Tell me more ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free, no registration required.

i have to parse two large text-files. Each file contains a String-Mapping from a local-identifier to a String-value. The local-identifier is in fact just a temporary key. Later the mapping should be from value(file1) to value(file2).

so what i did was:

  • build HashMaps with the mappings in every file.
  • by iterating over the keyset i build a hashmap which maps value(file1) to value(file2)

after that i had three HashMaps

  1. localid -> value(file1)
  2. localid -> value(file2)
  3. value(file1) -> value(file2)

What i did for verification was: for each localid

  • a)get value(file1) out of Map 1
  • b)get value(file2) out of Map 2
  • c)get value(file2) out of Map3 with the key out of step a)
  • d)compare value(file2)_b with value(file2)_c

what happens is that the two values in step d) are not equal in 15% of the key-value-pairs.

Actually there is some kind of System there... For example N2c changes into [N]2c, [nH]1c3c changes into n1c3c and (N) changes into ([NH])

is it possible that Java interprets the String as regular expressions or has anyone another idea?

thanks a lot

EDIT: ok here is some code^^ yeah this is more readable... sorry...

    HashMap<String, String> idToFile1 = File1.getMapping();
    HashMap<String, String> idToFile2 = File2.getMapping();

    HashMap<String, String> file1ToFile2 = new HashMap<String, String>();
    for(String localid : smilesfragments.keySet()){
        inchiToSmiles.put(idToFile1.get(localid), idToFile2.get(localid));
    }

    for(String localid : idToFile1.keySet()){
        String file1val  = idToFile1.get(localid);
        String file2val = idToFile2.get(localid);
        if(!file2val.equals(file1ToFile2.get(file1val))){
            System.err.println("mismatch!");
        }
    }

I get the mismatch in 15% of the cases

share|improve this question
1  
Huh? I understood absolutely nothing. Actually there is some kind of System there? Wut? –  m0skit0 Nov 15 at 16:39
2  
Show some code. If you're using String.replaceAll() it takes a regexp, but otherwise Java won't start randomly interpreting Strings as regular expressions. –  Kayaman Nov 15 at 16:40

1 Answer

up vote 0 down vote accepted

If different identifiers can have same values, your third map will keep the last parsed one. E.g. :

File 1 :

  • localId1 => "aaaa"
  • localId2 => "bbbb"
  • localId3 => "cccc"
  • localId4 => "aaaa"

File 2:

  • localId1 => "1111"
  • localId2 => "2222"
  • localId3 => "3333"
  • localId4 => "4444"

Your first and second maps will store this mapping as it is in your files.

However, when you build your third map, you'll get :

  • "aaaa" => "4444"
  • "bbbb" => "2222"
  • "cccc" => "3333"

As you can see, when you'll verify the parsing of your files, you'll get an error with localId1 ("aaaa" in file 1, "1111" in file 2, but "aaaa" => "4444" in the third map).

If you can't ensure the uniqueness of the values in your files, you can't store mapping in a map "value in file 1" => "value in file 2".

This can be an explanation of the 15% errors.

share|improve this answer
 
actually this is exactly what happened -.- The data in the file is incorrect so the mapping is not possible at the moment... Thanks your your answer! –  Arlzheim Nov 15 at 17:26
 
You're welcome :) –  ssssteffff Nov 15 at 22:39

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.