I have a scenario in which i have to compare 2 Maps. I have to take a keys of the source Map and iterate through the target Map for the key for equals and then compare their values. The problems is that these Maps are supposed to hold very high(>=10,000,000) volume of records. So i cannot keep these maps in-memory.
Possible solutions:
Storing both as text files as "key=value"
Problems:
We have to iterate through the target Map text file for every key value in the source which is ineffective and time consuming.
Possible solutions: Storing both as text files as "key=value" and creating an index of id=>line number for target
Problems:
No effective method to directly read a line based on the line number from a big text file. Some methods use api's for Java 1.8 and they again need the file to be loaded in-memory
Possible solutions:
Storing both in a database
Problems:
In this case, we have to query the databases for every single key value. If we have 1 million keys in source and target, We have to query 1 million times. Ineffective and time consuming
Possible solutions:
Using mapDB
Problems:
Tried this, but it failed after 260,000 records. It gave an Writer thread failed exception mainly because i was using a 32bit JVM. So i want to write the implementation myself rather than relying on MapDB.
How do i effectively store/retrieve and compare the key value maps so that it wont hit the performance much when i am doing the comparison. At any time i cannot bring any thing in-memory because it will give Out of Memory exception. The solution should read and writes to the disk, not in-memory - I don't have 12 GB of RAM. Also the solution should work for 32/64 bit systems
One reasonably simple option:
That way you only need two key/value pairs in memory at any one time.
For example, suppose we have a "source" map with keys A, B, H, I and a "target" map with keys B, C, I, J. The process would look like this:
See more on this question at Stackoverflow