I've got an array of bytes (primitive), they can have random values. I'm trying to count occurrences of them in the array in the most efficient/fastest way. Currently I'm using:
HashMap<Byte, Integer> dataCount = new HashMap<>();
for (byte b : data) dataCount.put(b, dataCount.getOrDefault(b, 0) + 1);
This one-liner takes ~500ms to process a byte[] of length 24883200. Using a regular for loop takes at least 600ms.
I've been thinking of constructing a set (since they only contain one of each element) then adding it to a HashMap using Collections.frequency(), but the methods to construct a Set from primitives require several other calls, so I'm guessing it's not as fast.
What would be the fastest way to accomplish counting of occurrences of each item?
I'm using Java 8 and I'd prefer to avoid using Apache Commons if possible.
I would create an array instead of a HashMap
, given that you know exactly how many counts you need to keep track of:
int[] counts = new int[256];
for (byte b : data) {
counts[b & 0xff]++;
}
That way:
Note that the & 0xff
is used to get a value in the range [0, 255]
instead of [-128, 127]
, so it's suitable as the index into the array.
See more on this question at Stackoverflow