I have a pretty interesting topic - at least for me. Given a ByteArrayOutputStream with bytes for example in UTF-8, I need a function that can "translate" those bytes into another - new - ByteArrayOutputStream in for example UTF-16, or ASCII or you name it. My naive approach would have been to use a an InputStreamReader and give in the the desired encoding, but that didn't work because that'll read into a char[] and I can only write byte[] to the new BAOS.
public byte[] convertStream(Charset encoding) {
ByteArrayInputStream original = new ByteArrayInputStream(raw.toByteArray());
InputStreamReader contentReader = new InputStreamReader(original, encoding);
ByteArrayOutputStream converted = new ByteArrayOutputStream();
int readCount;
char[] buffer = new char[4096];
while ((readCount = contentReader.read(buffer, 0, buffer.length)) != -1)
converted.write(buffer, 0, readCount);
return converted.toByteArray();
}
Now, this obviously doesn't work and I'm looking for a way to make this scenario possible, without building a String out of the byte[].
@Edit: Since it seems rather hard to read the obvious things. 1) raw: ByteArrayOutputStream containing bytes of a BINARY object sent to us from clients. The bytes usually come in UTF-8 as a part of a HTTP Message. 2) The goal here is to send this BINARY data forward to an internal System that's not flexible - well this is an internal System - and it accepts such attachments in UTF-16. I don't know why don't even ask, it does so.
So to justify my question: Is there a way to convert a byte array from Charset A to Charset B or encoding of your choise. Once again Building a String is NOT what I'm after.
Thank you and hope that clears up questionable parts :).
As mentioned in comments, I'd just convert to a string:
String text = new String(raw.toByteArray(), encoding);
byte[] utf8 = text.getBytes(StandardCharsets.UTF_8);
However, if that's not feasible (for some unspecified reason...) what you've got now is nearly there - you just need to add an OutputStreamWriter
into the mix:
// Nothing here should throw IOException in reality - work out what you want to do.
public byte[] convertStream(Charset encoding) throws IOException {
ByteArrayInputStream original = new ByteArrayInputStream(raw.toByteArray());
InputStreamReader contentReader = new InputStreamReader(original, encoding);
int readCount;
char[] buffer = new char[4096];
try (ByteArrayOutputStream converted = new ByteArrayOutputStream()) {
try (Writer writer = new OutputStreamWriter(converted, StandardCharsets.UTF_8)) {
while ((readCount = contentReader.read(buffer, 0, buffer.length)) != -1) {
writer.write(buffer, 0, readCount);
}
}
return converted.toByteArray();
}
}
Note that you're still creating an extra temporary copy of the data in memory, admittedly in UTF-8 rather than UTF-16... but fundamentally this is hardly any more efficient than creating a string.
If memory efficiency is a particular concern, you could perform multiple passes in order to work out how many bytes will be required, create a byte array of the write length, and then adjust the code to write straight into that byte array.
See more on this question at Stackoverflow