I have learned that in general, Java uses UTF-16 as the internal String representation.
My question is what actually happens when composing a response in Java and applying different char encoding, e.g. response.setCharacterEncoding("ISO-8859-1")
.
Does it actually convert the response's body bytes from UTF-16 to ISO-8859-1 or it just adds some metadata to the response object?
I'm assuming you're talking about a class that works along the lines of HttpServletResponse
. If that's the case, then yes, it changes the body of the response, if you call getWriter
. The writer that is returned by that has to convert any strings that are written to it into bytes, and the encoding is used for that.
If you've set the content type, then setting the content encoding will also make that information available via the Content-Type
header. As per the ServletResponse
docs:
Calling setContentType(java.lang.String) with the String of
text/html
and calling this method with the String ofUTF-8
is equivalent with calling setContentType with the String oftext/html; charset=UTF-8
.
See more on this question at Stackoverflow