I'm working on a big java web application in Eclipse, whose files have different encodings: some are in UTF-8, others in Cp1252, yet others are in ISO-8859-1 (with no distinction between JSP's or java source files, or CSS) — but I know the encoding of each file.
I'm converting the project to Maven, and this is a great occasion to turn all of them to UTF-8.
Of course I don't want to lose a single character (so fully automated conversions do not apply here).
How should I go about it? Is there a tool that can help me ensure I don't lose any special character?
The webapp is in Italian, so, especially in JSP's, there could be lots of accented letters (probably not everywhere HTML entities have been used).
The project is in Eclipse, but I can use an external editor if that could make the conversion easier.
It's very easy to write code to convert encodings - although I'd expect there are tools to do it anyway. Simply:
FileInputStream
to the existing file, and wrap it in an InputStreamReader
with the appropriate encodingFileOutputStream
to the new file, and wrap it in an OutputStreamWriter
with the appropriate encodingThe first two steps are simpler with Files.newBufferedReader
and Files.newBufferedWriter
, too.
See more on this question at Stackoverflow