By Xueming Shen
The UTF-8 charset implementation, which is available in all JDK/JRE releases from Sun, has been updated recently to reject non-shortest-form UTF-8 byte sequences. This is because the old implementation might be leveraged in security attacks. Since then I have been asked many times about what this "non-shortest-form" issue is and what the possible impact might be, so here are some answers.
The first question usually goes: "What is the non-shortest-form issue"?
Read more ...