How to convert these strange characters? (ë, Ã, ì, ù, Ã)
These are utf-8 encoded characters. Use utf8_decode() to convert them to normal ISO-8859-1 characters.
These are utf-8 encoded characters. Use utf8_decode() to convert them to normal ISO-8859-1 characters.
That’s the Unicode Replacement Character, \uFFFD. (info) Something like this should work: String strImport = “For some reason my �double quotes� were lost.”; strImport = strImport.replaceAll(“\uFFFD”, “\””);
To convert to HTML entities: <?php echo mb_convert_encoding( file_get_contents(‘http://www.tvrage.com/quickinfo.php?show=Surviver&ep=20×02&exact=0’), “HTML-ENTITIES”, “UTF-8” ); ?> See docs for mb_convert_encoding for more encoding options.
I can indeed confirm that the Facebook download data is incorrectly encoded; a Mojibake. The original data is UTF-8 encoded but was decoded as Latin -1 instead. I’ll make sure to file a bug report. In the meantime, you can repair the damage in two ways: Decode the data as JSON, then re-encode any strings … Read more
Introduction Normally, JSF/Facelets will set the request parameter character encoding to UTF-8 by default already when the view is created/restored. But if any request parameter is been requested before the view is been created/restored, then it’s too late to set the proper character encoding. The request parameters will namely be parsed only once. PrimeFaces encoding … Read more
So what’s the problem, It’s a ’ (RIGHT SINGLE QUOTATION MARK – U+2019) character which is being decoded as CP-1252 instead of UTF-8. If you check the encodings table, then you see that this character is in UTF-8 composed of bytes 0xE2, 0x80 and 0x99. If you check the CP-1252 code page layout, then you’ll … Read more
Three words for you: Byte Order Mark (BOM) That’s the representation for the UTF-8 BOM in ISO-8859-1. You have to tell your editor to not use BOMs or use a different editor to strip them out. To automatize the BOM’s removal you can use awk as shown in this question. As another answer says, the … Read more
That can happen if request and/or response encoding isn’t properly set at all. For GET requests, you need to configure it at the servletcontainer level. It’s unclear which one you’re using, but for in example Tomcat that’s to be done by URIEncoding attribute in <Connector> element in its /conf/server.xml. <Connector … URIEncoding=”UTF-8″> For POST requests, … Read more