unicode – Make Me Engineer

How can I use Unicode characters on the Windows command line?

June 20, 2023 by Tarik

Try: chcp 65001 which will change the code page to UTF-8. Also, you need to use Lucida console fonts.

MySQL: Get character-set of database or table or column?

June 16, 2023 by Tarik

Here’s how I’d do it – For Schemas (or Databases – they are synonyms): SELECT default_character_set_name FROM information_schema.SCHEMATA WHERE schema_name = “mydatabasename”; For Tables: SELECT CCSA.character_set_name FROM information_schema.`TABLES` T, information_schema.`COLLATION_CHARACTER_SET_APPLICABILITY` CCSA WHERE CCSA.collation_name = T.table_collation AND T.table_schema = “mydatabasename” AND T.table_name = “tablename”; For Columns: SELECT character_set_name FROM information_schema.`COLUMNS` WHERE table_schema = “mydatabasename” AND table_name … Read more

Output Unicode strings in Windows console

June 15, 2023 by Tarik

I have verified a solution here using Visual Studio 2010. Via this MSDN article and MSDN blog post. The trick is an obscure call to _setmode(…, _O_U16TEXT). Solution: #include <iostream> #include <io.h> #include <fcntl.h> int wmain(int argc, wchar_t* argv[]) { _setmode(_fileno(stdout), _O_U16TEXT); std::wcout << L”Testing unicode — English — Ελληνικά — Español.” << std::endl; } … Read more

Matching only a unicode letter in Python re

June 15, 2023 by Tarik

You can construct a new character class: [^\W\d_] instead of \w. Translated into English, it means “Any character that is not a non-alphanumeric character ([^\W] is the same as \w), but that is also not a digit and not an underscore”. Therefore, it will only allow Unicode letters.

Bytes in a unicode Python string

June 15, 2023 by Tarik

In Python 2, Unicode strings may contain both unicode and bytes: No, they may not. They contain Unicode characters. Within the original string, \xd0 is not a byte that’s part of a UTF-8 encoding. It is the Unicode character with code point 208. u’\xd0′ == u’\u00d0′. It just happens that the repr for Unicode strings … Read more

UTF-8 in Windows 7 CMD [duplicate]

June 13, 2023 by Tarik

This question has been already answered in Unicode characters in Windows command line – how? You missed one step -> you need to use Lucida console fonts in addition to executing chcp 65001 from cmd console.

Converting unicode character to string format

June 13, 2023 by Tarik

A function from k.ken’s response: function unicodeToChar(text) { return text.replace(/\\u[\dA-F]{4}/gi, function (match) { return String.fromCharCode(parseInt(match.replace(/\\u/g, ”), 16)); }); } Takes all unicode characters in the inputted string, and converts them to the character.

Isn’t on big endian machines UTF-8’s byte order different than on little endian machines? So why then doesn’t UTF-8 require a BOM?

June 13, 2023 by Tarik

The byte order is different on big endian vs little endian machines for words/integers larger than a byte. e.g. on a big-endian machine a short integer of 2 bytes stores the 8 most significant bits in the first byte, the 8 least significant bits in the second byte. On a little-endian machine the 8 most … Read more

Is wchar_t needed for unicode support?

June 13, 2023 by Tarik

No. Technically, no. Unicode is a standard that defines code points and it does not require a particular encoding. So, you could use unicode with the UTF-8 encoding and then everything would fit in a one or a short sequence of char objects and it would even still be null-terminated. The problem with UTF-8 and … Read more

how to insert unicode text to SQL Server from query window

June 12, 2023 by Tarik

The following should work, N indicates a “Unicode constant string” in MSSQL: INSERT INTO tForeignLanguage ([Name]) VALUES (N’Араб’)