How to change diacritic characters to non-diacritic ones [duplicate]

Since no one has ever bothered to post the code to do this, here it is: // \p{Mn} or \p{Non_Spacing_Mark}: // a character intended to be combined with another // character without taking up extra space // (e.g. accents, umlauts, etc.). private readonly static Regex nonSpacingMarkRegex = new Regex(@”\p{Mn}”, RegexOptions.Compiled); public static string RemoveDiacritics(string text) … Read more

Should I use accented characters in URLs?

There’s no ambiguity here: RFC3986 says no, that is, URIs cannot contain unicode characters, only ASCII. An entirely different matter is how browsers represent encoded characters when displaying a URI, for example some browsers will display a space in a URL instead of ‘%20’. This is how IDN works too: punycoded strings are encoded and … Read more

How to ignore acute accent in a javascript regex match?

The standard ecmascript regex isn’t ready for unicode (see http://blog.stevenlevithan.com/archives/javascript-regex-and-unicode). So you have to use an external regex library. I used this one (with the unicode plugin) in the past : http://xregexp.com/ In your case, you may have to escape the char é as \u00E9 and defining a range englobing e, é, ê, etc. EDIT … Read more

Remove diacritics from a string

if you have http://php.net/manual/en/book.intl.php available, you can use this: $string = “Fóø Bår”; $transliterator = Transliterator::createFromRules(‘:: Any-Latin; :: Latin-ASCII; :: NFD; :: [:Nonspacing Mark:] Remove; :: Lower(); :: NFC;’, Transliterator::FORWARD); echo $normalized = $transliterator->transliterate($string);

Why can’t I use accented characters next to a word boundary?

JavaScript’s regex implementation is not Unicode-aware. It only knows the ‘word characters’ in standard low-byte ASCII, which does not include é or any other accented or non-English letters. Because é is not a word character to JS, é followed by a space can never be considered a word boundary. (It would match \b if used … Read more

Check If the string contains accented characters in SQL?

SQL Fiddle: http://sqlfiddle.com/#!6/9eecb7d/1607 declare @a nvarchar(32) = ‘àéêöhello!’ declare @b nvarchar(32) = ‘aeeohello!’ select case when (cast(@a as varchar(32)) collate SQL_Latin1_General_Cp1251_CS_AS) = @a then 0 else 1 end HasSpecialChars select case when (cast(@b as varchar(32)) collate SQL_Latin1_General_Cp1251_CS_AS) = @b then 0 else 1 end HasSpecialChars (based on solution here: How can I remove accents on … Read more