All characters in a .NET string are “unicode chars”. Do you mean they’re non-ascii? That shouldn’t make any odds – unless you run into composition issues, e.g. an “e + acute accent” not being replaced when you try to replace an “e acute”.
You could try using a regular expression with Regex.Replace
, or StringBuilder.Replace
. Here’s sample code doing the same thing with both:
using System;
using System.Text;
using System.Text.RegularExpressions;
class Test
{
static void Main(string[] args)
{
string original = "abcdefghijkl";
Regex regex = new Regex("a|c|e|g|i|k", RegexOptions.Compiled);
string removedByRegex = regex.Replace(original, "");
string removedByStringBuilder = new StringBuilder(original)
.Replace("a", "")
.Replace("c", "")
.Replace("e", "")
.Replace("g", "")
.Replace("i", "")
.Replace("k", "")
.ToString();
Console.WriteLine(removedByRegex);
Console.WriteLine(removedByStringBuilder);
}
}
I wouldn’t like to guess which is more efficient – you’d have to benchmark with your specific application. The regex way may be able to do it all in one pass, but that pass will be relatively CPU-intensive compared with each of the many replaces in StringBuilder.