Remove characters that are not printable ASCII characters from a string in C#

The following TrimNonAscii extension method removes non-printable ASCII characters from a string.
public static string TrimNonAscii(this string value)
{
    string pattern = "[^ -~]*";
    Regex reg_exp = new Regex(pattern);
    return reg_exp.Replace(value, "");
}

The code makes a regular expression that represents all characters that are not in the range " " through "~" repeated any number of times. It uses the expression to create a Regex object and then uses its Replace method to remove those characters and returns the result.

Note that this method removes many useful Unicode characters such as £, Æ, and ♥, in addition to fonts such as Cyrillic and Kanji. It's mostly useful for standard English text.

   

 

What did you think of this article?




Trackbacks
  • No trackbacks exist for this post.
Comments

Leave a comment

Submitted comments are subject to moderation before being displayed.

 Name

 Email (will not be published)

 Website

Your comment is 0 characters limited to 3000 characters.