gsub() in R is not replacing ‘.’ (dot)

You may need to escape the . which is a special character that means “any character” (from @Mr Flick’s comment) gsub(‘\\.’, ‘-‘, x) #[1] “2014-06-09” Or gsub(‘[.]’, ‘-‘, x) #[1] “2014-06-09” Or as @Moix mentioned in the comments, we can also use fixed=TRUE instead of escaping the characters. gsub(“.”, “-“, x, fixed = TRUE)

How to remove unicode from string?

I just want to remove unicode <U+00A6> which is at the beginning of string. Then you do not need a gsub, you can use a sub with “^\\s*<U\\+\\w+>\\s*” pattern: q <-“<U+00A6> 1000-66329” sub(“^\\s*<U\\+\\w+>\\s*”, “”, q) Pattern details: ^ – start of string \\s* – zero or more whitespaces <U\\+ – a literal char sequence <U+ … Read more

R – gsub replacing backslashes

Here’s what you need: gsub(“\\\\”, “\\\\\\\\”, “\\”) [1] “\\\\” The reason that you need four backslashes to represent one literal backslash is that “\” is an escape character in both R strings and for the regex engine to which you’re ultimately passing your patterns. If you were talking directly to the regex engine, you’d use … Read more

Remove all punctuation except apostrophes in R

x <- “I like %$@to*&, chew;: gum, but don’t like|}{[] bubble@#^)( gum!?” gsub(“[^[:alnum:][:space:]’]”, “”, x) [1] “I like to chew gum but don’t like bubble gum” The above regex is much more straight forward. It replaces everything that’s not alphanumeric signs, space or apostrophe (caret symbol!) with an empty string.