How to identify/delete non-UTF-8 characters in R

Another solution using iconv and it argument sub: character string. If not NA(here I set it to ”), it is used to replace any non-convertible bytes in the input. x <- “fa\xE7ile” Encoding(x) <- “UTF-8” iconv(x, “UTF-8”, “UTF-8″,sub=”) ## replace any non UTF-8 by ” “faile” Here note that if we choose the right encoding: … Read more

How do I create a “macro” for regressors in R?

Here are some alternatives. No packages are used in the first 3. 1) reformulate fo <- reformulate(regressors, response = “income”) lm(fo, Duncan) or you may wish to write the last line as this so that the formula that is shown in the output looks nicer: do.call(“lm”, list(fo, quote(Duncan))) in which case the Call: line of … Read more

Pandas long to wide reshape, by two variables

A simple pivot might be sufficient for your needs but this is what I did to reproduce your desired output: df[‘idx’] = df.groupby(‘Salesman’).cumcount() Just adding a within group counter/index will get you most of the way there but the column labels will not be as you desired: print df.pivot(index=’Salesman’,columns=”idx”)[[‘product’,’price’]] product price idx 0 1 2 … Read more