stata
How to identify/delete non-UTF-8 characters in R
Another solution using iconv and it argument sub: character string. If not NA(here I set it to ”), it is used to replace any non-convertible bytes in the input. x <- “fa\xE7ile” Encoding(x) <- “UTF-8” iconv(x, “UTF-8”, “UTF-8″,sub=”) ## replace any non UTF-8 by ” “faile” Here note that if we choose the right encoding: … Read more
How do I create a “macro” for regressors in R?
Here are some alternatives. No packages are used in the first 3. 1) reformulate fo <- reformulate(regressors, response = “income”) lm(fo, Duncan) or you may wish to write the last line as this so that the formula that is shown in the output looks nicer: do.call(“lm”, list(fo, quote(Duncan))) in which case the Call: line of … Read more
Examples of the perils of globals in R and Stata
I also have the pleasure of teaching R to undergraduate students who have no experience with programming. The problem I found was that most examples of when globals are bad, are rather simplistic and don’t really get the point across. Instead, I try to illustrate the principle of least astonishment. I use examples where it … Read more
Pandas long to wide reshape, by two variables
A simple pivot might be sufficient for your needs but this is what I did to reproduce your desired output: df[‘idx’] = df.groupby(‘Salesman’).cumcount() Just adding a within group counter/index will get you most of the way there but the column labels will not be as you desired: print df.pivot(index=’Salesman’,columns=”idx”)[[‘product’,’price’]] product price idx 0 1 2 … Read more