stata – Make Me Engineer

Read Stata 13 file in R

June 11, 2023 by Tarik

Reading multiple files into multiple data frames

June 4, 2023 by Tarik

Mutate multiple columns in a dataframe

May 15, 2023 by Tarik

How to identify/delete non-UTF-8 characters in R

December 1, 2022 by Tarik

Another solution using iconv and it argument sub: character string. If not NA(here I set it to ”), it is used to replace any non-convertible bytes in the input. x <- “fa\xE7ile” Encoding(x) <- “UTF-8” iconv(x, “UTF-8”, “UTF-8″,sub=”) ## replace any non UTF-8 by ” “faile” Here note that if we choose the right encoding: … Read more

How do I create a “macro” for regressors in R?

November 23, 2022 by Tarik

Here are some alternatives. No packages are used in the first 3. 1) reformulate fo <- reformulate(regressors, response = “income”) lm(fo, Duncan) or you may wish to write the last line as this so that the formula that is shown in the output looks nicer: do.call(“lm”, list(fo, quote(Duncan))) in which case the Call: line of … Read more

Examples of the perils of globals in R and Stata

July 20, 2022 by Tarik

I also have the pleasure of teaching R to undergraduate students who have no experience with programming. The problem I found was that most examples of when globals are bad, are rather simplistic and don’t really get the point across. Instead, I try to illustrate the principle of least astonishment. I use examples where it … Read more

Pandas long to wide reshape, by two variables

May 13, 2022 by Tarik

A simple pivot might be sufficient for your needs but this is what I did to reproduce your desired output: df[‘idx’] = df.groupby(‘Salesman’).cumcount() Just adding a within group counter/index will get you most of the way there but the column labels will not be as you desired: print df.pivot(index=’Salesman’,columns=”idx”)[[‘product’,’price’]] product price idx 0 1 2 … Read more