read.csv – Make Me Engineer

How can I read the header but also skip lines – read.table()?

June 15, 2023 by Tarik

How to detect the right encoding for read.csv?

November 3, 2022 by Tarik

First of all based on more general question on StackOverflow it is not possible to detect encoding of file in 100% certainty. I’ve struggle this many times and come to non-automatic solution: Use iconvlist to get all possible encodings: codepages <- setNames(iconvlist(), iconvlist()) Then read data using each of them x <- lapply(codepages, function(enc) try(read.table(“encoding.asc”, … Read more

Why am I getting X. in my column names when reading a data frame?

October 9, 2022 by Tarik

read.csv() is a wrapper around the more general read.table() function. That latter function has argument check.names which is documented as: check.names: logical. If ‘TRUE’ then the names of the variables in the data frame are checked to ensure that they are syntactically valid variable names. If necessary they are adjusted (by ‘make.names’) so that they … Read more

Specifying colClasses in the read.csv

October 7, 2022 by Tarik

You can specify the colClasse for only one columns. So in your example you should use: data <- read.csv(‘test.csv’, colClasses=c(“time”=”character”))

How to read only lines that fulfil a condition from a csv into R?

July 26, 2022 by Tarik

You could use the read.csv.sql function in the sqldf package and filter using SQL select. From the help page of read.csv.sql: library(sqldf) write.csv(iris, “iris.csv”, quote = FALSE, row.names = FALSE) iris2 <- read.csv.sql(“iris.csv”, sql = “select * from file where `Sepal.Length` > 5”, eol = “\n”)

read.csv, header on first line, skip second line [duplicate]

July 21, 2022 by Tarik

This should do the trick: all_content = readLines(“file.csv”) skip_second = all_content[-2] dat = read.csv(textConnection(skip_second), header = TRUE, stringsAsFactors = FALSE) The first step using readLines reads the entire file into a list, where each item in the list represents a line in the file. Next, you discard the second line using the fact that negative … Read more

Invalid multibyte string in read.csv

June 17, 2022 by Tarik

Encoding sets the encoding of a character string. It doesn’t set the encoding of the file represented by the character string, which is what you want. This worked for me, after trying “UTF-8″: x <- read.csv(url, header=FALSE, stringsAsFactors=FALSE, fileEncoding=”latin1”) And you may want to skip the first 16 lines, and read in the headers separately. … Read more

‘Incomplete final line’ warning when trying to read a .csv file into R

May 21, 2022 by Tarik

The message indicates that the last line of the file doesn’t end with an End Of Line (EOL) character (linefeed (\n) or carriage return+linefeed (\r\n)). The original intention of this message was to warn you that the file may be incomplete; most datafiles have an EOL character as the very last character in the file. … Read more

Specify custom Date format for colClasses argument in read.table/read.csv

May 16, 2022 by Tarik

You can write your own function that accepts a string and converts it to a Date using the format you want, then use the setAs to set it as an as method. Then you can use your function as part of the colClasses. Try: setAs(“character”,”myDate”, function(from) as.Date(from, format=”%d/%m/%Y”) ) tmp <- c(“1, 15/08/2008”, “2, 23/05/2010”) … Read more