read.csv
How to detect the right encoding for read.csv?
First of all based on more general question on StackOverflow it is not possible to detect encoding of file in 100% certainty. I’ve struggle this many times and come to non-automatic solution: Use iconvlist to get all possible encodings: codepages <- setNames(iconvlist(), iconvlist()) Then read data using each of them x <- lapply(codepages, function(enc) try(read.table(“encoding.asc”, … Read more
Why am I getting X. in my column names when reading a data frame?
read.csv() is a wrapper around the more general read.table() function. That latter function has argument check.names which is documented as: check.names: logical. If ‘TRUE’ then the names of the variables in the data frame are checked to ensure that they are syntactically valid variable names. If necessary they are adjusted (by ‘make.names’) so that they … Read more
Specifying colClasses in the read.csv
You can specify the colClasse for only one columns. So in your example you should use: data <- read.csv(‘test.csv’, colClasses=c(“time”=”character”))
How to read only lines that fulfil a condition from a csv into R?
You could use the read.csv.sql function in the sqldf package and filter using SQL select. From the help page of read.csv.sql: library(sqldf) write.csv(iris, “iris.csv”, quote = FALSE, row.names = FALSE) iris2 <- read.csv.sql(“iris.csv”, sql = “select * from file where `Sepal.Length` > 5”, eol = “\n”)
read.csv, header on first line, skip second line [duplicate]
This should do the trick: all_content = readLines(“file.csv”) skip_second = all_content[-2] dat = read.csv(textConnection(skip_second), header = TRUE, stringsAsFactors = FALSE) The first step using readLines reads the entire file into a list, where each item in the list represents a line in the file. Next, you discard the second line using the fact that negative … Read more
Invalid multibyte string in read.csv
Encoding sets the encoding of a character string. It doesn’t set the encoding of the file represented by the character string, which is what you want. This worked for me, after trying “UTF-8″: x <- read.csv(url, header=FALSE, stringsAsFactors=FALSE, fileEncoding=”latin1”) And you may want to skip the first 16 lines, and read in the headers separately. … Read more
‘Incomplete final line’ warning when trying to read a .csv file into R
The message indicates that the last line of the file doesn’t end with an End Of Line (EOL) character (linefeed (\n) or carriage return+linefeed (\r\n)). The original intention of this message was to warn you that the file may be incomplete; most datafiles have an EOL character as the very last character in the file. … Read more
Specify custom Date format for colClasses argument in read.table/read.csv
You can write your own function that accepts a string and converts it to a Date using the format you want, then use the setAs to set it as an as method. Then you can use your function as part of the colClasses. Try: setAs(“character”,”myDate”, function(from) as.Date(from, format=”%d/%m/%Y”) ) tmp <- c(“1, 15/08/2008”, “2, 23/05/2010”) … Read more