I am new to R and I am currently having trouble with reading a series of strings until I encounter an EOF. Not only I don't know how to detect EOF, but I also don't know how to read a single string separated by whitespace which is trivial to do in any other language I开发者_Python百科 have seen so far. In C, I would simply do:
while (scanf("%s", s) == 1) { /* do something with s */ }
If possible, I would prefer a solution which does not require knowing the maximum length of strings in advance.
Any ideas?
EDIT: I am looking for solution which does not store all the input into memory, but the one equivalent or at least similar to the C code above.
Here's a way to read one item at a time... It uses the fact that scan
has an nmax
parameter (and n
and nlines
- it's actually kind of a mess!).
# First create a sample file to read from...
writeLines(c("Hello world", "and now", "Goodbye"), "foo.txt")
# Use a file connection to read from...
f <- file("foo.txt", "r")
i <- 0L
repeat {
s <- scan(f, "", nmax=1, quiet=TRUE)
if (length(s) == 0) break
i <- i + 1L
cat("Read item #", i, ": ", s, "\n", sep="")
}
close(f)
When scan encounters EOF, it returns a zero-length vector. So a more obscure but C-like way would be:
while (length(s <- scan(f, "", nmax=1, quiet=TRUE))) {
i <- i + 1L
cat("Read item #", i, ": ", s, "\n", sep="")
}
In any case, the output would be:
Read item #1: Hello
Read item #2: world
Read item #3: and
Read item #4: now
Read item #5: Goodbye
Finally, if you could vectorize what you do to the strings, you should probably try to read a bunch of them at a time - just change nmax
to, say, 10000
.
> txt <- "This is an example" # could be from a file but will use textConnection()
> read.table(textConnection(txt))
V1 V2 V3 V4
1 This is an example
read.table
is implemented with scan
, so you can just look at the code to see how the experts did it.
精彩评论