I have a nice list, which looks like this:
tmp = NULL
t = NULL
tmp$resultitem$count = "1057230"
tmp$resultitem$status = "Ok"
tmp$resultitem$menu = "开发者_高级运维PubMed"
tmp$resultitem$dbname = "pubmed"
t$resultitem$count = "305215"
t$resultitem$status = "Ok"
t$resultitem$menu = "PMC"
t$resultitem$dbname = "pmc"
tmp = c(tmp, t)
t = NULL
t$resultitem$count = "1"
t$resultitem$status = "Ok"
t$resultitem$menu = "Journals"
t$resultitem$dbname = "journals"
tmp = c(tmp, t)
Which produces:
> str(tmp)
List of 3
$ resultitem:List of 4
..$ count : chr "1057230"
..$ status: chr "Ok"
..$ menu : chr "PubMed"
..$ dbname: chr "pubmed"
$ resultitem:List of 4
..$ count : chr "305215"
..$ status: chr "Ok"
..$ menu : chr "PMC"
..$ dbname: chr "pmc"
$ resultitem:List of 4
..$ count : chr "1"
..$ status: chr "Ok"
..$ menu : chr "Journals"
..$ dbname: chr "journals"
Now I want to search through the elements of each resultitem
.
I want to know the dbname
for every database, that has less then 10 count
(example).
In this case it is very easy, as this list only has 3 elements, but the real list is a little bit longer.
This could be simply done with a for loop. But is there a way to do this with some other function of R (like rapply)? My problem with those apply functions is, that they only look at one element.
If I do a grep to get all dbname
elements, I can not get the count of each element.
rapply(tmp, function(x) paste("Content: ", x))[grep("dbname", names(rapply(tmp, c)))]
Does someone has a better idea than a for loop?
R generally wants to handle these things as data.frames, so I think your best bet is to turn your list into one (or even make a data.frame instead of a list to begin with, unless you need it to be in list form).
x <- do.call(rbind,tmp)
dat <- data.frame(x)
dat$count <- as.numeric(dat$count)
> dat
count status menu dbname
1 1057230 Ok PubMed pubmed
2 305215 Ok PMC pmc
3 1 Ok Journals journals
and then to get your answer(s) you can use normal data.frame subsetting operations:
> dat$dbname[dat$count<10]
$resultitem
[1] "journals"
If you're absolutely insistent that you must do this in a list the following will work for the present case.
x <- tmp[sapply(tmp, function(x){x$count>10})]
str(x)
(the list items you wanted)
More generally, if you would like to actually use ragged lists in this way you could use the same code but check for the presence of the item first
testForCount <- function(x) {if ('count' %in% names(x)) x$count>10 else FALSE}
tmp[sapply (tmp, count)]
This will work for your cases where the lists are not the same length as well as the present case. (I still think you should be using data frames for both speed and sensible representation of the data).
It looks like your list comes from an XML structure. It is easier to navigate to what you want with XPath and using NodeSet structure and function getNodeSet in the XML package
精彩评论