Fast assessment of corrupted Affymetrix CEL files_问答_开发者

Fast assessment of corrupted Affymetrix CEL files

开发者 https://www.devze.com 2022-12-12 22:28 出处：网络

I\'m trying to normalize a big amount of Affymetrix CEL files using R. However, some of them appear to be truncated, so when reading them i get the error

I'm trying to normalize a big amount of Affymetrix CEL files using R. However, some of them appear to be truncated, so when reading them i get the error

Cel file xxx does not seem to have the correct dimensions

And the normalization stops. Manually removing the corrupted files and restart every time will take very long. Do you know if there is a fast way (in R or with a tool) to detect corrupted files?

PS I'm 99.99% sure I'm normalizing together CELs from the same platform, it's reall开发者_如何学JAVAy just truncated files :-)

One simple suggestion:

Can you just use a tryCatch block around your read.table (or whichever read command you're using)? Then just skip a file if you get that error message. You can also compile a list of corrupted files within the catch block (I recommend doing that so that you are tracking corrupted files for future reference when running a big batch process like this). Here's the pseudo code:

corrupted.files <- data.frame()
for(i in 1:nrow(files)) {
    x <- tryCatch(read.table(file=files[i]), error = function(e) 
         if(e=="something") { corrupted.files <- rbind(corrupted.files, files[i]) } 
         else { stop(e) }, 
       finally=print(paste("finished with", files[i], "at", Sys.time())))
    if(nrow(x)) # do something with the uncorrupted data            
}