When I take a .tsv file ouput by Excel on a Mac, zip it, send it to a linux machine, and unzip it using the unzip command, I get a bunch of junk on the end of the file. In the file, I have 19 rows of data. I use the default "Compress" function from the right click menu in finder. I upload the file through PHP. Here is the command I run (manually or automatically from the script) on the zip file:
unzip -aajp {zipfile} > {newfile}
When I open the {newfile} I see all of this on the end of the file:
^@^E^V^G^@^B^@^@Mac OS X ^@^B^@^@^@ ^@^@^@2^@^@^@ ^@^@^@^B^@^@^@R^@^@^@^@TEXTXCEL^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
Is there anyway to get rid of the junk on the end of the file?
When I run:
unzip -aaj {zipfile}
It will unzip the file, converting it to text/plain without the junk just fine. But then within my PHP script, I need to be able to get the exact name/location of the file.
I am open to doing this either way. I just cannot seem to find the correct solution. That being said, it needs to work for a file coming from windows as well. Any ideas?
UPDATE:
Here is what I ended up doing, but it still feels sloppy. I am still open to a better solution.
function decompress($filename) {
// generate a temporary filename
$tmpfile = '/tmp/'.mt_rand();
// Here we actually decompress the $working_zip_file file
$command = "unzip -aao $filename -d $tmpfile/ | egrep \"(inflating:|extracting:)\" | grep -v MACOS ";
$unzip_output = exec($command, $dummy, $unzipstatus);
// If things where unzipped properly
if($开发者_StackOverflowunzipstatus[0] == 0) {
$work_plain_file = preg_match('/\s*(inflating:|extracting:)(.*)$/', $unzip_output, $matches);
$work_plain_file = trim($matches[2]);
$clean_name = str_replace(' ', '_', $work_plain_file);
if($clean_name != $work_plain_file){
exec("mv \"$work_plain_file\" $clean_name");
$work_plain_file = $clean_name;
}
rename($work_plain_file, $new_file);
}
}
unzip
is dumb when it comes to the resource fork. You must tell it to ignore anything it finds in .DS_Store
.
精彩评论