I am storing a number of individually serialized PHP arrays to a file. Each line of the file contains one serialized array. For example:
a:2:{s:4:"name";s:8:"John Doe";s:3:"age";s:2:"20";}
a:2开发者_高级运维:{s:4:"name";s:8:"Jane Doe";s:3:"age";s:2:"15";}
a:2:{s:4:"name";s:12:"Steven Tyler";s:3:"age";s:2:"35";}
a:2:{s:4:"name";s:12:"Jim Morrison";s:3:"age";s:2:"25";}
a:2:{s:4:"name";s:13:"Apple Paltrow";s:3:"age";s:2:"75";}
a:2:{s:4:"name";s:12:"Drew Nickels";s:3:"age";s:2:"34";}
a:2:{s:4:"name";s:11:"Jason Proop";s:3:"age";s:2:"36";}
Here is my question:
Is it possible to grep this file for the following pattern: "name"*"*"
Afterwards, I would like to sort the lines that are found based on the contents of the second wildcard.
I'm not sure where grepping comes into this, as all your lines seem to match the pattern. But anyway, you can use sort
on its own to sort your sample input:
sort -t\" -k4 data.txt
It's ignoring the "real" structure of the text, it's just treating "
as a delimiter, so it's quick and dirty but it sorts how you want. Here it is in action:
http://ideone.com/ZugIX
If you do need to grep for "name".*".*"
, you can just do that first and pipe the output to the sort
command.
Here is how you can sort your lines based on the name. I've broken down the steps so you can see the intermediate output.
> cat data.txt
a:2:{s:4:"name";s:8:"John Doe";s:3:"age";s:2:"20";}
a:2:{s:4:"name";s:8:"Jane Doe";s:3:"age";s:2:"15";}
a:2:{s:4:"name";s:12:"Steven Tyler";s:3:"age";s:2:"35";}
a:2:{s:4:"name";s:12:"Jim Morrison";s:3:"age";s:2:"25";}
a:2:{s:4:"name";s:13:"Apple Paltrow";s:3:"age";s:2:"75";}
a:2:{s:4:"name";s:12:"Drew Nickels";s:3:"age";s:2:"34";}
a:2:{s:4:"name";s:11:"Jason Proop";s:3:"age";s:2:"36";}
Now, we'll use the 'sed' command to extract the name using a regex. We then output the name, a tab, then the original line so we can sort it:
> cat data.txt | sed -rn 's/[^"]+"name";s:[0-9]+:"([^"]+)".*/\1\t\0/p'
John Doe a:2:{s:4:"name";s:8:"John Doe";s:3:"age";s:2:"20";}
Jane Doe a:2:{s:4:"name";s:8:"Jane Doe";s:3:"age";s:2:"15";}
Steven Tyler a:2:{s:4:"name";s:12:"Steven Tyler";s:3:"age";s:2:"35";}
Jim Morrison a:2:{s:4:"name";s:12:"Jim Morrison";s:3:"age";s:2:"25";}
Apple Paltrow a:2:{s:4:"name";s:13:"Apple Paltrow";s:3:"age";s:2:"75";}
Drew Nickels a:2:{s:4:"name";s:12:"Drew Nickels";s:3:"age";s:2:"34";}
Jason Proop a:2:{s:4:"name";s:11:"Jason Proop";s:3:"age";s:2:"36";}
This sed command requires the 'name' value to be the first quoted string on the line. If you can't guarantee that you should probably implement this step with a php script and deserialize the data using the native php functions. If 'name' is not present or it's not the first quoted string in the line the line will be skipped. For more information on sed, there are many resources online.
Now that the names are first on the line, we can use the normal unix sort command to sort them:
> cat data.txt | sed -rn 's/[^"]+"name";s:[0-9]+:"([^"]+)".*/\1\t\0/p' | sort
Apple Paltrow a:2:{s:4:"name";s:13:"Apple Paltrow";s:3:"age";s:2:"75";}
Drew Nickels a:2:{s:4:"name";s:12:"Drew Nickels";s:3:"age";s:2:"34";}
Jane Doe a:2:{s:4:"name";s:8:"Jane Doe";s:3:"age";s:2:"15";}
Jason Proop a:2:{s:4:"name";s:11:"Jason Proop";s:3:"age";s:2:"36";}
Jim Morrison a:2:{s:4:"name";s:12:"Jim Morrison";s:3:"age";s:2:"25";}
John Doe a:2:{s:4:"name";s:8:"John Doe";s:3:"age";s:2:"20";}
Steven Tyler a:2:{s:4:"name";s:12:"Steven Tyler";s:3:"age";s:2:"35";}
Now we've got our lines sorted, we just need to get rid of the plain names at the front of the lines:
> cat data.txt | sed -rn 's/[^"]+"name";s:[0-9]+:"([^"]+)".*/\1\t\0/p' | sort | cut -f2
a:2:{s:4:"name";s:13:"Apple Paltrow";s:3:"age";s:2:"75";}
a:2:{s:4:"name";s:12:"Drew Nickels";s:3:"age";s:2:"34";}
a:2:{s:4:"name";s:8:"Jane Doe";s:3:"age";s:2:"15";}
a:2:{s:4:"name";s:11:"Jason Proop";s:3:"age";s:2:"36";}
a:2:{s:4:"name";s:12:"Jim Morrison";s:3:"age";s:2:"25";}
a:2:{s:4:"name";s:8:"John Doe";s:3:"age";s:2:"20";}
a:2:{s:4:"name";s:12:"Steven Tyler";s:3:"age";s:2:"35";}
Enjoy!
精彩评论