I have a csv file with the data like this
Zoos, Sanctuaries & Animal Parks,7469,3.00
Unfortunately this is not correct as the first section should be all one field like this
开发者_StackOverflow中文版"Zoos, Sanctuaries & Animal Parks","7469","3.00"
As this is just a once off import I would be just happy to transform it to
Zoos, Sanctuaries & Animal Parks|7469|3.00
with the last and second last comma's converted to pipes. Is there an easy way to do this with regex?
To convert comma to pipe last 2 items, you could do like this
>>> re.sub(",(\d+),([\d.]+)$","|\\1|\\2","Zoos, Sanctuaries & Animal Parks,7469,3.00")
'Zoos, Sanctuaries & Animal Parks|7469|3.00'
Something like this should work:
s/(\S),(\S)/\1|\2/g
(Replaces all commas which have are surrounded on both sides by non-space characters with pipes.)
You can convert to pipes this way. Just feed your text through this command:
sed 's/,\([^,]*\),\([^,]*\)$/|\1|\2/'
$ cat test.csv
Zoos, Sanctuaries & Animal Parks,7469,3.00
a,100,2000
a,b and c, 100,300
$ cat test.csv | perl -npe 's/^(.*),(.*),(.*)$/$1|$2|$3/'
Zoos, Sanctuaries & Animal Parks|7469|3.00
a|100|2000
a,b and c| 100|300
To convert last commas into pipes:
Replace ^(.*?),([^,]*?),([^,]*?)$
with $1|$2|$3
Or even better - to convert them in to the correct format:
Replace ^(.*?),([^,]*?),([^,]*?)$
with "$1","$2","$3"
精彩评论