I have multiple files with hundreds of thousands of records in following format:
2010/08/10 10:07:52|TrainZoom|1393|38797|MyCustomerAPI->,mask = ZPTA,TransId = 1281460071578,testing :-> , capture -> : , IMDB = 9113290830, equipmentName = GMT, technologyName = RMS,,,)|
There are fields separated by pipes and inside a field (last) there are many other fields separated by commas. What I want to 开发者_StackOverflowdo is to store all these values in database. For the above example my DB fields would be date-time, process-name, thread-id, sequence-num, API name, transId, imdb, equipmentName, technologyName.
API Name is not in a fixed format but I can assume that its a fixed length.Please let me know if I can do this in Unix using awk or sed.
Thanks,
Just for fun:
cat file.txt | sed -e 's/\([^|]*\)|\([^|]*\)|\([^|]*\)|\([^|]*\)|/insert into table set date-time=\1, process-name=\2, thread-id=\3, sequence-num=\4, /' -e 's/,[^=]*,/,/g' -e 's/ \([,=]\)/\1/g' -e 's/\([,=]\) /\1/g' -e 's/\([^,]*\)=\([^,]*\),/\1="\2",/g' -e 's/"[^"]*$/";/' | mysql dbname
yes you can do at least the parsing in awk,
awk -F"|" 'BEGIN{q="\047"}
{
printf "insert into table values("
printf q $1 q","q $2,","
# fill up the rest yourself
}
' file
the above produces a bunch of insert statements. You can either pipe them to your database client, or save them to a .sql file, then use the database client to execute it.
I would use perl or python to do the job as they provide DB connectors for different flavor of DBs. I don't see what you want awk or sed to do, as they are "text to text transformers".
You should complete your question to clarify your goal (text => sql script? for example) and target DB if you want some sample code.
You can do the parsing, at least, directly in bash. I'm not sure what you want to do with the values, but this script parses each line and displays the individual fields:
#!/bin/bash
while IFS='|' read dateTime processName threadId sequenceNum other; do
IFS=',' read apiName mask transId testing capture imdb equipmentName technologyName other <<< "$other"
echo "date-time: $dateTime"
echo "process-name: $processName"
echo "thread-id: $threadId"
echo "sequence-num: $sequenceNum"
echo "api-name: $apiName"
echo "trans-id: $transId"
echo "imdb: $imdb"
echo "equipment-name: $equipmentName"
echo "technology-name: $technologyName"
done
精彩评论