2009-06-05

Dodgy delimited files

Very often, I see files that are supposedly x-delimited, but that then have x in the fields. Obviously this is an issue. Sometimes some "kind" soul wraps only the affected values in quotes, instead of selecting a better delimiter or limiting their data (or even quoting every field, or just every instance of that column).

Thanks.

This awk script turns a comma-delimited file with quoted comma-containing fields into a pipe-delimited file with commas in non-quoted fields.

awk -F '["]' '{for (i=1; i <= NF; i+=2) {gsub(",","|",$i)} print }' file.csv

0 comments:

Post a Comment