Processing CSV file without double quotes

In other words, I'm looking for a way to ignore "," in one of the fields.

A field should be considered as one field, although it contains a comma.

Example:

Round,Winner,place,prize
1,xyz,1,$4,500

If I read this with a dict reader $4,500, it will be printed as $4, because it 500is considered a different field. This makes sense since I am reading the file as a comma, so I cannot complain, but try to find a job.

reader = csv.reader(f, delimiter=',', quotechar='"')

My source is not enclosed in double quotes, so I cannot ignore by including a quote line.

Is there any other way to handle this scenario? Probably something like defining these dollar fields and making it ignore the commas for this area? Or try to include quotes around this field?

If not Python, can a shell script or Perl be used?

+4
source share
2 answers

Perhaps pre-process the data to wrap all the money in quotation marks and then process normally

$line =~ s/( \$\d+ (?:,\d{3})* (?:\.\d{2})? )/"$1"/gx;

The pattern matches the numbers following $, optionally followed by any multiple ,nnnand / or one .nn. It also wraps $4.22as well $100, which I find good for consistency. Limit what you get if necessary, for example (\$\d{1,3},\d{3}). With fractional cents, remove {2}. This does not apply to all possible edges / broken cases.

/g /x .

perl -pe 's/(\$\d+(?:,\d{3})*(?:\.\d{2})?)/"$1"/g' input.csv  > changed.csv

-i ( " " ) -i.bak, .


, script

use warnings;
use strict;

my $file = '...';
my $fout = '...';

open my $fh,     '<', $file or die "Can't open $file: $!";
open my $fh_out, '>', $fout or die "Can't open $fout for writing: $!";

while (my $line = <$fh>) {
    $line =~ s/( \$\d+ (?:,\d{3})* (?:\.\d{2})? )/"$1"/gx;
    print fh_out $line;
}

close $fh;
close $fh_out;
+8

, , , Bash:

#!/bin/bash

while IFS=, read -r f1 f2 f3 f4; do
   # f4 => has everything after f3, including extra commas as in $4,500
   # do your processing
   printf "f1=[$f1] f2=[$f2] f3=$[f3] f4=[$f4]\n"
done < input.txt

Input:

1,xyz,1,$4,500
2,abc,3,$400

:

f1=[1] f2=[xyz] f3=1 f4=[$4,500]
f1=[2] f2=[abc] f3=3 f4=[$400]
+1

Source: https://habr.com/ru/post/1668035/


All Articles