Sort CSV based on specific column?

I'm sure I did this in the past, and there is something small that I forget, but how can I sort the CSV file in a specific column? I'm interested in answers with and without third-party Perl modules. Mostly methods without, since I do not always have access to install additional modules.

Sample data:

  name, 25, female
 name, 24, male
 name, 27, female
 name, 21, male 

Desired end result after sorting by second numeric column:

  name, 21, male
 name, 24, male
 name, 25, female
 name, 27, female 
+3
source share
6 answers

Since CSV is a rather complicated format, it is better to use a module that does this work for us.

The following is an example of using the Text :: CSV module:

#!/usr/bin/env perl use strict; use warnings; use constant AGE => 1; use Text::CSV; my $csv = Text::CSV->new(); my @rows; while ( my $row_ref = $csv->getline( \*DATA ) ) { push @rows, $row_ref; } @rows = sort { $a->[AGE] <=> $b->[AGE] } @rows; for my $row_ref (@rows) { $csv->combine(@$row_ref); print $csv->string(), "\n"; } __DATA__ name,25,female name,24,male name,27,female name,21,male 
+11
source

In the spirit of the fact that there is always another way to do this, keep in mind that a simple old GNU look might be enough.

 $ sort -t, -k2 -n unsorted.txt name,21,male name,24,male name,25,female name,27,female 

If the command line arguments are:

 -t, # use comma as the record separator -k2 # sort on the second key (record) in the line -n # sort using numerical comparison (like using <=> instead of cmp in perl) 

If you need a Perl solution, wrap it in qx (); -)

+7
source

There is also DBD :: CSV :

 #!/usr/bin/perl use strict; use warnings; use DBI; my $dbh = DBI->connect('dbi:CSV:', undef, undef, { RaiseError => 1, f_ext => '.csv', csv_tables => { test => { col_names => [qw' name age sex '] } }, }); my $sth = $dbh->prepare(q{ SELECT name, age, sex FROM test ORDER BY age }); $sth->execute; while ( my @row = $sth->fetchrow_array ) { print join(',' => @row), "\n"; } $sth->finish; $dbh->disconnect; 

Output:

  name, 21, male
 name, 24, male
 name, 25, female
 name, 27, female 
+6
source

The original poster did not request third-party modules (that I can not say anything from CPAN). Although this is a limitation that will terribly limit your ability to write good modern Perl code, in this case you can use the (core) Text :: ParseWords module instead of the (non-main) text :: CSV. So, borrowing to a large extent from the example of Alan, we get:

 #!/usr/bin/env perl use strict; use warnings; use Text::ParseWords; my @rows; while (<DATA>) { push @rows, [ parse_line(',', 0, $_) ]; } @rows = sort { $a->[1] <=> $b->[1] } @rows; foreach (@rows) { print join ',', @$_; } __DATA__ name,25,female name,24,male name,27,female name,21,male 
+3
source

When you provide your own comparison code, you can sort anything. Just extract the element you want with a regular expression, or perhaps a split in this case, and then compare it. If you have many elements, I will analyze the data in the list of lists, and then the comparison code can access it without parsing. This would eliminate the parsing of the same line many times compared to other lines.

0
source

I would do something like this:

 #!/usr/bin/perl use warnings; use strict; my @rows = map { chomp; [split /[,\s]+/, $_] } <DATA>; #read each row into an array my @sorted = sort { $a->[1] <=> $b->[1] } @rows; # sort the rows (numerically) by second column for (@sorted) { print join(', ', @$_) . "\n"; # print them out as CSV } __DATA__ name,25,female name,24,male name,27,female name,21,male 
-2
source

Source: https://habr.com/ru/post/1398850/


All Articles