Using CSV file as data pool for sed function

I have a large amount of text that I would like to use sed to do bulk substitution using a CSV file as the data pool for sed for reference. For example, if I wanted to create a CSV file that looks like this:

bird,snake tree,bush river,stream 

Then I want to use sed to find my text for the rows of column 1 and replace them with the values โ€‹โ€‹of column 2. Is this something that is best done with a bash script call, or will I have more success with a Perl script?

-3
source share
2 answers

Use Perl. Read the CSV file into a hash, create a regular expression from the hash keys and perform global substitution in the text using the hash for translation.

Looks like this

 use strict; use warnings; use 5.010; use autodie; my $str = <<'__END_TEXT__'; The ripple-necked bird sang melodies by the curling river while the hooded tiger glowered in the tree beneath her, just out of reach. __END_TEXT__ open my $fh, '<', 'words.csv'; my %patterns = map { chomp; split /,/, $_, 2; } <$fh>; my $re = join '|', sort { length $b <=> length $a } keys %patterns; $str =~ s/\b($re)\b/$patterns{$1}/g; say $str; 

Exit

 The ripple-necked snake sang melodies by the curling stream while the hooded tiger glowered in the bush beneath her, just out of reach. 
+1
source

This is probably best done if one sed script converts the mapping file into a second sed script, which is then applied to the data being converted. Since you say bash , I assume you have a process replacement . If you do not, either upgrade bash , or use temporary files.

 sed -i .bak -f <(sed 's%^ *\([^ ,]\{1,\}\), *\([^ ]\{1,\}\) *$%s/\1/\2/g%' \ control-file) \ datefile-1 datafile-2 ... 

The regular expression is rather complicated because the control data shown in the question has leading spaces and possibly trailing spaces and have a comma as a field separator. This data in the control file was formatted more orthodoxly:

 bird,snake tree,bush river,stream 

the code could be simpler:

 sed -i .bak -f <(sed 's%\([^,]*\),\(.*\)%s/\1/\2/g%' control-file) \ datefile-1 datafile-2 ... 
+1
source

Source: https://habr.com/ru/post/1201391/


All Articles