How do I iterate over multiple files while maintaining the base name for further processing?

I have several text files that need to be faked, POS and NER. I use C & C tags and run their tutorial, but I wonder if there is a way to tag multiple files, and not one by one.

I am currently signing files:

bin/tokkie --input working/tutorial/example.txt--quotes delete --output working/tutorial/example.tok

as follows, and then Part of the speech tags:

bin/pos --input working/tutorial/example.tok --model models/pos --output working/tutorial/example.pos

and finally, Named Entity Recognition:

bin/ner --input working/tutorial/example.pos --model models/ner --output working/tutorial/example.ner

I'm not sure how I would like to create a loop for this and keep the file name the same as the input, but with the extension representing its tagging. I was thinking of a bash script, or possibly Perl, to open a directory, but I'm not sure how to enter C & C commands to understand the script.

, , !

+3
2

, , - .

use autodie qw(:all);
use File::Basename qw(basename);

for my $text_file (glob 'working/tutorial/*.txt') {
    my $base_name = basename($text_file, '.txt');
    system 'bin/tokkie',
        '--input'  => "working/tutorial/$base_name.txt",
        '--quotes' => 'delete',
        '--output' => "working/tutorial/$base_name.tok";
    system 'bin/pos',
        '--input'  => "working/tutorial/$base_name.tok",
        '--model'  => 'models/pos',
        '--output' => "working/tutorial/$base_name.pos";
    system 'bin/ner',
        '--input'  => "working/tutorial/$base_name.pos",
        '--model'  => 'models/ner',
        '--output' => "working/tutorial/$base_name.ner";
}
+3

Bash:

#!/bin/bash
dir='working/tutorial'
for file in "$dir"/*.txt
do
    noext=${file/%.txt}

    bin/tokkie --input "$file" --quotes delete --output "$noext.tok"

    bin/pos --input "$noext.tok" --model models/pos --output "$noext.pos"

    bin/ner --input "$noext.pos" --model models/ner --output "$noext.ner"

done
+1

Source: https://habr.com/ru/post/1795484/


All Articles