Using Awk to process a file, where each record has different fixed-width fields

Question

Using Awk to process a file, where each record has different fixed-width fields

I have some data files from an old system that I would like to process using Awk. Each file consists of a list of entries. There are several different types of records, and each type of record has a different set of fields of fixed width (the field separator symbol does not exist). The first two characters of the record indicate the type, from this you then know which fields should follow. A file might look something like this:

AAField1Field2LongerField3
BBField4Field5Field6VeryVeryLongField7Field8
CCField99

Using Gawk, I can set FIELDWIDTHS , but this applies to the whole file (unless I missed some way to set it to write by write), or I can set FS to "" and process the file one character at a time, but it's a little cumbersome.

Is there any way to extract fields from such a file using awk?

Change . Yes, I could use Perl (or something else). I still want to know if there is a reasonable way to do this with Awk.

+3

linux unix awk gawk text-processing

Dan dyer Sep 08 '09 at 11:34

source share

6 answers

, :

1step.awk

/^AA/{printf "2 6 6 12"    }
/^BB/{printf "2 6 6 6 18 6"}
/^CC/{printf "2 8"         }
{printf "\n%s\n", $0}

2step.awk

NR%2 == 1 {FIELDWIDTHS=$0}
NR%2 == 0 {print $2}

awk -f 1step.awk sample  | awk -f 2step.awk

+5

Aleksey Otrubennikov 08 . '09 12:53

, (, , ) awk :

awk '/^AA/ { manually process record AA out of $0 }
     /^BB/ { manually process record BB out of $0 }
     /^CC/ { manually process record CC out of $0 }' file ...

- , substr , , , .

, Perl unpack, awk , .

+4

Jonathan Leffler 08 . '09 12:21

Perl, ?

+3

Rob Wells 08 . '09 11:48

, perl ruby.

0

Petar Kabashki 08 . '09 11:37

What about 2 scripts? For example. The 1st script inserts field separators based on the first characters, then the second should handle it?

Or, first of all, define some function in the AWK script that breaks the lines into variables based on input - I would go this way for possible reuse.

0

Zsolt Botykai Sep 08 '09 at 12:19

source share

Darren Atkinson · Accepted Answer · 2009-09-08T13:23:42+0000

, . , "CC", , if-then. , 1,5 7 , awk script .

BEGIN {
        field1=""
        field5=""
        field7=""
}
{
    record_type = substr($0,1,2)
    if (record_type == "AA")
    {
        field1=substr($0,3,6)
    }
    else if (record_type == "BB")
    {
        field5=substr($0,9,6)
        field7=substr($0,21,18)
    }
    else if (record_type == "CC")
    {
        print field1"|"field5"|"field7
    }
}

awk script, program.awk, . script, :

awk -f program.awk < my_multi_line_file.txt

Using Awk to process a file, where each record has different fixed-width fields

More articles: