How to format string date (with text and milliseconds) using AWK

I am working on an AWK script that parses millions of lines of text. Each line contains (among other things) a date and time in the form:

16-FEB-2008 14:17:59.994669

I need to convert this to the following form

20080216141759994669000

And I would like to avoid manually translating text from text to numerical value, if possible. In bash, I can simply execute the following command to get the desired result:

date -d "16-FEB-2008 14:17:59.994669" +"%Y%m%d%H%M%S%N"

I tried calling this command in AWK, but I cannot figure out how to do this. I'd like to know

  • Can only be achieved with AWK?
  • How can I use such a command in an AWK script file?

Thanks in advance

+4
source share
5 answers

:

awk '{
         cmd="date -d \""$0"\" +%Y%m%d%H%M%S%N"
         cmd | getline ts
         print $0, ts
         # awk opened a pipe for the communication with 
         # the command. close that pipe to avoid running
         # out of file descriptors
         close(cmd)
     }' <<< '16-FEB-2008 14:17:59.994669'

:

16-FEB-2008 14:17:59.994669 20080216141759994669000

dave_thompson_085 , date GNU coreutils gawk. GNU date stdin gawk , date , stdin stdout:

{
    cmd = "stdbuf -oL date -f /dev/stdin +%Y%m%d%H%M%S%N"
    print $0 |& cmd 
    cmd |& getline ts
    print $0, ts
}

, stdbuf force date .

+4

awk , , () date "":

$ echo this 16-FEB-2008 14:17:59.994669 that \
> | awk '{ split($2,d,"-"); split($3,t,"[:.]"); 
    m=sprintf("%02d",index("JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC",d[2])/4+1);
    print $1,d[3] m d[1] t[1] t[2] t[3] t[4] "000",$4 }'
this 20080216141759994669000 that
$ # or can put the script in a file and use with awk -f
$ # or the whole thing in a shebang file like #!/bin/awk -f

date " ".

+5

perl:

LANG=C perl -MTime::Piece -plE 's/\b(\d+-\w{3}-\d{4}\s+\d+:\d+:\d+)\.(\d+)\b/Time::Piece->strptime($1,q{%d-%b-%Y %H:%M:%S})->strftime(q{%Y%m%d%H%M%S}).$2/ge' < in >out

Replaces each temporary pattern globally with a formatted (and verified) date.

The main module Time :: Piece does not support fractional seconds, so the solution is a bit hacky ...

+2
source

There are a lot of good answers here. Here's the one that uses a helper function awkto format dates.

awk '
  BEGIN { 
    mi["JAN"]="01"; mi["FEB"]="02"; mi["MAR"]="03"; mi["APR"]="04"; mi["MAY"]="05"; mi["JUN"]="06"
    mi["JUL"]="07"; mi["AUG"]="08"; mi["SEP"]="09"; mi["OCT"]="10"; mi["NOV"]="11"; mi["DEC"]="12"
  }
  function reformatDate(dtStr, tmStr) {
    split(dtStr, dtParts, "-"); gsub(/[:.]/, "", tmStr)
    return dtParts[3] mi[dtParts[2]] sprintf("%02d", dtParts[1]) tmStr "000"
  }
  { print reformatDate($1, $2) }
' <<<'16-FEB-2008 14:17:59.994669'
+2
source

there is no need to call the date, you just need a month search

$ awk -F'[- :.]' -v OFS='' '
     BEGIN {split("JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC",m);
            for(i=1;i<=12;i++) a[m[i]]=i<10?"0"i:i}
           {$2=a[$2]; y=$3; $3=$1; $1=y; print $0 "000"}' file
+1
source

Source: https://habr.com/ru/post/1672065/


All Articles