Regular expression and positioning elements in .csv format

Question

Regular expression and positioning elements in .csv format

Here is what I need to solve:

Providing the following set of letters has the header of the .csv file: H,A,D,E,R,T,Y,B,D

I need to process a group of letters with the places in the correct position: For example, enter the following group of letters: E,R,T,Yeither B,D,or T,Y,B,Dor H,A,D,E,R, etc.

Each letter has its own fixed position Example: “H” is always the first letter of a string, “A” is the second letter, etc. .... I need to place a group of letters in commas and keep the correct position

Ex for a group of letters ERTYI will have: ,,,E,R,T,Y,,,
And for HADERI will haveH,A,D,E,R,,,,

My first attempt was to count the number of missing commas. Example:

echo "E,R,T,Y" | sed 's/[^,]//g' | awk '{ print length }' | xargs -n 1 bash -c 'echo $((9-$1))' args`

Now I am trying to add the missing commas to the appropriate positions. But I'm stuck at this step.

+4

regex awk sed csv position

user3641949 May 15, '14 at 17:56

source share

3 answers

Using bash and GNU grep:

partial() { 
    # $1 is the header
    # $2 is the "substring" line
    local prefix suffix
    prefix=$( grep -oP ".*(?=$2)"  <<<"$1" ) || return 1
    suffix=$( grep -oP "(?<=$2).*" <<<"$1" )
    echo "${prefix//[^,]/}${2}${suffix//[^,]/}"
}
partial "H,A,D,E,R,T,Y,B,D" "B,D"
partial "H,A,D,E,R,T,Y,B,D" "A,D,E"
partial "H,A,D,E,R,T,Y,B,D" "A,D,E,"
partial "H,A,D,E,R,T,Y,B,D" "foo" || echo "foo is not a substring"

,,,,,,,B,D
,A,D,E,,,,,
,A,D,E,,,,,
foo is not a substring

A version that does not rely on grep:

partial () { 
    local prefix suffix
    prefix=${1%%${2}*}
    [[ $prefix == "$1" ]] && return 1
    suffix=${1##*${2}}
    echo "${prefix//[^,]/}${2}${suffix//[^,]/}"
}

+1

glenn jackman May 15, '14 at 20:21

source share

(GNU sed):

sed -r 's/$/\nH,A,D,E,R,T,Y,B,D/;s/(.*)\n(.*)\1(.*)/\2\n\1\n\3/;h;s/[^,\n]//g;G;s/^(.*)\n.*\n(.*)\n.*\n(.*)\n.*/\1\3\2/' file

. eitherside ( ). , , , (\n s). .

0

potong 15 '14 21:01

anubhava · Accepted Answer · 2014-05-15T20:22:31+0000

The following awk script should work:

s='H,A,D,E,R,T,Y,B,D'

awk -v p='HADER' -F, 'NR==1{for (i=1; i<=NF; i++) 
 {printf "%s%s", index(p, $i)?$i:"", (i<NF)?OFS:RS; sub($i, "", p)} print ""}' OFS=, <<<"$s"
H,A,D,E,R,,,,

awk -v p='ERTY' -F, 'NR==1{for (i=1; i<=NF; i++)
 {printf "%s%s", index(p, $i)?$i:"", (i<NF)?OFS:RS; sub($i, "", p)} print ""}' OFS=, <<<"$s"
,,,E,R,T,Y,,

Regular expression and positioning elements in .csv format

More articles: