Matching a string with a unique number?

Is there a good bash single liner for matching lines within a file with a unique number?

For instance,

a
a
b
b
c
c

should be converted to

1
1
2
2
3
3

I am currently implementing it in C ++, but one

+3
source share
4 answers
awk '{if (!($0 in ids)) ids[$0] = ++i; print ids[$0]}'

This supports an associative array called ids. Each time he finds a new line, he assigns it a monotonically increasing id ++i.

Example:

jkugelman$ echo $'a\nb\nc\na\nb\nc' | awk '{if (!($0 in ids)) ids[$0] = ++i; print ids[$0]}'
1
2
3
1
2
3
+10
source

Awk solutions are great here, but here is the same approach in pure bash (> = 4)

declare -A stringmap
counter=0
while read string < INPUTFILE; do
    if [[ -z ${stringmap[$string]} ]]; then
        let counter+=1
        stringmap[$string]=$counter
    fi
done
for string in "${!stringmap[@]}"; do
    printf "%d -> %s\n" "${stringmap[$string]}" "$string"
done
+3
source
awk 'BEGIN { num = 0; }
{
    if ($0 in seen) {
        print seen[$0];
    } else {
        seen[$0] = ++num;
        print num;
    }
}' [file]

( , .)

+2

if

awk '!($0 in ids){ids[$0]=++i}{print ids[$0]}' file
+2

Source: https://habr.com/ru/post/1767204/


All Articles