When used wcin a string Ås(capital city of the letters Å), I get wordcount 2 when I expect the phrase 1.
Counting words Å, sÅit gives one that feels right.
$ echo sÅ | wc
1 1 4
$ echo Å | wc
1 1 3
Counting words Ås, sÅsit gives 2, which does not seem right.
$ echo sÅs | wc
1 2 5
$ echo Ås | wc
1 2 4
Only a letter Åcan reproduce this, and not any of åäöÄÖ.
$ echo "Ås" | wc
1 2 4
$ echo "Äs" | wc
1 1 4
$ echo "Ös" | wc
1 1 4
I use the default Locale settings from Mac OS when starting the terminal, it looks like this:
$ locale
LANG=
LC_COLLATE="C"
LC_CTYPE="UTF-8"
LC_MESSAGES="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_ALL=
I get the same results on MacOS Sierra and Lion.
Just check what the string looks like Ås.
$ echo "Ås" | hexdump
0000000 c3 85 73 0a
0000004
, , Mac OS - , wc?
Mac OS wc UTF-8 Å?
, ( ) wc -c, 85 , … ASCII? (wc -m wordcount)
- ?