Many thanks to PetSerAl for all his invaluable contributions.
TL dr
-lt
and -gt
compare instances of [char]
numerically using Unicode code.
- It is confusing, so
-ilt
, -clt
, -igt
, -cgt
- although they only make sense with string operands, it’s a fad in PowerShell itself (see below).
-eq
(and its alias -ieq
), on the contrary, compare -ieq
[char]
instances, which is usually, but not necessarily, case-insensitive string comparison ( -ceq
again compared strictly numerically).
-eq
/ -ieq
ultimately also compares numerically, but first converts the operands to their uppercase equivalents using an invariant culture; as a result, this comparison is not completely equivalent to the PowerShell syntax matching, which additionally recognizes the so-called compatible sequences (single characters or even sequences considered identical, see Unicode equivalence ) as equal.- In other words: special PowerShell examples are the behavior of only
-eq
/ -ieq
with [char]
operands , and it does so in a way that is almost, but not completely different from case-insensitive string comparisons .
This difference leads to anti-intuitive behavior, for example, [char] 'A' -eq [char] 'a'
and [char] 'A' -lt [char] 'a'
, returning $true
.
To be safe:
- always send to
[int]
if you need a numeric (Unicode code notation). - always select
[string]
if you need string comparisons.
For initial information, read.
PowerShell's usually useful operator overloading can be tricky at times.
Note that in a numeric context (implicit or explicit), PowerShell processes characters ( [char]
( [System.Char]
)) numerically strong>, their Unicode codepoint (not ASCII).
[char] 'A' -eq 65
What makes [char]
unusual is that its instances are compared numerically as is, according to Unicode code, EXCEPT with -eq
/ -ieq
.
- compare
ceq
, -lt
and -gt
directly by Unicode code points and - counter-intuitively - so -ilt
, -clt
, -igt
and -cgt
:
[char] 'A' -lt [char] 'a' # $true; Unicode codepoint 65 ('A') is less than 97 ('a')
-eq
(and its alias -ieq
) first converts characters to uppercase and then compares the received Unicode code points:
[char] 'A' -eq [char] 'a'
It’s worth considering this Buddhist twist: this and that: in the PowerShell world, the symbol “A” is smaller and equal to “a”, depending on how you compare .
In addition, directly or indirectly - after converting to uppercase - the Unicode code comparison does not match the string comparison , because PowerShell string comparison additionally recognizes the so-called compatible sequences, where characters (or even character sequences) are considered “the same” if they have same meaning (see Unicode equivalence ); eg:.
# Distinct Unicode characters U+2126 (Ohm Sign) and U+03A9 Greek Capital Letter Omega) # ARE recognized as the "same thing" in a *string* comparison: "Ω" -ceq "Ω" # $true, despite having distinct Unicode codepoints # -eq/ieq: with [char], by only applying transformation to uppercase, the results # are still different codepoints, which - compared numerically - are NOT equal: [char] 'Ω' -eq [char] 'Ω' # $false: uppercased codepoints differ # -ceq always applies direct codepoint comparison. [char] 'Ω' -ceq [char] 'Ω' # $false: codepoints differ
Please note that using the i
or c
prefixes to explicitly indicate case-sensitive behavior is NOT sufficient to force string comparisons , although conceptual operators such as -ceq
, -ieq
, -clt
, -ilt
, -cgt
, -igt
only make sense with in rows.
Effectively, the i
and c
prefixes are simply ignored when applied to -lt
and -gt
when comparing operands [char]
; as it turns out (unlike what I originally thought), this is a common PowerShell trap - see the explanation below.
Aside: the logic of -lt
and -gt
in comparison of strings is not numerical, but is based on sorting order (a human-oriented way of ordering regardless of code point / byte values) which in .NET terms is controlled by cultures (the one that acts by default at the moment, or passing the culture parameter to methods).
As @PetSerAl demonstrates in a comment (and unlike what I originally claimed), PS string comparison uses a culture of invariants rather than the current culture, so their behavior is the same, regardless of which culture is the current one.
Behind the scenes:
As @PetserAl explains in the comments, PowerShell parsing does not distinguish between the basic form of a statement and its i
-prefixed form; for example, both -lt
and -ilt
translate to the same value, Ilt
.
Thus, Powershell cannot implement different behavior for -lt
vs. -ilt
, -gt
vs. igt
, ... because it treats them the same way at the syntax level.
This leads to some "strong" contact-intuitive behavior in that operator prefixes are effectively ignored when comparing data types where case sensitivity is not relevant - as opposed to string coercion, as you might expect; eg:.
"10" -cgt "2" # $false, because "2" comes after "1" in the collation order 10 -cgt 2 # !! $true; *numeric* comparison still happens; the `c` is ignored.
In the latter case, I expected the use of -cgt
to force operands to strings, given that case-sensitive comparisons are just a meaningful concept in string comparisons, but that’s NOT how it works.
If you want to delve deeper into PowerShell, see @PetSerAl's comments below.