CharInSet is much slower than IN, should the W1050 warning be fixed?

I use IN a lot in my project, and I have a lot of these warnings:

[DCC Warning] Unit1.pas (40): W1050 WideChar shortened to char byte in set expression. Consider using the CharInSet function in the SysUtils module.

I did a quick test and using CharInSet instead of IN with 65% -100% slower:

if s1[i] in ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z'] then 

vs

 if CharInSet(s1[i], ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z']) then 

Here is the code for 2 tests, one works with a loop through shorter lines, one loop once through a large line:

Adding 2 buttons in the form I checked for a short line:

 procedure TForm1.Button1Click(Sender: TObject); var s1: string; t1, t2: TStopWatch; a, i, cnt, vMaxLoop: Integer; begin s1 := '[DCC Warning] Unit1.pas(40): W1050 WideChar reduced to byte char in set expressions. Consider using CharInSet function in SysUtils unit.'; vMaxLoop := 10000000; cnt := 0; t1 := TStopWatch.Create; t1.Start; for a := 1 to vMaxLoop do for i := 1 to Length(s1) do if s1[i] in ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z'] then inc(cnt); t1.Stop; cnt := 0; t2 := TStopWatch.Create; t2.Start; for a := 1 to vMaxLoop do for i := 1 to Length(s1) do if CharInSet(s1[i], ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z']) then inc(cnt); t2.Stop; Button1.Caption := inttostr(t1.ElapsedMilliseconds) + ' - ' + inttostr(t2.ElapsedMilliseconds); end; 

And this is for 1 long line:

 procedure TForm1.Button2Click(Sender: TObject); var s1: string; t1, t2: TStopWatch; a, i, cnt, vMaxLoop: Integer; begin s1 := '[DCC Warning] Unit1.pas(40): W1050 WideChar reduced to byte char in set expressions. Consider using CharInSet function in SysUtils unit.'; s1 := DupeString(s1, 1000000); s1 := s1 + s1 + s1 + s1; // DupeString is limited, use this to create longer string cnt := 0; t1 := TStopWatch.Create; t1.Start; for i := 1 to Length(s1) do if s1[i] in ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z'] then inc(cnt); t1.Stop; cnt := 0; t2 := TStopWatch.Create; t2.Start; for i := 1 to Length(s1) do if CharInSet(s1[i], ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z']) then inc(cnt); t2.Stop; Button2.Caption := inttostr(t1.ElapsedMilliseconds) + ' - ' + inttostr(t2.ElapsedMilliseconds); end; 

Why do they recommend a slower option, or how can I fix this warning without a performance penalty?

+5
source share
1 answer

A warning informs you that your code may be malfunctioning. Since sets can only be based on types of order 256 or less, the base type is truncated to that size. Char is now an alias for WideChar and has an ordinal of 65536. Thus, you warn that your program may not behave as you expect. For example, you might ask that this expression evaluates:

 ['A', chr(256)] = ['A'] 

You can expect it to evaluate to false, but in fact, it evaluates to true. Therefore, I think that you should, of course, listen to the compiler when it issues this warning.

Now it so happened that your set, which can and should be written more concisely, like ['A'..'Z'] , consists entirely of ASCII characters. And this happens (thanks to commentators Andreas and ventiseis) that in this case the compiler generates the correct code for such a set, regardless of the ordinal value of the character to the left of the in operator. So,

 if s1[i] in ['A'..'Z'] then 

will result in the correct code despite the warning. And the compiler is able to detect that the installed elements are contiguous and generate efficient code.

Note that this depends on the set being a literal, and therefore optimization can be done by the compiler. And so it can work much better than CharInSet . Since CharInSet is a function, and the Delphi optimizer has limited power, CharInSet cannot use the contiguous nature of this particular set literal.

The warning is annoying, and you really want to rely on remembering very specific details when this warning can be safely ignored. Another way to implement the test and get around this warning is to use the inequality operators:

 if (c >= 'A') and (c <= 'Z') then .... 

You will probably wrap this with a built-in function to make the code even more readable.

 function IsUpperCaseEnglishLetter(c: Char): Boolean; inline; begin Result := (c >= 'A') and (c <= 'Z'); end; 

You should also ask yourself if this code is a performance bottleneck. You should spend time on your real program, and not on such an artificial program. I bet this code is not a bottleneck, and therefore you should not consider performance as a key driver.

+8
source

Source: https://habr.com/ru/post/1244886/


All Articles