Is `u32` /` i32` offered even in the case of a limited number?

Should we use u32 / i32 or a lower version ( u8 / i8 , u16 / i16 ) when working with a limited range number, for example, “days in a month”, which ranged from 1 -30 or “item rating”, which ranged from 0 to 100? Or why shouldn't we?

Is there any optimization or advantage in the lower version (i.e. effective memory)?

+5
source share
3 answers

The answer is unanswered . I doubt you will see any difference in benchmarks unless you do A LOT arithmetic or the HUGE process arrays of numbers.

You should probably just go with a type that makes more sense (there is no reason to use negatives or have an upper limit of millions in a month) and provide the methods you need (for example, you cannot execute abs() directly on an unsigned integer) .

+2
source

Summary

Correctness should be a priority in terms of performance and correctness (for ranges such as 1-100), all solutions ( u8 , u32 , ...) are equally bad. A better solution would be to create a new type to benefit from strong typing.

The rest of my answer tries to justify this statement and discusses various ways to create a new type.

Detailed explanation

Let's look at an example of "topic evaluation": the only valid values ​​are 0-100. I argue that correctness using u8 and u32 equally bad: in both cases, your variable may contain values ​​that are not legal in your semantic context; This is bad!

And arguing that u8 better because there are less illegal values, it looks like fighting a bear is better than walking around New York because you have only one chance to die (blood loss by a bear attack) as opposed to many possibilities of death (car accident, knife attack, drowning, ...) in New York.

So what we want is a type that guarantees the preservation of only legal values. We want to create a new type that does just that. However, there are several ways to continue; each of which has different advantages and disadvantages.


(A) Make the internal meaning public

 struct ScoreOfSubject(pub u8); 

Advantage : at least the APIs are easier to understand because the parameter is already type-explained. Which is easier to understand:

  • add_record("peter", 75, 47) or
  • add_record("peter", StudentId(75), ScoreOfSubject(47)) ?

I would say the last one :-)

Disadvantage : in fact, we do not check the range, and illegal values ​​may still occur; badly!.


(B) Make the internal value private and provide a range validation constructor

 struct ScoreOfSubject(pub u8); impl ScoreOfSubject { pub fn new(value: u8) -> Self { assert!(value <= 100); ScoreOfSubject(value) } pub fn get(&self) -> u8 { self.0 } } 

Advantage : we apply legal values ​​with a very small code, yes :)

Disadvantage : working with a type can be annoying. Almost every operation requires a programmer to package and unpack a value.


(C) Add a bunch of implementations (in addition to (B))

(the code will be impl Add<_> , impl Display , etc.)

Advantage : the programmer can use this type and do all the useful operations with it directly - with a range check! This is pretty optimal.

Please take a look at Matthieu M. comment:

[...], as a rule, multiplying points together or dividing them, does not give an assessment! Strong typing not only leads to valid values, but also leads to valid actions, so you don't actually split the two counts together to get another result.

I think this is a very important point that I have not clarified before. Strong typing prevents the programmer from performing illegal operations on values ​​(operations that make no sense). A good example is the cgmath box, which distinguishes point vectors and directions because both support different operations on them. You can find an additional explanation here .

Disadvantage : a lot of code: (

Fortunately, the flaw can be reduced by the macro / compiler plug-in system Rust. There are boxes, such as newtype_derive or bounded_integer , that do the code generation for you (disclaimer: I have never worked with them).


But now you say: "You can’t be serious? Should I spend my time creating new types?"

Not necessarily, but if you are working on production code (== at least somewhat important), then my answer is: yes, you should.

+5
source

Using smaller types can be significant benefits, but you will need to accurately evaluate your application on the target platform.

The first and most easily realized advantage due to lower memory is better caching. Your data will most likely not fit into the cache, but it is also less likely to drop other data in the cache, which could potentially improve a completely different part of your application. Regardless of whether this is caused or not, it depends on what memory your application belongs to and in what order. Do the tests!

Network data transfer has the obvious benefit of using smaller types.

Smaller data allows you to execute "large" instructions. A 128-bit SIMD module can process 4 32-bit data or 16 8-bit data, making certain operations 4 times faster. In tests, I made these instructions really executed 4 times faster, indeed, BUT the whole application improved by less than 1%, and the code became more messy. Building your program to make better use of SIMD can be difficult.

Given signed / unsigned discussions, unsigned has some better properties that the compiler may or may not use.

+1
source

Source: https://habr.com/ru/post/1257912/


All Articles