Why are functions like std :: is_permutation () unsafe?

The C and C ++ programmers have been beaten over the past decade due to the fact that they often did not perform proper border checks, especially in strings. These failures often led to serious security issues in the core software products. As the buffer overflow instability became understandable, the desire to implement proper border checking pushed many programmers away from traditional buffer and string management functions such as strcpy() and sprintf() , at least in part, due to the tendency of these functions to cause buffer overflow problems making assumptions about the size of the destination buffer. One of the advantages of STL types, such as std::string and std::vector , is their strong buffer access control.

But one thing puzzles me. Many of the most widely used functions in the C ++ <algorithms> header seem to be positively asking for abuse of overflow: in particular, those functions that accept a begin iterator (especially InputIterator) without a matching end iterator. For instance:

 template <class InputIterator, class OutputIterator> OutputIterator copy (InputIterator first, InputIterator last, OutputIterator result); template <class InputIterator, class OutputIterator, class UnaryOperation> OutputIterator transform (InputIterator first1, InputIterator last1, OutputIterator result, UnaryOperation op); template <class ForwardIterator1, class ForwardIterator2> bool is_permutation (ForwardIterator1 first1, ForwardIterator1 last1, ForwardIterator2 first2); 

The final example - is_permutation() is especially instructive. copy() and transform() well understood, so C ++ programmers should know either to manually check the boundaries of the output container before calling these functions, or use something like back_inserter , which ensures that the output container grows as needed. Therefore, we can conclude that although copy() and transform() can be used incorrectly, something can, and programmers can easily be brought up in best practices with such functions.

is_permutation() is a more complicated case. Just by looking at the function declaration above, what would you suggest about the size of the second range (the one that starts with first2 )? Should the second range be the same size as the first, or not less or more? I bet that simple answers to these questions do not occur to you. The concept of "permutation" is not as convenient and familiar to most programmers as the concept of copying. Therefore, it seems relatively easy to get is_permutation() wrong and overflow the buffer anyway.

"Look!" I can hear you talking. Yes of course. But if programmers remembered everything they needed and looked for everything else, then we would not have errors and security holes, would we?

Why not is_permutation() and similar functions (i.e. functions that accept all input iterators, but not a complete initial iterator pair for each range), requires a complete initial end pair for all input ranges? (Note that lexicographical_compare() , for example, fulfills this requirement.) Are functions like is_permutation() really not as dangerous as I imagine?

+4
source share
3 answers

In C ++ 14, there are four versions of the equal iterator, is_permutation and mismatch for addressing this particular point.

+7
source

Most of the language is inherently unsafe, the programmer must use it correctly. The programmer must know before , the calling function, whether the arguments are used correctly.

In addition, in some cases, such as copy , it allows the use of forward iterators in open ranges. For instance:

 std::copy(v.begin(), v.end(), std::ostream_iterator<int>(std::cout," ")); 

There is no corresponding iterator to mark the end of the stream, and the stream really has no end, you can constantly add it to it.

+8
source

I'm not sure that introducing the last iterator in the second range for is_permutation will make this function less cumbersome. I think that would make him more confused.

The thing with permutations is that the semantics belong to the name itself. To verify that one sequence is a permutation of the other, you expect that the sequence without the last iterator will be at least as long as the first sequence.

If this is not the case, then you do not need to call is_permutation , because it simply cannot be a permutation. If it takes longer, you expect it to not go through the length of the first sequence - why? Well, this is not so, and what you expected, therefore faith is not lost.

C ++ really expects programmers to take basic precautions, and in many cases we are responsible for checking boundaries. Without giving this control to the programmer, the power of the language decreases. If I call is_permutation , then I know that my second iterator will not overflow, because I know what a permutation is. Of course, I do not want to spend cycles on pointless checks.

I think the old adage applies: with great power comes great responsibility. That's fair, isn't it?

+1
source

Source: https://habr.com/ru/post/1490811/


All Articles