Why s.find returns string :: npos instead of s.length () on error

Recently, it was very annoying for me to find that string::find returns string::npos when the needle was not found in the haystack. This makes the following seemingly elegant composition of the code compilation, but throws an exception outside the range:

 s.erase(s.find('#')); // erase everything after a # if one exists 

If find returned s.length() on failure, it will work fine. Instead you need to do

 auto pos = s.find('#'); if (pos != s.npos) s.erase(pos); 

This also does not match std::find , which returns the final iterator if the element is not found.

I know that ordinary people are pretty smart, so I think that they just didn't come up with it from nowhere. This should give some elegance somewhere else that I do not see. What is the reason for this?

+6
source share
5 answers

I'm not sure about this: the original std :: sting (STL) did not require data storage contiguously. Consequently, the returned size () upon failure in the operation will be overhead (if the size is not saved). In C ++, 11 lines are contiguous, and I agree with your criticism.

+2
source

Your question is actually twofold:

1) Why does std::string have its own find function, which returns a std::size_t instead of an iterator?

This is largely due to the fact that std::string was developed separately from most of the rest of the standard library. Only in the latest standards was it covered by other templates (e.g. iostream). Therefore, when it was added to the standard, it had some functions added to it, but its original functionality was largely left as it is (the exception is the usual copy-on-write implementation, which was forbidden in C + + 11). It was left this way mainly for backward compatibility.

To your question about why this was from the very beginning: The original string.h was a very thin wrapper around several functions of the string C. It was not so rare to see strlen used as the return value for length() or strcpy used in the constructor copying. Requirements were not required to use these functions, so the developers began to do some interesting things (for example, copies to write, non-contiguous blocks of memory), but they left the interface the same to maintain backward compatibility. Although functions were added to it, no public functions were removed from the interface. This way, you can track design decisions for using a pointer and length for function parameters in the days when it was just a wrapper around C. functions

2) How can you write an erase sequence in a string without having to check the return value?

This can be done simply using the find-erase idiom, but without using the std::string find function:

 s.erase(std::find(s.begin(), s.end(), '#'), s.end()); 
+8
source

Using std::string::npos makes the result a constant expression, unlike std::string::length() . Since npos not suitable as an iterator, in any case, there is a value in a constant expression, for example, it can be used by default for parameters that take the value std::string::size_type .

Another reason is that the base interface for std::basic_string was merged before the STL was added to the C ++ standard library (well, at least there is a part of the interface that existed then). The original interface was basically an immutable string, and I think it did not support mutating the string itself.

+5
source

If you like the behavior of std::find , you should use it as std::string - this is a container:

 s.erase( std::find( s.begin(), s.end(), '#' ), s.end() ); 

Changing the behavior of s.find () to return s.length () may make this particular case more elegant, but will cause other problems. I think the best solution would be to make std :: string :: erase () to accept std :: string :: npos as the first parameter and not do anything.

0
source

The problem is that many std :: basic_string member functions use default arguments. This makes them more convenient to use. For example, consider the following constructor

 basic_string(const basic_string& str, size_type pos, size_type n = npos, const Allocator& a = Allocator()); 

What default argument can be specified for the third parameter n? The C ++ standard does not allow the use of non-static elements as default arguments:

Similarly, a non-static member should not be used as a default argument, even if it is not evaluated, if it is not displayed as the id expression of an access expression to a class member (5.2.5), or if it is used to form a pointer to a member

Thus, npos is a more convenient default argument.

0
source

Source: https://habr.com/ru/post/958883/


All Articles