Why does .flat_map () with .chars () not work with std :: io :: Lines, but does it with a string vector?

I am trying to iterate over characters in stdin. The Read.chars() method achieves this goal, but is unstable. An obvious alternative is to use Read.lines() with flat_map to convert it to a character iterator.

It seems that it should work, but does not lead to errors. borrowed value does not live long enough .

 use std::io::BufRead; fn main() { let stdin = std::io::stdin(); let mut lines = stdin.lock().lines(); let mut chars = lines.flat_map(|x| x.unwrap().chars()); } 

This is mentioned in Read Rust File , but it really does not explain why.

What confuses me especially is how it differs from the example in the documentation for flat_map , which uses flat_map to apply .chars() to a line vector. I really don’t understand how it should be otherwise. The main difference that I see is that my code also needs to call unwrap() , but changing the last line to the next doesn't work:

 let mut chars = lines.map(|x| x.unwrap()); let mut chars = chars.flat_map(|x| x.chars()); 

The error is in the second line, so the problem does not look like unwrap .

Why does this last line not work when a very similar line is missing from the documentation? Is there any way to make this work?

+5
source share
1 answer

Start by figuring out what type of closure variable is:

 let mut chars = lines.flat_map(|x| { let () = x; x.unwrap().chars() }); 

It shows . After unwrap ping, it will be a String .

Next, see str::chars :

 fn chars(&self) -> Chars 

And the definition of Chars :

 pub struct Chars<'a> { // some fields omitted } 

From this we can say that calling Chars in a string returns an iterator that has a link to the string.

Whenever we have a link, we know that the link cannot survive what it is taken from. In this case, x.unwrap() is the owner. The next thing to check is where this ownership ends. In this case, the closure belongs to String , so at the end of the closure, the value is discarded and any references become invalid.

Except that the code tried to return a Chars that was still referencing a string. Unfortunately. Thanks to Rust, the code was not segfault!

The difference with the example that works is in the property. In this case, the lines belong to a vector outside the loop, and they are not lost before the iterator is consumed. Thus, there are no problems with life expectancy.

To do this, we really need into_chars String code. This iterator can take responsibility for the values ​​and return the characters.


Not maximum efficiency, but a good start:

 struct IntoChars { s: String, offset: usize, } impl IntoChars { fn new(s: String) -> Self { IntoChars { s: s, offset: 0 } } } impl Iterator for IntoChars { type Item = char; fn next(&mut self) -> Option<Self::Item> { let remaining = &self.s[self.offset..]; match remaining.chars().next() { Some(c) => { self.offset += c.len_utf8(); Some(c) } None => None, } } } use std::io::BufRead; fn main() { let stdin = std::io::stdin(); let lines = stdin.lock().lines(); let chars = lines.flat_map(|x| IntoChars::new(x.unwrap())); for c in chars { println!("{}", c); } } 
+6
source

Source: https://habr.com/ru/post/1259075/


All Articles