Haskell breaks words with first space

Please note that this is not the same as using the word function.

I would like to convert from this:

"The quick brown fox jumped over the lazy dogs." 

in it:

 ["The"," quick"," brown"," fox"," jumped"," over"," the"," lazy"," dogs."] 

Notice how the gaps are in the first space after each word.

The best I could come up with is this:

 parts "" = [] parts s = if null a then (c ++ e):parts f else a:parts b where (a, b) = break isSpace s (c, d) = span isSpace s (e, f) = break isSpace d 

It just looks a little inelegant. Can anyone think of a better way to express this?

+6
source share
7 answers

edit - Sorry, I did not read the question. Hope this new answer will do what you want.

 > List.groupBy (\xy -> y /= ' ') "The quick brown fox jumped over the lazy dogs." ["The"," quick"," brown"," fox"," jumped"," over"," the"," lazy"," dogs."] 

The groupBy library groupBy takes a predicate function that tells you if you add the next element, y, to the previous list, which starts with x or starts a new list.

In this case, we donโ€™t care where the current list started, we only want to start a new list (i.e. make the predicate evaluated as false) when the next element, y, is space.

change

Nm indicates that handling multiple spaces is incorrect. In this case, you can switch to Data.List.HT , which has the semantics you want.

 > import Data.List.HT as HT > HT.groupBy (\xy -> y /= ' ' || x == ' ') "abcd" ["a"," b"," c"," d"] 

another semantics that does this job is that x is the last element of the previous list (you can add y or create a new list).

+6
source

If you are doing many different types of splits, look at split . The package allows you to define this split as split (onSublist [" "]) .

+3
source
 words2 xs = head w : (map (' ':) $ tail w) where w = words xs 

And here with arrows and applicative: (not recommended for practical use)

 words3 = words >>> (:) <$> head <*> (map (' ':) . tail) 

EDIT: My first solution is wrong because it eats extra spaces. Here is the correct one:

 words4 = foldr (\x acc -> if x == ' ' || head acc == "" || (head $ head acc) /= ' ' then (x : head acc) : tail acc else [x] : acc) [""] 
+1
source

Here is my welcome

 break2 :: (a->a->Bool) -> [a] -> ([a],[a]) break2 f (x:( xs@ (y:ys))) = if fxy then ([x],xs) else (x:u,us) where (u,us) = break2 f xs break2 f xs = (xs, []) onSpace xy = not (isSpace x) && isSpace y words2 "" = [] words2 xs = y : words2 ys where (y,ys) = break2 onSpace xs 
0
source
 parts xs = foldr spl [] xs where spl x [] = [[x]] spl ' ' (xs:xss) = (' ':xs):xss spl x xss@ ((' ':_):_) = [x]:xss spl x (xs:xss) = (x:xs):xss 
0
source

I like the idea of โ€‹โ€‹a split package, but split (onSublist [" "]) does not do what I want, and I cannot find a solution that breaks into one or more spaces.

Just like a solution using Data.List.HT , but I would like to avoid dependencies if possible.

The cleanest I can come up with:

 parts s | null s = [] | null a = (c ++ e) : parts f | otherwise = a : parts b where (a, b) = break isSpace s (c, d) = span isSpace s (e, f) = break isSpace d 
0
source

There he is. Enjoy !: D

  words' :: String -> [String] words' [] = [] words' te@ (x:xs) | x==' ' || x=='\t' || x=='\n' = words' xs | otherwise = a : words' b where (a, b) = break isSpace te 
0
source

Source: https://habr.com/ru/post/895111/


All Articles