Combining an item with the next in the list if the condition is met

I use Stanford NLP to split text into sentences, but it ignores abbreviation.

So this is an example of the resulting sentence that I have:

List(I, 'd, like, to, fix, this, sentence, because, it, 's, broken)

My goal is to combine abbreviated words so that the result looks like this:

List(I'd, like, to, fix, this, sentence, because, it's, broken)

Is there an elegant way to do this in scala? Basically, I am looking for an expression that iterates through a list, checking an element with the following, concatenating if the condition is true, and returning a list of results according to my example.

+4
source share
3 answers
scala> val l = List("I", "'d", "like", "to fix", "this", "sentence", "because", "it", "'s", "broken")
l: List[String] = List(I, 'd, like, to fix, this, sentence, because, it, 's, broken)

scala> l.reduceRight({(s1,s2) => if (s2.startsWith("'")) s1+s2 else s1+" "+s2})
        .split(" ").toList
res2: List[String] = List(I'd, like, to, fix, this, sentence, because, it's, broken)

, , (- reduceRight). foldRight reduceRightOption, .

+2
val broken = List("I", "'d", "like", "to", "fix", "this", "sentence", "because", "it", "'s", "broken")
broken.foldLeft(List.empty[String]) { (list, str) => 
  if (str.startsWith("'")) {
    list.init :+ (list.last + str) 
  } else {
    list :+ str
  }
}

( , "" , )

+1

An approach that extends the accepted answer to address cases such as ca, n't,

implicit class StanfordNLPConcat(val words: List[String]) extends AnyVal {
  def SNLPConcat() = {
    val sep = "#"
    words.reduce{ (a,v) => if (v.contains("'")) a+v else a+sep+v }.split(sep).toList
  }
}

Let be

val words = List("I", "'d", "like", "to", "fix", "this", "sentence", "because", "it", "'s", "broken")

and therefore

words.SNLPConcat()
res:  List[String] = List(I'd, like, to, fix, this, sentence, because, it's, broken)

Further

List("It", "ca", "n't", "be", "wrong").SNLPConcat()
res: List[String] = List(It, can't, be, wrong)
+1
source

Source: https://habr.com/ru/post/1541945/


All Articles