I do not understand how to use pattern matching for two or more regular expressions. For example, I wrote the following program:
import scala.io.Source.{fromInputStream} import java.io._ import java.net._ object craw { def main(args: Array[String]) { val url=new URL("http://contentexplore.com/iphone-6-amazing-looks/") val content=fromInputStream(url.openStream).getLines.mkString("\n") val x="<a href=(\"[^\"]*\")[^<]".r. findAllIn(content). toList. map(x=>x.substring(16,x.length()-2)). mkString(""). split("/"). mkString(""). split(".com"). mkString(""). split("www."). mkString(""). split(".html"). toList print(x) } }
The above text is read in all anchor tags.
import scala.io.Source.{fromInputStream} import java.io._ import java.net._ object new1 { def main(args: Array[String]) { val url=new URL("http://contentexplore.com/iphone-6-amazing-looks/") val content=fromInputStream(url.openStream).getLines.mkString("\n") val x="<p>.*?</p>".r. findAllIn(content). toList. map(x=>x.substring(3,x.length()-4)). mkString(""). split("</strong>"). mkString(""). split("</em>"). mkString(""). split(";"). mkString(""). split("<em>"). mkString(""). split("<strong>"). mkString(""). split(" "). toList print(x) } }
The above text is read in all paragraph tags.
I want to combine these two regular expressions into one program using pattern matching. Can I advise how to use more than two regular expressions?
NOTE This question is about combining regular expressions, not how to parse HTML effectively.
source share