Is there a parser for dating modern natural languages ​​for Java?

I have two problems, and I wonder if I can solve them in one go. I am trying to do a natural language date partitioning in Java (well, Scala) and used JChronic, the excellent chronicler RubyGem port.

There are two problems:

  • JChronic uses java.util.Calendar, not Joda-Time, and I think it's pretty reasonable to say that Joda-Time is or should be a replacement for JDK date libraries. And if Joda-Time does not replace existing date libraries, JSR 310 is confident that someday Oracle will finish the lawsuit and return to Java.

  • JChronic does not handle general date and time parsing. If I tell him to parse "next Thursday 4pm" or something like that, he will correctly handle it and give me a Calendar object at the right time. But if I just say β€œ2011” or β€œJanuary 1963” or something like that, he will not be able to handle general date ranges or Partials in Joda β€”Time to speak.

The second of them is much more worrying than the first. I am trying to extract dates from web pages regarding documents (books, newspaper articles, web pages, etc.), Where publication dates and copyright dates are important.

Currently, I feel like I’ve come to terms with writing my own or perhaps porting an aging JChronic to use Joda-Time and adding support for partial ones. Is there an alternative solution that could satisfy at least (2) and possibly (1)?

+4
source share
1 answer

You might want to check out SUTime , which is a temporary parser. Their sample code and online demo show support for partial ones. SUTime.Temporal objects should be able to provide you with the appropriate Joda-Time objects.

0
source

Source: https://habr.com/ru/post/1335158/


All Articles