How to use temporary HeidelTime tagger in Java project?

I would like to automatically determine the dates within the document flow, and in that sense, I would like to use the code provided by the Heideltime open source project available here ( https://code.google.com/p/heideltime/ ). I installed the Heideltime kit (and not a standalone version), and now I'm wondering how I can refer to it and call it in my Java project. I already added a Heideltime dependency inside my pom.xml:

<dependency> <groupId>de.unihd.dbs</groupId> <artifactId>heideltime</artifactId> <version>1.7</version> </dependency> 

however, I am not sure how to call the classes from this source project into my own project. I use Maven for both. Anyone who has used it before can give me a suggestion or advice? Many thanks!

+5
source share
3 answers

The heideltime-kit is itself a Maven project. So you can add the heideltime-kit project as a dependency. (In Netbeans, right-click Dependencies, → Add Dependency → Open Projects (make sure the project is open first) → HeidelTime)

Then move the config.props file to the src / main / resources project folder. Set the path to the treetagger inside config.props.

Regarding the use of classes, you want to create an instance of HeidelTimeStandalone (see de.unihd.dbs.heideltime.standalone.HeidelTimeStandalone.java) using POSTagger.TREETAGGER as the posTagger parameter and the hard path to your src / main / resources / config.props file as the configPath parameter. For instance,

 heidelTime = new HeidelTimeStandalone(Language.ENGLISH, DocumentType.COLLOQUIAL, OutputType.TIMEML, "path/to/config.props", POSTagger.TREETAGGER, true); 

Then, to use HeidelTime for text processing, you can simply call the process function:

 String result = heidelTime.process(text, date); 
+1
source

This library is not yet in the central maven repository. (You can check this out on this search.maven.org site.)

Use the library in your project. You must download the JAR file and install it locally. Answer to this question: How to add local jar files to maven project? .

Then you can simply use the import package and use the functionality in your project.

0
source

Adding jgloves to the answer, you might be interested in parsing the Heideltime result string in a Java object view. The following code converts a Uima-XML representation to Timex3 objects.

  HeidelTimeStandalone time = new HeidelTimeStandalone(Language.GERMAN, DocumentType.SCIENTIFIC, OutputType.XMI, "config.props", POSTagger.STANFORDPOSTAGGER); String xmiRepresentation = time.process(document, documentCreationTime); //Apply Heideltime and get the XML-UIMA representation JCas cas = jcasFactory.createJCas(); for(FSIterator<Annotation> it= cas.getAnnotationIndex(Timex3.type).iterator(); it.hasNext(); ){ System.out.printkn(it.next); } 
0
source

Source: https://habr.com/ru/post/1208510/


All Articles