How to combine English words in Java?

Say I have a basic word form and a tag from the Penn Treebank Tag Set . How can I get the conjugate form? For example, for "do" and "VBN", how can I "do"?

I think this task has already been implemented in some nlp library, so I would prefer not to reinvent the wheel. Is there something similar?

+4
source share
2 answers

If you have a class:

public Treebank { public String conjugate(String base, String formTag); ... } 

Then:

 String conjugated = treebank.conjugate(base, formTag); 

If you don't have a Treebank class, it might look something like this:

 public Treebank { private Map<String, Map<String, String>> m_map = new HashMap<String, Map<String, String>>(); public Treebank() { populate(); } public String conjugate(String base, String formTag) { return m_map.get(base, formTag); } private void populate() { InputStream istream = openDataFile(); try { for (Record record = readRecord(istream); record !== null; record = readRecord(istream)) { // Add the entry Map<String, String> entry = m_map.get(record.base); if (entry == null) entry = new HashMap<String, String>(); entry.put(record.formTag, record.conjugatedForm); m_map.put(record.base, entry); } } finally { closeDataFile(istream); } } // Data management - to be implemented. private InputStream openDataFile() { ... } private Record readRecord(InputStream istream) { ... } private void closeDataFile(InputStream istream) { ... } private static class Record { String base; String formTag; String conjugatedForm; } } 

A better solution might include a database instead of a data file. I would also reorganize the data access code into a data access object.

0
source

Here you must create a sparse array containing the answers indexed through the term itself as one key, and the PTTS code (CC, TO, VBD) as another key.

0
source

Source: https://habr.com/ru/post/1306547/


All Articles