Reading POS Tag Models in Android

I tried to mark POS with openNLP POS Models on a regular Java application. Now I would like to implement it on the Android platform. I am not sure what the requirements or limitations of Android are, since I cannot read the models (binary) and correctly execute the POS tags.

I tried to get the .bin file from external storage, and also put it in external libraries, but it still could not work. These are my codes:

InputStream modelIn = null; POSModel model = null; String path = Environment.getExternalStorageDirectory().getPath() + "/TextSumIt/en-pos-maxent.bin"; modelIn = new BufferedInputStream( new FileInputStream(path)); model = new POSModel(modelIn); 

The error I received is:

 11-15 06:39:35.072: W/System.err(565): opennlp.tools.util.InvalidFormatException: The profile data stream has an invalid format! 11-15 06:39:35.177: W/System.err(565): at opennlp.tools.dictionary.serializer.DictionarySerializer.create(DictionarySerializer.java:224) 11-15 06:39:35.177: W/System.err(565): at opennlp.tools.postag.POSDictionary.create(POSDictionary.java:282) 11-15 06:39:35.182: W/System.err(565): at opennlp.tools.postag.POSModel$POSDictionarySerializer.create(POSModel.java:48) 11-15 06:39:35.182: W/System.err(565): at opennlp.tools.postag.POSModel$POSDictionarySerializer.create(POSModel.java:44) 11-15 06:39:35.182: W/System.err(565): at opennlp.tools.util.model.BaseModel.<init>(BaseModel.java:135) 11-15 06:39:35.197: W/System.err(565): at opennlp.tools.postag.POSModel.<init>(POSModel.java:93) 11-15 06:39:35.197: W/System.err(565): at com.main.textsumit.SummarizationActivity.postagWords(SummarizationActivity.java:676) 11-15 06:39:35.205: W/System.err(565): at com.main.textsumit.SummarizationActivity.generateSummary(SummarizationActivity.java:252) 11-15 06:39:35.205: W/System.err(565): at com.main.textsumit.SummarizationActivity.onCreate(SummarizationActivity.java:127) 

What does it mean that he is not reading the model correctly? And how do I resolve this? Please, help.

Thanks.

+4
source share
2 answers

What is it worth, if it is still a problem: I had a similar problem, trying to use the POS model in a different context (not Android), and in my case it turned out that the extraction failed from the bunker file, and not something from the model. It seems local to the tags.tagdict file in the archive (as suggested here at http://sharpnlp.codeplex.com/discussions/263620 ), so if you don't need it now (and I haven't used it for my simple scripts), then try deleting it from the archive. (But leave the archive intact, as it should appear in zip'd form.)

+1
source

Try it worked for me

  System.setProperty("org.xml.sax.driver", "org.xmlpull.v1.sax2.Driver"); try { AssetFileDescriptor fileDescriptor = context.getAssets().openFd("en_pos_maxent.bin"); FileInputStream inputStream = fileDescriptor.createInputStream(); POSModel posModel = new POSModel(inputStream); posTaggerME = new POSTaggerME(posModel); } catch (Exception e) {} 
0
source

Source: https://habr.com/ru/post/1446150/


All Articles