I can provide a solution
Solution b.
Here is the code:
public class tikaOpenIntro { public String Tokens[]; public static void main(String[] args) throws IOException, SAXException, TikaException { tikaOpenIntro toi = new tikaOpenIntro(); String cnt; cnt="John is planning to specialize in Electrical Engineering in UC Berkley and pursue a career with IBM."; toi.tokenization(cnt); String names = toi.namefind(toi.Tokens); String org = toi.orgfind(toi.Tokens); System.out.println("person name is : "+names); System.out.println("organization name is: "+org); } public String namefind(String cnt[]) { InputStream is; TokenNameFinderModel tnf; NameFinderME nf; String sd = ""; try { is = new FileInputStream( "/home/rahul/opennlp/model/en-ner-person.bin"); tnf = new TokenNameFinderModel(is); nf = new NameFinderME(tnf); Span sp[] = nf.find(cnt); String a[] = Span.spansToStrings(sp, cnt); StringBuilder fd = new StringBuilder(); int l = a.length; for (int j = 0; j < l; j++) { fd = fd.append(a[j] + "\n"); } sd = fd.toString(); } catch (FileNotFoundException e) { e.printStackTrace(); } catch (InvalidFormatException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } return sd; } public String orgfind(String cnt[]) { InputStream is; TokenNameFinderModel tnf; NameFinderME nf; String sd = ""; try { is = new FileInputStream( "/home/rahul/opennlp/model/en-ner-organization.bin"); tnf = new TokenNameFinderModel(is); nf = new NameFinderME(tnf); Span sp[] = nf.find(cnt); String a[] = Span.spansToStrings(sp, cnt); StringBuilder fd = new StringBuilder(); int l = a.length; for (int j = 0; j < l; j++) { fd = fd.append(a[j] + "\n"); } sd = fd.toString(); } catch (FileNotFoundException e) { e.printStackTrace(); } catch (InvalidFormatException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } return sd; } public void tokenization(String tokens) { InputStream is; TokenizerModel tm; try { is = new FileInputStream("/home/rahul/opennlp/model/en-token.bin"); tm = new TokenizerModel(is); Tokenizer tz = new TokenizerME(tm); Tokens = tz.tokenize(tokens);
and you want the location to also import the location model, which is also available in openGroup Source Forge . You can download and use them.
I’m not sure what will be the probability of the name, location and retrieval of the organization, but almost he will recognize all the names, location, organization.
and if openNLP is not found sufficient, then use Stanford Parser to recognize object names.
source share