I am working on a project where I need to extract corporate bond information from unstructured letters. After a lot of research, I found that machine learning can be used to extract information. I tried Opennlp NER (Named Entity Recognizer), but I'm not sure if I took the right library for this problem or not, because I get the results, but not until the end.
Can someone please offer me any library or algorithms how I can analyze and extract data from it. I plan to research Naieve Bayes or N-gram or vector machine support, but not sure if this will help me or not. Please suggest.
Examples:
[/] Trading 10mm ABC 2.5 19 05/06 mkt can use 50mm ---> here I want to extract "ABC 2.5 19"
Example 2:
XYZ 6.5 15 10-2B 106-107 B3 AAA- 1.646MM 2x2 ---> here I want to extract "XYZ 6.5 15"
source
share