How to extract corporate bond information using machine learning

I am working on a project where I need to extract corporate bond information from unstructured letters. After a lot of research, I found that machine learning can be used to extract information. I tried Opennlp NER (Named Entity Recognizer), but I'm not sure if I took the right library for this problem or not, because I get the results, but not until the end.

Can someone please offer me any library or algorithms how I can analyze and extract data from it. I plan to research Naieve Bayes or N-gram or vector machine support, but not sure if this will help me or not. Please suggest.

Examples:

[/] Trading 10mm ABC 2.5 19 05/06 mkt can use 50mm ---> here I want to extract "ABC 2.5 19"

Example 2:

XYZ 6.5 15 10-2B 106-107 B3 AAA- 1.646MM 2x2 ---> here I want to extract "XYZ 6.5 15"

+1
source share
1 answer

In Perl, you can use Marpa :: R2 , a common BNF parser.

This gist extracts information from your examples.

Hope this helps.

+2
source

Source: https://habr.com/ru/post/1777710/


All Articles