How to create a CRF ++ template file?

I am new to CRF ++. I teach myself by looking at his guide: http://crfpp.googlecode.com/svn/trunk/doc/index.html?source=navbar#templ

And I do not understand what this means:

This is a template for describing the functions of a unigram. When you give

template "U01:% x [0,1]", CRF ++ automatically generates a set of functions

functions (func1 ... funcN) like:

func1 = if (output = B-NP and feature = "U01: DT") return 1 else return 0

func2 = if (output = I-NP and feature = "U01: DT") return 1 else return 0

func3 = if (output = O and feature = "U01: DT") return 1 else return 0

.... funcXX = if (output = B-NP and feature = "U01: NN") return 1 else return 0

funcXY = if (output = O and feature = "U01: NN") return 1 else return 0. Number of functional functions generated by the template

is (L * N), where L is the exit number

Why are there so many lines for Unigram functions and what do they mean?

+6
source share
2 answers

After looking at the documentation long enough, I think I understood.

Let's take an example in the documentation where the input data is located:

He PRP B-NP reckons VBZ B-VP the DT B-NP current JJ I-NP account NN I-NP 

and function template (in the format %x[row, col] , where row refers to your current position): %x[0,1]

When %x[0,1] expands, depending on the current token, it can scan one of the rows inside the set [PRP, VBZ, DT, JJ, NN] (ie one of the unique rows from the 1st column, where the leftmost column is column 0). For each of these lines, he creates a set of functional functions of the form (looking at the 3rd line of input data):

 func1 = if (output = B-NP and feature="U01:DT") return 1 else return 0 func2 = if (output = I-NP and feature="U01:DT") return 1 else return 0 func3 = if (output = O and feature="U01:DT") return 1 else return 0 ... 

where this particular line ( DT in the code above) is compared with each individual class.

So, if the output classes are [B-NP, I-NP, O] , the function template extended in the function of the function will look like this:

 # row 1 (He, PRP, B-NP) func1 = if (output = B-NP and feature="U01:PRP") return 1 else return 0 func2 = if (output = I-NP and feature="U01:PRP") return 1 else return 0 func3 = if (output = O and feature="U01:PRP") return 1 else return 0 # row 2 (Reckons, VBZ, B-VP) func4 = if (output = B-NP and feature="U01:VBZ") return 1 else return 0 func5 = if (output = I-NP and feature="U01:VBZ") return 1 else return 0 func6 = if (output = O and feature="U01:VBZ") return 1 else return 0 # Row 3 (the, DT, B-NP) func7 = if (output = B-NP and feature="U01:DT") return 1 else return 0 func8 = if (output = I-NP and feature="U01:DT") return 1 else return 0 func9 = if (output = O and feature="U01:DT") return 1 else return 0 # Row 4 (current, JJ, I-NP) func10 = if (output = B-NP and feature="U01:JJ") return 1 else return 0 func11 = if (output = I-NP and feature="U01:JJ") return 1 else return 0 func12 = if (output = O and feature="U01:JJ") return 1 else return 0 # Row 5 (account, NN, I-NP) func13 = if (output = B-NP and feature="U01:NN") return 1 else return 0 func14 = if (output = I-NP and feature="U01:NN") return 1 else return 0 func15 = if (output = O and feature="U01:NN") return 1 else return 0 

Regarding where the documentation is mentioned:

The number of functional functions generated by the template is (L * N), where L is the number of output classes and N is the number of unique lines deployed from this template.

In this case, L will be 3, and N will be 5.

+6
source

For a specific template,% x [i, j] I represent the offsets (row) to the current position, j represents the function (column) that you want to use. Data:

 He PRP B-NP reckons VBZ B-VP the DT B-NP current JJ I-NP << CURRENT TOKEN account NN I-NP 

% x [0,1] refers to the word, the offset to the current word is 0, its pos tag is JJ, and its output tag is I-NP.

Move pointer,% x [0, 1] β†’ pos tag = NN, output tag = I-NP

Each function of the function refers to a pair of possible values ​​of the current word and its pos tag.

update:

I think the explanation above is pretty straightforward provided that you understand the CRF model well.

Link to CRF Model

CRF ++ is Sha and Pereira Replication (2003)

+1
source

Source: https://habr.com/ru/post/974283/


All Articles