Laplace Smoothing in Biopython

I am trying to add Laplace smoothing support to the Biopython Naive Bayes 1 code for my Bioinformatics project.

I read a lot of documents about Naive Bayes algorithm and Laplacian smoothing, and I think I have a basic idea, but I just can’t integrate this with this code (actually I don’t see which part I will add 1-Laplacian number) .

I am not familiar with Python and I am a newbie coder. I appreciate if anyone who is familiar with Biopython can give me some suggestions.

+3
source share
1 answer

Try using this method definition _contents():

def _contents(items, laplace=False):
    # count occurrences of values
    counts = {}
    for item in items:
        counts[item] = counts.get(item,0) + 1.0
    # normalize
    for k in counts:
        if laplace:
            counts[k] += 1.0
            counts[k] /= (len(items)+len(counts))
        else:
            counts[k] /= len(items)
    return counts

Line 194 :

# Estimate P(value|class,dim)
nb.p_conditional[i][j] = _contents(values, True)

True, , False, .

/ :

# without
>>> carmodel.p_conditional
[[{'Red': 0.40000000000000002, 'Yellow': 0.59999999999999998},
  {'SUV': 0.59999999999999998, 'Sports': 0.40000000000000002},
  {'Domestic': 0.59999999999999998, 'Imported': 0.40000000000000002}],
 [{'Red': 0.59999999999999998, 'Yellow': 0.40000000000000002},
  {'SUV': 0.20000000000000001, 'Sports': 0.80000000000000004},
  {'Domestic': 0.40000000000000002, 'Imported': 0.59999999999999998}]]

# with
>>> carmodel.p_conditional
[[{'Red': 0.42857142857142855, 'Yellow': 0.5714285714285714},
  {'SUV': 0.5714285714285714, 'Sports': 0.42857142857142855},
  {'Domestic': 0.5714285714285714, 'Imported': 0.42857142857142855}],
 [{'Red': 0.5714285714285714, 'Yellow': 0.42857142857142855},
  {'SUV': 0.2857142857142857, 'Sports': 0.7142857142857143},
  {'Domestic': 0.42857142857142855, 'Imported': 0.5714285714285714}]]

, , :

, , , .

, , , , , , _contents(), , ...

, ( ) .

+3

Source: https://habr.com/ru/post/1771220/


All Articles