Markov chain on alphabetic scale and random text

I would like to generate random text using letter frequencies from a book in a .txt file, so each new character ( string.lowercase + ' ' ) depends on the previous one.

How to use Markov chains for this? Or is it easier to use 27 arrays with conditional frequencies for each letter?

+4
source share
2 answers

I would like to create random text using letter frequencies from a book in a txt file

Consider using collections.Counter to boost frequencies when you cycle through a text file two letters at a time.

How to use Markov chains for this? Or is it easier to use 27 arrays with conditional frequencies for each letter?

Two statements are equivalent. The Markov chain is what you do. 27 arrays with conditional frequencies - this is how you do it.

Here is the dictionary-based code to get you started:

 from collections import defaultdict, Counter from itertools import ifilter from random import choice, randrange def pairwise(iterable): it = iter(iterable) last = next(it) for curr in it: yield last, curr last = curr valid = set('abcdefghijklmnopqrstuvwxyz ') def valid_pair((last, curr)): return last in valid and curr in valid def make_markov(text): markov = defaultdict(Counter) lowercased = (c.lower() for c in text) for p, q in ifilter(valid_pair, pairwise(lowercased)): markov[p][q] += 1 return markov def genrandom(model, n): curr = choice(list(model)) for i in xrange(n): yield curr if curr not in model: # handle case where there is no known successor curr = choice(list(model)) d = model[curr] target = randrange(sum(d.values())) cumulative = 0 for curr, cnt in d.items(): cumulative += cnt if cumulative > target: break model = make_markov('The qui_.ck brown fox') print ''.join(genrandom(model, 20)) 
+8
source

If each character depends only on the previous character, you can simply calculate the probabilities for all 27 ^ 2 pairs of characters.

+1
source

Source: https://habr.com/ru/post/1388356/


All Articles