Markov chain on alphabetic scale and random text

Question

Markov chain on alphabetic scale and random text

I would like to generate random text using letter frequencies from a book in a .txt file, so each new character ( string.lowercase + ' ' ) depends on the previous one.

How to use Markov chains for this? Or is it easier to use 27 arrays with conditional frequencies for each letter?

+4

python markov-chains

Julia Dec 28 '11 at 18:53

source share

2 answers

If each character depends only on the previous character, you can simply calculate the probabilities for all 27 ^ 2 pairs of characters.

+1

tlehman Dec 28 '11 at 19:01

source share

Raymond hettinger · Accepted Answer · 2011-12-28T19:02:08+0000

I would like to create random text using letter frequencies from a book in a txt file

Consider using collections.Counter to boost frequencies when you cycle through a text file two letters at a time.

How to use Markov chains for this? Or is it easier to use 27 arrays with conditional frequencies for each letter?

Two statements are equivalent. The Markov chain is what you do. 27 arrays with conditional frequencies - this is how you do it.

Here is the dictionary-based code to get you started:

 from collections import defaultdict, Counter from itertools import ifilter from random import choice, randrange def pairwise(iterable): it = iter(iterable) last = next(it) for curr in it: yield last, curr last = curr valid = set('abcdefghijklmnopqrstuvwxyz ') def valid_pair((last, curr)): return last in valid and curr in valid def make_markov(text): markov = defaultdict(Counter) lowercased = (c.lower() for c in text) for p, q in ifilter(valid_pair, pairwise(lowercased)): markov[p][q] += 1 return markov def genrandom(model, n): curr = choice(list(model)) for i in xrange(n): yield curr if curr not in model: # handle case where there is no known successor curr = choice(list(model)) d = model[curr] target = randrange(sum(d.values())) cumulative = 0 for curr, cnt in d.items(): cumulative += cnt if cumulative > target: break model = make_markov('The qui_.ck brown fox') print ''.join(genrandom(model, 20))

Markov chain on alphabetic scale and random text

More articles: