I am working on a random text generator - without using Markov chains - and currently it works without any problems. Firstly, here is my code stream:
Enter the sentence as input - this is called the trigger line, assigned to the variable -
Get long word in trigger line
Search the entire Project Gutenberg database for sentences containing this word, regardless of uppercase uppercase -
Return the longest sentence that the word I spoke about in step 3
Attach the sentence in steps 1 and Step4 together
Assign the sentence in step 4 as the new trigger sentence and repeat the process. Please note that I need to get the longest word in the second sentence and continue so on and so forth -
And here is my code:
import nltk
from nltk.corpus import gutenberg
from random import choice
triggerSentence = raw_input("Please enter the trigger sentence: ")
longestLength = 0
longestString = ""
listOfSents = gutenberg.sents()
listOfWords = gutenberg.words()
while triggerSentence:
split_str = triggerSentence.split()
for piece in split_str:
if len(piece) > longestLength:
longestString = piece
longestLength = len(piece)
sets = []
for sentence in listOfSents:
if sentence.count(longestString):
sents= " ".join(sentence)
if len(sents) > 40:
sets.append(" ".join(sentence))
triggerSentence = choice(sets)
print triggerSentence
My concern is that the loop basically reaches the point at which the same sentence is printed over and over. Because it is the longest sentence that has the longest word. To meet the same sentence over and over, I thought of the following:
* If the longest word in the current sentence is the same as in the last sentence, simply remove this long word from the current sentence and look for the next longest word.
, , , gutenberg. , ? , , , .sents() .words() NLTK Gutenberg . .