How to make this random text generator more efficient in Python?

I am working on a random text generator - without using Markov chains - and currently it works without any problems. Firstly, here is my code stream:

  • Enter the sentence as input - this is called the trigger line, assigned to the variable -

  • Get long word in trigger line

  • Search the entire Project Gutenberg database for sentences containing this word, regardless of uppercase uppercase -

  • Return the longest sentence that the word I spoke about in step 3

  • Attach the sentence in steps 1 and Step4 together

  • Assign the sentence in step 4 as the new trigger sentence and repeat the process. Please note that I need to get the longest word in the second sentence and continue so on and so forth -

And here is my code:

import nltk
from nltk.corpus import gutenberg
from random import choice

triggerSentence = raw_input("Please enter the trigger sentence: ")#get input str
longestLength = 0
longestString = ""
listOfSents = gutenberg.sents() #all sentences of gutenberg are assigned -list of  list format-
listOfWords = gutenberg.words()# all words in gutenberg books -list format-

while triggerSentence:
    #so this is run every time through the loop
    split_str = triggerSentence.split()#split the sentence into words

    #code to find the longest word in the trigger sentence input
    for piece in split_str:
        if len(piece) > longestLength:
            longestString = piece
            longestLength = len(piece)

    #code to get the sentences containing the longest word, then selecting
    #random one of these sentences that are longer than 40 characters
    sets = []
    for sentence in listOfSents:
        if sentence.count(longestString):
            sents= " ".join(sentence)
            if len(sents) > 40:
            sets.append(" ".join(sentence))

    triggerSentence = choice(sets)
    print triggerSentence

My concern is that the loop basically reaches the point at which the same sentence is printed over and over. Because it is the longest sentence that has the longest word. To meet the same sentence over and over, I thought of the following:

* If the longest word in the current sentence is the same as in the last sentence, simply remove this long word from the current sentence and look for the next longest word.

, , , gutenberg. , ? , , , .sents() .words() NLTK Gutenberg . .

+3
2

:

  • while , , .
  • max .
  • 40 , longestWord . while, .

    sents = [" ".join(sent) for sent in listOfSents if longestWord in sent and len(sent) > 40]

  • , , :

    for sent in random.shuffle(sents): print sent

:

import nltk
from nltk.corpus import gutenberg
from random import shuffle

listOfSents = gutenberg.sents()
triggerSentence = raw_input("Please enter the trigger sentence: ")

longestWord = max(triggerSentence.split(), key=len)
longSents = [" ".join(sent) for sent in listOfSents 
                 if longestWord in sent 
                 and len(sent) > 40]

for sent in shuffle(longSents):
    print sent
0

( , , ), : ( Project Gutenberg - ).

+1

Source: https://habr.com/ru/post/1762163/


All Articles