Search for the average word length in a string

def word_count (x: str) -> str:
    characters = len(x)
    word = len(x.split())
    average = sum(len(x) for x in word)/len(word)
    print('Characters: ' + str(char) + '\n' + 'Words: ' + str(word) + '\n' + 'Avg word length: ' + str(avg) + '\n')

This code works fine for regular strings, but for a string like:

'***The ?! quick brown cat:  leaps over the sad boy.'

How to edit the code so that numbers like "***" and "?!" not taken into account in the code? The average number of words in the above sentence should be 3,888889, but my code gives me another number.

+4
source share
4 answers

Try the following:

import re

def avrg_count(x):
    total_chars = len(re.sub(r'[^a-zA-Z0-9]', '', x))
    num_words = len(re.sub(r'[^a-zA-Z0-9 ]', '', x).split())
    print "Characters:{0}\nWords:{1}\nAverage word length: {2}".format(total_chars, num_words, total_chars/float(num_words))


phrase = '***The ?! quick brown cat:  leaps over the sad boy.'

avrg_count(phrase)

Output:

Characters:34
Words:9
Average word length: 3.77777777778
+1
source

Strings have a method .translate()that you can use to do this (if you know all the characters you want to remove):

>>> "***foo ?! bar".translate(None, "*?!")
'foo  bar'
+2
source
import re

full_sent = '***The ?! quick brown cat:  leaps over the sad boy.'
alpha_sent = re.findall(r'\w+',full_sent)
print(alpha_sent)

It will display:

['The', 'quick', 'brown', 'cat', 'leaps', 'over', 'the', 'sad', 'boy']

To get the average, you can:

average = sum(len(word) for word in alpha_sent)/len(alpha_sent)

What will give: 3.77

0
source

Source: https://habr.com/ru/post/1613919/


All Articles