The average number of characters per word in the list

Question

The average number of characters per word in the list

I am new to python and I need to calculate the average number of characters in a single word in a list

using these definitions and an auxiliary function clean_up.

the token is str, which you get from calling the string method, broken into a file string.

a word is a non-empty token from a file that does not completely consist of punctuation marks. find the “words” in the file using str.splitto find the markers, and then remove the punctuation marks from the words using a helper function clean_up.

The offer represents a sequence of characters that ends with (but not including) the symbols !, ?, .or end of the file, eliminating gaps at both ends, and is not empty.

This is my homework question from my computer science class at my college.

cleaning function:

def clean_up(s):
    punctuation = """!"',;:.-?)([]<>*#\n\"""
    result = s.lower().strip(punctuation)
    return result

my code is:

def average_word_length(text):
    """ (list of str) -> float

    Precondition: text is non-empty. Each str in text ends with \n and at
    least one str in text contains more than just \n.

    Return the average length of all words in text. Surrounding punctuation
    is not counted as part of the words. 


    >>> text = ['James Fennimore Cooper\n', 'Peter, Paul and Mary\n']
    >>> average_word_length(text)
    5.142857142857143 
    """

    for ch in text:
        word = ch.split()
        clean = clean_up(ch)
        average = len(clean) / len(word)
    return average

I get 5.0, but I was really confused, some help would be greatly appreciated :) PS I am using python 3

+4

python string python-3.x regex

dev_prabh Feb 25 '14 at 16:49

source share

2 answers

This is a short and sweet method to solve your problem, which is still readable.

def clean_up(word, punctuation="!\"',;:.-?)([]<>*#\n\\"):
    return word.lower().strip(punctuation)  # you don't really need ".lower()"

def average_word_length(text):
    cleaned_words = [clean_up(w) for w in (w for l in text for w in l.split())]
    return sum(map(len, cleaned_words))/len(cleaned_words)  # Python2 use float

>>> average_word_length(['James Fennimore Cooper\n', 'Peter, Paul and Mary\n'])
5.142857142857143

.

+5

Inbar Rose 25 . '14 17:03

Adam smith · Accepted Answer · 2014-02-25T16:59:15+0000

Is it possible to clear some of these functions using import expressions and a generator, will we?

import string

def clean_up(s):
    # I'm assuming you REQUIRE this function as per your assignment
    # otherwise, just substitute str.strip(string.punctuation) anywhere
    # you'd otherwise call clean_up(str)
    return s.strip(string.punctuation)

def average_word_length(text):
    total_length = sum(len(clean_up(word)) for sentence in text for word in sentence.split())
    num_words = sum(len(sentence.split()) for sentence in text)
    return total_length/num_words

You may notice that this actually condenses with a length and an unreadable single-line layer:

average = sum(len(word.strip(string.punctuation)) for sentence in text for word in sentence.split()) / sum(len(sentence.split()) for sentence in text)

This is rude and disgusting, so you should not do this;). Readability indicators and all that.

The average number of characters per word in the list

More articles: