Here is a more python, more general rewriting code:
class ParseError(Exception): pass def safe_slice(data, start, end, exc=IndexError): """0 <= start <= end is assumed""" r = data[start:end] if len(r) != end - start: raise exc() return r def lazy_parse(data): """extract (name, phone) from a data buffer. If the buffer could not be parsed, a ParseError is raised.""" results = [] ptr = 0 while ptr < len(data): length = ord(data[ptr]) ptr += 1 results.append(safe_slice(data, ptr, ptr + length, exc=ParseError)) ptr += length return tuple(results) if __name__ == '__main__': print lazy_parse("\x04Jack\x0A0123456789")
Most of the changes are in the body of lazy_parse - now it will work with several values, not two, and the correctness of the whole thing still depends on whether the last element can be parsed accurately.
Also, instead of safe_slice raising an IndexError that lazy_parse changes to ParseError , lazy_parse gives the desired exception for safe_slice to safe_slice in case of an error ( lazy_parse defaults to IndexError if nothing is passed to it).
Finally, lazy_parse not - it processes the entire line at once and returns all the results. "Lazy" in Python means doing just what is needed to return the next snippet. In the case of lazy_parse this would mean returning the name and then to a later call returning the phone. With a minor modification, we can make lazy_parse lazy:
def lazy_parse(data): """extract (name, phone) from a data buffer. If the buffer could not be parsed, a ParseError is raised.""" ptr = 0 while ptr < len(data): length = ord(data[ptr]) ptr += 1 result = (safe_slice(data, ptr, ptr + length, ParseError)) ptr += length yield result if __name__ == '__main__': print list(lazy_parse("\x04Jack\x0A0123456789"))
lazy_parse now a generator that returns one piece at a time. Note that we had to put list() around the lazy_parse call in the main section to get lazy_parse , to give us all the results to print them out.
Depending on what you are doing, this may not be the way we would like, as it may be more difficult to recover due to errors:
for item in lazy_parse(some_data): result = do_stuff_with(item) make_changes_with(result) ...
By the time the ParseError is raised, you may have made changes that are difficult or impossible to undo. The solution in this case would be the same as in the print main part:
for item in list(lazy_parse(some_data)): ...
Calling list completely consumes lazy_parse and gives us a list of results, and if an error has been raised, we will find out about this before processing the first element in the loop.