You can imitate how Python actually parses indentation. First create a stack that will contain levels of indentation. In each line:
- If the indentation is greater than the top of the stack, click it and increase the depth level.
- If this is the same, continue at the same level.
- If it is lower, place the top of the stack until it is larger than the new indent. If you find a lower level of indentation before you find the same, then an indented error will appear.
indentation = [] indentation.append(0) depth = 0 f = open("test.txt", 'r') for line in f: line = line[:-1] content = line.strip() indent = len(line) - len(content) if indent > indentation[-1]: depth += 1 indentation.append(indent) elif indent < indentation[-1]: while indent < indentation[-1]: depth -= 1 indentation.pop() if indent != indentation[-1]: raise RuntimeError("Bad formatting") print(f"{content} (depth: {depth})")
With the file "test.txt", the contents of which are indicated by you:
Income Revenue IAP Ads Other-Income Expenses Developers In-house Contractors Advertising Other Expenses
Here is the result:
Income (depth: 0) Revenue (depth: 1) IAP (depth: 2) Ads (depth: 2) Other-Income (depth: 1) Expenses (depth: 0) Developers (depth: 1) In-house (depth: 2) Contractors (depth: 2) Advertising (depth: 1) Other Expense (depth: 1)
So what can you do with this? Suppose you want to create nested lists. First create a data stack.
- When you find the indent, add a new list at the end of the data stack.
- When you find unindentation, put the top list and add it to the new top.
And independently, for each row, add content to the list at the top of the data stack.
Here is the relevant implementation:
for line in f: line = line[:-1] content = line.strip() indent = len(line) - len(content) if indent > indentation[-1]: depth += 1 indentation.append(indent) data.append([]) elif indent < indentation[-1]: while indent < indentation[-1]: depth -= 1 indentation.pop() top = data.pop() data[-1].append(top) if indent != indentation[-1]: raise RuntimeError("Bad formatting") data[-1].append(content) while len(data) > 1: top = data.pop() data[-1].append(top)
The nested list is at the top of the data stack. Output for the same file:
['Income', ['Revenue', ['IAP', 'Ads' ], 'Other-Income' ], 'Expenses', ['Developers', ['In-house', 'Contractors' ], 'Advertising', 'Other Expense' ] ]
It is fairly easy to manipulate, although fairly deeply embedded. You can access the data through an element access chain:
>>> l = data[0] >>> l ['Income', ['Revenue', ['IAP', 'Ads'], 'Other-Income'], 'Expenses', ['Developers', ['In-house', 'Contractors'], 'Advertising', 'Other Expense']] >>> l[1] ['Revenue', ['IAP', 'Ads'], 'Other-Income'] >>> l[1][1] ['IAP', 'Ads'] >>> l[1][1][0] 'IAP'