Continuing the analogy with Excel:
Firstly, I decided that the request at the second level and beyond is not really needed (this makes the problem much easier). This structure cannot easily adapt to this, therefore a fair warning. This is not a perfect answer to my original question, sorry! It seems that MongoDB is just not very suitable for storing my original question.
I used a structure similar to how MongoDB recommends representing trees. Instead of trying to present an Excel document as one MongoDB document, it is now represented by many (one for each unique formula, and the other for storing strings and values). All MongoDB documents related to the same excel workbook just have a field book and store what they belong to.
A very simple example (custom _id tags _id not needed, but easier to read):
{_id: book1_1 workbook: book1, cell_values: [1,2,3], cell_strings: ['hello','world']} {_id:book1_2 workbook: book1, formula: 'SUM', children: ['AVG','ABS']} {_id:book1_3 workbook: book1, formula: 'AVG', children: ['SUM']} {_id:book1_4 workbook: book1, formula: 'ABS', children: ['SUM']}
These 4 MongoDB documents are one Excel document that has the following formula structures (this is not the only Excel worksheet that could create the above MongoDB documents):
=SUM(AVG()) =AVG(SUM()) =ABS(SUM(ABS()))
Along with the values 1,2,3 and the lines 'hello', 'world' somewhere inside it.
The search query for books with the SUM formula inside the AVG formula is the following query:
db.collection.find({$and: [{formula: 'AVG'},{children: 'SUM'}]})
returns the document _id:book1_3 MongoDB. Then you can disable the workbook, but want to.