This amount of data is small.
If there is no need for parallel data processing, there is no need for Mongo DB. Especially when it comes to small amounts of data, such as 4 GB, the overhead of distributing the work can easily get more than the actual evaluation effort.
4GB / 60k nodes are also not large XML databases. After some time, you will understand XQuery as a great tool for parsing XML documents.
It's really?
Or do you get 4 GB daily and need to evaluate this and all the data that you already saved? Then you will receive a certain amount that you can no longer store and process on one machine; and the distribution of work will be necessary. Not in a few days or weeks, but a year will already bring you 1 TB.
Convert to JSON
How do you present the information? Does it stick to any schema or even resemble tabular data? MongoDB's capabilities for parsing semi-structured methods are worse than XML databases provide. On the other hand, if you only want to pull a few fields onto well-defined paths, and you can parse one input file after another, Mongo DB probably won't suffer much.
Migrate XML to the Cloud
If you want to use both the capabilities of the XML database for data analysis, and some of the features of the NoSQL system in disseminating this work, you can run the database from this system.
BaseX falls into the cloud with exactly the features you need — but it will probably take some time to get this -Ready feature.
source share