Store mongodb data in compressed format

I use mongodb to store the raw HTML data of web pages using the scrapy scheme. On one day of scrambling 25 GB of disk space is full. There is a way to store raw data in a compressed format.

+3
mongodb compression
Aug 2 '13 at 10:19
source share
3 answers

Nothing has been created for compression. Some operating systems offer disk / file compression, but if you want more control, I would suggest you compress it using the library for any programming language you use and manually manage the compression.

For example, NodeJs offers simple, convenient methods for this: http://nodejs.org/api/zlib.html#zlib_examples

Update 3.0

If you decide to switch to the new WiredTiger storage engine that ships with 3.0, you can choose between several types of compression, as described here . Of course, you'll want to check out this change in workloads to see if additional CPU utilization is worth it.

+2
Aug 02 '13 at
source share

Starting with version 2.8 of Mongo, you can use compression . You will have 3 compression levels with the WiredTiger engine, mmap (which by default in 2.6 does not provide compression):

Here is an example of how much space you can save for 16 GB of data:

enter image description here

The data is taken from this article.

+4
Dec 20 '14 at 6:35
source share

You can save your string to compress it: myhtml.encode ('zlib')

0
Sep 25 '13 at 17:07
source share



All Articles