Best way to cache pages in a database?

Im working on a project that includes collecting data from different sites, a good analogy is collecting statistics at eBay auctions. However, like storing key data, I really need to provide access to the source page, and on some sites the source pages may not be permanent - for example, if eBay deleted the auction page after completion. Authentication ideally resembles a similar system of how Google caches pages, for example, by storing a copy of the page on my own server. However, I was informed that there could be complications, as well as a big impact on the resources needed for my database.

+3
source share
3 answers

Even if every page that you cache is only 5 kilobytes, it still adds 200 pages of cache and you used the 1mb add-on in your database; 20,000 pages cache, and you used 100 MB, and many pages (when you consider markup + content) should be more than 5 KB.

One alternative is to save the pages to disk as (potentially compressed) files in a directory, and then simply refer to the saved file name in your database - if you do not need to search for the contents of the page code through a query after the initial data processing this approach can reduce the size of your database and query results while maintaining full pages.

+3
source

, , , , , . varbinary . , Lucene .

0

, CSS JS , , , -, / ?

?

, 5Kb , , , JS... AJAX-. , , , , ?

- , Google?

0

Source: https://habr.com/ru/post/1714907/


All Articles