I use wget to download a huge list of web pages (about 70,000). I have to sleep about 2 seconds between consecutive wget.This takes a huge amount of time. Something like 70 days. What I would like to do is to use a proxy so that I can significantly speed up the process. I am using a simple bash script for this process. All suggestions and comments are appreciated.
The first suggestion is not to use Bash or wget. I would use Python and Beautiful Soup. Wget is not really meant to clear the screen.
Interchange the load on multiple machines by running part of your list on each machine.
, - , script .
Source: https://habr.com/ru/post/1794010/More articles:MySQL query count () returns rows instead of general? - sqlHow to check if before_filter is redirected for all Rails controller actions? - ruby-on-railsУдалить jquery ajax form submit - jqueryI would like to cross compile gcj program for windows in linux - javahttps://translate.googleusercontent.com/translate_c?depth=1&pto=aue&rurl=translate.google.com&sl=ru&sp=nmt4&tl=en&u=https://fooobar.com/questions/1794009/concurrent-access-to-hsqldb-embedded-instance&usg=ALkJrhiuWrgQZab48rkS8hWevmuD4B3N3wHow to improve the quality of JPEG images created from PDF files using Ghostscript? - c #Problems with EditText field not showing up - androidhow to run cgi - c ++ programUsing Zend_Form to validate in a view and model - design-patternsУдаление элементов из таблицы без идентификатора ссылки - sqlAll Articles