How to save scraw crawl command output

I am trying to save the output of the scraw crawl command that I tried scrapy crawl someSpider -o some.json -t json >> some.text But that didn’t work ... can some authority tell me how I can save the output to text file ... I mean logs and information printed using scrapy ...

+9
source share
5 answers

You also need to redirect stderr. You redirect only stdout. You can redirect it like this:

scrapy crawl someSpider -o some.json -t json 2> some.text

The key is the number 2 that stderr selects as the source for redirection.

If you want to redirect stderr and stdout to the same file, you can use:

scrapy crawl someSpider -o some.json -t json &> some.text

More on output redirection: http://tldp.org/HOWTO/Bash-Prog-Intro-HOWTO-3.html

+15
source

You can add these lines to your settings.py :

 LOG_STDOUT = True LOG_FILE = '/tmp/scrapy_output.txt' 

And then start a regular scan:

 scrapy crawl someSpider 
+22
source

if you want to get the result of the runpider command.

 scrapy runspider scraper.py -o some.json -t json 2> some.text 

This also works.

0
source

You can use nohup :

 nohup scrapy crawl someSpider & 

The log will be stored in nohup.out

0
source

For all scrapy commands scrapy you can add --logfile NAME_OF_FILE to enter the file, for example

 scrapy crawl someSpider -o some.json --logfile some.text 

There are two other useful command line options for logging:

  • -L or --loglevel to control the registration level, e.g. -L INFO (default is DEBUG )

  • --nolog to completely disable logging

These commands are described here .

0
source

Source: https://habr.com/ru/post/945418/


All Articles