I wrote a script in python to get some information from a web page. The code itself works flawlessly if it is inferred from asynchronous. However, since my script works synchronously, I wanted it to go through an asynchronous process, so that it would complete the task as soon as possible, ensuring optimal performance and, obviously, not blocking it. Since I have never worked with this asynchronous library, I am seriously confused about how to do this. I tried putting my script in an asyncio process, but that seems to be wrong. If someone lends a helping hand to complete this, I will really be grateful to him. Thanks in advance. Here is my error code:
import requests ; from lxml import html import asyncio link = "http://quotes.toscrape.com/" async def quotes_scraper(base_link): response = requests.get(base_link) tree = html.fromstring(response.text) for titles in tree.cssselect("span.tag-item a.tag"): processing_docs(base_link + titles.attrib['href']) async def processing_docs(base_link): response = requests.get(base_link).text root = html.fromstring(response) for soups in root.cssselect("div.quote"): quote = soups.cssselect("span.text")[0].text author = soups.cssselect("small.author")[0].text print(quote, author) next_page = root.cssselect("li.next a")[0].attrib['href'] if root.cssselect("li.next a") else "" if next_page: page_link = link + next_page processing_docs(page_link) loop = asyncio.get_event_loop() loop.run_until_complete(quotes_scraper(link)) loop.close()
After execution, I see on the console:
RuntimeWarning: coroutine 'processing_docs' was never awaited processing_docs(base_link + titles.attrib['href'])
source share