How to create a web crawler using Node.js?

I recently wondered how search engines work, and I learned that they use "bots" or "web crawlers." I immediately began to wonder how it works, and I wanted to create it! So, firstly: how do you create a program that requests a page from the server? It would be great if you would give me a simple example in JavaScript (I run it as a regular scripting language using Node). Next, is there a Node module that allows me to interpret HTML? Create a DOM for me so that I can quote all links and so on? Correct me if I am wrong, but I think it was so ... Any examples in C ++, C or Python are also welcome, although I would prefer JS or Python, because I am more familiar with higher-level scripting languages.

+4
source share
1 answer
+3
source

Source: https://habr.com/ru/post/1388012/


All Articles