Mountain groups of people from Wikipedia

I am trying to get a list of people from http://en.wikipedia.org/wiki/Category:People_by_occupation . I need to go through all the sections and get people from each section.

How should I do it? Should I use a crawler and get pages and search for them using BeautifulSoup?
Or is there another alternative to get the same from Wikipedia?

+3
source share
3 answers

I would go with the pywikipediabot python project.

Take a look at category.py . You can use:

* tree        - show a tree of subcategories of a given category
* listify     - make a list of all of the articles that are in a category
+3
source

, . , 3 2010 . : 5,6 .

+1

You can use the CatScan tool to search for categories.

Instructions here
http://meta.wikimedia.org/wiki/CatScan

Search example - note, html format is displayed on 1000 results. Select CSV export to get all the results. In addition, be sure to change the category depth and other parameters, if necessary.

The already mentioned pywikipediabot is another option.

+1
source

Source: https://habr.com/ru/post/1738876/


All Articles