Easy way to export Wikipedia translated names

Is there an easy way to export translations on Wikipedia to get this set
russian_title -> english_title:?

I tried to get from ruwiki-latest-pages-meta-current.xml.bz2 and ruwiki-latest-pages-articles.xml.bz2 , however there are translations of less than 25 thousand.

I found out that some are not. For instance. here you can see the link to the English wiki here , but there is no link in the dump [[en:Yandex]].

Maybe I should try to make out the English Wikipedia, but I'm sure there is a nicer solution.

By the way, I use wikixmlj + tried to find en:Yandexwith grep.

UPD: link to decision data @svick: http://dumps.wikimedia.org/ [language code] wiki / latest / eg http://dumps.wikimedia.org/ruwiki/latest/

+3
source share
1 answer

Most of the links between Wikipedia articles in different languages ​​are now on Wikidata . So, if you want to get to the source, you can download the Wikidata dump and analyze it (in JSON).

But I think the best way would be to use a tablelanglinks dump . It contains exactly the information you want, both for links from Wikidata, and for links that are still in the old form.

SQL. MySQL, ( .Net-, ).

( ) . , . , " " , API. , page, .

+1

Source: https://habr.com/ru/post/1548283/


All Articles