A huge amount of text data for parsing

I am developing a ruby ​​parser that parses some uneven text data. Can someone tell me where I can get a lot of open text data?

+6
source share
2 answers

You can copy Wikipedia (or just run a bunch of it through lynx -dump ). It will also give you an extensive source of non-English text. The Gutenberg project will be another good source of a lot of plain text.

+4
source

Source: https://habr.com/ru/post/886677/


All Articles