The Wikipedia dump file is in XML format. Therefore, for this purpose you can use any available XML tools.
Note that due to the size of the dump file, the SAX analyzer will usually be much more efficient than the DOM parser (since the DOM parser will try to load the whole thing into the memory view).
source share