How to analyze the value of attributes inside {{}} (curly braces) inside the info box

Inside Infobox on wikipedia, some attribute values โ€‹โ€‹are also inside curly braces {{}}. They also have lenses for a while. I need the values โ€‹โ€‹inside the braces that appear on the wikipedia web page. I read that these are also templates. Can someone give me a link or advise me how to deal with it?

+1
source share
2 answers

Double curly braces {{}} define the call of some magic word, variable, parser function or template. Help can be found at MediaWiki.org/.../Manual:Magic_words . Small lines that look like | are called pipe and are used as delimiters that allow the wikicore parser to determine parameters that can be used with a magic word, variable, parser function, or template.

+2
source

I hope this helps everyone who is faced with this same problem. Given that you will process the infobox using PHP, you can use this: http://www.mywiki.com/wiki/api.php?format=xml&action=query&titles=PAGE_TITLE_THAT_CONTAINS_AN_INFOBOX&prop=revisions&rvprop=content&rvgenerateml

'rvgeneratexml' is set to true (1), this will cause the xml node <rev> generate the "parsetree" attribute containing information about infoboxes in XML format.

Then in PHP you can load all the information ( <api> everything, including <rev></api> ) using simpleXML:

 $xml = simplexml_load_file($url); 

Then you can load the template information by getting the "parsetree" attribute and loading the line:

 $template = simplexml_load_string($xml->query->pages->page->revisions->rev->attributes()->parsetree); $template = $template->template; // If more than 1 template, check template[0], [1], etc 

Then, using the correct structure, you can access the elements with something like:

 if ($template->part[0]->name='name') $film = $template->part[0]->value; 

Then $film will contain the name of the movie ( ->name is the name of the parameter, and ->value is its value).

+1
source

Source: https://habr.com/ru/post/1387865/


All Articles