Extract data from XML file using SimpleXML in PHP

Introduction:

I want to iterate over XML files with a flexible category structure.

Problem:

I don’t know how to get through the theoretical infinte subtypes without having to do x amount "for each" (see code example below). How to dynamically navigate a category structure?

<?xml version="1.0" encoding="utf-8"?> <catalog> <category name="Category - level 1"> <category name="Category - level 2" /> <category name="Category - level 2"> <category name="Category - level 3" /> </category> <category name="Category - level 2"> <category name="Category - level 3"> <category name="Category - level 4" /> </category> </category> </category> </catalog> 

What I have now:

I have no problems with looping through XML files with an established structure:

 <catalog> <category name="Category - level 1"> <category name="Category - level 2"> <category name="Category - level 3" /> </category> <category name="Category - level 2"> <category name="Category - level 3" /> </category> </category> </catalog> 

Coding Example:

 //$xml holds the XML file foreach ( $xml AS $category_level1 ) { echo $category_level1['name']; foreach ( $category_level1->category AS $category_level2 ) { echo $category_level2['name']; foreach ( $category_level2->category AS $category_level3 ) { echo $category_level3['name']; } } } 
+4
source share
4 answers

Getting name attributes from your categories is most likely faster when executed through XPath, for example

 $categoryNames = $doc->xpath('//category/@name'); 

However, if you want to recursively iterate over an arbitrary nested XML structure, you can also use SimpleXMLIterator , for example. with $xml is the line you gave:

 $sxi = new RecursiveIteratorIterator( new SimpleXMLIterator($xml), RecursiveIteratorIterator::SELF_FIRST); foreach($sxi as $node) { echo str_repeat("\t", $sxi->getDepth()), // indenting $node['name'], // getting attribute name PHP_EOL; // line break } 

will give

 Category - level 1 Category - level 2 Category - level 2 Category - level 3 Category - level 2 Category - level 3 Category - level 4 

As said at the beginning, when you just want to get all the attributes of a name, use XPath because iterating over each node is slow. Use this approach only when you want to do more complex things with nodes, for example, adding something to them.

+6
source
 <?php $xml= new SimpleXMLElement('.....'); foreach ($xml->xpath('//category') as $cat) { echo $cat['name']; } 
+2
source

A possible solution would be to write a recursive function that:

  • Main category of current depth
    • enter the name of the current category
    • If it has child categories, name yourself above them.

The advantage of this solution is that you can keep track of the current depth you are in in your XML document - it can be useful if you need to present your data as a tree, for example.


For example, if you loaded your XML as follows:

 $string = <<<XML <catalog> <category name="Category - level 1"> <category name="Category - level 2"> <category name="Category - level 3" /> </category> <category name="Category - level 2"> <category name="Category - level 3" /> </category> </category> </catalog> XML; $xml = simplexml_load_string($string); 


You can call the recursive function as follows:

 recurse_category($xml); 


And this function can be written as follows:

 function recurse_category($categories, $depth = 0) { foreach ($categories as $category) { echo str_repeat('&nbsp; ', 2*$depth); echo (string)$category['name']; echo '<br />'; if ($category->category) { recurse_category($category->category, $depth + 1); } } } 


Finally, running this code will give you this output:

 Category - level 1 Category - level 2 Category - level 3 Category - level 2 Category - level 3 
+1
source

Using simplexml and xpath as excellent
... but just like the side, if all you want to achieve is to get the name attribute of each <category> element in the DOMDocument :: getElementsByTagName () document.
You can switch between DOM and simplexml via dom_import_simplexml () and simplexml_import_dom () . Both use the same internal data representation, so there is no costly conversion.

 $xml = '<?xml version="1.0" encoding="utf-8"?> <catalog> <category name="Category - level 1"> <category name="Category - level 2" /> <category name="Category - level 2"> <category name="Category - level 3" /> </category> <category name="Category - level 2"> <category name="Category - level 3"> <category name="Category - level 4" /> </category> </category> </category> </catalog>'; $doc = new DOMDocument; $doc->loadxml($xml); foreach( $doc->getElementsByTagName('category') as $c) { echo $c->getAttribute('name'), "\n"; } 

prints

 Category - level 1 Category - level 2 Category - level 2 Category - level 3 Category - level 2 Category - level 3 Category - level 4 
+1
source

Source: https://habr.com/ru/post/1306023/


All Articles