How does Excel read an XML file?

I researched a lot to convert an xml file to a 2d array in the same way that excel tries to do the same algorithm as excel when you open an xml file in excel.

<items> <item> <sku>abc 1</sku> <title>a book 1</title> <price>42 1</price> <attributes> <attribute> <name>Number of pages 1</name> <value>123 1</value> </attribute> <attribute> <name>Author 1</name> <value>Rob dude 1</value> </attribute> </attributes> <contributors> <contributor>John 1</contributor> <contributor>Ryan 1</contributor> </contributors> <isbn>12345</isbn> </item> <item> <sku>abc 2</sku> <title>a book 2</title> <price>42 2</price> <attributes> <attribute> <name>Number of pages 2</name> <value>123 2</value> </attribute> <attribute> <name>Author 2</name> <value>Rob dude 2</value> </attribute> </attributes> <contributors> <contributor>John 2</contributor> <contributor>Ryan 2</contributor> </contributors> <isbn>6789</isbn> </item> </items> 

I want it to convert it to a two-dimensional array, as if you opened the same file in Excel, it will show you this as

enter image description here


I want to convert to a 2-dimensional array, as Excel does. For now, I can extract shortcuts like Excel.

 function getColNames($array) { $cols = array(); foreach($array as $key=>$val) { if(is_array($val)) { if($val['type']=='complete') { if(in_array($val['tag'], $cols)) { } else { $cols[] = $val['tag']; } } } } return $cols; } $p = xml_parser_create(); xml_parse_into_struct($p, $simple, $vals, $index); xml_parser_free($p); 

goal

I want it to be generated like this.

 array ( 0 => array ( 'sku'=>'abc 1', 'title'=>'a book 1', 'price'=>'42 1', 'name'=>'Number of Pages 1', 'value'=>'123 1', 'isbn'=>12345 ), 1 => array ( 'sku'=>'abc 1', 'title'=>'a book 1', 'price'=>'42 1', 'name'=>'Author 1', 'value'=>'Rob dude 1', 'isbn'=>12345 ), 2 => array ( 'sku'=>'abc 1', 'title'=>'a book 1', 'price'=>'42 1', 'contributor'=>'John 1', 'isbn'=>12345 ), 3 => array ( 'sku'=>'abc 1', 'title'=>'a book 1', 'price'=>'42 1', 'contributor'=>'Ryan 1', 'isbn'=>12345 ), ) 

XML Example 2 ..

  <items> <item> <sku>abc 1</sku> <title>a book 1</title> <price>42 1</price> <attributes> <attribute> <name>Number of pages 1</name> <value>123 1</value> </attribute> <attribute> <name>Author 1</name> <value>Rob dude 1</value> </attribute> </attributes> <contributors> <contributor>John 1</contributor> <contributor>Ryan 1</contributor> </contributors> <isbns> <isbn>12345a</isbn> <isbn>12345b</isbn> </isbns> </item> <item> <sku>abc 2</sku> <title>a book 2</title> <price>42 2</price> <attributes> <attribute> <name>Number of pages 2</name> <value>123 2</value> </attribute> <attribute> <name>Author 2</name> <value>Rob dude 2</value> </attribute> </attributes> <contributors> <contributor>John 2</contributor> <contributor>Ryan 2</contributor> </contributors> <isbns> <isbn>6789a</isbn> <isbn>6789b</isbn> </isbns> </item> </items> 

XML Example 3 ..

 <items> <item> <sku>abc 1</sku> <title>a book 1</title> <price>42 1</price> <attributes> <attribute> <name>Number of pages 1</name> <value>123 1</value> </attribute> <attribute> <name>Author 1</name> <value>Rob dude 1</value> </attribute> </attributes> <contributors> <contributor>John 1</contributor> <contributor>Ryan 1</contributor> </contributors> <isbns> <isbn> <name>isbn 1</name> <value>12345a</value> </isbn> <isbn> <name>isbn 2</name> <value>12345b</value> </isbn> </isbns> </item> <item> <sku>abc 2</sku> <title>a book 2</title> <price>42 2</price> <attributes> <attribute> <name>Number of pages 2</name> <value>123 2</value> </attribute> <attribute> <name>Author 2</name> <value>Rob dude 2</value> </attribute> </attributes> <contributors> <contributor>John 2</contributor> <contributor>Ryan 2</contributor> </contributors> <isbns> <isbn> <name>isbn 3</name> <value>6789a</value> </isbn> <isbn> <name>isbn 4</name> <value>6789b</value> </isbn> </isbns> </item> </items> 
+6
source share
3 answers

According to your vague question, what do you call "Excel", it does the following in my own words: each item /items/item occupies a line. Of this, in document order, the column name is the tag name for each leaf-element, if there is a duplicate name, position first.

Then it creates one row per row, but only if all the children are leaf elements. Otherwise, the row is taken as the basis for the rows from this row, and elements that do not contain a leaf element are interpolated. For instance. if such a record has two times two additional sheets with the same name, they get two-line interpolation. Their child values ​​are then placed in the position of the columns with the name following the logic described in the first paragraph.

How deeply this logic is respected is not clear from your question. Therefore, I keep it only at this level. Otherwise, the interpolation will have to be written deeper into the tree. For this, the algorithm, as indicated, may not be suitable anymore.

To create this in PHP, you can especially benefit from XPath, and working with interpolation works like a Generator .

 function tree_to_rows(SimpleXMLElement $xml) { $columns = []; foreach ($xml->xpath('/*/*[1]//*[not(*)]') as $leaf) { $columns[$leaf->getName()] = null; } yield array_keys($columns); $name = $xml->xpath('/*/*[1]')[0]->getName(); foreach ($xml->$name as $source) { $rowModel = array_combine(array_keys($columns), array_fill(0, count($columns), null)); $interpolations = []; foreach ($source as $child) { if ($child->count()) { $interpolations[] = $child; } else { $rowModel[$child->getName()] = $child; } } if (!$interpolations) { yield array_values($rowModel); continue; } foreach ($interpolations as $interpolation) { foreach ($interpolation as $interpolationStep) { $row = $rowModel; foreach ($interpolationStep->xpath('(.|.//*)[not(*)]') as $leaf) { $row[$leaf->getName()] = $leaf; } yield array_values($row); } } } } 

Using it, it can be as direct as:

 $xml = simplexml_load_file('items.xml'); $rows = tree_to_rows($xml); echo new TextTable($rows); 

Providing an approximate conclusion:

 +-----+--------+-----+-----------------+----------+-----------+-----+ |sku |title |price|name |value |contributor|isbn | +-----+--------+-----+-----------------+----------+-----------+-----+ |abc 1|a book 1|42 1 |Number of pages 1|123 1 | |12345| +-----+--------+-----+-----------------+----------+-----------+-----+ |abc 1|a book 1|42 1 |Author 1 |Rob dude 1| |12345| +-----+--------+-----+-----------------+----------+-----------+-----+ |abc 1|a book 1|42 1 | | |John 1 |12345| +-----+--------+-----+-----------------+----------+-----------+-----+ |abc 1|a book 1|42 1 | | |Ryan 1 |12345| +-----+--------+-----+-----------------+----------+-----------+-----+ |abc 2|a book 2|42 2 |Number of pages 2|123 2 | |6789 | +-----+--------+-----+-----------------+----------+-----------+-----+ |abc 2|a book 2|42 2 |Author 2 |Rob dude 2| |6789 | +-----+--------+-----+-----------------+----------+-----------+-----+ |abc 2|a book 2|42 2 | | |John 2 |6789 | +-----+--------+-----+-----------------+----------+-----------+-----+ |abc 2|a book 2|42 2 | | |Ryan 2 |6789 | +-----+--------+-----+-----------------+----------+-----------+-----+ 
TextTable is a slightly modified version of https://gist.github.com/hakre/5734770 that allows you to work with generators - in case you are looking for this code.
+3
source

To get the array that you want to get from the xml file you specified, you have to do it this way. It wasn’t too much fun, so I hope this is really what you wanted.

Given the exact XML you gave about this, you will get the result that you have as the final result.

It was written in php 5.6. I suppose you have to move function calls to your own string and replace [] with an array () if you run into problems in your environment.

 $items = simplexml_load_file("items.xml"); $items_array = []; foreach($items as $item) { foreach($item->attributes->attribute as $attribute) { array_push($items_array, itemsFactory($item, (array) $attribute)); } foreach((array) $item->contributors->contributor as $contributer) { array_push($items_array, itemsFactory($item, $contributer)); } } function itemsFactory($item, $vars) { $item = (array) $item; return [ "sku" => $item['sku'], "title" => $item['title'], "price" => $item['price'], "name" => (is_array($vars) ? $vars['name'] : ""), "value" => (is_array($vars) ? $vars['name'] : ""), "contributer" => (is_string($vars) ? $vars : ""), "isbn" => $item['isbn'] ]; } var_dump($items_array); 

Here is the result when you run your XML file ...

 array(8) { [0]=> array(7) { ["sku"]=> string(5) "abc 1" ["title"]=> string(8) "a book 1" ["price"]=> string(4) "42 1" ["name"]=> string(17) "Number of pages 1" ["value"]=> string(17) "Number of pages 1" ["contributer"]=> string(0) "" ["isbn"]=> string(5) "12345" } [1]=> array(7) { ["sku"]=> string(5) "abc 1" ["title"]=> string(8) "a book 1" ["price"]=> string(4) "42 1" ["name"]=> string(8) "Author 1" ["value"]=> string(8) "Author 1" ["contributer"]=> string(0) "" ["isbn"]=> string(5) "12345" } [2]=> array(7) { ["sku"]=> string(5) "abc 1" ["title"]=> string(8) "a book 1" ["price"]=> string(4) "42 1" ["name"]=> string(0) "" ["value"]=> string(0) "" ["contributer"]=> string(6) "John 1" ["isbn"]=> string(5) "12345" } [3]=> array(7) { ["sku"]=> string(5) "abc 1" ["title"]=> string(8) "a book 1" ["price"]=> string(4) "42 1" ["name"]=> string(0) "" ["value"]=> string(0) "" ["contributer"]=> string(6) "Ryan 1" ["isbn"]=> string(5) "12345" } [4]=> array(7) { ["sku"]=> string(5) "abc 2" ["title"]=> string(8) "a book 2" ["price"]=> string(4) "42 2" ["name"]=> string(17) "Number of pages 2" ["value"]=> string(17) "Number of pages 2" ["contributer"]=> string(0) "" ["isbn"]=> string(4) "6789" } [5]=> array(7) { ["sku"]=> string(5) "abc 2" ["title"]=> string(8) "a book 2" ["price"]=> string(4) "42 2" ["name"]=> string(8) "Author 2" ["value"]=> string(8) "Author 2" ["contributer"]=> string(0) "" ["isbn"]=> string(4) "6789" } [6]=> array(7) { ["sku"]=> string(5) "abc 2" ["title"]=> string(8) "a book 2" ["price"]=> string(4) "42 2" ["name"]=> string(0) "" ["value"]=> string(0) "" ["contributer"]=> string(6) "John 2" ["isbn"]=> string(4) "6789" } [7]=> array(7) { ["sku"]=> string(5) "abc 2" ["title"]=> string(8) "a book 2" ["price"]=> string(4) "42 2" ["name"]=> string(0) "" ["value"]=> string(0) "" ["contributer"]=> string(6) "Ryan 2" ["isbn"]=> string(4) "6789" } } 

If you really have access to the excel file, not the xml, this can be a lot easier. If so, we can use php excel to render the same, but it will work for any dataset, not just the specified one. If this is not the case, I cannot think of any other way to convert this XML file to what you want.

EDIT:

It can also bring more light to the object and is owned by the developer of the PHPExcel PHPExcel factory itself when reading XML from a URL . As you can, I don’t think that you can write something that will parse any XML file that you drop onto it without delaying any Excels source code or spending a very long time on it. goes beyond this question. However, if you have to write something that will parse any XML file, I have a feeling that it will look higher, but with TON conditional expressions.

0
source

PHP library PHPExcel solves your problem:

https://phpexcel.codeplex.com/

You can also find some examples here:

https://phpexcel.codeplex.com/wikipage?title=Examples&referringTitle=Home

https://github.com/PHPOffice/PHPExcel/wiki/User%20Documentation

This is the most reliable Excel library for PHP, and it is constantly maintained and updated.

Keep in mind that you can read (from an Excel file, etc.) and write (to an Excel file, PDF, etc.).

0
source

Source: https://habr.com/ru/post/975704/


All Articles