The PHP library Xml Navigator base on XMLReader.
You can assign XML as string or as URI ( or file system path to file).
Navigator can provide XML-document as array or as object.
$xml =<<<XML <outer any_attrib="attribute value"> <inner>element value</inner> <nested nested-attrib="nested attribute value">nested element value</nested> </outer> XML; $result = \SbWereWolf\XmlNavigator\Convertation\FastXmlToArray ::prettyPrint($xml); echo json_encode($result, JSON_PRETTY_PRINT);OUTPUT:
{ "outer": { "@attributes": { "any_attrib": "attribute value" }, "inner": "element value", "nested": { "@value": "nested element value", "@attributes": { "nested-attrib": "nested attribute value" } } } }composer require sbwerewolf/xml-navigator
$xml = '<One attr="val">text</One><Other attr1="" attr2=""/>' . PHP_EOL; $file = fopen('data-for-stream.xml', 'w'); fwrite($file, "<Collection>$xml$xml$xml</Collection>"); fclose($file); /** @var XMLReader $reader */ $reader = XMLReader::open('data-for-stream.xml'); $extractor = \SbWereWolf\XmlNavigator\Parsing\FastXmlParser ::extractHierarchy( $reader, /* callback for detect element for parsing */ function (XMLReader $cursor) { return $cursor->name === 'One'; } ); /* Extract all elements with name `One` */ foreach ($extractor as $element) { echo json_encode($element, JSON_PRETTY_PRINT) . PHP_EOL; } $reader->close();Output to console will be:
{ "n": "One", "v": "text", "a": { "attr": "val" } } { "n": "One", "v": "text", "a": { "attr": "val" } } { "n": "One", "v": "text", "a": { "attr": "val" } } Access time to first element do not depend on file size.
Let explain this with example.
First generate XML files by script:
function generateFile(string $filename, int $limit, string $xml): void { $file = fopen($filename, 'a'); fwrite($file, '<Collection>'); for ($i = 0; $i < $limit; $i++) { $content = "$xml$xml$xml$xml$xml$xml$xml$xml$xml$xml"; fwrite($file, $content); } fwrite($file, '</Collection>'); fclose($file); $size = round(filesize($filename) / 1024, 2); echo "$filename size is $size Kb" . PHP_EOL; } $xml = '<SomeElement key="123">value</SomeElement>' . PHP_EOL; $generation['temp-465b.xml'] = 1; $generation['temp-429Kb.xml'] = 1_000; $generation['temp-429Mb.xml'] = 1_000_000; foreach ($generation as $filename => $size) { generateFile($filename, $size, $xml); }temp-465b.xml size is 0.45 Kb temp-429Kb.xml size is 429.71 Kb temp-429Mb.xml size is 429687.52 KbNow, run benchmark by script:
/** * @param string $filename * @return void */ function parseFirstElement(string $filename): void { $start = hrtime(true); /** @var XMLReader $reader */ $reader = XMLReader::open($filename); $mayRead = true; /* scroll to first `SomeElement` */ while ($mayRead && $reader->name !== 'SomeElement') { $mayRead = $reader->read(); } /* Compose array from XML element with name `SomeElement` */ $result = \SbWereWolf\XmlNavigator\Extraction\PrettyPrintComposer ::compose($reader); $reader->close(); $finish = hrtime(true); $duration = $finish - $start; $duration = number_format($duration,); echo "First element parsing duration of $filename is $duration ns" . PHP_EOL; } /* files to metering with benchmark */ $files = [ 'temp-465b.xml', 'temp-429Kb.xml', 'temp-429Mb.xml', ]; echo 'Warm up OPcache' . PHP_EOL; parseFirstElement(current($files)); echo 'Benchmark is starting' . PHP_EOL; foreach ($files as $filename) { parseFirstElement($filename); } echo 'Benchmark was finished' . PHP_EOL;Warm up OPcache First element parsing duration of temp-465b.xml is 1,250,700 ns Benchmark is starting First element parsing duration of temp-465b.xml is 114,400 ns First element parsing duration of temp-429Kb.xml is 132,400 ns First element parsing duration of temp-429Mb.xml is 119,900 ns Benchmark was finishedXmlConverter implements array approach.
XmlConverter can use to convert XML-document to array, example:
$xml = <<<XML <ElemWithNestedElems> <ElemWithVal>val</ElemWithVal> <ElemWithAttribs one="atrib" other="atrib"/> <ElemWithAll attribute_name="attribute-value"> element value </ElemWithAll> </ElemWithNestedElems> XML; $converter = new \SbWereWolf\XmlNavigator\Convertation\XmlConverter( val: 'value', attr: 'attributes', name: 'name', seq: 'sequence', ); $xmlAsArray = $converter->toHierarchyOfElements($xml); $prettyPrint = json_encode($xmlAsArray, JSON_PRETTY_PRINT); echo 'JSON representation of XML:' . PHP_EOL . $prettyPrint . PHP_EOL; echo 'Array representation of XML:' . PHP_EOL . var_export($xmlAsArray, true) . PHP_EOL;OUTPUT:
JSON representation of XML: { "name": "ElemWithNestedElems", "sequence": [ { "name": "ElemWithVal", "value": "val" }, { "name": "ElemWithAttribs", "attributes": { "one": "atrib", "other": "atrib" } }, { "name": "ElemWithAll", "value": "\n element value\n ", "attributes": { "attribute_name": "attribute-value" } } ] } Array representation of XML: array ( 'name' => 'ElemWithNestedElems', 'sequence' => array ( 0 => array ( 'name' => 'ElemWithVal', 'value' => 'val', ), 1 => array ( 'name' => 'ElemWithAttribs', 'attributes' => array ( 'one' => 'atrib', 'other' => 'atrib', ), ), 2 => array ( 'name' => 'ElemWithAll', 'value' => ' element value ', 'attributes' => array ( 'attribute_name' => 'attribute-value', ), ), ), )XmlElement implements object-oriented approach.
name(): string// Returns the name of XML elementhasValue(): bool// Returnstrueif XML element has valuevalue(): string// Returns the value of XML elementhasAttribute(string $name = ''): bool// Returnstrueif XML element has attribute with$name. If$nameomitted, than returnstrueif XML element has any attributeget(string $name = null): string// Get value of attribute with the$name, if$nameis omitted, than returns value of random attributeattributes(): XmlAttribute[]// Returns all attributes of XML elementhasElement(?string $name = null): bool// Returnstrueif XML element has nested element with$name. If$nameomitted, than returnstrueif XML element has any nested elementpull(string $name = ''): Generator// Pull nested elements asIXmlElement, if$nameis defined, than pull elements only with the$nameelements(): IXmlElement[]// Returns all nested elementsserialize(): array;Generates a storable representation ($data) of a IXmlElement, usenew XmlElement($data)to restoreXmlElementobject
$xml = <<<XML <doc attrib="a" option="o" > <base/> <valuable>element value</valuable> <complex> <a empty=""/> <b val="x"/> <b val="y"/> <b val="z"/> <c>0</c> <c v="o"/> <c/> <different/> </complex> </doc> XML; $content = \SbWereWolf\XmlNavigator\Convertation\FastXmlToArray ::convert($xml); $navigator = new \SbWereWolf\XmlNavigator\Navigation\XmlElement($content); /* Convert this XmlElement to array, with the array you may restore XmlElement (create same as original one) */ $gist = $navigator->serialize(); echo assert($content === $gist) ? 'is same' : 'is different'; echo PHP_EOL; /* get name of element */ echo $navigator->name() . PHP_EOL; /* doc */ /* get value of element */ echo "`{$navigator->value()}`" . PHP_EOL; /* `` */ /* get list of attributes */ $attributes = $navigator->attributes(); foreach ($attributes as $attribute) { /** @var \SbWereWolf\XmlNavigator\Navigation\IXmlAttribute $attribute */ echo "`{$attribute->name()}` `{$attribute->value()}`" . PHP_EOL; } /* `attrib` `a` `option` `o` */ /* get value of attribute */ echo $navigator->get('attrib') . PHP_EOL; /* a */ /* get list of nested elements */ $elements = $navigator->elements(); foreach ($elements as $element) { echo "{$element->name()}" . PHP_EOL; } /* base valuable complex */ /* get desired nested element */ /** @var \SbWereWolf\XmlNavigator\Navigation\IXmlElement $elem */ $elem = $navigator->pull('valuable')->current(); echo $elem->name() . PHP_EOL; /* valuable */ /* get all nested elements */ foreach ($navigator->pull() as $pulled) { /** @var \SbWereWolf\XmlNavigator\Navigation\IXmlElement $pulled */ echo $pulled->name() . PHP_EOL; /* base valuable complex */ } /* get nested element with given name */ /** @var \SbWereWolf\XmlNavigator\Navigation\IXmlElement $nested */ $nested = $navigator->pull('complex')->current(); /* get names of all elements of nested element */ $elements = $nested->elements(); foreach ($elements as $element) { echo "{$element->name()}" . PHP_EOL; } /* a b b b c c c different */ /* pull all elements with name `b` */ foreach ($nested->pull('b') as $b) { /** @var \SbWereWolf\XmlNavigator\Navigation\IXmlElement $b */ echo ' element with name' . ' `' . $b->name() . '` have attribute `val` with value' . ' `' . $b->get('val') . '`' . PHP_EOL; } /* element with name `b` have attribute `val` with value `x` element with name `b` have attribute `val` with value `y` element with name `b` have attribute `val` with value `z` */Unit tests have more examples of using, please investigate them.
composer testVolkhin Nikolay e-mail ulfnew@gmail.com phone +7-902-272-65-35 Telegram @sbwerewolf