Writing an XML Parser with PHP

In some recent articles have addressed the issue of XML, we still have to conclude the topic talking about the schemes, but the basic syntax of the pseudo-language we have already addressed. We’ve covered already use which does .NET DataSet and through the use of SAX in Java (to find the individual items you can analyze posts tagged with XML). In this article we will tackle the topic of parsing XML through a powerful language in which you certainly could not miss such a support, we will see how to create an XML parser in PHP.
The mechanism of parsing XML documents is through the use of handlers, ie functions that are called when parsing the file. Let’s see a practical example that we can help you understand how everything functions. We write an XML document of our customers’ personal data and in particular:

<?xml  version="1.0" ?>

<data>
   <customer idcust="1">
      <name>Thomas</name>
      <surname>Smith</surname>
      <web>http://www.test.com</web>
   </customer>
   <customer idcust="2">
      <name>Peter</name>
      <surname>Scott</surname>
      <web>http://www.example.com</web>
   </customer>
</data>

We insert this simple XML file that we will call “data.xml” in the same folder as the server where the PHP file that performs the parsing of this document:

<?php

$file = "data.xml";

function startElement($parser, $name, $attrib)
{
   echo "Opening tab: $name<br>";
   if (sizeof($attrib))
   {
      echo "attributes:";
      while (list($key, $val) = each($attrib))
	echo "$key = $val";

      echo "<br>";
   }
}

function endElement($parser, $name)
{
   echo "Closing tag: $name<br>";
}

function characterData($parser, $data)
{
   if (trim($data) != "")
      echo "Value: $data<br>";
}

?>

<html>
<head><title>Parsing XML in PHP</title></head>
<body>

<?php

$xml_parser = xml_parser_create();
xml_parser_set_option($xml_parser, XML_OPTION_CASE_FOLDING, 1);
xml_set_element_handler($xml_parser, "startElement", "endElement");
xml_set_character_data_handler($xml_parser, "characterData");

if (!($f = @fopen($file, "r")))
   die("Unable to open file for parsing");

echo "<b>Init parsing</b><br>";

while ($data = fread($f, 4096))
{
   if (!xml_parse($xml_parser, $data, feof($f)))
   {
      die(sprintf("%s(%d): %s<br>", $file,
      xml_get_current_line_number($xml_parser),
      xml_error_string(xml_get_error_code($xml_parser))));
   }

   echo "<b>Parsing completed!</b>";
   xml_parser_free($xml_parser);
}

?>

</body>
</html>

The program begins with the definition of the three handler functions that are called respectively the opening of the element, the element at the end and during the reading of the text content. In the main part of the program is created a new parser with the function xml_parser_create(), and then are set to the names of the handler: xml_set_element_handler() for the two functions of opening and closing of the element, while for the reading of the text between the elements you need to set the handler with xml_set_character_data_handler(). There are many other handler manageable by the XML parser, such as the management of the processing instruction or the notation declaration, for which reference to the documentation of PHP and XML. In this article we have tried to do, as in my other post, a basic example to understand how it works, then you will yourself to refine and make more sophisticated these examples, the process of parsing a PHP page.