xml2tree

Creating and manipulating XML tree objects in PHP

This README file documents the xml2tree package for php (4.0), ca. June 2000, by Bill Softky of Ricoh Silicon Valley, as an adjunct to Ricoh's Open Source web application initiative RiSource.org. It is covered by the Ricoh Public Source Code License. As a work in progress, there are no guarantee that it will do what you want, or even do what it is supposed to (although suggestions and improvements are welcome). The tarfile for this project can be downloaded from RiSource.org or here.

This project is effectively a library class for reading/writing/displaying XML documents, for manipulating them in memory by adding/deleting/changing attribute values and child nodes, and for extracting specific nodes from larger XML objects. It is written entirely in PHP (as a file to "require"), and depends on no other software, tools, libraries etc.

To see dummy applications of xml2tree, aim browser at one of the php files in this directory, served by a php4-enabled server; you can see the structure of the XML in PHP memory in the browser window. Further explanations of the functions involved are below.

Here are the sample demo files. You may view these on the softky.com website, or perhaps on your own php4-server after you download and un-tar this package (there is no guarantee otherwise that they can be viewed via the Web, since the webserver configuration must be just right). Look at their .php source code in an editor to see how they work:

readWrite.php (the server must have write permission in this directory, so try it at home)

fromScratch.php (works on softky.com)

extracting.php (works on softky.com)


General advice on using xml2tree

NOTE: the phrase "xml2tree" is approximate only! Certain XML constructs are NOT supported (external entities, sibling-pointers, processing instructions). xml2tree was written to expedite a particular application (storing and manipulating a certain type of XML file); it does not conform to the Document Object Model (DOM), which is the formal standard for XML memory representations. This project is meant only to be more useful than nothing at all.

NOTE: Do not try to print out a treeNode with print_r() or var_dump(); you will get stuck in an infinite loop, because each node has a reference to its parent, which prints the child, which prints the parent, etc etc. Use the class member functions treeNode->printEcho() or printHTML() instead for debugging.

There is one file, xml2tree.inc, which should be included (or "require"-ed) in any php file making use of these tools. Here follows a description of what is in that file.

/************* "in-memory" functions *******************/

The main structure is a class called treeNode, whose structure is below. The whole purpose for xml2tree is is allow direct manipulation of an XML "object", so the member functions and class members here are the things you will presumably want to use.

 class treeNode{
    var $parent;  // a reference to the parent treeNode, if existing
    var $name;        // the tagname for this node
    var $attributes;  // an associative array of the attributes (maybe empty)
    var $children;    // an array of strings and treeNodes

    // constructor for a treeNode object

    function treeNode( $newName )



    // copy() does an infinitely deep copy, as does copy(-1); any non-negative
    // argument tells how deep the copy should be.  It returns the new 
    // treeNode (the argument $deep is optional).

    function copy($deep="-1")


    // insertChild[Node/Text/Copy] makes a new child.  The simplest is
    // insertChildText, which makes a child string; the next simplest is
    // insertChildNode, which puts a reference to a node into another node 
    // (it does NOT copy it); the last is insertChildCopy, which
    // makes a new copy of the argument node. 
    // The default for inserting a child is appending; any non-negative
    // integer passed in as $newPlace will put child as close to that 
    // location as possible (the final $newPlace argument is optional).

    function insertChildNode( &$newChild , $newPlace="-1")
    function insertChildText( $newChild , $newPlace="-1")
    function insertChildCopy( $newChild , $newPlace="-1")



    // kills the child at the location $killPlace (or at the nearest
    // location to it), and fills in the leftover hole.  $killPlace may
    // be an integer location, or a "template" node (as in extract() below)
    // which gives the name/attributes of any child to delete, w/o knowing
    // its exact location. 
    function deleteChild( $killPlace)


    // deletes ("unset") the attribute given by $attKey
    function deleteAttribute( $attKey )


    // prints out indented XML to the browser's Page Source
    function printEcho()

    // prints out indented HTML to the browser's display
    function printHTML()

    // prints out indented XML to a string (the final $indent argument
    // is optional).
    function printOut(&$xmlString,  $indent="")

    // like printOut() above, but only prints the text content of nodes
    // (recursively), without printing the tags or attributes and
    // without newlines or indentation.
    function printText(&$xmlString)


    // Takes a template node, and
    // extracts references to any node with the same name and 
    // attribute values as the template.  The results are put in the
    // two rows of $resultArray: $resultArray["nodes"] is a flat array
    // of the returned nodes (at whatever depths they were found), and
    //  $resultArray["depth"] is a flat array of the depths at which
    // those nodes were found.  (The final $depth argument is optional).

    function extract( &$resultArray, $template, $depth=0 )
      

    // extracts only those nodes at a specific depths, and returns
    // a flat array of those nodes, rather than a 2xN array 
    // i.e. this returns array[N] rather than
    // array ["nodes"][N] and array ["depth"][N] 
    function extractAtDepth( &$resultArray, $template, $targetDepth )

  }

/************ File functions *******************/

The function readXML( $xmlFile ) reads the file given by the string $xmlFile
and returns a treeNode with the contents.


The function writeXML($outFileName, $topOutNode) takes a string $outFileName
and a treeNode $topOutNode, and writes an xml file (indented, natch)
from the treeNode.