You know, I entered this project of mine (the Distance Ed. Forms Maker/ CMS thingy that is not fully formed yet) with a lot of hopes and dreams. I wanted to learn all the “x” technologies and the DOM.
Problem is I’m a real person with a real life and my job isn’t computer programming. So I need to finish this project and just don’t have the time to learn everything. I can do basic stuff with DOM and SimpleXML, but these things can get quirky when your data or data structure gets complicated and you’re not an expert. So I prefer parsing XML myself. How do I do it? I just put them all in an array and manipulate the data as I would any other array. I can do this because it’s MY program and I know the data (so I don’t need generic thingies like parents, children, ancestors, aunts, uncles, and cousins)
But because I don’t have the time doesn’t mean I can’t do some serious XML manipulation. Why should I miss out on all the fun of this new technology?
I manipulate my files in one of two ways. First just simply lining my XML data into a single normally indexed array. Like this.
0=> <shopping list>
1 => <fruits>
2 => <highCalories>
3 => banana
4 => </highCalories>
5 => <lowcalories>
6 => apple
7 => </lowCalories>
8 => </fruits>
9 => <snacks>
. . .
15 => </snacks>
…
23 => </shopping list>
In this case, I simply do for loops and foreach loops and ifs and buts and all that other stuff we normally do in PHP becuase I know where the data is in the hierarchy. My point in all this is not to teach you how to manipulate your data but how to get your raw XML data into array form. Here is that function. This function accepts a string just as you would get from HTTP_RAW_POST_DATA.
///////////////////////////////////
/*the following procedure will parse the string into an array of xml elements
by first eliminating all spaces except those next to the tag markers and
then exploding the string into an array of xml elements and text;
then putting the spaces back in */
function xmlstring2xmlarray($xmlstring)
{
$xmlstring=str_replace(” “, “xxzz”, $xmlstring);
$xmlstring=str_replace(“<”, ” <”, $xmlstring);
$xmlstring=str_replace(“>”, “> “, $xmlstring);
//echo $xmlstring;
$xmlstring=trim($xmlstring);
$xmlArray= explode(” “,$xmlstring);
for ($i=0;$i<count($xmlArray);$i++)
{
$xmlArray[$i]=str_replace(“xxzz”, ” “, $xmlArray[$i]);
//echo “$xmlArray[$i]\n”;
}
///endfor
$xmlArray=stripArrayNonElements($xmlArray); user-defined function
trim($xmlArray[0]);
//echoArrayValues($xmlArray); //user-defined function
return $xmlArray;
}
//endfunction xmlstring2xmlarray
I do most of my manipulating from this format. However, occassionally I like to work with an associative array that goes more like this (real sample from my app) (I put the brackets and bars around the values cause it helps to check that there are no leading or ending spaces.) The element names change into the key while the value would be the text e.g. <htmlbgcolor>pink</htmlbgcolor> <htmlcolor>#000033</htmlcolor> & etc.
Key: htmlbgcolor; Value: [|pink|] Key: htmlcolor; Value: [|#000033|] Key: htmlbordL; Value: [|#14223D|] Key: htmlbordR; Value: [|#14223D|] Key: htmlbordT; Value: [|#14223D|] Key: htmlbordB; Value: [|#14223D|] Key: bodybgcolor; Value: [|#f0f0f0|] Key: bodycolor; Value: [|#000033|] Key: bodybordL; Value: [|#800000|] Key: bodybordR; Value: [|#800000|] Key: bodybordT; Value: [|#800000|] Key: bodybordB; Value: [|#800000|]
here is the code:
function xmlSTR2xmlAssocARR($xmlstring)
{ //echo $xmlstring;
//*******************************
//*****procedure xmlSTR2xmlARR
$xmlstring=ereg_replace('[[:space:]]+', ' ', $xmlstring);
$xmlstring=str_replace(" ", "xxzz", $xmlstring);
$xmlstring=str_replace("<", " <", $xmlstring);
$xmlstring=str_replace(">", "> ", $xmlstring);
//echo $xmlstring;
$xmlstring=trim($xmlstring);
//echo $xmlstring;
$xmlArray= explode(" ",$xmlstring);
//echo "x"; echoArrayKeyValue($xmlArray);
for ($i=0;$i<count($xmlArray);$i++)
{
$xmlArray[$i]=str_replace("xxzz", " ", $xmlArray[$i]);
//echo "$xmlArray[$i]n";
}
///endfor
//echo "x"; echoArrayKeyValue($xmlArray);
$xmlArray=stripArrayNonElements($xmlArray); //user-defined function
//echo "x"; echoArrayKeyValue($xmlArray);
$xmlArray=stripArrayNonElements2($xmlArray); //user-defined function
//echo "x"; echoArrayKeyValue($xmlArray);
trim($xmlArray[0]);
//echo"x"; echoArrayKeyValue($xmlArray);
//****end of procedure xmlSTR2xmlARR
//*******************************
//Begin building Associative Array
$xmlAssocARR=array();$i=0;
for ($i=0;$i<count($xmlArray);$i++)
{
switch ($xmlArray[$i])
{
case ereg("^</", $xmlArray[$i]) || ereg("^</", $xmlArray[$i]):
break;
case ereg("/>$", $xmlArray[$i]) || ereg("/>$", $xmlArray[$i]):
$xmlArray[$i]=str_replace("<","",$xmlArray[$i]);
$xmlArray[$i]=str_replace("/>","",$xmlArray[$i]);
$xmlAssocARR[$xmlArray[$i]]= "none";
break;
case ereg("^<", $xmlArray[$i]) && ereg("^</", $xmlArray[$i+1])
&& !ereg("/>", $xmlArray[$i]):
$xmlArray[$i]=str_replace("<","",$xmlArray[$i]);
$xmlArray[$i]=str_replace(">","",$xmlArray[$i]);
$xmlAssocARR[$xmlArray[$i]]= "none";
break;
case ereg("^<", $xmlArray[$i]) && !ereg("^</", $xmlArray[$i])
&& !ereg("^<", $xmlArray[$i+1]):
$xmlArray[$i]=str_replace("<","",$xmlArray[$i]);
$xmlArray[$i]=str_replace(">","",$xmlArray[$i]);
$xmlAssocARR[$xmlArray[$i]]= $xmlArray[$i+1];
break;
}
//endcase
}
//endfor
return $xmlAssocARR;
}
//endfucntion xmlSTR2xmlAssocARR
I usually like to return the same type of variable as I recieved,
but I didn't this time. You may like to do an implode before returning the variable.
Here is where I trim the array of all useless array elements .
function stripArrayNonElements($array)
{
//first trim each element to make sure there are no leading whitespaces
for ($i=0;$i<count($array);$i++)
{
$array[$i]=trim($array[$i]);
//echo "|$array[$i]|n";
}
//now rid the array of all non-elements (any element holding only white spaces)
foreach($array as $key => $value)
{
if( ereg("^[[:space:]]",$value) )
{
unset($array[$key]);
}
//endif
}
//endforeach
$new_array = array_values($array);
return $new_array;
}
//endfunction
[...] or an associative array and then manipulating the data as you would any other arrayed data (see http://clarkepeters.wordpress.com/2007/09/02/parsing-without-the-dom-or-simplexml-in-php/. I’m not advocating, however, that this is the best method for dealing with [...]
Pingback by Thoughts on xforms php and the DOM « Clarkepeters’s Weblog — October 9, 2007 @ 8:23 pm
[...] By the way, as you can see, I don’t often use the DOM or SimpleXML, I do my own parsing. So for the following example I converted an xml file to an associative array. See the following article to see the function I use to quickly do this. Parsing without the DOM or SimpleXML in PHP [...]
Pingback by ClarkePeter’s Weblog » Directory List Xforms and PHP (pt. 1) — October 21, 2008 @ 9:56 am