[MTOS-dev] XML parsing

Mark Paschal mark at sixapart.com
Thu Mar 6 10:43:12 PST 2008


Timothy Appnel wrote:
> * As already mentioned yesterday, having expat is not enough.
> XML::Parser requires the source for expat in order to bind with while
> the Perl module is being compiled. This is adds to the complications
> of getting that module installed though these hurdles pails in
> comparison to installing LibXML from source. I know package managers
> can help here, but in my experience installing this module is A BEAST.

Yeah, it was easy for me because I did use rpms. Do you suppose many 
folks set up production deployments without package management?


> * XML::Atom is not 1.0 compliant to my knowledge -- both the
> syndication and protocol parts.

Because of the namespace handling, or some other issue?


> * The problem with using SAX is that it doesn't put things in to a DOM
> or setup an XPath engine for you. Herein lies the big difference
> between Expat and LibXML.

Yes, that is a problem. I haven't written much actual SAX code since 
my Java days, but using the event-driven model directly would mean we 
and plugin authors have to write code completely differently. I would 
hazard that DOM is vastly more familiar than the event-driven model, 
since DOM is available in web browser Javascript where the document 
has already been parsed for display.


>>  While we don't parse XML in any of the core functions, we do in:
>>
>>  * the Atompub server(s)
>>  * profile exchange for OpenID commenters
> 
> There is also the XML-RPC services and Feeds.App Lite.

Right. The XML-RPC endpoint uses SOAP::Lite, which uses XML::Parser/ 
expat in our configuration.


>>  ship XML::XPath in extlib to have known minimal XML support.
> 
> I'm not sure "known" is necessarily true. I seem to recall XML::XPath
> will check out, but if XML::Parser and expat are not installed
> properly it will blow up the first time it's used.

That's fair to say. Personally, I've construed our shipping it in 
extlib as a guarantee to plugin authors that it's available, hence its 
use in Action Streams. That it doesn't work in some configurations is 
a fault in MT we should fix.


>>  obvious what the problem is; mt-wizard reports that XML::Atom is
>>  missing, which isn't precisely the case.
> 
> Actually I thought it reports XML::Atom is present though it won't work.

I saw the behavior I reported yesterday on one of my dev machines. I 
had removed expat, XML::Parser (automatically as it depended on 
expat), and XML::LibXML (I couldn't remove the libxml2 shared library, 
as I didn't think of that until after I had removed expat, and 
removing expat removed rpm).


>>  This isn't a pressing issue and isn't a performance focus, but I see
>>  us doing much more in the future with Atompub and XML APIs (I
>>  certainly am in my hack-day time), so we should plan for it.

> I agree and that is why I think some type of plan has to be developed.
> MT wouldn't be a powerful if it where tied to a specific web server or
> database system -- I don't see XML parsers being much different.

Totally. Thanks for helping us state the relevant issues!


Mark Paschal
Software developer, Movable Type
mark at sixapart.com



More information about the MTOS-dev mailing list