I don’t know anything about Struts 1, but Bill de hÓra’s recent post has got some interesting web-application-design tips. There were two particular bits that spoke to me:
struts-config.xml struts-config tries to capture primarily the flow of application state on the server, by being an awkward representation of a call graph. In doing it misses a key aspect of the web - hypertext. In web architecture, HTML hypertext on the client is the engine of application state, not an XML file on the server.
In other words (I think) in web applications your state in the page you’re on and taking action is about following the links (or submitting the forms) on the page. Your actions (and therefore the transitions between different states) are determined by what links and forms are on the page. But in fact, URLs should be hackable, and transitions unlimited. When you design the application what you really need to think about are the tasks the users want to achieve (and therefore the transitions that they might want to make) rather than the possible state transitions.
A lot of people run into problems with namespaces, and most of those arise from using default namespaces (ie not giving namespaces prefixes). The transformation technology you use can have a big effect on how confusing and irritating it gets.
Default namespaces make XML documents easier to read because they allow you to just give the local name of an element rather than using prefixes all over the place. For example, using:
<house status="For Sale" xmlns="http://www.example.com/ns/house">
<askingPrice>...</askingPrice>
<address>...</address>
<layout>...</layout>
</house>
I’m coming up to completion on the project that I’ve been working on for the last six months or so. It’s been very different from the projects that I’m used to: usually I fly in and either write a stylesheet or schema to a given specification and hand it off, or have to wade through, critique and improve a lot of existing XSLT code or schemas. Here, I’ve been involved in a much more end-to-end way: having to do the technical specification myself, do a lot of the testing, and deal with the bugs. Plus half of the project has involved customising existing (complex) stylesheets, written by someone else.
So, what have I learned?
David Carlisle’s posted a great tip on getting exsl:node-set() to work in IE:
In the above XSL-List thread I casually suggested that an alternative would be to just always use exslt:node-set in the body of the stylesheet and use the msxsl:script extension to define exslt:node-set for IE. That turned out not to be as easy as I thought as node-set isn’t a valid function name in either of the supported extension languages in msxsl (JScript or VBScript). However Julian Reschke came up with the construct needed, use associative array syntax so you can use [‘node-set’] to define the function.
Since there’s next to no ‘net connection at XTech 2007 (obviously the Web is not so ubiquitous as all that), I have nothing to do in the sessions but listen! Here are some thoughts about the sessions that I attended on the morning of Wednesday 16th. I haven’t included the keynotes not because they weren’t interesting but because I can’t think of anything to say about them at the moment.
I used to know how to arrange my XSLT modules. Each module had to be self-contained, and any common code imported into all the modules that used it. The reason? Because when you have on-going validation of your XSLT stylesheets, if the module can’t stand alone then you get all sorts of spurious errors. For example, if you define a variable in module A, which includes module B which uses that variable, then although the application as a whole will work fine, when you’re editing module B you’ll get errors because the variable isn’t defined in that module.
That rationale just got blown out of the water.
The big problem with the previous Levenshtein distance implementation is that it recurses so much a number of times (roughly) equal to the multiple of the lengths of the two strings you’re comparing. If you’re using an XSLT processor that doesn’t recognise the function as being tail recursive then you can’t compare two strings more than about 20 characters in length (400 recursions).
The problem is that the standard dynamic programming Levenshtein distance algorithm is written for procedural programming languages in which you can do useful things like updating variables. XSLT ain’t like that, so we need an alternative algorithm.
[UPDATE: Added a link to the full stylesheet, and edited the code so it doesn’t overlap the right-hand column.]
Levenshtein distance is a measure of how many edits it takes to get from one string to another. In the basic algorithm, each addition, deletion and substitution counts as a single edit. So, for example, the distance between "XSLT 1.0" and "XSLT 2.0" is 1: the only difference is the substitution of 2 for 1, whereas the distance between "XSLT" and "XQuery" is 5: three substitutions and two additions.
One of the interesting features of Levenshtein distance is that there’s a fairly straight-forward dynamic programming algorithm that can be used to calculate it. I thought it might be interesting to see what an XSLT 2.0 implementation might look like.
Excellent post on XSL-List by Abel Braaksma on creating readable regex expressions in XSLT 2.0. He suggests always defining regular expressions in the content of a <xsl:variable>, using normal XML comments to annotate the different parts of the regex, and then using the x flag to ignore the extraneous whitespace that you’ve introduced.
More rules of thumb: these are about when to use matching templates, when to use named templates, and when to use for-each. These have as much bearing in XSLT 1.0 as they do in XSLT 2.0.