Friday, October 30, 2009

Quote of the day

"It is well enough that people of the nation do not understand our banking and monetary system, for if they did, I believe there would be a revolution before tomorrow morning."
"Es ist gut, dass die Bürger der Nation unser Banken- und Währungssystem nicht verstehen. Denn ich glaube, wenn sie es verstünden, würde es noch vor morgen früh eine Revolution geben."
"C’est une chance que les gens de la nation ne comprennent pas notre système bancaire et monétaire parce que si tel était le cas, je crois qu'il y aurait une révolution avant demain matin."
Henry Ford

Friday, October 16, 2009

Taming the beast: managing SLF4J dependencies in complex Maven builds

More and more projects now use SLF4J instead of the good old Commons Logging as a logging facade. This introduces new challenges for complex Maven builds, because SLF4J will only work as expected if the dependencies are managed correctly. To understand why this is so, let's first review the different components that are part of SLF4J:
  • slf4j-api contains the SLF4J API, i.e. all the classes that an application or library using SLF4J directly depends on.
  • A number of bindings that implement the SLF4J API either based on an existing logging framework (slf4j-log4j12, slf4j-jdk14 and slf4j-jcl) or using a native implementation developed specifically for SLF4J (slf4j-nop and slf4j-simple).
  • A number of bridges that adapt SLF4J to existing logging facades (jul-to-slf4j) or emulate existing logging facades or implementations (jcl-over-slf4j and log4j-over-slf4j).
For SLF4J to work correctly in a project built with Maven, the following conditions must be met:
  1. The project must have a dependency on slf4j-api. If the project itself uses SLF4J and doesn't depend on any binding or bridge, then this should be a direct dependency. If SLF4J is used by one or more dependencies of the project, but not the project itself, then one may prefer to let Maven's dependency management system include it as a transitive dependency.
  2. If the project produces an executable artifact (JAR with Main-Class, WAR, EAR or binary distribution), then it must have a dependency on one and only one of the bindings. Indeed, a binding is always required at runtime, but the presence of multiple bindings would result in unpredictable behavior.
  3. The project may have any number of dependencies on SLF4J bridges, excluding the bridge for the API used by the binding. E.g. if slf4j-log4j12 is used as a binding, then the project must not depend on log4j-over-slf4j. Otherwise the application may crash because of infinite recursions.
  4. If the project has a dependency on a bridge that emulates an existing logging API, then it must not have at the same time a dependency on this API. E.g. if jcl-over-slf4j is used, then the project must not have a dependency on commons-logging. Otherwise the behavior will be unpredictable.
  5. The dependencies must not mix artifacts from SLF4J 1.4.x with artifacts from 1.5.x, since they are incompatible with each other.
Note that rule number 2 really only applies to executable artifacts. A project that produces a library artifact should never depend on any SLF4J binding, except in test scope. The reason is that depending on a given SLF4J binding in scope compile or runtime would impose a particular logging implementation on downstream projects. In a perfect world where every library (in particular third-party libraries) follows that practice, it would be very easy to validate the five conditions enumerated above: it would simply be sufficient to add a dependency on the desired binding (as wells as any necessary bridge) from SLF4J 1.5.x to every Maven project producing an executable artifact.
Alas, the world is not perfect and there are many third-party libraries that do have dependencies on particular SLF4J bindings and logging implementations. If a projects starts depending on this type of libraries, things get easily out of control if no countermeasures are taken. This is true especially for complex projects with lots of dependencies, which will almost certainly run into a situation where one of the five conditions above is no longer satisfied.
Unfortunately, Maven doesn't have the necessary features that would allow to enforce these conditions a priori, and enforcing them requires some discipline and manual intervention. On the other hand, there is a strategy that is quite simple and effective when applied systematically:
  • Make sure that in projects under your control, the policies described above are always followed.
  • For third-party libraries that don't follow best practices, use exclusions on the corresponding dependency declarations to remove transitive dependencies on SLF4J bindings. Note that if the library is used in several modules of a multi-module Maven project, then it is handy to declare these exclusions in the dependency management section in the parent POM, so that it is not required to repeat it every time the dependency is used.
An example of this would look as follows:

Thursday, October 15, 2009

XML schema oddity: covariant literals (part 1)

If you look at the 19 primitive types defined by the second part of the XML Schema specification, you may notice that one of them, namely QName, has a very particular feature that distinguishes it from the 18 other types: there is no functional mapping between its lexical space and its value space.
The value space of a type describes the set of possible values for that type and is a semantic concept. For example, the value space of the boolean type has two element: true and false. The lexical space on the other hand is the set of possible literals for that type. It is a syntactic concept and describes the possible ways in which the values of the type may appear in the XML document. E.g. the lexical space the boolean type has four elements: 0, 1, true and false. For a given type, the existence of a functional mapping between the lexical space and the value space means that for every literal, there is one and only one value that corresponds to that literal. This implies that if for example the type is used to describe an XML element, it is sufficient to know the text inside that element to unambiguously determine the value.
The QName type doesn't have this property because its value space is the set of tuples {namespace name, local part}, while its lexical space is defined by the production (Prefix ':')? LocalPart. Therefore, a QName literal can only be translated into a QName value if the context in which the literal appears is known. More precisely, it is necessary to know the namespace context, i.e. the set of namespace declarations in scope for the context where the QName is used.
Another interesting property of the schema type system is that none of the primitive types has a lexical space that is disjoint from the lexical spaces of all other primitive types. The proof is trivial: the lexical space of any simple type is in fact a subset of the lexical space of the string type. This implies that without knowledge of the schema, it is not possible to detect usages of QName in an instance document.
Why is all this important? Well, the consequence is that a transformation of an XML document can only leave QName values invariant if one of the following provisions are made:
  • The transformation leaves invariant the namespace context of every element. In that case it is sufficient to leave all literals invariant in order to leave all values invariant.
  • Before applying the transformation, all QName literals are translated into QName values. When serializing the document after the transformation, QName values are translated back into QName literals. In that case, QName literals are no longer invariant under the transformation. As noted above, this approach requires knowledge of the schema describing the document instance being transformed.
The situation is further complicated by the fact that there are custom types that have properties similar to QName, except that the semantics of these types are not defined at schema level, but by the application that eventually consumes the document. A typical example are XPath expressions: they also use namespace prefixes and their interpretation therefore also depends on the context in which they appear in the document.
Taking this into account, it is clear that the first approach described above is both simpler and more robust. The drawback is that it will in many cases cause a proliferation of namespace declarations in the transformation result, most of which are actually unnecessary. This can be seen for example on a transformation that simply extracts a single element from a document: to preserve the namespace context, it would be necessary to copy the namespace declarations of all ancestors of the element in the source document and add them to the element in the output document (except of course for those namespace declarations that are hidden).

In a second post I will examine how the issue described here is handled by various XML specifications and Java frameworks that describe or implement transformations on XML documents.

Electrabel vous remercie

Le gouvernement a négocié avec le secteur énergétique une "contribution structurelle" au budget fédéral. Selon Le Soir, le montant se situera dans une fourchette entre 215 et 245 millions d'euros par an. Or, ce qu'il faut savoir, c'est que les bénéfices exceptionnels que cela engendre pour Electrabel sont estimés à 1,8 milliards d'euros par an! Vous allez me dire qu'Electrabel paie des impôts et qu'une partie de cette somme revient quand même dans les caisses de l'état. Que nenni! Grâce aux intérêts notionnels inventés par notre ami Didier et un astucieux montage financier mis en place par Suez pour transférer une partie de sa dette vers sa filiale belge, Electrabel ne paie actuellement pas d'impôts des sociétés!
Les actionnaires d'Electrabel vous remercient cordialement pour votre contribution structurelle à leur dividendes...

Wednesday, October 14, 2009

Euphemism of the day: "tactical solution"

Recently I got into a discussion with somebody from IBM about an issue in Axiom. I argued that the proposed solution is only a workaround and that I would prefer a clean solution that addresses the root cause and fixes the issue once and for all. In his reply the guy from IBM carefully avoided the word workaround, and instead used tactical solution. I really like that euphemism.
So, next time you report to your boss, don't speak about workarounds and proper fixes, but about tactical and strategical solutions. He will be impressed...

Tuesday, October 13, 2009

Inside Axis2 and Axiom: can somebody please clean up?

Recently my attention got caught by a set of issues in Axis2 and Axiom that at first glance may seem unrelated, but when considered together point towards an important design flaw in Axis2 and Axiom:
  • When MTOM or SwA is used, Axiom has the ability to offload the content of the attachments to temporary files. Axiom does this based on a threshold algorithm: it will first attempt to read the data into memory, and if the attachment is larger than a configurable threshold it will move that data and write the rest of the attachment to a temporary file. In addition, Axiom also implements deferred loading of attachments: the data is only read from the message when the code consuming the request tries to access the attachments. Of course this only works within the limits imposed by the fact that these attachments must be read sequentially from the underlying stream.
    Recently a user reported an issue related to MTOM when used in an asynchronous Web Service, i.e. a service that returns an acknowledgement (HTTP 202) and then processes the request asynchronously on a separate thread, sending back the response using a different channel. This is a feature that is fully supported by Axis2. However it turns out that when used with MTOM, the attachments get lost. The reason is that sending back the HTTP 202 response will discard the part of the request that has not yet been read. More precisely, AbstractMessageReceiver, the class implementing the asynchronous feature, calls SOAPEnvelope#build(), which makes sure that the SOAP part is fully read into memory, but fails to tell Axiom to read the attachments before control is handed back to the servlet container.
    I advised the user to fix this by replacing build by buildWithAttachments, which forces Axiom to fetch all attachments, or at least those that are referenced by xop:Include elements. However, this only led to the next problem, which is that AxisServlet calls TransportUtils#deleteAttachments(MessageContext) before the thread processing the request gets a chance to read the attachments. If the attachments have been loaded into memory, this is not an issue, but if they have been offloaded to temporary files, these files will be deleted at that moment.
  • An interesting aspect about the issue described above is that AxisServlet seems to be the only transport that uses deleteAttachments. This means that the other transports would be affected by the opposite problem, i.e. instead of deleting temporary files too early (in the asynchronous case), they would not delete the temporary files at all. There is indeed an open issue in Axiom that describes this type of problem, but it is not clear if this occurs on the server side or client side (i.e. this bug report may actually refer to the last bullet below).
    It should be noted that since JMS is message based and doesn't use streams, the only other (commonly used) transport that would be impacted is the standalone HTTP transport, which is also used by Axis2's JAX-WS implementation when creating HTTP endpoints outside of a servlet container.
  • Axiom has another highly interesting feature called OMSourcedElement. Basically, this makes it possible to create an XML fragment that is not backed by an in-memory representation of the XML, but by some other data source. To make this work, every OMSourcedElement is linked to an OMDataSource instance that knows how to produce XML from the backing data. Many of the databindings provided by Axis2 rely on this feature. We also use it in Synapse for XSLT results if the stylesheet produces text instead of XML. Here again, if the result of the XSLT is too large, we offload it to a temporary file. In that case, we end up with an OMSourcedElement/OMDataSource that is backed by a temporary file. A known issue with this is that Synapse doesn't properly manage the lifecycle of these files, i.e. it is unable to delete them at the right moment. It actually relies on File#deleteOnExit() and on garbage collection, so that these temporary files will in general be kept longer than necessary.
  • Over the last year(s), there have been many reports about Axis2 leaking file descriptors or not closing HTTP connections. The issue came up again during the release process of Axis2 1.5.1, but it is still not entirely clear if the issue is now solved completely. What we know though is that at least part of the reports are in principle non-issues that are due the fact that the users didn't call ServiceClient#cleanupTransport() to properly release connections. However, as Glen pointed out, the Axis2 Javadoc didn't mention that it is mandatory to call that method (well, until I changed that). Also, I didn't check yet what happens inside cleanupTransport if the service response is MTOM. It might be that here again, Axis2 fails to clean up temporary files (see second bullet).
What is interesting to notice is that when processing a message, any SOAP stack may in general be expected to only acquire two kinds of resources that need explicit cleanup, namely network connections (or any other transport related resources) and temporary files. Indeed, assuming that the SOAP stack itself will not interact with any other external system, all other resources it acquires will be memory based and taken care of by the garbage collector (which of course doesn't exclude memory leaks). Only the clients and service implementations (and maybe some particular modules/handlers) will interact with external systems and acquire other resources requiring explicit cleanup (such as database connections).
The fact that Axis2 (and/or Axiom) does a poor job when it comes to manage both types of resources is a strong indication that there is an important design flaw that has yet to be addressed, or that it is lacking the appropriate infrastructure and APIs required to guarantee correct lifecycle management of these resources.