I have written about UIMA, IBM’s Natural Language Processing framework before. Since then, I had a couple of attempts to get a feel for it. Unfortunately, it kept feeling uncomfortable and confusing. Finally, I figured out why.
UIMA’s extensive documentation expects that you are committed to the framework. So, the documentation makes sure you understand full architecture before it lets you near the tutorial. The tutorial itself starts somewhere around section 4.1 and is easy enough to understand, once you find it. However, I had to spend literally more than an hour trying to find the quick example, before giving up and skimming all the architecture explanation to just find the correct start of the tutorial. It did not help that the documentation titles are rather ponderous and do not correspond to anything one would search for to try getting the feel of the framework. Nothing like quick start , walkthrough or any other keywords. In that sense, I found GATE easier to start with.
Additionally, the fact that the documentation comes inside one large PDF file, makes it harder to navigate and search than it would be for the HTML documentation. Finally, the tutorial flow interleaves with eclipse and without eclipse methods of doing the steps, which makes it that much harder again to follow.
I don’t really blame IBM guys. It is quite obvious that all of these disadvantages come from UIMA being a loss-leader for the commercial OmniFind product. But I think that if they really expect UIMA to be picked up, they should look into the usability of that first walkthrough. A separate document with explicit steps and without too much architecture would be good to show the capabilities and ease of use. Maybe even have it available with screenshots on the website itself before one commits to downloading the large setup files.