I am frustrated. I know my corpus (resolutions of the United Nations General Assembly) shares a lot in common with biomedical and legal domain. And I can find interesting articles in biomedical domain dealing with similar issues of complex tokenization, long named entity mentions (though mine are much longer), etc. But I see nothing in legal domain.
I have just gone through all of Jurix’ proceedings as well as all of [Artificial Intelligence and Law] and all I got is [between 2 and 4 articles worth following-up].
There must be somebody actually trying to parse real legal texts and figuring out to deal with complex organisation, people and group names. But all I can see is articles dealing with levels from ontology and up.
There might even be money in it!
And the business model would center on providing automatic notification option if a notice from subscribed website sneakily changed and became much worse. That way one would pay money for peace of mind that there were no unexpected service rule changes.
: http://www.springerlink.com/content/100239/ “Digital edition of “Artificial Intelligence and Law” journal” : http://www.citeulike.org/user/arafalov/tag/legal “My article set from legal domain”