Kentico content analysis – using LINQ and site export
I have recently started working with Kentico Content Management System (version 6). As they advertise, it supports unlimited website possibilities. Which means, that it is most definitely not an Apple-style interface and it has a lot of options and multiple ways to edit something like a page template.
As a result, it is sometimes hard to understand the site organization and one has to click on a lot of dialogs to figure the relationships out. And when that site is a delivery from an external consultant, the task could be nearly impossible.
One of the options, of course, is to run SQL queries directly on the Kentico database, but that may be not allowed for production, may be volatile due to ongoing site changes and is not as suitable for repeated or comparative/historical snapshot analysis.
But there is another way to look at Kentico content, one that allows to bring all the documents and configurations and settings into one place, one style, suitable to one analytical approach. I am talking, of course, about Kentico Site Export - a process that bundles the content of all the various database tables and components that constitute a site and outputs it as a zip containing primarily a large number of XML files.
There are many ways to navigate/analyze XML files. There are always basic text or XML editors; there are tools for command line query of XML (like XMLStarlet); or one could even build a custom XML processing framework (like the one I did earlier for Lotus Notes export).
For this example, I want to look at the Page Templates and Documents that use them. This allows to check which templates are not used, it shows any misplaced documents and it produces nice documentation.
There are three files involved in getting this information (all inside the zip file, under Data directory):
- Documents - Documents\cms_document.xml.export
- Reusable page templates - Objects\cms_pagetemplate.xml.export
- Non-reusable (Ad-hoc) page templates - Site\cms_pagetemplate.xml.export
As I had to extract and correlate multiple files, I have chosen to use Microsoft LINQ as it has a (nearly) native support for XML and, combined with the amazing LINQPad, allows to iterate on the scripts to get the exact results required.
Disclaimer: I am still learning LINQ, C# and Kentico. The code below is probably ugly or scary or both. And I don’t even want to admit how much internet searching it took to get to this point. Feel free to suggest improvements. I just hope that this example will show that bulk analysis of Kentico content and configuration is possible and is not particularly difficult.