All posts by Quirijn Slings

DD4T.Web: Publication Resolving

The next DD4T release (1.30) which will be out shortly, features a new .NET library: DD4T.Web. This library contains functionality which can be used within a .NET web application, regardless of whether you are using MVC or not. In a short series I will introduce the most important functionality in this library. This episode: Publication Resolving.

 

DD4T is all about URLs. When the request comes in, the framework looks in the Tridion broker database for pages with a matching URL. Or rather: a matching path. For example: if the URL is http://www.acme.com/products/foobar.html, the path is /products/foobar.html.

But what if there are more matches? This is entirely possible because of Tridion’s BluePrinting model. Imagine the Acme corporation decides to launch a German web site, with the base URL http://www.acme.de. They manage this site in a Tridion publication which is a child of their ‘dotcom publication’ (which manages the www.acme.com site). Given the nature of BluePrinting, the German page about their ‘Foobar’ product, will have the url http://www.acme.de/products/foobar.html, and the path of this page is /products/foobar.html.

As you see, the path is identical to the path in the English publication. So how can DD4T tell which page to serve? This is done by the PublicationResolver. This is a very simple class, whose job it is to find out which publication the current request belongs to. In the example above, the PublicationResolver could look at the host name and return the correct publication id: if the host name is ‘www.acme.de’, it should return the id of the German publication, if it’s ‘www.acme.com’ it returns the id of the DotCom publication.

Resolving the publication based on the host name is an obvious choice in many cases, but there are alternative scenarios. For example: you could look at the preferred browser language and base the publication id on that. So: if my browser language is German, I will see the German site, regardless of the host name. Technically, I wouldn’t need the www.acme.de domain at all (although I don’t think Google will like it if the same URL can return a page in different languages!)

And then there is the simplest form of all: you can simply decide that all the requests in your web application should use the same publication id. This is actually the default behavior. DD4T comes with one implementation of the PublicationResolver interface, the DefaultPublicationResolver, and it simply looks in the Web.config for an appSetting called ‘DD4T.PublicationId’. This id is then used to look up all the pages, components and binaries. The downside is of course that you need to set up a separate web application for each language that you’re supporting. Not very efficient and tough to manage!

 

Implement your own resolver

It’s very easy to write your own publication resolver.

  • First, create a class library project in Visual Studio.
  • Add a class which implements the DD4T.ContentModel.Contracts.Resolvers.IPublicationResolver.  
  • Right-click the word IPublicationResolver and select ‘implement interface’.

Your code now looks like this:

using System;
using System.Web;
using DD4T.ContentModel.Contracts.Resolvers;

namespace Trivident.DD4T.Examples.PublicationResolvers
{
   public class HostNamePublicationResolver : IPublicationResolver
   {
      public int ResolvePublicationId()
      {
      }
   }
}

As you see, this interface defines only one method: ResolvePublicationId(). In that method, we will check the host name in the URL and use it to determine the publication id. In a very simple (not to say a moronic) form, the code might look like this:

using System;
using System.Web;
using DD4T.ContentModel.Contracts.Resolvers;

namespace Trivident.DD4T.Examples.PublicationResolvers
{
   public class HostNamePublicationResolver : IPublicationResolver
   {
      public int ResolvePublicationId()
      {
         switch (HttpContext.Current.Request.Url.Host)
         {
            case "www.acme.de":
               return 17;
            case "www.acme.com":
               return 16;
         }
         throw new InvalidOperationException(string.Format("unknown hostname '{0}", HttpContext.Current.Request.Url.Host));
      }
   }
}

In a real life implementation you would probably want to get rid of the hardcoded IDs, but the point is clear, I hope.

Using  it in your application

The PublicationResolver is a property of the factories (PageFactory, ComponentFactory, BinaryFactory, LinkFactory). If you’re using a dependency injection framework (like MEF or Unity) you can simply configure it as a dependency of those factories. Otherwise, you can set it manually (in code). In your PageController for example:

 

private IPageFactory _pageFactory = null;
public override ContentModel.Factories.IPageFactory PageFactory
{
   get
   {
      if (_pageFactory == null)
      {
         _pageFactory = base.PageFactory;
         _pageFactory.PublicationResolver = new HostNamePublicationResolver();
      }
      return _pageFactory;
   }
}

 

Using DD4T without publication resolving

It’s possible to use DD4T without doing any publication resolving. In that case the publication is not used when looking up the page in the broker database. This works if all the paths in your system are unique. So in our Acme example, the URLs of the English and German ‘foobar’ pages should be:

  • http://www.acme.com/products/foobar.html (English)
  • http://www.acme.com/de/products/foobar.html (German)

In this case, DD4T can and will always find the correct page.

If ‘the business’ can live with this style of URL, then go for it. It saves you the trouble of resolving anything!

To use this approach, just use the default publication resolver (this requires no action) and remove the key ‘DD4T.PublicationId’ from the Web.config.

 

 

 

 

Inside DD4T: templates without metadata

If you’re using DD4T, you know that you have to assign metadata to your templates. It is through this metadata that DD4T knows which view to use to render a given page or component presentation.

Now, DD4T can also work without metadata on templates. How? Simple: by using a naming convention. Just give your view the same name as your template, and you’re done.  This is in line with the ‘convention over configuration’ mechanism which is embraced by the MVC world.

There is one caveat: if your template name contains spaces, these are removed before looking up the view. So if your page template is called ‘Standard Mobile’, the file containing the view should be named ‘StandardMobile.cshtml’.

Of course, the good old metadata still works, and will continue to be supported as a way to override the default.

 

Keep those publish transactions down!

One recurring complaint from Tridion users is the publishing queue. It can be very slow to view it, and sometimes the performance becomes so bad that publishing itself will fail. A symptom of this is that items stay in the queue with the status ‘Waiting for publish’ even though the publisher is running.

Often, all these symptoms have one common cause: the publish transactions table in the database has grown too big. Tridion has no built-in clean-up mechanism for this table, so you have to do some work.

The solution is quite easy: run the Purge Tool regularly. It is installed with the Tridion content manager, so everyone should have it. When you start it, it shows you a window with some options. You can use it to purge old versions of components, pages and other items, to purge workflow histories, as well as purge the publish transactions table.

Of course you can simply run this tool every day, but fortunately there is a command line mode. It is configurable through an XML file (see http://sdllivecontent.sdl.com/LiveContent/content/en-US/SDL_Tridion_2011_SPONE/idheading-259253576, login required). Using this XML and a simple bat file you can schedule the purge tool to run every night, and let it remove all publish transactions that are older than say 2 days. You do this with the following XML snippet:

<PublishTransactions Purge="true" Before="2012-09-18" />

There is only one problem with this: the ‘before’ date in the XML must be an absolute date. It would be nicer if SDL would have let us specify a date OFFSET instead, like ‘2 days ago’.

I solved this by writing a simple vbscript that updates the config file and sets the Before date to ‘2 days ago’. My batch file calls this script first, and then the purge tool. The batch file is scheduled to run every night through the Windows task scheduler.

I know many Tridion administrators have come up with this solution, or something similar (better?) already. But since I couldn’t find any of this online, I thought I’d post it here.

Installation instructions:

– Unzip

– Move files to a location of your choice on a Tridion CM server

– Schedule the file RunPurgeTool.bat every night (or as often as you like)

To download: RunPurgeTool.

 

Note: with slight modification, you can use the tool to maintain workflow histories and/or old versions of versionable items.

 

 

JMS: the nitty gritty

In a previous post I explained why using JMS instead of Tridion’s Cache Channel Service is a good idea. Now I will get down to earth and show you how to actually implement it. I’ve picked Apache’s ActiveMQ as JMS server, mostly because it’s free and it works well on Windows. I’m assuming you have Tridion running, with a deployer and a web server which you have configured separately. If your deployer and web site share the same Tridion configuration files, you will not be able to configure JMS successfully! This is because – as you will see – the deployer requires a slightly different storage configuration than the web site. Check the Tridion documentation to see how this can be done.

The architecture looks basically like this:

JMS

The installation consists of the following tasks:

  1. Install ActiveMQ
  2. Change storage configuration of the deployer
  3. Change storage configuration of the web site

1. Installing ActiveMQ

First, download the latest release from http://activemq.apache.org/download.html. In my case, that was ActiveMQ 5.6.0. The software is distributed as a zip file.

Next, unzip the file in a location of your choice. It will create a folder called apache-activemq-5.6.0 (or whatever the version is). Note: it is not an installer, it is the actual server, so you may want to put it in C:\Program Files or something.

Go to apache-activemq-5.6.0\bin\win64 and double click on InstallService.bat.

Go to the services panel. You should now see a service called ‘ActiveMQ’. Start it.

That’s all there is to it. Actually, there are many configuration options for ActiveMQ, but the default config seems to work fine with Tridion.

 

2. Configure the deployer

The Tridion deployer must have an ActiveMQ client installed. The client consists of a number of jar files that can be found in the ActiveMQ folder, under lib. Simply take every jar file in that folder, except slf4j, to Tridion’s lib folder. Slf4j is already installed by Tridion.

You could also use a file called activemq-all-X.X.X.jar, which sits in the root of the ActiveMQ zip that you just downloaded. However, when I did that, the Tridion logging stopped working because of a conflict between log4j versions!

Next, look for the cd_storage_conf.xml and open it in an editor.  You must add a RemoteSynchronization element inside the ObjectCache element like this:

<RemoteSynchronization Queuesize="512">
  <Connector Class="com.tridion.cache.JMSCacheChannelConnector" Topic="Tridion" Strategy="AsyncJMS11">
    <JndiContext>
      <Property Name="java.naming.factory.initial" Value="org.apache.activemq.jndi.ActiveMQInitialContextFactory"/>
      <Property Name="java.naming.provider.url" Value="tcp://SERVER:PORT?soTimeout=5000"/>
      <Property Name="topic.Tridion" Value="TOPIC NAME"/>
    </JndiContext>
  </Connector>
</RemoteSynchronization>

The SERVER is of course the machine where you installed your JMS server. The PORT is the port it’s listening to (defaults to 61616). The TOPIC NAME can be any string. It identifies the ‘queue’ you plan to use. Make sure the deployer uses the same topic name as the web server to which it deploys.

Note that JMS can work with different strategies. They differ in the delivery method (synchronous or – more commonly – asynchronous), the JMS version (1.0 or 1.1) and whether or not they operate within a J2EE environment using Message Driven Beans (you may run into this if you use a JMS service within Websphere, for example).

Tridion’s default strategy is AsyncJMS11. See the Tridion documentation for more information on JMS strategies.

Restart the deployer service to activate the changes. In case of in-process deployment you must restart IIS.

 

3. Configure the web site

The web site needs the same ActiveMQ client  as the deployer, so start by copying those jars to the Tridion lib folder again. The lib folder, by the way, is normally found inside the bin folder of your web application (on .NET) or in the WEB-INF (on Java).

Next up is the cd_storage_conf.xml. It requires usually the same change as the deployer’s storage configuration.

<RemoteSynchronization Queuesize="512">
  <Connector Class="com.tridion.cache.JMSCacheChannelConnector" Topic="Tridion" Strategy="AsyncJMS11">
    <JndiContext>
      <Property Name="java.naming.factory.initial" Value="org.apache.activemq.jndi.ActiveMQInitialContextFactory"/>
      <Property Name="java.naming.provider.url" Value="tcp://SERVER:PORT?soTimeout=5000"/>
      <Property Name="topic.Tridion" Value="TOPIC NAME"/>
    </JndiContext>
  </Connector>
</RemoteSynchronization>
Note that the strategy in the web application can never be set to AsyncJMS10MDB or AsyncJMS11MDB, since message driven beans can only be used for sending messages, not for receiving them.

Restart the web site to activate the changes.

Put it to the test

That’s it. Try it by publishing some content. Be sure to check the cd_core log for exceptions. If you have made a mistake in the configuration, it will normally show up in there.
If you want to learn more about ActiveMQ, check out their web site, or read this blog post by Christian Posta.  Or you can just be satisfied that it works, like I was.

Tridion and JMS: removing another SPOF

When you use Tridion’s content delivery API to retrieve dynamic content, resolve links, etcetera, you are well advised to use the built-in caching mechanism. Without it, every page request would result in dozens of queries on your broker database (or dozens of data files being opened, which is just as bad for performance). With caching, most requests can be handled completely from memory, which makes your site a lot faster.

There is one catch to caching: Tridion being a content management system, it is highly likely that the content and links will change from time to time. That is why Tridion offers a notification system, so that your web application will always show content which is up to date and links that lead to the correct destination.

Actually, Tridion offers not one such notification system, but two. Most implementations use the so-called Cache Channel Service. This is a proprietary service which runs as a windows service or a stand-alone java process. Its job is to notify the broker instances when a new version of an item has been published to the broker repository. The Cache Channel Service uses the RMI protocol to communicate between the different virtual machines (which contain the broker instances).

Although this is certainly the most popular notification mechanism, it has some drawbacks:

  • RMI is not a messaging protocol, so things might go wrong when the Cache Channel Service is temporarily down. Fortunately, Tridion has solved this by some clever programming in the broker, but it is still a work-around which makes the CCS slightly unstable at times.
  • RMI uses unpredictable ports, which means that all ports > 1024 must be open between all your machines running a Tridion broker and the one running the CCS.
  • There is no easy way to handle multiple channels (like a staging site, a live site, a mobile site, development environments, etc). You can only achieve this by running more than one CCS instance, which means running multiple processes, each on a different port, or even running the CCS on multiple machines.
  • The CCS is a single point of failure (SPOF), since it cannot be scaled up.

As a solution architect, I find these disadvantages quite unsettling. Especially the last one: removing single points of failure from a solution is what we architects drool over.

Fortunately, SDL has come up with an alternative notification system. Instead of installing the CCS, you can also install a JMS instance. JMS – Java Message Service – is Java’s messaging system. There are many implementations available, including free ones like Apache’s ActiveMQ.

JMS is designed to pass messages from one application to another. These messages are organized in containers called ‘topics’. Applications can subscribe to a topic. Whenever a message becomes available in this topic, it will be passed on to all subscribing applications.

JMS addresses each of the problems of the Cache Channel Service mentioned above:

  • JMS can work over http on a fixed port (typically 61616, but it can be configured to run on any port including 80)
  • Multiple channels can be accommodated on a single JMS instance by simply configuring extra topics
  • Since JMS works on http, it can easily be accessed through a load balancer. This removes the single point of failure from the architecture.
There is a common idea about JMS, and that is that it would be unsuited for a .NET presentation stack. This idea is incorrect. I can guarantee that the solution will work just as fine if your web site is on IIS. After all, the content broker core is always Java, and it is this core that does the communication through JMS.

Where’s the catch then? Well, there isn’t any, other than the lack of awareness of this solution in the Tridion community, and the lack of configuration examples in Tridion’s product documentation. But with some perseverance, the help of this excellent post by Julian Wraith and – if necessary – the assistance of Tridion customer support, it can certainly be done. The end result is a more stable and more robust solution.