Clustering/Session Replication HOW-TO

Important Note

Table of Contents

For the impatient

Simply add

<Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster"/>

to your <Engine> or your <Host> element to enable clustering.

Using the above configuration will enable all-to-all session replication using the DeltaManager to replicate session deltas. By all-to-all, we mean that every session gets replicated to all the other nodes in the cluster. This works great for smaller clusters, but we don't recommend it for larger clusters — more than 4 nodes or so. Also, when using the DeltaManager, Tomcat will replicate sessions to all nodes, even nodes that don't have the application deployed.
To get around these problem, you'll want to use the BackupManager. The BackupManager only replicates the session data to one backup node, and only to nodes that have the application deployed. Once you have a simple cluster running with the DeltaManager, you will probably want to migrate to the BackupManager as you increase the number of nodes in your cluster.

Here are some of the important default values:

  1. Multicast address is 228.0.0.4
  2. Multicast port is 45564 (the port and the address together determine cluster membership.
  3. The IP broadcasted is java.net.InetAddress.getLocalHost().getHostAddress() (make sure you don't broadcast 127.0.0.1, this is a common error)
  4. The TCP port listening for replication messages is the first available server socket in range 4000-4100
  5. Listener is configured ClusterSessionListener
  6. Two interceptors are configured TcpFailureDetector and MessageDispatchInterceptor

The following is the default cluster configuration:

        <Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster"
                 channelSendOptions="8">

          <Manager className="org.apache.catalina.ha.session.DeltaManager"
                   expireSessionsOnShutdown="false"
                   notifyListenersOnReplication="true"/>

          <Channel className="org.apache.catalina.tribes.group.GroupChannel">
            <Membership className="org.apache.catalina.tribes.membership.McastService"
                        address="228.0.0.4"
                        port="45564"
                        frequency="500"
                        dropTime="3000"/>
            <Receiver className="org.apache.catalina.tribes.transport.nio.NioReceiver"
                      address="auto"
                      port="4000"
                      autoBind="100"
                      selectorTimeout="5000"
                      maxThreads="6"/>

            <Sender className="org.apache.catalina.tribes.transport.ReplicationTransmitter">
              <Transport className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"/>
            </Sender>
            <Interceptor className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"/>
            <Interceptor className="org.apache.catalina.tribes.group.interceptors.MessageDispatchInterceptor"/>
          </Channel>

          <Valve className="org.apache.catalina.ha.tcp.ReplicationValve"
                 filter=""/>
          <Valve className="org.apache.catalina.ha.session.JvmRouteBinderValve"/>

          <Deployer className="org.apache.catalina.ha.deploy.FarmWarDeployer"
                    tempDir="/tmp/war-temp/"
                    deployDir="/tmp/war-deploy/"
                    watchDir="/tmp/war-listen/"
                    watchEnabled="false"/>

          <ClusterListener className="org.apache.catalina.ha.session.ClusterSessionListener"/>
        </Cluster>

We will cover this section in more detail later in this document.

Security

The cluster implementation is written on the basis that a secure, trusted network is used for all of the cluster related network traffic. It is not safe to run a cluster on a insecure, untrusted network.

There are many options for providing a secure, trusted network for use by a Tomcat cluster. These include:

  • private LAN
  • a Virtual Private Network (VPN)
  • IPSEC
  • Encrypt cluster traffic using the EncryptInterceptor

Cluster Basics

To run session replication in your Tomcat 8 container, the following steps should be completed:

  • All your session attributes must implement java.io.Serializable
  • Uncomment the Cluster element in server.xml
  • If you have defined custom cluster valves, make sure you have the ReplicationValve defined as well under the Cluster element in server.xml
  • If your Tomcat instances are running on the same machine, make sure the Receiver.port attribute is unique for each instance, in most cases Tomcat is smart enough to resolve this on it's own by autodetecting available ports in the range 4000-4100
  • Make sure your web.xml has the <distributable/> element
  • If you are using mod_jk, make sure that jvmRoute attribute is set at your Engine <Engine name="Catalina" jvmRoute="node01" > and that the jvmRoute attribute value matches your worker name in workers.properties
  • Make sure that all nodes have the same time and sync with NTP service!
  • Make sure that your loadbalancer is configured for sticky session mode.

Load balancing can be achieved through many techniques, as seen in the Load Balancing chapter.

Note: Remember that your session state is tracked by a cookie, so your URL must look the same from the out side otherwise, a new session will be created.

The Cluster module uses the Tomcat JULI logging framework, so you can configure logging through the regular logging.properties file. To track messages, you can enable logging on the key: org.apache.catalina.tribes.MESSAGES

Overview

To enable session replication in Tomcat, three different paths can be followed to achieve the exact same thing:

  1. Using session persistence, and saving the session to a shared file system (PersistenceManager + FileStore)
  2. Using session persistence, and saving the session to a shared database (PersistenceManager + JDBCStore)
  3. Using in-memory-replication, using the SimpleTcpCluster that ships with Tomcat (lib/catalina-tribes.jar + lib/catalina-ha.jar)

Tomcat can perform an all-to-all replication of session state using the DeltaManager or perform backup replication to only one node using the BackupManager. The all-to-all replication is an algorithm that is only efficient when the clusters are small. For larger clusters, you should use the BackupManager to use a primary-secondary session replication strategy where the session will only be stored at one backup node.
Currently you can use the domain worker attribute (mod_jk > 1.2.8) to build cluster partitions with the potential of having a more scalable cluster solution with the DeltaManager (you'll need to configure the domain interceptor for this). In order to keep the network traffic down in an all-to-all environment, you can split your cluster into smaller groups. This can be easily achieved by using different multicast addresses for the different groups. A very simple setup would look like this:

        DNS Round Robin
               |
         Load Balancer
          /           \
      Cluster1      Cluster2
      /     \        /     \
  Tomcat1 Tomcat2  Tomcat3 Tomcat4

What is important to mention here, is that session replication is only the beginning of clustering. Another popular concept used to implement clusters is farming, i.e., you deploy your apps only to one server, and the cluster will distribute the deployments across the entire cluster. This is all capabilities that can go into with the FarmWarDeployer (s. cluster example at server.xml)

In the next section will go deeper into how session replication works and how to configure it.

Cluster Information

Membership is established using multicast heartbeats. Hence, if you wish to subdivide your clusters, you can do this by changing the multicast IP address or port in the <Membership> element.

The heartbeat contains the IP address of the Tomcat node and the TCP port that Tomcat listens to for replication traffic. All data communication happens over TCP.

The ReplicationValve is used to find out when the request has been completed and initiate the replication, if any. Data is only replicated if the session has changed (by calling setAttribute or removeAttribute on the session).

One of the most important performance considerations is the synchronous versus asynchronous replication. In a synchronous replication mode the request doesn't return until the replicated session has been sent over the wire and reinstantiated on all the other cluster nodes. Synchronous vs. asynchronous is configured using the channelSendOptions flag and is an integer value. The default value for the SimpleTcpCluster/DeltaManager combo is 8, which is asynchronous. You can read more on the send flag(overview) or the send flag(javadoc). During async replication, the request is returned before the data has been replicated. async replication yields shorter request times, and synchronous replication guarantees the session to be replicated before the request returns.

Bind session after crash to failover node

If you are using mod_jk and not using sticky sessions or for some reasons sticky session don't work, or you are simply failing over, the session id will need to be modified as it previously contained the worker id of the previous tomcat (as defined by jvmRoute in the Engine element). To solve this, we will use the JvmRouteBinderValve.

The JvmRouteBinderValve rewrites the session id to ensure that the next request will remain sticky (and not fall back to go to random nodes since the worker is no longer available) after a fail over. The valve rewrites the JSESSIONID value in the cookie with the same name. Not having this valve in place, will make it harder to ensure stickiness in case of a failure for the mod_jk module.

Remember, if you are adding your own valves in server.xml then the defaults are no longer valid, make sure that you add in all the appropriate valves as defined by the default.

Hint:
With attribute sessionIdAttribute you can change the request attribute name that included the old session id. Default attribute name is org.apache.catalina.ha.session.JvmRouteOrignalSessionID.

Trick:
You can enable this mod_jk turnover mode via JMX before you drop a node to all backup nodes! Set enable true on all JvmRouteBinderValve backups, disable worker at mod_jk and then drop node and restart it! Then enable mod_jk Worker and disable JvmRouteBinderValves again. This use case means that only requested session are migrated.

Configuration Example

        <Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster"
                 channelSendOptions="6">

          <Manager className="org.apache.catalina.ha.session.BackupManager"
                   expireSessionsOnShutdown="false"
                   notifyListenersOnReplication="true"
                   mapSendOptions="6"/>
          <!--
          <Manager className="org.apache.catalina.ha.session.DeltaManager"
                   expireSessionsOnShutdown="false"
                   notifyListenersOnReplication="true"/>
          -->
          <Channel className="org.apache.catalina.tribes.group.GroupChannel">
            <Membership className="org.apache.catalina.tribes.membership.McastService"
                        address="228.0.0.4"
                        port="45564"
                        frequency="500"
                        dropTime="3000"/>
            <Receiver className="org.apache.catalina.tribes.transport.nio.NioReceiver"
                      address="auto"
                      port="5000"
                      selectorTimeout="100"
                      maxThreads="6"/>

            <Sender className="org.apache.catalina.tribes.transport.ReplicationTransmitter">
              <Transport className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"/>
            </Sender>
            <Interceptor className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"/>
            <Interceptor className="org.apache.catalina.tribes.group.interceptors.MessageDispatchInterceptor"/>
            <Interceptor className="org.apache.catalina.tribes.group.interceptors.ThroughputInterceptor"/>
          </Channel>

          <Valve className="org.apache.catalina.ha.tcp.ReplicationValve"
                 filter=".*\.gif|.*\.js|.*\.jpeg|.*\.jpg|.*\.png|.*\.htm|.*\.html|.*\.css|.*\.txt"/>

          <Deployer className="org.apache.catalina.ha.deploy.FarmWarDeployer"
                    tempDir="/tmp/war-temp/"
                    deployDir="/tmp/war-deploy/"
                    watchDir="/tmp/war-listen/"
                    watchEnabled="false"/>

          <ClusterListener className="org.apache.catalina.ha.session.ClusterSessionListener"/>
        </Cluster>

Break it down!!

        <Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster"
                 channelSendOptions="6">

The main element, inside this element all cluster details can be configured. The channelSendOptions is the flag that is attached to each message sent by the SimpleTcpCluster class or any objects that are invoking the SimpleTcpCluster.send method. The description of the send flags is available at our javadoc site The DeltaManager sends information using the SimpleTcpCluster.send method, while the backup manager sends it itself directly through the channel.
For more info, Please visit the reference documentation

          <Manager className="org.apache.catalina.ha.session.BackupManager"
                   expireSessionsOnShutdown="false"
                   notifyListenersOnReplication="true"
                   mapSendOptions="6"/>
          <!--
          <Manager className="org.apache.catalina.ha.session.DeltaManager"
                   expireSessionsOnShutdown="false"
                   notifyListenersOnReplication="true"/>
          -->

This is a template for the manager configuration that will be used if no manager is defined in the <Context> element. In Tomcat 5.x each webapp marked distributable had to use the same manager, this is no longer the case since Tomcat you can define a manager class for each webapp, so that you can mix managers in your cluster. Obviously the managers on one node's application has to correspond with the same manager on the same application on the other node. If no manager has been specified for the webapp, and the webapp is marked <distributable/> Tomcat will take this manager configuration and create a manager instance cloning this configuration.
For more info, Please visit the reference documentation

          <Channel className="org.apache.catalina.tribes.group.GroupChannel">

The channel element is Tribes, the group communication framework used inside Tomcat. This element encapsulates everything that has to do with communication and membership logic.
For more info, Please visit the reference documentation

            <Membership className="org.apache.catalina.tribes.membership.McastService"
                        address="228.0.0.4"
                        port="45564"
                        frequency="500"
                        dropTime="3000"/>

Membership is done using multicasting. Please note that Tribes also supports static memberships using the StaticMembershipInterceptor if you want to extend your membership to points beyond multicasting. The address attribute is the multicast address used and the port is the multicast port. These two together create the cluster separation. If you want a QA cluster and a production cluster, the easiest config is to have the QA cluster be on a separate multicast address/port combination than the production cluster.
The membership component