|
CRX Clustering
In computing, a cluster is a group of computers linked together to work, in some respects, as a single computer. Depending on the type of clustering, this can improve availability, performance or both. With CRX, two types of clustering are possible:
- Read-Only Clustering
- Active Clustering
In read-only clustering, multiple CRX instances each hold identical copies of the same content and share the job of servicing read requests through some form of load-balancing.
In this setup, a web server acting as a front-end exposes a single address and port to the web. Read requests arriving at the external address are distributed among the CRX instances (which will typically be on seperate machines on an internal network) and the responses are piped back through the front-end server to the web client.
A typical example of such an arrangement is the "single-author/multiple-publish" configuration often used with CQ (which uses CRX as its repository). All authoring is done on a single CQ author instance which then periodically replicates content to the CQ publish instances, meaning that it sends duplicates of the same data to all recipients, thus keeping them all synchronized. Caching and load-balancing are provided by the Dispatcher, which distributes the read requests across the publish instances.
This type of clustering provides both high-availability and high-performance, but only for publishing, not authoring. To set up read-only clustering on CQ consult the CQ documentation on deployment, replication and the dispatcher.
The second form of clustering available with CRX is active clustering. Under this system multiple instances are simultaneously active and are kept in sync at all times, enabling not just reading from any instance, but also writing to any instance. This arrangement is usually combined with load-balancing and caching provided by the Dispatcher, thus providing both high-availability and high performance.
The Role of Load-Balancing
Active clustering ensures that all CRX instances in the cluster are always synchronized. It does not, however, distribute the processing load for requests across those instances. Each instance in the cluster is still accessible individually at its particualr address and port and any request sent to the address and port of an instance will be served by that instance. The only difference is that all the instances will always appear to have identical content. This means that active clustering per se addresses only the issue of redundancy. It does not, by itself, increase performance. In order to improve response times active clustering must be combined with load-balancing.
One common way of doing this in CQ installations (which of course, use CRX as its repository) is to use the Dispatcher as a front end for the cluster. The dispatcher will expose a single address to the web and distribute incoming requests across cluster instances on the internal network. Consult the dispatcher documentation to learn how to set up this type of system.
In a CQ installation active clustering can be used for both author and publish (as opposed to just publish, as is the case with the read-only set up).
Note
Standard industry terminology refers to each instance in a cluster as a node. You may encounter this terminology in other documentation. In this documentation, the term instance is used to avoid confusion with the unrelated concept of a JCR node (see JCR Data Model).
CRX Data Storage Overview
To understand how active/active clustering works in CRX, a brief review of CRX data storage may be helpful. CRX data storage is composed of three elements:
- Workspace Store: This is a per-instance store that holds the content of all the worksapces in the repository except large binaries, which reside in the Data Store. The implementation of this store depends on the particular persistence manager (PM) being used for a particular workspace. PMs can be configured per-workspace, though usually they all use the same one. The default is the Tar PM which stores data in the file system as conventional Unix-style tar files. Other PMs, such as the MySQL PM and the Oracle PM, store data in databases. In an active/active clustered environment the workspace stores are kept synchronized across instances through the journal (see below).
- Data Store: This store holds large binaries. On write, large data objects are streamed directly to the data store and only an identifier referencing the binary is written to the workspace store. By providing this level of indirection, the data store ensures that large binaries are only stored once, even is they appear in multiple locations or in multiple workspaces. This store can be configured to use either a filesystem directory shared among all cluster instances, a database shared among all cluster instances or a per instance store automatically synchronized across all cluster instances.
- Journal: Whenever CRX writes data it first records the intended change in the journal. Maintaining the journal helps ensure data consistency and helps the system to recover quickly from crashes. Similar to the Data Store, the Journal can be configured as a shared filesystem directory, a shared database or an auto-synchronized per-instance journal.
Active Clustering Variants
CRX supports three variants of active/active clustering:
- Shared Nothing: This is the default configuration. In this configuration all three elements of CRX storage are held per instance and synchronized over the network. No shared storage is used. The Java classes used for this configuration are:
- Workspaces: TarPM
- Data Store: ClusterDataStore
- Journal: TarJournal
- Shared Data Store: In this configuration the workspace stores and the journal are maintained per-instance and synchronized over the network, but the data store is held in a shared directory accessible to all cluster instances.The Java classes used for this configuration are:
- Workspaces: TarPM
- Data Store: FileDataStore or DbDataStore
- Journal: TarJournal
- Shared Data Store and Journal: In this configuration the workspace stores are maintained per-instance but both the journal and the data store are held in a shared directory accessible to all cluster instances.The Java classes used for this configuration are:
- Workspaces: TarPM
- Data Store: FileDataStore or DbDataStore
- Journal: FileJournal or DatabaseJournal
GUI Setup of Shared Nothing Clustering
Every CRX 2.2 installation comes pre-configured to run within a cluster. Under these default settings, the cluster can be configured easily through the CRX console GUI.
The GUI Cluster Setup method provides the easiest way to evaluate active clustering, as it only requires minimal configuration and installation effort. CRX active clustering, on the other hand, is a sophisticated and flexible module, with many modes of operation and configuration and tuning options. As such, for anything but a simple evaluation the Manual Cluster Setup is recommended, as it gives full flexibility.
The GUI Cluster Setup method supports the following:
- Shared nothing clustering variant
- The default configuration of CRX (Tar-based persistence and journalling, Quickstart-based installation, the default settings, including port numbers used for internal communication between the cluster nodes)
To configure other cluster variants, or specify advanced configuration options, consult the Cluster Configuration instructions.
To set up an active Shared Nothing cluster using GUI:
Install two or more CRX 2.2 instances. In a production environment, each would be installed on a dedicated server. For development and testing, you may install multiple instances on the same machine.
Ensure that all the instance machines are networked together and visible to each other over TCP/IP.
Caution
Cluster instances communicate with each other through port 8088. Consequently this port must be open on all servers within a cluster. If one or more of the servers is behind a firewall, that firewall must be configured to open port 8088.
If you need to use another port (e.g., due to firewal setup), use Manual Cluster Setup approach. You can configure the cluster communication port to another port number, in which case that port must be visible throught the firewall.
To configure the cluster communications port, change the portList parmeter in the <Journal> element of repository.xml as described here.
Decide which instance will be the master instance. Note the host name and port of this instance. For example, if you are running the master instance on your local machine, its address might be localhost:4502.
Every instance other than the master will be a slave instance. Go to the CRX console of each slave instance, located at
http://<slave-address>/crx/index.jsp
For example, if you are running a slave instance on your local machine on port 4503 its CRX console will be found at
http://localhost:4503/crx/index.jsp
After logging in to the console (the Log In link is near the top of the page) click the Repository Configuration category near the top of the page.
When the Repository Configuration page appears, click Cluster, in the list under Tools.
In the Cluster Configuration page enter the web address of the master instance in the field marked Master URL, as follows:
http://<master-address>/crx/config/cluster.jsp
For example if both your slave and master on your local machine you might enter
http://localhost:4502/crx/config/cluster.jsp
Once you filled in the Master URL, enter the your Username and Password and click Join. You must have administrator access to set up a cluster.
Joining the cluster may take a few minutes. Once the console reports that join was successful, repeat the join procedure on each of the other slave instances.
Note
Restart of the slave instance might be required to avoid stale sessions.
In some cases a user may wish to set up a cluster without using the GUI. There are two ways to do this: manual slave join and manual slave clone.
The first method, manual slave join, is the same as the standard GUI procedure except that it is done without the GUI. Using this method, when a slave is added, the content of the master is copied over to it and a new search index on the slave is built from scratch. In cases where an pre-existing instance with a large amount of content is being "clusterized" this process can take a long time.
In such cases it is recommended to use the second method, manual slave clone. In this method the master instance is copied to a new location either at the file system level (i.e., the crx-quickstart directory is copied over) or using the online backup feature and the new instance is then adjusted to play the role of slave. This avoids the rebuilding of the index and for large repositories can save a lot of time.
The following steps are similar to joining a cluster node using the GUI. That means the data is copied over the network, and the search index is re-created (which may take some time):
Master
- Copy the files crx-quickstart-2.2.*.jar and and license.properties to the desired directory.
- Start the instance:
java -Xmx512m -jar *.jar
- If a shared data store should be used: stop the instance, change crx-quickstart/repository/repository.xml, move the datastore directory to the required place, start the instance, and verify it still works.
Slave
- Copy the files crx-quickstart-2.2.*.jar and license.properties to the desired directory (usually on a different machine from the master, unless you are just testing).
- Unpack the JAR file:
java -Xmx512m -jar crx-quickstart-2.2.*.jar -unpack
- Copy the files repository.xml and cluster.properties from the master:
cp ../n1/crx-quickstart/repository/repository.xml crx-quickstart/repository/
cp ../n1/crx-quickstart/repository/cluster.properties crx-quickstart/repository/
- If this new slave is on a different machine from the master, append the IP address of the master to the cluster.properties file of the slave:
echo "addresses=x.x.x.x" >> crx-quickstart/repository/cluster.properties
where x.x.x.x is replaced by the correct address. At the master, the IP address of the slave should be added to the cluster.properties file as well.
- Start the instance:
java -Xmx512m -jar crx-quickstart-2.2.*.jar
The following steps clone the master instance and change that clone into a slave, preserving the existing search index:
Master
Your existing repository will be the master instance.
If it is feasible to stop the master instance:
- Stop the master instance either through the GUI switch or the command line stop script.
- Copy the crx-quickstart directory of the master over to the location where you want the slave installed, using a normal filesystem copy (cp, for example).
- Restart the master.
If it is not feasible to stop the master instance:
- Do an online backup of the instance to the new slave location. The online backup tool can be made to write the copy directly into another directory or to a zip which you can then unpack in the new location. See here for details. The process can be automated using curl or wget. For example:
curl -c login.txt "http://localhost:7402/crx/login.jsp?UserId=admin&Password=xyz&Workspace=crx.default"
curl -b login.txt -f -o progress.txt "http://localhost:7402/crx/config/backup.jsp?action=add&&zipFileName=&targetDir=<targetDir>"
Slave
In the new slave instance directory:
- Delete the file crx-quickstart/repository/cluster_node.id.
- Add the master instance IP address to the file crx-quickstart/repository/cluster_node.properties:
echo "addresses=127.0.0.1" >> crx-quickstart/repository/cluster.properties
- Start the slave instance. It will join the cluster without re-indexing.
To set up either of the non-default active clustering variants (shared data store or shared data store and journal) requires manual configuration. This is done in the file crx-quickstart/repository/repository.xml.
The element DataStore holds the parameters that govern the data store which holds large binary objects. By default the data store is configured to use shared-nothing clustering, meaning that each instance maintains its own copy of the store and they are all kept in sync. The default configuration looks like this:
<DataStore class="com.day.crx.core.data.ClusterDataStore"> <param name="minRecordLength" value="4096"/> </DataStore>
- minRecordLength: The minimum size for an object to be stored in the Data Store as opposed to the inline within the regular PM. The default is 4096 bytes. The maximum supported value is 32000.
It is possible to maintain shared-nothing clustering for the worksapce stores and journal while using a shared file for the data store. This may be done to reduce disk space requirements, all cluster nodes can share the same data store. To use a shared data store, change the data store configuration in the repository.xml file as follows:
<DataStore class="org.apache.jackrabbit.core.data.FileDataStore"> <param name="minRecordLength" value="4096"/> <param name="path" value="${rep.home}/shared/datastore"/> </DataStore>
- path: The path where the data store files are stored. All cluster nodes must point to the same physical directory.
The element <Cluster> holds the parameters that govern the journal.
<Cluster syncDelay="2000"> <Journal class="com.day.crx.persistence.tar.TarJournal"> <param name="bindAddress" value=""/> </Journal> </Cluster>
- bindAddress: Used if the synchronization between cluster nodes should be done over a specific network interface. Default: empty (meaning all network interfaces are used).
- syncDelay: Events that were issued by other cluster nodes are processed after at most this many milliseconds. Optional, the default is 5000 (5 seconds)
- maxFileSize: The maximum file size per journal tar file. If the current data file grows larger than this number (in megabytes), a new data file is created (if the last entry in a file is very big, a data file can actually be bigger, as entries are not split among files). The maximum file size is 1024 (1 GB). Data files are kept open at runtime. The default is 256 (256 MB).
- maximumAge: Age specified as duration in ISO 8601 or plain format. Journal files that are older than the configured age are automatically deleted. The default is "P1M", which means files older than one month are deleted.
- portList: The list of listener ports to use by this cluster node. When using a firewall, the open ports must be listed. A list of ports or ranges is supported, for example: 9100-9110 or 9100-9110,9210-9220. By default, the following port list is used: 8088-8093.
- preferredMaster: Flag indicating whether this cluster node should be a preferred master. If this flag is set, the node will become the master upon startup. Default: false.
Workspace Store Configuration
All three of the active clustering variants (shared nothing, shared data store or shared data store and journal) use the TarPM persistence manager for workspace storage. Changing this PM is also possible, but since it is not an issue tied specifically to clustering (single-instance installs may also wish to change workspace PMs) it is not covered here, but in the Persistence Managers section.
Cluster Properties and Node ID
The file crx-quickstart/repository/cluster_node.id contains the cluster node id (unique for each cluster node). This file is automatically created by the system. By default it contains a randomly generated UUID, but it can be any name. When copying a cluster node, this file should be copied (if two cluster nodes contain the same cluster node id, only the first cluster node can connect).
Example file:
08d434b1-5eaf-4b1c-b32f-e9abedf05f23
The file crx-quickstart/repository/cluster.properties contains cluster configuration properties. The file is automatically updated by the system if the cluster configuration is changed in the GUI.
Example file:
cluster_id=86cab8df-3aeb-4985-8eb5-dcc1dffb8e10
addresses=10.0.2.2,10.0.2.3
members=08d434b1-5eaf-4b1c-b32f-e9abedf05f23,fd11448b-a78d-4ad1-b1ae-ec967847ce94
The cluster_id property contains the cluster ID, which must be the same for all cluster nodes that participate in this cluster. By default this is a randomly generated UUID, but it can be any name.
The addresses property contains a comma separated list of the IP addresses of all nodes in this cluster. This list is used at the startup of each cluster node to connect to the other nodes that are already running. The list is not needed if all cluster nodes are running on the same computer (which may be the case in certain circumstances, such as testing).
The members properties contains a comma separated list of the cluster node IDs that participate in the cluster. This property is not required for the cluster to work, it is for informational purposes only.
The following system properties affect the cluster behavior:
socket.connectTimeout: The maximum number of milliseconds to wait for the master to respond (default: 1000). A timeout of zero means infinite (block until connection established or an error occurs).
socket.receiveTimeout: the maximum number of milliseconds to wait for a reply from the master or slave (SO_TIMEOUT; default: 60000). A timeout of zero means infinite.
com.day.crx.core.cluster.DisableReverseHostLookup: Disable the reverse lookup from the master to the slave when connecting (default: false). If not set, the master checks if the slave is reachable using InetAddress.isReachable(connectTimeout).
Out-of-Sync Cluster Instances
In some cases, when the master instances is stopped while the other cluster instances are still running, The master instance cannot re-join the cluster after being restarted.
This can occur in cases where a write operation was in progress at the moment that the master node was stopped, or where a write operation occured a few seconds before the master instance was stopped. In these cases, the slave instance may not receive all changes from the master instance. When the master is then re-started, CRX will detect that it is out of sync with the remaining cluster instances and the repository will not start. Instead, an error message is written to the server.log saying the repository is not available, and the following or a similar error message in the file crx-quickstart/logs/crx/error.log and crx-quickstart/logs/stdout.log:
ClusterTarSet: Could not open (ClusterTarSet.java, line 710) java.io.IOException: This cluster node and the master are out of sync. Operation stopped. Please ensure the repository is configured correctly. To continue anyway, please delete the index and data tar files on this cluster node and restart. Please note the Lucene index may still be out of sync unless it is also deleted. ... java.io.IOException: Init failed ... RepositoryImpl: failed to start Repository: Cannot instantiate persistence manager ... RepositoryStartupServlet: RepositoryStartupServlet initializing failed
Avoiding Out-of-Sync Cluster Instances
To avoid this problem, ensure that the slave cluster instances are always stopped before the master is stopped.
If you are not sure which cluster instance is currently the master, open the page http://localhost:port/crx/config/cluster.jsp. The master ID listed there will match the the contents of the file crx-quickstart/repository/cluster_node.id of the master cluster instance.
Recovering an Out-of-Sync Cluster Instance
To re-join a cluster instance that is out of sync, there are a number of solutions:
- Create a new repository and join the cluster node as normal.
- Use the Online Backup feature to create a cluster node. In many cases this is the fastest way to add a cluster node.
- Restore an existing backup of the cluster instance node and start it.
- As described in the error message, delete the index and data tar files that are out-of-sync on this cluster node and restart. Note that the Lucene search index may still be out of sync unless it is also deleted. This procedure is discouraged as it requires more knowledge of the repository, and may be slower than using the online backup feature (specially if the Lucene index needs to be re-built).
Locking in active cluster. Active cluster does not support session-scoped locks. Open-scoped locks or application-side solutions for synchronizing write operations should be used instead.
Alternative Clustering Setups
Apart from the two standard CRX clustering arrangements, read-only and active, another clustring setup, called active/passive, may sometimes be used.
Under this system, a number of CRX instances share a common data storage location but only one instance is active at a time. The active instance services all read and write requests. If, however, that instance fails, incoming requests are automatically re-routed to one of the other instances which then takes over. Because all instances use the same data storage, their content is always, by definition, "synchronized".
This type of clustering provides high-availability, but without the higher performance achieved by load balancing, since only one instance is running at any one time.
CRX is compatible with this form of clustering, though in almost all cases it will be inferior to the standard CRX active clustering.
|
|
http://dev.day.com/docs/en/cq/current/core/administering/cluster.html