Configure a Cassandra HA cluster

Tip   A new guide is available in the latest release of API Gateway that includes information on Apache Cassandra best practices and tuning, setting up high availability, and backup and restore. Much of the information in this guide also applies to API Gateway 7.5.3. See the API Gateway 7.6.2 Apache Cassandra Administrator Guide.

This topic describes how to set up an Apache Cassandra database cluster for high availability of your API Gateway system. It describes the necessary configuration steps and provides examples from a production environment.

Cassandra HA in a production environment

To tolerate the loss of one Cassandra node and to ensure 100% data consistency, API Gateway requires the following cluster configuration in a HA production environment:

  • Three Cassandra nodes (with one seed node)
  • QUORUM consistency to ensure that you are reading from a quorum of Cassandra nodes (two) every time
  • Replication factor set to 3 so each node holds 100% of the data and you can tolerate the loss of one node

If one Cassandra node fails, the cluster continues with two nodes to be HA, consistent, and read/write. There is no availability with one node and QUORUM consistency. This configuration applies in all supported use cases (for example, API Manager and API Gateway custom KPS, OAuth, and client registry data).

Note   Eventual consistency is not supported in a production environment (due to a risk of stale and incomplete data).

For more details, see Example Cassandra HA configuration in a production environment.

For details on hosting Cassandra in datacenters, see Configure API Management in multiple datacenters.

Upgrade from previous API Gateway version

When upgrading from a previous API Gateway version, you need only one Cassandra node in the cluster to receive upgraded data. After upgrade, you can then add more nodes to this cluster to provide high availability (HA), and configure TLS security. For more details, see the API Gateway Upgrade Guide.

Cassandra HA configuration

Use the following tools to configure Cassandra and the API Gateway Cassandra client:

Tool Description
nodetool Located in CASSANDRA_HOME/bin. This tool is required run most Cassandra administration operations. nodetool runs locally by default against a Cassandra node.
cqlsh Located in CASSANDRA_HOME/bin. This tool provides a query language interface to Cassandra. Cassandra Query Language (CQL) is similar in syntax to SQL. You can use tab completion with cqlsh (for example, press Tab to complete keyspace, table, and command names, and so on).
setup-cassandra

Located in GATEWAY_INSTALL_DIR/bin. This script helps with Cassandra configuration and updates the cassandra.yaml configuration file. You can edit this file manually, but this script saves time, helps prevent errors, and creates a backup of the original cassandra.yaml file.setup-cassandra also outputs instructions for resetting the default user name and password.

You can use this script when Cassandra installed locally or remote to API Gateway. For more details, see setup-cassandra script reference.

Policy Studio Policy Studio enables you to configure API Gateway and API Manager as clients of Cassandra. It also enables you to configure KPS table definitions created in back-end storage, if they do not exist (for example, in Cassandra or a relational database).
For Cassandra, these tables are created in a group keyspace with an initial replication factor of 1. For more details, see Configure the group keyspace and replication factor in Cassandra.

The following general guidelines apply to configuring Cassandra HA:

  • Decide on the number Cassandra nodes and the number of API Gateway nodes (local or remote). Axway recommends to configure a Cassandra HA cluster with three Cassandra nodes, and least two API Gateway instances (local or remote). For details, see Cassandra deployment architectures.

Example Cassandra HA configuration in a production environment

This section describes an example Cassandra HA configuration supported by Axway in a production environment.

Note   In this section, API Gateway and API Manager are both clients of Cassandra, and all API Gateway steps refer to both API Gateway and API Manager. API Manager is used only when additional API Manager-specific configuration is required.

HA production environment requirements

The following system requirements apply for Cassandra HA in a production environment:

Hardware requirements

  • Nodes: Three Cassandra nodes (one seed node).
  • IP address: One IP address per Cassandra node.
  • Disk space and memory: Depend on how much data you plan to store and how often this data changes:
    • KPS data and API Manager data consume small amounts of data (mostly read configuration data), and should not be an issue.
    • OAuth token use can be large, depending on the frequency of token generation and token time-to-live.
    • Double the amount of estimated storage: Needed for Cassandra to perform automatic compaction of data.
  • Storage: Cassandra is designed to run on commodity distributed drives. Storage Area Network (SAN) is not recommended or supported in a production environment.

For more details, see the API Management Capacity Planning Guide (available when logged into the Axway documentation website).

Software requirements

API Gateway supports the following systems in production:

  • Operating systems:
    • All Linux platforms supported by API Gateway. For more details, see System requirements.
    • Windows Server 2012 R2 is recommended.
  • Cassandra:
    • 64-bit Cassandra version 2.2.8 or 2.2.5 with 64-bit Oracle JRE on Linux and Windows (OpenJDK is not supported). Cassandra 2.2.8 is recommended. For more details, Supported Cassandra versions.
Note   You must download a 64-bit Oracle JRE manually on UNIX/Linux when Cassandra is remote to API Gateway. This also applies on Windows when Cassandra is remote or local (see Install a 64-bit Oracle JRE on Windows).
  • Python:
    • Cassandra 2.2.5 requires Python 2.7.x (up to 2.7.10)
    • Cassandra 2.2.8 requires Python 2.7.x (up to the latest)
    For more details, see https://www.python.org/.

Network requirements

  • All Cassandra nodes can connect to each other:
    • Ensure you can ping from each node to each other node.
    • Ensure that your firewall rules allow the necessary Cassandra ports for client and server connections. API Gateway clients connect to Cassandra over the Cassandra native protocol on port 9042 by default. Cassandra uses port 7000 by default for communication between Cassandra nodes (and port 7001 when TLS/SSL is configured).
  • Use a time service such as NTP to ensure that time is in sync in the cluster. For more details, see Ensure time is in sync across the Cassandra cluster.
Note   Earlier API Gateway versions used a default port of 9160 to communicate with Cassandra over the Apache Thrift protocol. This protocol is not supported in API Gateway version 7.5.3 or later. However, if necessary, after upgrade, you can configure Cassandra and API Gateway to use port 9160 to communicate over the Cassandra native protocol.

Start with one Cassandra seed node

You must always start with one Cassandra node (non-HA). You can test API Gateway and API Manager functionality and become familiar with Cassandra using one node, before growing the system for HA.

When upgrading from previous API Gateway and API Manager versions (with embedded Cassandra), you must upgrade to one node only. After upgrade, you can then grow the system for HA. You do not need to start with three nodes, or start from scratch to achieve HA. For more details on upgrade, see the API Gateway Upgrade Guide.

Note   Cassandra scales horizontally. This means that each node must have equal resources. Each node must run on the same hardware (CPU, disk, memory, and network) and on the same operating system. This ensures that nodes do not starve or out-compete other nodes, and that you can easily add, remove, and replace nodes, especially in cloud environments. For example, do not run some nodes with less or more memory than other nodes, or some nodes on Windows and some on Linux, or some nodes on SUSE Linux and some on CentOS Linux.

Cassandra HA configuration steps

The high-level approach to Cassandra HA configuration is as follows:

  1. Configure and verify the Cassandra HA cluster (non-secured).
  2. Configure the API Gateway or API Manager client and verify.
  3. Secure the Cassandra HA configuration and verify.

These steps are described in detail in the sections that follow.

Step 1 – Configure and verify the Cassandra HA cluster (non-secure)

This includes the following steps:

  1. Connect API Gateway to Cassandra
  2. Configure the group keyspace and replication factor in Cassandra
  3. Configure the Cassandra seed node
  4. Add the seed node to the HA cluster
  5. Replicate and verify the Cassandra cluster

Connect API Gateway to Cassandra

If you installed a Standard or Complete setup, Cassandra is installed on the same host, and listens on localhost by default. API Gateway runs on the same host and connects to Cassandra by default.

If you installed a Custom setup, and did not select the Quickstart tutorial, see Connect to API Gateway for the first time.

Configure the group keyspace and replication factor in Cassandra

If you have created a KPS collection or set up API Manager (which creates KPS collections), API Gateway creates a Cassandra keyspace and tables for data storage when it connects to a Cassandra node, if these do not exist. This topic assumes API Manager users have already run setup-apimanager in non HA standalone mode, so the keyspace will exist. For details on configuring API Manager, see the API Manager User Guide.

By default, the created Cassandra keyspace has a name in the form of xDOMAINID_GROUPID. This enables API Gateways in a group to share data and enables a single Cassandra cluster to host data from multiple API Gateway domains (for example, development, test, and staging).

Tip   You can find your DOMAINID and GROUPID as follows:
  • Open the following file to view the DOMAINID:
    GATEWAY_INSTALL_DIR/apigateway/groups/topology.json
  • Run the following command to output the GROUPID:
    ls -l groups/topologylinks/GroupName

Configure the API Gateway keyspace and replication factor

Initially, the keyspace has a default Replication Factor (RF) of 1. You must increase this for HA configuration. Perform the following steps:

  1. Use cqlsh to verify that the keyspace has been created and to view its replication factor. For example:
  2. $ ./cqlsh ipA

    DESCRIBE KEYSPACES;
    describe x83709115_c70d_4996_83ad_339407e1117d_group_2;

  3. The following text is output at the start:
  4. CREATE KEYSPACE x83709115_c70d_4996_83ad_339407e1117d_group_2 WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '1'} AND durable_writes = true;

  5. Update the replication factor to 3, for example:
  6. ALTER KEYSPACE x83709115_c70d_4996_83ad_339407e1117d_group_2 WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 3 };

  7. Rerun the describe command in step 1. The RF should now be 3.
Note   You must repeat these steps for each API Gateway group.

Configure the system_auth keyspace replication factor

When Cassandra authentication is enabled, you must also replicate the system_auth keyspace so that API Gateway can communicate with the cluster (for example, if a node in the cluster goes down):

ALTER KEYSPACE "system_auth" WITH REPLICATION = { 'class': 'SimpleStrategy', 'replication_factor': 3 };

QUIT

For more details, see Secure Cassandra HA configuration.

Configure the Cassandra seed node

Perform the following steps on the Cassandra seed node (ipA in this example):

  1. Update CASSANDRA_HOME/conf/cassandra.yaml manually or with the setup-cassandra script. You must ensure that the following settings are set (where ipA is the IP address of the first node):
    • seed_provider, parameters, seeds: ipA
    • start_native_transport: true
    • native_transport_port: 9042
    • listen_address: ipA
    • rpc_address: ipA
    • authenticator: org.apache.cassandra.auth.PasswordAuthenticator
    • authorizer: org.apache.cassandra.auth.CassandraAuthorizer
  2. For example, to set these with the setup-cassandra script:
  3. setup-cassandra --seed --own-ip=ipA --nodes=3 --cassandra-config=CONFIG_FILE

  4. Restart Cassandra. For details, see Install an Apache Cassandra database. You must always start the seed node first. For example, use the following command:
  5. /opt/cassandra/bin/cassandra

  6. To verify, run nodetool status. You should see the correct IP address output. One node only should be output.
Note   The default username/password is cassandra/cassandra. You must specify these credentials when running Cassandra tools such as cqlsh. For example :

./cqlsh ipA -u cassandra -p cassandra

Add the seed node to the HA cluster

You now must add two more Cassandra nodes in turn. The steps are similar to configuring the seed node, except the seed is now the first node (ipA).

Note   You must first have at least one designated seed node. Seeds nodes are required at runtime when a node is added to the cluster. You can add more or change designation later.

Configure the second Cassandra node (ipB)

  1. On Cassandra node B, install Cassandra (see Install an Apache Cassandra database). Do not start it yet.
  2. Edit cassandra.yaml manually or use the setup-cassandra script. You must ensure that the following settings are set (where ipB is the IP address of the second node):
    • seed_provider, parameters, seeds: ipA
    • start_native_transport: true
    • native_transport_port: 9042
    • listen_address: ipB
    • rpc_address: ipB
    • authenticator: org.apache.cassandra.auth.PasswordAuthenticator
    • authorizer: org.apache.cassandra.auth.CassandraAuthorizer
  3. For example, to set these with the setup-cassandra script:
  4. setup-cassandra --seed-ip=ipA --own-ip=ipB --cassandra-config=CONFIG_FILE

  5. Start Cassandra, for example:
  6. /opt/cassandra/bin/cassandra

  7. This node should join the cluster after obtaining information from the seed node. For more details, see Install an Apache Cassandra database.
  8. To verify, run nodetool status. You should see two nodes reported with the correct IP addresses.

Configure the third Cassandra node (ipC)

  1. On Cassandra node C, install Cassandra (see Install an Apache Cassandra database). Do not start it yet.
  2. Edit cassandra.yaml manually or use the setup-cassandra script. You must ensure that the following settings are set (where ipC is the IP address of the third node):
    • seed_provider, parameters, seeds: ipA
    • start_native_transport: true
    • native_transport_port: 9042
    • listen_address: ipC
    • rpc_address: ipC
    • authenticator: org.apache.cassandra.auth.PasswordAuthenticator
    • authorizer: org.apache.cassandra.auth.CassandraAuthorizer
  1. For example, to set these with the setup-cassandra script:
  2. setup-cassandra --seed-ip=ipA --own-ip=ipC --cassandra-config=CONFIG_FILE

  3. Start Cassandra. For example:
  4. /opt/cassandra/bin/cassandra

  5. This node should join the cluster after obtaining information from the seed node. For more details, see Install an Apache Cassandra database.
  6. To verify, run nodetool status. You should see three nodes reported with the correct IP addresses.

Replicate and verify the Cassandra cluster

To replicate data correctly around the cluster after this cluster configuration change, go to each node in turn, and run nodetool repair.

To verify, go to each node in turn, and run the following command for each group:

nodetool status keyspace_name

For example:

nodetool status x83709115_c70d_4996_83ad_339407e1117d_group_2

You should see three nodes with ownership of 100%.

Step 2 – Configure the client settings for API Gateway or API Manager

Note   You need at least two API Gateways in a group for HA.

Configure API Gateway Cassandra client settings

To update the Cassandra client configuration for API Gateway, perform the following steps:

Configure the API Gateway domain

  1. Ensure API Gateway has been installed on the API Gateway 1 and API Gateway 2 nodes. For details, see Install the API Gateway server.
  2. Ensure an API Gateway domain has been created on the API Gateway 1 node using managedomain. For more details, see Configure an API Gateway domain in the API Gateway Administrator Guide.

Configure the API Gateway Cassandra client connection

  1. In Policy Studio, open your API Gateway group configuration.
  2. Select Server Settings > Cassandra > Authentication, and enter your Cassandra user name and password (both default to cassandra).
  3. Select Server Settings > Cassandra > Hosts, and add an address for each Cassandra node in the cluster (ipA, ipB and ipC in this example).
Tip   You can automate these steps by running the updateCassandraSettings.py script against a deployment package (.fed). For more details, see Automate API Gateway Cassandra client settings.

Configure the API Gateway Cassandra consistency levels

  1. Ensure that the API Server KPS collection has been created under Environment ConfigurationKey Property Stores. This is required to configure Cassandra consistency levels, and is created automatically if you installed the Complete setup type (see Installation options). If you installed the Custom or Standard setup, run one of the following scripts to create the required KPS collections:
  2. Select Environment ConfigurationKey Property Stores > API Server > Data Sources > Cassandra Storage, and click Edit.
  3. In the Read Consistency Level and Write Consistency Level fields, select QUORUM:
  1. Repeat this step for each KPS collection using Cassandra (for example, Key Property Stores > OAuth, or API Portal for API Manager). This also applies to any custom KPS collections that you have created.
  2. If you are using OAuth and Cassandra, you must also configure quorum consistency for all OAuth2 stores under Libraries > OAuth2 Stores:
    • Access Token Stores > OAuth Access Token Store
    • Authorization Code Stores > Authz Code Store
    • Client Access Token Stores > OAuth Client Access Token Store
Note   By default, OAuth uses EhCache instead of Cassandra. For more details on OAuth, see the API Gateway OAuth User Guide.

Deploy the configuration

  1. Click Deploy in the toolbar to deploy this configuration to the API Gateway group.
  2. Restart each API Gateway in the group.

For details on any connection errors between API Gateway and Cassandra, see Configure a Cassandra HA cluster.

Configure API Manager Cassandra client settings

To update the Cassandra client configuration for API Manager, perform the following steps:

  1. Ensure the API Gateway and API Manager components have been installed on the API Gateway 1 and API Gateway 2 nodes. These can be local or remote to Cassandra installations. For details, see Install the API Gateway server and Install API Manager.
  2. Ensure an API Gateway domain, group, and instance have been created on the API Gateway 1 node using managedomain. For more details, see the API Gateway Administrator Guide.
  3. Note   This section assumes that you have already run setup-apimanager on the first node in non-HA standalone mode. For more details, see the API Manager User Guide.
  1. Start the first API Gateway instance in the group. For example:
  2. startinstance -n "my_gw_server_1" -g "my_group"

  3. Configure the Cassandra connection on the API Gateway 1 node. For details, see Configure the API Gateway Cassandra client connection.
  1. Configure the Cassandra consistency levels for your KPS Collections. For details, see Configure the API Gateway Cassandra consistency levels.
  2. In the Policy Studio tree, select Server Settings > API Manager > Quota Settings, and ensure that Use Cassandra is selected.
  3. Under Cassandra consistency levels, in both the Read and Write fields, select QUORUM:
  4. Three node HA full consistency
  1. Add the API Gateway 2 host machine to the domain using managedomain.
  2. Create the second API Gateway instance in the same group on the API Gateway 2 node.
  3. Note   Do not start this instance, and do not run setup-apimanager on this instance.
  4. Before starting the second API Manager-enabled instance, ensure that each instance has unique ports in the envSettings.props file. For example:
    1. Edit the envSettings.props file for the API Gateway instance in the following directory:
    1. Add the API Manager ports. For example, the defaults are:
  1. Start the second API Gateway instance. For example:
  2. startinstance -n "my_gw_server_2" -g "my_group"

  3. On startup, this instance receives the API Manager configuration for the group. It now shares the same KPS and Cassandra configuration and data, and uses the ports specified in the envSettings.props file.

Step 3 – Secure the Cassandra HA configuration and verify

To secure your Cassandra HA configuration, perform the following steps:

  1. Reset your default user name and password.
  2. Enable node-to-node traffic encryption.
  3. Enable client-to-node traffic encryption.
  4. Configure the cqlsh command for client-to-node traffic encryption.

For details, see Secure Cassandra HA configuration.

Note   nodetool can normally run on any machine against any Cassandra node. For improved security, you might have locked down JMX for localhost access only. In such cases, you could use ssh to access that machine, and then run nodetool.

Automate API Gateway Cassandra client settings

You can automate your API Gateway Cassandra client configuration by running the updateCassandraSettings.py script against a specified API Gateway deployment package (.fed). For example:

  1. Go to the following directory error:

INSTALL_DIR/apigateway/samples/scripts

  1. Enter the following command:
run cassandra/updateCassandraSettings.py -f /opt/apigateway/conf/my_deployment.fed -r 3 -H "ipA:9042,ipB:9042,ipC:9042"
Tip   To ensure that the .fed file does not need to change between environments, you can specify the Cassandra hostnames as environment variables. For example:
-H "\${env.CASS.HOST1}:9042,\${env.CASS.HOST2}:9042,\${env.CASS.HOST3}:9042"

updateCassandraSettings options

The updateCassandraSettings.py script options are explained as follows:

Option Description
-f, --file Enter the API Gateway deployment (.fed) to be updated. The default is INSTALL_DIR/system/conf/templates/FactoryConfiguration-VordelGateway.fed. If you do not specify a .fed file, you must back up this file before running the script.
-r, --replicationFactor

Enter the Cassandra replication factor. For more details, see Configure the group keyspace and replication factor in Cassandra.

-h, --hosts

Enter a comma-separated list of Cassandra host nodes in host:port format. For example, "127.0.0.1:9042,127.0.0.2:9042,127.0.0.3:9042". You can also enter hostnames as environment variables (for example, "\${env.CASS.HOST1}:9042),\${env.CASS.HOST2}:9042,\${env.CASS.HOST3}:9042").

--passphrase=PASSPHRASE Enter the encryption passphrase for the API Gateway group if required.
--passphrasePrompt Specify this option to prompt for the encryption passphrase for the API Gateway group. This disabled by default.
-K --keyspace KEYSPACE Enter the Cassandra keyspace name. For more details, see Configure the group keyspace and replication factor in Cassandra.
-U --user CASSANDRA_USER Enter the Cassandra user name if authentication is enabled.
-P CASSANDRA_PASSWORD Enter the Cassandra password if authentication is enabled.

For more details on automating API Gateway configuration, see the API Gateway DevOps Deployment Guide.

Further details

For details on Cassandra administration, see Perform essential Cassandra operations.

For more details on Cassandra cluster configuration and its impact on your system, see the following:

Related Links