Cassandra Agent

`stacksaga-agent-cassandra-starter`

stacksaga-agent-cassandra-starter is one of the Stacksaga Agent implementations that for supporting Cassandra based Orchestrator services for retrying the transactions. stacksaga-agent-cassandra-starter is a ready to use dependency. you can create your own spring boot project and add the stacksaga-agent-cassandra-starter as a dependency and run your application with few configurations.

Adding stacksaga-agent-cassandra-starter as a dependency

<dependency>
    <groupId>org.stacksaga</groupId>
    <artifactId>stacksaga-agent-cassandra-starter</artifactId>
    <version>${org.stacksaga.version}</version>
</dependency>

After adding the dependency, update the configuration properties of the application as needed.

Profiles

Based on your environment, you can choose one of the profiles in stacksaga-agent-cassandra-starter. There are two profiles as follows:

eureka - Eureka based environment
k8s - Kubernetes based environment

`eureka` Profile

If your application is deployed in the Eureka environment, you can use the eureka profile when the agent application is deployed. Under the eureka profile, the agent nodes can be scaled horizontally as per requirement. The stacksaga-agent-cassandra-starter can acts as the leader and also as the follower. If there are multiple agent nodes in the region, one node should be deployed as a leader and other nodes should be deployed as the followers. The leader node is responsible for updating the token range based on the running nodes count.

Agent-service as Leader and Follower

As mentioned above after adding the stacksaga-agent-cassandra-starter as a dependency, the application can be configured as a leader and also as a follower. Let’s see how the configuration looks like for both.

Leader Instance configuration:

stacksaga.agent.cassandra.eureka.instance-type=LEADER (1)
eureka.instance.instance-id=order-service-agent-us-east-leader (2)

1 Set the instance-type as LEADER to run the node as the leader.

Set a Eureka instance ID as a fixed (Static ID) one.

It is recommended to use this format for the leader instance ID.
Format: ${service-name}-agent-${region}-leader
Using the service name in the leader instance ID helps to avoid the collision if you are using same event-store for multiple services. Because the followers identify the leader instance in the database by the leader instance ID. and adding the region to the leader instance ID guarantees the region-based uniqueness.

Follower Instance configuration:

stacksaga.agent.cassandra.eureka.instance-type=FOLLOWER (1)
stacksaga.agent.cassandra.eureka.follower.leader-id=order-service-agent-us-east-leader (2)
eureka.instance.instance-id=${spring.application.name}:${random.uuid} (3)

1	Set the `instance-type` as the `FOLLOWER`.
2	Set the leader’s Static ID.this value should be the same exactly with the leader’s id that we configured in the leader node in the same region.
3	Set the `instance-id` as a random ID.

Token range allocation for nodes

All agent applications are registered with the eureka server in eureka environment. So the leader service will have all other agent instances' details through the eureka server. The leader server periodically checks the changes of the instance based on the local eureka service registry cache and updates the database with the relevant token range for each instance. The position of each instance is sored based on the instance started time. For instance, if there are five StackSaga-agent instances in the cluster, the token range is divided with the help of Murmur3 Partition algorithm as follows:

How token range is shared with the available agents in Eureka

Steps:

1	Leader node uses the eureka client’s cache to get the list of all instances in the region. (It can be a single eureka server or peers)
2	Leader node calculates the range for each instance periodically based on their timetamps and updates the ranges is sent to each nodes.

`k8s` Profile

When Stacksaga agent is deployed in the kubernetes environment, the deployment architecture is a bit different from the eureka environment. In the kubernetes environment, the nodes are deployed as StatefulSet. The reason for using StatefulSet is that the token range of the node is calculated by itself based on the position (index of the node) and the total number of nodes. All nodes continuously monitor changes of respective StatefulSet’s changes in real-time. If one instance goes down or added, all the nodes will be notified the update in real-time and then the token range will be updated accordingly by themselves.

Deploy `stacksaga-agent-cassandra` in kubernetes environment.

First you have to create a user account due to stacksaga-agent-cassandra access the kubernetes API in k8s profile. And should create and bind the role with the created service account as follows.

ServiceAccount Manifest

apiVersion: v1
kind: ServiceAccount
metadata:
  name: stacksaga-agent-cassandra-service-account #the name of the service account.
  namespace: default #the namespace the application is deployed.

ClusterRole Manifest

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  namespace: default
  name: stacksaga-agent-cassandra-access
rules:
  # Grant read access to pods
  - apiGroups: [ "" ]
    resources: [ "pods" ]
    verbs: [ "get", "list", "watch" ]
  # Grant access to watch StatefulSets
  - apiGroups: [ "apps" ]
    resources: [ "statefulsets" ]
    verbs: [ "watch", "get", "list" ]
  # Grant access to nodes
  - apiGroups: [ "" ]
    resources: [ "nodes" ]
    verbs: [ "get", "list" ]

ClusterRoleBinding Manifest

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: stacksaga-agent-cassandra-access-binding
  namespace: default
subjects:
  - kind: ServiceAccount
    name: stacksaga-agent-cassandra-service-account
    namespace: default
roleRef:
  kind: ClusterRole
  name: stacksaga-agent-cassandra-access
  apiGroup: rbac.authorization.k8s.io

Create the service-agent StatefulSet to deploy the agent-service.

RoleBinding Manifest

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: your-app
spec:
  serviceName: "your-app"
  replicas: 3
  selector:
    matchLabels:
      app: your-app
  template:
    metadata:
      labels:
        app: your-app
    spec:
      serviceAccountName: stacksaga-agent-cassandra-service-account #assign the service-account
      containers:
        - name: your-app-container
          image: your-app-image:latest
          ports:
            - containerPort: 8080

Headless Service Manifest

apiVersion: v1
kind: Service
metadata:
  name: your-app
spec:
  clusterIP: None
  selector:
    app: your-app
  ports:
    - port: 8080
      name: http

StackSaga Cassandra-Agent Configuration Properties

Stacksaga Cassandra agent does support both eureka based and kubernetes based environments. There is a list of common configuration properties for both and as well as there are some configuration properties specific to eureka and kubernetes.

Common Configuration Properties

Property Name Default Value Type Description

Property Name	Default Value	Type	Description
`spring.profiles.active`	`-`	`String`	There are two profiles available as `eureka` and `k8s`. You can choose one of them based on the deployment environment.
`spring.application.name`	`-`	`String`	The name of the agent application.
`server.port`	`8080`	`int`	The port of the agent service.
NOTE: Due to StackSaga Cassandra-Agent internally uses spring primary-datasource, It can be configured in the same way that spring boot provides. The prefix is `spring.cassandra` Example: `spring.cassandra.keyspace-name=order_service_event_store`. To configure the cassandra database, it is recommended to configure by providing a separate configuration(`.conf`) file in file like below. `spring.cassandra.config=classpath:stacksaga-cassandra.conf`
`stacksaga.agent.cassandra.target-service`	-	`String`	The host name of the target orchestrator service.
`stacksaga.agent.cassandra.target-service-host`	-	`String`	The host name that target service can be reached. (It can be a host in the PVC or another external one) (The transactions are fetched based on this name from the event-store)
`stacksaga.agent.cassandra.act-leader-as-follower`	`true`	`boolean`	Whether the leader service acts as the follower or not. If the cluster is small, you can run one instance and run as the leader and as well as the follower.
Communication Thread-Pool: It is responsible for communicating with other services for sharing the transactions to the target orchestrator services.
`stacksaga.agent.cassandra.thread-pool.communication.core-size`	`available processors * 5`	`int`	core number of threads in the communication pool.
`stacksaga.agent.cassandra.thread-pool.communication.max-size`	`available processors * 50`	`int`	maximum number of threads in the communication pool.
`stacksaga.agent.cassandra.thread-pool.communication.allow-core-thread-timeout`	`true`	`Boolean`	Whether core threads are allowed to time out. This enables dynamic growing and shrinking of the communication pool.
`stacksaga.agent.cassandra.thread-pool.communication.queue-capacity`	`Integer.MAX_VALUE`	`int`	Queue capacity. An unbounded capacity does not increase the pool and therefore ignores the "max-size" property.
`stacksaga.agent.cassandra.thread-pool.communication.keep-alive`	`60 seconds`	`Duration`	Time limit for which threads may remain idle before being terminated.
Recovery Thread-Pool: It is responsible for processing the transactions that should be recovered.
`stacksaga.agent.cassandra.thread-pool.recovery.size`	`available processors * 5`	`int`	core number of threads in the communication pool.
`stacksaga.agent.cassandra.thread-pool.recovery.allow-core-thread-timeout`	`true`	`Boolean`	Whether core threads are allowed to time out. This enables dynamic growing and shrinking of the communication pool.
`stacksaga.agent.cassandra.thread-pool.recovery.queue-capacity`	`Integer.MAX_VALUE`	`int`	Queue capacity. An unbounded capacity does not increase the pool and therefore ignores the "max-size" property.
`stacksaga.agent.cassandra.thread-pool.recovery.keep-alive`	`60 seconds`	`Duration`	Time limit for which threads may remain idle before being terminated.
`stacksaga.agent.cassandra.recovery.batch-size`	`1000`	`int`	how much data should be fetched from the database per time for recovering as a bulk.
`stacksaga.agent.cassandra.recovery.delay-in-minutes`	`30`	`int`	how often the recovery task should be executed.
Retry Thread-Pool: It is responsible for processing the transactions that should be retried.
`stacksaga.agent.cassandra.thread-pool.retry.size`	`available processors * 5`	`int`	core number of threads in the communication pool.
`stacksaga.agent.cassandra.thread-pool.retry.allow-core-thread-timeout`	`true`	`Boolean`	Whether core threads are allowed to time out. This enables dynamic growing and shrinking of the communication pool.
`stacksaga.agent.cassandra.thread-pool.retry.queue-capacity`	`Integer.MAX_VALUE`	`int`	Queue capacity. An unbounded capacity does not increase the pool and therefore ignores the "max-size" property.
`stacksaga.agent.cassandra.thread-pool.retry.keep-alive`	`60 seconds`	`Duration`	Time limit for which threads may remain idle before being terminated.
`stacksaga.agent.cassandra.retry.batch-size`	`1000`	`int`	how much data should be fetched from the database per time for retrying as a bulk.
`stacksaga.agent.cassandra.retry.delay-in-minutes`	`2`	`int`	how often the retry task should be executed.

spring.profiles.active

-

String

There are two profiles available as eureka and k8s. You can choose one of them based on the deployment environment.

spring.application.name

-

String

The name of the agent application.

server.port

8080

int

The port of the agent service.

NOTE: Due to StackSaga Cassandra-Agent internally uses spring primary-datasource, It can be configured in the same way that spring boot provides.
The prefix is spring.cassandra
Example: spring.cassandra.keyspace-name=order_service_event_store.
To configure the cassandra database, it is recommended to configure by providing a separate configuration(.conf) file in file like below.
spring.cassandra.config=classpath:stacksaga-cassandra.conf

stacksaga.agent.cassandra.target-service

String

The host name of the target orchestrator service.

stacksaga.agent.cassandra.target-service-host

String

The host name that target service can be reached. (It can be a host in the PVC or another external one) (The transactions are fetched based on this name from the event-store)

stacksaga.agent.cassandra.act-leader-as-follower

true

boolean

Whether the leader service acts as the follower or not. If the cluster is small, you can run one instance and run as the leader and as well as the follower.

Communication Thread-Pool: It is responsible for communicating with other services for sharing the transactions to the target orchestrator services.

stacksaga.agent.cassandra.thread-pool.communication.core-size

available processors * 5

int

core number of threads in the communication pool.

stacksaga.agent.cassandra.thread-pool.communication.max-size

available processors * 50

int

maximum number of threads in the communication pool.

stacksaga.agent.cassandra.thread-pool.communication.allow-core-thread-timeout

true

Boolean

Whether core threads are allowed to time out. This enables dynamic growing and shrinking of the communication pool.

stacksaga.agent.cassandra.thread-pool.communication.queue-capacity

Integer.MAX_VALUE

int

Queue capacity. An unbounded capacity does not increase the pool and therefore ignores the "max-size" property.

stacksaga.agent.cassandra.thread-pool.communication.keep-alive

60 seconds

Duration

Time limit for which threads may remain idle before being terminated.

Recovery Thread-Pool: It is responsible for processing the transactions that should be recovered.

stacksaga.agent.cassandra.thread-pool.recovery.size

available processors * 5

int

core number of threads in the communication pool.

stacksaga.agent.cassandra.thread-pool.recovery.allow-core-thread-timeout

true

Boolean

Whether core threads are allowed to time out. This enables dynamic growing and shrinking of the communication pool.

stacksaga.agent.cassandra.thread-pool.recovery.queue-capacity

Integer.MAX_VALUE

int

Queue capacity. An unbounded capacity does not increase the pool and therefore ignores the "max-size" property.

stacksaga.agent.cassandra.thread-pool.recovery.keep-alive

60 seconds

Duration

Time limit for which threads may remain idle before being terminated.

stacksaga.agent.cassandra.recovery.batch-size

1000

int

how much data should be fetched from the database per time for recovering as a bulk.

stacksaga.agent.cassandra.recovery.delay-in-minutes

30

int

how often the recovery task should be executed.

Retry Thread-Pool: It is responsible for processing the transactions that should be retried.

stacksaga.agent.cassandra.thread-pool.retry.size

available processors * 5

int

core number of threads in the communication pool.

stacksaga.agent.cassandra.thread-pool.retry.allow-core-thread-timeout

true

Boolean

Whether core threads are allowed to time out. This enables dynamic growing and shrinking of the communication pool.

stacksaga.agent.cassandra.thread-pool.retry.queue-capacity

Integer.MAX_VALUE

int

Queue capacity. An unbounded capacity does not increase the pool and therefore ignores the "max-size" property.

stacksaga.agent.cassandra.thread-pool.retry.keep-alive

60 seconds

Duration

Time limit for which threads may remain idle before being terminated.

stacksaga.agent.cassandra.retry.batch-size

1000

int

how much data should be fetched from the database per time for retrying as a bulk.

stacksaga.agent.cassandra.retry.delay-in-minutes

2

int

how often the retry task should be executed.

Eureka profile Based Configuration Properties

If you are in the Eureka environment, then you have to configure the following configuration properties.

Property Name Default Value Type Description

Property Name	Default Value	Type	Description
`stacksaga.agent.cassandra.eureka.instance-type`	-	`InstanceType`[Enum]	What is the type of this agent? Whether this agent is leader or follower. As per the architecture, one `leader` instance per region should be deployed.
If the instance is deployed as leader, the following configuration properties should be provided.
`eureka.instance.instance-id`	-		If the instance is a leader one, the instance-id should be a static value. The recommended pattern is `{service-name}-agent-{region-name}-leader`.
`stacksaga.agent.cassandra.eureka.leader.communication-pool.core-size`	`AvailableProcessors()`	int	what is the core size of the thread-pool for communicating with other followers.
`stacksaga.agent.cassandra.eureka.leader.communication-pool.max-size`	`AvailableProcessors()`	int	what is the maximum size of the thread-pool for communicating with other followers.
`stacksaga.agent.cassandra.eureka.leader.communication-pool.keep-alive`	`AvailableProcessors()`	long	Time limit for which threads may remain idle before being terminated.
`stacksaga.agent.cassandra.eureka.leader.token-range-update-interval`	`10 * 60 * 1000`(10 minutes)	long	how long time the token range should be updated based on the available instance data by the leader. the default value is 10 minutes.
`stacksaga.agent.cassandra.eureka.leader.token-range-valid-duration`	`10 * 60 * 1000`(10 minutes)	long	How long time the token range should be remains valid. This valid time is sent to the follower instance along with the token range, and their executions will be based on this time. The default value is `tokenRangeUpdateInterval + 5 minutes = 15 minutes`. The extra 5 minutes are added to avoid a network delay that may occur. The tokenRangeValidDuration should be greater than tokenRangeUpdateInterval all the time.
If the instance is deployed as follower, the following configuration properties should be provided.
`stacksaga.agent.cassandra.eureka.follower.leader-id`	-		If the instance is a slave instance, it should be provided the master’s stick-id to identify their leader.

stacksaga.agent.cassandra.eureka.instance-type

InstanceType[Enum]

What is the type of this agent? Whether this agent is leader or follower. As per the architecture, one leader instance per region should be deployed.

If the instance is deployed as leader, the following configuration properties should be provided.

eureka.instance.instance-id

If the instance is a leader one, the instance-id should be a static value. The recommended pattern is {service-name}-agent-{region-name}-leader.

stacksaga.agent.cassandra.eureka.leader.communication-pool.core-size

AvailableProcessors()

int

what is the core size of the thread-pool for communicating with other followers.

stacksaga.agent.cassandra.eureka.leader.communication-pool.max-size

AvailableProcessors()

int

what is the maximum size of the thread-pool for communicating with other followers.

stacksaga.agent.cassandra.eureka.leader.communication-pool.keep-alive

AvailableProcessors()

long

Time limit for which threads may remain idle before being terminated.

stacksaga.agent.cassandra.eureka.leader.token-range-update-interval

10 * 60 * 1000(10 minutes)

long

how long time the token range should be updated based on the available instance data by the leader. the default value is 10 minutes.

stacksaga.agent.cassandra.eureka.leader.token-range-valid-duration

10 * 60 * 1000(10 minutes)

long

How long time the token range should be remains valid. This valid time is sent to the follower instance along with the token range, and their executions will be based on this time. The default value is tokenRangeUpdateInterval + 5 minutes = 15 minutes. The extra 5 minutes are added to avoid a network delay that may occur. The tokenRangeValidDuration should be greater than tokenRangeUpdateInterval all the time.

If the instance is deployed as follower, the following configuration properties should be provided.

stacksaga.agent.cassandra.eureka.follower.leader-id

If the instance is a slave instance, it should be provided the master’s stick-id to identify their leader.

Kubernetes profile’s Configuration Properties

Property Name Default Value Type Description

Property Name	Default Value	Type	Description
`stacksaga.agent.cassandra.k8s.namespace`	`default`	`String`	the namespace that the application is deployed in the kubernetes cluster.
`stacksaga.agent.cassandra.k8s.zone-topology-name`	`topology.kubernetes.io/zone`	`String`	the topology name of the zone in the kubernetes cluster.
`stacksaga.agent.cassandra.k8s.region-topology-name`	`topology.kubernetes.io/region`	`String`	the topology name of the region in the kubernetes cluster.

stacksaga.agent.cassandra.k8s.namespace

default

String

the namespace that the application is deployed in the kubernetes cluster.

stacksaga.agent.cassandra.k8s.zone-topology-name

topology.kubernetes.io/zone

String

the topology name of the zone in the kubernetes cluster.

stacksaga.agent.cassandra.k8s.region-topology-name

topology.kubernetes.io/region

String

the topology name of the region in the kubernetes cluster.

Cassandra Agent

stacksaga-agent-cassandra-starter

Profiles

eureka Profile

Agent-service as Leader and Follower

Token range allocation for nodes

k8s Profile

Deploy stacksaga-agent-cassandra in kubernetes environment.

StackSaga Cassandra-Agent Configuration Properties

Common Configuration Properties

Eureka profile Based Configuration Properties

Kubernetes profile’s Configuration Properties

`stacksaga-agent-cassandra-starter`

`eureka` Profile

`k8s` Profile

Deploy `stacksaga-agent-cassandra` in kubernetes environment.