stacksaga-cassandra-support
stacksaga-cassandra-support
is one of Stacksaga db support(event store support) implementations as per the architectural diagram.
It is used for the orchestrator service along with the starter dependency if the orchestrator service uses the Cassandra database as their primary database. This library provides all the facilities for accessing the Cassandra database for the Stacksaga engine.
Here is the way that you can add the library into your existing orchestrator application as a dependency.
stacksaga-cassandra-support
as a dependency<dependency>
<groupId>org.stacksaga</groupId>
<artifactId>stacksaga-cassandra-support</artifactId>
<version>${org.stacksaga.version}</version>
</dependency>
How the transaction data is sorted
As mentioned above, the primary responsibility of the Cassandra database support implementation is providing the facilities for saving the transaction data on the event store. Saving the transaction data approach is quite different from the SQL support implementations due to the database’s nature like data modeling, Denormalized Schema, clustering, partitioning, etc.
All the transactions are saved in 4 tables in the Cassandra database.
-
es_transaction
table -
es_transaction_tryout
table. -
transaction_recovery_retention
tables.-
es_{region_name}_transaction_recovery_retention_1
table. -
es_{region_name}_transaction_recovery_retention_2
table. -
es_{region_name}_transaction_recovery_retention_3
table.
-
-
es_transaction_retry_retention
tables.-
es_{region_name}_transaction_retry_retention_1
table. -
es_{region_name}_transaction_retry_retention_2
table. -
es_{region_name}_transaction_retry_retention_3
table.
-
es_transaction table
es_transaction
is the main table that transaction’s meta-data is saved in event store based on the transaction-Id as the Partition-Key.
All the transactions are saved in this table as Single-Row Partition.
Single-Row Partition approach refers to handling millions of transactions without making Database-Hotspot in your Cassandra database’s nodes.
Because the transactions are shared withing the all available nodes efficiently like below.
Even the Single-Row Partition helps to overcome the database hotspot problem, The data cannot be fetched from the database without knowing the exact transaction key.
It makes a trouble for fetching the data that should be retried from all the transactions from the entire table.
It is discussed in the cassandra-stacksaga-agent section.
|
es_transaction_tryout table
es_transaction_tryout is the table that transaction’s tryout-data is saved in based on the transaction-Id as the Partition-Key.
transaction tryouts are saved under the transaction-Id with Multi-Row Partition approach.
due to the tryout data is saved based on the transaction-Id, tryout data is also saved in the same node that the transaction has been saved.
it helps to optimize the network latency, and all the data(es_transaction
and es_transaction_tryout
) goes to the same node.
As you can see in the above diagram, the tryout data is saved in the same node that the transaction has been saved.
If the tx-8dKz7LpJ2Q
transaction metadata is saved in the Node-1
node, the tx-8dKz7LpJ2Q
transaction tryout data is also saved in the Node-1
node.
transaction_recovery_retention tables.
Unlike relational databases, transactions in Cassandra cannot be identified based on a field like status due to the way data is distributed across multiple nodes using a partition key. This distribution prevents bulk filtering of transactions directly from the transaction table.
Why Bulk Filtering is Not Possible In Cassandra?
-
Partition-Based Distribution
Cassandra shards data across multiple nodes, making global queries on large datasets inefficient. -
No Centralized Indexing for Status-Based Queries
Since transactions are stored based on partition keys, retrieving missing transactions requires querying multiple partitions, which is not feasible for large-scale systems.
Alternative Mechanism to Identify Missing Transactions.
Since transactions cannot be filtered in bulk due to Cassandra’s partition-based distribution, an alternative mechanism is required to track missing transactions effectively.
-
Use separate recovery tables
-
Instead of relying on status-based filtering, transactions that require tracking should be stored in separate recovery tables.
-
These tables can act as a centralized reference for identifying transactions that may not have completed.
-
-
Time-Based Retention Mechanism
-
Each transaction is retained for a specific time window after being added to the transaction table.
-
If a transaction fails to complete within this configured time frame, it is considered missing, and it will be exposed for re-invoking.
-
-
Agent-Based Recovery Mechanism
-
Deploy an agent service that collects pending transactions from the
transaction_recovery_retention
table. -
The agent then shares the missing transactions to the orchestrator service for re-invoke.
-
As described earlier, when a transaction is initialized, its data is stored in one of the transaction_recovery_retention
tables.
If the transaction completes successfully, its data is removed from the table upon process completion.
However, if a transaction does not reach completion, its data remains in the transaction_recovery_retention
table, making it a potential candidate for recovery.
Why Are There Three transaction_recovery_retention
Tables?
The reason for using three tables is to allocate one table for each scheduled recovery time. This ensures that transactions are efficiently managed and recovered based on a predefined schedule.
Scheduled Time and Table Allocation
If the recovery retention time is 480 minutes, the system schedules recovery at:
1. 00:00 → transaction_recovery_retention_1
2. 08:00 → transaction_recovery_retention_2
3. 16:00 → transaction_recovery_retention_3
Each table stores transactions that fall within its respective time frame. Due to Cassandra has no feature for scanning data in a specific time frame, This approach helps to collect the transactions efficiently.
Why Are Tables Created Per Region?
In a multi-region deployment, each service-agent operates within a specific region and should only access data within its assigned region.
For example, if a database is used in two different regions (us-east-1
and us-west-1
), the number of tables doubles, resulting in six tables:
Tables for us-east-1
Region
-
us_east_1_transaction_recovery_retention_1
(00:00) -
us_east_1_transaction_recovery_retention_2
(08:00) -
us_east_1_transaction_recovery_retention_3
(16:00)
Tables for us-west-1
Region
-
us_west_1_transaction_recovery_retention_1
(00:00) -
us_west_1_transaction_recovery_retention_2
(08:00) -
us_west_1_transaction_recovery_retention_3
(16:00)
Key Benefits of This Approach
-
Efficient Recovery Processing
Instead of scanning a massive dataset, transactions are categorized based on scheduled recovery times. -
Regional Isolation for Faster Access
Service-agents in a specific region only access data within their region, reducing unnecessary cross-region queries. -
Scalability
The system can scale to handle multiple regions independently without affecting transaction recovery in other locations.
Identifying Missing Transactions::
Consider a scenario where 1,000 transactions are executed within a specified time period across multiple nodes.
Ideally, all transactions that complete their journey are removed from the transaction_recovery_retention
table.
However, if even one transaction fails to complete, it will persist in the table.
Then the transaction is exposed to the agent application after the configured time period, and then the agent application will collect the missing transactions from the respective transaction_recovery_retention
table and shares them withing the available orchestrator service to re-invoke.
This could happen due to various reasons such as system failures, network issues, or unexpected interruptions.
transaction_recovery_retention
table selection formula.
As mentioned above, there are 3 transaction_recovery_retention
tables for adding the transactions temporarily.
In cassandra implementation, the Transaction recovery retention time is not a fixed one like in other database implementations.
The Transaction recovery retention time is oscillated between a range.
Just imagine if you configure the Transaction recovery retention time to be 8 hours. The Transaction recovery retention time will be withing the range of 4 hours to 12 hours.
How does that happen?
If the Transaction recovery retention time has been mentioned as 8 hours(480 minutes), 3 schedulers can be triggered in the round robbin manner for collecting and re-invoke the missing transactions like the diagram shows.
-
1st schedule at: 00:00
-
2nd schedule at: 08:00
-
3rd schedule at: 16:00
To determine whether a transaction has sufficient time to complete its journey, the system uses the middle time of the configured duration as the boundary point. This boundary helps classify transactions into different recovery schedules.
Transaction Placement Logic
-
Transactions Behind the Boundary (Back of the Boundary)
-
These transactions have already passed the boundary point.
-
They are scheduled for the next upcoming recovery schedule.
-
-
Transactions Ahead of the Boundary
-
These transactions were initialized after the boundary point.
-
They are scheduled for the recovery schedule after the next upcoming schedule to allow more time for completion.
-
The middle time is considered as the boundary point for determining that the transaction has sufficient time to complete their Journey. The transactions that are at the back of the boundary go to the next upcoming schedule. And the transactions that are ahead of the boundary go to the schedule after the next upcoming schedule.
Transaction | Initialization Time | Duration | Position Relative to Boundary | Recovery Schedule |
---|---|---|---|---|
T4 |
11:30 |
4 hours 30 minutes |
Back of the boundary |
Next upcoming schedule |
T5 |
12:30 |
3 hours 30 minutes |
Ahead of the boundary |
Schedule after the next |
T6 |
15:59 |
1 minute |
Ahead of the boundary |
Schedule after the next |
Handling False Positives in Transaction Recovery & the Role of Idempotency
While transactions that remain in the transaction_recovery_retention
table are generally considered missing, this is not always the case.
- Scenario: Transactions Delayed but Not Missing
-
Consider a situation where a transaction is still in the queue, waiting for execution because the respective orchestrator service is too busy. In this case:
-
The system mistakenly assumes the transaction is missing since it has not been removed from the
transaction_recovery_retention
table within the expected time frame. -
As a result, the system triggers a recovery process, re-invoking the transaction.
-
This can lead to the transaction(or certain atomic executions) executed multiple times, causing unintended duplicate operations.
-
This is one of possible ways the transactions can be executed multiple times. To prevent these kinds of unintended duplicate executions, idempotency should be implemented at the atomic execution level of the transaction. |
transaction_retry_retention tables.
In StackSaga, asynchronous transaction retrying is an essential feature that ensures transactions are retried when they fail due to resource unavailability. When a transaction’s execution is failed with a Resource Unavailability problem, the transaction is retried asynchronous by the service-agent by collecting them from the event-store. Unlike SQL databases, Cassandra does not support centralized indexing for bulk filtering. This makes it impossible to scan transactions by their states. More details on this limitation can be found here.
To overcome this, transaction retrying follows the same table-switching mechanism that used for transaction recovery.
When the transaction is stopped due to resource unavailability, it is stored in one of the transaction_retry_retention
tables.
It will be exposed for the upcoming retry schedule by the service-agent.
The only difference of way of scheduling and exposing transactions to the agent between transaction recovery retention and transaction retry retention is that the delay time less the recovery retention time. |
transaction_retry_retention
table selection formula.
Just imagine if you configure the Transaction retry retention time to be 2 minutes. The Transaction retry retention time will be withing the range of 1 minute to 3 minutes.
How does that happen?
If the Transaction retry retention time has been mentioned as 2 minutes, 3 schedulers can be triggered 30 times in the round robbin manner withing a day for collecting and re-invoke the temporally stopped transactions like the diagram shows.
-
1st schedule at: 00:00 (
transaction_retry_retention_1
) -
2nd schedule at: 00:02 (
transaction_retry_retention_2
) -
3rd schedule at: 00:04 (
transaction_retry_retention_3
) -
4th schedule at: 00:06 (
transaction_retry_retention_1
) -
5th schedule at: 00:08 (
transaction_retry_retention_2
) -
and so on…
To determine whether a transaction has sufficient time to be stayed without being executed, the system uses the middle time of the configured duration as the boundary point. This boundary helps classify transactions into different retry schedules.
Transaction Placement Logic
-
Transactions Behind the Boundary (Back of the Boundary)
-
These transactions have already passed the boundary point.
-
They are scheduled for the next upcoming recovery schedule.
-
-
Transactions Ahead of the Boundary
-
These transactions were initialized after the boundary point.
-
They are scheduled for the recovery schedule after the next upcoming schedule to allow more time for completion.
-
The middle time is considered as the boundary point for determining that the transaction has sufficient time to be stayed without being executed. The transactions that are at the back of the boundary go to the next upcoming schedule. And the transactions that are ahead of the boundary go to the schedule after the next upcoming schedule.
Transaction | Stopped Time | Duration | Position Relative to Boundary | Recovery Schedule |
---|---|---|---|---|
T4 |
00:02:50 |
1 minute and 10 seconds |
Back of the boundary |
Next upcoming schedule |
T5 |
00:03:08 |
52 seconds |
Ahead of the boundary |
Schedule after the next |
T6 |
00:03:58 |
2 seconds |
Ahead of the boundary |
Schedule after the next |