stacksaga-cassandra-support

stacksaga-cassandra-support is one of Stacksaga db support(event store support) implementations as per the architectural diagram.

stacksaga diagram stacksaga components database support

It is used for the orchestrator service along with the starter dependency if the orchestrator service uses the Cassandra database as their primary database. This library provides all the facilities for accessing the Cassandra database for the Stacksaga engine.

Here is the way that you can add the library into your existing orchestrator application as a dependency.

Adding stacksaga-cassandra-support as a dependency
<dependency>
    <groupId>org.stacksaga</groupId>
    <artifactId>stacksaga-cassandra-support</artifactId>
    <version>${org.stacksaga.version}</version>
</dependency>

How the transaction data is sorted

As mentioned above, the primary responsibility of the Cassandra database support implementation is providing the facilities for saving the transaction data on the event store. Saving the transaction data approach is quite different from the SQL support implementations due to the database’s nature like data modeling, Denormalized Schema, clustering, partitioning, etc.

All the transactions are saved in 4 tables in the Cassandra database.

  1. es_transaction table

  2. es_transaction_tryout table.

  3. transaction_recovery_retention tables.

    1. es_{region_name}_transaction_recovery_retention_1 table.

    2. es_{region_name}_transaction_recovery_retention_2 table.

    3. es_{region_name}_transaction_recovery_retention_3 table.

  4. es_transaction_retry_retention tables.

    1. es_{region_name}_transaction_retry_retention_1 table.

    2. es_{region_name}_transaction_retry_retention_2 table.

    3. es_{region_name}_transaction_retry_retention_3 table.

es_transaction table

es_transaction is the main table that transaction’s meta-data is saved in event store based on the transaction-Id as the Partition-Key. All the transactions are saved in this table as Single-Row Partition. Single-Row Partition approach refers to handling millions of transactions without making Database-Hotspot in your Cassandra database’s nodes. Because the transactions are shared withing the all available nodes efficiently like below.

StackSaga cassandra managing throughput

Even the Single-Row Partition helps to overcome the database hotspot problem, The data cannot be fetched from the database without knowing the exact transaction key. It makes a trouble for fetching the data that should be retried from all the transactions from the entire table. It is discussed in the cassandra-stacksaga-agent section.

es_transaction_tryout table

es_transaction_tryout is the table that transaction’s tryout-data is saved in based on the transaction-Id as the Partition-Key. transaction tryouts are saved under the transaction-Id with Multi-Row Partition approach. due to the tryout data is saved based on the transaction-Id, tryout data is also saved in the same node that the transaction has been saved. it helps to optimize the network latency, and all the data(es_transaction and es_transaction_tryout) goes to the same node.

StackSaga cassandra managing throughput

As you can see in the above diagram, the tryout data is saved in the same node that the transaction has been saved. If the tx-8dKz7LpJ2Q transaction metadata is saved in the Node-1 node, the tx-8dKz7LpJ2Q transaction tryout data is also saved in the Node-1 node.

transaction_recovery_retention tables.

Unlike relational databases, transactions in Cassandra cannot be identified based on a field like status due to the way data is distributed across multiple nodes using a partition key. This distribution prevents bulk filtering of transactions directly from the transaction table.

Why Bulk Filtering is Not Possible In Cassandra?

  • Partition-Based Distribution
    Cassandra shards data across multiple nodes, making global queries on large datasets inefficient.

  • No Centralized Indexing for Status-Based Queries
    Since transactions are stored based on partition keys, retrieving missing transactions requires querying multiple partitions, which is not feasible for large-scale systems.

Alternative Mechanism to Identify Missing Transactions.

Since transactions cannot be filtered in bulk due to Cassandra’s partition-based distribution, an alternative mechanism is required to track missing transactions effectively.

  1. Use separate recovery tables

    • Instead of relying on status-based filtering, transactions that require tracking should be stored in separate recovery tables.

    • These tables can act as a centralized reference for identifying transactions that may not have completed.

  2. Time-Based Retention Mechanism

    • Each transaction is retained for a specific time window after being added to the transaction table.

    • If a transaction fails to complete within this configured time frame, it is considered missing, and it will be exposed for re-invoking.

  3. Agent-Based Recovery Mechanism

    • Deploy an agent service that collects pending transactions from the transaction_recovery_retention table.

    • The agent then shares the missing transactions to the orchestrator service for re-invoke.

As described earlier, when a transaction is initialized, its data is stored in one of the transaction_recovery_retention tables. If the transaction completes successfully, its data is removed from the table upon process completion. However, if a transaction does not reach completion, its data remains in the transaction_recovery_retention table, making it a potential candidate for recovery.

Why Are There Three transaction_recovery_retention Tables?

The reason for using three tables is to allocate one table for each scheduled recovery time. This ensures that transactions are efficiently managed and recovered based on a predefined schedule.

Scheduled Time and Table Allocation

If the recovery retention time is 480 minutes, the system schedules recovery at: 1. 00:00transaction_recovery_retention_1 2. 08:00transaction_recovery_retention_2 3. 16:00transaction_recovery_retention_3

Each table stores transactions that fall within its respective time frame. Due to Cassandra has no feature for scanning data in a specific time frame, This approach helps to collect the transactions efficiently.

Why Are Tables Created Per Region?

In a multi-region deployment, each service-agent operates within a specific region and should only access data within its assigned region.

For example, if a database is used in two different regions (us-east-1 and us-west-1), the number of tables doubles, resulting in six tables:

Tables for us-east-1 Region

  • us_east_1_transaction_recovery_retention_1 (00:00)

  • us_east_1_transaction_recovery_retention_2 (08:00)

  • us_east_1_transaction_recovery_retention_3 (16:00)

Tables for us-west-1 Region

  • us_west_1_transaction_recovery_retention_1 (00:00)

  • us_west_1_transaction_recovery_retention_2 (08:00)

  • us_west_1_transaction_recovery_retention_3 (16:00)

Key Benefits of This Approach

  • Efficient Recovery Processing
    Instead of scanning a massive dataset, transactions are categorized based on scheduled recovery times.

  • Regional Isolation for Faster Access
    Service-agents in a specific region only access data within their region, reducing unnecessary cross-region queries.

  • Scalability
    The system can scale to handle multiple regions independently without affecting transaction recovery in other locations.

Identifying Missing Transactions::

Consider a scenario where 1,000 transactions are executed within a specified time period across multiple nodes. Ideally, all transactions that complete their journey are removed from the transaction_recovery_retention table. However, if even one transaction fails to complete, it will persist in the table. Then the transaction is exposed to the agent application after the configured time period, and then the agent application will collect the missing transactions from the respective transaction_recovery_retention table and shares them withing the available orchestrator service to re-invoke. This could happen due to various reasons such as system failures, network issues, or unexpected interruptions.

transaction_recovery_retention table selection formula.

As mentioned above, there are 3 transaction_recovery_retention tables for adding the transactions temporarily. In cassandra implementation, the Transaction recovery retention time is not a fixed one like in other database implementations. The Transaction recovery retention time is oscillated between a range.

stacksaga diagram stacksaga cassandra how transactions saved for recovery

Just imagine if you configure the Transaction recovery retention time to be 8 hours. The Transaction recovery retention time will be withing the range of 4 hours to 12 hours.

How does that happen?

If the Transaction recovery retention time has been mentioned as 8 hours(480 minutes), 3 schedulers can be triggered in the round robbin manner for collecting and re-invoke the missing transactions like the diagram shows.

  1. 1st schedule at: 00:00

  2. 2nd schedule at: 08:00

  3. 3rd schedule at: 16:00

To determine whether a transaction has sufficient time to complete its journey, the system uses the middle time of the configured duration as the boundary point. This boundary helps classify transactions into different recovery schedules.

Transaction Placement Logic

  1. Transactions Behind the Boundary (Back of the Boundary)

    • These transactions have already passed the boundary point.

    • They are scheduled for the next upcoming recovery schedule.

  2. Transactions Ahead of the Boundary

    • These transactions were initialized after the boundary point.

    • They are scheduled for the recovery schedule after the next upcoming schedule to allow more time for completion.

The middle time is considered as the boundary point for determining that the transaction has sufficient time to complete their Journey. The transactions that are at the back of the boundary go to the next upcoming schedule. And the transactions that are ahead of the boundary go to the schedule after the next upcoming schedule.

Transaction Initialization Time Duration Position Relative to Boundary Recovery Schedule

T4

11:30

4 hours 30 minutes

Back of the boundary

Next upcoming schedule

T5

12:30

3 hours 30 minutes

Ahead of the boundary

Schedule after the next

T6

15:59

1 minute

Ahead of the boundary

Schedule after the next

Handling False Positives in Transaction Recovery & the Role of Idempotency

While transactions that remain in the transaction_recovery_retention table are generally considered missing, this is not always the case.

Scenario: Transactions Delayed but Not Missing

Consider a situation where a transaction is still in the queue, waiting for execution because the respective orchestrator service is too busy. In this case:

  1. The system mistakenly assumes the transaction is missing since it has not been removed from the transaction_recovery_retention table within the expected time frame.

  2. As a result, the system triggers a recovery process, re-invoking the transaction.

  3. This can lead to the transaction(or certain atomic executions) executed multiple times, causing unintended duplicate operations.

This is one of possible ways the transactions can be executed multiple times. To prevent these kinds of unintended duplicate executions, idempotency should be implemented at the atomic execution level of the transaction.

transaction_retry_retention tables.

In StackSaga, asynchronous transaction retrying is an essential feature that ensures transactions are retried when they fail due to resource unavailability. When a transaction’s execution is failed with a Resource Unavailability problem, the transaction is retried asynchronous by the service-agent by collecting them from the event-store. Unlike SQL databases, Cassandra does not support centralized indexing for bulk filtering. This makes it impossible to scan transactions by their states. More details on this limitation can be found here.

To overcome this, transaction retrying follows the same table-switching mechanism that used for transaction recovery.

When the transaction is stopped due to resource unavailability, it is stored in one of the transaction_retry_retention tables. It will be exposed for the upcoming retry schedule by the service-agent.

The only difference of way of scheduling and exposing transactions to the agent between transaction recovery retention and transaction retry retention is that the delay time less the recovery retention time.

transaction_retry_retention table selection formula.

stacksaga diagram stacksaga cassandra how transactions saved for retry

Just imagine if you configure the Transaction retry retention time to be 2 minutes. The Transaction retry retention time will be withing the range of 1 minute to 3 minutes.

How does that happen?

If the Transaction retry retention time has been mentioned as 2 minutes, 3 schedulers can be triggered 30 times in the round robbin manner withing a day for collecting and re-invoke the temporally stopped transactions like the diagram shows.

  1. 1st schedule at: 00:00 (transaction_retry_retention_1)

  2. 2nd schedule at: 00:02 (transaction_retry_retention_2)

  3. 3rd schedule at: 00:04 (transaction_retry_retention_3)

  4. 4th schedule at: 00:06 (transaction_retry_retention_1)

  5. 5th schedule at: 00:08 (transaction_retry_retention_2)

  6. and so on…​

To determine whether a transaction has sufficient time to be stayed without being executed, the system uses the middle time of the configured duration as the boundary point. This boundary helps classify transactions into different retry schedules.

Transaction Placement Logic

  1. Transactions Behind the Boundary (Back of the Boundary)

    • These transactions have already passed the boundary point.

    • They are scheduled for the next upcoming recovery schedule.

  2. Transactions Ahead of the Boundary

    • These transactions were initialized after the boundary point.

    • They are scheduled for the recovery schedule after the next upcoming schedule to allow more time for completion.

The middle time is considered as the boundary point for determining that the transaction has sufficient time to be stayed without being executed. The transactions that are at the back of the boundary go to the next upcoming schedule. And the transactions that are ahead of the boundary go to the schedule after the next upcoming schedule.

Transaction Stopped Time Duration Position Relative to Boundary Recovery Schedule

T4

00:02:50

1 minute and 10 seconds

Back of the boundary

Next upcoming schedule

T5

00:03:08

52 seconds

Ahead of the boundary

Schedule after the next

T6

00:03:58

2 seconds

Ahead of the boundary

Schedule after the next