Multi-SDK Backwards Compatibility Tester

Submitted May 20, 2025

I am submitting for: Speaking at the Fifth Elephant 2025 Annual Conference Type of submission: 30 mins talk Choose the topic your submission falls under: Data & ML Infrastructure track

Abstract

In the dynamic landscape of distributed systems and streaming ETL (Extract, Transform, Load) pipelines, maintaining backwards compatibility poses a significant challenge, especially as complexity scales. Services that interact through multiple versions of SDKs must ensure seamless communication and data flow, even as new features are developed. Traditionally, manual testing for compatibility across versions has been both time-consuming and prone to errors, particularly as development velocity increases.

This challenge became evident in our work on a platform developed at Atlassian called Lithium, where we faced the complexity of ensuring compatibility across numerous SDK versions in a dynamic and ephemeral environment. To address this, we developed a robust framework for automating the verification of compatibility across SDK and service versions. This framework uses containerized test environments, custom environment generation with Docker Compose, and targeted version testing to streamline the process. Integrated into the developer workflow via CI/CD pipelines, it conducts comprehensive compatibility tests for every code change, preventing regressions and enhancing reliability. Developers can replicate test environments locally for in-depth debugging, facilitating rapid iteration and resolution. By automating compatibility testing, we have significantly reduced manual effort, increased developer confidence, and ensured a smooth experience during major platform updates.

Key Takeaways

Automating backwards compatibility testing shifts responsibility to developers during development, reducing manual QA efforts and preventing production regressions.
A robust testing framework, integrated into the developer workflow, can adapt to platform complexity, ensuring compatibility across dynamic environments and versions.

Audience Segment

Software Engineers and Developers: Professionals working with distributed systems, streaming ETL pipelines, or microservice architectures who face challenges with version compatibility and regression testing.
DevOps and Platform Engineers: Practitioners focused on building robust CI/CD pipelines, managing containerized environments, and integrating testing frameworks into development processes.
Quality Assurance (QA) Engineers: QA professionals interested in automating compatibility testing to reduce manual efforts and improve coverage and reliability.
Engineering Leads and Architects: Technical leaders responsible for designing scalable systems that handle compatibility challenges across multiple SDKs, APIs, or platform components.

About the Speaker

Ashif is a Senior Software Engineer with nearly 7 years of experience, including significant contributions to designing backwards compatibility testing solutions and service discovery mechanisms to enhance system stability and scalability.
LinkedIn: Ashif Iqbal

Reference Links

Slides Incoming

This is the first talk about Backwards Compatibility for Lithium Platform but below are some link for Lithium:

Mind Map for the Presentation

Slide 1

Hello everyone, I’m Ashif from the Data Portability Organization at Atlassian. Our primary mission is to develop platforms and solutions that simplify data portability within Atlassian. Data portability is crucial for us, as it underpins numerous experiences and enterprise features, in addition to facilitating the migration of customer data from on-premises servers to Atlassian Cloud. Some of the features built on this foundation include BRIE (Backup & Restore) and Sandbox.

Slide 2

We have created a platform named Lithium to facilitate data movement, which we refer to as ETL++. The ++ signifies that while it is indeed ETL, it incorporates several unique features:

Pipelines are dynamically provisioned with the number of partitions at runtime.
It employs a Bring Your Host model utilizing the SDK we provide.
It is distributed, allowing each part of the ETL pipeline to be hosted separately.
Pipelines are ephemeral, provisioned as needed, and deprovisioned once their work is complete.

Slide 3

I won’t delve deeply into how Lithium operates, as Robert’s talk covers that more effectively. However, I will highlight the essential components that illustrate the challenges we began to encounter:

lithium-control-plane: This component features two topics, controlplane-events and dataplane-events, which facilitate the sending and receiving of events from the data plane components via their SDKs.
lithium-sdk: Located in the dataplane, this SDK manages communication with the control plane and provides constructs for data movement, enabling feature teams to focus solely on their features without needing to understand Kafka internals. Feature teams implement this SDK to integrate with the Lithium Pipeline.

Slide 4

In production, we have hundreds of hosts implementing our SDK, each contributing to different parts of the ETL pipeline. Consequently, the production environment resembles a complex landscape, where pipelines may involve SDKs from various publicly available versions. Additionally, communication with the control plane occurs across all these SDK versions.

Slide 5

The challenges we faced stemmed from the rapid development of this new platform. Early on, individual developers often worked independently on features, with minimal awareness of each other’s work. This led to regression issues, primarily due to insufficient backward compatibility testing.

We identified two key areas where maintaining backward compatibility was essential:

Between the SDKs and the control plane.
Among the SDKs for data movement, enabling multi-SDK pipelines.

Slide 6

At first glance, contract testing seemed sufficient for our use case, and while that is partially correct, we also needed to validate results and conduct feature testing. Essentially, we sought a solution that allowed us to write tests in a straightforward manner, assert as we typically would, and let the framework handle the heavy lifting. Finding no existing solution, we decided to build one ourselves. Moreover, we aimed to incorporate additional features, such as targeted tests for specific SDK versions. This was crucial for scenarios where we might deprecate a feature or introduce something new, necessitating tests only for newer versions.

Slide 7

It would be remiss of me not to show what we had previously before discussing our new solution. Initially, we relied on a lengthy and tedious manual process to test backward compatibility against the latest publicly released version. Our main codebase resides in a monorepo that houses the control plane service, a proxy service designed to accept HTTP traffic into our otherwise event-driven platform. This repository also contains the dataplane-sdk, along with three other services implementing the dataplane-sdk, facilitating simultaneous work on the SDK and corresponding ETL component changes. This setup provides quick feedback during the development loop. Additionally, we maintained another repository that housed a test service implementing the latest publicly available version of the dataplane-sdk, which was used to test changes made in the control plane service of the main repository.

Slide 8

The previous approach presented several challenges:

It was manual.
There was no test suite, requiring tests to be run from memory.
- Developers were often unaware of previously developed features, leading to gaps in testing.
Even in an ideal scenario, we could only achieve backward compatibility with the most recent public version.

Slides 9 to 11

Let me show you one of the tests written using the new framework to illustrate how it appears to developers, followed by a deeper dive into the details.

@Tag("Parallel")
class TerminateWorkplanTest : ParallelBaseE2E() {
    @Test
    fun `terminating workplan from STOPPED state should work`() {
        val workplanCreationRequest = getSampleWorkplanCreationRequest("sample-workplan-creation-request-with-sink-disabled")

        WorkplanUtility.createWorkplanAndTestForSuccess(
            workplanCreationRequest,
            getHttpRequestBuilder().addNonAdminOwnerHeaders(),
            httpClient,
        )

        WorkplanUtility.testWorkplanStatus(
            workplanCreationRequest.getWorkplanId(),
            WorkplanStatus.STOPPED,
            getHttpRequestBuilder().addNonAdminOwnerHeaders(),
            httpClient,
        )

        WorkplanUtility.terminateWorkplanAndTestForSuccess(
            workplanCreationRequest.getWorkplanId(),
            getHttpRequestBuilder().addNonAdminOwnerHeaders(),
            httpClient,
        )

        WorkplanUtility.testWorkplanNotFound(
            workplanCreationRequest.getWorkplanId(),
            getHttpRequestBuilder().addNonAdminOwnerHeaders(),
            httpClient,
        )
    }
}

This may appear to be a standard JUnit test, and you would be correct—it is. We have built specific capabilities tailored to our use case, but overall, the tests remain familiar to developers. These capabilities are provided by the ParallelBaseE2E base class, along with some custom-developed annotations and tags.

Here is another test utilizing the MinimumLibraryTargetVersion annotation to map this test to specific versions, as this feature was not available in older versions:

@Tag("Parallel")
class CustomPartionerTest : ParallelBaseE2E() {
    @Nested
    @MinimumLibraryTargetVersion("3.4.0")
    inner class WithoutCustomPartitioner {
        @Test
        fun `With Max Sink Processors = 3 And Non Transactional Sink`() {
            ...
        }
    }
}

Slide 12

Now, let’s explore the architecture of the framework, which consists of three components:

Generating and publishing the required images.
Custom ENV Generator using Docker Compose.
Test Writing Framework.

Slides 13 to 14

First, let’s discuss the Test Writing Framework, which is a Custom Base E2E class that provides the following capabilities:

Running or skipping tests based on Library Version Tags.
Running or skipping tests based on Infra Setup Tags.
Additionally, we offer variations such as Sequential and Parallel, allowing tests to run either in isolation or alongside others. For Sequential, the base test class also cleans up all running resources, ensuring the next test starts fresh.

Slides 15 to 18

These tests are executed in an environment established using Docker Compose files. The structure of these files is as follows:

lithium-e2e-tests/
│
├── docker-compose-test-environments/
│   ├── kafka.yml
│   ├── database.yml
│   ├── udpp-control-plane-svc/
│   │   ├── latest-master.yml
│   │   └── specified-build.yml
│   └── udpp-extract-service/
│       ├── latest-master.yml
│       └── specified-build.yml
│   .
│   .
│   .
└── scripts/
    ├── library-version/
    │   ├── branch.sh
    │   └── main.sh
    ├── single-components/
    │   ├── current-controlplane.sh
    │   └── current-loader.sh
    │   .
    │   .
    │   .
    └── all-current.sh

Now, let’s take a look at what the Docker Compose file looks like for one of the services:

version: "3.8"

services:
  udpp-extract-service:
    image: ${IMAGE_PREFIX:-docker.atl-paas.net/sox/atlassian}/udpp-extract-service:latest
    ports:
      - "7075:8300"
    healthcheck:
      test: curl --fail -H X-Slauth-Mechanism:slauth -H X-Slauth-Subject:udpp-extract-service -H X-Slauth-Authorization:true -H X-Asap-Issue:udpp-extract-service http://localhost:8300/healthcheck || exit 1
      start_period: 50s
      retries: 10
      timeout: 180s
      interval: 10s
    environment:
      SPRING_PROFILES_ACTIVE: local,e2e-test
      LIBRETTO_CODE_SERVER_LOCAL_URL: http://libretto-code-server:8080
      MEMORY_OPTS: -Xmx512M
      MICROS_AWS_REGION: micros-aws-e2e-region
      MICROS_SERVICE: udpp-extract-service
      MICROS_INSTANCE_ID: 9991
      MICROS_ENV: e2e-test
      MICROS_ENVTYPE: e2e-test
      SERVER_SSL_ENABLED: false
      USER: ${USER}
      JMX_OPTS: "-Dcom.sun.management.jmxremote.rmi.port=9015 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port=9015 -Dcom.sun.management.jmxremote.authenticate=false"
      LITHIUM_FF_FILE_PATH: /opt/service/feature-flag.json
      MAX_PROCESSORS: 200
    depends_on:
      kafka-broker-1:
        condition: service_healthy

These Docker Compose YAML files provide a highly granular way to spin up the desired environment with a single command, as shown below:

#!/bin/bash
 docker-compose -f docker-compose-test-environments/sox-database.yml \
                -f docker-compose-test-environments/sox-single-kafka-broker.yml \
                -f docker-compose-test-environments/udpp-cp-rest-proxy/sox-specified-build-single-instance.yml \
                -f docker-compose-test-environments/udpp-control-plane-svc/sox-specified-build-single-instance.yml \
                -f docker-compose-test-environments/lithium-dataplane-test-svc/custom-version-triple-instance.yml \
 up --pull=always -d --wait

Slide 19

But how do these compose files and scripts work? The magic lies in their generation, which I will discuss next.

Slides 20 to 23

We create and publish these Docker images from several sources. Let’s review them one by one:

In the Pull Request Pipeline: Here, we generate and publish images tagged with the pipeline number, allowing them to be pulled in later steps when tests are run.
In the master/main Pipeline: We also generate and publish images here, but in addition to the pipeline number tag, we add a latest tag. This ensures that when other developers test their changes against the master/main branch, they are always working with the latest version.
In the Lithium Dataplane Test Client Repo master/main Pipeline: Similar to the previous cases, we generate and publish images tagged with the pipeline number, the latest tag, and the version of the SDK used to generate the image. This allows us to pull in specific versions during the testing pipeline whenever necessary.

Slide 24

Here’s a glimpse of a testing pipeline in action, showcasing the number of tests being run across various combinations.
pic to be inserted

Slide 25

What did we achieve after implementing these changes?

Slide 26

We transitioned from a Totally Manual Testing Process requiring hours of effort to a Fully Automated Process with zero testing effort.
We expanded from a Limited Number of Tests to validate to Every Single Feature being tested.
We reduced the occurrence of regression bugs from 1-2 per release to 0 regressions post-implementation.

Slide 27

I believe one of the most significant maturity metrics for a platform is its ability to accept contributions from others. Previously, we struggled with this due to the extensive manual testing effort required, creating a steep learning curve for those outside the development team. However, with our new processes, we have successfully accepted numerous contributions from colleagues in other teams, particularly our clients.

Slide 28

It’s incredibly reassuring to know that if the pipeline is GREEN, there is a high degree of confidence that the PR merged will not introduce regressions. I emphasize “high degree of confidence” because a testing framework is merely a tool; its effectiveness depends on how well it is utilized. If tests are written, the framework ensures that the PR aligns with those tests.

Slide 29

Q&A

All submissions

Previous Next

Comments

Hosted by

The Fifth Elephant

Jumpstart better data engineering and AI futures