Overview

In the music distribution world, it’s not just about getting tracks onto DSPs. Once streams start coming in, platforms take on the complex responsibility of making sure royalties reach the right artists and rights holders. One of our clients, a fast- growing music distributor, was handling this well until the scale started to grow. Their existing system worked fine at lower volumes, but as the monthly royalty data crossed hundreds of millions of records, it began to slow down significantly. Processing delays increased, technical overhead piled up, and it became clear that the architecture needed a rethink to keep up with growth. They needed a faster and more reliable way to scale.

Challenges

As we dug into the system, several underlying issues came to light, most of them hidden during earlier stages of development but now impossible to ignore. The architecture was beginning to show strain across multiple fronts. Some of the major Challenges involved:

• Processing raw royalty data with hundreds of millions of rows each month

• Managing duplicate ISRCs and inconsistent metadata from multiple DSPs

• Legacy codebase that became harder to maintain and debug over time

• Slow processing cycles that stretched into days, with no clear visibility into when they would finish • Heavy load on the database, with both reads and writes competing for resources

• Frustrated users due to unreliable performance and sluggish reporting interfaces

• Limited observability and metrics — scaling meant throwing more infrastructure at the problem without fixing the root issues

Research

As part of the re-architecture e ff ort, we spent time researching not just how to handle the current scale, but also how to ensure the system could grow with us over the next few years without needing a complete overhaul again. The existing setup was built in Ruby with heavy use of MariaDB stored procedures, which had become a bottleneck. Reporting and analytics were running on the same database that handled processing, leading to contention and major slowdowns.

There was no batching or parallelism in place, which meant everything was processed in a single thread further compounding the delays. W e explored a range of options: di ff erent batching techniq ues, caching strategies, and even alternative languages. But our goal wasn' t just performance we also wanted to make sure the system was something the current team could support and evolve. So, instead of bringing in a completely new stack, we focused on technologies and patterns that were powerful but still within reach of the team' s comfort z one.

Our approach

Rebuilding the Ingestion Engine

We replaced the existing Ruby based ingestion pipeline with a G olang (something the client team was already comfortable with, which made long-term maintainability much easier) parallel processing pipeline that handled records in batches of 50,000. This was combined with PebbleDB, a lightweight embedded key-value store to perform fast local lookups without the memory bloat or network latency of Redis.

To handle real-world issues like duplicate ISRCs, we customized our storage logic to allow multiple values per key. This ensured accurate matching even when DSP data was inconsistent.

Rethinking the Data Pipeline

CSV imports into MariaDB were taking hours and failing often. We moved ingestion to ClickHouse, bringing load time for 500 million records down to about 10 minutes from existing 4 hours. We then used ClickHouse for aggregation and S3 integration, sending only the critical summary data to MariaDB for transactional operations.

On the frontend, we replaced the read-heavy reporting tables with denormalized ClickHouse views. This dramatically improved U I responsiveness and made filtering fast and smooth for end users.