Brief CQRS overview
CQRS is far from being a brand new concept. It was first described by Greg Young in 2010 and strongly promoted by Udi Dahan. The idea of CQRS is built on the guiding principles of Command Query Separation (CQS) that Bertrand Meyer defined in his book “Object Oriented Software Construction“ back in 1988.
CQRS is an architectural pattern. At its core, CQRS separates a system’s write (command) and read (query) operations by grouping them into different layers. Commands are operations that change the system’s state but do not return any result, and queries are operations that return the result without changing the state of a system. The main idea is that these layers have their own data models and are built using their own mix of tools and technologies, which allows for each layer to be handled independently, without affecting the other one.
When to use the CQRS pattern?
The short answer is when you have a high-load system with asymmetrical read and write workloads that have different performance requirements. Think Facebook – a user can post once a week (or even less frequently) but read the feed religiously every day. This is a classic case of imbalance when the number of reads far exceeds the number of writes.
Google Ads is, however, a different story. This adtech platform collects millions of data points including clicks, impressions, conversions, device type, click-through rate, and more. Such a data-heavy system needs to be able to continuously write this information without choking.
That’s exactly what CQRS can help with. Although it’s not strictly required, it’s possible – and often makes sense – to implement separate databases that are specifically optimized for read and write operations. For instance, an application can use MS SQL server for consistent writes and real-time processing, and Cassandra, MongoDB or Elasticsearch for efficient retrieval of data.
In addition, read and write models can be backed by different persistent layers optimized for their roles. For example, the read database can have materialized views to avoid complex joins and considerably improve the speed of queries.
In a nutshell, the main benefits of the CQRS pattern include:
- Scalability. Segregating reads and write operations can lead to independent and more efficient scaling of resources and storage capacity based on the real-world needs.
- Performance. In addition to different optimization strategies that can be applied to read and write schemas, CQRS can reduce contention between read and write workloads, meaning that multiple users can query the system simultaneously without affecting the write side performance.
- Flexibility. The CQRS pattern allows for a more flexible architecture that can be optimized to meet specific business challenges and better evolve over time.
The inherent challenges of CQRS
Just like any technology or approach, CQRS comes with its own set of challenges and risks that must be carefully considered before diving right in.
Added development complexity
First and foremost, the CQRS pattern brings an additional layer of complexity to the table. Ultimately, we end up with a more sophisticated application design and more components to manage – commands, queries, events, handlers, data models, aggregates, and more.
For instance, a message broker is not obligatory but it is usually used to support asynchronous communication within an application. This translates into additional infrastructure that needs to be maintained and additional failover and error handling logic that needs to be built in.
Handling eventual consistency
Separating the read and write models can become a true headache in terms of eventual consistency. While the write database is always up-to-date and strongly consistent whenever a command is sent, the read database may not have been yet updated with the latest changes. Eventually the system will settle on a value, but during that inconsistency window the read database can return stale data.
Now, imagine that a financial solution allows a user to continuously send multiple commands, withdrawing 1,000 USD each time. If the queue between the write and read model fills up, the system may take more time to process than expected. Meanwhile, the user ends up withdrawing 10,000 USD in total although the current balance was 5,000 USD to begin with. And this over-withdrawing can result in an overdraft penalty.
To workaround the issues and keep data models in sync, some extra work is needed, which may come at the cost of additional latency while waiting for the read model to catch up. Naturally, this trade-off may not be suitable for financial, banking, or healthcare systems with zero tolerance to data inconsistency.
In addition to more challenging codebase development, another source of complexity for CQRS-based systems is testing. Multiple data models, complex scenarios, and asynchronous processes can make it difficult to run automated test scripts because data messages can be stuck in the queue and the system may not be quick enough to synchronize.
Putting it all together
As we start seeing an increased interest in the CQRS pattern, it’s worth remembering that CQRS is not a silver bullet. Depending on the business context and practical implementation, CQRS can create more problems than it solves. So, instead of boosting performance and enhancing scalability, you can end up with the case of over-engineering and unnecessary complexity.
We have rounded up the arguments for when the CQRS pattern can be a valid design choice and when it’s better to opt for a simpler approach.
As it has become clear, CQRS can bring many benefits but just as easily it can add risky complexities. Even Udi Dahan, one of the foremost experts on the topic, strongly suggests to avoid using CQRS most of the time.
And the reason is that for most businesses introducing the CQRS pattern would be over-engineering and drain of resources. Unless you are the next Twitter or Facebook with millions of users, your performance and scalability problems can be solved in an easier and more cost-efficient way, i.e. by tuning cloud configurations or switching to managed services. In other words, the challenge is not to build a software solution that uses modern architectural patterns and frameworks but to deliver a solution that will meet your particular business needs – reliably and effectively.