top of page
  • kartikayluthra

Reactive Architecture: Part 2 - How to Build Scalable Systems?

In the first part of this blog we looked at the basics of reactive architecture, what is reactive architecture? The four pillars of reactive architecture, what are microservices, and how to build microservices. Here’s a link to the previous article which we advise you once go through in order to completely understand everything in this blog. Reactive Architecture: A complete guide

As we have mentioned in the last edition of this blog Reactive Architecture helps business owners build applications that are Elastic. This means that an application should have the ability to scale up when required without compromising its responsiveness but also be able to scale down efficiently.


How to build Scalable Systems?

The world around us is a distributed system. When you have a large application that is migrating from Monolithic to Microservices you need a Distributed System in order to reap its benefits. A distributed system is defined as a system whose components are located on different networked computers, which communicate and coordinate their actions by passing messages to one another.

Consistency, Availability, and Scalability

These are the three important properties of distributed systems that are often considered trade-offs.

Consistency refers to the requirement that all nodes in a distributed system see the same data at the same time. In other words, it ensures that every read receives the most recent write or update.

Availability refers to the property that every request to the system receives a response, without a guarantee that it contains the most recent version of the information. In a highly available system, nodes are typically replicated to handle failures.

Scalability refers to the ability of a system to handle the increasing workload by adding more resources, such as adding more nodes to a cluster. A scalable system can handle growing amounts of data and user traffic without a decrease in performance.

How does this impact Performance?

At the end of the day, we want our application to perform the best which will in turn lead to the best interactive experience for our users.

Consistency, Availability, and Scalability do impact the performance of an application in the following ways:

  1. Consistency, by ensuring that all nodes in the system see the same data at the same time, can help to prevent data inconsistencies and conflicting updates. However, maintaining consistency can also introduce overhead in the form of additional network communication and synchronization, which can slow down the system and impact performance.

  1. Availability, by ensuring that every request to the system receives a response, can improve the overall reliability and responsiveness of the system. However, maintaining high availability can also require replication and redundancy, which can increase the complexity of the system and consume additional resources, impacting performance.

  1. Scalability, by allowing a system to handle the increasing workload by adding more resources, can help to ensure that the system can continue to perform well as the demand grows. However, scalability can also introduce new challenges such as data partitioning, load balancing, and coordination between nodes, which can also impact performance.

In conclusion, the relationship between consistency, availability, scalability, and performance in a distributed system is complex and dynamic. Designers and engineers need to consider all three properties and make trade-off decisions to achieve a balance that meets the specific needs and requirements of their system.

Laws of Scalability

Although it is extremely difficult to achieve all three at the same time for one particular system. We still want a mix of all three in order for the application to perform and have the best user experience possible.

For building systems that are scalable, there are several laws and principles that govern scalability in distributed systems. Here are a few of the most commonly cited ones:

  1. Amdahl's Law: This law states that the maximum theoretical speedup that can be achieved by parallelizing a task is limited by the fraction of the task that cannot be parallelized. In other words, the scalability of a system is limited by its bottlenecks.

  1. Brewer's CAP Theorem: This theorem states that it is impossible for a distributed system to provide all three guarantees of Consistency, Availability, and Partition Tolerance simultaneously and that a system can provide at most two of these guarantees.

These laws and principles provide a foundation for understanding the challenges and trade-offs involved in achieving scalability in distributed systems. By considering these laws, designers, and engineers can make informed decisions about how to design and build scalable systems that meet the specific requirements and constraints of their applications.

Now, when we hire additional resources for scaling our system, this often leads to contention and coherence delay within the application. We are going to look at how both of these terms affect the performance and scalability of the application.

Contention and Scalability

Contention in distributed systems is defined as competition between the resources in order to perform a specific service. This leads to problems in scaling up or scaling down our system. It is very important to minimize the impact of contention in our system in order to scale up or down.

Sharding is a technique used to minimize Contention in a system.

What is Sharding?

Sharding is a technique that is vastly implemented in databases. Here we are gonna take a look at how Sharding can be implemented in applications.

Sharding refers to the process of horizontal partitioning large and complex data and distributing those chunks of data across multiple servers. The ultimate goal of sharding is to improve the performance and Scalability of the system.

In reactive architecture, sharding is done to improve the response time of database-intensive applications by reducing the amount of data that needs to be processed by each server. This can also help reduce the downtime in case any of the servers fail as only a portion of the data gets affected.

How does Sharding improve System Performance?

As we have mentioned above Sharding is a process by which we divide the application into smaller chunks across multiple. As you must have thought it does improve the performance of the system as it reduces the load on servers. Sharding is however a new-age technique and is supported by the Databases that access NoSQL. Some of the commands in MongoDB for sharding are as follows:

Name of the Command



Aborts a resharding operation


Adds a shard to a sharded cluster


Associates a shard with the zone, supports configuring zones in sharded clusters


Returns information on whether the chunks of a sharded collection are balanced.

New in version 4.4


Starts a balancer thread.

This is just a list of very few commands although there are plenty more that you can look up on the internet.

Coherence Delay and Scalability

Coherence Delay in applications refers to the time it takes for different processing units in a parallel computing system to become completely aware of each other’s updates to shared memory. In a parallel processing system, multiple processing units may access and modify shared data concurrently.

Coherence Delay occurs because of the need to maintain cache coherence which ensures that all caches must have the same view of the shared data. All caches must share the same data as it is important in order to maintain consistency within the system. When processing units modify data the updates must be propagated to other caches of processing units.

How do Coherence Delay and Contention affect Scalability?

To build a scalable you need to make sure that the system suffers from very less contention and Coherence Delay. Gunther’s Theorem states that “Contention and Coherence Delay leads to a negative impact on performance, as the system scales up the cost of coordinating between nodes exceeds any benefits”

Due to the increased cost of coordinating between nodes, it leads to restricting our ability to scale our application. If there is a lot of contention between nodes this leads to an increase in the response time. For example, The API gateway of one service calls another service in order to proceed with user actions, now if there are multiple nodes competing with each other in order to complete the action this will lead to the user waiting for more which leads to poor performance and also does not abide by the 4 pillars of Reactive Architecture.

If there is an increase in coherence delay this will lead to certain services of applications responding slowly as the updated data has not reached them till now. Now, if this problem persists within our application and when we try to scale up, with the availability of additional resources the update will take a longer time in order to reach the desired node. This will lead to more performance issues than before.

In our next article, we are going to look at Availability and Consistency in Scalable Systems and also understand the very popular topics right now in Software known as Event Sourcing and CQRS.


Rated 0 out of 5 stars.
No ratings yet

Add a rating
bottom of page