Ways to Scale Your App and which Scaling Strategy to Follow

App Modernization
Best Practices
Engineering insights
7 min read

When selecting characteristics of the future application, any entrepreneur is guided by the set goals and demands, and scalability is frequently at the top of the requirements. Everybody wants a stably working app, which would successfully pass the “Black Friday” test without a glitch when needed. 

Some intend to create a high-load app and insist on outstanding scalability keeping in mind future business growth and increased user demands. Some do not have plans to create a high-load app at the moment but are just reinsured and add it just in case. Because they think that adding it later would result in more significant expenditures, than at the stage of app creation. 

So is scalability a must for any kind of application? Is it possible to avoid extra expenditures in the future by improving scalability at the early stage of web app development? Let’s take a closer look at the ability of apps to scale and give answers to these and other questions related to application scalability. 

Table of Contents

  • Application scalability. What does this mean?
  • Define the strategy to scale your application
  • Final thoughts

Application scalability. What does this mean?

This term means the ability of the system to scale up the resources in compliance with the demand. It helps your application handle growing user requests without compromising performance. There are two ways to increase the application bandwidth: 

Horizontal and vertical scaling

  • Vertical scaling or scale-up
    This approach implies an increase in the resources available to our system. For example, before app scaling, we used a 4GB memory and 8-core machine. To scale our system, we are switching to a more powerful machine with 8GB memory and 8 cores. Vertical scaling means that we increase the capacity of the only server our application is deployed on. 
  • Horizontal scaling or scale-out
    The scale-out approach means that we don’t just increase the available resources, but increase the number of machines by duplicating them. For example, we have a server deployed on one machine. As far as load increases, we duplicate the machine with the same service and balance the load. 

The majority of systems are optimized for the scale-out approach. But when balancing the load, there are some challenges to overcome, related to the load balancing principles and status monitoring emergence.

The challenges can be overcome with the principles of balancing. If we consider the backend, one of them is the stateless API approach (the principle of how the backend is created and how endpoints are organized). This means that when the request is received, it does not alter the server state, and each following request does not affect or depend on the previous one. For example, the server received a request. If the backend follows the REST principle and is stateless, it means that the final result is followed by only one request. It means that calls can be made independently of each other. Moreover, every call includes all the data, essential to complete itself effectively, and no hidden or session data should be used during request processing. The request does not keep any information (state) on the server and you are not able to gain access to the first request by the second one. The stateless approach alleviates balancing and makes you independent of the number of instances in your application. 

Stateless API approach

Stateful API approach

This is important because different requests from the same user may be received by different servers. Therefore, if the system tries to save the state, it must be saved in the interim link. For example, there is a purchase request containing user data. Before the system transfers the user to the payment page for the separate payment request, user data must be saved somewhere. Different storages, like databases, Redis cache, or Azure Blob Storage can be used for interim data storage. If we have not a stateless but stateful application, one of the possible ways can be configuring the Load Balancer to transfer the requests from one user to one server (configuring user session ID). 

Tools and methods used for app scaling depend on two things: the deployment model and application type. 

  • Deployment model 
    We can deploy our application on classic instances – like Amazon EC2 or virtual machines. Depending on it we select appropriate tools for app scaling. For example, it can be the NGINX load balancer or Apache Zookeeper deployed on a separate machine and used for monitoring the machines’ load. Depending on the tasks before us, we can also use hosting services to delegate the work with static content scaling, by transferring it to CDN CloudFront. Further, we apply the balancer and configure scaling rules for each service in Amazon infrastructure. For example, for EC2 we can configure the rule that the instances can be launched not with D2 micro resource subscription, but with D2 large, D2XXL instances, etc. to ensure vertical scalability. Or we can use additional instances when the load achieves the specified threshold value. For example, if our system is CPU-intensive for 80%, we deploy extra instances to ensure smooth functioning. 
  • Application type
    There is no one-fits-all application scaling method. To select an appropriate one, we consider the app type, its components, and which of them are subject to the highest load. There are CPU-intensive applications when processor capacity is highly loaded; Data-intensive applications processing massive amounts of data and require database scaling; and network-intensive applications, which depend on the network capacity.

Define the strategy to scale your application

Many customers want their applications to have ideal scalability from the very beginning because they think that adding the improvement later would hit the wallet much more seriously. But this assumption is not entirely true. 

Laying significant infrastructural costs and spending time and effort to make the application initially scalable does not make much sense until you know exactly that it’s in demand. When you receive the first bells indicating the scalability problems, you can take steps towards growth with more confidence.

Alexander Shumski quote

When you know which components are subject to the high load, it is easier to tackle the emerged scalability challenges. For example, when developing an application, you expect it to be network-intensive, but it turns out to be CPU-intensive when it goes live. In such a case, scalability issues can be solved by changing instances’ types in the AWS infrastructure, which will be much cheaper than making changes in the application architecture.

Application complexity and scope are also decisive factors in selecting the app scaling strategy. If we deal with a startup or a small company, it will not be a too complex task to tackle. Applications with a relatively small codebase (1-2 years of development) are easier and cheaper to move to the microservices architecture, which allows each service to be scaled in the correct proportion. 

But in case we have 5-6 months of development and gain the insights that our application is in demand and calls for scaling, it further simplifies the task. The application contains little logic and has a small database and a small number of dependencies, which allows us to split it into microservices easier or just switch to more productive solutions in the areas we have bottlenecks. For example, if the bottleneck is represented by the database, we can use the Database as a Service (DBaaS) solution and delegate the scaling to the provider’s infrastructure (for example, AWS)

However, we strongly recommend you take into account app specifics and business goals. Because in some cases, seemingly cheap ways to ensure application scalability might be extremely expensive in the result. Let’s consider a simple example. You have a CPU and network-intensive application for video processing, which contains production for a news channel and implies online streaming. In this case, it makes sense to build your own data center and transfer the load there, having left only content distribution in the Cloud. Otherwise, hosting expenses may be so serious that outweigh the advantages. 

It happens that complex legacy applications also require scaling. A monolith system represents a hodgepodge of logic and data, contains multiple dependencies, and has one or several tightly connected databases. In such a case the system is difficult to scale yet possible. It depends on which task is being implemented and where the application is hosted. If the app is hosted by the inner data center, we use the “lift and shift” method, when the entire application with its infrastructure is migrated to the Cloud. Therefore, we apply the vertical scaling approach to gain some time and start making amendments to the application structure. 

Final thoughts

Keep in mind that there is a huge difference between a nominal startup and a legacy monolith application, and there is no universal approach that would fit both if we speak about scaling. When selecting the most optimal scaling method, remember that it depends on your business goals, application specifics, and components it consists of. 

We consider that investing in scalability without ensuring that your application is high-load, is quite an irrational approach. Waiting until the moment you just start facing the scalability issue, you have a chance to define the strategy by drawing not on your gut feelings but on the hard facts.

More Like This


Contact us

Our team will get back to you promptly to discuss the next steps