The goal of scaling
Performance gain is not the goal of scaling. It can improve performance in some cases, but it can also introduce a small amount of overhead.
The goal of scaling is to:
- Increase available resources to meet workload demands
- Decrease available resources to not have overcapacity
- Find the balance between stability and cost
- Keep your performance stable with the minimum amount of resources (and perhaps a bit of redundancy)
Scaling is possible in several ways, each with its own advantages and disadvantages.
Vertical scaling means that you add more CPU/RAM/etc. to the webserver.
Advantages of vertical scaling:
- It keeps your application architecture simple
- It has a positive impact on performance even when workload is low
Disadvantages of vertical scaling:
- It has a much lower ceiling on the workload capacity that you can gain compared to horizontal scaling
- It does not add any failover or redundancy to your architecture
- It cannot scale dynamically based on demand
Horizontal scaling means that you expand the infrastructure (cloud) or hardware (on premise) by adding more web servers. A load test can give an estimation of the required number of Indiciums.
Advantages of horizontal scaling:
- It has virtually no upper bound on workload capacity
- It can scale dynamically based on demand
Disadvantages of horizontal scaling:
- It requires a more complicated architecture
- It can increase the overhead of the application
- It adds redundancy to your architecture
For horizontal scaling, load balancing is required. Load balancing is the process of distributing a set of tasks over a set of resources, to make their overall processing more efficient.
For your end application, built by the Thinkwise Platform, sticky sessions are an option in a load-balanced environment. Sticky sessions, a.k.a. session persistence, is a process in which a load balancer creates an affinity between a client and a specific network server for the duration of a session, (i.e. the time a specific IP spends on a website). When you use sticky sessions, all requests inside a session are directed to the same web server. Because of this, all resource states created within that session can be kept in memory, as only that web server requires access to it.
The in-memory state is faster, but tying users to single web servers for their entire session limits the effectiveness of load balancing. It is, for example, possible that the sessions of one web server coincidentally cause much more load than the sessions of another web server. However, the load balancer can no longer fix that due to sticky sessions.
When to use scaling
No scaling or vertical scaling performs best:
- If your workload is low and you do not need redundancy, do not use scaling.
- If your workload increases a bit, then first look into vertical scaling
Use horizontal scaling with sticky sessions:
- If your workload is medium (i.e. many hundreds of concurrent users) and very predictable. In that case, sticky sessions are ideal and perform better than non-sticky sessions
Use horizontal scaling without sticky sessions:
- In case of the highest workloads (in that case, it allows for limitless scaling)
- If you need dynamic scaling (just in time and just enough scaling)
- To ensure that there is no impact on users when a server goes down
Scaling for different platforms
Read more about how to scale your environment for the platform you are using:
The best practice for exposing your application to the internet is through an application-level gateway at the edge of your network.
- A gateway is simply a reverse proxy that can forward traffic to another part of the network (such as a web server) and forward the response back to the client.
- A load balancer is a type of reverse proxy that is able to distribute load evenly amongst a set of web servers. A load balancer is mandatory in horizontal scaling setups.
- A WAF (Web Application Firewall) is also a type of reverse proxy, which can monitor, validate and filter incoming traffic for security purposes. In particular, a WAF can be an effective measure to prevent denial of service attacks, where an attacker attempts to make a service unavailable through high volumes of traffic or other means.
In short, a gateway is always a reverse proxy and it can also be a load balancer, a WAF or both. The best choice depends on the circumstances.
This topic shows examples of scaling.
Example: two Indiciums
Example: a second portal
To reduce the risk of a single point of failure, a second portal can be added.
Example: a second server
The same applies to the SQL server: to reduce the risk of a single point of failure, you can add a second one.
Supported setups are:
- SQL Always On.
- SQL Clustering (active/passive or active/active).