60% cost optimization on autoscaling and 70% savings on renting spot instances

Client’s result

0,05% from online profit on Black Friday spent on AWS resources

0,2 s website response time

60% cost savings on autoscaling

Industry |

Retail Fashion

Company size |

1K-5K employees

Service | Migration to AWS, autoscaling configuration, AWS infrastructure support

Location | Ukraine

About INTERTOP

INTERTOP Ukraine is the leader of Ukrainian fashion retail that has been present for 30 years on the market, managing 14 retail chains of multi-brand and mono-brand stores. These include INTERTOP, Armani Exchange, Emporio Armani, EA7, Ecco, Geox, Marc O’Polo, Napapijri, Skechers, Timberland, The North Face, Vans, Kiko Milano cosmetics chain, and INTERTOP Outlet. The intertop.ua platform offers 130,000 products from more than 500 brands in the categories of shoes, clothes, accessories, and cosmetics. The main goal of INTERTOP Ukraine is to become a highly technological fashion company that meets market trends and provides its clients with exciting shopping experiences. The company has achieved full integration between the website, mobile application, 157 offline stores, terminals, warehouses, suppliers, and accounting systems.

Main challenges

Challenge 1

Stop system downtime and issues with stability during peak load periods

The company hosted its system on on-premises servers. It was enough for them at first, but as the business developed and the website load grew, the number of issues and time of system downtime increased to several incidents a month for any marketing activity.
Hardware limitations, maintenance overhead, complex and time-consuming scaling mechanisms, and significant ongoing costs made it impossible to stick to the current solution.

Challenge 2

Optimize infrastructure cost

Hosting the system on on-premises servers also caused overspending on infrastructure costs.
It was quite challenging for the company to control resource allocation and feasibility, scale resources efficiently, cover unforeseen maintenance needs, and reduce energy consumption.
INTERTOP wanted to spend money on infrastructure efficiently, track costs, and make informed decisions about infrastructure budgeting.

Challenge 3

Speed up the work of the software development team

Efficient communication and cooperation between DevOps and software development teams is one of the keys to a more productive software development process.
INTERTOP required DevOps assistance that could strengthen developers, facilitate the work on new features and updates, and increase time-to-market.
They didn’t want to hire in-house experts because the company had no relevant internal expertise to control them, and there was not enough work for a full-time employee.

What we did

Solution 1

Migration from on-premises to the cloud

As on-premises servers were unable to cope with the growing load on the system, we offered cloud migration.

Our advice was to migrate to AWS at once. However, due to some budget constraints and hesitations, the client chose a small virtual hosting.

Its competitive price was a great advantage but it had limited server capacity, its service ecosystem and integration opportunities weren’t extensive, and its reliability and uptime lagged behind the standards of the leading global providers.

Using this small hosting, it was impossible to resize resources automatically. The team had to wait for the provisioning of additional servers for a few hours and pay the full rate for idle servers. Still, there were crashes several times a month, sometimes even with data loss, and we had to restore it from backups. All this was inconvenient and inefficient, so migration to AWS was a rescue.

We chose the lift-and-shift migration model to move to AWS as is. This is the least time and cost-consuming model that offers a quicker solution with opportunities for further infrastructure improvement and optimization. It took us 2 weeks to migrate to AWS using this approach instead of a potential 2-3 months in case of rearchitecting.

We moved the system to EC2 instances, configured scalability and load balancing, and set up CI/CD pipelines.

Migration to AWS was done before Black Friday in 2019. This solution helped cope with the upcoming loads and the overall positive growth dynamics of peak loads on the website.

Since then, the website load has grown yearly, and INTERTOP’s AWS system has been stable, durable, and scalable.

Due to the AWS-based solution, INTERTOP successfully went through Black Friday 2019 with the following results:

Solution 2

System rightsizing and adaptation to the growing website load

There were three main tasks defined within this solution: adapt resources to the growing traffic load, prepare the system, and carry out load testing.

Adapting resources to the load increase consisted of the following steps:

Tune autoscaling
Adding and increasing the capacity of DB nodes
Сreation of separate groups of web servers for Admin panel and API
Increasing the performance of the web cluster
The use of software code optimization and constant profiler to search for bottlenecks in the code

Autoscaling tuning
According to AWS official recommendations, scaling should be based on processor loads. However, in real life, it turns out to be not enough. The complexity of modern applications requires a combined approach that is applied in the solution for INTERTOP.

We considered not only processor loads, but also the number of occupied and available PHP workers, and the number of connections to one server.

Adding and upgrading database nods
This became a must before big and important events like sales. It was observed that during sales the load on the database grew abruptly, so we also considered it reasonable to keep some power in reserve for peak loads.

Creating separate web server groups for the admin panel and API
The admin panel influenced the website greatly. It contained a lot of heavy parts, requests, and pieces of code, and when operators used it intensely, it slowed down the website. When we isolated the admin panel, the problem ended.

It was a very successful decision due to which the total number of servers was reduced.

Compare the numbers before dividing servers into logical parts:
Night – 4 servers
Day – up to 35 servers
Peak load – up to 70 servers

Increasing the performance of the web cluster
Software developers noted that a piece of PHP code ran slowly on the production cluster. Performance in the test environment was 30% faster. When we compared the environments, we noticed that we partially used T3A instances on AMD processors. These processors reduced the performance in production. When we removed them from the cluster, the issue was fixed.
However, later we tested Intel and AMD processors once more and it turned out that AMD processors worked 30% slower, so we had to return to Intel.
The use of software code optimization and constant profiler to search for bottlenecks in the code
This work was done non-stop. A profiler was used to find such places and developers analyzed and eliminated them with our assistance if necessary.

System preparation consisted of the following parts:

We conducted load testing after every change in the system and checked everything until we received the expected result.

All the changes and improvements made it possible to normalize system provisioning, which also led to significant cost optimization as INTERTOP requested.

Here are the benefits the client received:

100% autoscaling
Possibility to add and remove resources dynamically (up to 60 seconds for server deployment)
Automatic Failover (in case of a failure, AWS needs only 30-120 seconds to restore a backup copy of a service without human intervention)
Opportunity to use native AWS services in the future
60% cost savings on autoscaling
Up to 70% cost saving via renting Spot Instances
Saving about 30% by purchasing Reserved Instances

Solution 3

Infrastructure support on an hourly basis

We offered hourly paid infrastructure support services that would cover all client’s needs and requests with no fixed monthly budget commitment. Within this agreement, we worked on regular and emergency tasks, infrastructure optimization, security improvement, etc.

Examples of what we did within our infrastructure support services:

Optimized architecture and database performance via the implementation of Aurora Serverless and later Aurora Serverless 2 for improved vertical fractional scaling.
Connected Elasticsearch and SQS
Dockerized the mobile app and put Docker containers in ECS to unify development and testing and simplify app deployment
Implemented the microservices architecture for the mobile app

We still provide infrastructure support services to INTERTOP but now they have an in-house DevOps engineer. Their full-time specialist usually does regular tasks and our team provides more complex services, and overall architecture guidance and consulting.

Key Results and Business Value:

1. 100% system scalability

2. 60% cost savings on autoscaling

3. Up to 70% cost saving via renting Spot Instances

4. Saving about 30% by purchasing Reserved Instances

5. 0,05% from online profit spent on AWS

6. 0,2 s website response time

7. Increased system reliability (only 2 critical issues happened within 4 years)

Features Delivered:

1. AWS-based infrastructure

2. Improved system performance and reliability

3. System autoscaling

Technologies we used

Client’s feedback

“Partnering with IT-Magic has transformed our technological landscape. Their expertise in AWS cloud migration and infrastructure optimization has significantly improved our system’s stability and performance while reducing costs. The flexible support model has seamlessly complemented our in-house team, accelerating our development process and enabling us to provide our customers with an unparalleled shopping experience. We highly recommend their services to any business looking to scale and optimize their infrastructure.”