How To Speed Up A Microservice With Multithreading, Asynchrony, And Cache: A Step-By-Step Guide (FastApi, Redis)

How To Speed Up A Microservice With Multithreading, Asynchrony, And Cache: A Step-By-Step Guide (FastApi, Redis)

Microservice architecture sounds good on its own. A fast microservice that makes efficient use of server resources is even better. I will show you how to consistently apply methods to speed up its operation to a simple microservice while considering each of their pros and cons.

Let’s see how you can parallelize the work of a microservice by distributing the load into several threads, increasing performance using asynchronous execution, and speed up work using the cache.

The listed methods are suitable only for programs written in Python 3 + FastApi; they are not universal ways to speed up any microservices. And, of course, the primary methods are described here; the best option is up to the developer. It depends on the situation.

What Will I Use

For the microservices I’m talking about to work correctly, you only need Docker Compose – they are already packaged in Docker containers. It can quickly deploy any microservice on a local machine and in the cloud on a VPS / VDS. The speed test client is written in Python 3 using async io and HTTPS. As an example, I used the weather forecast from pogoda


It would be possible not to parse the data but to use an open API, for example, But then you would have to register and get an API key. In the current implementation, this is not required – everything will work right away.

To test the speed of microservices, I use a client that asynchronously sends requests for three different cities, and in the end, displays the total uptime and average time per request. It will send 18 proposals: the server will ban for 10-20 seconds if more than 20 requests are sent in a row.

To check 18 requests is enough for us. But even if we were using the API, we would not be able to send more than 60 requests per minute (see Free Subscription ). That is, the numbers are pretty comparable.

Step 1. Deploy a Simple Microservice

All simple microservice files are located in the 1-simple-microservice folder. The source code of the microservice is in the file; it is built on the FastApi web framework.

The microservice accepts the city’s name as input, requests an external resource in the main thread, parses the temperature in the town, and returns the work result. It is a basic microservice that we will further accelerate.


Step 2. Add Multithreading To The Microservice

All files are in the 2-few-threads-microservice folder. To not complicate the example by distributing traffic between microservice instances, let’s go the most straightforward way. Let’s parallelize the process using the Gunicorn web server, namely, add to the docker-compose.yml line command: unicorn -b the -workers = n parameter so that there are n processes, in our case three.


You could also use the Nginx balancer; an example is here. The advantages of this approach can be attributed to simple implementation to the disadvantages – a linear increase in resource consumption. 

For example, if one microservice consumes 100 MB of memory, then, working in three threads, it uses at least 300 MB. Consumption can be reduced by sharing resources between lines, for example, as shown below, using a cache shared by all threads.

Step 3. Adding Asynchronous Execution

In a regular program, when a process in an executing thread reaches a place where external resources are required, it blocks execution, waiting for a response. When the program is executed asynchronously, the executing line is occupied by another process – due to this, performance increases.

The microservice source files are in the 3-async-few-threads-microservice folder. To create a microservice, we initially took the web framework FastApi and web server Gunicorn, which uses ASGI -worker Unicorn. It means that for asynchronous execution, it is enough to mark asynchronous methods with the async keyword and implement operations in them that require to await.

In our case, we are asking for a weather forecast – this is our awaited operation. The main advantage of this approach is that it allows more efficient use of resources; the disadvantages could be attributed to the greater complexity of implementation, but this is not the case with FastApi.

Step 4. Add a Cache To The Microservice

When using a shared resource in a multithreaded application, the issue of data access synchronization is reasonable. Since version 4, Redis has become more multithreaded, not to mention the more recent Redis 6. Because of this, it is not immediately clear whether we can send unsynchronized requests to the cache.

The data access interface inside the Redis core remains single-threaded (proof here ). It means, no matter how many read/write commands we send, they are executed strictly sequentially.

Redis is an in-memory data structure store; it stores data in RAM. And thanks to this, it has tremendous performance (even on an entry-level server from 100 thousand get/set requests per second, for more details, see here ) and is often used to implement a cache.

Like all microservices variations, I configured Redis as a cache and packaged it in a Docker container. Let’s take a closer look at the settings that I added to the Redis. Conf file :

  • Bind – mapped Redis to the internal local host of the Docker container.
  • Max memory 100MB – limited the size of the cache.
  • Max memory-policy volatile-TTL – This policy lets you delete the least relevant data when the memory limit is reached. Read more about this here.


You can download the default example, Redis. Conf from here, there are also recommendations for configuring Redis as a cache.

Next, I started Redis with my configs and linked the ./redis/data folder to the data folder inside the Docker container so that the data is saved to an external drive (see the docker-compose.yml file ).

A caveat is needed here: although the cache can significantly speed up the work of a microservice, in some cases, it cannot be used, for example, if only relevant data is needed for each request. This example will store a scheme implemented in the weather forecast (temperature in the city) in the cache for 1 hour. For such data, this is normal since they do not change much during this time.

The microservice files are in the 4-cache-async-few-threads-microservice folder. There is nothing complicated in the implementation: first, we read the cache from memory; if there is one, we send the data. If not, we make an external request and process the response. Then we put the received data in the cache for 1 hour and return the answer. The implementation of reading and writing requests are asynchronous. As mentioned above, in this case, the cache is an external resource.

Also Read: Why Hadoop PaaS Users Need To Upgrade To Version 3


Leave a Reply

Your email address will not be published. Required fields are marked *