14 Lines of Code

This is the smallest properly working Python 3.7 HTTP client based on asynio/aiohttp that generates the maximum possible number of requests from your personal device to a server. You can use this template whatever you wish, e.g. to crawl the web or to test your servers against a DoS (denial-of-service).


#!/usr/bin/env python3.7

import aiohttp
import asyncio

URL = 'https://google.com'

async def worker(session):
    async with session.get(URL) as response:
        await response.read()

async def main():
    async with aiohttp.TCPConnector() as connector:
        async with aiohttp.ClientSession(connector=connector) as session:
            await asyncio.gather(*[worker(session) for _ in range(TOTALREQ)])

if __name__ == '__main__':
    # Don't use asyncio.run() - it produces a lot of errors on exit.

See Implementation Notes in the annex below.

1 Line: The Most Important One

A lot of things happens at the following line:

await asyncio.gather(*[worker(session) for _ in range(TOTALREQ)]) 
  1. coroutine worker() contains a logic to send and receive HTTP(S) requests using aiohttp session;
  2. generator [] creates a big number of coroutines (TOTALREQ) in order to utilize cooperative multitasking;
  3. asyncio.gather() schedules coroutines as tasks and waits until their completion.

Performance To Expect

Don't expect 1 million requests per second on the personal device. No. On practice, if it is about to send GET/POST requests at the application layer (layer 7 of the OSI model) with the response post-processing, it could be thousands requests per minute and only in a case of the most optimal resource utilization.

Let me show you why:

Client limits: In a case if the client expects to receive a response per each request in order to post-process and analyze the result, a lot of sockets (file descriptors) should be created while the context for each one is being stored in the memory. Then if a server doesn't return response back for too long, the client ends up with a limited count of open connections (e.g. 500-2000), not able to generate more requests. Also the client is not able to prepare and process many requests simultaneously, it is limited to a CPU count.

Server limits: The server is not able to respond immediately, it needs to finish TLS handshake in a case of https, where cryptographic manipulations extensively utilize a processor time, the server needs to spend a time to analyze a request and to get data from database. Meanwhile the client is waiting.

In order to maximize a frequency of client requests you basically need two things:

  1. cooperative multitasking (asyncio)
  2. connection pool (aiohttp)

Let's go back to the magic line:

await asyncio.gather(*[worker(session) for _ in range(TOTALREQ)]) 

1. Cooperative Multitasking (asyncio)

asyncio is here to get the maximum from computer's hardware and resources.

Event loop is a single non-blocking thread that tries to figure out how to carry out 1 million requests on client's 4 CPU cores through a single network card.

On the right side of the picture, you can see a limited count of physical CPUs.

On the left side of the picture, there are tasks that our program desires to do:

  • send 1 request and get 1 response: it is a 1 task;
  • send 1000 requests and get 1000 responses: it is 1000 tasks which could be parallelized.

Concretely in Python a single task can be represented by async coroutine ("worker()" in my example) consisted of a bunch of await blocks.

How to execute 1000 tasks on 4-8 CPU cores in the most effective way?

Whenever a coroutine "stucks" awaiting for a server response, event loop of asyncio pauses that coroutine, pulling one from CPU execution to the memory, and then asyncio schedules another coroutine on a free CPU. The processor never sleeps, and event loop fills the gaps of awaiting events.

2. Connection Pool (aiohttp)

If to generate multiple requests without any optimization, then a huge amount of CPU time will be consumed to close the old connection and to establish a new one for each upcoming request. Another point is that TLS handshake is a pretty consuming operation. Whenever connection is closed, TLS handshake must be carried out again to open a new connection.

To increase a frequency of requests TLS tunnel should be reused for multiple requests.

Connection pool keeps as many as possible open TLS connections. aiohttp opens a new connection and holds the connection as long as necessary. Upcoming requests load the pool evenly and go to the server via reused TLS tunnels. Each open connection means an open socket (file descriptor) and a context in the memory.

Performance Measurements

500 requests (TOTALREQUESTS = 500)

$ time ./aiohttp-request-generator.py
real    0m3.701s

5000 requests (TOTALREQUESTS = 5000)

$ time ./aiohttp-request-generator.py
real    0m25.393s

500 requests with only 1 thread
I don't provide a code example here, I just used 1 coroutine and for _ in range(500) in the worker(). You can see how slower it is.

$ time ./aiohttp-request-generator.py
real    4m0.640s

5000 requests per 25 seconds at most.

Implementation Notes

  1. You could see more complex solutions with semaphores and different fancy coroutine configurations, but seems like asyncio and aiohttp are pretty smart enough, so I see comparable results during my experiments.
  2. google.com acts as a test server, and because google.com is able to process 40000 requests per second, our 5000 requests doesn't harm too much.
  3. I ignore the result and error handling for a sake of simplicity, so this example is easier to understand and test. Performance results are following further in the article. Basically, Google quickly identifies the script as a robot, and the captcha is being returned constantly.
  4. connector is not really needed for this simple example, but I left the line just in case: it is useful to have connector there to configure connection parameters during the experiments.
  5. When you add the real application logic to form the request and process the response you should see increasing CPU consumption.