Writing simple random images downloader using Python's requests and asyncio/aiohttp libraries

Introduction

In this tutorial, we will write a simple Python program that downloads random images from a resource called https://picsum.photos/.

This resource allows downloading images with specified sizes. In this script, we will get images with sizes 370x250 px. So our URL will be https://picsum.photos/370/250

Using `requests` package

Our first script will use Python's the most popular package for HTTP requests: requests package. It is not in Python's standard library, so first, we will need to install it.

Activate your virtual environment and enter pip install requests. It will install the latest version of the package.

Let’s take a look at the full program:

import os
import requests
from uuid import uuid4
import time
from urllib import parse

url = 'https://picsum.photos/370/250'


def save_post_image():
    start = time.time()
    for _ in range(25):
        response = requests.get(url)
        extension = os.path.splitext(parse.urlsplit(response.url).path)[-1]
        image_name = f'{uuid4()}{extension}'
        path = f'images/{image_name}'
        with open(path, mode='wb') as f:
            f.write(response.content)

    elapsed = time.time() - start
    print(f'{elapsed} s')


if __name__ == '__main__':
    save_post_image()

As you can see the preceding script will download 25 images with defined sizes and saves them on the local file system (./images) keeping the downloaded image's extension. To extract the file extension we use packages os, urllib. Also, unique names will be given to images using function uuid4().

When we send a request to the URL above it will redirect us to another URL like https://i.picsum.photos/id/372/370/250.jpg?hmac=<some_hmac>. To get this URL we use the requests.models.Response object's .url attribute.

Also, we track the time to measure how much time it takes to download and save 25 images from the resource.

On my laptop, it took approximately 33.983 seconds.

Using asyncio and aiohttp

Next, we will rewrite our program to use aiohttp library. It's blazingly fast asynchronous HTTP Client/Server for asyncio and Python. It allows us to send multiple requests asynchronously to the image resource's server (https://picsum.photos). So our program will be executed in less time than its synchronous version because while we are waiting for a response from the server we can send another concurrent request which starts downloading the second image and so on.

First, we will need to install the required packages:

pip install aiohttp aiofiles

Also, we need aiofiles package to save images on the local filesystem. aiofiles is a library for handling local disk files in asyncio applications. It allows to read and save files in a non-blocking manner.

The asynchronous version of our script will look like this in the simplest version:

import asyncio
from aiohttp import ClientSession
import aiofiles
from uuid import uuid4
import time


url = 'https://picsum.photos/370/250'


async def make_request(session):
    try:
        resp = await session.request(method="GET", url=url)
    except Exception as ex:
        print(ex)
        return

    if resp.status == 200:
        image_name = f'{uuid4()}.jpg'
        path = f'async_images/{image_name}'

        async with aiofiles.open(path, 'wb') as f:
            await f.write(await resp.read())


async def bulk_request():
    """Make requests concurrently"""
    async with ClientSession() as session:
        tasks = []
        for _ in range(25):
            tasks.append(
                make_request(session)
            )
        await asyncio.gather(*tasks)


def download_images():
    start = time.time()
    asyncio.run(bulk_request())
    print('{} s'.format(time.time() - start))


if __name__ == '__main__':
    download_images()

The coroutine bulk_request() serves as the main entry point into the script’s chain of coroutines. It uses a single ClientSession and a task is created for each image that is being downloaded. The requests are made using a single session to reuse the session’s internal connection pool.

The coroutine make_request() makes the GET request, awaits the response, and saves the downloaded image if the status of the response is 200.

It took 3.149 seconds to download 25 random images for the asynchronous version of the script. It's 11 times faster than the program where requests package is used.

Writing simple random images downloader using Python's requests and asyncio/aiohttp libraries

Introduction

Using requests package

Using asyncio and aiohttp

Resources

OTHER POSTS

Which programming skills can assist in stock trading?

Как бороться с выгоранием программисту

Что такое Airbyte и почему вы должны его использовать?

Using `requests` package