1. Introduction
In Python 3.9, if there are multiple computational tasks that need to be computed in parallel, they are usually done in a multithreading or coroutine fashion.
This article is based on the official Python 3.9 documentation and summarizes some common applications in development.
2. Description
A thread is the smallest unit of computational scheduling in the operating system. A process can have multiple threads, which share all system resources of the current process.
Threads are sometimes called “lightweight processes”.
Threads share all the system resources of the process, but still have their own independent resource parts.
- call stack
- register context
- thread-local storage
Since all system resources of the current process are shared, a thread can easily destroy independent resource parts of other threads.
Advantages of threads
- Saving system resources compared to the process approach
- Data sharing among multiple threads
- Utilizes multiple CPU cores and threads
Disadvantages of threads
- Fixed memory footprint
- Time-consuming to switch thread contexts
The default size of memory allocated for each thread can be viewed with ulimit -s
under Linux.
3. Usage
Starting a thread in Python3 is very simple.
Threads in Python are executed in a separate system-level thread (say a POSIX thread or a Windows thread).
The code is as follows.
|
|
The output is as follows
4. Message Subscriptions
Practical development often requires threads to communicate with each other or subscribe to certain messages to decide whether to execute the next step.
Messaging between threads
- Queue objects
queue.Queue
are thread-safe and can be used to interact between threads. - The thread property
threding.Event
can be used for state listening to accommodate special data to be processed next by other threads after it has been processed by the thread.
Suppose there are 2 threads, 1 producer thread is responsible for pushing the string “hello” and waiting for the consumer thread to add “world” to the string and then the producer thread prints it immediately.
Inter-thread communication is actually passing object references between threads.
The code implementation is as follows.
|
|
The output is as follows, you can see that the producer thread got the “hello world” string within 1 millisecond after processing.
5. GIL
The CPython interpreter uses a mutual exclusion lock called “GIL Global Interpreter Lock”, or Global Interpreter Lock, to prevent concurrent execution of machine code by multiple threads.
Without going into detail about GIL, there are two conclusions
- For computationally intensive tasks, the performance difference between multi-threaded and multi-process is not significant (GIL locks optimize IO blocking)
- For IO-intensive tasks, multi-processing or concurrency should be used to handle
For example, for the following program, calculate the cumulative value of 100000000.
The output is as follows.
We use a multi-threaded approach to compute two 100000000 cumulative values in parallel.
|
|
The output is as follows.
You can see that there is no difference in speed compared to sequential execution, and for computationally intensive tasks, multithreading does not make processing faster.
If you must process computationally intensive tasks in Python, consider process pools.
6. Thread locking
In reality, any multi-threaded scheduling operation should mostly take into account atomic operation, i.e. the execution order of fixed steps should not be interrupted or even abnormal by the thread scheduling mechanism.
In Python3, dict/tuple/list objects are thread-safe and can be used boldly.
For example, if the following Add
method is called once, the value of count increases by 500000, and 2 threads are called once each, theoretically the output should be 1000000.
|
|
The output is as follows. The value of Count is not 1000000, and multiple executions reveal that the number varies, which is a reflection of the broken atomic operation.
|
|
Locks can be very good at avoiding such problems in practical development, but they are usually accompanied by some performance loss.
|
|
The output is as follows. The value of Count is 1000000, no matter how many times it is executed.
|
|
7. Thread pooling
Sometimes it is not easy to determine the size of the task, so thread pooling comes in handy.
For example, it takes 1 second to generate a random string each time.
We use 2 threads to generate 4 random strings of length 5, 6, 7 and 8.
The code is as follows.
|
|
The output is as follows.
You can see that only 2 threads are processing string generation at any given time and scheduling themselves to handle the remaining tasks. Thread pooling is ideal for scenarios where you are dealing with a lot of IO-blocking tasks.
Normally, you should also only use thread pooling in I/O processing related code.