In this blog, we will discuss a simple yet very useful library for Python terminal/command line use cases. Very often we work with command-line python scripts to process a significantly high volume of data or files. The expectation is to have a way to track how many files are processing and what’s the speed of every process.
The command line progress bars are here to rescue us. There are multiple progress bars available in the market and you can read more about open source python commandline progress bars.
How to create a progress bar on the command line in Python?
In this blog, we will discuss atpbar. Python multiprocessing enabled a progress bar for the terminal. Atpbar provides the following features:
- Easy to install.
- Minimalistic progress bar without any fancy UX thus quite simple to implement.
- Compatible with multi-processing and multi-threading.
- Can add a name to every subprocess in multiprocessing and multithreading.
- Python terminal progress bars simultaneously grow to show the progress of iterations of loops in threading or multiprocessing tasks.
- Compatible with Jupyter Notebook.
- On TTY devices where the progress bar is not compatible, it can show the status with numbers without the progress bar.
- The object
atpbar
is an iterable that can wrap another iterable and shows the progress bars for outer and inner iterations. - Break and exception exit the code and the progress bar will stop right there.
How to install python atpbar for commandline progress bar?
Create virtualenv, if not present, using the following command:
virtualenv -p python3.9 venv
source venv/bin/activate
python3 --version
Now install atpbar using the below command for multi-processing python terminal/command line progress bar.
pip install -U atpbar
How to use atpbar?
you can find more details on the exact implementation on python foundation website or on github page of atpbar. In this article, I will explain the functionality in brief.
One loop
import time, random
from atpbar import atpbar
n = random.randint(1000, 10000)
for i in atpbar(range(n)):
time.sleep(0.0001)
A python terminal progress bar will look something like this
For atpbar
to show a progress bar, the wrapped iterable needs to have a length. If the length cannot be obtained by len()
, atpbar
won’t show a progress bar.
Nested loops
atpbar
can show progress bars for nested loops as shown in the below example.
for i in atpbar(range(4), name='outer'):
n = random.randint(1000, 10000)
for j in atpbar(range(n), name='inner {}'.format(i)):
time.sleep(0.0001)
In this example, the outer loop will iterate 4 times while the inner loops are processing.
Threading
atpbar
can show multiple progress bars for loops concurrently iterating in different threads.
from atpbar import flush
import threading
def run_with_threading():
nthreads = 5
def task(n, name):
for i in atpbar(range(n), name=name):
time.sleep(0.0001)
threads = [ ]
for i in range(nthreads):
name = 'thread {}'.format(i)
n = random.randint(5, 100000)
t = threading.Thread(target=task, args=(n, name))
t.start()
threads.append(t)
for t in threads:
t.join()
flush()
run_with_threading()
As shown in the below screenshot, tasks are running concurrently and Python terminal progress bar will show the status of each task simultaneously.
One important thing to notice here is flush() function that returns when loops have finished and informs the main thread or main program to finish updating progress bars.
As a task completes, the progress bar for the task moves up. The progress bars for active tasks are at the bottom.
Multiprocessing
import multiprocessing
multiprocessing.set_start_method('fork', force=True)
from atpbar import register_reporter, find_reporter, flush
def run_with_multiprocessing():
def task(n, name):
for i in atpbar(range(n), name=name):
time.sleep(0.0001)
def worker(reporter, task, queue):
register_reporter(reporter)
while True:
args = queue.get()
if args is None:
queue.task_done()
break
task(*args)
queue.task_done()
nprocesses = 4
ntasks = 10
reporter = find_reporter()
queue = multiprocessing.JoinableQueue()
for i in range(nprocesses):
p = multiprocessing.Process(target=worker, args=(reporter, task, queue))
p.start()
for i in range(ntasks):
name = 'task {}'.format(i)
n = random.randint(5, 100000)
queue.put((n, name))
for i in range(nprocesses):
queue.put(None)
queue.join()
flush()
run_with_multiprocessing()
With multiprocessing enabled with atpbar, two more functions come into play:
- find_reporter() – This function is required to be called into main thread or main process. This intimate main thread of atpbar to look for subprocesses.
- register_reporter() – This function is required to be called inside every new subprocess. Every call from the subprocess will be tracked by the main thread and a new Python terminal progress bar will be created.
Simultaneously growing python terminal based progress bars will look something like this.
[AUTHOR’S CORNER]
This article is part one of the progress bar in python series. Stay tuned for more such articles on singlequote.blog.
If you find this exercise helpful then motivate me to write more such posts for you. Share this with your friends, family, and colleagues to help them be more productive in life.
Ciao…