Streamlining Python Popen Output
When executing a long-running shell command from a Python script (such as within a Jenkins pipeline), you often encounter a specific set of requirements:
- You need to parse or
grepthe output for specific patterns. - You need to stream the output to the terminal in real-time.
- You need to capture the final exit code.
The
subprocess.check_output
method might seem like the obvious choice:
import subprocess
import sys
try:
output = subprocess.check_output(['ping', '-c', '4', 'localhost'])
sys.stdout.write(output)
except subprocess.CalledProcessError as e:
pass
# Now search the output...
However, check_output buffers the output and only yields it after the
command has fully terminated. If you have a Jenkins job invoking a script that
runs for 10 minutes, sitting in the dark with no real-time terminal output is
unacceptable. Additionally, check_output was introduced in Python 2.7, which
poses a problem for legacy production systems (like CentOS 6) still heavily
reliant on Python 2.6.
Other common alternatives like
subprocess.check_call
or os.system don’t allow you to capture the output programmatically for
parsing.
The Solution: Non-Blocking Reads with select
To achieve real-time streaming alongside programmatic output capture in
legacy-compatible Python, you can combine subprocess.Popen with I/O
redirection and the select
module for non-blocking reads:
# foo.py
import sys
from subprocess import PIPE, Popen
from select import select
# bufsize=1 ensures line-buffered output
p = Popen('ping -c 4 localhost', shell=True, stdout=PIPE, bufsize=1)
while True:
# Non-blocking check to see if there is data to read
if select([p.stdout.fileno()], [], [], 0)[0]:
line = p.stdout.readline()
if not line:
break
sys.stdout.write(line) # You can perform your pattern matching/grep here
sys.exit(p.wait())
This ensures that output is printed immediately as it is generated, you can inspect each line programmatically, and the exit code is properly returned at the end.
