A ksh bug on "wait"
WTF?
$ cat kshbug
{ return 0; } &
evil=$(/bin/true) # XXX: works fine without this line
wait $!
echo $?
$ ksh kshbug
127
$ ksh --version
version sh (AT&T Labs Research) 1993-12-28 r
The correct return code should be 0. Without the line of “eval=$(bin/true)” everything works fine. The problem happens only when
- Execute a function or a clause in background, and
- A subshell is invoked between the background execution and the “wait”, and
- An external command is executed in the subshell
I googled for a while, there’s no ksh bug report so far, workaround could be use output text for return code check instead. Note there’s a similar report for ksh on solaris but it’s not the identical issue.
Pdksh (public domain ksh) doesn’t have the problem. (See another bug).
Couldn’t figure out where to report this bug so gave up.
Update: this issue doesn’t happen on Ubuntu ksh version
sh (AT&T Research) 93s+ 2008-01-31.
TETware Infinite Loop: The ksh93 Bug That Filled My Hard Drive
While using TETware as our automated testing
framework, I’ve increasingly found it to be incredibly frustrating. The ksh
API portion, in particular, feels severely outdated. Despite making numerous
local modifications, it remained clunky. Today, however, I uncovered an infinite
loop hiding within one of its core logging interfaces. After diving deep into
the issue, it turned out to be a native bug in ksh93. If this interface hadn’t
been written so poorly in the first place, this shell bug might have remained
hidden forever.
Here is the breakdown of the bug.
In ksh, the ${parameter%pattern} syntax is used to strip a suffix from a
string, while ${parameter#pattern} strips a prefix. These are commonly used to
extract directories and filenames from paths. However, when the parameter is a
multi-line string and the pattern matches the \n.*(.*).* regex format, the
shell parser completely fails:
$ cat ksh93bug
NL=$'\n'
PAT="$1"
A="Hello $NL$PAT"
echo "${A%$NL$PAT}"
A="$PAT$NL world"
echo "${A#$PAT$NL}"
$ ksh ksh93bug '()'
Hello
()
()
world
$ ksh ksh93bug 'a(b)c'
Hello
a(b)c
a(b)c
world
This bug is strictly isolated to AT&T’s ksh93, including the latest versions.
Both bash and the public domain ksh (pdksh) handle it flawlessly:
$ bash ksh93bug '()'
Hello
world
$ bash ksh93bug 'a(b)c'
Hello
world
In our specific scenario, the code executed out=$(mount) followed by
tet_infoline "$out". This immediately caused a freeze. TETware’s tetapi.ksh
script relies on %% within a loop inside the tet_output function to process
multi-line text. Because the suffix was never actually deleted due to the bug,
the loop never terminated. When I attempted to debug the freeze by enabling
set -x, the infinite loop generated logs so rapidly that it filled my entire
hard drive to 100% capacity in seconds! :P
I initially intended to report this upstream, but after struggling to find a proper bug tracker on the ksh93 homepage, I gave up. For now, I’ve just patched our instance of TETware directly.