Educational Topic: Unix pipes, streaming safety, and production reliability
Real-World Problem: How broken pipes can corrupt data streams in CLI applications
Case Study: UnifyWeaver’s solution for Gemini CLI streaming compatibility
This appendix explores a real production issue encountered in UnifyWeaver: SIGPIPE warnings that could interfere with Gemini CLI streaming responses. This case study demonstrates Unix systems programming concepts, the subtleties of pipe handling, and engineering approaches to streaming safety.
In Unix systems, pipes connect the output of one process to the input of another:
producer | consumer
Normal flow:
producer writes data to the pipeconsumer reads data from the pipeBroken pipe scenario:
producer is generating dataconsumer exits early (e.g., grep -q finds first match)producer tries to write to a closed pipeproducerUnifyWeaver generates bash scripts for recursive predicates like:
# Find if 'target' is reachable from 'start'
ancestor_check() {
local start="$1"
local target="$2"
# This pattern causes SIGPIPE:
ancestor_all "$start" | grep -q "^$start:$target$"
# ↑
# grep -q exits after first match
# ancestor_all keeps running and hits broken pipe
}
Timeline of the problem:
ancestor_all starts BFS traversal (expensive operation)grep -qgrep -q finds match on iteration 3, exits immediatelyancestor_all continues to iteration 20, tries to output resultbash: line 1: echo: write error: Broken pipeThe Core Issue: Error messages to stderr can corrupt streaming data in CLI applications.
Gemini CLI Context:
General CLI Streaming:
Implementation:
ancestor_all "$start" | grep -q "^$start:$target$" 2>/dev/null
Why rejected:
Implementation:
trap '' PIPE # Ignore SIGPIPE
ancestor_all "$start" | grep -q "^$start:$target$"
trap - PIPE # Restore default handling
Limitations:
Implementation:
ancestor_check() {
local temp_file="/tmp/ancestor_$$"
ancestor_all "$start" > "$temp_file"
grep -q "^$start:$target$" "$temp_file"
local result=$?
rm -f "$temp_file"
return $result
}
Trade-offs:
Implementation:
ancestor_check() {
local start="$1"
local target="$2"
local tmpflag="/tmp/ancestor_found_$$"
local timeout_duration="5s"
# Timeout prevents infinite execution, tee prevents SIGPIPE
timeout "$timeout_duration" ancestor_all "$start" |
tee >(grep -q "^$start:$target$" && touch "$tmpflag") >/dev/null
if [[ -f "$tmpflag" ]]; then
echo "$start:$target"
rm -f "$tmpflag"
return 0
else
rm -f "$tmpflag"
return 1
fi
}
How it works:
timeout "5s" - Bounds execution time (safety feature)tee - Duplicates output to two destinations:
>(grep -q ...) - Process substitution that exits when match found>/dev/null - Stays open, preventing SIGPIPEtmpflag - Communicates grep result via filesystemWhy this approach wins:
/dev/nulltee is well-established for this purposeUnifyWeaver uses a template system for code generation. The SIGPIPE fix required updating templates in two locations:
Files Modified:
src/unifyweaver/core/template_system.pl - Main transitive closure templatesrc/unifyweaver/core/recursive_compiler.pl - Descendant-specific templateTemplate Variable Design:
# Template with placeholders
_check() {
local tmpflag="/tmp/_found_$$"
timeout "" _all "$start" |
tee >(grep -q "^$start:$target$" && touch "$tmpflag") >/dev/null
# ... rest of implementation
}
Configuration Options:
timeout_duration - Configurable timeout (default: “5s”)tmpflag pattern - Process-specific temporary filesProcess Substitution Syntax:
tee >(grep -q "pattern" && touch flag) >/dev/null
What happens:
>(...) creates a temporary FIFO (named pipe)tee writes to both the FIFO and stdoutgrep -q reads from FIFO, exits when match found/dev/null) stays openBash Version Compatibility:
Test Strategy:
# Before fix - check for SIGPIPE warnings
swipl -s run_all_tests.pl -g "main, halt." 2>&1 | grep -i "broken pipe"
# After fix - should find none
swipl -s run_all_tests.pl -g "main, halt." 2>&1 | grep -i "broken pipe" || echo "No warnings found!"
Results:
The timeout + tee solution fixes the streaming safety issue but still computes full results. Future optimizations could improve efficiency:
Concept: Modify the BFS algorithm itself to stop when target found.
Current:
ancestor_all() {
# BFS that finds ALL reachable nodes
while [[ -s "$queue_file" ]]; do
# ... traverse all nodes
echo "$start:$node" # Output every relationship
done
}
Optimized:
ancestor_all() {
local target="$2" # Optional target parameter
while [[ -s "$queue_file" ]]; do
# ... BFS logic
echo "$start:$node"
# Early exit when specific target found
if [[ -n "$target" && "$node" == "$target" ]]; then
cleanup_and_exit 0
fi
done
}
Benefits:
Advanced Concept: Kill producer process when consumer has enough data.
Implementation complexity:
Use cases: High-frequency streaming scenarios where every microsecond matters.
Setup:
# Create a simple producer that outputs many lines
#!/bin/bash
producer() {
for i in {1..100}; do
echo "line_$i"
sleep 0.01 # Small delay to show the issue
done
}
# This will cause SIGPIPE
producer | head -5
Observe: The “Broken pipe” error message when producer tries to write after head exits.
Stderr suppression (poor solution):
producer | head -5 2>/dev/null
Tee solution (good solution):
producer | tee >(head -5 > results.txt) >/dev/null
Compare: Which approach is more maintainable and debuggable?
Problem simulation:
# Simulate streaming JSON with errors
json_stream() {
echo '{"data": ['
for i in {1..10}; do
echo " {\"id\": $i},"
done
echo ']}'
}
# This corrupts the JSON stream with error messages
json_stream | head -3
Solution: Apply the tee pattern to keep JSON clean.
Good streaming output:
# Clean, parseable output
api_tool --format=json | jq '.results[]'
api_tool --format=csv | cut -d, -f1,3
Bad streaming output:
# Mixed errors and data - breaks automation
api_tool 2>&1 | parser # stderr mixed with stdout
Monitor for SIGPIPE in production:
# Log analysis for broken pipes
grep -i "broken pipe" /var/log/app.log
# System-wide signal monitoring
dmesg | grep -i "killed by signal"
The SIGPIPE issue in UnifyWeaver demonstrates several important concepts:
The timeout + tee solution demonstrates that elegant engineering solutions often combine multiple simple tools (timeout, tee, process substitution) to solve complex problems safely and efficiently.
This approach provides:
The lesson: When building production systems, prioritize reliability and safety first, then optimize for efficiency. The timeout + tee pattern achieves both goals with minimal complexity.
Related UnifyWeaver Documentation:
This appendix documents a real production issue encountered October 14, 2025, and the engineering solution developed to address streaming safety concerns in CLI applications.