UnifyWeaver

Chapter 7: Variable Scope and Process Substitution

Understanding Bash Variable Scope in Generated Code

One of the most critical aspects of the bash code generated by UnifyWeaver is proper variable scoping. This chapter explains how UnifyWeaver handles variable scope to ensure correctness, and why certain patterns (like process substitution) are essential.

The Variable Scope Problem

Why Scope Matters

Consider this seemingly simple bash code:

# ❌ WRONG - Variable scope issue
cat data.txt | while IFS=":" read -r a b; do
    count=$((count + 1))  # This modifies a COPY of count!
done
echo $count  # Will be empty or wrong!

Problem: The pipe (|) creates a subshell. Variables set inside the while loop are in that subshell and don’t affect the parent shell.

This is critical because UnifyWeaver generates pipelines that must preserve data across multiple stages.

UnifyWeaver’s Solutions

Solution 1: Process Substitution (< <(command))

From ancestor.sh (actual generated code):

while IFS=":" read -r from to; do
    if [[ "$from" == "$current" && -z "${visited[$to]}" ]]; then
        visited["$to"]=1              # ← This works!
        echo "$to" >> "$next_queue"
        echo "$start:$to"
    fi
done < <(parent_get_stream | grep "^$current:")
     ↑
     This is process substitution, NOT a pipe!

Why this works:

Compare to a pipe (which would be wrong):

# ❌ WRONG
parent_get_stream | grep "^$current:" | while IFS=":" read -r from to; do
    visited["$to"]=1  # Modifies a COPY - lost when subshell exits!
done

Solution 2: Local Variables in Functions

From sum_list.sh (actual generated code):

sum_list() {
    local input="$1"
    local acc="$2"
    local result_var="$3"

    # ... processing ...

    local current_acc="$acc"

    for item in "${items[@]}"; do
        current_acc=$((current_acc + item))  # Modifies local variable
    done

    # Return via eval (for named variable) or echo
    if [[ -n "$result_var" ]]; then
        eval "$result_var=$current_acc"  # Set variable in CALLER'S scope
    else
        echo "$current_acc"
    fi
}

Key points:

Solution 3: Global Associative Arrays for Memoization

From even_odd.sh (actual generated code):

# Shared memo table for all predicates in this group
declare -gA is_even_is_odd_memo  # ← -gA = global associative array

is_even() {
    local key="is_even:$*"

    # Check shared memo table
    if [[ -n "${is_even_is_odd_memo[$key]}" ]]; then
        echo "${is_even_is_odd_memo[$key]}"
        return 0
    fi

    # ... computation ...

    # Cache result in GLOBAL memo table
    is_even_is_odd_memo["$key"]="$result"
    echo "$result"
}

Why global:

Scope Patterns in Detail

Pattern 1: Join Operations with Process Substitution

From grandparent.sh:

parent_join() {
    while IFS= read -r input; do
        IFS=":" read -r a b <<< "$input"  # Parse input

        # Iterate over parent data
        for key in "${!parent_data[@]}"; do
            IFS=":" read -r c d <<< "$key"
            [[ "$b" == "$c" ]] && echo "$a:$d"
        done
    done
}

grandparent() {
    parent_stream | parent_join | sort -u
}

Scope analysis:

Pattern 2: BFS with Temporary Files

From ancestor.sh:

ancestor_all() {
    local start="$1"
    declare -A visited  # Function-local associative array
    local queue_file="/tmp/ancestor_queue_$$"
    local next_queue="/tmp/ancestor_next_$$"

    trap "rm -f $queue_file $next_queue" EXIT PIPE

    echo "$start" > "$queue_file"
    visited["$start"]=1

    while [[ -s "$queue_file" ]]; do
        > "$next_queue"

        while IFS= read -r current; do
            # Process substitution preserves 'visited' array!
            while IFS=":" read -r from to; do
                if [[ "$from" == "$current" && -z "${visited[$to]}" ]]; then
                    visited["$to"]=1           # Works correctly!
                    echo "$to" >> "$next_queue"
                    echo "$start:$to"
                fi
            done < <(parent_get_stream | grep "^$current:")
        done < "$queue_file"

        mv "$next_queue" "$queue_file"
    done
}

Why this pattern:

Common Pitfalls (What NOT to Do)

Pitfall 1: Using Pipes When Accumulating

# ❌ WRONG
counter=0
cat file.txt | while read line; do
    counter=$((counter + 1))  # Modifies subshell copy!
done
echo $counter  # Will be 0!

# ✅ CORRECT
counter=0
while read line; do
    counter=$((counter + 1))  # Modifies actual variable
done < file.txt
echo $counter  # Correct count!

Pitfall 2: Forgetting local in Functions

# ❌ WRONG
process_item() {
    item="$1"  # Creates GLOBAL variable!
}

# ✅ CORRECT
process_item() {
    local item="$1"  # Function-scoped
}

Pitfall 3: Assuming Array Persistence Through Pipes

# ❌ WRONG
declare -A memo
command | while read key value; do
    memo["$key"]="$value"  # Lost after pipe!
done

# ✅ CORRECT
declare -A memo
while read key value; do
    memo["$key"]="$value"  # Persists!
done < <(command)

How UnifyWeaver Chooses Patterns

The compiler intelligently selects the appropriate pattern based on the predicate structure:

Simple Stream Transformation → Pipes OK

# No accumulation needed
grandparent() {
    parent_stream | parent_join | sort -u
}

Need to Accumulate State → Process Substitution

# visited array must persist
while IFS=":" read -r from to; do
    visited["$to"]=1
done < <(command)

Need Global Persistence → Global Arrays

# Memoization across all calls
declare -gA function_memo

Testing Variable Scope

You can test if scope is correct:

# Test: Does counter persist?
test_scope() {
    counter=0

    # Your loop here
    while read line; do
        counter=$((counter + 1))
    done < <(echo -e "a\nb\nc")

    echo "Counter: $counter"  # Should be 3
}

test_scope

If counter is 0, you have a scope issue!

Summary

Key Takeaways:

  1. Pipes create subshells - variables set inside are lost
  2. Process substitution (< <(...)) - keeps loop in current shell
  3. Use local for all function variables
  4. Use declare -gA for global memo tables
  5. Temporary files for state that must persist across subshells

UnifyWeaver handles this automatically - the generated code uses the correct pattern for each situation. Understanding these patterns helps you:


Next Steps

Now that you understand variable scope, the next chapter will explore the template system that UnifyWeaver uses to generate this bash code cleanly and maintainably.


Previous: Chapter 6: Advanced Topic: The Constraint System 📖 Book 2: Bash Target Next: Chapter 8: The Modern Template System →