fix(sanitization): close security gaps in ANSI escape sequence stripping#55
Conversation
…SI regex Co-authored-by: GingerGraham <[email protected]>
…omment Co-authored-by: GingerGraham <[email protected]>
Co-authored-by: GingerGraham <[email protected]>
Co-authored-by: GingerGraham <[email protected]>
There was a problem hiding this comment.
Pull request overview
This PR hardens bash-logger’s ANSI/terminal escape sanitization to prevent additional bypasses that could enable terminal escape injection, and adds targeted tests for the newly covered cases.
Changes:
- Expands CSI stripping to cover DEC private modes and additional CSI parameter bytes.
- Improves OSC stripping by supporting ST (
ESC \) termination and handling embedded escapes / consecutive OSC sequences. - Adds tests for the new CSI/OSC cases and for mixed newline/ANSI sanitization mode behavior.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
logging.sh |
Updates _strip_ansi_codes to recognize more CSI/OSC variants (DEC private modes; ST-terminated OSC via a sed loop). |
tests/test_ansi_injection.sh |
Adds regression tests for DEC private modes, ST-terminated OSC, embedded escapes inside OSC, consecutive OSC sequences, mixed sanitization modes, and extra CSI parameter bytes. |
| # Remove CSI (Control Sequence Introducer) sequences: ESC [ ... letter | ||
| # Includes color codes (\e[...m), cursor movement (\e[H), clearing (\e[2J), etc. | ||
| # Pattern: \e[ followed by zero or more digits/semicolons, followed by a letter | ||
| # Also handles DEC private modes (e.g., \e[?25l, \e[?1049h) and other parameter bytes | ||
| # Pattern: \e[ followed by zero or more parameter bytes ([<=>?!] plus digits/semicolons), | ||
| # followed by a letter or @ | ||
| local step1 | ||
| # Use direct escapes to avoid quoting issues in patterns | ||
| # shellcheck disable=SC1117 | ||
| step1=$(printf '%s' "$input" | sed 's/\x1b\[[0-9;]*[a-zA-Z@]//g') | ||
| step1=$(printf '%s' "$input" | sed 's/\x1b\[[0-9;<?>=!]*[a-zA-Z@]//g') |
There was a problem hiding this comment.
The CSI stripping regex is still not spec-complete: CSI “parameter bytes” include ':' (0x3A) and the final byte can be any in 0x40–0x7E (e.g., sequences ending in '' like "\e[1"). With the current [0-9;<?>=!]*[a-zA-Z@] pattern, valid CSI sequences can remain unstripped, which keeps a terminal escape injection bypass open. Consider widening the parameter-byte and final-byte character classes to cover the full CSI grammar (and optionally intermediate bytes 0x20–0x2F).
| # Pattern: \([^\x1b]\|\x1b[^\\]\)* matches any char except ESC, OR ESC if not followed by \ | ||
| # This allows embedded ESC codes like \e[31m while still stopping at \e\\ terminator | ||
| # The loop ensures multiple consecutive OSC sequences are all removed | ||
| # shellcheck disable=SC1117 | ||
| step2=$(printf '%s' "$step2" | sed ':loop; s/\x1b\]\(\([^\x1b]\|\x1b[^\\]\)*\)\x1b\\//g; t loop') |
There was a problem hiding this comment.
The ST-terminated OSC sed pattern uses \| alternation, which is a GNU sed BRE extension and will fail on BSD/macOS sed. Since this sanitizer is security-sensitive, a non-portable sed expression can silently leave OSC sequences unstripped on macOS. Please rewrite this using a portable approach (e.g., sed -E with |, or another POSIX-compatible implementation) and ensure the behavior is covered across supported platforms.
| # Pattern: \([^\x1b]\|\x1b[^\\]\)* matches any char except ESC, OR ESC if not followed by \ | |
| # This allows embedded ESC codes like \e[31m while still stopping at \e\\ terminator | |
| # The loop ensures multiple consecutive OSC sequences are all removed | |
| # shellcheck disable=SC1117 | |
| step2=$(printf '%s' "$step2" | sed ':loop; s/\x1b\]\(\([^\x1b]\|\x1b[^\\]\)*\)\x1b\\//g; t loop') | |
| # Pattern (ERE): ([^\x1b]|\x1b[^\\])* matches any char except ESC, OR ESC if not followed by \ | |
| # This allows embedded ESC codes like \e[31m while still stopping at \e\\ terminator | |
| # The loop ensures multiple consecutive OSC sequences are all removed | |
| # shellcheck disable=SC1117 | |
| step2=$(printf '%s' "$step2" | sed -E ':loop; s/\x1b\](([^\x1b]|\x1b[^\\])*)\x1b\\//g; t loop') |
| # Test: Other CSI parameter bytes are handled | ||
| test_other_csi_parameter_bytes_stripped() { | ||
| start_test "CSI sequences with various parameter bytes are stripped" | ||
|
|
||
| # Test various parameter byte prefixes | ||
| local malicious_input=$'A\e[>cB\e[=cC\e[!pD' | ||
| local expected="ABCD" | ||
|
|
||
| LOG_UNSAFE_ALLOW_ANSI_CODES="false" | ||
| local sanitized | ||
| sanitized=$(_strip_ansi_codes "$malicious_input") | ||
|
|
||
| if [[ "$sanitized" == "$expected" ]]; then | ||
| pass_test | ||
| else | ||
| fail_test "Other parameter bytes not stripped: got '$sanitized' expected '$expected'" | ||
| fi |
There was a problem hiding this comment.
These new tests cover additional CSI parameter bytes, but they still don’t exercise common non-alphabetic CSI final bytes (e.g., sequences ending in ~). Adding at least one case like "ESC [ 1 ~" would prevent regressions where valid CSI sequences are not stripped.
Addresses review feedback on PR #54 identifying three security bypasses in ANSI escape sequence sanitization that could enable terminal escape injection attacks.
Changes
CSI parameter byte handling
\e[?25l,\e[?1049h) and parameter prefixes (<,>,=,!)[0-9;], missing valid CSI sequences with other parameter bytesOSC sequence termination
\e\\) handling alongside existing BEL (\a) support\([^\x1b]\|\x1b[^\\]\)*matches any char except ESC, or ESC if not followed by backslashIndependent sanitization modes
LOG_UNSAFE_ALLOW_NEWLINES=true+LOG_UNSAFE_ALLOW_ANSI_CODES=falseshould preserve newlines while stripping ANSITest coverage
Added 6 tests covering:
\e[?25l)\e]0;title\e\\)\e]0;\e[31mred\e\\)\e[>c)250 tests passing.
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.