Skip to content

Commit

Permalink
fix: improve redaction of base64 encoded strings
Browse files Browse the repository at this point in the history
Only using regex to find and replace potential base64 strings (without actually trying to decode them), redacted strings that looked like base64, but was not actually base64. E.g. 'echo "yes" | foo' got redacted, even though 'yes' is not a base64 encoded string.
  • Loading branch information
Realiserad committed Aug 21, 2024
1 parent d19ba2d commit 5cd0d6e
Show file tree
Hide file tree
Showing 2 changed files with 12 additions and 6 deletions.
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"

[project]
name = "fish_ai"
version = "0.9.4"
version = "0.9.5"
authors = [{ name = "Bastian Fredriksson", email = "realiserad@gmail.com" }]
description = "Provides core functionality for fish-ai, an AI plugin for the fish shell."
readme = "README.md"
Expand Down
16 changes: 11 additions & 5 deletions src/fish_ai/redact.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# -*- coding: utf-8 -*-

import re
import base64


def redact(messages):
Expand Down Expand Up @@ -62,8 +63,13 @@ def redact_pem_encoded_private_key_block(content):

def redact_base64_data(content):
pattern = r'["\']([A-Za-z0-9+\\/=]+={0,2})["\']'
replace_with = r'"<REDACTED>"'
return re.sub(
pattern,
replace_with,
content)

def redact_match(match):
encoded_string = match.group(1)
try:
base64.b64decode(encoded_string)
return r'"<REDACTED>"'
except base64.binascii.Error:
return match.group(0)

return re.sub(pattern, redact_match, content)

0 comments on commit 5cd0d6e

Please sign in to comment.