Redaction
Redaction runs before an event is passed to its sink. It applies recursively to
strings, dictionaries, lists, tuples, sets, dataclasses, and objects exposing
model_dump() or dict().
Default detectors
- Email addresses
- US phone numbers and Social Security numbers
- JSON Web Tokens and bearer credentials
- Common API-key assignments and AWS-style access keys
- Database connection URLs
- Signed URL query parameters such as
token,signature,sig,key, andX-Amz-Signature - Values under sensitive dictionary field names
Sensitive field names include password, secret, token, API key, access key, authorization, cookies, client secret, connection string, database URL, email, phone, and SSN variants.
from agentlogsafe import redact
redact({
"authorization": "Bearer abc.def.ghi",
"message": "Contact person@example.com",
})
# {
# "authorization": "[REDACTED]",
# "message": "Contact [REDACTED]",
# }
Configuration
from agentlogsafe import AgentLogger, RedactionConfig
config = RedactionConfig(
replacement="***",
redact_phone_numbers=False,
sensitive_field_names=("password", "secret", "employee_id"),
)
log = AgentLogger(redaction_config=config, sink="events.jsonl")
Supplying sensitive_field_names replaces the default tuple; include every field
name your policy requires.
Disabling redaction
This disables redaction for payload, risk, and metadata. Use it only when data has already been sanitized and the destination has suitable protections.
Operational limitations
Regex detectors can produce false positives and false negatives. Built-in
international coverage includes plus-prefixed phone numbers, IBANs, UK National
Insurance numbers, and Canadian SINs; it is not a comprehensive global identity
library. Values that do not resemble a known pattern should be placed beneath a
configured sensitive field name or removed before logging. Cyclic object graphs
raise RedactionError instead of producing ambiguous output.
Hashing and tokenization
Use deterministic hashing when correlating repeated values is necessary but the original value must not be retained:
from agentlogsafe import HashStrategy, RedactionConfig
config = RedactionConfig(strategy=HashStrategy(salt=b"tenant-specific-salt"))
Use keyed HMAC tokenization when correlation identifiers must not be reproducible without a secret key:
from agentlogsafe import RedactionConfig, TokenizationStrategy
config = RedactionConfig(
strategy=TokenizationStrategy(key=load_key_from_secret_manager())
)
Never hard-code production salts or tokenization keys. Rotate and scope keys using your secret-management policy. These strategies are intentionally one-way; they do not provide token vault lookup or reversible encryption.
Custom detectors
Implement RedactionDetector.redact(value, strategy) or use RegexDetector:
import re
from agentlogsafe import RedactionConfig, RegexDetector
employee_ids = RegexDetector("employee_id", re.compile(r"EMP-\d{6}"))
config = RedactionConfig(detectors=(employee_ids,))
Custom detectors run after enabled built-ins. Keep expressions bounded and avoid nested ambiguous repetitions that can cause pathological regex performance.