PIIDetector
llmfy.guardrails.pii.pii_detector
PIIDetector
Detects and optionally replaces Personally Identifiable Information in text.
Uses regex-based detection — no external NLP dependencies required.
Example:
from llmfy import PIIDetector, PIIMaskStyle, PIIStrategy, PIIType
# Default: MASK + PARTIAL (first 2 chars of value + *)
detector = PIIDetector()
result = detector.detect("Contact john@example.com or call 555-123-4567")
print(result.processed_text) # "Contact jo* or call 55*"
# MASK + TYPE_NAME style
detector = PIIDetector(mask_style=PIIMaskStyle.TYPE_NAME)
result = detector.detect("Contact john@example.com or call 555-123-4567")
print(result.processed_text) # "Contact [EMAIL] or call [PHONE_NUMBER]"
# REDACT strategy (mask_style ignored)
detector = PIIDetector(strategy=PIIStrategy.REDACT, types=[PIIType.EMAIL])
result = detector.detect("Email: jane@test.org, SSN: 123-45-6789")
print(result.processed_text) # "Email: [REDACTED], SSN: 123-45-6789"
# Custom types with name and regex
detector = PIIDetector(
custom_types={"EMPLOYEE_ID": "EMP-[0-9]{6}", "PROJECT_CODE": "PRJ-[A-Z]{3}"}
)
result = detector.detect("Employee EMP-001234 is on project PRJ-ABC")
print(result.processed_text) # "Employee EM* is on project PR*"
# Scan without replacing
findings = detector.scan("Email: jane@test.org, IP: 10.0.0.1")
for f in findings:
print(f.pii_type, f.value)
Source code in llmfy/guardrails/pii/pii_detector.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 | |
strategy = strategy
instance-attribute
mask_style = mask_style
instance-attribute
types = [t for t in active_types if t.value not in self._custom_patterns]
instance-attribute
__init__(strategy=PIIStrategy.MASK, mask_style=PIIMaskStyle.PARTIAL, types=None, custom_types=None)
Initialize the PIIDetector.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
strategy
|
PIIStrategy
|
PIIStrategy.MASK replaces PII using mask_style. PIIStrategy.REDACT replaces all PII with [REDACTED]. Defaults to PIIStrategy.MASK. |
MASK
|
mask_style
|
PIIMaskStyle
|
Controls the placeholder format when strategy is MASK. PIIMaskStyle.PARTIAL shows the first 2 chars of the detected value followed by * (e.g. 'jo*'). PIIMaskStyle.TYPE_NAME shows the type in brackets (e.g. '[EMAIL]'). Defaults to PIIMaskStyle.PARTIAL. |
PARTIAL
|
types
|
Optional[List[PIIType]]
|
List of PIIType values to detect. Pass None to detect all supported PII types. Defaults to None (all types). |
None
|
custom_types
|
Optional[Dict[str, Union[str, Pattern]]]
|
Dict mapping a custom type name to a regex pattern (str or compiled). The name is used as the TYPE_NAME placeholder label. If a key matches a built-in PIIType name, the custom pattern replaces the built-in. Defaults to None (no custom types). |
None
|
Source code in llmfy/guardrails/pii/pii_detector.py
scan(text)
Find all PII in text without replacing anything.
Detections are returned sorted by their start character index.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
The input text to scan. |
required |
Returns:
| Type | Description |
|---|---|
List[PIIDetection]
|
List of PIIDetection instances, each describing one PII occurrence. |
Source code in llmfy/guardrails/pii/pii_detector.py
detect(text)
Detect all PII in text and return a result with PII replaced.
Replacements are applied right-to-left to preserve character index validity as substitutions are made.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
The input text to process. |
required |
Returns:
| Type | Description |
|---|---|
PIIDetectionResult
|
PIIDetectionResult containing the original text, processed text |
PIIDetectionResult
|
with PII replaced, and all individual detections. |