i18n tags tools
This commit is contained in:
parent
61f04a6049
commit
e952b4c9ee
1 changed files with 12 additions and 4 deletions
|
|
@ -90,16 +90,24 @@ class NeutralizationService:
|
|||
|
||||
_NEUT_INSTRUCTION = (
|
||||
"Analyze the following text and identify ALL sensitive content that must be neutralized:\n"
|
||||
"1. Personal data (PII): names of persons, email addresses, phone numbers, "
|
||||
"physical addresses, ID numbers, dates of birth, financial data (IBAN, account numbers), "
|
||||
"social security numbers\n"
|
||||
"1. Personal data (PII):\n"
|
||||
" - Full names of persons\n"
|
||||
" - Email addresses\n"
|
||||
" - Phone numbers\n"
|
||||
" - Physical addresses (street, city, postal code)\n"
|
||||
" - ID numbers (passport, driver license, AHV/SSN)\n"
|
||||
" - Dates of birth (e.g. '14.03.1982', '1982-03-14', 'March 14, 1982', 'born in 1982')\n"
|
||||
" - Age when it identifies a person\n"
|
||||
" - Financial data (IBAN, account numbers, salary, balances)\n"
|
||||
" - Nationality, citizenship, place of origin\n"
|
||||
"2. Protected business logic: proprietary algorithms, trade secrets, confidential "
|
||||
"processes, internal procedures, code snippets that reveal implementation details\n"
|
||||
"3. Named entities: company names, product names, project names, brand names\n\n"
|
||||
"Return ONLY a JSON array (no markdown, no explanation):\n"
|
||||
'[{"text":"exact substring","type":"name|email|phone|address|id|financial|logic|company|product|location|other"}]\n\n'
|
||||
'[{"text":"exact substring","type":"name|email|phone|address|id|dob|financial|nationality|logic|company|product|location|other"}]\n\n'
|
||||
"Rules:\n"
|
||||
"- Every entry's 'text' must be an exact, verbatim substring of the input.\n"
|
||||
"- Dates of birth MUST always be captured — use type 'dob'.\n"
|
||||
"- Do NOT include generic words, common language constructs or non-sensitive terms.\n"
|
||||
"- If nothing is sensitive, return [].\n\n"
|
||||
)
|
||||
|
|
|
|||
Loading…
Reference in a new issue