
Incognito is a Python module for anonymizing French text. It uses Regex and other strategies to mask names and personal information provided by the user.
This module was specifically designed for medical reports, ensuring that disease names remain unaltered.
NOTE The doc is not quite up to date 🙁
pip install incognito-anonymizer
-
Clone the repository:
git clone https://github.com/Micropot/incognito
-
Install the dependencies (defined in
pyproject.toml
):pip install .
from . import anonymizer
# Initialize the anonymizer
ano = anonymizer.Anonymizer()
# Define personal information
infos = {
"first_name": "Bob",
"last_name": "Jungels",
"birth_name": "",
"birthdate": "1992-09-22",
"ipp": "0987654321",
"postal_code": "01000",
"adress": ""
}
# Configure the anonymizer
ano.set_info(infos)
ano.set_strategies(['regex', 'pii'])
ano.set_masks('placeholder')
# Read and anonymize text
text_to_anonymize = ano.open_text_file("/path/to/file.txt")
anonymized_text = ano.anonymize(text_to_anonymize)
print(anonymized_text)
from . import anonymizer
# Initialize the anonymizer
ano = anonymizer.Anonymizer()
# Load personal information from JSON
infos_json = ano.open_json_file("/path/to/infofile.json")
# Configure the anonymizer
ano.set_info(infos_json)
ano.set_strategies(['regex', 'pii'])
ano.set_masks('placeholder')
# Read and anonymize text
text_to_anonymize = ano.open_text_file("/path/to/file.txt")
anonymized_text = ano.anonymize(text_to_anonymize)
print(anonymized_text)
python -m incognito --input myinputfile.txt --output myanonymizedfile.txt --strategies mystrategies --mask mymasks
python -m incognito --help
python -m incognito --input myinputfile.txt --output myanonymizedfile.txt --strategies mystrategies --mask mymasks json --json myjsonfile.json
To view helper options for the JSON submodule:
python -m incognito json --help
python -m incognito --input myinputfile.txt --output myanonymizedfile.txt --strategies mystrategies --mask mymasks infos --first_name Bob --last_name Dylan --birthdate 1800-01-01 --ipp 0987654312 --postal_code 75001
To view helper options for the “infos” submodule:
python -m incognito infos --help
Unit tests are included to ensure the module’s functionality. You can modify them based on your needs.
To run the tests:
make test
To check code coverage:
make cov
One available anonymization strategy is Regex. It can extract and mask specific information from the input text, such as:
- Email addresses
- Phone numbers
- French NIR (social security number)
- First and last names (if preceded by titles like “Monsieur”, “Madame”, “Mr”, “Mme”, “Docteur”, “Professeur”, etc.)
For more details, see the RegexStrategy
class and the self.title_regex
variable.
The documentation is available here
.
This project is licensed under the terms of the MIT License.
- Maintainer: Micropot
Feel free to open issues or contribute via pull requests!

Leave a Reply