Dictionaries can be used as an additional dependency required before considering a location to have identity matches. Using dictionaries is especially helpful when searching for PHI (Protected Health Information) and other health care related information, PCI (Payment Card Industry) data, and other types of information that require multiple criteria to be met.
Dictionaries are easy to create and configure and detailed instructions are provided in the on-line help.
Some sample dictionaries are attached to this article and may be used as a starting point but are not intended to be used without review, are not comprehensive, and may not meet your particular needs.
- ICD9large - Contains many common ICD9 keywords (less a few common ones that create too many false positive hits like ‘green’). The ICD9 dictionary is very large and can reduce the speed of your search. Additionally, this dictionary contains some generic words from the ICD9 list and can therefore create false positives.
- names100, names400 - Contains the most common names typically found in documents with lists of names (a large one and a small one for higher performance but less results).
For the best results, create custom dictionaries from your specific data sets. As you preview your results and find false positives, you can refine your dictionary by editing it in any text editor.
To use dictionaries in a Policy on the Console you would add them in a Sensitive Data Definition as explained in the following linked user guide: http://www.identityfinder.com/help/EnterpriseConsole/index.htm#3588.htm