Lighthouse AI for PII
Effectively Comply with Data Privacy Regulations
Protect consumer data in your documents going through litigations, investigations, data breach responses and more with enhanced PII identification. Early and accurate detection of personally identifiable information and protected health information (PII/PHI) enables a streamlined and targeted review workflow while mitigating the risk of inadvertent production. Lighthouse PII workflows are powered by AI for the most robust PII detection in eDiscovery.
PII and PHI Detection Powered by Lighthouse AI
Types of PII Automatically Detected
Recall and Precision for a PHI Model Used in an Investigation
Finding the PII/PHI hidden within datasets is only getting harder as data volumes grow. The protection of PII/PHI is often required, and regulations covering the transmission and production of PII/PHI are only increasing.
Regular expressions (regex) alone lack the nuance to accurately identify PII, leading to overly broad screens and missed personal information.
Lighthouse sensitive information detection gives you the speed and confidence you need when responding to regulatory and data privacy responses, cybersecurity, investigations, and litigations.
Built with AI and linguistics for nuanced analysis, our solution was created with your needs at the forefront:
Rapidly assess data
sets with pre-built, battle-tested PII models
Rely on comprehensive PII detection technology built with AI and linguistics
Reduce and support downstream review with predictive PII scores and other insights
Extract PII excerpts ready for automated redaction workflows within Relativity
Get in touch to learn more about sensitive data identification
Lighthouse AI Automatically Detects Over 50 Types of PII
Don’t see your PII type on the list? We can build it.
Our on-staff data scientists and linguists will work with you to create a custom model that can be used across your matters.
Identity Information
- Social Security number (SSN)
- National identifiers
- Passport numbers
- Driver's license or state ID
- Date of birth
- Phone number
- Address
- Organization names
Common Forms and Documents
- HR related info, resumes
- Standard Form 86
- General forms and contact information
Medical Information
- Patient name
- Patient ID
- Patient account/billing information
- Medical record numbers
- Explanation of benefits
- Medical history
- Insurance information
Personal Record Information
- Criminal conduct information
- Student information
- Customer information
Financial Information
- Credit card numbers
- IRS forms
- Taxpayer identification number (TIN)
- Account and financial information
- SWIFT codes
- IBANs
Security
- Passwords
- IP address
- Security questions
- Azure account keys
- Connection strings
What Our Pharma Clients Are Saying
Unprecedented PII Detection Efficiency
Reuse PII Decisions with Lighthouse
PII coding and redaction from past matters can be copied to normalized duplicate documents.
Document reviewers can be supported by historical PII coding insights within Relativity.
PII coding from prior matters can be used to train PII classifiers powered by Lighthouse AI.
Identifying PHI for a Fast-Paced Government Investigation
As part of a government investigation, a healthcare client had 3.8M documents to collect, review, and produce in only two months. Their data was not only large, it was complex due to low richness and the amount of PHI. Working with Lighthouse, the client was able to fulfill their requirements on time and with peace of mind.
Using a PHI model trained on the company’s past matters and measured 90% recall and precision, PHI review was reduced to 20K documents. Eyes-on review revealed that 80% of these were redacted for PHI.
Extracting and Linking PII in Scanned Documents
A financial institution had 300K documents to review to respond to a data breach. However, many of these were scanned documents. Lighthouse processed these documents and some non-image documents as images and developed a prompt to use Lighthouse AI for analysis. This method preserved the link between the PII and the PII owner, enabling the team to quickly extract and classify elements within the documents, delivering results as Relativity objects.
Case Studies
Global Law Firm Cuts 3M Documents to 440K, Achieving HSR Second Request Compliance in 11 Weeks
FAQs
Why are traditional methods like regular expressions (regex) insufficient for identifying sensitive information?
Regex often lacks the sophistication to accurately detect personally identifiable information (PII) and personal health information (PHI). They tend to produce overly broad screening results, which means they can either miss critical personal information or generate too many false positives. This lack of precision can lead to significant risks in data privacy, regulatory compliance, and potential legal exposures for organizations dealing with sensitive datasets.
How does Lighthouse’s PII detection differ from traditional approaches?
Lighthouse provides a more advanced, nuanced approach to identifying PII and PHI. Unlike basic regex methods, our solution offers enhanced speed and confidence when using eDiscovery for regulatory responses, data privacy management, cybersecurity, investigations, and litigation. The technology goes beyond surface-level scanning, using sophisticated algorithms that are built for big data.
Get in Touch
Ready to see how Lighthouse can help you reduce risk and power eDiscovery efficiency? Fill out the form to connect with our team.