A diverse group of professionals looking over paperwork and smiling.

Lighthouse AI for PII

Effectively Comply with Data Privacy Regulations

Protect consumer data in your documents going through litigations, investigations, data breach responses and more with enhanced PII identification. Early and accurate detection of personally identifiable information and protected health information (PII/PHI) enables a streamlined and targeted review workflow while mitigating the risk of inadvertent production. Lighthouse PII workflows are powered by AI for the most robust PII detection in eDiscovery.

PII and PHI Detection Powered by Lighthouse AI

50

Types of PII Automatically Detected

90
%

Recall and Precision for a PHI Model Used in an Investigation

Finding the PII/PHI hidden within datasets is only getting harder as data volumes grow. The protection of PII/PHI is often required, and regulations covering the transmission and production of PII/PHI are only increasing.  

Regular expressions (regex) alone lack the nuance to accurately identify PII, leading to overly broad screens and missed personal information.

Lighthouse sensitive information detection gives you the speed and confidence you need when responding to regulatory and data privacy responses, cybersecurity, investigations, and litigations.

Built with AI and linguistics for nuanced analysis, our solution was created with your needs at the forefront:

Rapidly assess data
sets with pre-built, battle-tested PII models

Rely on comprehensive PII detection technology built with AI and linguistics

Reduce and support downstream review with predictive PII scores and other insights

Extract PII excerpts ready for automated redaction workflows within Relativity

Get in touch to learn more about sensitive data identification

Lighthouse AI Automatically Detects Over 50 Types of PII

Don’t see your PII type on the list? We can build it.  

Our on-staff data scientists and linguists will work with you to create a custom model that can be used across your matters.  

Identity Information

  • Social Security number (SSN)
  • National identifiers
  • Passport numbers
  • Driver's license or state ID
  • Date of birth
  • Phone number
  • Address
  • Email
  • Organization names

Common Forms and Documents

  • HR related info, resumes
  • Standard Form 86
  • General forms and contact information

Medical Information

  • Patient name
  • Patient ID
  • Patient account/billing information
  • Medical record numbers
  • Explanation of benefits
  • Medical history
  • Insurance information

Personal Record Information

  • Criminal conduct information
  • Student information
  • Customer information

Financial Information

  • Credit card numbers
  • IRS forms
  • Taxpayer identification number (TIN)
  • Account and financial information
  • SWIFT codes
  • IBANs

Security

  • Passwords
  • IP address
  • Security questions
  • Azure account keys
  • Connection strings
Connect with an expert

Unprecedented PII Detection Efficiency  

Reuse PII Decisions with Lighthouse

PII coding and redaction from past matters can be copied to normalized duplicate documents.

Document reviewers can be supported by historical PII coding insights within Relativity.

PII coding from prior matters can be used to train PII classifiers powered by Lighthouse AI.

Identifying PHI for a Fast-Paced Government Investigation ​

As part of a government investigation, a healthcare client had 3.8M documents to collect, review, and produce in only two months. Their data was not only large, it was complex due to low richness and the amount of PHI. Working with Lighthouse, the client was able to fulfill their requirements on time and with peace of mind.  

Using a PHI model trained on the company’s past matters and measured 90% recall and precision, PHI review was reduced to 20K documents. Eyes-on review revealed that 80% of these were redacted for PHI.  

A young professional smiles and gives testimony on the stand in a court room.
A group of business professionals sit tensely around a large conference table in a glass-walled office.

Extracting and Linking PII in Scanned Documents

A financial institution had 300K documents to review to respond to a data breach. However, many of these were scanned documents. Lighthouse processed these documents and some non-image documents as images and developed a prompt to use Lighthouse AI for analysis. This method preserved the link between the PII and the PII owner, enabling the team to quickly extract and classify elements within the documents, delivering results as Relativity objects.

Case Studies

Simplifying Complex Multi-District Document Review

Global Law Firm Cuts 3M Documents to 440K, Achieving HSR Second Request Compliance in 11 Weeks

FAQs

Why are traditional methods like regular expressions (regex) insufficient for identifying sensitive information?

Regex often lacks the sophistication to accurately detect personally identifiable information (PII) and personal health information (PHI). They tend to produce overly broad screening results, which means they can either miss critical personal information or generate too many false positives. This lack of precision can lead to significant risks in data privacy, regulatory compliance, and potential legal exposures for organizations dealing with sensitive datasets.

How does Lighthouse’s PII detection differ from traditional approaches?

Lighthouse provides a more advanced, nuanced approach to identifying PII and PHI. Unlike basic regex methods, our solution offers enhanced speed and confidence when using eDiscovery for regulatory responses, data privacy management, cybersecurity, investigations, and litigation. The technology goes beyond surface-level scanning, using sophisticated algorithms that are built for big data.

Get in Touch

Ready to see how Lighthouse can help you reduce risk and power eDiscovery efficiency? Fill out the form to connect with our team.

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.