(Updated) Purview | Data Loss Prevention: Enhanced content extraction and file type coverage for DLP on Windows devices

Message Information

Severity normal
Timeline
Start Date May 11, 2024
End Date April 28, 2025
Last Modified November 21, 2024
Services
Microsoft 365 suite
Category StayInformed

Message Details

Updated November 21, 2024: We have updated the rollout timeline below. Thank you for your patience.

Microsoft Purview: We are excited to announce upcoming enhancements to Microsoft Purview Data Loss Prevention (DLP). With the forthcoming update, the capability to scan, classify, and protect sensitive content on Windows endpoint devices will be significantly expanded. The number of supported file types will increase from approximately 40 to over 100, aligning endpoint coverage with other platforms like Exchange, SharePoint, and OneDrive. Additionally, this update will introduce several key enhancements, including:

  • Detection of labels from protected files (pfiles).
  • Identification of sensitive content within file metadata.
  • Recognition of sensitive information in PDF form fields.
  • Detection of sensitive information in files embedded inside office files (for example, a .txt file inside .pptx file)

This message is associated with Microsoft 365 Roadmap ID 171586

When this will happen:

Public Preview: We will begin rolling out late June (previously late July) 2024 and expect to complete by late October 2024 (previously early October).

General Availability Worldwide: We will begin rolling out early November 2024 (previously late October) and expect to complete by mid-March 2025 (previously late mid-February.

How this will affect your organization:

The upcoming update will enhance DLP’s content scanning on Windows devices. No changes to existing policies are required.

Summary of enhancements:

1. Enhanced file type coverage

The file type coverage to scan, classify, and protect sensitive content on Windows Endpoint devices will increase from 40 file types to over 100.

This means that sensitive content in additional file types like BZ2, EML etc. will also start getting scanned and protected using DLP policies.

2. Detect label in PFILE

The DLP condition “content contains sensitivity label” now has the capability to detect labels from protected files (pfiles). This means that it can now read labels not just from Office and PDF files, but all other files where MIP label with protection can be applied via applications like AIP client, Secude etc. which converts the file into “pfile”.

Picture 1:

8c4d2426 7164 4710 900b 2bd8667b59ca

A txt file converted to .ptxt (PFile) after applying a label. This label can now be detected with this preview.

3. Scanning metadata

Ability to detect sensitive content in file metadata like custom properties in Office and PDF files.

Picture 2:

22b5d8fe a5ef 4d5e b810 a74f9ea7dc49

4. Scanning content embedded in Microsoft 365 office files

If a file is embedded inside an office file (Microsoft Word/Excel/PowerPoint), the content of the embedded file is also scanned. For example, if a DOCX file containing credit card numbers is inserted into an XLSX file, the content of both XLSX and the embedded DOCX files will be scanned, and credit card numbers will be detected.

Picture 3:

cab0da75 319b 40bc 967f b8c6766611e7

5. Better scanning with PDF files

· Ability to scan and detect sensitive content in PDF forms.

· Ability to scan and detect sensitive content in permission protected PDF files. Permission protected PDF files are ones which do not require any password to open the file and read the content but require a password to edit/copy the content.

Picture 4:

453bd3df d597 4e9d ab89 43f90a87f4f8

What you need to do to prepare:

You do not need do any changes to your existing policies. Your existing policies will seamlessly start scanning additional content as detailed above.

Additional Resources