Personal Data in Documents

When you upload a document to Ask.School, it is automatically scanned for personal data before being added to your chatbot’s knowledge base. This helps prevent sensitive information — such as student names or NHS numbers — from being included in chatbot responses.

How It Works

Every document goes through a personal data check during processing:

You upload a document (PDF, DOCX, TXT, etc.)
Ask.School extracts the text content
An AI-powered scanner checks the text for personal data entities
If personal data is found, the document is flagged instead of being added to the knowledge base
If no personal data is found, the document is marked as Ready and added normally

Flagged documents are not used by chatbots until you review and approve them.

What Gets Detected

The personal data scanner looks for:

Entity Type	Examples
Person names	Student names, staff names, parent names
NHS numbers	UK National Health Service numbers

The scanner is designed to catch common personal data in school documents. It uses AI-powered entity recognition, so it may occasionally flag general references (like historical figures or place names) as personal data. Review flagged documents to confirm whether the detection is accurate.

What a Flagged Document Looks Like

When a document is flagged, it appears in your Documents list with:

An orange “Flagged” badge instead of the usual green “Ready” badge
A warning message: “Contains personal data — review before adding to knowledge base”
An orange-highlighted background row
An Approve button to manually approve the document after review

Flagged documents also appear under the Flagged tab in the document status filters, making them easy to find.

Reviewing a Flagged Document

When you see a flagged document:

Download the document using the download button to review its contents.
Check what was detected — consider whether the personal data is genuinely sensitive or a false positive (e.g. a headteacher’s name in a welcome letter vs. a list of student names).
Decide what to do:
- Approve the document — Click Approve if the personal data is acceptable (e.g. the headteacher’s published name, or publicly available staff contact details). The document will be added to the knowledge base and chatbots can use it.
- Remove and re-upload — If the document contains genuinely sensitive data (e.g. student names, medical information), remove the personal data from the original file and upload a cleaned version.
- Delete the document — If the document shouldn’t be in the system at all, delete it.

Important: Approving a flagged document means the chatbot may include information from that document in its responses. Make sure you are comfortable with everything in the document being potentially surfaced to users.

Common Scenarios

Documents that are usually safe to approve

Welcome letters signed by the headteacher — the head’s name is public information
Staff directories with published contact details — already on the school website
Prospectus mentioning named department heads — intentionally public
Policy documents referencing the DSL by name — this is a statutory requirement

Documents that should be redacted first

Behaviour reports mentioning students by name
SEN/EHCP documents with student details
Medical or allergy lists with student names
Safeguarding case notes or referrals
Staff HR documents with personal details
Parent contact lists with phone numbers or addresses

Documents that work well without personal data

Generic policies (uniform, behaviour, admissions) — rarely contain names
Term dates and timetables — no personal data
Curriculum guides — subject content only
FAQs — question-and-answer format, no names needed

PII Whitelist

If your school regularly uploads documents that contain the same names (e.g. the headteacher in every policy), Ask.School maintains a PII whitelist that can be configured to automatically allow certain terms. Your school’s own profile information (school name, contact details) is automatically whitelisted so it won’t trigger false flags.

Contact support if you need help configuring your whitelist.

Chatbot Conversation Protection

Personal data protection isn’t limited to documents. Ask.School also monitors chatbot conversations in real time:

Input scanning — When a user sends a message containing sensitive data (credit card numbers, NHS numbers, etc.), the system can detect and handle it appropriately
Output scanning — Before the chatbot sends a response, it is checked to ensure it doesn’t expose sensitive data like credit card numbers, NHS numbers, or other protected information

This works alongside Safeguarding Alerts to provide comprehensive protection across your chatbot platform.

Good to Know

Personal data scanning happens automatically — you don’t need to enable it
The scanner runs on every new document upload, including documents attached to School Knowledge topics
Flagged documents can still be downloaded and viewed by administrators — they are just prevented from being used by chatbots until approved
The scanning uses AI-powered entity recognition, which is highly accurate but not infallible — always review flagged documents manually
All detected personal data entities are logged but encrypted at rest

Next Steps

Documents — Upload and manage school documents
Safeguarding Alerts — How conversation monitoring works
Security & 2FA — Account security settings