What makes CordonData different from other enterprise AI search tools?

CordonData combines on-premise RAG, document-level permission enforcement, automated PII/NHI/secret redaction, and advanced OCR in a single self-hosted package. Your data never leaves your infrastructure.

Can CordonData run completely air-gapped?

Yes. CordonData is designed for air-gapped, offline deployments. Run the entire stack — OCR, embedding, vector search, LLM inference, and SSO — entirely within your secure network.

How does document-level security work in CordonData?

ACL metadata is stored alongside vector embeddings. At query time, user identity is cross-referenced against this metadata, dynamically filtering the vector search space so users only see documents they have permission to access.

What document formats and languages does CordonData support?

CordonData supports PDF, DOCX, PPTX, XLSX, PNG, JPEG, TIFF, HTML, Markdown, plain text, and email formats. OCR supports 100+ languages including CJK, Arabic, Cyrillic, and Indic scripts.

CordonData Security & Compliance | PII Detection, Audit Trail & Zero-Trust AI

Defence in Depth

Security at
every layer

CordonData is designed with the assumption that no single security control is sufficient. Each layer independently prevents a class of threat — so a failure in one layer doesn't compromise the whole system.

Identity verified before any request is processed

SSO via Keycloak — no local passwords, no shared credentials

Access decisions made on live ACL state, not cached tokens

Source-of-truth permissions synced continuously from your systems

Sensitive content detected and flagged before it reaches the AI

PII, NHI, and secret patterns scanned automatically on ingest

Every action logged with who, what, when, and outcome

Immutable audit trail — required for GDPR, HIPAA, and internal reviews

Layer 1 — Identity & Authentication

Keycloak SSO · MFA support · JWT token validation

GATE 1

Layer 2 — Authorisation & ACL

Role-based capabilities · Live permission sync · Per-document access

GATE 2

Layer 3 — Compliance Scanning

PII · NHI · Secrets · Configurable scan rules · Redaction

GATE 3

Layer 4 — Immutable Audit Log

Every query · Every access · Every admin action · CSV export

GATE 4

Request approved — compliant content delivered to authorised user

Detection Coverage

What CordonData scans for

Three independent scanning layers detect sensitive content before it enters the AI pipeline — and flag it for remediation.

PII

Personal Identifiable Information

22 types

Person name Email address Phone number Postal address Date of birth Passport number Driver licence National ID Tax ID Bank account Credit card IBAN SSN / NHS Medical condition IP address Biometric ID + 6 more

NHI

Non-Human Identities & Workload Creds

13 types

AWS IAM Role ARN AWS STS assumed role Azure managed identity Azure service principal GCP service account GCP workload identity GitHub Actions OIDC OIDC workload subject SPIFFE ID K8s service account + 3 more

Secrets

Credential Material & API Keys

35+ types

AWS access keys Azure storage key Google API key GitHub PAT GitLab PAT Docker PAT Slack token JWT tokens SSH private key Stripe keys Twilio API key DB connection strings Private key blocks High-entropy secrets + 21 more

Burn (destructive redact)

Permanently remove sensitive content from the document and purge it from the AI index. Irreversible. Audit logged.

Mark Safe

Acknowledge a finding as intentional or acceptable. Removes the compliance block so the document re-enters the AI pipeline.

Re-scan

Trigger a fresh scan with the latest detection rules. Automatically runs when the scanner taxonomy is updated.

Enterprise-Ready, by Design

Built from the ground up for the security, compliance, and scale requirements of the world's most demanding organizations.

Air-Gapped Ready

Zero external API calls

AES-256 Encryption

At rest & in transit

Docker / Kubernetes

Single-node to multi-AZ

Keycloak SSO

AD, LDAP, OIDC, SAML

On-Premise

Deploy entirely within your data center. Air-gapped operation with no external dependencies. Full control over infrastructure, networking, and data residency.

Bare metal or VM deployment
Docker Compose or Kubernetes
Local LLM inference via Ollama

BYOC (Bring Your Own Cloud)

Deploy inside your own AWS, Azure, or GCP environment. You maintain control of the infrastructure while we provide the software and support.

Your VPC, your security groups
Your IAM roles and policies
Your encryption keys (BYOK)

Managed Single-Tenant

Let us host it for you — in a dedicated, physically isolated environment. No shared databases, no shared indexes, no cross-tenant data leakage.

Dedicated infrastructure per customer
99.9% uptime SLA
Managed updates & monitoring

// COMPLIANCE_ENGINE

Automated PII, NHI & Secret Detection

Before any document enters your AI pipeline, CordonData scans, classifies, and redacts sensitive data — ensuring compliance with GDPR, HIPAA, PCI-DSS, and internal data governance policies.

PII Detection

Automatically identify and classify Personally Identifiable Information across all ingested documents — names, addresses, phone numbers, email addresses, social security numbers, passport numbers, driver's license IDs, and more.

SSN Email Phone Passport DL# DOB

NHI Detection

Detect Non-public Health Information and protected health data — medical record numbers, health insurance IDs, patient identifiers, diagnosis codes, and clinical trial data — ensuring HIPAA compliance.

MRN HIPAA ICD-10 HITECH

Secret & Credential Detection

Scan for leaked API keys, access tokens, database connection strings, private keys, AWS/Azure/GCP credentials, and other secrets accidentally embedded in documents before they reach the AI model.

API Key Token ConnStr PEM

How Compliance Scanning Works

1

Ingest

Document enters the pipeline from any connected source

2

Scan & Classify

Regex + ML models detect PII, NHI, and secrets with confidence scoring

3

Redact or Flag

Auto-redact sensitive spans or flag for manual review based on policy

4

Index Safely

Only sanitized content enters the vector index and LLM context window

Enterprise-Grade Transparency

We built CordonData to solve the two biggest blockers for Enterprise AI adoption: Data Security and Hallucinations.

// RETRIEVAL_AUDIT_TRACE

Verifiable Retrieval Audit Trace

LLM hallucinations are unacceptable in the enterprise. CordonData provides a deterministic audit trace for every generated sentence. Instantly verify the exact document, page number, and extracted text chunk the AI used to formulate its response.

Direct links to source files in your DMS
Confidence scoring on vector matches
Exact text chunk highlighting

AI Response

Generated in 1.2s

Based on the current guidelines, the Q3 bonus pool has been increased by 15% across the APAC division [1].

Audit Trace: Reference [1]

MATCH_SCORE: 0.94

DOC: APAC_Project_Phoenix_Launch_Brief.pdf

PAGE: 12 | CHUNK: #402

"...the executive board has approved a 15% increase to the bonus pool specifically allocated for the APAC division following record sales..."

JS

John Smith Role: HR Director

Query: "Q3 Layoffs"

Found 4 matching documents.

Indexing Source: Enterprise_DMS/HR_Confidential

ED

Emma Doe Role: Engineering Intern

Query: "Q3 Layoffs"

No results found.

Filtered by Index Authorization Rules

// ZERO_TRUST_ACL

Permission-Safe Retrieval Routing

A search engine is only as safe as its weakest access control. While Keycloak handles seamless identity authentication, CordonData’s native authorization engine takes over at the data layer. When a user queries the system, the vector space is dynamically filtered by cross-referencing their username, group, and authority directly against the indexed document metadata.

Secure authentication via Keycloak/Active Directory
Index-level authorization (User/Group matching)
Impossible to bypass via prompt injection

Enterprise-Grade Security

Security at
every layer

What CordonData scans for

Enterprise-Ready, by Design

On-Premise

BYOC (Bring Your Own Cloud)

Managed Single-Tenant

Automated PII, NHI & Secret Detection

PII Detection

NHI Detection

Secret & Credential Detection

How Compliance Scanning Works

Enterprise-Grade Transparency

Verifiable Retrieval Audit Trace

Permission-Safe Retrieval Routing

Ready to transform how your team works with documents?

Enterprise-Grade Security

Security atevery layer

What CordonData scans for

Enterprise-Ready, by Design

On-Premise

BYOC (Bring Your Own Cloud)

Managed Single-Tenant

Automated PII, NHI & Secret Detection

PII Detection

NHI Detection

Secret & Credential Detection

How Compliance Scanning Works

Enterprise-Grade Transparency

Verifiable Retrieval Audit Trace

Permission-Safe Retrieval Routing

Ready to transform how your team works with documents?

Join the Waitlist

You're on the list!

Security at
every layer