Full Platform

Complete Product Suite

Everything you need to manage, search, and understand your enterprise documents — powered by AI that stays within your infrastructure.

Back to Home

Your Documents, Supercharged

CordonData is a complete Document Management System — upload, organize, version, share, and annotate. Then layer on AI search, compliance scanning, and OCR to unlock everything inside your files.

Document Management

Upload up to 500MB per file. Organize with nested folders (50 levels, 100K+ files each). Full version history, granular sharing with VIEW/EDIT/DELETE permissions, and PDF annotation with permanent redaction.

AI-Powered Search

Ask questions in natural language across all your documents. Hybrid search combines semantic vectors with BM25 keywords. Every answer cites the exact source document and page.

PII, NHI & Secret Detection

Auto-scan every document for sensitive data before it enters the AI pipeline. Detect SSNs, emails, medical records, API keys, and credentials. Auto-redact or flag for review.

OCR & Document Intelligence

Extract searchable text from scanned PDFs, images, and mixed-content documents. 100+ languages, RTL support, layout-aware parsing for multi-column and table-heavy files.

Full Audit Trail

Every query, retrieval, LLM prompt, and response is logged. Export deterministic audit traces showing exactly which document chunk produced each sentence.

Permission-Safe Retrieval

DMS permissions are the single source of truth. When you share or revoke access, the AI search index updates instantly. Users only see documents they're authorized to access — impossible to bypass.

Connect External Sources

Already have documents elsewhere? Connect to SharePoint, Alfresco, S3, email, file servers, and REST APIs. Index in-place — no file duplication, no data migration.

Model-Agnostic LLM Gateway

Use any LLM — OpenAI, Azure, Anthropic, local Ollama models. Configure per-knowledge-base with automatic fallback across priority tiers.

Deploy Anywhere

On-premise, air-gapped, your own cloud (BYOC), or managed single-tenant SaaS. Docker Compose or Kubernetes. AES-256 encryption. Keycloak SSO.

Head-to-Head

How CordonData compares

Enterprise document management has two common failure modes: legacy DMS with no AI, and cloud AI with no control. CordonData solves both.

Capability Legacy DMS Cloud AI Tool CordonData
On-premise / air-gap deployment
AI search with grounded answers
Access enforced before AI retrieval
Automated PII / secret detection Partial
Connect to existing SharePoint / Alfresco Limited
Immutable per-user audit trail Basic Basic
Use your own LLM (model-agnostic)
Version control + approval workflows Basic
Document redaction (burn PII)

Comparison is generalised. Results vary by vendor and deployment configuration.

Supported Formats

Every document format
your teams use

Upload, version, index, and search across all major office and technical formats. OCR applied where needed.

Office & Documents
PDF DOCX / DOC XLSX / XLS PPTX / PPT ODT / ODS / ODP RTF EPUB
Text & Markup
TXT Markdown HTML CSV XML JSON YAML LaTeX AsciiDoc
Images & Media
PNG JPEG TIFF WebP BMP MP4 MOV VTT subtitles

OCR applied automatically to scanned PDFs and images. Vision model descriptions available for complex visual content.

AI Configuration

Purpose-specific
AI model roles

Assign different models to different tasks. Use a large model for generation, a fast model for reranking, a vision model for images — all configurable with 3-tier failover.

GENERATION

Primary text generation — chat responses, document summaries, workflow decisions

THINKING

Extended reasoning for complex queries. Outputs a visible reasoning trace before the final answer.

EMBEDDING

Converts document chunks to vectors. Changing this model triggers a full re-index.

RERANKER

Re-scores retrieval candidates by relevance after initial recall. Improves answer quality significantly.

VISION

Describes images and video frames for indexing. Used as an OCR alternative for complex visual layouts.

CONDENSER + PRIVACY

Condenser compresses long context before generation. Privacy filter adds AI-contextual PII detection on top of pattern matching.

All roles support 3-tier priority failover. Context window auto-detected per model.

// AGENT_BUILDER

Agent Builder & Admin Platform

Beyond search — CordonData includes a full agent-builder platform for creating custom AI assistants, configuring model pipelines, and managing enterprise knowledge at scale.

Custom AI Agents

Build purpose-specific AI agents with custom system prompts, tool configurations, and knowledge base assignments. Each agent can use different LLM models and retrieval strategies tailored to specific business functions.

Global Model Settings

Configure LLM, embedding, reranker, condenser, and vision models globally across all knowledge bases. Set priority tiers with automatic fallback — use OpenAI for primary, local Ollama models as backup.

Visual Workflow Editor

Design complex AI pipelines with a drag-and-drop workflow editor. Chain together data ingestion, text extraction, chunking, embedding, retrieval, and response generation nodes — no code required.

Knowledge Base Management

Create and manage multiple knowledge bases, each with independent data sources, chunking strategies, embedding models, and ACL policies. Monitor indexing status, document counts, and sync health from a unified dashboard.

Processing Pipeline Monitor

Real-time visibility into OCR, compliance scanning, chunking, embedding, and RAG indexing pipelines. Track per-document status, retry failed documents, and monitor throughput across all connected sources.

SSO & Identity Management

Integrated Keycloak SSO with support for Active Directory, LDAP, and OIDC/SAML identity providers. Role-based access control across admin console, chat interface, and API endpoints.

Document Management

Everything you need to
manage documents at scale

CordonData's DMS is a full enterprise document workbench — not a file drop. Upload, version, annotate, share, route for approval, place on legal hold, and ask AI questions, all from one place.

File Management

Upload & organise

Drag-and-drop, bulk upload, unlimited folder nesting, move & copy

Version control

Full version chain with comments; each version independently OCR/RAG indexed

Tags & saved searches

Multi-value tags with autocomplete; save and replay search queries

Favourites

Star documents and folders for quick access across sessions

Custom document types

Admin-defined metadata schemas applied per document type

Collaboration & Editing

Online editing (WOPI)

Edit DOCX, XLSX, PPTX, ODF in-browser via Collabora — no download needed. Save creates a new version and re-indexes automatically.

Check-out / check-in

Lock a document for offline editing; others see read-only status. Check in uploads a new version and releases the lock.

Comments & threads

Threaded comments per document; visible inline in the document panel

PDF annotation layer

Add highlights, shapes, and markup on PDFs; annotations are shared and persisted server-side

Subscriptions & notifications

Watch any document or folder — receive real-time bell notifications and email alerts on changes

Sharing & Access Control

Granular permissions

Per-file and per-folder access control synced from SSO or set manually

Public share links

Password-protected links with expiry and per-IP rate limiting

Recipient redaction

Apply per-recipient PDF redaction overlays — the same document, different views per user

Multi-step approval workflows

Submit documents for approval with role routing, ALL/ANY/QUORUM conditions, SLA escalation, delegation, and recall

AI Compare

Compare any two document versions: side-by-side, inline diff, visual diff, or AI-generated summary of key changes

Records & Lifecycle

Declare as record

Convert any document to an immutable record; content-frozen records cannot be modified or deleted

Retention schedules

Admin-defined retention periods; automatic disposition (destruction or transfer) on expiry

Legal hold

Place legal holds that block disposition; propagates to ancestor folders; fully audited

Disposition events

Schedule and execute disposition workflows; admin-managed bulk disposition runs

Records manager role

Delegate records management capabilities to designated users — separate from admin

AI built into every document

Available from the document panel — no separate AI view needed
Ask AI on document

Ask questions directly about any document — answers are grounded in that specific file

AI Compare

Compare two versions with AI-generated change summaries, risk flags, and action items

OCR + Re-index

Every new version is OCR-processed and re-indexed for semantic search — automatically

Compliance scan

PII, NHI, and secrets scanned on ingest; flagged documents quarantined from AI retrieval

// DOCUMENT_INTELLIGENCE

Advanced OCR & Document Intelligence

CordonData extracts structured, searchable text from any document format — scanned PDFs, images, handwritten notes, and complex multi-column layouts — using state-of-the-art OCR and document understanding models.

Scanned PDF OCR

Convert image-based PDFs into fully searchable text. Supports multi-page documents, mixed content (text + images), and RTL languages including Arabic and Hebrew.

Image Text Extraction

Extract text from PNG, JPEG, TIFF, and other image formats. Handles low-resolution scans, skewed documents, and complex backgrounds with high accuracy.

Layout-Aware Parsing

Understands multi-column layouts, tables, headers, footnotes, and callout boxes. Preserves reading order and document structure for accurate chunking.

Multilingual OCR

Supports 100+ languages including CJK (Chinese, Japanese, Korean), Arabic, Cyrillic, and Indic scripts. Automatic language detection for mixed-language documents.

Supported Document Formats

PDF (Scanned & Native)
DOCX / DOC
PPTX / PPT
XLSX / XLS
PNG / JPEG / TIFF
HTML / XML
Markdown / Plain Text
EML / MSG (Email)

Ready to transform how your team works with documents?

Join enterprise teams using CordonData for secure, AI-powered document management — deployed on your infrastructure.