Skip to content

Technical Evaluation

Technical evaluation guide

Architecture, deployment models, security controls, and an honest comparison against alternatives. Everything a CTO or Head of IT needs for due diligence.

Architecture

Four layers, cleanly separated

CompleteFlow separates AI agents, workflow execution with document processing, governance, and infrastructure into independent layers. Each has a well-defined interface and can be modified or extended without affecting the others.

01

AI Agent Layer

Conversational AI agents handle the work that requires judgement: interpreting documents, answering questions, reasoning across multiple sources. Agents choose which tools and workflows to invoke based on the task.

PydanticAI agent frameworkMulti-provider model registryTool registryPer-user credential delegation
02

Workflow & Document Processing Layer

Deterministic workflows for tasks that must execute identically every time. Document ingestion, extraction, cross-referencing, and validation. Expression-based data transforms. Agents invoke workflows as tools, and workflows can call agents for steps that need reasoning.

Workflow engine (idempotent, rerunnable)Document processing pipelineExpression language (JSONata)Human-in-the-loop gatesScheduled and event-driven triggers
03

Governance Layer

Workgroup-based access control with default-deny policies. Full audit trail on every agent action, tool call, and workflow step. Reasoning traces show how the AI reached each conclusion.

Workgroup-based RBACAudit logger (7-year retention)Confidence scoringReview queue and approval gatesCost tracking per user and workflow
04

Infrastructure Layer

Deploys on your chosen infrastructure using managed services. Multi-provider LLM access via private endpoints (Azure OpenAI, Anthropic, open-weight models). PostgreSQL with pgvector for data and vector search. Azure Key Vault for secrets and per-user credentials. All components containerised with infrastructure-as-code.

Multi-provider LLM (private endpoints)PostgreSQL 16 + pgvectorAzure Key VaultAzure Cache for RedisHardened container images (CIS benchmarked)Private networking (VNet/VPC)Infrastructure as Code

Deployment

Three deployment models, each with clear trade-offs

The right deployment model depends on your regulatory constraints, existing infrastructure, and operational capacity. Here is exactly what each option involves.

Private Cloud

Typical timeline: 4-6 weeks

Recommended

Infrastructure

Your Azure, AWS, or GCP tenancy

Networking

Hub-spoke VNet/VPC with private endpoints. No public internet exposure for data plane. Management plane accessible via VPN or private link.

Compute

Azure Container Apps, AWS ECS/Fargate, or GCP Cloud Run. Auto-scaling based on agent workload.

LLM Providers

Works with any state-of-the-art model: Anthropic Claude, OpenAI, Google Gemini, Azure OpenAI, and open-weight models (Llama, Mistral) in your tenancy. Commercial API tiers. Your data is never used for model training.

Data Layer

PostgreSQL managed service (Azure Database for PostgreSQL, RDS, Cloud SQL) with customer-managed encryption keys.

Identity

Microsoft Entra ID with delegated OAuth. Agents inherit requesting user permissions.

Monitoring

OpenTelemetry to Azure Monitor, CloudWatch, or your existing SIEM.

On-Premises

Typical timeline: 6-10 weeks

Maximum isolation

Infrastructure

Your own hardware or private data centre

Networking

Fully air-gapped. No outbound internet required. All components run within your network perimeter.

Compute

Docker Compose for single-node. Kubernetes (any distribution) for multi-node. Bare metal or VM deployment.

LLM Providers

Open-weight models only (Llama, Mistral). Runs on your GPU infrastructure. No external API calls.

Data Layer

Self-managed PostgreSQL with your encryption and backup policies. Full control over data lifecycle.

Identity

On-premises Active Directory or SAML 2.0 identity provider. Local auth fallback available.

Monitoring

OpenTelemetry export to your existing monitoring stack (Grafana, Splunk, ELK).

CompleteFlow Cloud

Typical timeline: 2-3 weeks

Fastest setup

Infrastructure

Private cloud managed by Atchai (deploy in any region)

Networking

Dedicated tenant environment. Data residency in your chosen region guaranteed. SOC 2 controls applied.

Compute

Managed container infrastructure with automatic scaling and patching.

LLM Providers

Anthropic Claude, OpenAI, and Azure OpenAI. Commercial API tiers. Your data is never used for model training. Model selection per agent.

Data Layer

Managed PostgreSQL with encryption at rest. Data residency in your chosen geography.

Identity

Microsoft Entra ID SSO. Federated identity with your existing directory.

Monitoring

Built-in dashboard with agent metrics, cost tracking, and audit log viewer. Export to your SIEM available.

Security Model

Defence in depth, not perimeter-only

Security controls are applied at every layer: network, identity, encryption, policy, and audit. No single control is a single point of failure.

Domain Control Detail
Single-tenant architecture Dedicated deployment per client No shared infrastructure, no shared database, no co-tenancy. You own the Azure subscription and can revoke access at any time.
LLM data privacy Private LLM endpoints, zero training on your data Azure OpenAI via private endpoint within your subscription. Prompts and completions never used for model training by Microsoft or OpenAI.
Per-user credentials OAuth delegation with Key Vault storage Each user authenticates separately with external services. Tokens stored in Azure Key Vault (hardware-backed). Agents see only what the authenticated user can see.
Encryption AES-256 at rest, TLS 1.2+ in transit Customer-managed keys available. HS256 signed access tokens. SHA-256 hashed API keys. SSL required on all database connections.
Network isolation Private endpoints, no public data plane Database, Redis, and LLM accessed via private networking only. External ingress limited to web app and Teams bot.
Access control Workgroup-based RBAC with default-deny Per-workgroup roles (owner, contributor, reviewer, viewer). AI tool access filtered by user permissions. No default access for new users.
Secrets management Azure Key Vault with managed identity All secrets in Key Vault. No secrets in code, config, or environment variables. Only the credential broker service can access tokens.
Audit logging Full provenance chain, 7-year retention Every agent action, tool call, workflow step, and approval decision logged with who, what, when, and cost. Reasoning traces exportable.
Container security Hardened base images, CI/CD scanning CIS-benchmarked container images (CBL-Mariner). Dependency vulnerability scanning and SAST on every build. CVE scanning on images.
Disaster recovery Azure managed backups with geo-redundancy 35-day automated backup retention. Point-in-time restore. Geo-redundant storage. Optional multi-region failover.
Data residency Deploy in any geography US, UK, EU, or any region. Data does not leave the designated geography. All processing within your subscription.

Compliance Mapping

How CompleteFlow maps to regulatory frameworks

Compliance is not a feature that gets added later. These controls are built into the agent execution pipeline. Below is a mapping of specific regulatory requirements to the CompleteFlow controls that address them.

FCA Consumer Duty

Requirement CompleteFlow Control
Demonstrate outcomes monitoring Immutable audit trails capture every AI decision with reasoning traces, confidence scores, and outcome data
Evidence of fair treatment Output logging for bias review and audit. Human review gates for high-stakes decisions
Explainability of AI decisions Reasoning traces export showing tool calls, data sources, confidence scores, and the decision path for each output
Governance and oversight Workgroup-based access control with audit logging. Configurable human-in-the-loop approval gates

SRA Standards and Regulations

Requirement CompleteFlow Control
Client data confidentiality Private deployment on firm infrastructure. Delegated OAuth ensures agents only access what the requesting user can access
Competence and supervision Human-in-the-loop review for all client-facing outputs. Confidence thresholds trigger mandatory review
Record keeping Two-level audit logging with 7-year retention. Every agent interaction recorded with full context
Technology risk management Version-controlled agent configurations. Rollback capability. Sandbox testing before production deployment

MiFID II

Requirement CompleteFlow Control
Transaction reporting Per-agent, per-user activity logging with timestamps. Exportable audit records in standard formats
Best execution documentation Reasoning traces capture the full decision path including data sources consulted and alternatives considered
Algorithmic trading controls Rate limits, circuit breakers, and mandatory human gates for actions above configurable thresholds
Record retention (5 years) Configurable retention periods. Default 7-year retention exceeds the MiFID II 5-year requirement

GDPR

Requirement CompleteFlow Control
Data minimization Agents process only the data required for each task. No persistent caching of personal data beyond the task lifecycle
Right to erasure Tenant-level and user-level data deletion with cascade. Audit records anonymised (not deleted) to preserve regulatory trail
Data portability Full data export in standard formats. No proprietary data lock-in
Breach notification Automated anomaly detection with alerting. Incident response procedures with 72-hour notification support

EU AI Act

Requirement CompleteFlow Control
Risk classification Agent risk tiers mapped to EU AI Act categories. High-risk agents receive additional governance controls automatically
Transparency obligations Users are informed when interacting with an AI agent. Reasoning traces provide full transparency of AI decision-making
Human oversight Configurable human-in-the-loop gates. High-risk decisions require explicit human approval before execution
Technical documentation Automated generation of technical documentation per agent: capabilities, limitations, training data lineage, and performance metrics

Integration Architecture

MCP-first, not API-spaghetti

CompleteFlow uses the Model Context Protocol (MCP) as its primary integration standard. MCP is the open standard for connecting AI agents to external tools and data sources. Any system with an MCP server can be connected to CompleteFlow agents without custom integration code.

Integration via MCP

Each external system connects via a tool interface that the agent discovers at runtime. Tools run within your network perimeter. No data leaves your deployment environment during tool calls. Adding a new integration does not require changes to agent code.

Per-user credential delegation ensures each user's agent uses their own OAuth tokens when accessing external services. The external system's own permissions (folder access, ethical walls, matter restrictions) are enforced at the source.

Microsoft 365 Integration

Microsoft 365 has dedicated support via the Graph API with delegated OAuth tokens. When a user asks an agent to search SharePoint or access OneDrive, the agent uses that user's existing permissions. No admin consent for broad access required.

Agents surface in Microsoft Teams as a chat bot. Users interact through natural language in the same environment they already work in.

Comparison

CompleteFlow vs the alternatives for regulated environments

Four approaches to deploying AI in regulated industries. Each has trade-offs. This table presents the differences without editorialising. The right choice depends on your constraints, budget, and timeline.

Criterion CompleteFlow DIY (Open Source) SaaS AI Tools Palantir AIP
Time to production 6 weeks 12-18 months 2-4 weeks 6-12 months
Initial cost From £40k £200k-500k+ (team cost) £20-1,000/user/month £1M+/year
Data residency Your infrastructure, any geography Your infrastructure Vendor cloud (usually US) Your infrastructure
Audit trail depth Two-level immutable logging, 7-year retention Must build from scratch Basic execution logs Enterprise audit logging
Access control Workgroup-based RBAC, default-deny Must build or integrate Role-based only Proprietary policy engine
Human-in-the-loop Native gates with suspend/resume Must build Limited or none Available
Reasoning traces Every decision, exportable Must instrument Rarely available Available
LLM flexibility Multi-provider, swap without code changes Full flexibility (if built) Vendor-locked model Multi-provider
Regulatory mapping FCA, SRA, MiFID II, GDPR, EU AI Act Must map yourself GDPR only (if any) Enterprise compliance
Vendor lock-in Low. Open-source stack, full data export. Migration requires re-integrating workflows None High High
Ongoing team required Minimal. We manage the platform; your team manages agents 2+ FTE AI/platform engineers 0.5 FTE 2-4 FTE + Palantir engineers

Evaluation Checklist

What to look for in an AI agent platform

Use this checklist to evaluate CompleteFlow, competitors, or an in-house build. These are the questions that separate platforms built for real document work from wrappers around an LLM.

Document Processing

  • Can the platform ingest PDFs, Word docs, emails, and spreadsheets?
  • Does it extract structured data into tables, or just summarise?
  • Can it cross-reference figures and statements across multiple documents?
  • Is there vector search (RAG) across your document corpus?
  • Can it generate new documents from extracted data and templates?

AI Agent Capabilities

  • Can agents reason across document bundles, not just retrieve from them?
  • Do agents invoke deterministic workflows when precision is needed?
  • Can every team member access the agent through a conversational interface?
  • Are agent responses traceable back to source documents with references?
  • Can you swap LLM providers without rewriting agent logic?
  • Can you choose different models per workflow (e.g. fast/cheap for extraction, powerful for reasoning)?

Workflow Engine

  • Is there a visual builder and an expression language for complex logic?
  • Are workflows idempotent (safe to rerun without duplication)?
  • Does the engine support human-in-the-loop gates with suspend/resume?
  • Can workflows run on a schedule or be triggered manually?
  • Can agents call workflows as tools, and can workflows call agents?

Security and Access

  • Does the platform deploy on your infrastructure (not SaaS-only)?
  • When an agent accesses external systems, whose credentials does it use? Per-user or shared?
  • Where do LLM API calls terminate? In your tenancy or the vendor's?
  • Is there a full audit trail with reasoning traces, or just execution logs?
  • Does the access control model support per-team/per-matter isolation?

Operations and Ownership

  • Do you own the agents and workflows that are built? Can you export them?
  • What does the platform require from your team to operate day-to-day?
  • Is there version control and rollback for workflow definitions?
  • What is the cost model? Per-agent cost tracking and attribution?
  • What is the time to first production deployment?

FAQ

Technical evaluation questions

What types of documents can CompleteFlow process? +
PDFs, Word documents, Excel spreadsheets, emails, and plain text. The platform extracts structured data into tables, searches across documents using vector search (RAG), cross-references figures between sources, and generates new documents from templates and extracted data.
How do agents access our document management system? +
Each user authenticates separately with your DMS (NetDocuments, iManage, SharePoint) via OAuth. The agent uses that user's own credentials, so it only sees what that user can see. Tokens are stored in Azure Key Vault. No shared service accounts.
Can we choose which LLM to use? +
Yes. CompleteFlow supports Anthropic Claude, OpenAI, Azure OpenAI, and open-weight models. You can assign different models per workflow: a fast, cheap model for extraction, a powerful model for reasoning. Swap providers without changing any workflow logic.
What deployment options are available? +
Private cloud (your Azure, AWS, or GCP tenancy), on-premises (air-gapped, open-weight models only), or CompleteFlow Cloud (managed by us in your chosen region). All options include full audit trails and the same feature set.
What happens if we want to leave? +
You own everything built on your infrastructure. Workflows, agent configurations, audit records, and all data are exportable. The platform uses open standards throughout. No proprietary lock-in.
How long to get to production? +
Six weeks from kickoff to first production workflows. This includes platform deployment, integration with your systems, workflow configuration, and team training.
Do we need AI expertise in-house? +
No. We handle the platform setup and initial workflow configuration. Your team uses the visual builder and conversational interface to work with agents day-to-day. No ML engineering required.

Ready for a technical walkthrough?

Book a 30-minute session with an engineer. Bring your CISO, your architect, your hardest questions. We will walk through the architecture on your terms.