Skip to content

Technical Evaluation

The technical deep dive your team needs before recommending an AI agent platform

Architecture, deployment models, security controls, compliance mapping, and an honest comparison against alternatives. Everything a CTO, CISO, or Head of IT needs for due diligence.

Architecture

Four independent layers, each replaceable

CompleteFlow separates concerns into four layers. Each layer has a well-defined interface and can be modified, replaced, or extended without affecting the others. This is not a monolith with "modular" in the marketing copy. The layers are independently deployable and independently testable.

01

Channel Layer

Accepts requests from Microsoft Teams, Copilot Chat, web UI, and REST API. Each channel adapter handles authentication, message formatting, and response delivery. Agents are channel-neutral. The same agent serves every channel without modification.

Teams adapterCopilot Chat adapterWeb UIREST APIWebhook endpoints
02

Agent Layer

PydanticAI agents with type-safe tool definitions and structured output, backed by an idempotent workflow engine for multi-step processes. Workflows support parallel fan-out, conditional routing, human gates, scheduling, and automatic retry with state recovery. Agents auto-discover registered workflows and expose them as conversational tools.

PydanticAI agent frameworkWorkflow engine (idempotent, rerunnable)Model registry (multi-provider)Tool registry (MCP servers)Python-native orchestration
03

Governance Layer

Every agent action passes through the governance layer before execution and after completion. OPA evaluates agent-level access control. Cedar handles fine-grained tool-level authorisation with formal verification. The audit logger records every decision at configurable detail levels.

OPA (agent-level access control)Cedar (tool-level authorisation)Immutable audit loggerConfidence scoring engineReview queue and approval gates
04

Infrastructure Layer

PostgreSQL 16 with pgvector for data persistence and vector search. Row-level security enforces tenant isolation at the database level. OpenTelemetry provides distributed tracing with per-agent cost attribution. All components deploy as containers with infrastructure-as-code.

PostgreSQL 16 + pgvectorRow-level security (multi-tenancy)OpenTelemetry tracingContainer orchestrationPrivate networking (VNet/VPC)

Deployment

Three deployment models, each with clear trade-offs

The right deployment model depends on your regulatory constraints, existing infrastructure, and operational capacity. Here is exactly what each option involves.

Private Cloud

Typical timeline: 4-6 weeks

Recommended

Infrastructure

Your Azure, AWS, or GCP tenancy

Networking

Hub-spoke VNet/VPC with private endpoints. No public internet exposure for data plane. Management plane accessible via VPN or private link.

Compute

Azure Container Apps, AWS ECS/Fargate, or GCP Cloud Run. Auto-scaling based on agent workload.

LLM Providers

Azure OpenAI (private endpoint), Anthropic API (via private link), or self-hosted open-weight models in your tenancy. Commercial API tiers. Your data is never used for model training.

Data Layer

PostgreSQL managed service (Azure Database for PostgreSQL, RDS, Cloud SQL) with customer-managed encryption keys.

Identity

Microsoft Entra ID with delegated OAuth. Agents inherit requesting user permissions.

Monitoring

OpenTelemetry to Azure Monitor, CloudWatch, or your existing SIEM.

On-Premises

Typical timeline: 6-10 weeks

Maximum isolation

Infrastructure

Your own hardware or private data centre

Networking

Fully air-gapped. No outbound internet required. All components run within your network perimeter.

Compute

Docker Compose for single-node. Kubernetes (any distribution) for multi-node. Bare metal or VM deployment.

LLM Providers

Open-weight models only (Llama, Mistral). Runs on your GPU infrastructure. No external API calls.

Data Layer

Self-managed PostgreSQL with your encryption and backup policies. Full control over data lifecycle.

Identity

On-premises Active Directory or SAML 2.0 identity provider. Local auth fallback available.

Monitoring

OpenTelemetry export to your existing monitoring stack (Grafana, Splunk, ELK).

CompleteFlow Cloud

Typical timeline: 2-3 weeks

Fastest setup

Infrastructure

UK-hosted private cloud managed by Atchai

Networking

Dedicated tenant environment. UK data residency guaranteed. SOC 2 controls applied.

Compute

Managed container infrastructure with automatic scaling and patching.

LLM Providers

Anthropic Claude, OpenAI, and Azure OpenAI. Commercial API tiers. Your data is never used for model training. Model selection per agent.

Data Layer

Managed PostgreSQL with encryption at rest. Data residency in UK geography.

Identity

Microsoft Entra ID SSO. Federated identity with your existing directory.

Monitoring

Built-in dashboard with agent metrics, cost tracking, and audit log viewer. Export to your SIEM available.

Security Model

Defence in depth, not perimeter-only

Security controls are applied at every layer: network, identity, encryption, policy, and audit. No single control is a single point of failure.

Domain Control Detail
Encryption at rest AES-256 for all stored data Customer-managed keys via Azure Key Vault, AWS KMS, or HashiCorp Vault. Key rotation policies enforced.
Encryption in transit TLS 1.3 for all connections Certificate management automated. Internal service mesh with mTLS between components.
Network isolation Private endpoints, no public data plane Hub-spoke network topology. Management and data planes separated. NSG/security group rules restrict lateral movement.
Identity and access Delegated OAuth via Microsoft Entra ID Agents inherit requesting user permissions. No over-provisioned service accounts. JIT user provisioning. OPA enforces who can invoke which agents.
Key management Customer-held encryption keys CompleteFlow never holds customer keys. Bring-your-own-key (BYOK) supported across all deployment models.
Secrets management Vault-based secret storage API keys, tokens, and credentials stored in Azure Key Vault, AWS Secrets Manager, or HashiCorp Vault. No secrets in code or configuration files.
Audit logging Immutable two-level audit trail Minimal level: agent ID, user, action, model, tokens, cost, policy decisions. Maximal level: full prompt and response. 7-year default retention. Tamper-evident with cryptographic chaining.
Policy enforcement OPA + Cedar policy-as-code Rego policies version-controlled and tested in CI. Cedar policies formally verified. Every policy decision logged with full evaluation context.
Penetration testing Annual third-party assessment CREST-accredited providers. Continuous vulnerability scanning. Responsible disclosure programme.
Data residency Deploy in any geography UK (Azure UK South, AWS eu-west-2), EU, or any region. Data does not leave the designated geography.

Compliance Mapping

How CompleteFlow maps to regulatory frameworks

Compliance is not a feature that gets added later. These controls are built into the agent execution pipeline. Below is a mapping of specific regulatory requirements to the CompleteFlow controls that address them.

FCA Consumer Duty

Requirement CompleteFlow Control
Demonstrate outcomes monitoring Immutable audit trails capture every AI decision with reasoning traces, confidence scores, and outcome data
Evidence of fair treatment Output logging for bias review and audit. Human review gates for high-stakes decisions
Explainability of AI decisions Reasoning traces export showing tool calls, data sources, confidence scores, and the decision path for each output
Governance and oversight OPA and Cedar policy-as-code with version control. Configurable human-in-the-loop approval gates

SRA Standards and Regulations

Requirement CompleteFlow Control
Client data confidentiality Private deployment on firm infrastructure. Delegated OAuth ensures agents only access what the requesting user can access
Competence and supervision Human-in-the-loop review for all client-facing outputs. Confidence thresholds trigger mandatory review
Record keeping Two-level audit logging with 7-year retention. Every agent interaction recorded with full context
Technology risk management Version-controlled agent configurations. Rollback capability. Sandbox testing before production deployment

MiFID II

Requirement CompleteFlow Control
Transaction reporting Per-agent, per-user activity logging with timestamps. Exportable audit records in standard formats
Best execution documentation Reasoning traces capture the full decision path including data sources consulted and alternatives considered
Algorithmic trading controls Rate limits, circuit breakers, and mandatory human gates for actions above configurable thresholds
Record retention (5 years) Configurable retention periods. Default 7-year retention exceeds the MiFID II 5-year requirement

GDPR

Requirement CompleteFlow Control
Data minimisation Agents process only the data required for each task. No persistent caching of personal data beyond the task lifecycle
Right to erasure Tenant-level and user-level data deletion with cascade. Audit records anonymised (not deleted) to preserve regulatory trail
Data portability Full data export in standard formats. No proprietary data lock-in
Breach notification Automated anomaly detection with alerting. Incident response procedures with 72-hour notification support

EU AI Act

Requirement CompleteFlow Control
Risk classification Agent risk tiers mapped to EU AI Act categories. High-risk agents receive additional governance controls automatically
Transparency obligations Users are informed when interacting with an AI agent. Reasoning traces provide full transparency of AI decision-making
Human oversight Configurable human-in-the-loop gates. High-risk decisions require explicit human approval before execution
Technical documentation Automated generation of technical documentation per agent: capabilities, limitations, training data lineage, and performance metrics

Integration Architecture

MCP-first, not API-spaghetti

CompleteFlow uses the Model Context Protocol (MCP) as its primary integration standard. MCP is the open standard for connecting AI agents to external tools and data sources. Any system with an MCP server can be connected to CompleteFlow agents without custom integration code.

MCP Server Architecture

Each external system is represented by an MCP server that exposes typed tools to CompleteFlow agents. The agent runtime discovers available tools at startup and makes them available as callable functions. MCP servers run within your network perimeter, so no data leaves your infrastructure during tool calls.

MCP servers handle authentication, rate limiting, and error handling for each external system. New integrations are added by deploying an MCP server. No changes to agent code or platform configuration required.

Microsoft 365 Integration

Microsoft 365 has dedicated support via the Graph API with delegated OAuth tokens. When a user asks an agent to search SharePoint or access OneDrive, the agent uses that user's existing permissions. No admin consent for broad access is required. No over-provisioned service accounts.

The channel layer supports Teams and Copilot Chat natively. Agents deployed to these channels receive messages through the Bot Framework and respond in the context of the user's Teams environment.

Channel Abstraction

Agents are channel-neutral. The same agent logic serves Teams, Copilot Chat, web UI, and API consumers without modification. Each channel adapter handles:

Microsoft Teams

Bot Framework adapter. Adaptive cards for rich responses. Thread-aware conversations.

Copilot Chat

Microsoft 365 Copilot plugin. Agent tools surfaced as Copilot actions.

Web UI

Built-in chat interface. Review queue dashboard. Agent configuration and monitoring.

REST API

Programmatic access for system-to-system integration. Webhook callbacks for async workflows.

Comparison

CompleteFlow vs the alternatives for regulated environments

Four approaches to deploying AI in regulated industries. Each has trade-offs. This table presents the differences without editorialising. The right choice depends on your constraints, budget, and timeline.

Criterion CompleteFlow DIY (Open Source) SaaS AI Tools Palantir AIP
Time to production 6 weeks 12-18 months 2-4 weeks 6-12 months
Initial cost From £40k £200k-500k+ (team cost) £20-1,000/user/month £1M+/year
Data residency Your infrastructure, any geography Your infrastructure Vendor cloud (usually US) Your infrastructure
Audit trail depth Two-level immutable logging, 7-year retention Must build from scratch Basic execution logs Enterprise audit logging
Policy-as-code OPA + Cedar, formally verified Must build or integrate Role-based only Proprietary policy engine
Human-in-the-loop Native gates with suspend/resume Must build Limited or none Available
Reasoning traces Every decision, exportable Must instrument Rarely available Available
LLM flexibility Multi-provider, swap without code changes Full flexibility (if built) Vendor-locked model Multi-provider
Regulatory mapping FCA, SRA, MiFID II, GDPR, EU AI Act Must map yourself GDPR only (if any) Enterprise compliance
Vendor lock-in Low. Open-source stack, full data export. Migration requires re-integrating workflows None High High
Ongoing team required Minimal. We manage the platform; your team manages agents 3-5 FTE ML/platform engineers 0.5 FTE 2-4 FTE + Palantir engineers

Evaluation Checklist

What to look for in any AI agent platform for regulated industries

This checklist is vendor-neutral. Use it to evaluate CompleteFlow, competitors, or an in-house build. These are the questions that matter when AI decisions will be subject to regulatory scrutiny.

Data Sovereignty

  • Where do LLM API calls terminate? In your tenancy or the vendor's?
  • Can you deploy fully on-premises with no outbound internet?
  • Who holds the encryption keys? Can you bring your own?
  • Does the platform support your required data residency geography?
  • Is there a clear data processing agreement (DPA) with GDPR-compliant terms?

Governance and Audit

  • Does the audit trail capture the full decision path, or just execution logs?
  • Are audit records immutable and tamper-evident?
  • Can you export audit data to your existing SIEM or compliance tooling?
  • Is policy enforcement built into the execution pipeline or bolted on after?
  • Does the platform support configurable human-in-the-loop approval gates?

Regulatory Fit

  • Has the vendor mapped their controls to your specific regulatory framework?
  • Can reasoning traces be exported for regulatory review?
  • Does the platform support the record retention periods your regulator requires?
  • Is there bias monitoring and fairness measurement on AI outputs?
  • Can you demonstrate to a regulator exactly how each AI decision was made?

Architecture and Integration

  • Does the platform integrate with your existing identity provider (Entra ID, Okta, SAML)?
  • Can agents access your document management systems with user-level permissions?
  • Is the integration layer based on open standards (MCP, OAuth, REST)?
  • Can you swap LLM providers without rewriting agent logic?
  • Does the platform support multi-tenancy with database-level isolation?

Operations and Ownership

  • Do you own the agents and workflows that are built? Can you export them?
  • What does the platform require from your team to operate day-to-day?
  • Is there version control and rollback for agent configurations?
  • What observability is built in? Per-agent cost tracking? Distributed tracing?
  • What is the vendor's track record with regulated industries specifically?

FAQ

Technical evaluation questions

What deployment options does CompleteFlow support? +
Three models: private cloud (your Azure, AWS, or GCP tenancy with container orchestration and private networking), on-premises (air-gapped deployment on your hardware with Docker Compose or Kubernetes, open-weight models only), and CompleteFlow Cloud (managed UK-hosted infrastructure). Hybrid configurations are supported for organisations that need on-premises data storage with cloud-based model inference.
How does CompleteFlow handle multi-tenancy? +
PostgreSQL row-level security enforces tenant isolation at the database level. Each tenant's data is invisible to queries from other tenants. Tenant context is established at authentication and flows through every database operation automatically. There is no application-level filtering that could be bypassed.
What happens if we want to leave CompleteFlow? +
You own everything that is built on your infrastructure. Agent configurations, workflows, audit records, and all data are exportable in standard formats. There is no proprietary format lock-in. The platform uses open standards (MCP, OAuth, OpenTelemetry, PostgreSQL) throughout.
How does policy enforcement work at runtime? +
Every agent request passes through OPA before execution. OPA evaluates Rego policies that define which users can invoke which agents with which parameters. Cedar provides a second layer for tool-level authorisation, determining which specific tools an agent can call for a given user. Both layers log their decisions to the audit trail. Policies are version-controlled and tested in CI before deployment.
What is the typical infrastructure footprint? +
A standard private cloud deployment requires: 2-4 container instances (auto-scaling), 1 managed PostgreSQL instance, 1 Redis instance for caching, and network configuration (VNet/VPC with private endpoints). The total compute cost is typically £200-2,000 per month depending on usage volume, separate from CompleteFlow licence fees.
Can CompleteFlow agents call our internal APIs? +
Yes. Any internal API can be exposed as an MCP server, which makes it available as a typed tool for agents. The MCP server handles authentication and schema definition. Alternatively, agents can call REST APIs directly through the HTTP tool with configurable authentication.
How do you handle LLM model updates and deprecations? +
The model registry abstracts provider-specific model versions. When a provider deprecates a model, the registry maps the model tier (budget, standard, premium) to the replacement model. Agent code references tiers, not specific model versions, so provider changes require a registry update, not a code change.

Ready for a technical walkthrough?

Book a 30-minute session with an engineer. Bring your CISO, your architect, your hardest questions. We will walk through the architecture on your terms.