Technical Evaluation
The technical deep dive your team needs before recommending an AI agent platform
Architecture, deployment models, security controls, compliance mapping, and an honest comparison against alternatives. Everything a CTO, CISO, or Head of IT needs for due diligence.
Architecture
Four independent layers, each replaceable
CompleteFlow separates concerns into four layers. Each layer has a well-defined interface and can be modified, replaced, or extended without affecting the others. This is not a monolith with "modular" in the marketing copy. The layers are independently deployable and independently testable.
Channel Layer
Accepts requests from Microsoft Teams, Copilot Chat, web UI, and REST API. Each channel adapter handles authentication, message formatting, and response delivery. Agents are channel-neutral. The same agent serves every channel without modification.
Agent Layer
PydanticAI agents with type-safe tool definitions and structured output, backed by an idempotent workflow engine for multi-step processes. Workflows support parallel fan-out, conditional routing, human gates, scheduling, and automatic retry with state recovery. Agents auto-discover registered workflows and expose them as conversational tools.
Governance Layer
Every agent action passes through the governance layer before execution and after completion. OPA evaluates agent-level access control. Cedar handles fine-grained tool-level authorisation with formal verification. The audit logger records every decision at configurable detail levels.
Infrastructure Layer
PostgreSQL 16 with pgvector for data persistence and vector search. Row-level security enforces tenant isolation at the database level. OpenTelemetry provides distributed tracing with per-agent cost attribution. All components deploy as containers with infrastructure-as-code.
Deployment
Three deployment models, each with clear trade-offs
The right deployment model depends on your regulatory constraints, existing infrastructure, and operational capacity. Here is exactly what each option involves.
Private Cloud
Typical timeline: 4-6 weeks
Infrastructure
Your Azure, AWS, or GCP tenancy
Networking
Hub-spoke VNet/VPC with private endpoints. No public internet exposure for data plane. Management plane accessible via VPN or private link.
Compute
Azure Container Apps, AWS ECS/Fargate, or GCP Cloud Run. Auto-scaling based on agent workload.
LLM Providers
Azure OpenAI (private endpoint), Anthropic API (via private link), or self-hosted open-weight models in your tenancy. Commercial API tiers. Your data is never used for model training.
Data Layer
PostgreSQL managed service (Azure Database for PostgreSQL, RDS, Cloud SQL) with customer-managed encryption keys.
Identity
Microsoft Entra ID with delegated OAuth. Agents inherit requesting user permissions.
Monitoring
OpenTelemetry to Azure Monitor, CloudWatch, or your existing SIEM.
On-Premises
Typical timeline: 6-10 weeks
Infrastructure
Your own hardware or private data centre
Networking
Fully air-gapped. No outbound internet required. All components run within your network perimeter.
Compute
Docker Compose for single-node. Kubernetes (any distribution) for multi-node. Bare metal or VM deployment.
LLM Providers
Open-weight models only (Llama, Mistral). Runs on your GPU infrastructure. No external API calls.
Data Layer
Self-managed PostgreSQL with your encryption and backup policies. Full control over data lifecycle.
Identity
On-premises Active Directory or SAML 2.0 identity provider. Local auth fallback available.
Monitoring
OpenTelemetry export to your existing monitoring stack (Grafana, Splunk, ELK).
CompleteFlow Cloud
Typical timeline: 2-3 weeks
Infrastructure
UK-hosted private cloud managed by Atchai
Networking
Dedicated tenant environment. UK data residency guaranteed. SOC 2 controls applied.
Compute
Managed container infrastructure with automatic scaling and patching.
LLM Providers
Anthropic Claude, OpenAI, and Azure OpenAI. Commercial API tiers. Your data is never used for model training. Model selection per agent.
Data Layer
Managed PostgreSQL with encryption at rest. Data residency in UK geography.
Identity
Microsoft Entra ID SSO. Federated identity with your existing directory.
Monitoring
Built-in dashboard with agent metrics, cost tracking, and audit log viewer. Export to your SIEM available.
Security Model
Defence in depth, not perimeter-only
Security controls are applied at every layer: network, identity, encryption, policy, and audit. No single control is a single point of failure.
| Domain | Control | Detail |
|---|---|---|
| Encryption at rest | AES-256 for all stored data | Customer-managed keys via Azure Key Vault, AWS KMS, or HashiCorp Vault. Key rotation policies enforced. |
| Encryption in transit | TLS 1.3 for all connections | Certificate management automated. Internal service mesh with mTLS between components. |
| Network isolation | Private endpoints, no public data plane | Hub-spoke network topology. Management and data planes separated. NSG/security group rules restrict lateral movement. |
| Identity and access | Delegated OAuth via Microsoft Entra ID | Agents inherit requesting user permissions. No over-provisioned service accounts. JIT user provisioning. OPA enforces who can invoke which agents. |
| Key management | Customer-held encryption keys | CompleteFlow never holds customer keys. Bring-your-own-key (BYOK) supported across all deployment models. |
| Secrets management | Vault-based secret storage | API keys, tokens, and credentials stored in Azure Key Vault, AWS Secrets Manager, or HashiCorp Vault. No secrets in code or configuration files. |
| Audit logging | Immutable two-level audit trail | Minimal level: agent ID, user, action, model, tokens, cost, policy decisions. Maximal level: full prompt and response. 7-year default retention. Tamper-evident with cryptographic chaining. |
| Policy enforcement | OPA + Cedar policy-as-code | Rego policies version-controlled and tested in CI. Cedar policies formally verified. Every policy decision logged with full evaluation context. |
| Penetration testing | Annual third-party assessment | CREST-accredited providers. Continuous vulnerability scanning. Responsible disclosure programme. |
| Data residency | Deploy in any geography | UK (Azure UK South, AWS eu-west-2), EU, or any region. Data does not leave the designated geography. |
Compliance Mapping
How CompleteFlow maps to regulatory frameworks
Compliance is not a feature that gets added later. These controls are built into the agent execution pipeline. Below is a mapping of specific regulatory requirements to the CompleteFlow controls that address them.
FCA Consumer Duty
| Requirement | CompleteFlow Control |
|---|---|
| Demonstrate outcomes monitoring | Immutable audit trails capture every AI decision with reasoning traces, confidence scores, and outcome data |
| Evidence of fair treatment | Output logging for bias review and audit. Human review gates for high-stakes decisions |
| Explainability of AI decisions | Reasoning traces export showing tool calls, data sources, confidence scores, and the decision path for each output |
| Governance and oversight | OPA and Cedar policy-as-code with version control. Configurable human-in-the-loop approval gates |
SRA Standards and Regulations
| Requirement | CompleteFlow Control |
|---|---|
| Client data confidentiality | Private deployment on firm infrastructure. Delegated OAuth ensures agents only access what the requesting user can access |
| Competence and supervision | Human-in-the-loop review for all client-facing outputs. Confidence thresholds trigger mandatory review |
| Record keeping | Two-level audit logging with 7-year retention. Every agent interaction recorded with full context |
| Technology risk management | Version-controlled agent configurations. Rollback capability. Sandbox testing before production deployment |
MiFID II
| Requirement | CompleteFlow Control |
|---|---|
| Transaction reporting | Per-agent, per-user activity logging with timestamps. Exportable audit records in standard formats |
| Best execution documentation | Reasoning traces capture the full decision path including data sources consulted and alternatives considered |
| Algorithmic trading controls | Rate limits, circuit breakers, and mandatory human gates for actions above configurable thresholds |
| Record retention (5 years) | Configurable retention periods. Default 7-year retention exceeds the MiFID II 5-year requirement |
GDPR
| Requirement | CompleteFlow Control |
|---|---|
| Data minimisation | Agents process only the data required for each task. No persistent caching of personal data beyond the task lifecycle |
| Right to erasure | Tenant-level and user-level data deletion with cascade. Audit records anonymised (not deleted) to preserve regulatory trail |
| Data portability | Full data export in standard formats. No proprietary data lock-in |
| Breach notification | Automated anomaly detection with alerting. Incident response procedures with 72-hour notification support |
EU AI Act
| Requirement | CompleteFlow Control |
|---|---|
| Risk classification | Agent risk tiers mapped to EU AI Act categories. High-risk agents receive additional governance controls automatically |
| Transparency obligations | Users are informed when interacting with an AI agent. Reasoning traces provide full transparency of AI decision-making |
| Human oversight | Configurable human-in-the-loop gates. High-risk decisions require explicit human approval before execution |
| Technical documentation | Automated generation of technical documentation per agent: capabilities, limitations, training data lineage, and performance metrics |
Integration Architecture
MCP-first, not API-spaghetti
CompleteFlow uses the Model Context Protocol (MCP) as its primary integration standard. MCP is the open standard for connecting AI agents to external tools and data sources. Any system with an MCP server can be connected to CompleteFlow agents without custom integration code.
MCP Server Architecture
Each external system is represented by an MCP server that exposes typed tools to CompleteFlow agents. The agent runtime discovers available tools at startup and makes them available as callable functions. MCP servers run within your network perimeter, so no data leaves your infrastructure during tool calls.
MCP servers handle authentication, rate limiting, and error handling for each external system. New integrations are added by deploying an MCP server. No changes to agent code or platform configuration required.
Microsoft 365 Integration
Microsoft 365 has dedicated support via the Graph API with delegated OAuth tokens. When a user asks an agent to search SharePoint or access OneDrive, the agent uses that user's existing permissions. No admin consent for broad access is required. No over-provisioned service accounts.
The channel layer supports Teams and Copilot Chat natively. Agents deployed to these channels receive messages through the Bot Framework and respond in the context of the user's Teams environment.
Channel Abstraction
Agents are channel-neutral. The same agent logic serves Teams, Copilot Chat, web UI, and API consumers without modification. Each channel adapter handles:
Microsoft Teams
Bot Framework adapter. Adaptive cards for rich responses. Thread-aware conversations.
Copilot Chat
Microsoft 365 Copilot plugin. Agent tools surfaced as Copilot actions.
Web UI
Built-in chat interface. Review queue dashboard. Agent configuration and monitoring.
REST API
Programmatic access for system-to-system integration. Webhook callbacks for async workflows.
Comparison
CompleteFlow vs the alternatives for regulated environments
Four approaches to deploying AI in regulated industries. Each has trade-offs. This table presents the differences without editorialising. The right choice depends on your constraints, budget, and timeline.
| Criterion | CompleteFlow | DIY (Open Source) | SaaS AI Tools | Palantir AIP |
|---|---|---|---|---|
| Time to production | 6 weeks | 12-18 months | 2-4 weeks | 6-12 months |
| Initial cost | From £40k | £200k-500k+ (team cost) | £20-1,000/user/month | £1M+/year |
| Data residency | Your infrastructure, any geography | Your infrastructure | Vendor cloud (usually US) | Your infrastructure |
| Audit trail depth | Two-level immutable logging, 7-year retention | Must build from scratch | Basic execution logs | Enterprise audit logging |
| Policy-as-code | OPA + Cedar, formally verified | Must build or integrate | Role-based only | Proprietary policy engine |
| Human-in-the-loop | Native gates with suspend/resume | Must build | Limited or none | Available |
| Reasoning traces | Every decision, exportable | Must instrument | Rarely available | Available |
| LLM flexibility | Multi-provider, swap without code changes | Full flexibility (if built) | Vendor-locked model | Multi-provider |
| Regulatory mapping | FCA, SRA, MiFID II, GDPR, EU AI Act | Must map yourself | GDPR only (if any) | Enterprise compliance |
| Vendor lock-in | Low. Open-source stack, full data export. Migration requires re-integrating workflows | None | High | High |
| Ongoing team required | Minimal. We manage the platform; your team manages agents | 3-5 FTE ML/platform engineers | 0.5 FTE | 2-4 FTE + Palantir engineers |
Evaluation Checklist
What to look for in any AI agent platform for regulated industries
This checklist is vendor-neutral. Use it to evaluate CompleteFlow, competitors, or an in-house build. These are the questions that matter when AI decisions will be subject to regulatory scrutiny.
Data Sovereignty
- ▢ Where do LLM API calls terminate? In your tenancy or the vendor's?
- ▢ Can you deploy fully on-premises with no outbound internet?
- ▢ Who holds the encryption keys? Can you bring your own?
- ▢ Does the platform support your required data residency geography?
- ▢ Is there a clear data processing agreement (DPA) with GDPR-compliant terms?
Governance and Audit
- ▢ Does the audit trail capture the full decision path, or just execution logs?
- ▢ Are audit records immutable and tamper-evident?
- ▢ Can you export audit data to your existing SIEM or compliance tooling?
- ▢ Is policy enforcement built into the execution pipeline or bolted on after?
- ▢ Does the platform support configurable human-in-the-loop approval gates?
Regulatory Fit
- ▢ Has the vendor mapped their controls to your specific regulatory framework?
- ▢ Can reasoning traces be exported for regulatory review?
- ▢ Does the platform support the record retention periods your regulator requires?
- ▢ Is there bias monitoring and fairness measurement on AI outputs?
- ▢ Can you demonstrate to a regulator exactly how each AI decision was made?
Architecture and Integration
- ▢ Does the platform integrate with your existing identity provider (Entra ID, Okta, SAML)?
- ▢ Can agents access your document management systems with user-level permissions?
- ▢ Is the integration layer based on open standards (MCP, OAuth, REST)?
- ▢ Can you swap LLM providers without rewriting agent logic?
- ▢ Does the platform support multi-tenancy with database-level isolation?
Operations and Ownership
- ▢ Do you own the agents and workflows that are built? Can you export them?
- ▢ What does the platform require from your team to operate day-to-day?
- ▢ Is there version control and rollback for agent configurations?
- ▢ What observability is built in? Per-agent cost tracking? Distributed tracing?
- ▢ What is the vendor's track record with regulated industries specifically?
FAQ
Technical evaluation questions
What deployment options does CompleteFlow support? +
How does CompleteFlow handle multi-tenancy? +
What happens if we want to leave CompleteFlow? +
How does policy enforcement work at runtime? +
What is the typical infrastructure footprint? +
Can CompleteFlow agents call our internal APIs? +
How do you handle LLM model updates and deprecations? +
Ready for a technical walkthrough?
Book a 30-minute session with an engineer. Bring your CISO, your architect, your hardest questions. We will walk through the architecture on your terms.