Will Chen vibe coded MikeOSS in two weeks and changed the conversation about legal AI. There has been a lot of enthusiasm from people keen for an alternative to Harvey and Legora, and a lot of criticism, mostly centred around the fact that it is still early and not ready for production. I would like to share some of my experience of how open source projects mature, and the boxes that law firms and enterprises need to check before they can adopt.
I am a big believer in open source. I have spent my career at Atchai helping Fortune 500 enterprises, government departments, non-profits and financial institutions adopt it. A lot of serious infrastructure runs on open source, and it can help buyers avoid vendor lock-in, manage exit risk, and get a much better cost structure when done right.
But getting open source past a regulated buyer’s assurance process is real work. You are answering for the quality of the code, the soundness of the architecture, maintainability, security posture, vulnerability response, the supply chain of every transitive dependency, the operational story (logging, monitoring, upgrade path), and the commercial wrap (who do I call at 2am, who indemnifies me if it breaks, who carries the cyber liability cover). None of that is hostile to open source. It is the cost of putting open source somewhere that matters.
Let’s unpack what enterprise ready actually means. I will share a few specifics you need to know if you are building for legal or any other regulated industry. It is also useful if you are a buyer or decision-maker who wants to do something a bit different, but you still need to make a responsible decision and manage technology risk.
The information security questionnaire says it all
A good demo wins you a meeting. The information security questionnaire wins the contract.
This process is owned by the IT department, who have the rather unenviable job of assessing the risk of every piece of software an organisation adopts. They serve a critical function and in the case of adopting open source they will typically be bearing a greater burden than usual.
You must learn to love the IT department.
I have had the dubious pleasure of successfully completing numerous information security questionnaires over the years. For those of you who have not seen one of these mythical artefacts before, here is what they typically ask for.
01 — Information security governance
A documented Information Security Management System (ISMS), with a clear scope statement covering the products, infrastructure and staff involved. ISO/IEC 27001 certification, or a credible roadmap to it with a current Statement of Applicability. A named CISO or security lead, annual risk assessments, and a written information security policy reviewed at least yearly.
02 — Data protection
Client confidentiality and legal professional privilege come first, before any data protection regime. Can vendor staff read matter content, can you ensure that privileged material stays privileged, and that the vendor will not respond to third-party disclosure requests (US discovery, foreign law enforcement, government subpoenas) without notifying the firm first.
Then the standard layer: an Article 28 Data Processing Agreement under UK GDPR for UK/EU work, alignment with the CCPA/CPRA and applicable state privacy laws in the US, evidenced data residency, a maintained sub-processor list with change notification, and valid international transfer mechanisms (UK IDTA, EU SCCs) with a transfer risk assessment.
03 — Identity and access control
SSO via SAML or OIDC, SCIM provisioning for joiner-mover-leaver flows, and per-user authentication to every downstream system rather than a shared service account. Matter-level access controls that enforce the firm’s ethical walls, role-based permissions, segregation of duties for administrative actions, and an attestable user access review cycle.
04 — Encryption and key management
TLS 1.2 or higher in transit and AES-256 at rest as a minimum. Customer-managed keys (CMK/BYOK) for firms that require them, HSM-backed key storage, FIPS 140-2 or 140-3 validated cryptographic modules where regulators expect them, and a documented key rotation and revocation policy.
05 — Hosting and infrastructure
Dedicated tenancy or, at minimum, strong logical isolation between customers. In-region data residency for the UK, EU or US as required, a named hyperscaler with regional commitments in writing, network segmentation, private endpoints, and no data leaving the chosen region for any reason including support or telemetry.
06 — Software architecture
Clean separation of concerns and documented service boundaries. Horizontal scalability for the components that matter, observability built in from day one rather than retrofitted, no shared mutable state across tenants, and row-level or equivalent enforcement at the data layer so a query bug cannot leak across customers.
07 — AI-specific controls
Two years ago this was one or two lines on a questionnaire. It is now a section in its own right. The questions cluster around four areas.
First, the data flow: signed DPAs with every model provider, written no-training commitments, and a clear picture of who in the chain (you, the cloud platform, the model provider, any fine-tuning partner) sees what.
Second, the controls on the model itself: version pinning, defences against prompt injection and jailbreaks, output citation and provenance, retention and logging of prompts and completions.
Third, the human layer: human-in-the-loop for consequential decisions, clear ownership of AI risk inside the vendor, and a published governance framework.
Fourth, regulatory alignment: the NIST AI Risk Management Framework, ISO/IEC 42001, and the EU AI Act provider and deployer obligations where the firm has European exposure.
08 — Secure development lifecycle
SAST and DAST in CI, dependency scanning with a defined remediation SLA, signed commits and signed releases, a code review policy that prevents single-author merges to main, secrets scanning on every push, and a Software Bill of Materials (SBOM) available to customers on request.
09 — Vulnerability and patch management
Defined CVE response SLAs by severity, annual third-party penetration tests with the report and remediation evidence available under NDA, a responsible disclosure policy with a published contact route, and a patch cadence that covers operating systems, base images, libraries and third-party services.
10 — Logging, monitoring, audit
Audit logs covering authentication, authorisation decisions, data access and configuration changes. Alignment with the regulatory and bar requirements that apply to the firm (SRA in the UK, state bar rules and ABA Model Rule 1.6 obligations in the US). User activity attribution down to the individual, and a matter-level audit trail that satisfies the firm’s own record-keeping duties.
11 — Business continuity and resilience
Stated RTO and RPO commitments backed by a tested DR plan. Graceful degradation paths for upstream model outages, and exactly-once execution semantics on expensive LLM calls so a retry does not double-bill or double-fire downstream side effects.
12 — Incident response
A named incident contact, breach notification SLAs that meet the applicable regime (72 hours under GDPR, varying state laws across the US, FTC and SEC obligations where relevant), playbooks for the most common scenarios, post-incident reporting with root cause and remediation, and a clear position on regulator liaison and customer notification.
13 — Third-party and supply chain risk
A current sub-processor list with change notification before any addition or replacement. Flow-down obligations into every sub-processor contract, a meaningful right to audit, and an exit assistance commitment covering data return, format and timeline.
14 — Certifications and assurance
SOC 2 Type II or ISO/IEC 27001 are most widely recognised. Cyber Essentials Plus is typically required for UK firms. Certifications attest to process, not product quality.
15 — Contractual and liability
A Master Services Agreement and Data Processing Agreement that the firm’s GC can actually sign. Professional indemnity insurance and cyber liability cover at limits proportionate to the engagement, IP indemnity covering model outputs, source code escrow for business-critical deployments, and a defined exit clause covering data return and deletion.
On software architecture and design
Having been through this process with law firms and enterprise legal departments, I can say a bit more about what they are specifically looking for in legal AI.
In a separate post.
If this is useful, let me know. Happy to go deeper on enterprise adoption of open source, or on the legal AI requirements specifically.