Kirkland on Cloudflare — The cloud control plane for $500M of legal AI infrastructure

Kirkland is building the model. Cloudflare is the everything-in-front-of-it.

Fine-tuning a legal LLM is hard, expensive, sensitive work. The boring parts — routing, caching, observability, tenant isolation, attorney access from 17 offices, training-data movement — shouldn't be hard too. They shouldn't even be on the critical path.

Kirkland builds

The model itself and the workflows that wrap it

Fine-tuned open-source LLMs running on on-prem GPU clusters and Azure AI, governed by the Innovation Program, embedded into IP, Litigation, Restructuring, and Transactional practice groups by ~85 AI Advisors.

Domain-tuned weights on Kirkland matter data
Practice-group-specific prompt libraries and workflows
"Governed environments for rapid experimentation" (per the AI Director JD)
Client-facing AI-enhanced legal services

Cloudflare runs

The control plane between attorneys, models, and matter data

The edge you already deploy on kirkland.com extends to a single, governed control plane for every call into every model — on-prem, Azure, or third-party — with per-matter tenancy, cost attribution, and Zero Trust access baked in.

AI Gateway in front of GPU clusters + Azure AI + Anthropic + others
Workers for Platforms — per-matter / per-client tenant isolation
Vectorize + R2 for the legal corpus and zero-egress training movement
Zero Trust access to GPU envs from all 17 offices, including Tokyo

Five plays that fit this specific buildout.

Each one maps directly to a sentence in your published job descriptions or the public reporting on the project. None of them require Kirkland to rip anything out or pause the model work.

PLAY 01 / GOVERNED INFERENCE

AI Gateway in front of every model the firm uses.

One audited, rate-limited, cached, logged hop in front of your fine-tuned on-prem model and Azure AI and Anthropic (already on your apex) and Harvey/Legora/CoCounsel/Lexis+ AI during the transition.

Why now: The AI Director JD specifically calls for "secure, governed environments that enable rapid experimentation." AI Gateway is that governance layer for inference traffic — without slowing the Innovation Program down.

AI Gateway Observability Per-app keys ~30–60% LLM spend

PLAY 02 / PER-MATTER ISOLATION

Workers for Platforms = a sandbox per matter, per client.

Every matter or client gets its own tenant — its own keys, its own egress policy, its own logs, its own model routing rules. Privilege is enforced by infrastructure, not by checklist.

Why now: Privacy is the differentiator Artificial Lawyer flagged in their analysis. "Even if outputs are only marginally better, what Kirkland could have is increased privacy." Per-matter tenancy makes that a hard architectural claim instead of a marketing one.

Workers for Platforms Durable Objects Privilege Audit-ready by default

PLAY 03 / LEGAL CORPUS

Vectorize + R2 for precedent, deal docs, and matter history.

Native vector DB for retrieval over your fine-tuning corpus and live RAG, with R2 holding the raw documents. Zero egress fees when training data moves between R2, the GPU clusters, and Azure ML.

Why now: Fine-tuning a domain LLM consumes and produces enormous data movement. R2's zero-egress profile is the difference between "a known monthly cost" and "an AWS S3 bill that surprises the CFO every month for three years."

Vectorize R2 Zero egress 40–60% storage egress

PLAY 04 / WORKFLOW ORCHESTRATION

Workers + Workflows for the 85 AI Advisor playbooks.

Your AI Advisors are hired to "translate legal tasks and workflows into scoped AI solutions." Workers + Workflows is the durable, replayable, observable runtime those scoped solutions need — without standing up Kubernetes for every practice group.

Why now: 85 AI Advisor seats × dozens of workflows each = hundreds of small services. Each one needs to be governed, restartable, and auditable. Workers + Workflows gives you that without spinning up infra per workflow.

Workers Workflows Durable Objects No infra-per-workflow

PLAY 05 / ATTORNEY ACCESS

Zero Trust to the GPU clusters from all 17 offices.

The on-prem GPU environment is a high-value target. Cloudflare Access + Tunnel mean every authorized attorney, AI Advisor, and infra engineer reaches it through the same identity-aware proxy — including the just-announced Tokyo office on day one.

Why now: Wiz is already verifying your apex (TXT record). Cloudflare ZT layers cleanly on top — Wiz watches cloud posture, ZT enforces identity-aware access. The on-prem GPU clusters are the asset that most needs that boundary.

Access Tunnel Identity-aware proxy Replaces VPN sprawl

PLAY 06 / GLOBAL EDGE FOR CLIENTS

The same edge already in front of kirkland.com.

You're already a Cloudflare customer. Extending the same edge to client-facing AI-enhanced legal services — which the AI Advisor JD explicitly mentions — is a roadmap move, not a procurement event.

Why now: No re-papering. No new vendor security review. The MSA, the SOC reports, the data-processing addenda — they already exist. The Innovation Program can ship faster because the underlying edge is already approved.

Existing relationship Faster procurement Same SOC Zero net-new vendor risk

Privilege is a tenancy problem, not a policy problem.

Workers for Platforms lets you spin a fresh, isolated tenant per matter — own keys, own egress, own logs, own model-routing rules — all from the same shared control plane. The boundary is enforced by the infrastructure, which is what regulators, clients, and your General Counsel actually want to hear.

Per-matter tenant isolation, sketched

Each matter is its own Worker namespace. Same edge, same observability, completely isolated state and egress.

MATTER A

M&A — Project Atlas

tenant-a.workers.dev

MATTER B

Restructuring — Co. X

tenant-b.workers.dev

MATTER C

IP Litigation — Patent 5,672

tenant-c.workers.dev

MATTER D

PE Fund Formation

tenant-d.workers.dev

↓

Shared control plane — AI Gateway + Workers for Platforms + Zero Trust

one routing layer · one observability surface · zero cross-matter leakage by construction

The economics of 85 AI Advisor workflows.

AI Gateway gives you three things at once that none of Harvey/Legora/CoCounsel give you natively: per-practice-group cost attribution, semantic caching (legal research repeats itself constantly), and a single audit log across every model your firm calls.

A back-of-the-envelope, not a quote

Modeled at $15 / M blended tokens (Claude + GPT + your fine-tuned model)

85 AI ADVISORS × 200 QUERIES / DAY

~17K queries/day

Each AI Advisor is embedded in a practice group and "owns end-to-end workflow development." Conservative usage.

SEMANTIC CACHE HIT RATE

25–40%

Legal queries cluster heavily — precedent lookups, standard clause analysis, doc-summarization templates. Higher than typical LLM workloads.

ANNUAL INFERENCE SAVINGS

$1.4M–$2.8M

Before counting the value of per-practice-group attribution at billing time, which alone justifies the gateway for most firms.

The real win isn't the savings, it's the attribution. When IP Litigation, Restructuring, M&A, and Funds each have their own AI cost line — broken out by partner, matter, and client — you can make defensible decisions about which workflows are worth productionizing and which were just experiments. That data doesn't exist today inside Harvey, CoCounsel, or your own fine-tuned model. It exists inside AI Gateway.

What you already have vs. what Cloudflare adds.

We're not asking you to throw anything out. The stack you've signaled in your DNS, hiring, and public statements is excellent. Cloudflare slots in as the connective tissue.

The current stack, with Cloudflare overlaid

Each row is something you've already publicly signaled. The right column is the thin Cloudflare layer that makes it governable.

LAYER

WHAT KIRKLAND ALREADY USES

WHAT CLOUDFLARE ADDS

PUBLIC EDGE

kirkland.com on Cloudflare (server: cloudflare, cf-ray on every response)

+ AI Gateway + Workers for Platforms — same edge, more use cases

FOUNDATION MODELS

Anthropic (TXT-verified on apex), Azure OpenAI, fine-tuned open-source LLMs (planned)

+ AI Gateway: one observability layer across all three, BYO keys

CLOUD SECURITY POSTURE

Wiz (TXT-verified on apex)

+ Zero Trust: identity-aware access to GPU clusters & Azure environments

CORPORATE NETWORKING

Cisco (TXT-verified on apex)

+ Magic WAN / Tunnel as overlay across 17 offices including Tokyo

EMAIL SECURITY

Proofpoint (MX + SPF + TXT confirmation)

+ Area 1 / Email Security as defense-in-depth, not a replacement

WEBSITE CMS

OneNorth (AmLaw-specialized — onilive.com / onenorth.com aliases)

No change — OneNorth fronts cleanly behind your existing Cloudflare zone

MARKETING AUTOMATION

Pardot (Salesforce, TXT-verified)

No change — Workers can enrich client-facing forms server-side if useful

CORPUS / RAG STORAGE

Likely SharePoint / iManage + Azure Blob (industry standard for AmLaw)

+ R2 (zero-egress) + Vectorize as the cheap, fast retrieval tier for fine-tuning

GPU CLUSTERS

On-prem (per the May 27 AI Infrastructure Director JD)

+ Tunnel + Access in front; Workers AI as overflow capacity if useful

Why this is the right week, not next quarter.

Three things lined up in the last two weeks: the $500M project went public (FT), the AI Infrastructure Director roles posted (May 27), and the Tokyo office was announced (May 31). The first two define the platform; the third extends the perimeter.

The infrastructure choices being made right now — what governs inference traffic, how matters are isolated, where the training corpus lives, how Tokyo attorneys reach the GPU cluster — are the ones that will be expensive to change in 2027.

Cloudflare is already in the perimeter. Extending that perimeter inward to the AI buildout is a 60-day decision, not a 12-month one. And it doesn't require Kirkland to pause the model work for a single day to do it.

A $500M legal AI buildout deserves a control plane that's as carefully governed as the practice itself.