For the team building Kirkland's $500M legal AI platform

A $500M legal AI buildout deserves a control plane that's as carefully governed as the practice itself.

Kirkland & Ellis is fine-tuning open-source LLMs across on-prem GPU clusters and Microsoft Azure AI, with a 180-person team and 85+ open AI roles. Off-the-peg platforms (Harvey, Legora, CoCounsel, Lexis+ AI) are the floor — the goal is above it.

kirkland.com already runs on Cloudflare. Anthropic is already verified on your apex (anthropic-domain-verification TXT record). The same edge you use today is also the cleanest place to put an AI Gateway, per-matter tenancy, and a privately-routed pipe to your GPU clusters. Artificial Lawyer, 1 June 2026 · Tokyo office, 31 May 2026

"As AI Infrastructure Director, you'll design, manage, and optimize the firm's AI infrastructure — spanning on‑premise GPU environments and Microsoft Azure–based AI platforms — to enable enterprise AI, automation, and innovation initiatives at scale."
— Kirkland & Ellis, AI Infrastructure Director role (Houston / Chicago, posted 27 May 2026). via Artificial Lawyer
$500M
Reported tech project budget
FT / Artificial Lawyer, May–June 2026
85+
Open AI roles across the firm
kirkland.com careers, June 2026
180
Team committed to the buildout
Kirkland statements, Q2 2026
17
Offices globally — incl. Tokyo (new)
kirkland.com/offices · 31 May 2026 announcement

Kirkland is building the model. Cloudflare is the everything-in-front-of-it.

Fine-tuning a legal LLM is hard, expensive, sensitive work. The boring parts — routing, caching, observability, tenant isolation, attorney access from 17 offices, training-data movement — shouldn't be hard too. They shouldn't even be on the critical path.

Kirkland builds

The model itself and the workflows that wrap it

Fine-tuned open-source LLMs running on on-prem GPU clusters and Azure AI, governed by the Innovation Program, embedded into IP, Litigation, Restructuring, and Transactional practice groups by ~85 AI Advisors.

  • Domain-tuned weights on Kirkland matter data
  • Practice-group-specific prompt libraries and workflows
  • "Governed environments for rapid experimentation" (per the AI Director JD)
  • Client-facing AI-enhanced legal services
×

Cloudflare runs

The control plane between attorneys, models, and matter data

The edge you already deploy on kirkland.com extends to a single, governed control plane for every call into every model — on-prem, Azure, or third-party — with per-matter tenancy, cost attribution, and Zero Trust access baked in.

  • AI Gateway in front of GPU clusters + Azure AI + Anthropic + others
  • Workers for Platforms — per-matter / per-client tenant isolation
  • Vectorize + R2 for the legal corpus and zero-egress training movement
  • Zero Trust access to GPU envs from all 17 offices, including Tokyo

Five plays that fit this specific buildout.

Each one maps directly to a sentence in your published job descriptions or the public reporting on the project. None of them require Kirkland to rip anything out or pause the model work.

PLAY 01 / GOVERNED INFERENCE

AI Gateway in front of every model the firm uses.

One audited, rate-limited, cached, logged hop in front of your fine-tuned on-prem model and Azure AI and Anthropic (already on your apex) and Harvey/Legora/CoCounsel/Lexis+ AI during the transition.

Why now: The AI Director JD specifically calls for "secure, governed environments that enable rapid experimentation." AI Gateway is that governance layer for inference traffic — without slowing the Innovation Program down.
AI Gateway Observability Per-app keys ~30–60% LLM spend
PLAY 02 / PER-MATTER ISOLATION

Workers for Platforms = a sandbox per matter, per client.

Every matter or client gets its own tenant — its own keys, its own egress policy, its own logs, its own model routing rules. Privilege is enforced by infrastructure, not by checklist.

Why now: Privacy is the differentiator Artificial Lawyer flagged in their analysis. "Even if outputs are only marginally better, what Kirkland could have is increased privacy." Per-matter tenancy makes that a hard architectural claim instead of a marketing one.
Workers for Platforms Durable Objects Privilege Audit-ready by default
PLAY 03 / LEGAL CORPUS

Vectorize + R2 for precedent, deal docs, and matter history.

Native vector DB for retrieval over your fine-tuning corpus and live RAG, with R2 holding the raw documents. Zero egress fees when training data moves between R2, the GPU clusters, and Azure ML.

Why now: Fine-tuning a domain LLM consumes and produces enormous data movement. R2's zero-egress profile is the difference between "a known monthly cost" and "an AWS S3 bill that surprises the CFO every month for three years."
Vectorize R2 Zero egress 40–60% storage egress
PLAY 04 / WORKFLOW ORCHESTRATION

Workers + Workflows for the 85 AI Advisor playbooks.

Your AI Advisors are hired to "translate legal tasks and workflows into scoped AI solutions." Workers + Workflows is the durable, replayable, observable runtime those scoped solutions need — without standing up Kubernetes for every practice group.

Why now: 85 AI Advisor seats × dozens of workflows each = hundreds of small services. Each one needs to be governed, restartable, and auditable. Workers + Workflows gives you that without spinning up infra per workflow.
Workers Workflows Durable Objects No infra-per-workflow
PLAY 05 / ATTORNEY ACCESS

Zero Trust to the GPU clusters from all 17 offices.

The on-prem GPU environment is a high-value target. Cloudflare Access + Tunnel mean every authorized attorney, AI Advisor, and infra engineer reaches it through the same identity-aware proxy — including the just-announced Tokyo office on day one.

Why now: Wiz is already verifying your apex (TXT record). Cloudflare ZT layers cleanly on top — Wiz watches cloud posture, ZT enforces identity-aware access. The on-prem GPU clusters are the asset that most needs that boundary.
Access Tunnel Identity-aware proxy Replaces VPN sprawl
PLAY 06 / GLOBAL EDGE FOR CLIENTS

The same edge already in front of kirkland.com.

You're already a Cloudflare customer. Extending the same edge to client-facing AI-enhanced legal services — which the AI Advisor JD explicitly mentions — is a roadmap move, not a procurement event.

Why now: No re-papering. No new vendor security review. The MSA, the SOC reports, the data-processing addenda — they already exist. The Innovation Program can ship faster because the underlying edge is already approved.
Existing relationship Faster procurement Same SOC Zero net-new vendor risk

Privilege is a tenancy problem, not a policy problem.

Workers for Platforms lets you spin a fresh, isolated tenant per matter — own keys, own egress, own logs, own model-routing rules — all from the same shared control plane. The boundary is enforced by the infrastructure, which is what regulators, clients, and your General Counsel actually want to hear.

Per-matter tenant isolation, sketched

Each matter is its own Worker namespace. Same edge, same observability, completely isolated state and egress.

MATTER A
M&A — Project Atlas
tenant-a.workers.dev
MATTER B
Restructuring — Co. X
tenant-b.workers.dev
MATTER C
IP Litigation — Patent 5,672
tenant-c.workers.dev
MATTER D
PE Fund Formation
tenant-d.workers.dev
Shared control plane — AI Gateway + Workers for Platforms + Zero Trust
one routing layer · one observability surface · zero cross-matter leakage by construction

The economics of 85 AI Advisor workflows.

AI Gateway gives you three things at once that none of Harvey/Legora/CoCounsel give you natively: per-practice-group cost attribution, semantic caching (legal research repeats itself constantly), and a single audit log across every model your firm calls.

A back-of-the-envelope, not a quote
Modeled at $15 / M blended tokens (Claude + GPT + your fine-tuned model)
85 AI ADVISORS × 200 QUERIES / DAY
~17K queries/day
Each AI Advisor is embedded in a practice group and "owns end-to-end workflow development." Conservative usage.
SEMANTIC CACHE HIT RATE
25–40%
Legal queries cluster heavily — precedent lookups, standard clause analysis, doc-summarization templates. Higher than typical LLM workloads.
ANNUAL INFERENCE SAVINGS
$1.4M–$2.8M
Before counting the value of per-practice-group attribution at billing time, which alone justifies the gateway for most firms.
The real win isn't the savings, it's the attribution. When IP Litigation, Restructuring, M&A, and Funds each have their own AI cost line — broken out by partner, matter, and client — you can make defensible decisions about which workflows are worth productionizing and which were just experiments. That data doesn't exist today inside Harvey, CoCounsel, or your own fine-tuned model. It exists inside AI Gateway.

What you already have vs. what Cloudflare adds.

We're not asking you to throw anything out. The stack you've signaled in your DNS, hiring, and public statements is excellent. Cloudflare slots in as the connective tissue.

The current stack, with Cloudflare overlaid

Each row is something you've already publicly signaled. The right column is the thin Cloudflare layer that makes it governable.

LAYER
WHAT KIRKLAND ALREADY USES
WHAT CLOUDFLARE ADDS
PUBLIC EDGE
kirkland.com on Cloudflare (server: cloudflare, cf-ray on every response)
+ AI Gateway + Workers for Platforms — same edge, more use cases
FOUNDATION MODELS
Anthropic (TXT-verified on apex), Azure OpenAI, fine-tuned open-source LLMs (planned)
+ AI Gateway: one observability layer across all three, BYO keys
CLOUD SECURITY POSTURE
Wiz (TXT-verified on apex)
+ Zero Trust: identity-aware access to GPU clusters & Azure environments
CORPORATE NETWORKING
Cisco (TXT-verified on apex)
+ Magic WAN / Tunnel as overlay across 17 offices including Tokyo
EMAIL SECURITY
Proofpoint (MX + SPF + TXT confirmation)
+ Area 1 / Email Security as defense-in-depth, not a replacement
WEBSITE CMS
OneNorth (AmLaw-specialized — onilive.com / onenorth.com aliases)
No change — OneNorth fronts cleanly behind your existing Cloudflare zone
MARKETING AUTOMATION
Pardot (Salesforce, TXT-verified)
No change — Workers can enrich client-facing forms server-side if useful
CORPUS / RAG STORAGE
Likely SharePoint / iManage + Azure Blob (industry standard for AmLaw)
+ R2 (zero-egress) + Vectorize as the cheap, fast retrieval tier for fine-tuning
GPU CLUSTERS
On-prem (per the May 27 AI Infrastructure Director JD)
+ Tunnel + Access in front; Workers AI as overflow capacity if useful

Why this is the right week, not next quarter.

Three things lined up in the last two weeks: the $500M project went public (FT), the AI Infrastructure Director roles posted (May 27), and the Tokyo office was announced (May 31). The first two define the platform; the third extends the perimeter.

The infrastructure choices being made right now — what governs inference traffic, how matters are isolated, where the training corpus lives, how Tokyo attorneys reach the GPU cluster — are the ones that will be expensive to change in 2027.

Cloudflare is already in the perimeter. Extending that perimeter inward to the AI buildout is a 60-day decision, not a 12-month one. And it doesn't require Kirkland to pause the model work for a single day to do it.

Worth a 30-minute conversation?

I sketched this because the public signals are so specific that a generic deck would have been a waste of your time. If the framing is roughly right, I'd love to walk through it with whoever owns the AI Infrastructure or Innovation Program side — and if it's off, the correction itself is the most useful thing I could hear.

Matt Holscher Calendar  → Reply by email