Kirkland & Ellis is fine-tuning open-source LLMs across on-prem GPU clusters and Microsoft Azure AI, with a 180-person team and 85+ open AI roles. Off-the-peg platforms (Harvey, Legora, CoCounsel, Lexis+ AI) are the floor — the goal is above it.
Fine-tuning a legal LLM is hard, expensive, sensitive work. The boring parts — routing, caching, observability, tenant isolation, attorney access from 17 offices, training-data movement — shouldn't be hard too. They shouldn't even be on the critical path.
Each one maps directly to a sentence in your published job descriptions or the public reporting on the project. None of them require Kirkland to rip anything out or pause the model work.
One audited, rate-limited, cached, logged hop in front of your fine-tuned on-prem model and Azure AI and Anthropic (already on your apex) and Harvey/Legora/CoCounsel/Lexis+ AI during the transition.
Every matter or client gets its own tenant — its own keys, its own egress policy, its own logs, its own model routing rules. Privilege is enforced by infrastructure, not by checklist.
Native vector DB for retrieval over your fine-tuning corpus and live RAG, with R2 holding the raw documents. Zero egress fees when training data moves between R2, the GPU clusters, and Azure ML.
Your AI Advisors are hired to "translate legal tasks and workflows into scoped AI solutions." Workers + Workflows is the durable, replayable, observable runtime those scoped solutions need — without standing up Kubernetes for every practice group.
The on-prem GPU environment is a high-value target. Cloudflare Access + Tunnel mean every authorized attorney, AI Advisor, and infra engineer reaches it through the same identity-aware proxy — including the just-announced Tokyo office on day one.
You're already a Cloudflare customer. Extending the same edge to client-facing AI-enhanced legal services — which the AI Advisor JD explicitly mentions — is a roadmap move, not a procurement event.
Workers for Platforms lets you spin a fresh, isolated tenant per matter — own keys, own egress, own logs, own model-routing rules — all from the same shared control plane. The boundary is enforced by the infrastructure, which is what regulators, clients, and your General Counsel actually want to hear.
Each matter is its own Worker namespace. Same edge, same observability, completely isolated state and egress.
AI Gateway gives you three things at once that none of Harvey/Legora/CoCounsel give you natively: per-practice-group cost attribution, semantic caching (legal research repeats itself constantly), and a single audit log across every model your firm calls.
We're not asking you to throw anything out. The stack you've signaled in your DNS, hiring, and public statements is excellent. Cloudflare slots in as the connective tissue.
Three things lined up in the last two weeks: the $500M project went public (FT), the AI Infrastructure Director roles posted (May 27), and the Tokyo office was announced (May 31). The first two define the platform; the third extends the perimeter.
The infrastructure choices being made right now — what governs inference traffic, how matters are isolated, where the training corpus lives, how Tokyo attorneys reach the GPU cluster — are the ones that will be expensive to change in 2027.
Cloudflare is already in the perimeter. Extending that perimeter inward to the AI buildout is a 60-day decision, not a 12-month one. And it doesn't require Kirkland to pause the model work for a single day to do it.
I sketched this because the public signals are so specific that a generic deck would have been a waste of your time. If the framing is roughly right, I'd love to walk through it with whoever owns the AI Infrastructure or Innovation Program side — and if it's off, the correction itself is the most useful thing I could hear.