This is the living roadmap and architecture document for legalOS — an operating system for legal departments. It complements CLAUDE.md (conventions) and DECISION_LOG.md (why the architecture is what it is).
legalOS is the AI-native operating system that serves as the single entry point for every workflow, agent, and tool used by an in-house legal department. It starts with one corporate legal department (single-tenant) but is designed to become a multi-tenant SaaS if that path is chosen later.
The app supports two types of agents:
┌───────────────────────────┐
│ Browser (Next.js client) │
│ - Launchpad UI │
│ - Chat UI │
│ - Admin dashboard │
└──────────────┬─────────────┘
│
HTTPS / Auth cookies
│
┌──────────────▼─────────────┐
│ Vercel (Next.js server) │
│ - Route handlers │
│ - Server actions │
│ - Proxy (auth) │
└──┬───────────────────────┬─┘
│ │
Supabase JS │ │ Anthropic SDK
(with user JWT)│ │ (server-side)
│ │
┌─────────────▼──────┐ ┌──────────▼──────────┐
│ Supabase │ │ Anthropic API │
│ - Postgres + RLS │ │ - Messages API │
│ - Auth │ │ - Streaming │
│ - Storage (later) │ └─────────────────────┘
└────────────────────┘
organization_id from day one, even though we serve one organization for now.| Role | Description | Created How |
|---|---|---|
super_admin |
Can manage all organizations. Reserved for platform owner. | Seed only |
org_admin |
Can manage users, roles, and agents within their organization. | Assigned by super_admin |
dept_admin |
Can manage agents within their department. View department analytics. | Assigned by org_admin |
user |
Can access departments they have been granted access to. | Assigned by org_admin or dept_admin |
Department access is independent of role. A user has zero or more rows in the user_department_roles table, each granting access to one department with a specific role scoped to that department.
Example: A user may be a dept_admin for Commercial and a user for M&A, and have no access to Privacy.
proxy.ts) validates the Supabase session on every request to authenticated routes.agents, conversations, messages, and usage_events reference user_department_roles to determine what the user can see. If the other three layers fail, the DB still refuses.| Table | Purpose | Notes |
|---|---|---|
organizations |
The tenant. One row for a single-customer deployment. | Multi-tenant ready. |
users |
User profile, joined to Supabase auth.users. |
One row per auth user. |
departments |
Commercial, M&A, Public Sector, GR&RA, Privacy, etc. | Seeded with the five starter departments. |
user_department_roles |
Join table: which user has which role in which department. | Enforces access control. |
agents |
All agents (external + native). | type column: external or native. category column added in 0003. |
| Table | Purpose |
|---|---|
conversations |
A chat thread between a user and a native agent. Snapshots system_prompt and model at creation per CLAUDE.md AI Integration Rules. |
messages |
Individual messages in a conversation. Immutable in practice. |
usage_events |
Per-call token usage and cost tracking. Append-only ledger. |
docs/AGENT_ARCHITECTURE.md)| Table | Purpose |
|---|---|
agent_attachments |
Permanent per-agent attached references (PDF, DOCX, TXT, MD, XLSX). Includes cached extracted_text, delivery_mode, source_type. |
message_attachments |
Per-message file uploads (Section 5a — core chat capability). Turn-scoped, garbage-collected on a longer cadence. |
formatted_outputs |
Audit + dedup record for server-rendered exports (Word .docx in v1; XLSX / Google Workspace / PowerPoint deferred). |
analytics_events |
Promotion from localStorage to Supabase per D-010. Independent of agent runtime architecture; tracked as a Phase 2 work item but not part of the architecture doc’s phasing list. |
agents also gains: is_template, forked_from_agent_id, tools_enabled (JSONB), default_output_format, deleted_at (soft delete with 30-day undo). agents.created_by already exists from 0001 and is reused. usage_events gains cache_creation_tokens and cache_read_tokens for prompt caching. Two Supabase Storage buckets land alongside: agent-attachments and message-attachments, both RLS-policied.
Goal: A deployable-but-empty app with the scaffolding in place.
legal-department-launchpad-template.CLAUDE.md, PROJECT_OUTLINE.md, DECISION_LOG.md, SETUP.md, README.md, CHANGELOG.md.agent-launchpad-template using Tailwind v4’s @theme directive and CSS variables..claude/skills/ (see skills-checklist.md).config/site.ts with placeholder branding.Definition of done: Push to main; site loads on Vercel; a health-check route returns 200; CLAUDE.md renders correctly on GitHub.
Goal: A working single-department launchpad with auth and role-based access, matching the UX of the previous agent-launchpad-template but in Next.js.
organizations, users, departments, user_department_roles, agents.proxy.ts); login, logout, magic-link flows.Definition of done: A test user can log in, land on the Commercial department, click an external agent card, and see the click recorded in localStorage. An admin user can see the admin dashboard and use the calculator.
Goal: Native agents become user-owned, user-configurable workspaces — with attached references, configurable tools, multi-format output, prompt caching, and a multi-vendor-ready directory structure. Phase 2 is a multi-session arc, not a single-week sprint, originally scoped narrower (D-023) and expanded mid-phase (D-025) to match the product vision captured in docs/AGENT_ARCHITECTURE.md.
Already shipped in Phase 2:
0004_native_agents.sql (conversations, messages, usage_events, message_role enum, full RLS in the user-owns + admin-read idiom). lib/anthropic/ runtime helpers (client.ts, pricing.ts, prompt-defense.ts, rate-limit.ts, stream.ts, types.ts). app/api/chat/route.ts SSE streaming endpoint. Test Smoke Agent seed at 0003_test_native_agent.sql. See D-023 for the bundled architectural commitments./agents/[id] route, components/chat/ (interface, message list, bubbles, input, SSE parser, sanitized markdown renderer). components/launchpad/agent-card.tsx branches native vs external. End-to-end smoke test passed against the live runtime.Remaining work (mirrors docs/AGENT_ARCHITECTURE.md § implementation phasing — sequenced when picked up, not pre-numbered):
lib/anthropic/ → lib/llm/anthropic/ move + vendor-prefixed model ids + single-case dispatcher. Pure structural; lays the groundwork for multi-vendor without shipping a second adapter.is_template, forked_from_agent_id, tools_enabled, default_output_format, deleted_at), agent_attachments, message_attachments, formatted_outputs, usage_events cache columns. RLS on every new table; Storage buckets and policies.extracted_text cache, attachments enter the cached prompt portion of every Anthropic request.cache_control: { type: "ephemeral" } markers on the cacheable portion; cache_creation_tokens / cache_read_tokens populated in usage_events; updated cost math. Required architecture per docs/AGENT_ARCHITECTURE.md §1, not optimization.message-attachments bucket and table, extraction reused from (5). The “here’s the NDA the other side sent us” workflow.tools_enabled validation against the catalog, sources rendered inline in chat for provenance, search cost into usage_events..docx export — server-side renderer, “Download as Word” button bound to default_output_format = docx, formatted_outputs audit row.Tracked as a Phase 2 commitment but independent of the architecture doc:
analytics_events table, write path from existing lib/analytics/events.ts call sites, admin metrics page reads from Supabase instead of localStorage.Definition of done: Users can create native agents from templates or from blank, attach references, enable web search, choose Word output, and have full chat conversations with streaming, prompt caching, and cost tracking. Six Commercial templates ship as the baseline catalog. Analytics events live in Supabase with admin metrics reading from the table.
Goal: Prove that adding a department is a scoped, repeatable task.
Definition of done: A user with Commercial + M&A access sees both departments. A user with only Commercial access gets a clean “not found” or redirect if they try to force-navigate to M&A. RLS stops them at the DB even if the proxy misses.
Goal: All five target departments live.
dept_admin role).Definition of done: All five departments are functional end-to-end. A demo walk-through covers at least one external agent and one native agent per department.
Goal: Admin-level oversight surface over the user-owned agent estate.
User-level agent create / edit ships in Phase 2 (per docs/AGENT_ARCHITECTURE.md and D-025 — every user creates and owns their own agents from templates or blank). Phase 5’s scope is the residual admin work that does not belong in the per-user surface:
Definition of done: An org_admin can see every agent in the organization, force-disable one if needed, transfer ownership, and view an audit trail of changes.
Goal: Agents can run on OpenAI and Google models, not just Anthropic.
The directory structure (lib/llm/<vendor>/), the vendor-prefixed model id format (anthropic/claude-sonnet-4-6), the single-case dispatcher, the bounded model picker, and the multi-vendor pricing table all ship in Phase 2 (per docs/AGENT_ARCHITECTURE.md § 6 and Phase 2 work item 1). Phase 6’s scope is the actual multi-vendor implementation against that structure:
lib/llm/openai/. Adds a case to the dispatcher; populates pricing table rows for the supported models.lib/llm/google/. Same pattern.cache_control markers (already wired in Phase 2 per item 6); OpenAI caches automatically with no markers; Gemini sets cache headers per Google’s API. Each adapter owns its caching strategy; the usage_events cache columns are vendor-neutral and each adapter populates them from its own SDK’s response shape.openai/gpt-5.1 or google/gemini-... from the same dropdown that already offers Claude models.Definition of done: A user can select an OpenAI or Google model in the agent edit form, conversations stream through the appropriate adapter, cost tracking attributes spend correctly by vendor, and prompt caching works per each vendor’s semantics.
Goal: Production-quality guardrails for a growing agent catalog.
Definition of done: An admin can see a per-agent quality score, trace any slow or failed conversation, and get alerted when monthly spend crosses a threshold.
Goal: Products, Compliance, Litigation, IP. Each department should now be ~80% configuration + agent definitions.
Goal: If the template is ever productized, this phase makes it a true SaaS.
The organization_id foundation from Phase 1 makes this a scoped project rather than a rewrite.
To keep scope honest, the following are explicitly not part of this project unless/until a future decision changes it:
These are adjacent to the legal department’s stack but out of scope for a launchpad.
/workspace/workflows, /workspace/integrations, /workspace/help), breadcrumb rendered visually lowercase via text-transform with polished sub-leaf labels, dashboard transition attempted and reverted (commit 5947326, D-047).?next= preservation in proxy.ts:24 (deferred follow-up from D-036), then Session 32’s Knowledge reshape (Research / Vault / Sources as real routes; cuts over the coming-soon URLs introduced in Session 31), then Sessions 33 / 34 / 35 build out Workflows / Integrations / Help respectively. Workspace dashboard deferred to Session 36+ (see README Future / Backlog).