llmtypesvalidation

Typed Prompt Engineering: Modeling LLM Inputs/Outputs with TypeScript

UUnknown

2026-02-05

12 min read

Model LLM prompts and outputs as typed contracts in TypeScript. Use generics, mapped types, and runtime schemas for safe Gemini and Claude integrations.

Hook: Why your LLM integration should feel like a typed API, not a guessing game

If you ship LLM features without compile-time contracts and runtime validation, you will hit bugs that only appear in production conversations. Teams integrating Apple's on-device assistant stack with Google Gemini or Anthropic's Claude/Cowork are already seeing how slight prompt or model-output drift can break user flows, leak data, or cause wrong actions. The fix is not more logging; it is typed prompt engineering — modeling inputs and outputs of LLM calls with TypeScript, validating at runtime, and treating prompts as first-class, versioned contracts.

Executive summary

In this article you will learn concrete patterns to design a typed prompt interface that combines advanced TypeScript features (generics, conditional types, mapped types) with runtime schemas and type guards. I use real 2025–2026 context — Apple relying on Google Gemini for Siri customization and Anthropic's Cowork/autonomy trends — to show why typed contracts are now essential. By the end you will have copy-pasteable patterns for:

Defining PromptSpec types and inferring output types at compile time
Mapping provider raw responses into validated domain objects with Zod or lightweight guards
Creating typed adapters for multiple LLM providers (Gemini, Claude) so you can swap models safely
Handling streaming, partial validation, and schema evolution

Why typed prompt engineering matters in 2026

Late 2025 and early 2026 cemented a few trends: Apple partnered with Google to bring Gemini into Siri, and Anthropic pushed Cowork toward non-technical desktop autonomy. These developments demonstrate two things. First, more critical user workflows depend on LLMs running at different trust boundaries: cloud models, partner-run models, and on-device models. Second, agents are taking actions on behalf of users, meaning a mistyped LLM output can have real-world consequences.

That makes contracts and validation unavoidable. TypeScript gives you helpful compile-time checks but cannot validate runtime responses. The solution is to combine advanced TypeScript patterns with runtime schemas and validators to make LLM integrations robust and auditable.

Core concepts and architecture

Treat each LLM endpoint as an RPC with a declared input type and an output type. Your stack should have three layers:

Compile-time spec: TypeScript types describing inputs and expected outputs
Runtime schema: Zod or similar validators that mirror types for runtime checks
Provider adapter: Code that calls Gemini, Claude, or other LLMs and converts raw responses into validated objects

Pattern: PromptSpec and inferred outputs

Start by defining a generic PromptSpec type that encodes both the input shape and the output shape. Use conditional types to infer the output for typed clients.


// PromptSpec defines a name, input shape, and output shape
type PromptSpec<I, O> = {
  name: string
  promptTemplate: (input: I) => string
  outputSchema: unknown
}

// Extract the output type from a spec at the type level
type SpecOutput<S> = S extends PromptSpec<any, infer O> ? O : never

// Example spec
type UserProfile = {
  id: string
  displayName: string
  bio?: string
}

const userProfileSpec: PromptSpec<{ userId: string }, UserProfile> = {
  name: 'getUserProfile',
  promptTemplate: input => `Return a JSON user profile for id ${input.userId}`,
  outputSchema: {} // wire up runtime validator later
}

The conditional type SpecOutput lets a typed LLM client return a correct TypeScript type so your front end and tests get compile-time safety.

Pattern: Typed client with runtime validation

Next, create a typed client that receives a PromptSpec and returns a validated output. Use a runtime validator like Zod to parse the response and fall back to safe errors.


import { z } from 'zod'

// runtime schema corresponding to UserProfile
const UserProfileSchema = z.object({
  id: z.string(),
  displayName: z.string(),
  bio: z.string().optional()
})

type ZodType<T> = { parse: (v: unknown) => T }

async function runPrompt<I, O>(
  spec: PromptSpec<I, O> & { zod?: ZodType<O> },
  input: I,
  provider: (prompt: string) => Promise<unknown>
): Promise<O> {
  const prompt = spec.promptTemplate(input)
  const raw = await provider(prompt)

  // Try schema validation if provided
  if (spec.zod) {
    try {
      return spec.zod.parse(raw)
    } catch (e) {
      throw new Error('Validation failed: ' + String(e))
    }
  }

  // otherwise assume raw is O
  return raw as O
}

This function combines TypeScript generics with runtime Zod checks. If your provider is Gemini vs Claude, you keep the same compile-time contracts and only implement a provider adapter to normalize raw outputs.

Provider adapters: mapping raw LLM outputs to contracts

Each LLM provider returns text in different shapes. Gemini might give structured JSON via function calling, while Claude might return conversational text or tool output. Build small adapter layers that transform provider results to the expected shape before validation.


// Example adapter signatures
async function callGemini(prompt: string): Promise<unknown> {
  // call Google's Gemini via your API key or Apple's Siri gateway
  // normalize function-calling style responses to a plain JS object
  return fetch('/gemini', { method: 'POST', body: JSON.stringify({ prompt }) })
    .then(r => r.json())
    .then(r => r.result ?? r)
}

async function callClaude(prompt: string): Promise<unknown> {
  // Anthropic Claude may return text that includes JSON blocks
  const text = await fetch('/claude', { method: 'POST', body: JSON.stringify({ prompt }) }).then(r => r.text())
  // try to find JSON in message and parse it
  const jsonMatch = text.match(/\{[\s\S]*\}/)
  if (jsonMatch) {
    try { return JSON.parse(jsonMatch[0]) } catch { return text }
  }
  return text
}

Adapter responsibilities:

Extract structured payloads from model text
Map provider metadata into your contract (timestamps, model id, safety labels)
Surface parse errors early for auditing

Advanced TypeScript patterns: mapped types and conditional transforms

You can use mapped types to automatically produce optional input shapes for partial prompts, or conditional types to compute response types from spec signatures.


// Make all input fields optional for partial prompt composition
type PartialInputs<T> = { [K in keyof T]?: T[K] }

// Example: a prompt that can accept partial user info
type UserUpdate = { displayName?: string; bio?: string }

type UpdateSpec = PromptSpec<PartialInputs<UserUpdate>, UserProfile>

// Conditional type to extract input type
type SpecInput<S> = S extends PromptSpec<infer I, any> ? I : never

These patterns let you build helpers such as auto-generating UI forms from the input type or coercing optional fields in an agent step pipeline.

Type guards and lightweight runtime validators

Not every project wants Zod. For small integrations or constrained environments like on-device Apple models, a tiny custom validator or hand-written type guard is effective and reduces bundle size.


// Example type guard for UserProfile
function isUserProfile(v: unknown): v is UserProfile {
  if (typeof v !== 'object' || v === null) return false
  const o = v as any
  return typeof o.id === 'string' && typeof o.displayName === 'string' && (o.bio === undefined || typeof o.bio === 'string')
}

// Use in client
async function runPromptWithGuard<I, O>(
  spec: PromptSpec<I, O> & { guard?: (v: unknown) => v is O },
  input: I,
  provider: (prompt: string) => Promise<unknown>
): Promise<O> {
  const raw = await provider(spec.promptTemplate(input))
  if (spec.guard) {
    if (spec.guard(raw)) return raw
    throw new Error('Guard validation failed')
  }
  return raw as O
}

Use guards for performance-sensitive code or when you cannot include a schema library. You still get a runtime safety check and clear error handling.

Handling streaming and partial validation

When you handle streaming outputs or agents (Anthropic Cowork style), you may receive partial responses. A robust approach:

Stream tokens to the client UI incrementally
Buffer until a complete JSON payload appears, then validate
Emit typed partial events for UI progress and a final validated object for business logic


// Simplified streaming handler
async function streamAndValidate<O>(
  stream: AsyncIterable<string>,
  collectJson: (chunks: string[]) => string | null,
  zod?: ZodType<O>
): Promise<O> {
  const buffer: string[] = []
  for await (const chunk of stream) {
    buffer.push(chunk)
    const maybeJson = collectJson(buffer)
    if (maybeJson) {
      const parsed = JSON.parse(maybeJson)
      if (zod) return zod.parse(parsed)
      return parsed as O
    }
  }
  throw new Error('Stream ended without valid JSON')
}

When building a streaming layer, consider patterns from real-time and edge ingestion playbooks — especially for buffering and token-level observability. See approaches for streaming and edge data meshes when you design the collector and partial-emit semantics.

Schema evolution and versioning

LLM providers change behavior; models drift. Protect your product by versioning prompt specs and schemas.

Ship spec versions with major/minor numbers
Use a registry (Git or internal schema store) that CI can validate against
Run nightly contract tests against staging models to detect drift

Practical example: typed assistant step for Siri using Gemini vs Claude

Imagine a feature where Siri (running on an iPhone) asks a model to summarize a meeting note and generate action items. You must support either Gemini (via Apple) or Claude (via Anthropic) depending on configuration. Here is a high-level plan.

Define spec: inputs (meeting transcript, user id, date) and output type (action items array)
Create runtime schema with Zod or a guard
Implement two adapters: callGemini and callClaude that normalize to JSON
Use runPrompt which validates and returns typed ActionItem[]


type ActionItem = { who: string; text: string; due?: string }

const ActionItemsSchema = z.array(z.object({ who: z.string(), text: z.string(), due: z.string().optional() }))

const summarizeSpec: PromptSpec<{ transcript: string }, ActionItem[] > = {
  name: 'meetingSummary',
  promptTemplate: input => `Summarize this meeting and return json action items: ${input.transcript}`,
  outputSchema: ActionItemsSchema
}

// Use chosen provider adapter at runtime
const provider = featureFlagUseGemini ? callGemini : callClaude
const items = await runPrompt(summarizeSpec as any & { zod: ActionItemsSchema }, { transcript }, provider)

This setup ensures that whether the backend is Gemini or Claude, your app receives a validated array of ActionItem objects and you can safely generate calendar events or notifications.

Observability and safety: auditing model decisions

For agentic features (Anthropic Cowork makes the case for agents with file system access), add metadata to every prompt call:

Model id and version
Prompt hash and spec version
Validation result and parse errors
Decision trace: why a suggested action was accepted or rejected

Export these logs to a secure audit store and use them in your CI to detect when a model drifted or started hallucinating fields your schema expects. For UI and developer-facing traces, see patterns for edge-assisted observability and decision-plane telemetry.

Advanced strategies

Contract testing and CI

Add contract tests that call a staging model and assert the runtime schema. Fail the build when parsing fails. Use mocked provider responses for unit tests and live model calls in nightly integration tests.

Schema registry and feature flags

Store validated schemas in a registry. When you update a prompt or schema, gate rollout behind feature flags and canary model calls.

Composable prompt building with mapped types

For multi-step agents, compose partial PromptSpecs using mapped types so later steps can accept outputs of earlier steps as typed inputs. This reduces casting and accidental shape mismatch.

Predictions for 2026 and beyond

Expect these trends to accelerate:

Provider-first contract features: Major providers will release stronger function-calling to return structured JSON and built-in schema validation endpoints.
On-device typed runtimes: Apple and others will push lightweight schema validators to the device to reduce round trips and surface validation errors locally. Watch coverage of pocket edge hosts and lightweight device runtimes.
Schema marketplaces and registries: Teams will trade and publish common prompt specs for tasks like summarization, invoices, or medical triage, with versioning and trust signals.

Checklist: Ship safe LLM integrations

Model your prompts and responses as PromptSpec<Input, Output>
Mirror each TypeScript type with a runtime validator (Zod or guards)
Wrap provider calls in adapters that normalize raw output before validation
Version specs and run nightly contract tests against staging models
Log model id, prompt hash, and validation outcomes for auditing
Use feature flags and canaries when switching models or prompt versions

A typed LLM integration turns brittle conversation glue into maintainable, auditable APIs. It might add a few lines of code up front, but it saves entire incident investigations later.

Actionable takeaways

Start by creating a small PromptSpec for a single critical flow and add a Zod schema for its output
Implement two provider adapters and run the same spec against both models to detect drift
Build CI contract tests that fail when parsing fails and add nightly model checks
For agentic features, log decisions and include schema version and model metadata for audits (governance matters)

Conclusion and call to action

If your team is integrating Gemini via Apple, Anthropic's Claude or Cowork, or any other LLM, make prompts and outputs first-class typed contracts. Use TypeScript advanced types to keep your codebase safe and predictable, and pair them with runtime validation to avoid surprises in production. Start small: pick one mission-critical prompt, add a schema, and run it against multiple providers. You will reduce incidents and make model upgrades far less risky.

Ready to convert a brittle LLM call into a typed contract in your codebase? Try the sample patterns above, add a contract test to CI, and roll it out behind a feature flag. If you want a checklist or code templates adapted to your stack (React Native Siri extension, Node backend for Claude, or on-device Swift+TypeScript bridge), share your stack and I will provide a tailored starter.

Unknown

Contributor

Senior editor and content strategist. Writing about technology, design, and the future of digital media. Follow along for deep dives into the industry's moving parts.