Skip to content

Schema-derived AI prompts from compiled types

Problem

AI prompts describing the Dataface YAML spec are hand-maintained markdown strings in schema.py. When the schema evolves (new chart types, new layout options, new variable features), the prompt templates must be manually updated — and they drift. generate_face_schema_summary() is a 70-line hand-written markdown string that's already incomplete (missing settings, geo charts, advanced layout options). json-render solved this with catalog.prompt() which auto-generates a complete system prompt from the Zod-typed catalog, so adding a component automatically makes it available to AI.

Context

  • Research origin: ai_notes/research/json-render-deep-dive.md — Priority 1 section.
  • Current hand-maintained prompts: dataface/core/compile/schema.pyget_schema_for_prompt(), generate_face_schema_summary(), generate_variable_schema()
  • Pydantic input types: dataface/core/compile/types.pyFace, Chart, Variable, QueryDefinition, etc.
  • Compiled types: dataface/core/compile/compiled_types.pyCompiledFace, CompiledChart, etc.
  • Chart type enum: ChartType in types.py
  • AI integration: dataface/ai/ — MCP server and tools that consume schema prompts
  • Dependency: declarative-schema-definition-outside-python-code — a declarative schema makes auto-generation trivial; without it, we'd introspect Pydantic models directly (possible but messier)
  • Enables: extensible-schema-with-custom-elements-and-chart-types — when users register custom elements, they automatically appear in AI prompts

Possible Solutions

Walk the Pydantic model tree (FaceChartChartType enum, etc.) at runtime, extract field names, types, defaults, docstrings, and enum values. Generate markdown prompt from the live model structure.

Pros: No new schema format needed. Works today. Always in sync with code by definition. Cons: Pydantic docstrings aren't optimized for AI context. Can't express "when to use this" hints (like json-render's description field on components). May produce verbose/noisy prompts.

B. Declarative schema with AI annotations

Once the declarative schema exists (see dependency), add AI-specific metadata: component descriptions, usage hints, common mistakes, examples. Generate prompts from this enriched schema.

Pros: Rich, curated AI context. Descriptions tuned for LLM consumption. Same schema powers validation, editor tooling, and AI prompts. Cons: Blocked on declarative schema work. Requires maintaining description quality.

C. Hybrid: Pydantic introspection now, migrate to declarative later

Start with (A) to eliminate hand-maintenance immediately. When the declarative schema lands, switch the prompt generator to read from it instead. The public API (get_schema_for_prompt()) stays the same.

Pros: Immediate improvement. Clean migration path. No throwaway work. Cons: Two rounds of implementation.

Plan

Implementation Progress

Review Feedback

  • Review cleared