Skip to content

Grain Inference and Fanout Risk

Objective

Infer candidate model grain, join multiplicity profiles, and fanout risk scores from profiling stats and relationship graph. Surface warnings to agents and compile-time linting.

Deliverables

  • Define scope and decision boundaries.
  • Produce implementation plan and execution checkpoints.
  • Capture rollout and validation approach.

Approach

Three-phase implementation adding new modules to dataface/core/inspect/:

  1. Phase 1 — Grain candidate inference (grain_detector.py): Detect per-table grain from existing profiler stats (PK, uniqueness, naming). No new DB queries.
  2. Phase 2 — Join multiplicity profiling (join_multiplicity.py): Classify relationship cardinality (1:1, 1:N, N:1, N:M) and compute fanout factors from existing column stats.
  3. Phase 3 — Fanout risk scoring (fanout_risk.py): Score join risk (none → critical) with actionable recommendations. Wire into MCP tools and compile-time warnings.

Phase 4 (multi-hop path analysis) is deferred to M2+.

See spec.md for full phased plan, output schemas, algorithms, non-goals, and acceptance criteria.

Tasks