Research¶
Key References¶
- Fanout deep dive — canonical explanation of fanout, GROUP BY pitfalls, and sidecar pattern
- Fivetran dbt packages fanout analysis — 25-model analysis across 5 Fivetran packages identifying real fanout patterns
- Inspector output schema — current profiler output format (grain/fanout fields will extend this)
- Profiler contract — versioning policy for inspector output
Key Findings from Fivetran Analysis¶
- Pre-aggregate then join is the dominant safe pattern (17/25 models). This is what fanout warnings should recommend.
- NetSuite has real structural fanout risk — multi-period currency conversion creates grain expansion. Grain detection would catch this immediately.
- No packages document grain explicitly — validates the need for automated grain inference.
- 5 risk patterns identified: pre-aggregate-join (safe), multi-period expansion (risky), SUM-after-dimension-join (watch), UNION-then-aggregate (safe), star-join-of-sidecars (safe).
Existing Inspector Signals Useful for Grain Detection¶
From the current profiler output:
- key_role=primary_key → direct grain signal (100% unique + name matches id pattern)
- key_role=foreign_key → relationship signal (points to parent table grain)
- uniqueness_ratio → composite grain candidate detection
- role=identifier vs role=dimension → grain column vs grouping column
- is_incremental + role=time → temporal grain component
- distribution=unique → likely grain column
- row_count → denominator for uniqueness checks