proofchat

Schema Labeling

Best practices for labeling tables and fields so AI responses are accurate and safe.

Summary

  • Purpose: Make schema AI-friendly without overexposing data
  • Audience: Integrators, data owners
  • Prereqs: DDL Extraction

Guidance

  • The DDL extraction looks for the "[LLM]" tag in field comments.
  • Table behavior:
    • Whitelist mode: If any field in a table is tagged "[LLM]", only fields tagged "[LLM]" are exposed to the AI.
    • Open mode: If no fields are tagged "[LLM]", all fields in the table are exposed.
  • Add brief descriptions to disambiguate field meaning.
  • Do not tag sensitive fields (PII, secrets) unless strictly necessary.
  • Prefer enumerations or tags for categorical fields.

Example

  • If Invoices::Total [LLM] and Invoices::Status [LLM] are tagged but Invoices::Notes is not, the AI can access only Total and Status.

Checklist

  • Only necessary tables/fields labeled
  • Ambiguity reduced with descriptions
  • Sensitive data excluded or summarized