Schema Labeling
Best practices for labeling tables and fields so AI responses are accurate and safe.
Summary
- Purpose: Make schema AI-friendly without overexposing data
- Audience: Integrators, data owners
- Prereqs: DDL Extraction
Guidance
- The DDL extraction looks for the "[LLM]" tag in field comments.
- Table behavior:
- Whitelist mode: If any field in a table is tagged "[LLM]", only fields tagged "[LLM]" are exposed to the AI.
- Open mode: If no fields are tagged "[LLM]", all fields in the table are exposed.
- Add brief descriptions to disambiguate field meaning.
- Do not tag sensitive fields (PII, secrets) unless strictly necessary.
- Prefer enumerations or tags for categorical fields.
Example
- If
Invoices::Total [LLM]
andInvoices::Status [LLM]
are tagged butInvoices::Notes
is not, the AI can access onlyTotal
andStatus
.
Checklist
- Only necessary tables/fields labeled
- Ambiguity reduced with descriptions
- Sensitive data excluded or summarized