Behind the Data - How Metadata Drives Trustworthy AI in Utilities
Suppose I say the number 35 out loud. Does this mean anything to you? Probably not.
Now, what if I tell you 35 represents kilowatt-hours (kWh), recorded on September 1, 2025, at 23:59 by AMI meter #1234? Suddenly, you know much more:
- AMI meter #1234 recorded 35 kWh of usage.
- The recording occurred on September 1, 2025, at 23:59.
- By querying other data sources, you can determine AMI meter #1234 belongs to a specific customer and that customer’s specific meter type.
The number 35 on its own is just a value. The supporting details around it are metadata. Metadata describes data, giving it both meaning and context.
You may ask: What does this have to do with AI in utilities?
Large language models (LLMs) and other AI systems can process any data stream and attempt to make inferences. But if you feed them only raw time-series numbers—say, meter readings with no labels, timestamps, or units—they struggle to make sense of what those numbers represent and provide actionable insights.
What AI Can Do Without Metadata
- Even without metadata, AI isn’t useless. It can still:
- Detect basic patterns
- Perform unsupervised learning such as clustering or anomaly detection
- Conduct general classifications
The Problems of Missing Metadata
- But serious issues arise without metadata:
- Loss of provenance and context: Without time, location, or source, data can be misinterpreted. For example, was the temperature reading recorded in Fahrenheit or Celsius?
- Lower trust and transparency: If you can’t prove where data came from, you can’t prove its reliability
- Overly generalized results: Predictions lack precision
- Harder to explain and audit: Troubleshooting or regulatory audits become painful
- Difficult dataset mapping: Integrating multiple systems becomes guesswork
For very narrow cases—say, a single model applied across identical residential feeders—AI may have some value. But that assumes consistent metadata already exists in the background.
How Metadata Helps
- Metadata unlocks the real power of AI in utilities by:
- Ensuring data quality and integrity through traceability and evidence
- Supporting explainability and audits, making it easier to connect decisions back to data
- Integrating heterogeneous sources such as SCADA, smart meters, or smart devices like thermostats or water heaters
- This is where semantics comes in: not just labeling data but defining its meaning. Standard ontologies and formats (like the Common Information Model) make sure systems speak the same language
- Adding context like time, location, weather, or Demand Response calls
- Example: correlating extreme heat events with spikes in distributed energy resource activity.
- Enabling responsible AI by supporting accountability, transparency, reliability, and compliance
- Example: proving that a rebate program is applied equitably across demographics, rather than unintentionally clustering benefits in wealthier neighborhoods.
Context vs Semantics
- It’s worth pausing here:
- Contextual metadata tells you when, where, how, and by whom data was created
- Semantic metadata tells you what the data means and how it relates to other data
In any business, including utilities, both are essential. Without context, AI can’t place data in the right time or place. Without semantics, AI can’t distinguish between “temperature” and “voltage,” or know if “35” means kWh, MW, or °C.
Putting Metadata into Practice
Understanding metadata is one thing. Putting it into practice is another. For utilities, that means establishing a data governance framework—the policies, processes, and roles that keep data accurate, meaningful, and usable.
Here are a few steps to get started:
- Identify Data Owners – Assign clear ownership for every dataset. Owners are accountable for quality, access, and usage
- Map Data Flows – Visualize how data moves through your organization. Show how collection points (AMI systems, SCADA, IoT devices, customer portals) connect to business processes
- Create a Data Dictionary – Define a common vocabulary so everyone speaks the same data language. For stronger integration, adopt standards and ontologies like the Common Information Model (CIM)
- Assign Data Stewards – Designate roles to maintain metadata quality, monitor consistency, and close gaps. Remember: metadata isn’t a one-time project—it evolves with your organization
Closing Thought
Metadata is already central to utility operations—even before AI enters the picture. But as AI adoption accelerates, the risks of ignoring metadata grow sharper. Poorly contextualized or semantically mismatched data doesn’t just create inefficiency; it can lead to catastrophic decisions on the grid.
We help utilities design governance frameworks to ensure AI is built on a foundation everyone can trust. Contact us to see how we can help put this into practice.