Why We Built a Model Context Protocol Server for Health Data

Most health platforms treat AI as a chatbot skin over a database. You ask a question, a query runs, an LLM paraphrases the result. It works — until you ask something that requires reasoning across multiple data sources, time ranges, or analytical dimensions.

We took a different approach. Omnio’s AI is powered by a purpose-built Model Context Protocol (MCP) server with a rich toolkit of specialized tools, a plugin architecture for every data source, and a time-series database designed for exactly this kind of workload. This post explains why — and what it means for the insights you get.

The Problem with “Database Layer + LLM Interpreter”

The common pattern in health AI products goes like this:

User asks a question
App translates it into a database query
Database returns numbers
LLM writes a sentence about those numbers

This works for simple lookups: “What was my sleep score last night?” But it falls apart for the questions that actually matter:

“Why did my sleep quality drop this week?”
“How does my training volume affect my recovery the next day?”
“Compare my sleep on days I lift heavy vs. rest days over the last 90 days.”

These questions require multi-step reasoning — fetching sleep data, then activity data, then computing correlations, then checking for outliers, then contextualizing the results. A static database layer can’t anticipate the chain of analysis an LLM needs to perform.

MCP: Letting the AI Drive the Analysis

The Model Context Protocol is an open standard for connecting AI models to external tools and data sources. Instead of pre-computing answers, MCP gives the AI a toolkit and lets it decide which tools to use, in what order, based on what the user is actually asking.

Our MCP server exposes tools across several categories:

Domain Tools

The foundation. Each tool returns structured, normalized data from one or more wearable sources — covering sleep, recovery, activity, workouts, strength training, heart rate, stress, environment, bloodwork, body composition, blood oxygen, and nutrition.

These tools abstract away the differences between devices. Whether your sleep data comes from an Oura Ring, a Garmin, or a WHOOP, the AI sees a consistent schema.

Cross-Source Analysis Tools

This is where it gets interesting. These tools operate across data sources to find relationships:

Correlation analysis — Joins all metrics by date and computes correlations across known metric pairs (e.g., training volume vs. deep sleep, air quality vs. sleep score, protein intake vs. recovery)
Temporal patterns — Groups metrics by day of week to surface behavioral patterns
Outlier detection — Identifies unusual days and cross-references what else was atypical, connecting cause and effect
Predictive modeling — Identifies which metrics are the strongest predictors of a target metric using lag analysis

Comparison & Snapshot Tools

Tools for high-level overviews and period-over-period comparisons — things like comprehensive health snapshots, automatic delta computation between time periods, recovery adherence tracking, and analysis of how specific behaviors (like training) affect subsequent recovery and sleep.

Why This Matters in Practice

When you ask “Why did my sleep drop this week?”, the AI doesn’t just fetch your sleep data. It:

Pulls sleep data for the current and previous week
Sees the decline, then pulls activity, stress, HRV, environment, and nutrition data for the full period to look for correlations
Identifies what was unusual on your worst sleep nights
Synthesizes: “Your sleep score dropped 8 points this week. The data shows three contributing factors: your training volume was 40% higher than your 30-day average, your bedroom PM2.5 was elevated on Tuesday and Wednesday (likely the wildfire smoke), and your protein intake dropped by 25g/day — which correlates with reduced deep sleep in your historical data.”

That’s multiple tool calls, chained by the AI’s reasoning about what to investigate next. No pre-built query could anticipate that chain.

The Plugin Architecture

Every data source in Omnio is a plugin — a self-contained module that declares its capabilities and implements standardized data access methods.

We currently support plugins for Oura, Garmin, WHOOP, strength training apps (LiftLog, Hevy), nutrition trackers (MyFitnessPal, Cronometer), environment sensors (via Home Assistant), smart scales, DEXA scans, bloodwork, and manual entries.

Each plugin declares a capability map — structured metadata about what data types it provides and what fields are available. When the MCP server starts, it auto-discovers plugins — if you have Oura and Garmin configured, those plugins register. If you add WHOOP later, its data automatically appears in every cross-source analysis tool. The AI sees a unified dataset; it doesn’t need to know which device generated which metric.

The aggregation layer runs all plugin queries in parallel, merges results by date, and handles partial failures gracefully — if one source times out, the others still return.

Why Time-Series, Not Relational

Most health platforms store metric data in PostgreSQL (or worse, SQLite). We use a purpose-built time-series database designed for exactly this workload.

Why it matters:

Native range queries — Aggregations, rollups, and range queries are first-class operations, not bolted-on SQL window functions
Efficient storage — Compression ratios of 10–20x for health metrics mean we can store years of high-frequency data (heart rate every few seconds, sleep stages minute-by-minute) without cost concerns
Downsampling — Automatic retention policies keep high-resolution data for recent periods and progressively aggregate older data
Sub-second queries on 365-day ranges — Asking the AI to analyze a full year of correlations isn’t a theoretical feature; it’s a fast operation
Label-based filtering — Sleep type, workout activity type, environment room, sensor source — all queryable via labels without schema changes

Every plugin’s data client speaks the database’s native query language. When the AI asks for 90 days of correlated data, it’s running parallel range queries across every configured source, not scanning relational tables.

Security Model: The AI Never Sees What It Shouldn’t

Health data is sensitive. Our architecture enforces strict boundaries:

Hardened System Prompts

The system prompt is structured into prioritized sections with explicit instructions that override any user attempts to manipulate behavior. User messages are delimited to prevent prompt injection.

Tool Result Sanitization

Every tool result passes through a sanitization boundary before reaching the LLM. Results are wrapped in clearly marked delimiters, and content is validated and size-limited to prevent context stuffing.

Per-User Data Scoping

The MCP server scopes every database query to the authenticated user. There’s no way for the AI — or a malicious prompt — to access another user’s data. The scoping happens at the query client level, below the tool layer.

Rate Limiting

Per-user daily limits on both messages and tool calls prevent abuse.

Sanitized Error Messages

If a tool fails, the user sees a safe generic message. The full stack trace is logged server-side for debugging — never exposed to the client or the LLM.

The Chat Orchestrator

The MCP server is one half of the system. The other half is the chat service — the orchestrator that manages the conversation loop between the user, the LLM, and the tools.

Multi-Provider Support

We support multiple LLM providers with a protocol-based abstraction that makes them interchangeable. All support streaming responses and function calling, with automatic retry and circuit-breaking for resilience.

Autonomous Investigation

When the AI decides it needs data, the chat service executes tool calls, feeds results back to the LLM for interpretation, and allows the AI to request additional tool calls based on what it finds. This means the AI can pursue multi-step investigations — it doesn’t stop after one query; it follows the data.

Persona System

The chat supports multiple personas — health coach (encouraging, action-oriented), clinical (objective, statistical), and casual (approachable, plain language) — each with carefully tuned system prompts. The health coach persona is specifically instructed to always explain why, not just what — connecting changes in one metric to correlated factors across other data sources.

Real-Time Streaming

Responses stream in real-time via Server-Sent Events. You see the AI thinking, see tool calls happening, and get the response word-by-word. A stop button lets you cancel generation mid-stream.

Observability: We Instrument Everything

Every layer of the system exposes metrics that we collect and visualize:

Tool metrics — Call count, duration, error rate per tool
Plugin metrics — Per-source call count and latency by capability type
Database query metrics — Query count, duration, retries by query type
Chat metrics — Messages, tool calls, token usage, latency, rate limit hits
Infrastructure metrics — CPU, memory, disk I/O, network

When a user reports “the AI was slow answering my question,” we can trace it end-to-end — from chat latency to which tool calls were made to which plugin queries were slow to which database queries took longest. Full observability, not guesswork.

What’s Next

The MCP server architecture is designed to grow. Adding a new data source means writing a single plugin — declare capabilities, implement the data methods, and every cross-source tool automatically includes the new data. We’re working on:

Mobile App
Apple Health & Google Health Connect integration via mobile SDKs
Fitbit and Polar device support
Advanced analysis tools — multivariate analysis, seasonal decomposition, and ML-powered anomaly detection
Proactive insights — scheduled analysis that surfaces notable changes before you ask

The health data space doesn’t need another dashboard that shows you numbers you already saw on your wrist. It needs infrastructure that turns fragmented data into understanding. That’s what we’re building.

Omnio is a health analytics platform that unifies wearable and health data with AI-powered insights. Learn more at getomn.io.