Documentation
Monitors
Monitors are how Sparvi watches the metrics that matter. A monitor combines a source (what to measure), an evaluation mode (how to decide when to alert), and a schedule (when to run). When a monitor falls outside its expected range, Sparvi raises an issue with the metric, the segment (if applicable), and the historical trend.
Anatomy of a Monitor
Every monitor has four parts:
- Source, what numeric value to read from your warehouse
- Evaluation mode, ML / statistical (learns a baseline) or threshold (warning + critical lines you set)
- Schedule, interval, daily, weekly, or manual-only
- Segmentation (optional), split the metric by a dimension column so each segment gets its own baseline and its own alert path
Source Types
There are three:
Built-in table metric
A single value computed across the whole table. The only built-in table metric today is row_count. Use this for volume monitoring, partial loads, missed runs, duplicated batches.
Built-in column metric
A single value computed across one column. Available metrics:
null_pct, percent of values that are NULLdistinct_count, number of distinct valuesdistinct_pct, distinct / total ratiomin,max,avg,stddev, numeric statistics on numeric columns
No SQL required, pick the table, the column, and the metric.
Custom SQL
Write any query that returns a numeric column. Alias the numeric column as value (or whatever you specify in the value column field) and Sparvi reads the metric from there. Use this for:
- Business metrics: revenue, signups, churn rate, MRR
- Freshness:
DATEDIFF('minute', MAX(updated_at), CURRENT_TIMESTAMP()) AS value - Cross-table checks: JOINs and CTEs that no built-in metric covers
- Per-source health: split a single table's metric by upstream provider
Custom SQL monitors can also be segmented: alias a dimension column as segment (or your chosen name) and Sparvi runs one evaluation per segment value.
Evaluation Modes
ML / Statistical
Sparvi learns a baseline from history and alerts when the latest value deviates beyond your configured sensitivity. Three methods:
- Z-score, number of standard deviations from baseline mean. Good general-purpose default.
- IQR (Interquartile Range), robust to outliers in the baseline window. Good when occasional legitimate spikes would otherwise inflate the standard deviation and mask future anomalies.
- Moving average, compares the latest value to a rolling mean. Good when the metric trends naturally over time.
Sensitivity ranges from 0.5 (less sensitive, fewer alerts) to 2.0 (more sensitive, more alerts). The baseline window defaults to 14 days and can be set anywhere from 1 to 90 days. The minimum data points guard prevents firing until enough history has been collected.
This is where Sparvi's anomaly detection lives, an "anomaly" is just what happens when a statistical monitor's latest value falls outside the learned baseline.
Threshold
Set explicit warning and critical lines. The monitor alerts when the value crosses them. Pick a direction:
- Below, alert when the value drops below the line (e.g. revenue per region must stay above $50K/day)
- Above, alert when the value goes above the line (e.g. p95 latency must stay under 300ms)
You can set warning only, critical only, or both. When both are set, Sparvi enforces consistency (for below, critical must be ≤ warning; for above, critical must be ≥ warning).
Use threshold mode when you have a hard contract. Use statistical mode when "normal" drifts with the business.
Segmented Metrics
Any monitor can be segmented by a dimension column. With segmentation enabled:
- Sparvi computes the metric once per segment value
- Each segment has its own baseline (statistical) or its own threshold check
- Issues name the segment so the alert is actionable on the first read
- The dashboard collapses per-segment results into one overall value using your chosen rollup: sum, avg, min, or max
- The detail view shows all four rollups side by side and lets you sort segments by deviation
For built-in metrics, choose any column on the same table. For custom SQL, alias your dimension column in the query (for example region AS segment) and name it in the segment column field.
When to segment
Segment when an aggregate would hide the problem. Common patterns:
- Revenue, signups, or orders per region or per channel
- Latency, error rate, or throughput per tenant
- Row counts per upstream source
- Quality metrics per environment (dev, staging, prod)
- Distributions per product line or per app version
Scheduling
Each monitor picks its own cadence:
- Interval, every N minutes or hours. Minimum total interval is 5 minutes.
- Daily, at a specific hour and minute in any IANA timezone
- Weekly, choose the day of week and time
- Manual, only runs when you click Run Now
Monitors also have an active flag. Toggle it off to pause without losing configuration or history; toggle it on to resume.
Issues and Alerts
When a monitor fires, Sparvi creates an issue containing:
- The monitor name and its current value
- The baseline (statistical) or threshold (threshold mode)
- The segment value, if the monitor is segmented
- The historical trend so context is one click away
- AI-suggested resolution based on similar past issues
- Downstream lineage so you can see which dashboards or pipelines depend on the affected data
Alerts route via Slack, email, or PagerDuty according to your notification preferences. Each user can choose between immediate alerts and a daily digest.
Best Practices
- Start with row counts, every critical table deserves a row count monitor with statistical evaluation. It is the cheapest monitor and the highest-signal one for catching pipeline regressions.
- Use threshold mode for contracts, SLAs and business minimums belong in threshold monitors, not statistical ones.
- Segment when an aggregate would hide the problem, revenue per region, latency per tenant, row counts per source.
- Set sensible schedules, hourly tables can be monitored every 10–60 minutes; reference tables that change daily do not need sub-hour cadence.
- Tune sensitivity after a week, collect a baseline first, then dial sensitivity up or down based on how often you actually act on alerts.
- Pause noisy monitors instead of deleting them, the active flag preserves configuration so you can come back after the upstream stabilizes.
Next Steps
- Read the Anomaly Detection Guide for a deeper look at statistical evaluation
- Set up Validation Rules for hard pass/fail contracts
- Explore Data Profiling to find the column statistics worth monitoring