Anomaly Detection Guide

Anomaly detection in Sparvi happens through monitors. A monitor watches a metric on a schedule, learns a baseline from history, and flags an anomaly when the latest value falls outside what is normal. This guide explains how the statistical evaluation mode works, when to use it, and how it pairs with thresholds and segmented metrics.

How Anomaly Detection Works

Every Sparvi monitor has an evaluation mode. There are two:

  • ML / statistical, Sparvi learns a baseline from your historical data and alerts when the latest value deviates beyond your configured sensitivity. This is the mode that produces anomalies.
  • Threshold, You set explicit warning and critical lines. The monitor alerts when the value crosses them. No anomaly framing, it is either inside or outside the contract.

Statistical mode supports three detection methods:

  • Z-score, How many standard deviations the latest value sits from the baseline mean. Good general-purpose default.
  • IQR (Interquartile Range), Robust to outliers in the baseline window. Good when the metric has occasional legitimate spikes you do not want the baseline to absorb.
  • Moving average, Compares the latest value to a rolling mean. Good when the metric trends naturally over time and a static mean would drift.

Each monitor also has a configurable sensitivity (0.5–2.0; higher fires on smaller deviations), a baseline window in days (default 14), and a minimum data points guard so detection does not fire on a thin history.

What You Can Detect

Anomaly detection runs on whatever metric the monitor is configured to track. Sparvi's three source types cover most cases:

Row count anomalies

A built-in table metric monitor on row_count plus statistical evaluation catches:

  • Partial loads (today's volume sits well below baseline)
  • Duplicate runs (volume sits well above baseline)
  • Failed pipelines (volume flatlines)
  • Upstream outages (volume drops on one source but not others)

Example: orders.row_count averages 10K rows/hour over the last 14 days; today's hour reads 500. Sparvi raises an issue with the metric, the baseline, and the deviation.

Column metric anomalies

Built-in column-metric monitors (null %, distinct count, distinct %, min, max, average, standard deviation) plus statistical evaluation catch:

  • Null spikes (null % on email column jumps from 2% to 15% after an upstream change)
  • Distribution shifts (average order value jumps from $50 to $5,000)
  • Cardinality drops (distinct count on customer_id falls off, an ID column is now constant)
  • Range escapes (min or max of a numeric column exits its historical band)

Custom SQL anomalies

A custom SQL monitor plus statistical evaluation lets you detect anomalies on any metric you can express as a numeric query: revenue per hour, p95 latency, churn rate, freshness, signups by source. The baseline is learned from the values your query has returned over time.

Segment-level anomalies

Any monitor can be segmented by a dimension column. With segmentation enabled:

  • Each segment value (e.g. each region, each tenant) has its own baseline
  • Statistical evaluation runs per segment so a deviation in one slice fires its own alert
  • Issues name the segment so on-call work is actionable on the first read

This is the difference between "revenue is down 5% overall" (probably noise) and "EMEA revenue is down 62% while AMER and APAC are flat" (immediately actionable).

Configuring a Monitor for Anomaly Detection

In Sparvi Cloud, create a monitor:

  1. Basics, Name, optional display name, description, and active flag.
  2. What to measure, Pick a source type:
    • Table metric (row count), choose a table.
    • Column metric, choose a table, a column, and a metric (null %, distinct count, distinct %, min, max, avg, stddev).
    • Custom SQL, paste a query that returns a numeric column, and name the alias of that column.
  3. Segment by (optional), Pick a column to segment by. For built-in metrics, pick any column on the same table. For custom SQL, alias a dimension column in the query (for example region AS segment) and reference it in the segment column field.
  4. How to alert, Choose ML / Statistical, then pick a detection method (z-score, IQR, or moving average), a sensitivity, and a baseline window.
  5. Schedule, Run every N minutes/hours, daily at a specific local time, weekly on a chosen day, or manually.

Save the monitor. Sparvi collects baseline data on the configured schedule and starts firing alerts once the minimum-data-points guard has been satisfied.

Anomaly Types in Detail

Volume anomalies

Row-count monitors detect unusual changes in table volume. Severity is derived from how far the value sits from the baseline.

Example: "orders row count down 92% from baseline (12,418 → 1,012)", critical severity.

Null rate anomalies

Null % monitors on a column flag unexpected upstream changes, ETL bugs, or data source issues.

Example: "users.email null % rose from 2% to 15% over the last 6 hours", warning severity.

Distribution anomalies

Column metric monitors on avg, stddev, min, or max catch shifts in value distributions that point to calculation errors, unit changes, or data corruption.

Example: "orders.amount avg jumped from $52.18 to $5,218, likely cents-to-dollars conversion regression", critical severity.

Freshness anomalies

A custom SQL monitor on DATEDIFF('minute', MAX(updated_at), CURRENT_TIMESTAMP()) plus statistical evaluation surfaces pipeline failures, source outages, and scheduling issues.

Example: "orders freshness lag is 4 hours; baseline is under 10 minutes", critical severity.

Uniqueness anomalies

distinct_count and distinct_pct monitors catch unintended duplicates, ID collisions, and cardinality drift.

Example: "user_id distinct % fell from 100% to 84%, duplicates introduced", critical severity.

Segment anomalies

Anomalies on segmented monitors are scoped to a single segment value. The issue names the segment so the alert is immediately actionable.

Example: "revenue per region, EMEA down 62% from baseline; AMER and APAC normal", critical severity, scoped to region = 'EMEA'.

Working with Anomalies in Sparvi Cloud

Filtering and viewing

In the dashboard you can:

  • Filter by severity (critical, warning)
  • Filter by monitor or by table
  • Filter by segment value (for segmented monitors)
  • View the trend chart for any monitor, with the baseline band overlaid

Taking action

When a monitor fires, Sparvi automatically creates an issue with:

  1. The metric name, the current value, the baseline, and the deviation
  2. The segment value (if the monitor is segmented)
  3. The historical trend so you can see context at a glance
  4. AI-suggested resolution based on similar past issues
  5. Downstream lineage so you can see which dashboards or pipelines depend on the affected data

You can assign the issue, comment in context, and resolve it without leaving Sparvi.

Tuning Sensitivity

Some advice for getting useful signal:

  1. Start with the defaults, sensitivity 1.0, baseline window 14 days, z-score detection. Adjust only after you have a few days of data and can see which alerts you actually act on.
  2. Lower sensitivity for noisy metrics, if a monitor pages weekly on benign blips, drop sensitivity toward 0.5 (less sensitive) or switch to IQR.
  3. Raise sensitivity for critical columns, payment amounts, primary keys, and similar should fire on smaller deviations.
  4. Use threshold mode when you have a hard contract, "p95 latency must stay below 300ms" is not an anomaly question; it is a threshold. Flip the monitor to threshold mode and set the line.
  5. Segment when an aggregate would hide the problem, per-region revenue, per-tenant latency, per-source row counts. Per-segment baselines surface localized issues that overall baselines smooth out.
  6. Set a meaningful minimum data points, for a once-daily monitor, three is fine; for a once-an-hour monitor, you might want a week's worth before firing.

Anomaly Detection vs. Validation

Both create issues and route alerts, but they answer different questions:

  • Validation, pass/fail check against a known rule you write in SQL. "Orders must have positive amounts." If a row violates the rule, the validation fails.
  • Anomaly detection (statistical monitor), track a metric over time and alert when it falls outside the baseline learned from history. "Order volume per hour should look like the last 14 days of order volume per hour."

Use validations when you know the rule. Use monitors when you do not, or when the bar moves with the business. Most teams use both.

Next Steps

After mastering anomaly detection:

  1. Set up Validation Rules to enforce hard contracts alongside statistical monitors
  2. Explore Data Profiling to see the column statistics that feed into column-metric monitors
  3. Learn about Sparvi Cloud Database Connections to connect Snowflake, BigQuery, or dbt Core