Tutorial 6: Deploy & Run Inference¶

This tutorial deploys the inference task graph and runs batch predictions using the trained model.

What you will learn¶

How the inference DAG is structured.
How batch inference loads features and generates predictions.
Where predictions are stored and how to inspect them.

Before you start¶

Training is complete and a model is registered (Tutorial 5).

Step 1: Deploy the inference DAG¶

make -C projects/pudo deploy-inference-dag

This creates a Snowflake task graph that orchestrates the inference pipeline:

Load model: retrieves the latest (or configured) model version from the Model Registry.
Generate features: computes inference-time features from the feature store for the target date.
Run predictions: applies the model to the feature matrix.
Write results: stores predictions in the project schema.

Step 2: Run the inference DAG¶

make -C projects/pudo run-inference-dag

This triggers an immediate execution of the inference task graph.

Alternative: CLI-based inference¶

You can also run inference directly without the DAG:

make -C projects/pudo run-inference

This runs pudo-inference run, which performs the same steps as the DAG but from your local terminal. This is useful for debugging or ad-hoc runs.

How inference works¶

The inference pipeline:

Reads the model version from configuration (or uses the latest).
Constructs an inference spine, one row per (PUDO, target_date).
Performs ASOF JOINs against feature views to get point-in-time features.
Applies the trained XGBoost model to generate predictions.
Writes predictions with metadata (model version, run timestamp, features used) to the project schema.

Step 3: Inspect predictions¶

-- View recent predictions
SELECT *
FROM PUDO_DEV.PREDICTIONS
ORDER BY PREDICTION_DATE DESC
LIMIT 20;

-- Check prediction distribution
SELECT
  PREDICTION_DATE,
  COUNT(*) AS num_predictions,
  AVG(PREDICTED_CAPACITY) AS avg_predicted,
  STDDEV(PREDICTED_CAPACITY) AS std_predicted
FROM PUDO_DEV.PREDICTIONS
GROUP BY PREDICTION_DATE
ORDER BY PREDICTION_DATE DESC;

Inference CLI verbs¶

The pudo-inference CLI provides additional verbs for post-inference operations:

Command	What it does
`pudo-inference run`	Run batch inference.
`pudo-inference evaluate`	Compare predictions to actuals and compute metrics.
`pudo-inference alerts`	Check for alert conditions (e.g., high prediction error).
`pudo-inference summary`	Print a summary of recent predictions.

You will use evaluate, alerts, and summary in Tutorial 7.

What you have now¶

Inference DAG deployed and executed.
Batch predictions generated and stored in Snowflake.
Understanding of the inference CLI verbs.

Next step¶

Continue to Tutorial 7: Simulate, Evaluate & Alert to simulate daily data cycles, evaluate prediction quality, and trigger alerts.