Troubleshooting¶
Common issues and how to resolve them.
Connection errors¶
"Cannot create session" or "Role does not exist"¶
Cause: The .env file is missing or the role has not been created yet.
Fix:
- Verify the
.envfile exists in the component directory. - If this is the first run, use
ACCOUNTADMINfor the hub bootstrap. - After bootstrap, switch to the operational role created by the hub.
# Check your .env
cat hub/.env
# Verify the role exists in Snowflake
snow sql -q "SHOW ROLES LIKE 'PUDO%'"
"Database does not exist"¶
Cause: Hub infrastructure has not been deployed yet.
Fix:
Training errors¶
Training DAG fails to deploy¶
Cause: The compute pool or warehouse does not exist.
Fix:
- Verify hub bootstrap completed successfully.
- Check that the compute pool exists:
- Check that the warehouse exists:
Training job fails with "libomp not found" (macOS)¶
Cause: XGBoost requires libomp which may not be installed on macOS.
Fix:
Training takes too long in DEV¶
Cause: DEV configuration may still use production-scale parameters.
Fix: Check config/training/dev.override.yaml and reduce parameters:
Inference errors¶
"No model found in registry"¶
Cause: Training has not completed or the model was registered under a different name.
Fix:
- Verify training completed successfully:
- Check the model registry:
Inference produces no predictions¶
Cause: No data is available for the target date.
Fix:
- Check that shared data has been seeded:
- If the simulation has not advanced far enough, add data:
Mock data errors¶
"Simulation has not been initialised"¶
Cause: The shared data has not been seeded yet.
Fix:
Simulation clock is stuck¶
Cause: You may have already reached the simulation end date.
Fix:
- Check simulation status:
- Reset if needed:
General debugging tips¶
Check the environment¶
# Which Git branch am I on? (determines the Snowflake environment)
git branch --show-current
# Which Python version?
python --version # Should be 3.10.x
# Is uv installed?
uv --version
Check Snowflake state¶
-- What databases exist?
SHOW DATABASES LIKE 'PUDO%';
-- What schemas exist?
SHOW SCHEMAS IN DATABASE PUDO_MLOPS;
-- What tasks exist?
SHOW TASKS IN SCHEMA PUDO_DEV;
-- Recent task history
SELECT * FROM TABLE(INFORMATION_SCHEMA.TASK_HISTORY())
ORDER BY SCHEDULED_TIME DESC LIMIT 20;
Enable verbose logging¶
Most scripts accept environment variables for debugging: