Repo Layers & Ownership¶

The hub-spoke architecture¶

This repository uses a hub-spoke architecture where platform infrastructure is centralised and projects are independent:

graph TD
    Hub["Hub<br/>Platform infrastructure"]
    MockData["mock_data<br/>Shared data"]
    Projects["projects/pudo<br/>ML operations"]
    FeatureStore["feature_store<br/>Shared utilities"]

    Hub --> MockData
    Hub --> Projects
    Hub --> FeatureStore
    Projects --> FeatureStore

Ownership rule¶

Hub and shared code may be referenced by projects. Projects may never be referenced by hub or shared code.

This rule ensures:

Hub stability: the hub changes rarely and is owned by platform engineers.
Project independence: each project can be developed, tested, and deployed without coordinating with other projects.
Clear dependency direction: dependencies always point inward (project → hub/shared), never outward.

Layer responsibilities¶

Hub (`hub/`)¶

Creates and manages Snowflake platform objects:

Responsibility	Snowflake objects
Database management	`PUDO_MLOPS` database
Schema provisioning	`SHARED_DATA`, `FEATURE_STORE_<ENV>`, `MODEL_REGISTRY_<ENV>`
Role management	Operational roles and grants
Compute provisioning	Warehouses, compute pools

Mock Data (`mock_data/`)¶

Generates and loads realistic test data:

Responsibility	What it does
Data generation	PUDO locations, parcels, deliveries, occupancy
Simulation	Morning/evening cycles, temporal patterns
Seeding	Bulk load into `SHARED_DATA` schema

Projects (`projects/<name>/`)¶

Each project is an independent ML workload:

Block	Responsibility
`feature_view/`	Entity definitions and feature view implementations
`training/`	Training DAG, model training, evaluation
`inference/`	Inference DAG, batch prediction, CLI tools
`core/`	Shared utilities (session, config, SQL helpers)
`config/`	YAML configuration with environment overlays
`scripts/`	Deployment and execution entry points

No root tooling¶

There is intentionally no root pyproject.toml or root Makefile. Each component is fully self-contained:

Its own Python dependencies (pyproject.toml + uv.lock).
Its own operational targets (Makefile).
Its own connection configuration (.env).

This means you always run commands from within a component:

make -C hub deploy-infra
make -C mock_data seed-shared-data
make -C projects/pudo deploy-schema

Adding a new project¶

To add a new project spoke:

Create projects/<name>/ following the project template.
Add a pyproject.toml with the required dependencies.
Add a Makefile with deploy and run targets.
Reference hub and shared code as needed (but never the reverse).

Repo Layers & Ownership¶

The hub-spoke architecture¶

Ownership rule¶

Layer responsibilities¶

Hub (hub/)¶

Mock Data (mock_data/)¶

Projects (projects/<name>/)¶

No root tooling¶

Adding a new project¶

See also¶

Hub (`hub/`)¶

Mock Data (`mock_data/`)¶

Projects (`projects/<name>/`)¶