AMP Project Structure

An Applied ML Prototype (AMP) is a portable, declarative CML project. Its behavior is defined by .project-metadata.yaml, which tells CML how to set up the runtime, install dependencies, and configure resources.

`.project-metadata.yaml` Specification

Field	Value	Description
`name`	Distributed XGBoost with Dask on CML	Display name in the AMP catalog
`specification_version`	1.0	AMP specification version
`prototype_version`	1.0	Project version
`date`	2022-07-30	Publication date

Runtime

runtimes:
  - editor: JupyterLab
    kernel: Python 3.9
    edition: Standard

The project requires a JupyterLab editor with a Python 3.9+ kernel. The Standard edition provides the base CML runtime without GPU drivers.

Tasks

The AMP defines two tasks that run automatically during project setup:

Task	Type	Script	Resources	Description
Install Dependencies	`create_job`	`scripts/install_dependencies.py`	1 vCPU, 2 GiB	Creates a CML job for dependency installation
(same)	`run_job`	—	(inherited)	Executes the dependency installation job

The install_dependencies.py script uses CML’s shell escape syntax:

!pip3 install -r requirements.txt

Note: This is not valid standalone Python. The ! prefix is a CML session feature that executes shell commands.

Build Script

cdsw-build.sh is the build script for CML Model Endpoints. It runs during endpoint deployment to install runtime dependencies:

pip3 install -r requirements.txt

This ensures the inference script (scripts/predict_fraud.py) has access to xgboost and numpy at serving time.

Deployment Methods

There are three ways to deploy this AMP on CML:

AMP Catalog — Navigate to the AMP Catalog in a CML workspace, select the “Distributed XGBoost with Dask on CML” tile, and follow the setup wizard.
AMP Prototype URL — In a CML workspace, create a new project with “AMP” as the initial setup option and provide the Git repository URL.
Manual Git Clone — Create a new project with “Git” as the initial setup option. In this case, run !pip install -r requirements.txt manually in a JupyterLab session before executing the notebooks.

After deployment, run either notebook by starting a Python 3.9+ JupyterLab session with at least 1 vCPU / 2 GiB.

Keyboard shortcuts