Model Endpoint Contract

This is the authoritative specification for the inference API. An external client must conform to this contract to receive correct predictions from the deployed CML Model Endpoint.

Function Signature

def predict_fraud(args) -> int

CML invokes this function for each prediction request, passing the request body as args.

Input Contract

{
    "features": [[<29 float values>]]
}

The features value is a list containing a single list of 29 floats. The feature order must match the training schema — alphabetical by column name:

Amount, V1, V2, V3, V4, V5, V6, V7, V8, V9, V10, V11, V12, V13,
V14, V15, V16, V17, V18, V19, V20, V21, V22, V23, V24, V25, V26,
V27, V28

See Feature Engineering Contract for details on how these features are derived from raw transaction data.

Output Contract

Returns an integer:

Value	Meaning
`0`	Not fraud
`1`	Fraud

Internal Logic

Module load (once, at endpoint startup):

booster = xgb.Booster(model_file='/home/cdsw/model/best-xgboost-model')
threshold = 0.35

Per-request:

prediction = booster.inplace_predict(np.array(args['features']))
if prediction[0] <= threshold:
    return 0  # not fraud
return 1      # fraud

The model outputs a continuous probability in [0.0, 1.0]. The threshold of 0.35 converts this to a binary classification.

Sample Request

{
    "features": [[-1.35980713, -0.0727811733, 2.53634674, 1.37815522,
      -0.33832077, 0.462387778, 0.239598554, 0.0986979013,
      0.36378697, 0.090794172, -0.551599533, -0.617800856,
      -0.991389847, -0.311169354, 1.46817697, -0.470400525,
      0.207971242, 0.0257905802, 0.40399296, 0.251412098,
      -0.0183067779, 0.277837576, -0.11047391, 0.0669280749,
      0.128539358, -0.189114844, 0.133558377, -0.0210530535,
      149.62]]
}

CML Model Endpoint Configuration

Setting	Value
Build File	`cdsw-build.sh`
Target File	`scripts/predict_fraud.py`
Function	`predict_fraud`

The build file runs pip3 install -r requirements.txt to ensure xgboost and numpy are available. The target file loads the model at import time and exposes the predict_fraud function for CML to invoke.

Distributed XGBoost with Dask on CML — Developer's Guide