Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Validation Rules Reference

The Studio validates resources at multiple points: on import (datasets, models, adapters, prompts, configs), on job submission (fine-tuning, evaluation), and on export (model deployment). This chapter catalogs all validation rules extracted from the source code.

Source: ft/jobs.py, ft/evaluation.py, ft/datasets.py, ft/models.py, ft/adapters.py, ft/prompts.py, ft/configs.py, ft/service.py

Rule ID Convention

Rule IDs follow the format {Domain}-{Number} where Domain is one of:

DomainScope
FTFine-tuning job parameters
EVEvaluation job parameters
DSDataset import
MDModel import
ADAdapter import
PRPrompt template
CFConfiguration blob
EXModel export / deployment

All rules with severity ERROR abort the operation and return a gRPC error. Rules with severity INFO are advisory and do not block the operation.

Fine-Tuning Job Validation

Validated in ft/jobs.py when StartFineTuningJob is called.

Rule IDFieldConstraintSeverity
FT-001framework_typeMust be legacy or axolotlERROR
FT-002adapter_nameMust match ^[a-zA-Z0-9-]+$ (alphanumeric + hyphens, no spaces)ERROR
FT-003out_dirMust exist as a directoryERROR
FT-004num_cpuMust be > 0ERROR
FT-005num_gpuMust be >= 0ERROR
FT-006num_memoryMust be > 0ERROR
FT-007num_workersMust be > 0ERROR
FT-008num_epochsMust be > 0ERROR
FT-009learning_rateMust be > 0ERROR
FT-010dataset_fractionMust be in (0, 1]ERROR
FT-011train_test_splitMust be in (0, 1]ERROR
FT-012axolotl_config_idRequired when framework_type=axolotlERROR
FT-013base_model_idMust exist in models tableERROR
FT-014dataset_idMust exist in datasets tableERROR
FT-015prompt_idMust exist in prompts table (legacy framework only)ERROR
FT-016axolotl_config_idMust exist in configs table (when provided)ERROR

FT-001 through FT-011 are local validations that require no database access. FT-012 is a cross-field consistency check. FT-013 through FT-016 are foreign-key validations resolved against the DAO.

Evaluation Job Validation

Validated in ft/evaluation.py when StartEvaluationJob is called.

Rule IDFieldConstraintSeverity
EV-001model_adapter_combinationsMust be non-emptyERROR
EV-002dataset_idMust be non-emptyERROR
EV-003prompt_idMust be non-emptyERROR
EV-004num_cpu, num_gpu, num_memoryMust be providedERROR
EV-005model IDs in combinationsEach must exist in models tableERROR
EV-006adapter IDs in combinationsEach must exist in adapters table (or empty for base model)ERROR
EV-007dataset_idMust exist in datasets tableERROR
EV-008prompt_idMust exist in prompts tableERROR

Evaluation jobs accept multiple model-adapter pairs in a single request. EV-005 and EV-006 are validated per combination entry.

Dataset Import Validation

Validated in ft/datasets.py when AddDataset is called.

Rule IDFieldConstraintSeverity
DS-001typeMust be one of: huggingface, project, project_csv, project_json, project_jsonlERROR
DS-002huggingface_nameMust resolve via HfApi.dataset_info() (huggingface type)ERROR
DS-003locationFile must exist (project_csv, project_json, project_jsonl)ERROR

DS-002 makes a network call to the HuggingFace Hub. If HUGGINGFACE_ACCESS_TOKEN is set, it is used for gated dataset access. DS-003 validates the local filesystem path and checks the file extension matches the declared type.

Model Import Validation

Validated in ft/models.py when AddModel is called.

Rule IDFieldConstraintSeverity
MD-001typeMust be one of: huggingface, project, model_registryERROR
MD-002huggingface_model_nameMust resolve via HfApi.model_info() (huggingface type)ERROR
MD-003cml_registered_model_idMust resolve via cmlapi (model_registry type)ERROR

MD-002 contacts the HuggingFace Hub. MD-003 queries the CML Model Registry through the cmlapi SDK.

Adapter Import Validation

Validated in ft/adapters.py when AddAdapter is called.

Rule IDFieldConstraintSeverity
AD-001nameRequired, non-emptyERROR
AD-002model_idRequired, must exist in models tableERROR
AD-003locationMust exist as directory (project type)ERROR
AD-004fine_tuning_job_idMust exist in fine_tuning_jobs table (if provided)ERROR
AD-005prompt_idMust exist in prompts table (if provided)ERROR

AD-004 and AD-005 are optional foreign-key references. When provided, they link the adapter back to the job and prompt that produced it.

Prompt Validation

Validated in ft/prompts.py when AddPrompt is called.

Rule IDFieldConstraintSeverity
PR-001nameRequired, uniqueERROR
PR-002dataset_idRequiredERROR
PR-003prompt_templateRequired, non-emptyERROR
PR-004input_templateRequiredERROR
PR-005completion_templateRequiredERROR

PR-001 enforces uniqueness at the application level before insert. The prompt_template, input_template, and completion_template fields use Python format-string syntax referencing dataset feature column names.

Config Validation

Validated in ft/configs.py when AddConfig is called.

Rule IDFieldConstraintSeverity
CF-001typeMust be one of ConfigType enum valuesERROR
CF-002configMust be valid JSON (non-axolotl types)ERROR
CF-003configMust be valid YAML (axolotl type)ERROR
CF-004configDeduplicated – returns existing ID if identical content existsINFO

CF-004 is a deduplication check, not an error. When a config with identical content already exists, the existing record’s ID is returned instead of creating a duplicate. The caller receives a successful response in either case.

Export Validation

Validated in ft/models.py when ExportModel or RegisterModel is called.

Rule IDFieldConstraintSeverity
EX-001base_model_idRequired, non-emptyERROR
EX-002adapter_idRequired, non-emptyERROR
EX-003model_nameRequired, non-emptyERROR
EX-004adapter typeMust be PROJECT for CML Model deploymentERROR
EX-005model typeMust be huggingface for CML Model deploymentERROR

EX-004 and EX-005 enforce deployment constraints. Only project-local adapters (those with files on disk) can be packaged for CML Model Registry, and only HuggingFace-sourced base models are supported for the export merge workflow.