Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Resource Concepts

Fine Tuning Studio manages seven resource types. All use UUID string primary keys generated via uuid4(). Resources are metadata entries stored in SQLite – the actual artifacts (model weights, dataset files, adapter checkpoints) live on the filesystem, HuggingFace Hub, or the CML Model Registry.

Resource Types

ResourceTablePurpose
DatasetdatasetsReference to a HuggingFace Hub dataset or local file (CSV, JSON, JSONL)
ModelmodelsBase foundation model from HuggingFace Hub or CML Model Registry
AdapteradaptersPEFT LoRA adapter – produced by training, imported from disk, or fetched from Hub
PromptpromptsFormat-string template mapping dataset features into training input
ConfigconfigsNamed configuration blob (training args, BnB, LoRA, generation, Axolotl YAML)
FineTuningJobfine_tuning_jobsCML Job that trains a PEFT adapter
EvaluationJobevaluation_jobsCML Job that runs MLflow evaluation against model+adapter combinations

Entity Relationships

Type Enums

All type enums are defined in ft/api/types.py as str, Enum subclasses.

EnumValues
DatasetTypehuggingface, project, project_csv, project_json, project_jsonl
ModelTypehuggingface, project, model_registry
AdapterTypeproject, huggingface, model_registry
PromptTypein_place
ConfigTypetraining_arguments, bitsandbytes_config, generation_config, lora_config, custom, axolotl, axolotl_dataset_formats
FineTuningFrameworkTypelegacy, axolotl
ModelExportTypemodel_registry, cml_model
EvaluationJobTypemlflow
ModelFrameworkTypepytorch, tensorflow, onnx

ORM Layer

All ORM models inherit from sqlalchemy.orm.declarative_base() plus two mixins defined in ft/db/model.py:

MappedProtobuf – bidirectional protobuf conversion:

  • from_message(message) – class method. Extracts set fields from a protobuf message via ListFields() and passes them as kwargs to the ORM constructor.
  • to_protobuf(protobuf_cls) – instance method. Converts non-null ORM columns into a protobuf message by matching field names.

MappedDict – bidirectional dict conversion:

  • from_dict(d) – class method. Constructs an ORM instance from a plain dictionary.
  • to_dict() – instance method. Returns a dictionary of all non-null column values via SQLAlchemy inspect().

The serialization chain for any resource:

Protobuf message  <-->  ORM model  <-->  Python dict
     from_message() / to_protobuf()   from_dict() / to_dict()

Table Registry

ft/db/model.py maintains two registries used by the database import/export subsystem:

TABLE_TO_MODEL_REGISTRY = {
    'datasets': Dataset,
    'models': Model,
    'prompts': Prompt,
    'adapters': Adapter,
    'fine_tuning_jobs': FineTuningJob,
    'evaluation_jobs': EvaluationJob,
    'configs': Config,
}

MODEL_TO_TABLE_REGISTRY = {v: k for k, v in TABLE_TO_MODEL_REGISTRY.items()}

Any new resource type must be added to TABLE_TO_MODEL_REGISTRY for database import/export to function correctly.