Resource Concepts

Fine Tuning Studio manages seven resource types. All use UUID string primary keys generated via uuid4(). Resources are metadata entries stored in SQLite – the actual artifacts (model weights, dataset files, adapter checkpoints) live on the filesystem, HuggingFace Hub, or the CML Model Registry.

Resource Types

Resource	Table	Purpose
Dataset	`datasets`	Reference to a HuggingFace Hub dataset or local file (CSV, JSON, JSONL)
Model	`models`	Base foundation model from HuggingFace Hub or CML Model Registry
Adapter	`adapters`	PEFT LoRA adapter – produced by training, imported from disk, or fetched from Hub
Prompt	`prompts`	Format-string template mapping dataset features into training input
Config	`configs`	Named configuration blob (training args, BnB, LoRA, generation, Axolotl YAML)
FineTuningJob	`fine_tuning_jobs`	CML Job that trains a PEFT adapter
EvaluationJob	`evaluation_jobs`	CML Job that runs MLflow evaluation against model+adapter combinations

Entity Relationships

Type Enums

All type enums are defined in ft/api/types.py as str, Enum subclasses.

Enum	Values
`DatasetType`	`huggingface`, `project`, `project_csv`, `project_json`, `project_jsonl`
`ModelType`	`huggingface`, `project`, `model_registry`
`AdapterType`	`project`, `huggingface`, `model_registry`
`PromptType`	`in_place`
`ConfigType`	`training_arguments`, `bitsandbytes_config`, `generation_config`, `lora_config`, `custom`, `axolotl`, `axolotl_dataset_formats`
`FineTuningFrameworkType`	`legacy`, `axolotl`
`ModelExportType`	`model_registry`, `cml_model`
`EvaluationJobType`	`mlflow`
`ModelFrameworkType`	`pytorch`, `tensorflow`, `onnx`

ORM Layer

All ORM models inherit from sqlalchemy.orm.declarative_base() plus two mixins defined in ft/db/model.py:

MappedProtobuf – bidirectional protobuf conversion:

from_message(message) – class method. Extracts set fields from a protobuf message via ListFields() and passes them as kwargs to the ORM constructor.
to_protobuf(protobuf_cls) – instance method. Converts non-null ORM columns into a protobuf message by matching field names.

MappedDict – bidirectional dict conversion:

from_dict(d) – class method. Constructs an ORM instance from a plain dictionary.
to_dict() – instance method. Returns a dictionary of all non-null column values via SQLAlchemy inspect().

The serialization chain for any resource:

Protobuf message  <-->  ORM model  <-->  Python dict
     from_message() / to_protobuf()   from_dict() / to_dict()

Table Registry

ft/db/model.py maintains two registries used by the database import/export subsystem:

TABLE_TO_MODEL_REGISTRY = {
    'datasets': Dataset,
    'models': Model,
    'prompts': Prompt,
    'adapters': Adapter,
    'fine_tuning_jobs': FineTuningJob,
    'evaluation_jobs': EvaluationJob,
    'configs': Config,
}

MODEL_TO_TABLE_REGISTRY = {v: k for k, v in TABLE_TO_MODEL_REGISTRY.items()}

Any new resource type must be added to TABLE_TO_MODEL_REGISTRY for database import/export to function correctly.