Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

gRPC Service Design

The Fine Tuning Studio API is defined as a single gRPC service in ft/proto/fine_tuning_studio.proto. The service exposes 29 RPCs organized by resource domain. A generated Python stub provides the transport layer; FineTuningStudioClient wraps it with error handling and convenience methods.

Service Architecture

RPC Catalog

Every domain follows the same pattern: List, Get, Add (or Start for jobs), and Remove. Request and response types use the naming convention {Action}{Domain}Request / {Action}{Domain}Response.

Dataset RPCs

RPCRequest TypeResponse TypeDescription
ListDatasetsListDatasetsRequestListDatasetsResponseReturn all registered datasets
GetDatasetGetDatasetRequestGetDatasetResponseReturn a single dataset by ID
AddDatasetAddDatasetRequestAddDatasetResponseRegister a HuggingFace or local dataset
RemoveDatasetRemoveDatasetRequestRemoveDatasetResponseDelete a dataset registration
GetDatasetSplitByAdapterGetDatasetSplitByAdapterRequestGetDatasetSplitByAdapterResponseGet dataset split info for a specific adapter

Model RPCs

RPCRequest TypeResponse TypeDescription
ListModelsListModelsRequestListModelsResponseReturn all registered models
GetModelGetModelRequestGetModelResponseReturn a single model by ID
AddModelAddModelRequestAddModelResponseRegister a HuggingFace or CML model
ExportModelExportModelRequestExportModelResponseExport a model to CML Model Registry
RemoveModelRemoveModelRequestRemoveModelResponseDelete a model registration

Adapter RPCs

RPCRequest TypeResponse TypeDescription
ListAdaptersListAdaptersRequestListAdaptersResponseReturn all registered adapters
GetAdapterGetAdapterRequestGetAdapterResponseReturn a single adapter by ID
AddAdapterAddAdapterRequestAddAdapterResponseRegister a local or HuggingFace adapter
RemoveAdapterRemoveAdapterRequestRemoveAdapterResponseDelete an adapter registration

Prompt RPCs

RPCRequest TypeResponse TypeDescription
ListPromptsListPromptsRequestListPromptsResponseReturn all prompt templates
GetPromptGetPromptRequestGetPromptResponseReturn a single prompt by ID
AddPromptAddPromptRequestAddPromptResponseCreate a new prompt template
RemovePromptRemovePromptRequestRemovePromptResponseDelete a prompt template

Fine-Tuning RPCs

RPCRequest TypeResponse TypeDescription
ListFineTuningJobsListFineTuningJobsRequestListFineTuningJobsResponseReturn all fine-tuning jobs
GetFineTuningJobGetFineTuningJobRequestGetFineTuningJobResponseReturn a single job by ID
StartFineTuningJobStartFineTuningJobRequestStartFineTuningJobResponseDispatch a new fine-tuning CML Job
RemoveFineTuningJobRemoveFineTuningJobRequestRemoveFineTuningJobResponseDelete a fine-tuning job record

Evaluation RPCs

RPCRequest TypeResponse TypeDescription
ListEvaluationJobsListEvaluationJobsRequestListEvaluationJobsResponseReturn all evaluation jobs
GetEvaluationJobGetEvaluationJobRequestGetEvaluationJobResponseReturn a single evaluation job by ID
StartEvaluationJobStartEvaluationJobRequestStartEvaluationJobResponseDispatch a new evaluation CML Job
RemoveEvaluationJobRemoveEvaluationJobRequestRemoveEvaluationJobResponseDelete an evaluation job record

Config RPCs

RPCRequest TypeResponse TypeDescription
ListConfigsListConfigsRequestListConfigsResponseReturn all configuration blobs
GetConfigGetConfigRequestGetConfigResponseReturn a single config by ID
AddConfigAddConfigRequestAddConfigResponseCreate a new configuration
RemoveConfigRemoveConfigRequestRemoveConfigResponseDelete a configuration

Database RPCs

RPCRequest TypeResponse TypeDescription
ExportDatabaseExportDatabaseRequestExportDatabaseResponseExport entire database as JSON
ImportDatabaseImportDatabaseRequestImportDatabaseResponseImport database from JSON file

Servicer Implementation

FineTuningStudioApp in ft/service.py extends the generated FineTuningStudioServicer. It holds two shared resources initialized in __init__:

class FineTuningStudioApp(FineTuningStudioServicer):
    def __init__(self):
        self.cml = cmlapi.default_client()
        self.dao = FineTuningStudioDao(engine_args={
            "pool_size": 5,
            "max_overflow": 10,
            "pool_timeout": 30,
            "pool_recycle": 1800,
        })
        self.project_id = os.getenv("CDSW_PROJECT_ID")

Every RPC method is a one-line delegation to the corresponding domain function, passing (request, self.cml, self.dao):

def ListDatasets(self, request, context):
    return list_datasets(request, self.cml, self.dao)

def StartFineTuningJob(self, request, context):
    return start_fine_tuning_job(request, self.cml, dao=self.dao)

Config and database RPCs omit the cml parameter since they operate on local data only.

Client Wrapper

FineTuningStudioClient in ft/client.py wraps the generated stub with automatic error handling. On construction, it introspects all callable methods on the stub and wraps each one to convert grpc.RpcError into ValueError with cleaned messages.

class FineTuningStudioClient:
    def __init__(self, server_ip=None, server_port=None):
        if not server_ip:
            server_ip = os.getenv("FINE_TUNING_SERVICE_IP")
        if not server_port:
            server_port = os.getenv("FINE_TUNING_SERVICE_PORT")
        self.channel = grpc.insecure_channel(f"{server_ip}:{server_port}")
        self.stub = FineTuningStudioStub(self.channel)

        # Auto-wrap all stub methods with error handling
        for attr in dir(self.stub):
            if not attr.startswith('_') and callable(getattr(self.stub, attr)):
                setattr(self, attr, self._grpc_error_handler(getattr(self.stub, attr)))

Convenience Methods

The client provides shorthand accessors that construct the request internally:

MethodReturnsEquivalent RPC
get_datasets()List[DatasetMetadata]ListDatasets(ListDatasetsRequest()).datasets
get_models()List[ModelMetadata]ListModels(ListModelsRequest()).models
get_adapters()List[AdapterMetadata]ListAdapters(ListAdaptersRequest()).adapters
get_prompts()List[PromptMetadata]ListPrompts(ListPromptsRequest()).prompts
get_fine_tuning_jobs()List[FineTuningJobMetadata]ListFineTuningJobs(ListFineTuningJobsRequest()).fine_tuning_jobs
get_evaluation_jobs()List[EvaluationJobMetadata]ListEvaluationJobs(ListEvaluationJobsRequest()).evaluation_jobs

Usage Example

from ft.client import FineTuningStudioClient
from ft.api import *

client = FineTuningStudioClient()

# List all datasets
datasets = client.get_datasets()

# Add a HuggingFace dataset
client.AddDataset(AddDatasetRequest(
    type="huggingface",
    huggingface_name="tatsu-lab/alpaca",
    name="Alpaca"
))

# Start a fine-tuning job
client.StartFineTuningJob(StartFineTuningJobRequest(
    base_model_id="model-uuid",
    dataset_id="dataset-uuid",
    prompt_id="prompt-uuid",
    adapter_name="my-adapter",
    num_cpu=2,
    num_gpu=1,
    num_memory=16,
    framework_type="legacy"
))

All request and response types are importable from ft.api, which re-exports the generated protobuf classes.

Protobuf Regeneration

After modifying ft/proto/fine_tuning_studio.proto, regenerate the Python bindings:

./bin/generate-proto-python.sh

This produces ft/proto/fine_tuning_studio_pb2.py (message classes) and ft/proto/fine_tuning_studio_pb2_grpc.py (stub and servicer base class). Both are checked into the repository. Do not edit them by hand.

Server Startup

The gRPC server is started by bin/start-grpc-server.py:

  1. Creates a grpc.server with ThreadPoolExecutor(max_workers=10).
  2. Registers FineTuningStudioApp() as the servicer.
  3. Binds to [::]:50051 (all interfaces).
  4. Updates CML project environment variables (FINE_TUNING_SERVICE_IP, FINE_TUNING_SERVICE_PORT) via cmlapi so that any workload in the project can locate the server.
  5. Blocks on server.wait_for_termination().

The server process is launched as a background subprocess by bin/start-app-script.sh before Streamlit starts. See System Overview for the full initialization sequence.