Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Prompt Template Specification

A Prompt resource defines a format-string template that maps dataset feature columns into structured training input. Prompts bind a dataset’s column names to positional slots in the training text, controlling how raw data is presented to the model during fine-tuning and evaluation.

Source: ft/prompts.py, ft/utils.py, ft/jobs.py, ft/db/model.py

Template Fields

FieldPurposeExample
prompt_templateFull prompt format string used during training"Instruction: {instruction}\nInput: {input}\nOutput: {output}"
input_templateInput portion (informational, used in evaluation)"Instruction: {instruction}\nInput: {input}"
completion_templateExpected output portion (informational, used in evaluation)"Output: {output}"

Placeholders use Python format-string syntax: {feature_name}. Each placeholder must correspond to a column name in the linked dataset’s features JSON array.

ORM Schema

class Prompt(Base, MappedProtobuf, MappedDict):
    __tablename__ = "prompts"
    id = Column(String, primary_key=True)               # UUID
    type = Column(String)                                # PromptType enum value
    name = Column(String)                                # Display name (unique)
    description = Column(String)
    dataset_id = Column(String, ForeignKey('datasets.id'))  # Linked dataset FK
    prompt_template = Column(String)                     # Full template
    input_template = Column(String)                      # Input portion
    completion_template = Column(String)                  # Output portion

Import Validation

_validate_add_prompt_request() enforces:

  1. Required fields: id, name, dataset_id, prompt_template, input_template, completion_template must all be present on the PromptMetadata message.
  2. Non-blank name: name.strip() must be non-empty.
  3. Unique name: No existing prompt may share the same name.

The prompt is created via Prompt.from_message(request.prompt), which uses the MappedProtobuf.from_message() method to map protobuf fields directly to ORM columns.

Auto-Generation from Dataset Columns

ft/utils.py::generate_templates(columns) produces default templates from a list of dataset column names:

  1. Output column detection: Compares column names against a ranked list of 500 common output column names (e.g., answer, response, output, label, target). The column matching the highest-ranked name becomes the output column. If no match, the last column is used.

  2. Input columns: All columns except the identified output column.

  3. Prompt template: Generated as:

    You are an LLM responsible for generating a response. Please provide a response given the user input below.
    
    <Column1>: {column1}
    <Column2>: {column2}
    <Output>:
    
  4. Completion template: {output_column}\n

Returns (prompt_template, completion_template).

Axolotl Auto-Prompt

ft/jobs.py::_add_prompt_for_dataset() generates a prompt automatically when using the Axolotl framework and no prompt is provided:

  1. Load the Axolotl config from the database by axolotl_config_id.
  2. Parse the YAML config and extract the dataset type from config['datasets'][0]['type'].
  3. Query the Config table for a matching axolotl_dataset_formats config by description == dataset_type.
  4. Parse the format config JSON to extract feature column names.
  5. Build a template: <Feature>: {feature}\n for each feature.
  6. Check for an existing prompt with the same dataset_id and prompt_template. If found, return its ID.
  7. Otherwise, create a new Prompt named "AXOLOTL_AUTOGENERATED : {dataset_type}_{dataset_name}".

Removal

remove_prompt() deletes the Prompt record by ID. Note that prompts are also cascade-deleted when their parent dataset is removed with remove_prompts=True.

Protobuf Message

PromptMetadata fields: id, type, name, description, dataset_id, prompt_template, input_template, completion_template.