Integration Guide

Your model is live. Here's how to wire it into your product in under 5 minutes — no infrastructure changes needed.

Get your API key

Go to Dashboard → API Keys → Create Key. Copy the key immediately — it's displayed only once. All keys start with dst_live_.

Test your model endpoint

Paste your key and model name into the playground below and click Run Test. You'll see the live response and latency. Or use curl:

bash

curl https://api.distillfast.com/v1/chat/completions \
  -H "Authorization: Bearer dst_live_xxxxxxxxxxxx" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "support-bot-v1",
    "messages": [{"role": "user", "content": "Hello, are you working?"}],
    "max_tokens": 64
  }'

Drop into your codebase

Change the base_url in any OpenAI-compatible client. No other changes needed.

python

from openai import OpenAI

client = OpenAI(
    base_url="https://api.distillfast.com/v1",
    api_key="dst_live_xxxxxxxxxxxx",   # or os.environ["DISTILLFAST_API_KEY"]
)

resp = client.chat.completions.create(
    model="support-bot-v1",
    messages=[{"role": "user", "content": user_message}],
    max_tokens=256,
    temperature=0.3,
)
reply = resp.choices[0].message.content

javascript

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.distillfast.com/v1",
  apiKey: process.env.DISTILLFAST_API_KEY,
});

const resp = await client.chat.completions.create({
  model: "support-bot-v1",
  messages: [{ role: "user", content: userMessage }],
  max_tokens: 256,
  temperature: 0.3,
});
const reply = resp.choices[0].message.content;

✅ Tip: Store your API key in an environment variable — never hardcode it in frontend code or commit it to git.

API Playground

Paste your API key and model name below to fire a real request at your live endpoint directly from this page.

API Playground

Test your model live — no code required

Live

API Key

Model Name

Test Prompt

Enter your API key and model name, then click Run Test. The response comes directly from your live endpoint.

ℹ️ Note: The request goes from your browser directly to api.distillfast.com. Your API key is never sent to our servers.

Distillfast Docs

Distillfast turns your existing examples — support tickets, FAQs, instruction pairs — into a fine-tuned model that runs at 10× lower cost. You upload the data, we handle everything else.

Your model is private, yours to keep, and served on an OpenAI-compatible API endpoint the moment training completes.

Upload

50+ JSONL examples from your domain

Train

We generate & fine-tune overnight

Deploy

OpenAI-compatible API endpoint live

How It Works

When you create a project, Distillfast runs a 4-stage pipeline automatically:

Data synthesis

Your seed examples are sent to a teacher model (Claude) which generates thousands of high-quality variations. Every generated example is quality-scored and deduplicated before use.

Fine-tuning

The synthetic dataset is used to fine-tune an open-weight base model (Llama, Mistral, or Phi-3) using parameter-efficient techniques. This is what makes inference 10× cheaper — a small domain-specific model beats a large general one on your use case.

Deployment

The trained model is loaded onto a GPU inference server and an OpenAI-compatible endpoint is provisioned. Your API key gates access.

DistillScore™ evaluation

Optionally benchmark your model against 10 real-world Indian SaaS scenarios. You get a score out of 100 and a comparison against GPT-3.5 baseline (71).

File Format

Distillfast accepts JSONL files — one valid JSON object per line, no trailing commas, UTF-8 encoded.

ℹ️ Note: Minimum 50 examples. Recommended 200+ for best quality. More diverse examples = higher DistillScore™.

Format	Use case	Required fields
Q&A	Chatbots, support bots	messages[]
Instruction	Tasks with clear I/O	instruction, output
Classification	Routing, tagging, labelling	text, label
Completion	Text continuation, templating	prompt, completion

Q&A Format messages[]

Best for customer support bots, internal helpdesks, and conversational assistants. Uses the same message schema as OpenAI Chat.

jsonl

{"messages": [{"role": "user", "content": "Mera refund 7 din ho gaye, kab milega?"}, {"role": "assistant", "content": "Apologies for the delay. Refunds take 5-7 business days. I've escalated your case — you should receive it within 24 hours."}]}
{"messages": [{"role": "user", "content": "GST invoice chahiye last month ka"}, {"role": "assistant", "content": "Go to Settings → Billing → Invoice History, select the month and click Download PDF."}]}
{"messages": [{"role": "user", "content": "How do I add a team member?"}, {"role": "assistant", "content": "Go to Settings → Team → Invite Member. Enter their email address and choose their role. They'll receive an invite link valid for 48 hours."}]}

✅ Tip: Mix English and Hinglish examples if your users write in both. The model learns the distribution of your actual users.

Instruction / Output Format instruction + output

Use this for tasks where there is a clear instruction and a correct output — summarisation, rewriting, extraction, or structured generation.

jsonl

{"instruction": "Classify this support ticket by department: 'My Razorpay payment failed during checkout'", "output": "billing"}
{"instruction": "Write a one-line auto-reply for this ticket: 'I cannot log into my account after the password reset'", "output": "Thanks for reaching out! We've received your request and our team will respond within 2 hours."}
{"instruction": "Extract the product name from this complaint: 'Your mobile app crashes whenever I try to export a PDF report'", "output": "Mobile App — PDF Export"}

Classification Format text + label

For routing tickets, tagging emails, or any multi-class labelling task. Each example needs a text field and a label from a fixed set of classes.

jsonl

{"text": "My payment failed three times and I was charged for all of them", "label": "billing"}
{"text": "I cannot access my account after the new update", "label": "access"}
{"text": "Can you add support for NEFT transfers?", "label": "feature_request"}
{"text": "The mobile app crashes on iOS 17 when I open reports", "label": "bug"}
{"text": "Mujhe apna plan downgrade karna hai", "label": "subscription"}

⚠️ Warning: Include at least 10 examples per class. Imbalanced datasets (100 billing, 2 bug) produce biased models — aim for roughly equal distribution.

Completion Format prompt + completion

For text continuation — email templates, document drafting, or any task where the model completes a partially-written text.

jsonl

{"prompt": "Subject: Follow-up on your refund request\n\nDear customer,", "completion": " thank you for your patience. We're happy to confirm that your refund of ₹2,499 has been processed and will appear in your account within 3-5 business days."}
{"prompt": "Ticket summary: Customer unable to login after password reset.", "completion": "Root cause: Password reset email expired before use. Resolution: Manually reset password via admin panel. Follow-up: Sent confirmation SMS."}

Create a Project

From the dashboard, go to Projects → New Project. You'll see a setup form with the following fields:

Project name

Identifies your model in the dashboard and API. E.g. support-bot-v1, invoice-classifier.

Data format

Matches the format of your JSONL file — Q&A, Instruction, Classification, or Completion.

Samples to generate

How many synthetic training pairs to generate from your seeds. Start with 500. Use 2000+ for production-grade models.

Model tier

Balanced (7B) is recommended for most use cases. Use Fast (3B) for latency-critical applications.

Training data

Your .jsonl file. Must contain at least 50 examples.

After clicking Create Project, the platform automatically:

Uploads and validates your JSONL file
Generates synthetic training data (quality-filtered and deduplicated)
Fine-tunes the selected model
Deploys the endpoint — you'll see status update to Live

ℹ️ Note: Typical time: 2–4 hours for 500 samples, 6–12 hours for 5,000+. You can close the browser — the pipeline continues on the server.

Project Status

Generating

Synthesizing training data from your seed examples.

Training

Fine-tuning the model. A progress bar shows completion %

Ready

Training complete but not yet deployed. Click Deploy to go live.

Live

Endpoint is active and accepting requests.

Failed

An error occurred. The error message is shown on the card.

Authentication

Every request to your model endpoint must include a Bearer token in the Authorization header. Create an API key from Dashboard → API Keys.

bash

# Test your key
curl https://api.distillfast.com/v1/models \
  -H "Authorization: Bearer dst_live_xxxxxxxxxxxx"

⚠️ Warning: API keys are shown only once at creation. Store them in your environment variables — never hardcode in client-side code.

Chat Completions

Your endpoint is OpenAI-compatible — any library or tool that supports OpenAI's API works with Distillfast by changing the base_url.

bash

curl https://api.distillfast.com/v1/chat/completions \
  -H "Authorization: Bearer dst_live_xxxxxxxxxxxx" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "support-bot-v1",
    "messages": [
      {"role": "user", "content": "Mera refund kab aayega?"}
    ],
    "max_tokens": 256,
    "temperature": 0.3
  }'

json — response

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "model": "support-bot-v1",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "Apologies for the delay! Refunds typically process in 5-7 business days. I've flagged your account for priority review — please check your bank by tomorrow."
    },
    "finish_reason": "stop"
  }],
  "usage": { "prompt_tokens": 18, "completion_tokens": 42, "total_tokens": 60 }
}

Python SDK

Use the openai Python library — just point it at the Distillfast base URL.

bash

pip install openai

python

from openai import OpenAI

client = OpenAI(
    base_url="https://api.distillfast.com/v1",
    api_key="dst_live_xxxxxxxxxxxx",
)

response = client.chat.completions.create(
    model="support-bot-v1",
    messages=[
        {"role": "system", "content": "You are a helpful customer support agent for an Indian SaaS company."},
        {"role": "user",   "content": "GST invoice chahiye last month ka"},
    ],
    max_tokens=256,
    temperature=0.3,
)

print(response.choices[0].message.content)

✅ Tip: Set temperature=0.1–0.3 for support bots (deterministic, factual answers) and 0.7–0.9 for creative or generative tasks.

What Is DistillScore™?

DistillScore™ is Distillfast's proprietary benchmark for Indian SaaS support quality. It evaluates your model on 10 handcrafted scenarios covering the most common support situations: Hinglish refund queries, GST invoice requests, billing disputes, Razorpay failures, account lockouts, and feature requests.

Accuracy

Does the response contain correct, factual information?

Fluency

Is the language natural, clear, and grammatically correct?

Relevance

Does the response directly address what was asked?

Helpfulness

Does the customer have a clear next step after reading the reply?

Each dimension is scored 0–25. Total score is out of 100. The GPT-3.5 baseline on this benchmark is 71. A score above 80 indicates a production-ready model for Indian SaaS support.

Running the Benchmark

From your Projects page, click a Live project card to expand it, then click DistillScore™ → Run Benchmark. The evaluation takes 1–2 minutes. Results show:

Aggregate score (0–100)
Score delta vs GPT-3.5 baseline (e.g. +12 vs GPT-3.5)
Per-dimension breakdown: Accuracy, Fluency, Relevance, Helpfulness
Pass/fail count across the 10 test scenarios

✅ Tip: Run DistillScore™ after every new version of your model to track quality over time. A higher score directly correlates with fewer escalations and higher CSAT.