Getting Started
DisTillux API Reference
The DisTillux REST API lets you programmatically submit distillation jobs, poll pipeline status, download artifacts, and configure continuous calibration β all without touching the dashboard.
π‘
Base URL: All API requests go to https://api.distillux.fluxcybers.polsia.app/v3
Quick Start β 3 steps
Step 1: Upload your model
Copy
curl -X POST https://api.distillux.fluxcybers.polsia.app/v3/jobs \
-H "Authorization: Bearer dx_prod_sk_live_YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"model_url": "https://your-bucket.s3.amazonaws.com/llama-3-8b.safetensors",
"target_params": "1B",
"quality_mode": "balanced",
"output_formats": ["onnx", "tensorrt", "gguf_q4", "safetensors"]
}'
Step 2: Poll for completion
Copy
curl https://api.distillux.fluxcybers.polsia.app/v3/jobs/job_abc123 \
-H "Authorization: Bearer dx_prod_sk_live_YOUR_KEY"
Step 3: Download artifacts
Copy
curl https://api.distillux.fluxcybers.polsia.app/v3/jobs/job_abc123/artifacts/onnx/download \
-H "Authorization: Bearer dx_prod_sk_live_YOUR_KEY" \
--output model_distilled.onnx
Authentication
Authentication
All API requests must include your API key in the Authorization header using Bearer token format.
Authorization : Bearer dx_prod_sk_live_eXFPkB7yxqA3mNc9wLZK...
Obtain your API key from the DisTillux Dashboard β API Keys . Keys are prefixed with dx_prod_ (production) or dx_test_ (test/sandbox).
β οΈ
Security: Never expose API keys in client-side code or commit them to version control. Use environment variables or a secrets manager.
Error Handling
Error Handling
DisTillux uses standard HTTP status codes. Error responses include a JSON body with error and message fields.
{
"error" : "model_too_large" ,
"message" : "Model exceeds 7B parameter limit on Starter tier. Upgrade to Pro." ,
"status" : 422 ,
"docs_url" : "https://docs.distillux.fluxcybers.polsia.app/errors/model_too_large"
}
Code Meaning
400Bad request β check required parameters
401Invalid or missing API key
403Action not permitted on your tier
404Job or artifact not found
409Conflict β duplicate job for same model hash
422Validation error β see message
429Rate limit exceeded β see Retry-After header
500Internal server error β contact support
Rate Limits
Rate Limits
Tier Concurrent Jobs API Requests/min Max Model Size
Starter 1 60 7B parameters
Pro 3 300 70B parameters
Enterprise Unlimited Custom Unlimited
Rate limit headers are included in every response: X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset.
Core API
Create Distillation Job
POST
/jobs
Submit a new model distillation job
Submits a new distillation job. The model file must be accessible via a pre-signed URL or you can use the upload endpoint first.
Request Body
Parameter Type Description
model_urlrequired string Pre-signed URL or DisTillux upload URL for the model file
target_paramsrequired string Target parameter count: "1B", "1.5B", "3B", "7B", or "custom"
quality_mode string "balanced" (default), "max_compression", "max_quality", "speed"
output_formats array Array of format strings. See Output Formats . Default: ["safetensors","onnx"]
quantization string "int4", "int8", "fp8", "mixed". Default: "int4"
calibration_data_url string Optional URL to a JSONL calibration dataset (Pro/Enterprise only)
accuracy_threshold number Max acceptable accuracy delta (e.g., -0.01 for β1%). Default: -0.02
webhook_url string URL to POST completion events to. See Webhooks
metadata object Custom key-value pairs attached to this job (returned in all responses)
Response
{
"job_id" : "job_kR7mNqPvXcW9zA" ,
"status" : "queued" ,
"created_at" : "2026-01-22T14:30:00Z" ,
"estimated_completion" : "2026-01-22T18:42:00Z" ,
"estimated_cost_usd" : 1499.00 ,
"tier" : "pro" ,
"pipeline_url" : "wss://stream.distillux.fluxcybers.polsia.app/jobs/job_kR7mNqPvXcW9zA"
}
Get Job Status
GET
/jobs/{job_id}
Get pipeline status and progress
Returns the current status of a distillation job, including per-layer progress and projected metrics.
{
"job_id" : "job_kR7mNqPvXcW9zA" ,
"status" : "running" ,
"current_layer" : 2 ,
"layer_name" : "distill" ,
"progress_pct" : 38.4 ,
"layers" : {
"analyze" : { "status" : "done" , "duration_min" : 42 },
"distill" : { "status" : "running" , "progress" : 0.38 },
"refine" : { "status" : "queued" },
"optimize" : { "status" : "queued" },
"deliver" : { "status" : "queued" },
"evolve" : { "status" : "queued" }
},
"eta_minutes" : 194 ,
"projected_accuracy_delta" : -0.0021 ,
"projected_compression_ratio" : 8.7
}
π‘
For real-time updates, connect to the WebSocket stream at pipeline_url returned when creating the job. See WebSocket Streaming .
List Jobs
GET
/jobs
List all jobs with optional filters
Query Parameters
Parameter Type Description
status string Filter by status: queued, running, done, failed, cancelled
limit integer Results per page (max 100, default 20)
cursor string Pagination cursor from previous response
since string ISO 8601 timestamp β return jobs created after this date
Get Artifacts
GET
/jobs/{job_id}/artifacts
List all output artifacts for a completed job
{
"job_id" : "job_kR7mNqPvXcW9zA" ,
"artifacts" : [
{
"format" : "onnx" ,
"size_bytes" : 1621000000 ,
"size_human" : "1.51 GB" ,
"compression_ratio" : 8.74 ,
"accuracy_delta" : -0.0019 ,
"benchmark" : { "mmlu" : 0.998 , "truthfulqa" : 0.9981 , "hellaswag" : 0.9974 },
"checksum_sha256" : "a3f92eβ¦d8b1" ,
"expires_at" : "2026-04-22T14:30:00Z"
}
]
}
GET
/jobs/{job_id}/artifacts/{format}/download
Download a specific artifact (302 redirect to pre-signed URL)
Returns a 302 redirect to a pre-signed download URL (1-hour expiry). Use -L with cURL to follow redirects.
SDK Examples
Python SDK
Install: pip install distillux
Basic
Async
With Polling
Copy
from distillux import DisTillux
client = DisTillux(api_key="dx_prod_sk_live_YOUR_KEY" )
# Submit distillation job
job = client.jobs.create(
model_url="https://example.com/llama-3-8b.safetensors" ,
target_params="1B" ,
quality_mode="balanced" ,
output_formats=["onnx" , "tensorrt" , "gguf_q4" ],
accuracy_threshold=-0.01 ,
)
print (f "Job created: {job.job_id}" )
print (f "ETA: {job.estimated_completion}" )
print (f "Est. cost: ${job.estimated_cost_usd:.2f}" )
Copy
import asyncio
from distillux import AsyncDisTillux
async def distill_model ():
client = AsyncDisTillux(api_key="dx_prod_sk_live_YOUR_KEY" )
async with client.jobs.stream("job_abc123" ) as stream:
async for event in stream:
if event.type == "layer_progress" :
print (f "Layer {event.layer}: {event.progress:.1%}" )
elif event.type == "job_complete" :
print (f "Done! Accuracy delta: {event.accuracy_delta}" )
break
asyncio.run(distill_model())
Copy
from distillux import DisTillux
import time
client = DisTillux(api_key="dx_prod_sk_live_YOUR_KEY" )
job = client.jobs.create(
model_url="https://example.com/model.safetensors" ,
target_params="1B" ,
output_formats=["onnx" , "gguf_q4" ],
)
# Poll until complete (or use webhooks/streaming instead)
while job.status not in ("done" , "failed" ):
time.sleep(60 )
job = client.jobs.get(job.job_id)
print (f "Status: {job.status} | Layer: {job.layer_name} | {job.progress_pct:.1f}%" )
if job.status == "done" :
artifacts = client.artifacts.list(job.job_id)
client.artifacts.download(job.job_id, "onnx" , path="./model_distilled.onnx" )
print (f "Downloaded! Size: {artifacts[0].size_human}" )
Node.js SDK
Install: npm install @fluxcybers/distillux
ESM
CommonJS
Stream
Copy
import { DisTillux } from '@fluxcybers/distillux' ;
const client = new DisTillux ({ apiKey: process .env.DISTILLUX_API_KEY });
const job = await client.jobs.create ({
modelUrl: 'https://example.com/llama-3-8b.safetensors' ,
targetParams: '1B' ,
qualityMode: 'balanced' ,
outputFormats: ['onnx' , 'tensorrt' , 'gguf_q4' ],
});
console .log (`Job: ${job.jobId} | ETA: ${job.estimatedCompletion}` );
// Wait for completion
const completed = await client.jobs.waitForCompletion (job.jobId, {
pollInterval: 60_000 ,
onProgress: (status) => console .log (`${status.layerName}: ${status.progressPct}%` ),
});
await client.artifacts.download (completed.jobId, 'onnx' , './distilled.onnx' );
Copy
const { DisTillux } = require ('@fluxcybers/distillux' );
const client = new DisTillux ({ apiKey: process.env.DISTILLUX_API_KEY });
client.jobs.create ({
modelUrl: 'https://example.com/model.safetensors' ,
targetParams: '1B' ,
outputFormats: ['onnx' , 'gguf_q4' ],
}).then ((job) => {
console .log ('Job created:' , job.jobId);
});
Copy
import { DisTillux } from '@fluxcybers/distillux' ;
const client = new DisTillux ({ apiKey: process.env.DISTILLUX_API_KEY });
// Stream real-time pipeline events
const stream = await client.jobs.stream ('job_abc123' );
stream.on ('layer_progress' , (event) => {
console .log (`Layer ${event.layer}: ${(event.progress * 100).toFixed(1)}%` );
});
stream.on ('job_complete' , (event) => {
console .log (`Done! Compression: ${event.compressionRatio}x` );
stream.close ();
});
Go
Install: go get github.com/fluxcybers/distillux-go
Copy
package main
import (
"context"
"fmt"
distillux "github.com/fluxcybers/distillux-go"
)
func main () {
client := distillux.New (distillux.Config{
APIKey: "dx_prod_sk_live_YOUR_KEY" ,
})
job, err := client.Jobs.Create (context.Background (), distillux.CreateJobParams{
ModelURL: "https://example.com/model.safetensors" ,
TargetParams: "1B" ,
QualityMode: "balanced" ,
OutputFormats: []string {"onnx" , "gguf_q4" },
})
if err != nil {
panic (err)
}
fmt.Printf ("Job: %s | ETA: %s\n" , job.JobID, job.EstimatedCompletion)
}
cURL Examples
Create Job
Get Status
Download
Copy
curl -X POST https://api.distillux.fluxcybers.polsia.app/v3/jobs \
-H "Authorization: Bearer $DISTILLUX_API_KEY" \
-H "Content-Type: application/json" \
-d @- <<EOF
{
"model_url" : "https://example.com/llama-3-8b.safetensors" ,
"target_params" : "1B" ,
"quality_mode" : "balanced" ,
"output_formats" : ["onnx" , "tensorrt" , "gguf_q4" ],
"accuracy_threshold" : -0.01 ,
"webhook_url" : "https://your-server.com/webhooks/distillux"
}
EOF
Copy
curl https://api.distillux.fluxcybers.polsia.app/v3/jobs/job_abc123 \
-H "Authorization: Bearer $DISTILLUX_API_KEY" | jq .
Copy
# Download the ONNX artifact (follows 302 redirect)
curl -L \
-H "Authorization: Bearer $DISTILLUX_API_KEY" \
https://api.distillux.fluxcybers.polsia.app/v3/jobs/job_abc123/artifacts/onnx/download \
--output model_distilled.onnx
Guides
CI/CD Integration
Use the official GitHub Actions action to trigger distillation on every model push and gate deployments on accuracy thresholds.
Copy
# .github/workflows/distill.yml
name : Distill on Model Push
on :
push :
paths : ['models/**' ]
jobs :
distill :
runs-on : ubuntu-latest
steps :
- uses : fluxcybers/distillux-action@v2
with :
api-key : ${{ secrets.DISTILLUX_API_KEY }}
model-path : models/llama-3-8b.safetensors
target-params : 1B
output-formats : onnx,tensorrt,gguf_q4
accuracy-threshold : -0.01
fail-on-threshold : true # Fail CI if accuracy drops too much
- name : Deploy distilled model
run : |
echo "Artifact URLs available in $DISTILLUX_ARTIFACT_URLS"
# Your deploy step here
Webhooks
DisTillux POSTs JSON events to your webhook_url when job status changes. Verify the signature using the X-DisTillux-Signature header.
Copy
// Example webhook payload (job_complete event)
{
"event" : "job_complete" ,
"job_id" : "job_kR7mNqPvXcW9zA" ,
"timestamp" : "2026-01-22T18:42:00Z" ,
"data" : {
"compression_ratio" : 8.74 ,
"accuracy_delta" : -0.0019 ,
"artifacts" : ["onnx" , "tensorrt" , "gguf_q4" ]
}
}