API Docs
Dashboard β†’ Get API Key

DisTillux API Reference

The DisTillux REST API lets you programmatically submit distillation jobs, poll pipeline status, download artifacts, and configure continuous calibration β€” all without touching the dashboard.

πŸ’‘

Base URL: All API requests go to https://api.distillux.fluxcybers.polsia.app/v3

Quick Start β€” 3 steps

Step 1: Upload your model

curl -X POST https://api.distillux.fluxcybers.polsia.app/v3/jobs \ -H "Authorization: Bearer dx_prod_sk_live_YOUR_KEY" \ -H "Content-Type: application/json" \ -d '{ "model_url": "https://your-bucket.s3.amazonaws.com/llama-3-8b.safetensors", "target_params": "1B", "quality_mode": "balanced", "output_formats": ["onnx", "tensorrt", "gguf_q4", "safetensors"] }'

Step 2: Poll for completion

curl https://api.distillux.fluxcybers.polsia.app/v3/jobs/job_abc123 \ -H "Authorization: Bearer dx_prod_sk_live_YOUR_KEY"

Step 3: Download artifacts

curl https://api.distillux.fluxcybers.polsia.app/v3/jobs/job_abc123/artifacts/onnx/download \ -H "Authorization: Bearer dx_prod_sk_live_YOUR_KEY" \ --output model_distilled.onnx

Authentication

All API requests must include your API key in the Authorization header using Bearer token format.

Authorization: Bearer dx_prod_sk_live_eXFPkB7yxqA3mNc9wLZK...

Obtain your API key from the DisTillux Dashboard β†’ API Keys. Keys are prefixed with dx_prod_ (production) or dx_test_ (test/sandbox).

⚠️

Security: Never expose API keys in client-side code or commit them to version control. Use environment variables or a secrets manager.

Error Handling

DisTillux uses standard HTTP status codes. Error responses include a JSON body with error and message fields.

{ "error": "model_too_large", "message": "Model exceeds 7B parameter limit on Starter tier. Upgrade to Pro.", "status": 422, "docs_url": "https://docs.distillux.fluxcybers.polsia.app/errors/model_too_large" }
CodeMeaning
400Bad request β€” check required parameters
401Invalid or missing API key
403Action not permitted on your tier
404Job or artifact not found
409Conflict β€” duplicate job for same model hash
422Validation error β€” see message
429Rate limit exceeded β€” see Retry-After header
500Internal server error β€” contact support

Rate Limits

TierConcurrent JobsAPI Requests/minMax Model Size
Starter1607B parameters
Pro330070B parameters
EnterpriseUnlimitedCustomUnlimited

Rate limit headers are included in every response: X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset.

Create Distillation Job

POST /jobs Submit a new model distillation job

Submits a new distillation job. The model file must be accessible via a pre-signed URL or you can use the upload endpoint first.

Request Body

ParameterTypeDescription
model_urlrequiredstringPre-signed URL or DisTillux upload URL for the model file
target_paramsrequiredstringTarget parameter count: "1B", "1.5B", "3B", "7B", or "custom"
quality_modestring"balanced" (default), "max_compression", "max_quality", "speed"
output_formatsarrayArray of format strings. See Output Formats. Default: ["safetensors","onnx"]
quantizationstring"int4", "int8", "fp8", "mixed". Default: "int4"
calibration_data_urlstringOptional URL to a JSONL calibration dataset (Pro/Enterprise only)
accuracy_thresholdnumberMax acceptable accuracy delta (e.g., -0.01 for βˆ’1%). Default: -0.02
webhook_urlstringURL to POST completion events to. See Webhooks
metadataobjectCustom key-value pairs attached to this job (returned in all responses)

Response

{ "job_id": "job_kR7mNqPvXcW9zA", "status": "queued", "created_at": "2026-01-22T14:30:00Z", "estimated_completion": "2026-01-22T18:42:00Z", "estimated_cost_usd": 1499.00, "tier": "pro", "pipeline_url": "wss://stream.distillux.fluxcybers.polsia.app/jobs/job_kR7mNqPvXcW9zA" }

πŸ§ͺ Try It

API Key
model_url
target

Get Job Status

GET /jobs/{job_id} Get pipeline status and progress

Returns the current status of a distillation job, including per-layer progress and projected metrics.

{ "job_id": "job_kR7mNqPvXcW9zA", "status": "running", "current_layer": 2, "layer_name": "distill", "progress_pct": 38.4, "layers": { "analyze": { "status": "done", "duration_min": 42 }, "distill": { "status": "running", "progress": 0.38 }, "refine": { "status": "queued" }, "optimize": { "status": "queued" }, "deliver": { "status": "queued" }, "evolve": { "status": "queued" } }, "eta_minutes": 194, "projected_accuracy_delta": -0.0021, "projected_compression_ratio": 8.7 }
πŸ’‘

For real-time updates, connect to the WebSocket stream at pipeline_url returned when creating the job. See WebSocket Streaming.

List Jobs

GET /jobs List all jobs with optional filters

Query Parameters

ParameterTypeDescription
statusstringFilter by status: queued, running, done, failed, cancelled
limitintegerResults per page (max 100, default 20)
cursorstringPagination cursor from previous response
sincestringISO 8601 timestamp β€” return jobs created after this date

Get Artifacts

GET /jobs/{job_id}/artifacts List all output artifacts for a completed job
{ "job_id": "job_kR7mNqPvXcW9zA", "artifacts": [ { "format": "onnx", "size_bytes": 1621000000, "size_human": "1.51 GB", "compression_ratio": 8.74, "accuracy_delta": -0.0019, "benchmark": { "mmlu": 0.998, "truthfulqa": 0.9981, "hellaswag": 0.9974 }, "checksum_sha256": "a3f92e…d8b1", "expires_at": "2026-04-22T14:30:00Z" } ] }
GET /jobs/{job_id}/artifacts/{format}/download Download a specific artifact (302 redirect to pre-signed URL)

Returns a 302 redirect to a pre-signed download URL (1-hour expiry). Use -L with cURL to follow redirects.

Python SDK

Install: pip install distillux

Basic Async With Polling
from distillux import DisTillux client = DisTillux(api_key="dx_prod_sk_live_YOUR_KEY") # Submit distillation job job = client.jobs.create( model_url="https://example.com/llama-3-8b.safetensors", target_params="1B", quality_mode="balanced", output_formats=["onnx", "tensorrt", "gguf_q4"], accuracy_threshold=-0.01, ) print(f"Job created: {job.job_id}") print(f"ETA: {job.estimated_completion}") print(f"Est. cost: ${job.estimated_cost_usd:.2f}")

Node.js SDK

Install: npm install @fluxcybers/distillux

ESM CommonJS Stream
import { DisTillux } from '@fluxcybers/distillux'; const client = new DisTillux({ apiKey: process.env.DISTILLUX_API_KEY }); const job = await client.jobs.create({ modelUrl: 'https://example.com/llama-3-8b.safetensors', targetParams: '1B', qualityMode: 'balanced', outputFormats: ['onnx', 'tensorrt', 'gguf_q4'], }); console.log(`Job: ${job.jobId} | ETA: ${job.estimatedCompletion}`); // Wait for completion const completed = await client.jobs.waitForCompletion(job.jobId, { pollInterval: 60_000, onProgress: (status) => console.log(`${status.layerName}: ${status.progressPct}%`), }); await client.artifacts.download(completed.jobId, 'onnx', './distilled.onnx');

Go

Install: go get github.com/fluxcybers/distillux-go

package main import ( "context" "fmt" distillux "github.com/fluxcybers/distillux-go" ) func main() { client := distillux.New(distillux.Config{ APIKey: "dx_prod_sk_live_YOUR_KEY", }) job, err := client.Jobs.Create(context.Background(), distillux.CreateJobParams{ ModelURL: "https://example.com/model.safetensors", TargetParams: "1B", QualityMode: "balanced", OutputFormats: []string{"onnx", "gguf_q4"}, }) if err != nil { panic(err) } fmt.Printf("Job: %s | ETA: %s\n", job.JobID, job.EstimatedCompletion) }

cURL Examples

Create Job Get Status Download
curl -X POST https://api.distillux.fluxcybers.polsia.app/v3/jobs \ -H "Authorization: Bearer $DISTILLUX_API_KEY" \ -H "Content-Type: application/json" \ -d @- <<EOF { "model_url": "https://example.com/llama-3-8b.safetensors", "target_params": "1B", "quality_mode": "balanced", "output_formats": ["onnx", "tensorrt", "gguf_q4"], "accuracy_threshold": -0.01, "webhook_url": "https://your-server.com/webhooks/distillux" } EOF

CI/CD Integration

Use the official GitHub Actions action to trigger distillation on every model push and gate deployments on accuracy thresholds.

# .github/workflows/distill.yml name: Distill on Model Push on: push: paths: ['models/**'] jobs: distill: runs-on: ubuntu-latest steps: - uses: fluxcybers/distillux-action@v2 with: api-key: ${{ secrets.DISTILLUX_API_KEY }} model-path: models/llama-3-8b.safetensors target-params: 1B output-formats: onnx,tensorrt,gguf_q4 accuracy-threshold: -0.01 fail-on-threshold: true # Fail CI if accuracy drops too much - name: Deploy distilled model run: | echo "Artifact URLs available in $DISTILLUX_ARTIFACT_URLS" # Your deploy step here

Webhooks

DisTillux POSTs JSON events to your webhook_url when job status changes. Verify the signature using the X-DisTillux-Signature header.

// Example webhook payload (job_complete event) { "event": "job_complete", "job_id": "job_kR7mNqPvXcW9zA", "timestamp": "2026-01-22T18:42:00Z", "data": { "compression_ratio": 8.74, "accuracy_delta": -0.0019, "artifacts": ["onnx", "tensorrt", "gguf_q4"] } }

Output Formats

Format KeyFile TypeBest ForTier
safetensors.safetensorsHuggingFace / PyTorch inferenceAll
onnx.onnxCross-platform CPU/GPU inferenceAll
tensorrt.trtNVIDIA GPU (max throughput)Pro+
gguf_q4.ggufCPU / ARM / Edge (4-bit)All
gguf_q5.ggufCPU (5-bit, better quality)Pro+
gguf_q8.ggufCPU (8-bit, near-lossless)Pro+
coreml.mlpackageApple Silicon / iOS / macOSPro+
openvino.xml + .binIntel CPU / iGPU / VPUPro+
tflite.tfliteAndroid / embeddedPro+
execflow.exfExecFlow platform nativeAll