Contact
Learn more at distillabs.ai and reach out via [email protected]
What are we building?
What if you could train a 1B parameter model to outperform a 1T model from just 10 examples? At distil labs, we're building the platform that makes this possible. Our distillation pipeline turns a handful of seed examples into production-ready small language models that match or exceed cloud LLM accuracy while running privately on-prem or at the edge.
We're looking for an ML Science Intern to join us for 4–6 months and tackle an open research problem at the core of our platform.
About the internship
As a Machine Learning Science Intern, you’ll join the small, high‑impact research group that powers our platform. You’ll work side‑by‑side with experienced ML scientists and engineers to explore new ideas in knowledge distillation, synthetic‑data generation and model self‑improvement, then turn the best of them into prototypes (or production features if you wish) and publishable papers. Expect fast iteration cycles, lots of autonomy and a direct line of sight from your code to real customers. The internship is planned for 4-6 months of full time work which should be enough time to explore an idea in-depth.
Example areas you might work on:
- Synthetic data diversity: How do we ensure generated training data covers the right distribution without drifting off-task? We model controllable properties (length, topic, style) of new data to sample new examples that we need - there's plenty of room to push this further.
- Validation and filtering: What makes a synthetic example "good enough" to train on? We currently use similarity thresholds, schema checks, and deduplication - but LLM-as-judge and learned quality models are open questions.
- Knowledge distillation at the edge: Can we distill into even smaller models (100M–1B parameters) without losing task performance? What about multi-task or multi-modal distillation?
- Reasoning and tool use: Our tool-calling benchmarks show massive gains from distillation. How far can we push small models on agentic tasks?
Key responsibilities
- Research & prototyping – implement, benchmark and iterate on novel algorithms that shrink large models without sacrificing accuracy.
- Evaluation – help expand our internal benchmark suite to measure latency, memory footprint and task performance across hardware profiles.
- Production testing – deploy and stress‑test successful prototypes in live production environments, collect telemetry and drive iterative improvements.
- Collaboration – pair with engineers to integrate successful experiments into our Python/PyTorch stack and customer‑facing APIs.
- Publication – prepare and publish benchmark results and research findings as scientific papers at top‑tier machine‑learning conferences.
- Communication & culture – share learnings in weekly demos, contribute to paper‑reading clubs and give feedback on product direction.