Member of Technical Staff (Research)

Bagel Labs•Toronto, Canada•Today•

We are Bagel Labs - an artificial intelligence research lab pioneering distributed training of frontier diffusion models on commodity hardware.

We ignore years of experience and pedigree. If you have high agency - meaning your default assumption is that you can control the outcome of whatever situation you are in - we want to hear from you. Every requirement below is flexible for a candidate with high enough agency and tolerance for ambiguity.

Role Overview

We encourage curiosity-driven research and welcome bold, untested concepts. You will explore frontiers in continual learning, world modelling, and reinforcement learning on diffusion models. We love novel, provocative, untested ideas that challenge conventional paradigms.

Key Responsibilities

Advance decentralized diffusion models (DDM) and pioneer next-generation architectures including rectified flows, EDM variants, and latent consistency models.
Develop novel sampling algorithms, guidance mechanisms, and conditioning strategies that unlock new capabilities in controllable generation.
Push the frontier of video generation and synthesis, including temporal modeling and multi-modal architectures.
Publish at top-tier ML venues and share insights through blog posts, open-source contributions, and community engagement.

Who You Might Be

You are extremely curious. You actively consume the latest ML research - scanning arXiv, attending conferences, dissecting new open-source releases, and integrating breakthroughs into your own experimentation. You thrive on first-principles reasoning, see potential in unexplored ideas, and view learning as a perpetual process.

Desired Skills

Deep expertise in modern diffusion models including training, sampling, denoising schedulers, score matching, flow matching, consistency training, and distillation techniques.
Experience with transformer architectures such as DiT, MM-DiT, and attention mechanisms.
Hands-on experience with distributed training at scale across multi-GPU and multi-node setups, with familiarity in mixed-precision training (FP8, BF16).
Experience with video generation and synthesis, including temporal modeling and 3D positional encodings.
Knowledge of VAE architectures such as HunyuanVAE, DC-AE, and latent representations, as well as motion modeling and optical flow.
Strong mathematical foundation in SDEs, ODEs, optimal transport, and variational inference for designing novel generative objectives.

What We Offer

Top of the market compensation and time to pursue open-ended research.
A deeply technical culture where bold, frontier ideas are debated, stress-tested, and built.
In-person role at our Toronto office.
Ownership of work that can set the direction for frontier diffusion models.
Paid travel opportunities to the top ML conferences around the world.