writing
Date | Title |
---|---|
Jan 1, 2024 | Our Mission |
Jan 5, 2024 | Linear Transformers Are Faster |
May 16, 2024 | Compute-Optimal Context Size |
Aug 14, 2024 | LongCrawl64: A Long-Context Dataset |
Aug 15, 2024 | Symmetric Power Transformers |
Sep 23, 2024 | Why Gradient Descent Minimizes Training Loss |
Jul 6, 2025 | Release: Power Attention |
code
research
about us
Carles Gelada, Jacob Buckman, Sean Zhang. We are 8+ year veteran deep learning researchers, formerly at labs like OpenAI, Google Brain, and Meta. Our research has been published at NeurIPS, ICLR, ICML, etc., and been cited 2000+ times. We are backed by a small group of highly techinal long-term investors including Decibel and True.
join us
We are hiring core technical team members.
The role has elements of both software engineering and research. Responsibilities include implementing deep learning architectures, deriving algorithms, developing research infrastructure, running large-scale experiments, and interpreting and communicating results.
We will work well together if you are independent-minded, capable of self-teaching, and value thinking from first principles. Skills we are looking for include comfort with mathematics, strong communication, deep knowledge in areas like CUDA, XLA/MLIR, Jax, or distributed systems/HPC, and experience training large-scale deep learning models.
We do not care about formal credentials. If you share our vision and would like to get involved, please send an example of some technical work that you are proud of to 777b7a607577605479757a7d72716760757d3a777b79