Definition
An open mixture-of-experts model with 30 billion total parameters but only about 3 billion active per token.
A 30B/3B-active sparse mixture-of-experts open-weight base model used as the backbone for SU-01's olympiad-math post-training.
An open mixture-of-experts model with 30 billion total parameters but only about 3 billion active per token.
A 30B/3B-active sparse mixture-of-experts open-weight base model used as the backbone for SU-01's olympiad-math post-training.