With only 6.6B activate parameters, GRIN MoE achieves exceptionally good performance across a diverse set of tasks, particularly in coding and mathematics tasks.
Microsoft releases GRIN😁 MoE
GRadient-INformed MoE
Demo: https://huggingface.co/spaces/GRIN-MoE-Demo/GRIN-MoE model: https://huggingface.co/microsoft/GRIN-MoE github:
With only 6.6B activate parameters, GRIN MoE achieves exceptionally good performance across a…
Contribute to microsoft/GRIN-MoE development by creating an account on GitHub.
Leave a reply