This seems to be really powerful. It has native multi-modal capabilities, and it is an MOE.
It is amazing to see how Meta has moved from dense models to MOEs
Longer context windows and seems like it is also quite good at coding and resoning, although it is not a specific resoning model.
It is amazing to see how Meta has moved from dense models to MOEs
Longer context windows and seems like it is also quite good at coding and resoning, although it is not a specific resoning model.