MiniMax 2.7 230B model eka local run kara with iQ3_S quantized weights. CPU eken inference kare. Runtime eka Vulkan llama.cpp(linux)
Promt processing 9tk/s
Promt evaluation 6tk/s
Q3 quantized hinda BF16 full model eka tharam quality na.
Promt processing 9tk/s
Promt evaluation 6tk/s
Q3 quantized hinda BF16 full model eka tharam quality na.