2024-10-25
SiliconANGLE
9 related
Meta debuts “quantized” versions of Llama 3.2 1B and 3B models, designed to run on low-powered devices and developed in collaboration with Qualcomm and MediaTek
so today we're releasing new quantized versions of Llama 3.2 1B & 3B that deliver up to 2-4x increases in inference speed and, on average, 56% reduction in model size, and 41% reduction in memory foot...
Loading articles...