2025-12-16
Finally, you can count the r's in strawberry and check if 3.11 is higher than 3.9 without tokenisation interfering: Here's Bolmo, a fully open byte-level LLM with latent tokenisation, derived from a SOTA LLM (Olmo 3). Promising on coding and char-level understanding!
VentureBeat
Allen Institute for AI launches Bolmo 7B and Bolmo 1B, claiming they are “the first fully open byte-level language models”, built on its Olmo 3 models
and every token gets the same compute, regardless of complexity. Benjamin Minixhofer / @bminixhofer : There are also some things Bolmo lets us do which we just can't do using subwo...