abacaj · TEXXR

Why pay for this if the cost is the same and you get access to a worse model? I get the speed factor but for code - I want the best model

2025-08-02 View on X

Cerebras

Cerebras announces the $50/month Code Pro and the $200/month Code Max plans, offering users access to Qwen3-Coder at speeds of up to 2,000 tokens per second

Two interesting examples of inference speed as a flagship feature of LLM services today. Bluesky: Tim Kellogg / @timkellogg.me : Cerebras Code — use models hosted on Cerebras with ...

View original

You can just fork vscode and make billions

2025-04-17 View on X

Bloomberg

Source: OpenAI is in talks to acquire Windsurf, an AI coding tool formerly known as Codeium, for ~$3B; Windsurf was valued at $1.25B in a 2024 funding deal

they are the black hole of Startups People still don't get it Evil institution Deedy / @deedydas : @EMostaque Perhaps the time (say even 6mos) it takes to build and grow a Windsurf...

View original

llama 4 is really a bit disappointing, not a model I would use for assistance (code, etc). turns out gemini 2.5 is a really good model for code & sonnet for agentic tasks. not sure where llama 4 fits in with all of the available models today...

2025-04-06 View on X

Meta launches Llama 4 Maverick with 400B parameters and Scout with 109B parameters and a 10M context window, and previews Behemoth with 2T total parameters

Takeaways — We're sharing the first models in the Llama 4 herd, which will enable people to build more personalized multimodal experiences.

View original

It's funny that SOTA models like deepseek will gladly output they are trained by OAI (1.5 years after gpt-4) but maybe not so funny... how hard could it be to steer the model in post training to not admit it is trained by OAI/GPT-4? It's obvious the model will output this when [image]

2024-12-28 View on X

TechCrunch

DeepSeek-V3 sometimes identifies itself as ChatGPT when asked which model it is; some speculate that its training datasets may contain text generated by ChatGPT

Kyle Wiggers / TechCrunch :

View original

open source models are kind of cooked, if it takes this much compute to get the right answer for complex questions there's no shot you can run that “locally”

2024-12-22 View on X

TechCrunch

OpenAI unveils o3 and o3-mini, trained to “think” before responding via what OpenAI calls a “private chain of thought”, and plans to launch them in early 2025

12 Days of OpenAI: Day 12 Naomi Li Gan / Tech in Asia : OpenAI unveils AI model for advanced reasoning Bojan Stojkovski / Interesting Engineering : OpenAI unveils o3 reasoning AI m...

View original

open source models are kind of cooked, if it takes this much compute to get the right answer for complex questions there's no shot you can run that “locally”

2024-12-21 View on X

TechCrunch

OpenAI unveils o3 and o3-mini, trained to “think” before responding via what OpenAI calls a “private chain of thought”, and plans to launch them in early 2025

OpenAI announced its new o3 models on Friday. — In a tweet ahead of its final livestream for its …

View original

Not feeling the vibe with the new gemini flash 2.0 thinking model (it seems a lot worse than o1). I am impressed with gemini flash 2.0 vision capabilities though pretty sure it is SOTA or very close

2024-12-20 View on X

TechCrunch

Google releases Gemini 2.0 Flash Thinking, an experimental “reasoning” model that “explicitly shows its thoughts” and can use them to strengthen its reasoning

Quick: what sort of prompts should you run against GPT-4o vs Gemini 1.5 Flash vs o1 vs o1-pro vs gemini-2.0-flash-thinking-exp? X: Jeff Dean / @jeffdean : Introducing Gemini 2.0 Fl...

View original

This is effectively the most important feature possible outside of just making models better. Cheaper prompts (10x) and faster (30-80%) first token on cache hits, incredible when doing few shot prompting with images or text

2024-08-15 View on X

VentureBeat

Anthropic releases prompt caching, which lets developers cache frequently used context between API calls, in public beta on its API

Anthropic introduced prompt caching on its API, which remembers the context between API calls and allows developers to avoid repeating prompts.

View original

What's so special about search gpt? Am I missing the hype? There's already a dozen or so services doing this

2024-07-26 View on X

The Verge

OpenAI unveils SearchGPT, a GPT-4-powered search tool that can organize links and summarize its findings, limited to 10K users but eventually coming to ChatGPT

Whenever AI companies present a vision for the role of artificial intelligence … Greg Noone / Tech Monitor : OpenAI announces AI search engine SearchGPT Samantha Dunn / CCN.com : S...

View original

openai just effectively made llama 3 70B obsolete (outside of being able to fine tune it for cheaper) [image]

2024-07-19 View on X

Simon Willison's Weblog

GPT-4o mini costs $0.15 per 1M input tokens and $0.60 per 1M output tokens, prices lower than those of Claude 3 Haiku and Gemini 1.5 Flash

GPT-4o mini. I've been complaining about how under-powered GPT 3.5 is for the price for a while now (I made fun of it in a keynote a few weeks ago).

View original

won't be able to run this locally (even 4bit will require > 200GB) but we'll very likely see some providers deploy it

2024-07-13 View on X

The Information

Source: Meta plans to release the largest version of its Llama 3 model, expected to have 405B parameters and multimodal capabilities, on July 23

View original

Another AGI lab bites the dust [image]

2024-06-29 View on X

GeekWire

Amazon hires the CEO and co-founders of Adept, which builds AI agents that automate enterprise workflows, to join its AGI team and will use some of Adept's tech

Amazon is amping up its AI efforts by hiring executives from Adept, a San Francisco-based startup building “agents” that automate enterprise workflows.

View original

Wow these numbers for the new phi-3-vision are really good... going to have to give this a try [image]

2024-05-22 View on X

VentureBeat

Microsoft announces the general availability of its Phi-3 models, including Phi-3-Silica, a 3.3B parameter model that will be embedded on all Copilot+ PCs

here's what you can use it for Pradeep Viswav / MSPoweruser : Microsoft and Khan Academy announce AI partnership Kevin Okemwa / Windows Central : Microsoft ships Azure AI Studio in...

View original

Wow these numbers for the new phi-3-vision are really good... going to have to give this a try [image]

2024-05-22 View on X

Windows Central

Microsoft ships Azure AI Studio in broad availability, adds support for OpenAI's GPT-4o, and announces a new multimodal model in its lightweight Phi-3 family

View original

Bearish for OpenAI from now on [image]

2024-05-15 View on X

CNBC

Ilya Sutskever says he will leave OpenAI to work on a “personally meaningful” project; Director of Research Jakub Pachocki will become OpenAI's chief scientist

OpenAI co-founder Ilya Sutskever said Tuesday that he's leaving the Microsoft-backed startup.

View original

what are these models? they don't seem like the next iteration of gpt-n

2024-05-08 View on X

Simon Willison's Weblog

OpenAI built the gpt2-chatbot, renamed to “im-also-a-good-gpt-chatbot”, per the gpt2-chatbot's 429 rate limit error message, which appeared in the LMSYS arena

gpt2-chatbot confirmed as OpenAI (via) The mysterious gpt2-chatbot model that showed up in the LMSYS arena a few days ago …

View original

im-a-good-gpt2-chatbot will *gladly* hallucinate information where the current gpt-4-turbo *does* not, left “gpt2” right gpt-4-turbo [image]

2024-05-08 View on X

Simon Willison's Weblog

OpenAI built the gpt2-chatbot, renamed to “im-also-a-good-gpt-chatbot”, per the gpt2-chatbot's 429 rate limit error message, which appeared in the LMSYS arena

gpt2-chatbot confirmed as OpenAI (via) The mysterious gpt2-chatbot model that showed up in the LMSYS arena a few days ago …

View original

So “gpt2” seems to be coming from OpenAI lol. They should probably not return exact errors from providers

2024-05-08 View on X

Simon Willison's Weblog

OpenAI built the gpt2-chatbot, renamed to “im-also-a-good-gpt-chatbot”, per the gpt2-chatbot's 429 rate limit error message, which appeared in the LMSYS arena

gpt2-chatbot confirmed as OpenAI (via) The mysterious gpt2-chatbot model that showed up in the LMSYS arena a few days ago …

View original

If I had to take a wild guess it's just the phi-3 variants that aren't released yet

2024-05-08 View on X

Simon Willison's Weblog

OpenAI built the gpt2-chatbot, renamed to “im-also-a-good-gpt-chatbot”, per the gpt2-chatbot's 429 rate limit error message, which appeared in the LMSYS arena

gpt2-chatbot confirmed as OpenAI (via) The mysterious gpt2-chatbot model that showed up in the LMSYS arena a few days ago …

View original

AI bubble bursting? [image]

2024-04-20 View on X

Financial Times

Nvidia closed down 10% on Friday, falling the most since March 2020 and losing more than $200B of its market value, as investors pull back from AI bets

US stock markets suffer their worst run since October 2022 as investors pull back from AI bets — Nvidia's share price plunged …

View original