2024-12-26
Recently tried to (painstakingly) write a convolution layer kernel using AWS Trainum with their NKI programming model (their latest and greatest), only to realize CUDA moat is well and truly alive!
SemiAnalysis
Benchmarking AMD's MI300X and Nvidia's H100 and H200; in theory, AMD's GPU has advantages in specs and total cost of ownership, but software bugs hold it back
This NVidia monopoly on #ai hardware is not good for anybody. … X: Nicholas Wilt / @cudahandbook : NVIDIA to churn the hardware instruction set with abandon, sometimes even committ...
2024-12-25
Recently tried to (painstakingly) write a convolution layer kernel using AWS Trainum with their NKI programming model (their latest and greatest), only to realize CUDA moat is well and truly alive!
SemiAnalysis
Benchmarking AMD's MI300X and Nvidia's H100 and H200; in theory, AMD's GPU has advantages in specs and Total Cost of Ownership, but software bugs hold it back
Intro — SemiAnalysis has been on a five-month long quest to settle the reality of MI300X. In theory, the MI300X …