cudahandbook · TEXXR

CUDA's software stack has a few distinct pillars that are triumphs of software engineering (let alone software architecture). The driver API was built in C, portable across both operating systems and CPU architectures. Across the 6 years I worked on CUDA, no one questioned 1/x

2024-12-26 View on X

SemiAnalysis

Benchmarking AMD's MI300X and Nvidia's H100 and H200; in theory, AMD's GPU has advantages in specs and total cost of ownership, but software bugs hold it back

This NVidia monopoly on #ai hardware is not good for anybody. … X: Nicholas Wilt / @cudahandbook : NVIDIA to churn the hardware instruction set with abandon, sometimes even committ...

View original

NVIDIA to churn the hardware instruction set with abandon, sometimes even committing featurecide and relying on the PTX translator to emulate instructions that were removed. The PTX translation code is in the driver as well as the offfline toolchain (ptxas) and is 3/x

2024-12-26 View on X

SemiAnalysis

Benchmarking AMD's MI300X and Nvidia's H100 and H200; in theory, AMD's GPU has advantages in specs and total cost of ownership, but software bugs hold it back

This NVidia monopoly on #ai hardware is not good for anybody. … X: Nicholas Wilt / @cudahandbook : NVIDIA to churn the hardware instruction set with abandon, sometimes even committ...

View original

supercomputers in the world. So yeah, CUDA is a deep, deep moat. I get pretty offended when folks intimate that any luck was involved. We knew exactly what we were doing and why. /fin

2024-12-26 View on X

SemiAnalysis

Benchmarking AMD's MI300X and Nvidia's H100 and H200; in theory, AMD's GPU has advantages in specs and total cost of ownership, but software bugs hold it back

This NVidia monopoly on #ai hardware is not good for anybody. … X: Nicholas Wilt / @cudahandbook : NVIDIA to churn the hardware instruction set with abandon, sometimes even committ...

View original

why I was making sure it ran on Windows as well as Linux. It was healthy for the code base. Today, NVIDIA has parlayed CUDA's Windowa support into a monopoly position in GPU workstations, because 1,200 workstation apps use CUDA. Another pillar is PTX, which enables 2/x

2024-12-26 View on X

SemiAnalysis

Benchmarking AMD's MI300X and Nvidia's H100 and H200; in theory, AMD's GPU has advantages in specs and total cost of ownership, but software bugs hold it back

This NVidia monopoly on #ai hardware is not good for anybody. … X: Nicholas Wilt / @cudahandbook : NVIDIA to churn the hardware instruction set with abandon, sometimes even committ...

View original

multithreaded, so it can exploit modern multicore CPUs for performance gains proportional to the core count. Another triumph of software engineering. All of this great software runs on a span of platforms from tiny SOCs for cars and drones and robots, to the biggest 4/x

2024-12-26 View on X

SemiAnalysis

Benchmarking AMD's MI300X and Nvidia's H100 and H200; in theory, AMD's GPU has advantages in specs and total cost of ownership, but software bugs hold it back

This NVidia monopoly on #ai hardware is not good for anybody. … X: Nicholas Wilt / @cudahandbook : NVIDIA to churn the hardware instruction set with abandon, sometimes even committ...

View original

CUDA's software stack has a few distinct pillars that are triumphs of software engineering (let alone software architecture). The driver API was built in C, portable across both operating systems and CPU architectures. Across the 6 years I worked on CUDA, no one questioned 1/x

2024-12-25 View on X

SemiAnalysis