/
Navigation
C
Chronicles
Browse all articles
C
E
Explore
Semantic exploration
E
R
Research
Entity momentum
R
N
Nexus
Correlations & relationships
N
~
Story Arc
Topic evolution
S
Drift Map
Semantic trajectory animation
D
P
Posts
Analysis & commentary
P
Browse
@
Entities
Companies, people, products, technologies
Domains
Browse by publication source
Handles
Browse by social media handle
Detection
?
Concept Search
Semantic similarity search
!
High Impact Stories
Top coverage by position
+
Sentiment Analysis
Positive/negative coverage
*
Anomaly Detection
Unusual coverage patterns
Analysis
vs
Rivalry Report
Compare two entities head-to-head
/\
Semantic Pivots
Narrative discontinuities
!!
Crisis Response
Event recovery patterns
Connected
Nav: C E R N
Search: /
Command: ⌘K
Embeddings: large
VOICE ARCHIVE

Nicholas Wilt

@cudahandbook
11 posts
2024-12-26
CUDA's software stack has a few distinct pillars that are triumphs of software engineering (let alone software architecture). The driver API was built in C, portable across both operating systems and CPU architectures. Across the 6 years I worked on CUDA, no one questioned 1/x
2024-12-26 View on X
SemiAnalysis

Benchmarking AMD's MI300X and Nvidia's H100 and H200; in theory, AMD's GPU has advantages in specs and total cost of ownership, but software bugs hold it back

This NVidia monopoly on #ai hardware is not good for anybody. … X: Nicholas Wilt / @cudahandbook : NVIDIA to churn the hardware instruction set with abandon, sometimes even committ...

NVIDIA to churn the hardware instruction set with abandon, sometimes even committing featurecide and relying on the PTX translator to emulate instructions that were removed. The PTX translation code is in the driver as well as the offfline toolchain (ptxas) and is 3/x
2024-12-26 View on X
SemiAnalysis

Benchmarking AMD's MI300X and Nvidia's H100 and H200; in theory, AMD's GPU has advantages in specs and total cost of ownership, but software bugs hold it back

This NVidia monopoly on #ai hardware is not good for anybody. … X: Nicholas Wilt / @cudahandbook : NVIDIA to churn the hardware instruction set with abandon, sometimes even committ...

supercomputers in the world. So yeah, CUDA is a deep, deep moat. I get pretty offended when folks intimate that any luck was involved. We knew exactly what we were doing and why. /fin
2024-12-26 View on X
SemiAnalysis

Benchmarking AMD's MI300X and Nvidia's H100 and H200; in theory, AMD's GPU has advantages in specs and total cost of ownership, but software bugs hold it back

This NVidia monopoly on #ai hardware is not good for anybody. … X: Nicholas Wilt / @cudahandbook : NVIDIA to churn the hardware instruction set with abandon, sometimes even committ...

why I was making sure it ran on Windows as well as Linux. It was healthy for the code base. Today, NVIDIA has parlayed CUDA's Windowa support into a monopoly position in GPU workstations, because 1,200 workstation apps use CUDA. Another pillar is PTX, which enables 2/x
2024-12-26 View on X
SemiAnalysis

Benchmarking AMD's MI300X and Nvidia's H100 and H200; in theory, AMD's GPU has advantages in specs and total cost of ownership, but software bugs hold it back

This NVidia monopoly on #ai hardware is not good for anybody. … X: Nicholas Wilt / @cudahandbook : NVIDIA to churn the hardware instruction set with abandon, sometimes even committ...

multithreaded, so it can exploit modern multicore CPUs for performance gains proportional to the core count. Another triumph of software engineering. All of this great software runs on a span of platforms from tiny SOCs for cars and drones and robots, to the biggest 4/x
2024-12-26 View on X
SemiAnalysis

Benchmarking AMD's MI300X and Nvidia's H100 and H200; in theory, AMD's GPU has advantages in specs and total cost of ownership, but software bugs hold it back

This NVidia monopoly on #ai hardware is not good for anybody. … X: Nicholas Wilt / @cudahandbook : NVIDIA to churn the hardware instruction set with abandon, sometimes even committ...

2024-12-25
CUDA's software stack has a few distinct pillars that are triumphs of software engineering (let alone software architecture). The driver API was built in C, portable across both operating systems and CPU architectures. Across the 6 years I worked on CUDA, no one questioned 1/x
2024-12-25 View on X
SemiAnalysis

Benchmarking AMD's MI300X and Nvidia's H100 and H200; in theory, AMD's GPU has advantages in specs and Total Cost of Ownership, but software bugs hold it back

Intro  —  SemiAnalysis has been on a five-month long quest to settle the reality of MI300X.  In theory, the MI300X …

NVIDIA to churn the hardware instruction set with abandon, sometimes even committing featurecide and relying on the PTX translator to emulate instructions that were removed. The PTX translation code is in the driver as well as the offfline toolchain (ptxas) and is 3/x
2024-12-25 View on X
SemiAnalysis

Benchmarking AMD's MI300X and Nvidia's H100 and H200; in theory, AMD's GPU has advantages in specs and Total Cost of Ownership, but software bugs hold it back

Intro  —  SemiAnalysis has been on a five-month long quest to settle the reality of MI300X.  In theory, the MI300X …

supercomputers in the world. So yeah, CUDA is a deep, deep moat. I get pretty offended when folks intimate that any luck was involved. We knew exactly what we were doing and why. /fin
2024-12-25 View on X
SemiAnalysis

Benchmarking AMD's MI300X and Nvidia's H100 and H200; in theory, AMD's GPU has advantages in specs and Total Cost of Ownership, but software bugs hold it back

Intro  —  SemiAnalysis has been on a five-month long quest to settle the reality of MI300X.  In theory, the MI300X …

why I was making sure it ran on Windows as well as Linux. It was healthy for the code base. Today, NVIDIA has parlayed CUDA's Windowa support into a monopoly position in GPU workstations, because 1,200 workstation apps use CUDA. Another pillar is PTX, which enables 2/x
2024-12-25 View on X
SemiAnalysis

Benchmarking AMD's MI300X and Nvidia's H100 and H200; in theory, AMD's GPU has advantages in specs and Total Cost of Ownership, but software bugs hold it back

Intro  —  SemiAnalysis has been on a five-month long quest to settle the reality of MI300X.  In theory, the MI300X …

multithreaded, so it can exploit modern multicore CPUs for performance gains proportional to the core count. Another triumph of software engineering. All of this great software runs on a span of platforms from tiny SOCs for cars and drones and robots, to the biggest 4/x
2024-12-25 View on X
SemiAnalysis

Benchmarking AMD's MI300X and Nvidia's H100 and H200; in theory, AMD's GPU has advantages in specs and Total Cost of Ownership, but software bugs hold it back

Intro  —  SemiAnalysis has been on a five-month long quest to settle the reality of MI300X.  In theory, the MI300X …

2024-10-11
I added the CUDA features for MCP7, back when GPUs were integrated into the northbridge. The fundamental problem w APUs is that any GPU work is bookended by launch latency on the one hand and waiting for the GPU to finish on the other.
2024-10-11 View on X
CNBC

AMD launches its Instinct MI325X GPU to compete with Nvidia's upcoming Blackwell chips and says production will start in 2024, but doesn't disclose its pricing

AMD launched a new artificial-intelligence chip on Thursday that is taking direct aim at Nvidia's data center graphics processors, known as GPUs.