Bookmarks

How to Think About GPUs

03 CUDA Fundamental Optimization Part 1

Detailed lecture on foundational CUDA performance techniques—memory coalescing, occupancy, and kernel launch parameters—illustrated through hands-on code profiling and optimization steps.

Live at NVIDIA GTC with Acquired

NVIDIA Doesn't Care About GPUs

How does Groq LPU work? (w/ Head of Silicon Igor Arsovski!)

Deep technical interview on Groq’s Language Processing Unit architecture—single-cycle SIMD fabric, compiler stack, and network scaling versus GPUs.

My notes while reading about GPUs

Accelerate

Analyzing Modern NVIDIA GPU cores

User Guide for NVPTX Back-end

Tenstorrent Wormhole Series Part 1: Physicalities

Udacity CS344: Intro to Parallel Programming

Writing CUDA Kernels for PyTorch

Subcategories