Graphics

Understanding GPU Architectures: Whats Changed from Turing to Ampere?

May 7, 2024

Ah, the grand unfolding saga of GPU architectures – it’s like a never-ending game of technological leapfrog, with engineers at NVIDIA constantly upping the ante. As a devoted computer repair technician in the bustling UK market, I’ve seen firsthand how these rapid advancements can leave customers scratching their heads. But fear not, my friends, for I’m here to demystify the latest and greatest from the Turing to Ampere lineage.

Let’s start with the Turing architecture, shall we? This bad boy burst onto the scene in 2018, boasting a slew of new features that left the industry abuzz. Chief among them were the introduction of Tensor Cores [1] – specialized processing units designed to accelerate AI workloads. With support for int8, int4, and even binary 1-bit precision, these Tensor Cores could deliver a whopping 32x performance boost over the previous Pascal generation.

But NVIDIA wasn’t done there. Turing also ushered in the first-generation Ray Tracing Cores [1], enabling real-time, photo-realistic rendering for gaming and visualization. Suddenly, the virtual worlds we created came alive with lifelike shadows, reflections, and refractions. It was a true game-changer, no pun intended.

Now, fast forward to the present, where the Ampere architecture has taken the GPU world by storm. Unveiled in 2020, Ampere builds upon the foundations laid by Turing, while introducing a bevy of new enhancements that will make your head spin faster than a GPU fan.

For starters, the third-generation Tensor Cores [2] in Ampere are an absolute powerhouse. Not only do they support even more data types, like TF32 and BF32, but they can also provide up to 4x the throughput of their Turing predecessors. Imagine crunching through those complex deep learning models with the speed of a cheetah on Red Bull.

But the real showstopper is Ampere’s introduction of fine-grained structured sparsity [3]. This revolutionary technique allows the GPU to skip computations on weight values that have been deemed insignificant, effectively doubling the Tensor Core’s throughput. It’s like having a personal assistant who knows exactly which tasks to focus on, leaving the rest to gather dust.

And let’s not forget about the improvements to the Streaming Multiprocessors (SMs) – the heart and soul of any NVIDIA GPU. Ampere’s SMs [3] feature larger and faster L1 caches, combined with a unified shared memory/L1 cache design that provides twice the bandwidth and capacity of Turing. This translates to smoother, more efficient execution of both FP32 and INT32 workloads, making Ampere a true jack-of-all-trades.

But the changes don’t stop there. Ampere also boasts a beefed-up second-generation Ray Tracing Core [3], delivering twice the performance of its Turing counterpart. For those of you who live and breathe rendering, this is a game-changer (pun absolutely intended this time).

And let’s not forget about the memory subsystem. Ampere’s support for PCIe Gen 4 [3] has doubled the available bandwidth compared to the previous generation, while the massive 40MB L2 cache and innovative Compute Data Compression features ensure that your data is always within arm’s reach.

So, in summary, the journey from Turing to Ampere has been nothing short of a quantum leap. With advancements in Tensor Cores, Ray Tracing Cores, Streaming Multiprocessors, and the memory subsystem, NVIDIA has solidified its position as the undisputed champion of GPU innovation. Whether you’re a hardcore gamer, a deep learning researcher, or a professional visualizer, the Ampere architecture is poised to take your experiences to new heights.

And as your friendly neighborhood computer repair technician, I can’t wait to see what the future holds. Who knows what mind-bending advancements the Hopper microarchitecture will bring? One thing’s for sure – the GPU race is far from over, and I’ll be here to guide you through every twist and turn.

References:
[1] Knowledge from https://wolfadvancedtechnology.com/articles/nvidia-gpu-architecture
[2] Knowledge from https://www.cherryservers.com/blog/everything-you-need-to-know-about-gpu-architecture
[3] Knowledge from https://www.nvidia.com/content/PDF/nvidia-ampere-ga-102-gpu-architecture-whitepaper-v2.pdf