nvidia/cuda-samples

★ 9,170C++Audience · developerComplexity · 5/5Setup · hard

Mindmap

mindmap
  root((cuda-samples))
    What it does
      GPU programming examples
      CUDA feature demos
    Topics Covered
      Memory management
      Thread coordination
      Multi GPU setup
    Platforms
      Linux with CMake
      Windows Visual Studio
      Tegra embedded
    Audience
      GPU developers
      CUDA learners

mindmap root((cuda-samples)) What it does GPU programming examples CUDA feature demos Topics Covered Memory management Thread coordination Multi GPU setup Platforms Linux with CMake Windows Visual Studio Tegra embedded Audience GPU developers CUDA learners

Click or tap to explore — scroll the page freely

Things people build with this

USE CASE 1

Build and run a targeted CUDA sample to understand exactly how GPU memory management or thread coordination works in real code.

USE CASE 2

Use a CUDA sample as a starting point for your own GPU-accelerated algorithm, studying NVIDIA's recommended code patterns.

USE CASE 3

Cross-compile CUDA samples for an NVIDIA Tegra embedded device for use in robotics or automotive applications.

Tech stack

C++CUDACMakeVisual Studio

Getting it running

Difficulty · hard Time to first run · 1h+

Requires an NVIDIA GPU and matching CUDA Toolkit version, Windows builds also need Visual Studio.

In plain English

This repository is a collection of example programs created by NVIDIA to show developers how to use CUDA, which is NVIDIA's programming system for running code on a graphics card (GPU) instead of a regular processor. GPUs can handle many calculations at once, making them much faster than a regular CPU for certain tasks like graphics, simulations, and machine learning. Each sample in this collection demonstrates a specific feature or technique available in the CUDA Toolkit, which is the set of tools NVIDIA provides for GPU programming. The samples cover a wide range of topics, from basic memory management and thread coordination to more advanced features like multi-GPU setups, cooperative computation patterns, and integration with graphics APIs. Each sample is a self-contained program you can build and run to see a particular concept in action. The collection is kept in sync with specific CUDA Toolkit versions, so you can match the samples to the version of CUDA you have installed. Building the samples requires installing the CUDA Toolkit and a compatible C++ build system. On Linux, you use CMake and a standard compiler. On Windows, you use Visual Studio. The README provides step-by-step instructions for both platforms, including cross-compilation for NVIDIA Tegra devices used in embedded systems like robots and automotive hardware. This is not an end-user application. It is a reference library for developers who are already writing or learning to write GPU-accelerated code. If you are new to GPU programming and want to understand how specific CUDA features work in practice, these samples give you working code to read, compile, and experiment with directly on your own machine.

Copy-paste prompts

Prompt 1

Show me how to build and run a basic CUDA memory management sample on Linux using CMake. What commands do I run?

Prompt 2

I want to understand CUDA thread coordination. Which cuda-samples example demonstrates it best and what does the code do?

Prompt 3

Help me modify an nvidia/cuda-samples example to run a parallel computation across multiple GPUs.

Prompt 4

How do I match the right cuda-samples branch or tag to the CUDA Toolkit version installed on my machine?

Open on GitHub → Explain another repo

← nvidia on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.