CUDA Occupancy Calculator
GPU Occupancy Data is displayed here and in the graphs
Active Threads per Multiprocessor |
|
Active Warps per Multiprocessor |
|
Active Thread Blocks per Multiprocessor |
|
Occupancy of each Multiprocessor |
|
Physical Limits for GPU Compute Capability
Version |
|
Threads per Warp |
|
Warps per Multiprocessor |
|
Threads per Multiprocessor |
|
Thread Blocks per Multiprocessor |
|
Total # of 32-bit registers per Multiprocessor |
|
Register allocation unit size |
|
Register allocation granularity |
|
Max registers per Block |
|
Max registers per thread |
|
Shared Memory per Multiprocessor (bytes) |
|
Shared Memory Allocation unit size |
|
Warp allocation granularity (for register allocation) |
|
Max thread block size |
|
Allocation Per Thread Block
Warps |
|
Registers |
|
Shared Memory |
|
Note: CUDA Runtime uses bytes of Shared Memory per Thread Block.
Maximum Thread Blocks Per Multiprocessor
Limited by Max Warps / Blocks per Multiprocessor |
|
Limited by Registers per Multiprocessor |
|
Limited by Shared Memory per Multiprocessor |
|