Streaming multiprocessors Thread block
an illustration of streaming multiprocessor , resources.
to achieve purpose, sm contains following:
execution cores. (single precision floating-point units, double precision floating-point units, special function units (sfus)).
caches:
schedulers warps. (these issuing instructions warps based on particular scheduling policies).
a substantial number of registers. (an sm may running large number of active threads @ time, must have registers in thousands.)
the hardware schedules thread blocks sm. in general sm can handle multiple thread blocks @ same time. sm may contains 8 thread blocks in total. thread id assigned thread respective sm.
whenever sm executes thread block, threads inside thread block executed @ same time. hence free memory of thread block inside sm, critical entire set of threads in block have concluded execution. each thread block divided in scheduled units known warp. these discussed in detail in following section.
an illustration of double warp scheduler implemented in fermi micro-architecture of nvidia.
the warp scheduler of sm decides of warp gets prioritized during issuance of instructions.
some of warp prioritizing policies have been discussed in following sections.
Comments
Post a Comment