GTX 680, Keplers and maximum registers per stream

I ask the following questions because I am confused ...

I find statements on various sites and in newspapers that the Kepler architecture has increased the number of registers in the stream, but on my GTX680 it does not look like RegsPerBlock is 65536, so for streams 1024 it will be 64 reg. What am I missing? .. Will there be more registers per thread in the future?

Relationship Daniel

+4
source share
2 answers

There are two options for Kepler's architecture: sm_30 and sm_35. The GTX 680 is based on the GK104 GPU, which implements the sm_30 architecture. This architecture has 64 registers per stream, of which 63 are available for user code, and one is a dedicated zero register. Future components based on GK110, such as K20, will implement the sm_35 architecture, which provides 256 registers per stream, of which 255 are available for user code (one again is a dedicated zero register).

+9
source

While what @njuffa wrote is also important to note that the maximum number of registers per stream is not necessarily equal (register file size / maximum number of threads per block). Perhaps you can only use the maximum allowed registers for a stream with smaller stream blocks.

... and in fact, that's exactly how it really is with Kepler CC 3.5 cards, and with Maxwell 5.x and Pascal 6.0: the registration file has 64 Ki registers; the maximum number of threads per block is 1024; but the maximum registers for each thread are 255 (+ zero register). Only grids containing no more than 256 threads per block receive 255 registries per thread.

+1
source

Source: https://habr.com/ru/post/1442202/


All Articles