This is possible for almost all shaders (especially if you are going to SM4 +). I do not recommend doing anything on SM3 if you want any market penetration. I still regret that we did not provide a backup of SM2 for our last game, because quite a lot of people still use the old crappy SM2 cards.
More about the question. You can use RTT and never go back to main memory (itβs slow, to hell, to minimize the transfer from graphic memory to main memory), but the downside is that you need to use some rather complicated tricks to calculate AABB (which you want on the processor side) if you go pure graphics processor.
Instead, we do everything that requires changing the state of the particle on the processor side. Then we have a limited memory representation of this data, which is updated on the GPU. The vertex shader is quite meaty (but this is completely normal, do as much as possible in the vertex shader!), It extracts this compressed representation of the particle, converts it and transfers the uncompressed data to the pixel shader. The important point here is that you can and should break data into vertices and particles. This implies using instancing (this is just a way of saying: use frequency dividers). We represent the rotation of a particle with normal rotation around this normal.
Another reason for changing the state of a part of the CPU is that it is much more difficult to combine the behavior of the processor. Any at least half of a decent particle system requires a few pens to be able to create interesting particle effects.
EDIT: And if you have something like Particle :: Update that cannot be enabled, you have failed, minimize calls to particle functions, especially virtual ones, and keep the particle memory view tightly packed!
source share