OpenGL ES 2.0 optimization 2D texture output and frame rate

I was hoping that someone could help me make some progress in some of the texture tests that I do in OpenGL ES 2.0 and iPhone 4.

I have an array containing sprite objects. the rendering cycle cycles through all the sprites on each texture and extracts all their texture coordinates and vertex coordinates. he adds them to a giant interlaced array using degenerate vertices and indices and sends them to the GPU (I am injecting the code below). All this is done for each texture, so I bind the texture once, and then create my strip array, and then draw it. Everything works fine, and the results on the screen are exactly what they should be.

So, my test test is performed by adding 25 new sprites for each touch with different opacity and changing their vertices in the update so that they bounce around the screen while rotating and running OpenGL ES Analyzer in the application.

That's where I hope for help ... I can get about 275 32x32 sprites with varying opacity bouncing around the screen at 60 frames per second. By 400 I am up to 40 frames per second. When I run the OpenGL ES performance detector, it tells me ...

The implementation of the application is limited by the rasterization of the triangle - the process of converting triangles into pixels. The total area in pixels of all displayed triangles is too large. To make the frame rate faster, simplify your scene by reducing either the number of triangles, their size, or both.

The thing is, I just whipped up a test in cocos2D using CCSpriteBatchNode using the same texture and created 800 transparent sprites and a frame rate of 60 frames per second.

Here is some code that might be appropriate ...

Shader.vsh (matrices are installed once at the beginning)

void main() { gl_Position = projectionMatrix * modelViewMatrix * position; texCoordOut = texCoordIn; colorOut = colorIn; } 

Shader.fsh (colorOut used to calculate opacity)

 void main() { lowp vec4 fColor = texture2D(texture, texCoordOut); gl_FragColor = vec4(fColor.xyz, fColor.w * colorOut.a); } 

VBO setup

  glGenBuffers(1, &_vertexBuf); glGenBuffers(1, &_indiciesBuf); glGenVertexArraysOES(1, &_vertexArray); glBindVertexArrayOES(_vertexArray); glBindBuffer(GL_ARRAY_BUFFER, _vertexBuf); glBufferData(GL_ARRAY_BUFFER, sizeof(TDSEVertex)*12000, &vertices[0].x, GL_DYNAMIC_DRAW); glEnableVertexAttribArray(GLKVertexAttribPosition); glVertexAttribPointer(GLKVertexAttribPosition, 2, GL_FLOAT, GL_FALSE, sizeof(TDSEVertex), BUFFER_OFFSET(0)); glEnableVertexAttribArray(GLKVertexAttribTexCoord0); glVertexAttribPointer(GLKVertexAttribTexCoord0, 2, GL_FLOAT, GL_FALSE, sizeof(TDSEVertex), BUFFER_OFFSET(8)); glEnableVertexAttribArray(GLKVertexAttribColor); glVertexAttribPointer(GLKVertexAttribColor, 4, GL_FLOAT, GL_FALSE, sizeof(TDSEVertex), BUFFER_OFFSET(16)); glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, _indiciesBuf); glBufferData(GL_ELEMENT_ARRAY_BUFFER, sizeof(ushort)*12000, indicies, GL_STATIC_DRAW); glBindVertexArrayOES(0); 

Refresh Code

  /* Here it cycles through all the sprites, gets their vert info (includes coords, texture coords, and color) and adds them to this giant array The array is of... typedef struct{ float x, y; float tx, ty; float r, g, b, a; }TDSEVertex; */ glBindBuffer(GL_ARRAY_BUFFER, _vertexBuf); //glBufferSubData(GL_ARRAY_BUFFER, sizeof(vertices[0])*(start), sizeof(TDSEVertex)*(indicesCount), &vertices[start]); glBufferData(GL_ARRAY_BUFFER, sizeof(TDSEVertex)*indicesCount, &vertices[start].x, GL_DYNAMIC_DRAW); glBindBuffer(GL_ARRAY_BUFFER, 0); 

Visualization code

  GLKTextureInfo* textureInfo = [[TDSETextureManager sharedTextureManager].textures objectForKey:textureName]; glBindTexture(GL_TEXTURE_2D, textureInfo.name); glBindVertexArrayOES(_vertexArray); glDrawElements(GL_TRIANGLE_STRIP, indicesCount, GL_UNSIGNED_SHORT, BUFFER_OFFSET(start)); glBindVertexArrayOES(0); 

Here is a screenshot of 400 sprites (800 triangles + 800 degenerate triangles) to give an idea of ​​the opacity of the layers when the textures move ... Again, I have to note that VBO is created and sent to each texture, so Im binds and then draws only two times per frame (since there are only two textures).

screenshot showing layering of sprites

Sorry if this is awesome, but my first post here and wanted to be solid. Any help would be greatly appreciated.

PS, I know that I could just use Cocos2D instead of writing everything from scratch, but do you have fun (and training) ?!

UPDATE # 1 When I switch my fragment shader only

  gl_FragColor = texture2D(texture, texCoordOut); 

it receives up to 80 sprites at a speed of 50 frames per second (4804 triangles, including degenerate triangles), although the sprite's opacity is lost. Any suggestions on how I can handle opacity in my shader without working at 1/4 speed?

UPDATE # 2 So I grabbed the GLKit View and View controller and wrote a custom view downloaded from AppDelegate. 902 sprites with transparency and transparency at 60 frames per second.

+4
source share
1 answer

Mostly different thoughts ...

If you are bounded by a triangle, try switching from GL_TRIANGLE_STRIP to GL_TRIANGLES . You still need to specify exactly the same number of indices - six per square, but the GPU should never notice that the connecting triangles between the quads are degenerate (i.e. you never need to convert them to zero pixels). You will need a profile in order to find out if you do not fit into the payment of the cost in order to already share the ribs implicitly.

You should also reduce the size of your vertices. I would dare to suggest that you can specify x, y, tx and ty as 16-bit integers, and your colors as 8-bit integers without any noticeable changes in the rendering. This will reduce the size of each vertex from 32 bytes (eight components, each four bytes in size) to 12 bytes (four double-byte values ​​plus four single-byte values ​​without the need to fill, because everything is already aligned) - a reduction of almost 63% of the memory bandwidth costs here.

As you seem to limit the fill rate, you should also consider your original texture. Anything you can do to trim the byte size will directly help load texel and therefore fill speed.

It sounds like you are using art that is conscious about pixels, so switching to PVR is probably not an option. However, people sometimes do not understand the full benefits of PVR textures; if you switch to, say, 4 bits per pixel mode, then you can scale the image up to two times wider and twice as high to reduce compression artifacts and still only pay 16 bits per pixel of the source, but probably Get a better range of brightness than 16 bit / s RGB texture.

Assuming you are using a 32 bpp texture, you should at least see if the usual RGB 16 bpp texture is enough using any of the hardware modes provided (especially if 1 bit alpha plus 5 bits per color channel is appropriate for your art, as it only loses 9 a bit of color information compared to the original, while reducing bandwidth costs by 50%).

It also looks like you are loading indexes into every single frame. Only load when you add additional objects to the scene or if the buffer as the last loaded one is much larger than necessary. You can simply limit the count passed to glDrawElements to cut objects without reloading. You should also check to see if you really get something by loading your vertices into VBO and then reusing them if they just change every frame. It may be faster to provide them directly from client memory.

+1
source

Source: https://habr.com/ru/post/1402350/


All Articles