Array of structures or array structure

I am using MacMini '11 with an AMD Radeon HD 6630M. I draw a grid using an array structure, and all is well: 60 frames per second (using CVDisplayLink). I use a shader with built-in attributes. Life is good. I am moving on to using an array of structures (interlaced) because I understand that this is preferable to "modern" GPUs. Attributes are defined in the shader. The grid is beautifully drawn. But when I do this, the frame rate drops by about 33% (to 40 frames per second). And there are several copies of these calls. Using the tools: Time Profiler, I get the following comparisons:

Using structure of arrays (60 fps) Running Time Self Symbol Name 3.0ms 0.0% 3.0 0x21b76c4 ATIRadeonX3000GLDriver 2.0ms 0.0% 0.0 gldUpdateDispatch ATIRadeonX3000GLDriver 2.0ms 0.0% 0.0 gleDoDrawDispatchCore GLEngine 2.0ms 0.0% 0.0 glDrawElements_ACC_Exec GLEngine 2.0ms 0.0% 0.0 glDrawElements libGL.dylib 2.0ms 0.0% 0.0 -[Mesh draw] me Using array of structures (40 fps) Running Time Self Symbol Name 393.0ms 7.4% 393.0 0x86f6695 ? 393.0ms 7.4% 0.0 gleDrawArraysOrElements_ExecCore GLEngine 393.0ms 7.4% 0.0 glDrawElements_IMM_Exec GLEngine 393.0ms 7.4% 0.0 glDrawElements libGL.dylib 393.0ms 7.4% 0.0 -[Mesh draw] me 

It seems that libGL decides to go in different directions, and the array of structures looks like the X3000 driver is not called. Is this done in the Apple software emulator? Should I just stay with the array structure? Has anyone seen something like this?


The attribute code is from an Apple example and is used throughout my application (in at least 10 other areas) without any performance improvement in these areas. This is from the slow version. As I already mentioned, I use the built-in attributes in the quick version, since the data does not alternate. Rendering in place, just slow.

Hope this is what you are looking for:

 // Step 5 - Bind each of the vertex shader attributes to the programs [self.meshShader addAttribute:@"inPosition"]; [self.meshShader addAttribute:@"inNormal"]; [self.meshShader addAttribute:@"inTexCoord"]; // Step 6 - Link the program if([[self meshShader] linkShader] == 0){ self.posAttribute = [meshShader attributeIndex:@"inPosition"]; self.normAttribute = [meshShader attributeIndex:@"inNormal"]; self.texCoordAttribute = [meshShader attributeIndex:@"inTexCoord"]; ... - (void) addAttribute:(NSString *)attributeName { if ([attributes containsObject:attributeName] == NO){ [attributes addObject:attributeName]; glBindAttribLocation(program, [attributes indexOfObject:attributeName], [attributeName UTF8String]); } } 

Update: After further investigation: 1) I am using dhpoWare modelObj loader (modified), and since it uses a striped array of structures, it also acts like my array of structures in performance - just not like a bit. Perhaps I misinterpret the tools. The modelObj code calls glDrawElements_IMM_Exec; it also calls gleDoDrawDispatchCore in loop mode. I'm not sure if it just accumulates a bunch of calls in glDrawElements_IMM_Exec and then explodes them through gleDoDrawDispatchCore. I do not know. 2) I think the tools have problems, since it shows that GLEngine calls one of my unused internal 3ds object methods that has no external hooks. I double-checked by setting the Xcode breakpoint there, and it never worked. I no longer do 3DS.

I think I will look back and possibly stumble on the answer. If someone gave me an opinion on whether there could be an array of structures, that would be appreciated.

SOLUTION: I added VBO to the forefront, and everything is fine. The source code came from the OpenGL ES 2.0 manual, and adding VBO fixes my problem. Frame rate for driver calls 60, 1 ms. Here is the code:

 glGenVertexArrays(1, &vaoName); glBindVertexArray(vaoName); // new - create VBO glGenBuffers(1, &vboName); glBindBuffer(GL_ARRAY_BUFFER, vboName); // Allocate and load position data into the VBO glBufferData(GL_ARRAY_BUFFER, sizeof(struct vertexAttribs) * self.numVertices, vertexAttribData, GL_STATIC_DRAW); // end of new NSUInteger vtxStride = sizeof(struct vertexAttribs); //GLfloat *vtxBuf = (GLfloat *)vertexAttribData; // no longer use this GLfloat *vtxBuf = (GLfloat *)NULL; // use this instead glEnableVertexAttribArray(self.posAttribute); glVertexAttribPointer(self.posAttribute, VERTEX_POS_SIZE, GL_FLOAT, GL_FALSE, vtxStride, vtxBuf); vtxBuf += VERTEX_POS_SIZE; glEnableVertexAttribArray(self.normAttribute); glVertexAttribPointer(self.normAttribute, VERTEX_NORM_SIZE, GL_FLOAT, GL_FALSE, vtxStride, vtxBuf); vtxBuf += VERTEX_NORM_SIZE; glEnableVertexAttribArray(self.texCoordAttribute); glVertexAttribPointer(self.texCoordAttribute, VERTEX_TEX_SIZE, GL_FLOAT, GL_FALSE, vtxStride, vtxBuf); ... 
+4
source share
1 answer

Array structures to achieve access per unit step in memory is an empirical rule. It is used not only for the GPU, but also for CPUS and Co-processors such as Intel Xeon Phi.

In your case, I do not believe that this part of the code is sent to the GPU; instead, performance loss is associated with memory access without single access (CPU to / from memory).

0
source

Source: https://habr.com/ru/post/1400766/


All Articles