Unmatched recession in OpenGL on OS X

I added a new GL rendering engine to my engine, which uses the main profile. Although it works fine on Windows and / or nvidia cards , it looks like 10 times slower on OS X (3 frames per second instead of 30). It is strange that my rendering of the compatibility profile works fine.

I collected some tracks using the tools and the GL profiler:

https://www.dropbox.com/sh/311fg9wu0zrarzm/31CGvUcf2q

This shows that the application is wasting its time in glDrawRangeElements. I have tried the following things:

  • Use glDrawElements instead (no effect)
  • dropping dropping (does not affect speed)
  • disable some GL_DYNAMIC_DRAW buffers (no effect)
  • bind index buffer after VAO when drawing (no effect)
  • converted indexes to 4 bytes (no effect)
  • use GL_BGRA textures (no effect)

What I have not tried is to align my vertices with a 16-byte boundary and / or convert indices to 4 bytes , but seriously, if that is a problem, then why the hell should the standard be allowed?

I create a context like this:

NSOpenGLPixelFormatAttribute attributes[] = { NSOpenGLPFAColorSize, 24, NSOpenGLPFAAlphaSize, 8, NSOpenGLPFADepthSize, 24, NSOpenGLPFAStencilSize, 8, NSOpenGLPFADoubleBuffer, NSOpenGLPFAAccelerated, NSOpenGLPFANoRecovery, NSOpenGLPFAOpenGLProfile, NSOpenGLProfileVersion3_2Core, 0 }; NSOpenGLPixelFormat* format = [[NSOpenGLPixelFormat alloc] initWithAttributes:attributes]; NSOpenGLContext* context = [[NSOpenGLContext alloc] initWithFormat:format shareContext:nil]; [self.view setOpenGLContext:context]; [context makeCurrentContext]; 

Tried the following specifications:

  • radeon 6630M, OS X 10.7.5
  • radeon 6750M, OS X 10.7.5
  • geforce GT 330M, OS X 10.8.3

Do you have any idea what I can do wrong? Again, it works great with the compatibility profile (but doesn't use VAO).

UPDATE . Reported by Apple.

UPDATE : Apple doesn’t give a damn about the problem ... anyway, I created a small test program that is actually good. Now I compared the call stack with the tools and found out that when using the engine, glDrawRangeElements makes two calls:

  • gleDrawArraysOrElements_ExecCore
  • gleDrawArraysOrElements_Entries_Body

whereas in the test program it only calls the second. Now the first call does something like immediate mode rendering (gleFlushPrimitivesTCLFunc, gleRunVertexSubmitterImmediate), so slowing down is obvious.

+4
source share
2 answers

Finally, I was able to reproduce the slowdown. This is just crazy ... This is clearly due to the fact that glBindAttribLocation is called with the my_Position attribute. Now I have done some tests:

  • 1 by default (returned by glGetAttribLocation function)
  • if I set it to zero, the problem does not occur
  • if I set it to 1, rendering becomes slow
  • if I set it to any larger number, it is slower again

Obviously, I am rewriting a program (verification code). This is not a problem in the implementation, I also tested it with "normal" values.

Testing program:

https://www.dropbox.com/s/dgg48g1fwgyc5h0/SLOWDOWN_REPRO.zip

How to reproduce:

  • open with xcode
  • open the common /glext.h (do not violate the name)
  • change the GLDECLUSAGE_POSITION constant from 0 to 1
  • compile and run => slowly
  • return to zero => good
+1
source

I managed to solve the same problem under the following circumstances: OS X Mavericks:

  • Screen rendering using array buffers to give each instance its own modelToWorld and inverseNormal ; attribute location is set using layout, not glGetAttribLocation

  • leaving one of these array buffers unused in the shader where the location is declared, but this attribute is not actually used for anything in glsl code

In this case, the glDrawElementsInstanced call takes a lot of CPU time (under normal circumstances, this call uses an almost zero CPU, even when drawing several thousand instances).

You can say that you are having this problem if almost all the processor time used in glDrawElementsInstanced is spent on gleDrawArraysOrElements_ExecCore . After making sure that all the buffers in the array actually refer to your shader code, it corrects the processor time to zero (almost).

I suspect this is one of the situations when leaving the variable from your main () to glsl confuses the compiler, deleting the entire link to this variable, leaving you with a dangling reference to an attribute or uniform.

+1
source

Source: https://habr.com/ru/post/1479496/


All Articles