Should the GPU be considered when coding? In no case, the OpenGL API is the layer between your application and the hardware.
This is pretty much true for desktop graphics, since all GPUs are immediate visualizers, however this does not apply to mobile graphics.
Mali GPUs use tile-based instant rendering. For this type of rendering, the framebuffer is divided into tiles of 16 by 16 pixels. Polygon List Builder (PLB) organizes input from the application into polygon lists. There is a list of polygons for each tile. When a primitive covers a portion of a tile, an entry called the polygon list command is added to the list of polygons for the tile. The pixel processor displays a list of polygons for one tile and calculates the values ββfor all pixels in that tile before starting work on the next tile. Since this tile-based approach uses a fast built-in buffer on the plate, the GPU only writes the contents of the fragment buffer to the framebuffer in main memory at the end of each tile. An immediate rendering tool other than tiles usually requires much more access to frames. Thus, the tile-based method consumes less memory bandwidth and effectively supports operations such as deep testing, blending, and smoothing.
Another difference is the processing of processed buffers. Immediate renderers will "save" the contents of your buffer, effectively allowing you to only draw differences in the rendered scene on top of the existing ones. However, it is available in Mali, by default it is not included, as it can cause unwanted effects if used improperly.
There is an example of the Mali GLES2 SDK on how to use the "EGL Preserve". Properly available in the GLES2 SDK here
The reason the Nexus 7 based on Geforce ULP works as intended is because it saves buffers by default as an immediate renderer, while Mali does not.
From the EGL Chronos specification:
EGL_SWAP_BEHAVIOR
Determines the effect on the color buffer for surface placement using eglSwapBuffers. A value of EGL_BUFFER_PRESERVED indicates that the contents of the color buffer are unaffected, while EGL_BUFFER_DESTROYED indicates that the contents of the color buffer may be destroyed or altered by operation.
* The initial value of EGL_SWAP_BEHAVIOR is selected by the implementation. *
The default value for EGL_SWAP_BEHAVIOUR on the Mali platform is EGL_BUFFER_DESTROYED. This is due to a performance hit associated with the need to retrieve the previous buffer from memory before rendering the new frame and storing it at the end, as well as the bandwidth consumption (which is also incredibly bad for battery life on mobile devices). I can not confidently comment on the default behavior of Tegra SoC, but for me it is obvious that their default value is EGL_BUFFER_PRESERVED.
To clarify the position of Mali in relation to the specifications of Chronos GLES - Mali is fully compliant.