Effective texture access for convolution in GLSL ES 1.1

Question

Effective texture access for convolution in GLSL ES 1.1

I am doing a convolution with a 3x3 kernel in iPhone shader, GLSL ES 1.1. Currently, I am just doing 9 texture searches. Is there a faster way? Some ideas:

transferring the input image as a buffer rather than a texture to avoid invoking texture interpolation.
Transferring 9 variable coordinates vec2 from the vertex shader (and not just one, as I am doing now) to encourage the processor to effectively filter the texture.
Look at the various Apple extensions that may be suitable for this.
(Added) examine ES equivalents to invoke GLSL shaderOffset (which is not available in ES, but there may exist an equivalent)

As far as hardware is concerned, I focus in particular on the iPhone 4S.

+6

iphone opengl-es glsl

Alex flint Jun 21 '12 at 15:42

source share

2 answers

why don't you run two passes with blurry blur in a gaussian style? Make 3 taps for vertical in the first pass, then 3 taps for horizontal in the second pass.

0

Bart hender Aug 17 '12 at 8:00

source share

Brad larson · Accepted Answer · 2012-06-23T01:25:02+0000

Are you sure you do not mean OpenGL ES 2.0? You cannot make shaders of any type using OpenGL ES 1.1. I'll take the first one.

In my experience, the fastest way I've found this is your second list. I am doing several types of 3x3 convolutions in my GPUImage structure (which you could just use instead of trying to roll on your own), and for those I reproach in texture offset for horizontal and vertical directions and calculate the nine texture coordinates needed in the vertex shader. From there I pass them as changes to the fragment shader.

This (for the most part) avoids the dependent texture readings in the shader of fragments that are terribly expensive on iOS PowerVR GPUs. I say “for the most part” because on older devices such as the iPhone 4, only eight of these changes are used to avoid dependent texture. As I found out last week, the ninth causes a dependent texture read on older devices, so this slows down a bit. However, the iPhone 4S does not have this problem, since it supports more changes used in this way.

For the vertex shader, I use the following:

attribute vec4 position; attribute vec4 inputTextureCoordinate; uniform highp float texelWidth; uniform highp float texelHeight; varying vec2 textureCoordinate; varying vec2 leftTextureCoordinate; varying vec2 rightTextureCoordinate; varying vec2 topTextureCoordinate; varying vec2 topLeftTextureCoordinate; varying vec2 topRightTextureCoordinate; varying vec2 bottomTextureCoordinate; varying vec2 bottomLeftTextureCoordinate; varying vec2 bottomRightTextureCoordinate; void main() { gl_Position = position; vec2 widthStep = vec2(texelWidth, 0.0); vec2 heightStep = vec2(0.0, texelHeight); vec2 widthHeightStep = vec2(texelWidth, texelHeight); vec2 widthNegativeHeightStep = vec2(texelWidth, -texelHeight); textureCoordinate = inputTextureCoordinate.xy; leftTextureCoordinate = inputTextureCoordinate.xy - widthStep; rightTextureCoordinate = inputTextureCoordinate.xy + widthStep; topTextureCoordinate = inputTextureCoordinate.xy - heightStep; topLeftTextureCoordinate = inputTextureCoordinate.xy - widthHeightStep; topRightTextureCoordinate = inputTextureCoordinate.xy + widthNegativeHeightStep; bottomTextureCoordinate = inputTextureCoordinate.xy + heightStep; bottomLeftTextureCoordinate = inputTextureCoordinate.xy - widthNegativeHeightStep; bottomRightTextureCoordinate = inputTextureCoordinate.xy + widthHeightStep; }

and fragment shader:

  precision highp float; uniform sampler2D inputImageTexture; uniform mediump mat3 convolutionMatrix; varying vec2 textureCoordinate; varying vec2 leftTextureCoordinate; varying vec2 rightTextureCoordinate; varying vec2 topTextureCoordinate; varying vec2 topLeftTextureCoordinate; varying vec2 topRightTextureCoordinate; varying vec2 bottomTextureCoordinate; varying vec2 bottomLeftTextureCoordinate; varying vec2 bottomRightTextureCoordinate; void main() { mediump vec4 bottomColor = texture2D(inputImageTexture, bottomTextureCoordinate); mediump vec4 bottomLeftColor = texture2D(inputImageTexture, bottomLeftTextureCoordinate); mediump vec4 bottomRightColor = texture2D(inputImageTexture, bottomRightTextureCoordinate); mediump vec4 centerColor = texture2D(inputImageTexture, textureCoordinate); mediump vec4 leftColor = texture2D(inputImageTexture, leftTextureCoordinate); mediump vec4 rightColor = texture2D(inputImageTexture, rightTextureCoordinate); mediump vec4 topColor = texture2D(inputImageTexture, topTextureCoordinate); mediump vec4 topRightColor = texture2D(inputImageTexture, topRightTextureCoordinate); mediump vec4 topLeftColor = texture2D(inputImageTexture, topLeftTextureCoordinate); mediump vec4 resultColor = topLeftColor * convolutionMatrix[0][0] + topColor * convolutionMatrix[0][1] + topRightColor * convolutionMatrix[0][2]; resultColor += leftColor * convolutionMatrix[1][0] + centerColor * convolutionMatrix[1][1] + rightColor * convolutionMatrix[1][2]; resultColor += bottomLeftColor * convolutionMatrix[2][0] + bottomColor * convolutionMatrix[2][1] + bottomRightColor * convolutionMatrix[2][2]; gl_FragColor = resultColor; }

Even with the caveats mentioned above, this shader works in ~ 2 ms for a video clip with a resolution of 640x480 on iPhone 4, and 4S can handle 1080p video at 30 FPS easily using such a shader.

Effective texture access for convolution in GLSL ES 1.1

More articles: