use a point product
dc=cos(ang)=dot(col1,col2); dc=r1*r2+g1*g2+b1*b2
for normalized RGB colors (vector units) this gives you a coefficient in the range dc=<0,1> , where 0 means a 90 degree angle between the colors (maximum difference) and 1 means the same color (not intensity)
performance
use 8 bits per channel ... so the range is <0,255> to avoid using FPUs . You can avoid using sqrt for unnormalized colors simply by:
dc=(r1*r2+g1*g2+b1*b2)^2/(|col1|^2*|col2|^2) |col|^2=r*r+g*g+b*b
[edit1] more information
normalized colors are units of 3D vectors
if you convert this value to an 8-bit range, for example 255*(r,g,b) , then you get a range of 8 bits per channel so that you can treat each color channel as an integer or a decimal point with a fixed point. For a fixed point, you just need to change the multiplication and division, all other operations are the same:
add=a+b sub=ab mul=(a*b)>>8 div=((a<<8)/b)>>8
when you use normalized colors then |col|=1 so you don't need sqrt or division. For a fixed point, just shift right 8 bits instead ... For integers <0,255> |col|=255 , which are also done ~ shift 8 bits to the right. For abnormal colors you need to divide by |col| which need sqrt and division, but the dc coefficient is in the range <0,1> , so if you use dc^2 , you just change the linearity of the coefficient, which is not important for you and for |col|^2 use of sqrt is deprecated , because |col|^2=sqrt(r*r+g*g+b*b)^2=(r*r+g*g+b*b) .
For greater speed, you must convert the entire image to normalized colors before your task. If encoded correctly, it should be around 10+ ms for normal desktop permissions.
[Note]
There are other color spaces that are more suitable for your purpose, such as HSV