Optimization of the average color program of the image in haskell using REPA

Problem

I wrote a Haskell program that goes through a folder and finds the average color of each image in the folder. It uses the hack repa-devil package to load images into repa arrays. I find the average by adding all the red, blue and green values, and then dividing by the number of pixels:

-- compiled with -O2 import qualified Data.Array.Repa as R import Data.Array.Repa.IO.DevIL import Control.Monad.Trans (liftIO) import System.Directory (getDirectoryContents) size :: (R.Source re) => R.Array r R.DIM3 e -> (Int, Int) size img = (w, h) where (RZ R.:. h R.:. w R.:. 3) = R.extent img averageColour :: (R.Source re, Num e, Integral e) => R.Array r R.DIM3 e -> (Int, Int, Int) averageColour img = (r `div` n, g `div` n, b `div` n) where (w, h) = size img n = w * h (r,g,b) = f 0 0 0 0 0 f row col rgb | row >= w = f 0 (col + 1) rgb | col >= h = (r, g, b) | otherwise = f (row + 1) col (addCol 0 r) (addCol 1 g) (addCol 2 b) where addCol xv = v + fromIntegral (img R.! (RZ R.:. col R.:. row R.:. x)) main :: IO () main = do files <- fmap (map ("images/olympics_backup/" ++) . filter (`notElem` ["..", "."])) $ getDirectoryContents "images/olympics_backup" runIL $ do images <- mapM readImage files let average = zip (map (\(RGB img) -> averageColour img) images) files liftIO . print $ average 

I also wrote this program in Python using the Python image library. He finds the average value of the images in the same way:

 import Image def get_images(folder): images = [] for filename in os.listdir(folder): images.append(folder + filename) return images def get_average(filename): image = Image.open(filename) pixels = image.load() r = g = b = 0 for x in xrange(0, image.size[0]): for y in xrange(0, image.size[1]): colour = pixels[x, y] r += colour[0] g += colour[1] b += colour[2] area = image.size[0] * image.size[1] r /= area g /= area b /= area return [(r, g, b), filename, image] def get_colours(images): colours = [] for image in images: try: colours.append(get_average(image)) except: continue return colours imgs = get_images('images/olympics_backup/') print get_colours(imgs) 

When both of them are launched in a folder with 301 images, the Haskell version exceeds 0.2 seconds (0.87 against 0.64). This seems odd because Haskell is a compiled language (which is often faster than interpreted), and I heard that repo arrays had good performance (although this may have been compared to other Haskell data types like list) .

What i tried

The first thing I did was to notice that I was using explicit recursion, and therefore decided to replace it using a fold, which would also mean that I no longer needed to check if I was outside the array:

 (r,g,b) = foldl' f (0,0,0) [(x, y) | x <- [0..w-1], y <- [0..h-1]] f (r,g,b) (row,col) = (addCol 0 r, addCol 1 g, addCol 2 b) where addCol xv = v + fromIntegral (img R.! (RZ R.:. col R.:. row R.:. x)) 

This made it run slower (1.2 seconds), so I decided to profile the code and see where most of the time wasted (in case I created an obvious bottleneck or the repa-devil package was just slow). The profile told me that ~ 58% of the time was spent on the function f, and ~ 35% of the time was spent in addCol.

Unfortunately, I can’t figure out how to make this run faster. The function is just an array index and the addition is the same as python code. Is there a way to improve the performance of this code, or is the Python image library just providing more performance?

+4
source share
1 answer

Although the following code is hacky, it is pretty fast.

  • Gets 75x75 images in 0.03 ms (16 ticks / pixels) => approx. 10-20 ms for 300 images

  • 512x512 (Lenna) in 1 ms (13.5 ticks / pixel)

  • 2560x1600 at 12 ms (9.2 ticks / pixel)

yarr was specifically designed to solve problems such as yours, unfortunately, there are some problems (indicated in the comments to the code) that do not allow you to make the code very compressed and fast at the same time.

One pixel routine is 3 reads in memory + 3 add s, so I expect 3 ticks / pixel to be the limit of this task.

In addition, you can easily parallelize calculations using parallel from the parallel-io package.

 {-# LANGUAGE FlexibleContexts, TypeFamilies #-} import System.Environment import Data.Yarr import Data.Yarr.IO.Image import Data.Yarr.Walk import Data.Yarr.Utils.FixedVector as V import Data.Yarr.Shape as S main :: IO () main = do [file] <- getArgs print =<< getAverage file getAverage :: FilePath -> IO (Int, Int, Int) getAverage file = do -- Meaningful choice, for homogenious images, -- in preference to readRGB(Vectors). -- readRGB make the case of representation -> polymorfic access -> -- poor performance (RGB imageArr) <- readImage file -- let imageArr = readRGBVectors file let ext = extent imageArr avs <- averageColour imageArr return $ V.inspect avs (Fun (,,)) averageColour :: (Vector v Int, Dim v ~ N3, Integral e, UVecSource r slr l Dim2 ve, PreferredWorkIndex l Dim2 i) => UArray rl Dim2 (ve) -> IO (VecList N3 Int) {-# INLINE averageColour #-} averageColour image = fmap (V.map (`div` (w * h))) compSums where -- `walk (reduce ... (V.zipWith (+))) (return V.zero) image` -- would be more idiomatic and theoretically faster, -- but had problems with perf too :( compSums = walkSlicesSeparate sum (return 0) image -- would better to `mapElems fromIntegral imageArr` before counting, -- but faced some performance problems and I have no time to dig them {-# INLINE sum #-} sum = reduceL sumFold (\xy -> x + (fromIntegral y)) sumFold = S.unrolledFoldl n8 noTouch (w, h) = extent image 

Compile

 ghc-7.6.1 --make -Odph -rtsopts -threaded -fno-liberate-case -funbox-strict-fields -funfolding-keeness-factor1000 -fllvm -optlo-O3 -fexpose-all-unfoldings -fsimpl-tick-factor=500 -o avc average-color.hs 
+1
source

Source: https://habr.com/ru/post/1486616/


All Articles