As an exercise, I rewrote the sample program in a blog post on the Gibbs probe in different languages ββ(revised) by Darren Wilkinson.
Below is the code. This code runs on my (5 year old) machine in about 53 seconds using SBCL 1.0.56, creating a kernel image using buildapp and then running it using
time ./gibbs > gibbs.dat
Since this was the way the timings for other languages ββin this post were calculated, I thought I would do something comparable. The C code in the message starts about 25 seconds. I would like to try to speed up Lisp code if possible.
gibbs.lisp (eval-when (:compile-toplevel :load-toplevel :execute) (require :cl-rmath) (setf *read-default-float-format* 'double-float)) (defun gibbs (N thin) (declare (fixnum N thin)) (declare (optimize (speed 3) (safety 1))) (let ((x 0.0) (y 0.0)) (declare (double-float xy)) (print "Iter xy") (dotimes (i N) (dotimes (j thin) (declare (fixnum ij)) (setf x (cl-rmath::rgamma 3.0 (/ 1.0 (+ (* yy) 4)))) (setf y (cl-rmath::rnorm (/ 1.0 (+ x 1.0)) (/ 1.0 (sqrt (+ (* 2 x) 2)))))) (format t "~a ~a ~a~%" ixy)))) (defun main (argv) (declare (ignore argv)) (gibbs 50000 1000))
Then I built an executable gibbs
with a call to sh gibbs.sh
with gibbs.sh
as
gibbs.sh buildapp --output gibbs --asdf-tree /usr/share/common-lisp/source/ --asdf-tree /usr/local/share/common-lisp/source/ --load-system cl-rmath --load gibbs.lisp --entry main
I get 6 compiler notes when compiling with SBCL 1.0.56, which are reproduced below. I'm not sure what to do with them, but I would be grateful for any hints.
; compiling file "/home/faheem/lisp/gibbs.lisp" (written 30 MAY 2012 02:00:55 PM): ; file: /home/faheem/lisp/gibbs.lisp ; in: DEFUN GIBBS ; (SQRT (+ (* 2 X) 2)) ; ; note: unable to ; optimize ; due to type uncertainty: ; The result is a (VALUES (OR (DOUBLE-FLOAT 0.0) (COMPLEX DOUBLE-FLOAT)) ; &OPTIONAL), not a (VALUES FLOAT &REST T). ; (/ 1.0d0 (SQRT (+ (* 2 X) 2))) ; ; note: unable to ; optimize ; due to type uncertainty: ; The second argument is a (OR (DOUBLE-FLOAT 0.0) ; (COMPLEX DOUBLE-FLOAT)), not a (COMPLEX ; DOUBLE-FLOAT). ; ; note: forced to do static-fun Two-arg-/ (cost 53) ; unable to do inline float arithmetic (cost 12) because: ; The second argument is a (OR (DOUBLE-FLOAT 0.0) (COMPLEX DOUBLE-FLOAT)), not a DOUBLE-FLOAT. ; The result is a (VALUES (OR (COMPLEX DOUBLE-FLOAT) (DOUBLE-FLOAT 0.0)) ; &OPTIONAL), not a (VALUES DOUBLE-FLOAT &REST T). ; (CL-RMATH:RGAMMA 3.0d0 (/ 1.0d0 (+ (* YY) 4))) ; ; note: doing float to pointer coercion (cost 13) ; (SQRT (+ (* 2 X) 2)) ; ; note: doing float to pointer coercion (cost 13) ; (CL-RMATH:RNORM (/ 1.0d0 (+ X 1.0d0)) (/ 1.0d0 (SQRT (+ (* 2 X) 2)))) ; ; note: doing float to pointer coercion (cost 13) ; ; compilation unit finished ; printed 6 notes ; /home/faheem/lisp/gibbs.fasl written ; compilation finished in 0:00:00.073
UPDATE 1: Rainer Joswig's answer indicated that the SQRT argument could be negative, which was the source of the obscure compiler comments I saw, namely
The result is a (VALUES (OR (DOUBLE-FLOAT 0.0) (COMPLEX DOUBLE-FLOAT)) ; &OPTIONAL), not a (VALUES FLOAT &REST T).
The compiler complained that since he did not know if the argument was positive, the result could be a complex number. Since in the example the value of x
is a selective change from the gamma distribution, it is always greater than 0. It was well indicated by Stefan on the SBCL user mailing list, (see the second message in the stream "Optimizing a simple common Lisp Gibbs sampler program" , that this can be decide by declaring x greater than or equal to zero,
(declare (type (double-float 0.0 *) x))
See Generic Lisp Hyperspec for related documentation on FLOAT Types and Interval Constructors .
This speeds up the code a bit. Currently, it is reliably below 52 seconds, but still, but not very winning. It also leaves notes about
note: doing float to pointer coercion (cost 13)
If for some reason this cannot be fixed, I would like to know why. In addition, an explanation of what a note means will be interesting, independently. In particular, what does the word pointer
mean? Is this because the C functions are being called? In addition, the cost of 13 does not seem very useful. What is measured?
In addition, Rainer suggested that consing could be reduced, which could shorten the execution time. I donβt know if shrinking consing is possible, or it will reduce lead time, but I would be interested in opinions and approaches. Overall, there seems to be not much that can be done to improve the performance of this feature. Perhaps this is too small and simple.