Kalman filters with four input parameters

I have been studying the Kalman filter for a couple of days to improve the performance of my face detection program. From the information I collected, I collected the code. The code for the Kalman filter part is as follows.

int Kalman(int X,int faceWidth,int Y,int faceHeight, IplImage *img1){ CvRandState rng; const float T = 0.1; // Initialize Kalman filter object, window, number generator, etc cvRandInit( &rng, 0, 1, -1, CV_RAND_UNI ); //IplImage* img = cvCreateImage( cvSize(500,500), 8, 3 ); CvKalman* kalman = cvCreateKalman( 4, 4, 0 ); // Initializing with random guesses // state x_k CvMat* state = cvCreateMat( 4, 1, CV_32FC1 ); cvRandSetRange( &rng, 0, 0.1, 0 ); rng.disttype = CV_RAND_NORMAL; cvRand( &rng, state ); // Process noise w_k CvMat* process_noise = cvCreateMat( 4, 1, CV_32FC1 ); // Measurement z_k CvMat* measurement = cvCreateMat( 4, 1, CV_32FC1 ); cvZero(measurement); /* create matrix data */ const float A[] = { 1, 0, T, 0, 0, 1, 0, T, 0, 0, 1, 0, 0, 0, 0, 1 }; const float H[] = { 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0 }; //Didn't use this matrix in the end as it gave an error:'ambiguous call to overloaded function' /* const float P[] = { pow(320,2), pow(320,2)/T, 0, 0, pow(320,2)/T, pow(320,2)/pow(T,2), 0, 0, 0, 0, pow(240,2), pow(240,2)/T, 0, 0, pow(240,2)/T, pow(240,2)/pow(T,2) }; */ const float Q[] = { pow(T,3)/3, pow(T,2)/2, 0, 0, pow(T,2)/2, T, 0, 0, 0, 0, pow(T,3)/3, pow(T,2)/2, 0, 0, pow(T,2)/2, T }; const float R[] = { 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0 }; //Copy created matrices into kalman structure memcpy( kalman->transition_matrix->data.fl, A, sizeof(A)); memcpy( kalman->measurement_matrix->data.fl, H, sizeof(H)); memcpy( kalman->process_noise_cov->data.fl, Q, sizeof(Q)); //memcpy( kalman->error_cov_post->data.fl, P, sizeof(P)); memcpy( kalman->measurement_noise_cov->data.fl, R, sizeof(R)); //Initialize other Kalman Filter parameters //cvSetIdentity( kalman->measurement_matrix, cvRealScalar(1) ); //cvSetIdentity( kalman->process_noise_cov, cvRealScalar(1e-5) ); /*cvSetIdentity( kalman->measurement_noise_cov, cvRealScalar(1e-1) );*/ cvSetIdentity( kalman->error_cov_post, cvRealScalar(1e-5) ); /* choose initial state */ kalman->state_post->data.fl[0]=X; kalman->state_post->data.fl[1]=faceWidth; kalman->state_post->data.fl[2]=Y; kalman->state_post->data.fl[3]=faceHeight; //cvRand( &rng, kalman->state_post ); /* predict position of point */ const CvMat* prediction=cvKalmanPredict(kalman,0); //generate measurement (z_k) cvRandSetRange( &rng, 0, sqrt(kalman->measurement_noise_cov->data.fl[0]), 0 ); cvRand( &rng, measurement ); cvMatMulAdd( kalman->measurement_matrix, state, measurement, measurement ); //Draw rectangles in detected face location cvRectangle( img1, cvPoint( kalman->state_post->data.fl[0], kalman->state_post->data.fl[2] ), cvPoint( kalman->state_post->data.fl[1], kalman->state_post->data.fl[3] ), CV_RGB( 0, 255, 0 ), 1, 8, 0 ); cvRectangle( img1, cvPoint( prediction->data.fl[0], prediction->data.fl[2] ), cvPoint( prediction->data.fl[1], prediction->data.fl[3] ), CV_RGB( 0, 0, 255 ), 1, 8, 0 ); cvShowImage("Kalman",img1); //adjust kalman filter state cvKalmanCorrect(kalman,measurement); cvMatMulAdd(kalman->transition_matrix, state, process_noise, state); return 0; } 

In the face detection part (not shown), a field is set for the recognized face. "X, Y, faceWidth and faceHeight" are the coordinates of the field, and the width and height are passed to the Kalman filter. 'img1' - the current frame of the video.

Results:

Although I get two new rectangles from the data "state_post" and "forecasting" (as seen from the code), none of them look more stable than the original rectangle without the Kalman filter.

Here are my questions:

  • Are the initialized matrices (transition matrix A, measurement matrix H, etc.), are four input events corrected for this? (e.g. 4 * 4 matrices for four inputs?)
  • Can't we set each matrix as an identity matrix?
  • Is the method that I followed before the correct construction of the rectangles? I followed the examples in this and the book "OpenCV Training", which does not use external inputs.

Any help in this regard would be greatly appreciated!

+6
source share
1 answer

H [] must be identical if you directly measure the image. If you have 4 measurements and you make 0 values ​​on the diagonal, you make these expected measurements (x * H) 0 when this is not true. Then the innovation (zx * H) on the Kalman filter will be high.

R [] should also be diagonal, although the covariance of the measurement error may differ from the covariance. If you have normalized coordinates (width = height = 1), R can be about 0.01. If you are dealing with pixel coordinates, R = diagonal_ones means a single pixel error, and that’s fine. You can try with 2,3,4 etc.

Your transition matrix A [], which should propagate the state on each frame, looks like a transition matrix for the state consisting of x, y, v_x and v_y. You do not mention speed in your model; you are talking only about measurements. Be careful not to confuse State (describes the position of the face) with Measurements (used to update the state). Your condition may be position, speed and acceleration, and your measurements may be n points in the image. Or position x and y of the face.

Hope this helps.

+4
source

Source: https://habr.com/ru/post/914705/


All Articles