You made this nice picture of ASCII art
Screen B | h = H/2 | x----- n ----------A | | h = H/2 B'
The field of view is defined as the angle fov = angle((x,B), (x,B')) , formed between the two ends B, B' of the screen "line" and the point x. The trigonometric function Tangens (tan) is defined as
h/n = tan( angle((x,A), (x,B)) )
And since length(A, B) == length(A, B') == h == H/2 we know that
H/(2·n) == tan( fov ) == tan( angle((x,B), (x,B')) ) == tan( 2·angle((x,A), (x,B)) )
Since in trigonometry, angles are given in radians, but most people are more comfortable with degrees, which you may have to convert from degregate to radians.
So, we are only interested in half the screen (= h), we have half the angle. And if we want to accept degress, we will also transform it into radians. This is what this expression is for.
tanValue = DEG_TO_RAD * theta/2;
Using this, we then calculate h on
h = tan(tanValue) * n
If the FOV for the horizontal or vertical spacing of the screen depends on how you scale the H field with aspect ratio.
How are headX and y calculated and that you use subtracting -0.5 from the above? I noticed that it brings the value of x (from -0.5 to 0.5) and the y-value (from 0.5 to -0.5) when msX and msY change.
Your calculations assume that the coordinates of the screen space are in the range [0, screenWidth] × [0, screenHeight]. However, since we are doing truncated cone calculations in the normalized range [-1, 1] ², we want to bring the device to the absolute coordinates of the mouse to the normalized central relative coordinates. This then allows you to specify the offset of the axis relative to the normalized size near the plane. It looks like this with 0 offset (the grid has 0.1 units of distance in this image):

And at an offset of X -0.5 it looks like this (orange outline), since you can see that the left edge of the near plane is offset by -0.5.

Now just imagine that the grid was your screen, and your mouse would drag around the projection slope near the borders of the plane.
What gives this meaning? -fFov was a computed tan of theta / 2, but how can you add headY directly?
Because fFov is not an angle, but the range H / 2 = h in the ASCII picture. And headX and headY are relative shifts in the normalized projection plane.
How does headXheadZ give me xPosition eyes, headYheadZ gives me yPosition eyes, which can I use in gluLookAt () here?
The code you are quoting is a special solution for this account to emphasize the effect. In a real stereoscopic head tracking system, you are a little different. Technically, headZ should either be used to calculate the distance near or away from the plane.
In any case, the main ideas are that the head is located at some distance from the projection plane, and the center point is shifted in relative units of projection. Thus, you must scale the relative headX, headY with the actual distance to the projection plane to perform vertex correction.
Update due to comment / request
So far, we have considered only one dimension when converting a field of view (fov) to a screen. For an undistorted image, the aspect ratio of the [left, right] / [lower, upper] extent of the near clipping plane should correspond to the width / height format of the viewport.
If we define the FoV angle as the vertical FoV, then the horizontal size of the near crop extents is the size of the vertical planes close to the clipping width, scaled using the aspect ratio / height.
This is nothing special with respect to off-axis projection, but can be found in every auxiliary function of perspective projection; compare gluPerspective source code for reference:
void GLAPIENTRY gluPerspective(GLdouble fovy, GLdouble aspect, GLdouble zNear, GLdouble zFar) { GLdouble xmin, xmax, ymin, ymax; ymax = zNear * tan(fovy * M_PI / 360.0);
And if we consider the extents of the near cropping of the plane [-spectivity, aspect] × [-1, 1], then, of course, the position of headX is not in the normalized range [-1, 1], but should be indicated in the range [-spectrum, aspect].
If you look at the paper you linked, you will find that for each screen the position of the head reported by the tracker is converted in absolute coordinates relative to the screen.
Two weeks ago, I had the opportunity to test a display system called "Z-space", where the polarized stereo display was combined with a head tracker that creates an off-axis truncated / reverse lookup combination that matches your physical head position in front of the display. He also offers a “pen” for interacting with the 3D scene in front of you. This is one of the most impressive things I've seen in the last few years, and now I ask my boss to buy us :)