I understood this from the small amount of information on this page: http://cg.skeelogy.com/depth-of-field-using-raytracing/ , in particular the diagrams below. I think I did it a little different than shown, but the concept is pretty simple.
I can explain the general idea of what is happening and how to implement it (I will try to be brief). Light is reflected from any point in all directions (generally speaking), therefore, in fact, no ray passes between the render and your eye, it is a cone of light that leaves visualization and expands to the eye. The lens of your eye / camera will tend to bend these light rays so that the cone stops expanding and begins to contract again. In order for everything to be in focus, the cone should shrink to the point of your retina / frame, but this only works at a certain distance from the lens: the distance indicated by the "focal plane" on the link page (although I think it really should be a sphere focused on the eye, not on the plane).
For some reason, in front of the focal plane, the cone of light will be more curved: it will focus on the point in front of the grid / frame, and then begin to expand again, so by the time it reaches the frame, it is no longer a point, but a circle. Similarly, for points lying beyond the focal plane, the cone will bend less and will not reach the point when it reaches the frame. In both cases, the effect is that one point is that the scene ends in blurry pixels.
For implementation, you can put this idea aside for your head: instead of rendering each point in the scene by several pixels, you can display several nearby points by one pixel, which, of course, will happen since the "smeared" light circles from neighboring points will overlap and, therefore , each contribution to the pixel.
So, here is how I implemented it:
First define the aperture: the flat center of the area on your eye and parallel to the grid / frame. The larger the aperture, the greater the DOF effect will be apparent. The diaphragm is usually just circles, in which case it is easy to determine by its radius. Other forms can lead to various lighting effects.
Also define the "focal length". I do not think that this is actually the right term, but it is the distance from the eyes where everything will be in focus.
To display each pixel:
- Start with the fact that a ray, similar to normalizing from the eye, through a pixel enters the scene. Instead of crossing it with objects in the scene, you just want to find a point on the beam for which the distance from the eye is equal to the selected focal length. Call this point the focal point for the pixel.
- Now select a random starting point on the iris. For a circular aperture, this is quite easy, you can choose a random polar angle and a random radius (no more than the radius of the aperture). You want a uniform distribution throughout the diaphragm, do not try to deviate from the center or anything else.
- Extract the beam from the point of your choice to the iris through focus. Note that it doesn’t necessarily go through the same pixel, which is good.
- Change this ray as usual (for example, tracing a path or just finding the nearest intersection point, etc.).
- Repeat steps 2, 3, and 4 several times, using each random starting point on the aperture each time, but always sending it through the focal point. Summarize the displayed color values from all the rays and use this as the value for this pixel (as usual, divide by the coefficient of attenuation if necessary).
The more rays you use for each pixel, the better the quality will be, of course. I use about 150 rays per pixel to get decent, but not very good quality. You can see the effect with a rather small bite (say, 50 or 60 rays), but fewer rays will tend to create image grain, especially for things that are very out of focus. The number of rays you need also depends on the size of the aperture: a smaller aperture will not require as many rays, but you will not get that blur effect.
Obviously, you significantly increase your workload by doing this, essentially multiplying it by the number of rays per pixel, so if you have any optimizations left in your raytracer, now is the right time for this, Good news, if you have several processors available, this confuses the parallel if you find focus for the pixel.
A bit more explanation.
The image below should give you an idea of what is happening and why it works in order to be equivalent to what is really happening in the eye or in the camera. It displays two pixels, one pixel is illustrated in red, the other in blue. Dotted lines from the eye through each pixel to the focal “plane” are the rays that you throw at the beginning to determine the focus for the pixel. Translucent cones indicate a complete set of rays that could be arbitrarily selected to display each pixel (red cone for pixel-1, blue cone for pixel 2). Note that since all the rays pass through the focal point, each cone converges to a point exactly at the focal point.
The overlapping areas of the cones are points in the scene that can be transmitted to both pixel-1 and pixel-2: in other words, they are blurry. Since each cone is a point on the focal "plane", there are no matches between the cones, so the points at this distance are displayed only by one pixel: they are completely in focus. Meanwhile, the farther you go from the focal “plane” (forward or backward), the more the cones expand, so more cones will overlap at any point. Therefore, points that are very close or very far away will be displayed on a large number of different pixels, so they will be very out of focus.
