read this: http://www.cs.auckland.ac.nz/courses/compsci773s1c/lectures/773-GG/lectA-773.htm this explains the 3D reconstruction using two cameras. Now for a simple summary, look at the drawing from this site:

You only know pr / pl, image points. By tracing the line from your respective Or / Ol focal points, you get two lines (Pr / Pl) that contain the P point. Since you know the beginning and orientation of the two cameras, you can build 3d equations for these lines. Their intersection, therefore, is the 3rd point, veil, it is simple.
But when you drop one camera (let them say the left one), you definitely know the line Pr. There is no depth. Fortunately, you know the radius of your ball, this additional information can give you the missing depth information. see the following drawing (ignore my drawing skills): 
Now you know the depth using the interception theorem
I see the last question: the shape of the ball changes when projecting at an angle (i.e. not perpendicular to the capture plane). However, you know the angle, so compensation is possible, but I leave it to you: p
edit: @ripkars comment (comment field was too small)
1) ok
2) aha, correspondence problem: D It is usually solved by correlation analysis or matching functions (mainly matching with subsequent tracking in the video). (there are other methods) I myself did not use the image / vision toolkit, but there must be some things that will help you with this.
3) = Calibrate your cameras. Usually you should do this only once, when installing the cameras (and every time you change your relative position)
4) yes, just put the Longet-Higgins equation, that is: solve
P = C1 + mu1*R1*K1^(-1)*p1 P = C2 + mu2*R2*K2^(-1)*p2
with P = 3D to find C = camera center (vector) R = rotation matrix expressing the orientation of the first camera in the world frame. K = camera calibration matrix (containing internal camera parameters, so as not to be confused with external parameters contained in R and C) p1 and p2 = image points mu = parameter expressing the position of P on the projection line from the center of camera C to P (if Iโm correct R * K ^ -1 * p expresses a linear equation / vector indicating C to P)
these are 6 equations containing 5 unknowns: mu1, mu2 and P
edit: @ripkars comment (comment is too small again) The only computer library of hangouts that appears in my head is OpenCV ( http://opencv.willowgarage.com/wiki ). But this is a C library, not matlab ... I think Google is your friend;)
About calibration: yes, if these two images contain enough information to fit certain functions. If you change the relative position of the cameras, you will have to recalibrate, of course.
The choice of world structure is arbitrary; this becomes important only when you want to analyze the obtained 3D data: for example, you could align one of the world planes with the plane of motion โ a simplified equation of motion if you want to place it. This world frame is only a reference frame, changeable with a โchange in the transformation of the reference systemโ (translation and / or transformation of rotation)