Here is an explicit example based on form functions.
Consider the functions:
u1 (x, z) = (x-x_b) / (x_c-x_b)
We have u1 (x_b, z_b) = u1 (x_a, z_a) = 0 (because x_a = x_b) and u1 (x_c, z_c) = u1 (x_d, z_d) = 1
u2 (x, z) = 1 - u1 (x, z)
Now we have u2 (x_b, z_b) = u2 (x_a, z_a) = 1 and u2 (x_c, z_c) = u2 (x_d, z_d) = 0
v1 (x, z) = (z-z_b) / (z_a-z_b)
This function satisfies the condition v1 (x_a, z_a) = v1 (x_d, z_d) = 1 and v1 (x_b, z_b) = v1 (x_c, z_c) = 0
v2 (x, z) = 1 - v1 (x, z)
We have v2 (x_a, z_a) = v2 (x_d, z_d) = 0 and v2 (x_b, z_b) = v2 (x_c, z_c) = 1
Now create new functions as follows:
S_D (x, z) = u1 (x, z) * v1 (x, z)
We get S_D (x_d, z_d) = 1 and S_D (x_a, z_a) = S_D (x_b, z_b) = S_D (x_c, z_c) = 0
S_C (x, z) = u1 (x, z) * v2 (x, z)
We get S_C (x_c, z_c) = 1 and S_C (x_a, z_a) = S_C (x_b, z_b) = S_C (x_d, z_d) = 0
S_A (x, z) = u2 (x, z) * v1 (x, z)
We get S_A (x_a, z_a) = 1 and S_A (x_b, z_b) = S_A (x_c, z_c) = S_A (x_d, z_d) = 0
S_B (x, z) = u2 (x, z) * v2 (x, z)
We get S_B (x_b, z_b) = 1 and S_B (x_a, z_a) = S_B (x_c, z_c) = S_B (x_d, z_d) = 0
Now define your interpolation function as
H (x, z) = h_a * S_A (x, z) + h_b * S_B (x, z) + h_c * S_C (x, z) + h_d * S_D (x, z),
where h_a is the height at point A, h_b is the height at point B, etc.
You can easily verify that H is indeed an interpolating function:
H (x_a, z_a) = h_a, H (x_b, z_b) = h_b, H (x_c, z_c) = h_c and H (x_d, z_d) = h_d.
Now, to approximate the height at point P, all you have to do is estimate H at that point:
h_p = H (x_p, z_p)
S functions are usually called "form functions." There is one such function for each node you want your interpolated value to depend on, in which case they all satisfy the Kronecker delta property (they take a value of one on one node and zero on all other nodes).
There are many ways to build shape functions for a given set of nodes. If I remember correctly, the construction of two-dimensional form-functions by multiplying the 1D-form functions (as we did in this case) is called the "tensor product of functions" (in this case it is simple because the grid is rectangular). We ended up with four functions (one per node), all of which are linear combinations of {1, x, z, xz}.
If you want to use only three points for your interpolation, you should easily build three shape functions in the form of linear combinations {1, x, z}, but you will lose 25% of the height information provided by the grid, and your interpolator will not be smooth inside the rectangle. when h_b! = h_d.