LeastSquares Approximation
Like the method of cubic splines, the leastsquare method attempts to fit a function through a set of data points without the wiggle problem associated with higherorder polynomial approximation. But unlike the cubic spline technique, the derived leastsquare function does not necessarily pass through every data point. The method involves approximating a function such that the sum of the squares of the differences between the approximating function and the actual values given by the data is a minimum.
The basis for the method is:
Given a set of data points (), (),()
we wish to fit an approximating function of the form
where and are assumed functions of the independent variable The problem is to evaluate the regression coefficients The method of leastsquares suggests that these can be calculated by minimizing the sum of the squares of the vertical distances (deviations)
The minimum value of is determined by setting the first partial derivatives of with respect to equal to zero.
This set of linear algebraic equations can then be solved for the unknowns
These regression coefficients are then substituted into
to give the desired approximating function.
The technique used to estimate a linear relationship of the form
is known as simple regression. Geometrically, this amounts to finding the line in the plane that best fits the data points
(), (),()
This line is called the least squares line, and the coefficients and are called regression coefficients. If all the points were exactly on the least square line, we would have
Following the procedure given above, we get
See also LinearFit in the Statistics package.
The leastsquares method can be written in matrix form
where
x = b =
But since most of these points probably not lie on the line, we have
or
+ =
where is the residual vector. The vector element is the vertical deviation from the point () to the least squares line.
These linear equations in the two unknowns and form an overdetermined system of equations that probably has no solution. The system is inconsistent. Therefore we find a least square solution to the linear system given by
presented in Chapter 6.9. The system is solved for x by GaussJordan elimination.
  x
The least squares solution of minimizes
Example 8
Find the least squares line for the data points given in the table.

0 
0.5 
1.0 
1.5 
2.0 
2.5 
3.0 
3.5 
4.0 


0.70 


0.12 
0.25 
0.49 
0.83 
0.91 
Solution
>  restart:MathMaple:ini():alias(GaussJord=ReducedRowEchelonForm): 
>  X:=[seq(0.5*i,i=0..8)]: Y:=[0.99,0.70,0.49,0.24,0.12,0.25,0.49,0.83,0.91]: 
We use the coordinates to build the matrix and the coordinates to build the vector b.
>  A:=Matrix([[0,1],[0.5,1],[1.0,1],[1.5,1],[2.0,1],[2.5,1],[3.0,1],[3.5,1],[4.0,1]]): b:=<Y>:
'A'=A,'b'=b; 
The augmented matrix
is given by

GaussJordan elimination gives


The equation for the least squares line is

Direct computation gives

>  p1:=PlotData(X,Y,style=point,symbol=solidcircle,symbolsize=16): p2:=plot(yx,x=0..4,legend=typeset(y=yx)): display(p1,p2,thickness=2);

Let us also compute the regression coefficients using the basis method described in the first section. We have the data








The leastsquares procedure in the CurveFitting package gives

or by using the Fit procedure
