Least-Squares Approximation 

Like the method of cubic splines, the least-square method attempts to fit a function through a set of data points without the wiggle problem associated with higher-order polynomial approximation. But unlike the cubic spline technique, the derived least-square function does not necessarily pass through every data point. The method involves approximating a function such that the sum of the squares of the differences between the approximating function and the actual values given by the data is a minimum. 

 

The basis for the method is: 

Given a set of n data points  (x[1], y[1]), (x[i], y[i]),(x[n], y[n])   

Image 

we wish to fit an approximating function y(x)of the form 

 

where `<=`(k, n) and are assumed functions of the independent variable The problem is to evaluate the regression coefficients The method of least-squares suggests that these can be calculated by minimizing the sum of the squares S of the vertical distances (deviations)  

`and`(S = Sum(`*`(`^`(d[i], 2)), i = 1 .. n), Sum(`*`(`^`(d[i], 2)), i = 1 .. n) = Sum(`*`(`^`(`+`(y[i], `-`(y(x[i]))), 2)), i = 1 .. n)) 

The minimum value of S is determined by setting the first partial derivatives of S with respect to equal to zero. 

 

This set of Typesetting:-delayCrossProduct(k, k) linear algebraic equations can then be solved for the unknowns   

These regression coefficients are then substituted into 

 

to give the desired approximating function. 

 

 

 

Linear Regression 

The technique used to estimate a linear relationship of the form 

y = `+`(`*`(a[1], `*`(x)), a[0]) 

is known as simple regression. Geometrically, this amounts to finding the line in the plane that best fits the data points 

(x[1], y[1]), (x[2], y[2]),(x[n], y[n]) 

This line is called the least squares line, and the coefficients a[1] and a[0] are called regression coefficients. If all the points were exactly on the least square line, we would have 

y[i] = `+`(`*`(a[1], `*`(x[i])), a[0]), i = 1 .. n 

Following the procedure given above, we get  

a[0] = `/`(`*`(`+`(Sum(`*`(y[i], `*`(Sum(`*`(`^`(x[i], 2)), i = 1 .. n))), i = 1 .. n), `-`(Sum(`*`(x[i], `*`(Sum(`*`(y[i], `*`(x[i])), i = 1 .. n))), i = 1 .. n)))), `*`(`+`(`*`(n, `*`(Sum(`*`(`^`(x[... 

 

  • LeastSquares x[1], y[1], computes a least-squares approximation to the given points unless the curve = f  option is provided.
    The function f must be linear in the unknown parameters, though f itself need not be linear
 

See also LinearFit in the Statistics package. 

 

The least-squares method can be written in matrix form 

`*`(A, `*`(x)) = b 

where 

x = b =  

 

But since most of these points probably not lie on the line, we have 

y[i] = `+`(`*`(a[1], `*`(x[i])), a[0], `ε`[i]), i = 1 .. n 

or 

`*`(A, `*`(x)) = b + `ε`, `ε` =  

where `ε` is the residual vector. The vector element `ε`[i]is the vertical deviation from the point (x[i], y[i]) to the least squares line. 

These n linear equations in the two unknowns a[0] and a[1] form an overdetermined system of equations that probably has no solution. The system is inconsistent.  Therefore we find a least square solution to the linear system `*`(A, `*`(x)) = bgiven by 

`*`(`^`(A, T), `*`(A, `*`(x))) = `*`(`^`(A, T), `*`(b)) 

presented in Chapter 6.9. The system is solved for x by Gauss-Jordan elimination. 

`*`(`^`(A, T), `*`(A)) | `*`(`^`(A, T), `*`(b))I | x 

The least squares solution of `*`(A, `*`(x)) = b minimizes 

`+`(b, `-`(`*`(A, `*`(x))))  

 

 

Example 8 

Find the least squares line for the data points given in the table. 

x 

0 

0.5 

1.0 

1.5 

2.0 

2.5 

3.0 

3.5 

4.0 

y 

-.99 

0.70 

-.49 

-.24 

0.12 

0.25 

0.49 

0.83 

0.91 

 

Solution 

> restart:MathMaple:-ini():alias(GaussJord=ReducedRowEchelonForm):
 

> X:=[seq(0.5*i,i=0..8)]: Y:=[-0.99,-0.70,-0.49,-0.24,0.12,0.25,0.49,0.83,0.91]:
 

We use the x-coordinates to build the matrix and the y-coordinates to build the vector b. 

> A:=Matrix([[0,1],[0.5,1],[1.0,1],[1.5,1],[2.0,1],[2.5,1],[3.0,1],[3.5,1],[4.0,1]]): b:=<Y>:
'A'=A,'b'=b;
 

A = Matrix(%id = 150547516), b = Vector[column](%id = 150547580)
 

The augmented matrix 

`*`(`^`(A, T), `*`(A)) `*`(`^`(A, T), `*`(b)) 

is given by 

> ATA:=Transpose(A).A:
ATb:=Transpose(A).b:
 

> B:=Matrix([ATA,ATb]):%;
 

 

Gauss-Jordan elimination gives 

> GaussJord(B);
 

Matrix(%id = 150549052)
 

> av:=Column(%,3):
<a[1],a[0]> =map(evalf,av,3);
 

Vector[column](%id = 150549372) = Vector[column](%id = 150549436)
 

The equation for the least squares line is 

> y=evalf(av[1]*x+av[2],3);
 

y = `+`(`*`(.488, `*`(x)), `-`(.956))
 

> yx:=rhs(%):
 

Direct computation gives 

> <a[1],a[0]> =map(evalf,LinearAlgebra:-LeastSquares(A,b),3);
 

Vector[column](%id = 150549500) = Vector[column](%id = 150550588)
 

> p1:=PlotData(X,Y,style=point,symbol=solidcircle,symbolsize=16): p2:=plot(yx,x=0..4,legend=typeset(y=yx)): display(p1,p2,thickness=2);

 

Plot_2d
 

Let us also compute the regression coefficients using the basis method described in the first section. We have the data 

> 'X'=X,'Y'=Y;
 

X = [0., .5, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0], Y = [-.99, -.70, -.49, -.24, .12, .25, .49, .83, .91]
X = [0., .5, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0], Y = [-.99, -.70, -.49, -.24, .12, .25, .49, .83, .91]
X = [0., .5, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0], Y = [-.99, -.70, -.49, -.24, .12, .25, .49, .83, .91]
 

> a:='a':b:='b':
 

> y:=t->a*t+b;
 

 

> S=Sum('(y(X[i])-Y[i])^2',i=1..9);
 

S = Sum(`*`(`^`(`+`(y(X[i]), `-`(Y[i])), 2)), i = 1 .. 9)
 

> S:=value(rhs(%));
 

`+`(`*`(`^`(`+`(.99, b), 2)), `*`(`^`(`+`(`*`(.5, `*`(a)), b, .70), 2)), `*`(`^`(`+`(`*`(1.0, `*`(a)), b, .49), 2)), `*`(`^`(`+`(`*`(1.5, `*`(a)), b, .24), 2)), `*`(`^`(`+`(`*`(2.0, `*`(a)), b, `-`(.1...
`+`(`*`(`^`(`+`(.99, b), 2)), `*`(`^`(`+`(`*`(.5, `*`(a)), b, .70), 2)), `*`(`^`(`+`(`*`(1.0, `*`(a)), b, .49), 2)), `*`(`^`(`+`(`*`(1.5, `*`(a)), b, .24), 2)), `*`(`^`(`+`(`*`(2.0, `*`(a)), b, `-`(.1...
`+`(`*`(`^`(`+`(.99, b), 2)), `*`(`^`(`+`(`*`(.5, `*`(a)), b, .70), 2)), `*`(`^`(`+`(`*`(1.0, `*`(a)), b, .49), 2)), `*`(`^`(`+`(`*`(1.5, `*`(a)), b, .24), 2)), `*`(`^`(`+`(`*`(2.0, `*`(a)), b, `-`(.1...
 

> eq1:=diff(S,a)=0:%;
 

`+`(`*`(102.00, `*`(a)), `*`(36.0, `*`(b)), `-`(15.360)) = 0
 

> eq2:=diff(S,b)=0:%;
 

`+`(`*`(36.0, `*`(a)), `*`(18, `*`(b)), `-`(.36)) = 0
 

> solve({eq1,eq2});
 

{a = .4880000000, b = -.9560000000}
 

The least-squares procedure in the CurveFitting package gives 

> y=CurveFitting:-LeastSquares(X,Y,t);
 

y = `+`(`-`(.9560000000), `*`(.4880000000, `*`(t)))
 

or by using the Fit procedure 

> y=Fit(a*t+b,X,Y,t);
 

y = `+`(`*`(.487999999999999878, `*`(t)), `-`(.955999999999999740))
y = `+`(`*`(.487999999999999878, `*`(t)), `-`(.955999999999999740))