Understanding the matrix formulation of least squares regression can help you understand the essence of DOE.

The equation of a straight line has the following well-known form:

Using this notation the equation can be generalised to a P^{th} order polynomial:

If there are N independent observations then

We have N simultaneous equations with (P+1) unknown parameters. The equations are most conveniently expressed using a matrix notation.

The N observations of y can be represented as a vector of order N, and the β-parameters can be represented by a vector of order P+1.

Furthermore an X matrix of dimensions N x (P+1) can be constructed:

In matrix notation the equation now becomes:

The unknowns in this equation are the β-parameters that form the elements of the **B** matrix.

The naïve way to solve the equation is the following:

where X^{-1} denotes the inverse of the X matrix.

This is “naïve” because we can only take the inverse of a square matrix. If we have a matrix of dimensions n x m then the transpose (X^{t}) of the matrix has dimensions m x n, and multiplying these two matrices together results in a square matrix of dimensions n x n. This matrix can be inverted. Hence we take the following steps:

The above analysis was based on a single x-variable, however, the matrix formulation generalises to the case where multiple x-variables are included in the model.

In least squares regression, the x-variables are presumed to be controlled and all errors are assumed to be in the observations y. Consequently:

If our goal is to produce estimates of the β-parameters with minimum variance then we need to minimise the following matrix:

Notice that this quantity is a function of two things:

- The type of model that we wish to fit (i.e. the number of polynomial terms)

- Our choice of x-values

Both of these are under our control and known before the experiment it performed!

*This is the basis of design of experiments.*

**Share the joy:**

Wow eftersom detta är utmärkt arbete ! Grattis och hålla upp.