Linear least squares (mathematics)

Linear least squares (mathematics)

Linear least squares (LLS) is the least squares approximation of linear functions to data. It is a set of formulations for solving statistical problems involved in linear regression, including variants for ordinary (unweighted), weighted, and generalized (correlated) residuals. Numerical methods for linear least squares include inverting the matrix of the normal equations and orthogonal decomposition methods.
Main formulations
The three main linear least squares formulations are:
Alternative formulations
Other formulations include:
In addition, percentage least squares focuses on reducing percentage errors, which is useful in the field of forecasting or time series analysis. It is also useful in situations where the dependent variable has a wide range without constant variance, as here the larger residuals at the upper end of the range would dominate if OLS were used. When the percentage or relative error is normally distributed, least squares percentage regression provides maximum likelihood estimates. Percentage regression is linked to a multiplicative error model, whereas OLS is linked to models containing an additive error term.[6]
In constrained least squares, one is interested in solving a linear least squares problem with an additional constraint on the solution.
Objective function
In OLS (i.e., assuming unweighted observations), the optimal value of the objective function is found by substituting in the optimal expression for the coefficient vector, can be written as:
These values can be used for a statistical criterion as to the goodness of fit. When unit weights are used, the numbers should be divided by the variance of an observation.
For WLS, the ordinary objective function above is replaced for a weighted average of residuals.
Discussion
In statistics and mathematics, linear least squares is an approach to fitting a mathematical or statistical model to data in cases where the idealized value provided by the model for any data point is expressed linearly in terms of the unknown parameters of the model. The resulting fitted model can be used to summarize the data, to predict unobserved values from the same system, and to understand the mechanisms that may underlie the system.
Mathematically, linear least squares is the problem of approximately solving an overdetermined system of linear equations, where the best approximation is defined as that which minimizes the sum of squared differences between the data values and their corresponding modeled values. The approach is called linear least squares since the assumed function is linear in the parameters to be estimated. Linear least squares problems are convex and have a closed-form solution that is unique, provided that the number of data points used for fitting equals or exceeds the number of unknown parameters, except in special degenerate situations. In contrast, non-linear least squares problems generally must be solved by an iterative procedure, and the problems can be non-convex with multiple optima for the objective function. If prior distributions are available, then even an underdetermined system can be solved using the Bayesian MMSE estimator.
In statistics, linear least squares problems correspond to a particularly important type of statistical model called linear regression which arises as a particular form of regression analysis. One basic form of such a model is an ordinary least squares model. The present article concentrates on the mathematical aspects of linear least squares problems, with discussion of the formulation and interpretation of statistical regression models and statistical inferences related to these being dealt with in the articles just mentioned. See outline of regression analysis for an outline of the topic.
Properties
For example, it is easy to show that the arithmetic mean of a set of measurements of a quantity is the least-squares estimator of the value of that quantity. If the conditions of the Gauss–Markov theorem apply, the arithmetic mean is optimal, whatever the distribution of errors of the measurements might be.
However, in the case that the experimental errors do belong to a normal distribution, the least-squares estimator is also a maximum likelihood estimator.[9]
These properties underpin the use of the method of least squares for all types of data fitting, even when the assumptions are not strictly valid.
Limitations
An assumption underlying the treatment given above is that the independent variable, x, is free of error. In practice, the errors on the measurements of the independent variable are usually much smaller than the errors on the dependent variable and can therefore be ignored. When this is not the case, total least squares or more generally errors-in-variables models, or rigorous least squares, should be used. This can be done by adjusting the weighting scheme to take into account errors on both the dependent and independent variables and then following the standard procedure.[10][11]
Applications
Polynomial fitting: models are polynomials in an independent variable, x: Straight line: .[12] Quadratic: . Cubic, quartic and higher polynomials. For regression with high-order polynomials, the use of orthogonal polynomials is recommended.[13]
Numerical smoothing and differentiation — this is an application of polynomial fitting.
Multinomials in more than one independent variable, including surface fitting
Curve fitting with B-splines [10]
Chemometrics, Calibration curve, Standard addition, Gran plot, analysis of mixtures
Uses in data fitting
Ideally, the model function fits the data exactly, so
so to minimize the function
and the best fit can be found by solving the normal equations.
Example
of four equations in two unknowns in some "best" sense.
The residual, at each point, between the curve fit and the data is the difference between the right- and left-hand sides of the equations above. The least squares approach to solving this problem is to try to make the sum of the squares of these residuals as small as possible; that is, to find the minimum of the function
This results in a system of two equations in two unknowns, called the normal equations, which when solved give
Using a quadratic model
The partial derivatives with respect to the parameters (this time there is only one) are again computed and set to 0:
and solved
See also
Line-line intersection#Nearest point to non-intersecting lines, an application
Line fitting
Nonlinear least squares
Regularized least squares
Simple linear regression
Partial least squares regression