Identifiability of Parametric Models

Identifiability of Parametric Models


This post was originally published by Georgi Tancev at Towards Data Science

The importance of recognizing non-identifiability.

In statistics, identifiability is a property that a model must satisfy in order for precise inference to be possible. When developing probabilistic or deterministic models, it is up to the scientist how the model equations are parametrized. In order to estimate the parameters of a model from data, an inverse problem is solved, which are some of the most important mathematical problems in science and mathematics because they provide information about parameters that cannot directly be observed, for example, parameters in a generalized linear model. Such an inverse problem is well-posed if

  1. a solution exists,
  2. the solution is unique,
  3. and the solution’s behavior changes continuously with the initial conditions.

Especially aspects two and three are difficult to fulfill. Imagine parametrizing a model as

where x and y is data, and a and b are parameters. This model is clearly over parametrized, i.e. it has more parameters than can be estimated from the data, which makes the problem ill-posed. The global optimum of the objective function such as mean squared error is not a point in the parameter space but a trajectory; only the lumped parameter c can be estimated, and different combinations of a and b are optimal. Depending on the starting point of gradient descent, different solutions can be obtained.

A model that is not identifiable is said to be non-identifiable or unidentifiable, i.e. two or more parametrizations are equivalent. A more familiar case is quadratic equations, as the sign of x, in this case, can not be identified. For instance, the square root of four is two or minus two, the solution, in this case, is not unique, it is either positive or negative.

This is called structural identifiability. Recognized non-identifiability can be removed through the substitution of the non-identifiable parameters with their combinations like in the example above. By exploring the degrees of freedom of the model, non-identifiability can be discovered a priori, i.e. even before inference. In particular, computing the rank of the sensitivity matrix can provide more information; if the rank is lower than the number of parameters, the model structure cannot be identified.

Another issue arises due to a lack of data or noisy data. The variance in the data is transferred to the variance in the parameters (Cramér-Rao bound). If the confidence interval(s) of the parameter(s) includes zero, then the parameter(s) could also be redundant. In a case like this, more or better data is needed. This is called practical identifiability.

In both cases, a solution to the optimization problem might be obtained but it might be not the best one and predictions would fail. Hence, it is of importance to recognize situations of non-identifiability, as any downstream steps are worthless and more time should be invested in better-parametrized models or better data.

Spread the word

This post was originally published by Georgi Tancev at Towards Data Science

Related posts