# Regression and Inverse Problems

### Description

Regression problems occur in many metrological applications, e.g. in everyday calibration tasks (as illustrated in Annex H.3 of the GUM), in the evaluation of inter-laboratory comparisons [Toman et al. 2012], the characterization of sensors [Matthews et al. 2014], determination of fundamental constants [Bodnar et al. 2014], interpolation or prediction tasks [Wübbeler et al. 2012] and many more. Such problems arise when the quantity of interest cannot be measured directly, but has to be inferred from measurement data (and their uncertainties) using a mathematical model that relates the quantity of interest to the data.

### Definition and Examples

Regression problems often take the form
$$\begin{equation*} \label{int_reg_eq1} y_i = f_{\boldsymbol{\theta}}(x_i) + \varepsilon_i , \quad i=1, \ldots, n \,, \end{equation*}$$
where the measurements $\boldsymbol{y}=(y_1, \ldots, y_n)^\top$ are explained by a function $f_{\boldsymbol{\theta}}$ evaluated at values $\boldsymbol{x}=(x_1, \ldots, x_n)^\top$ and depending on unknown parameters $\boldsymbol{\theta}=(\theta_1, \ldots, \theta_p)^\top$. The measurement error $\pmb{\varepsilon}=(\varepsilon_1, \ldots, \varepsilon_n)^\top$ follows a specified distribution $p(\pmb{\varepsilon} | \boldsymbol{\theta}, \boldsymbol{\delta}).$
Regressions may be used to describe the relationship between a traceable, highly-accurate reference device with values denoted by $x$ and a device to be calibrated with values denoted by $y$. The pairs $(x_i,y_i)$ then denote simultaneous measurements made by the two devices of the same measurand such as, for example, temperature.
A simple example is the Normal straight line regression model (as illustrated in Figure 1)
$$\begin{equation*} \label{int_reg_eq4} y_i = \theta_1 + \theta_2 x_i + \varepsilon_i , \quad \varepsilon_i \stackrel{iid}{\sim} \text{N}(0, \sigma^2), \quad i=1, \ldots, n \,. \end{equation*}$$
The basic goal of regression tasks is to estimate the unknown parameters $\pmb{\theta}$ of the regression function, and possibly also the unknown parameters of the error distribution $\pmb{\delta}$. The estimated regression model may then be used to evaluate the shape of the regression function, predictions or interpolations of intermediate or extrapolated $x$-values or to invert the regression function to predict $x$-values for new measurements.

### Uncertainty evaluation

Decisions based on regression analyses require a reliable evaluation of measurement uncertainty. The current state of the art in uncertainty evaluation in metrology (i.e. the GUM and its supplements) provides little guidance however. One reason is that the GUM guidelines are based on a model that relates the quantity of interest (the measurand) to the input quantities. Yet, regression models cannot be uniquely formulated as such a measurement function. By way of example, Annex H.3 of the GUM nevertheless suggests a possibility to analyse regression problems. However, this analysis contains elements from both classical (least squares) and Bayesian statistics such that the results are no deductions of state-of-knowledge distributions and usually differ from a purely classical or Bayesian approach.

Consequently, there is a need for guidance and research in metrology for uncertainty evaluation in regression problems. The Joint Committee for Guides in Metrology (JCGM) identified this need. The EMRP project NEW041 developed template solutions for specific regression problems with known values x (cf. [Elster et al., 2015]). These solutions are based on Bayesian inference and consider (1) a simple, analytically solvable, Normal linear regression [Klauenberg et al., 2015], (2) a similar problem with additional constraints on the values of the regression curve [Kok et al., 2015], (3) a problem where the regression function is not known explicitly but needs to be determined through the numerical solution of a partial differential equation [Allard et al., 2015], (4) a problem where the variances of the observations are not constant and the information gained in the regression is used completely for a subsequent prediction of values of x [Klauenberg et al., 2015] and (5) a regression function which is computationally expensive to evaluate [Heidenreich et al., 2014]. Other Bayesian research of metrological regression problems include [Rocha et al., 2004, Toman et al.,2006, Grientschnig et al., 2011, Willink et al., 2008, Wübbeler et al., 2012, Toman et al., 2012-2, Elster et al., 2011 and Possolo et al., 2007].

### Research

To improve reliability and comparability in many fields of metrology, a consistent evaluation of regression problems is indispensable. Important issues to achieve this goal, are
• the proper treatment of the error structure associated with measured data (including errors in both stimulus and response variables),
• the thoughtful inclusion of all available information such as prior knowledge from previous measurements or physical constraints,
• the availability of reliable numerical methods (such as Monte Carlo meth- ods),
• the quantification of the sensitivity of the results obtained to the assump- tions made, and
• the consideration of model uncertainty and model validation.
In order to enable metrologists to address these issues, there is also a need to
• develop tutorials, guides and template solutions for typical regression problems,
• implement these in easy to use software,
• define conditions under which simple (approximate) methods are applicable, and
• bridge the gap to statisticians (especially at smaller metrology institutes) to tackle also more complex problems.

### Related journal papers

Authors
Title
Journal
Year
C. Elster, G. WübbelerBayesian inference using a noninformative prior for linear Gaussian random coefficient regression with inhomogeneous within-class variances.Comput. Stat., 32(1), 51--692017
F. Rolle, F. Pennecchi, S. Perini, M. SegaMetrological traceability of Polycyclic Aromatic Hydrocarbons (PAHs) measurements in green tea and mateMeasurement 982017
K. Klauenberg, M. Walzel, B. Ebert and C. ElsterInformative prior distributions for ELISA analysesBiostatistics2015
K. Klauenberg, G. Wübbeler, B. Mickan, P. M. Harris, and C. ElsterA Tutorial on Bayesian Normal Linear RegressionMetrologia, 52(6)2015
C. Elster and G. WübbelerBayesian inference using a noninformative prior for linear Gaussian random coefficient regression with inhomogeneous within-class variancesComput. Stat. 30 (4)2015
Andrea Malengo and Francesca PennecchiA weighted total least-square algorithm for any fitting model with correlated variablesMetrologia, vol. 50, nr. 6, 654-6222013
C. Elster, K. Klauenberg, M. Bär, A. Allard, N. Fischer, G. Kok, A. van der Veen, P. Harris, I. Smith, L. Wright, S. Cowen, P. Wilson and S. EllisonNovel mathematical and statistical approaches to uncertainty evaluation in the context of regression and inverse problems16th International Congress of Metrology2013
M.-A. Henn, H. Gross, F. Scholze, M. Wurm, C. Elster, and M. BärA maximum likelihood approach to the inverse problem of scatterometryOpt. Express 20, 12771-127862012
M. A. Henn, S. Heidenreich, H. Gross, C. Elster and M. BärImproved grating reconstruction by determination of line roughness in extreme ultraviolet scatterometryOpt. Letters, vol. 37, nr. 24, 5229-52312012
B. Toman, D.L. Duewer, H.G. Aragon, F.R. Guenther and G.C. RhoderickA Bayesian approach to the evaluation of comparisons of individually value-assigned reference materialsAnalytical and Bioanalytical Chemistry 403(2), 537-5482012
K. Klauenberg, B. Ebert, J. Voigt, M. Walzel, J. E. Noble, A. E. Knight and C. ElsterBayesian analysis of an international ELISA comparability studyClin. Chem. Lab. Med2011
C. Elster and B. TomanBayesian uncertainty analysis for a regression model versus application of GUM supplement 1 to the least-squares estimateMetrologiam 48 (5), 2332011
D. Grientschnig and I. LiraReassessment of a calibration model by Bayesian reference analysisMetrologia 48 (1), L72011
R. WillinkEstimation and uncertainty in fitting straight lines to data: different techniquesMetrologia 45(3), 2902008
A. Possolo and B. TomanAssessment of measurement uncertainty via observation equationsMetrologia 44(6), 4642007
M. J. T. Milton, P. M. Harris, I. M. Smith, A. Brown and B. GoodyImplementation of a generalised least-squares method for determining calibration curves from data with general uncertainty structuresMetrologia 43, 291-2982006
B. TomanLinear statistical models in the presence of systematic effects requiring a Type B evaluation of uncertaintyMetrologia 43(1), 272006