Natural scienceBiologyBiology basicsImmunology

Curve fitting

Provided by: Edvancium

4 minutes read

As we already know, immunoassays are tests that use antibodies to detect the presence of specific proteins or other molecules in a sample. The end result of the immunoassay protocol is some sort of signal that we measure. Based on the signal readout, we will probably be able to tell if there was antigen present in the sample because there would be measurable signal. However, we can't really say how much of an analyte is present in the sample. During the process of quantification (=measuring the concentration of a particular analyte in a sample), a special calculation algorithm is used. Let's discover it below!

How does it work

For an immunoassay to be quantitative, it must be calibrated using standards. Standards are analyte solutions with known concentrations. If that sounds complex, imagine a ruler. Every ruler has marks on it — defined centimetres, millimetres, and so on. These are standards. We can compare any object to this set of standard lengths (ruler) to know the object's length. The same happens here: after the immunoassay experiment, researchers compare the standards with the results they obtained from the experiment. The best way to compare these values is to plot them against each other. The generated graph is called a standard curve:

Standard curves are plotted by using the concentrations as x values, on the horizontal x-axis, and the signal mesaured for these standards (e.g. fluorescence intensity (MFI) or absorbance) as y values, on the left vertical axis. After we plot the curve, we can use it calculate the concentration for the rest of the samples with unknown concentration:

Standard curves are plotted by using the concentrations as x values, on the horizontal x-axis, and the signal mesaured for these standards (e.g. fluorescence intensity (MFI) or absorbance) as y values, on the left vertical axis. After we plot the curve, we can use it calculate the concentration for the rest of the samples with unknown concentration: However, straight lines, as shown on the graph above, are not always the best type of curve to describe the relationships between the signal and the concentration. This is why there are various curve fitting methods that can be used to fit a standard curve, including linear regression, logarithmic transformation, and nonlinear regression analysis. The choice of method depends on several factors, such as the type and quality of data points available, desired precision/accuracy, and computational resources/expertise available. Below we will cover the most widely used methods: linear and logistic regression.

Linear Regression

Generally speaking, linear regression is a statistical technique that can be used to assess the relationship between two variables, x and y. In linear regression, the aim is to find the equation of a line that best describes the data points. This line can then be used to make predictions about future values of y, given new values of x (or vice versa). As already mentioned above, a standard curve is a graph that shows how a known concentration of a substance produces a known response. To create a standard curve, different concentrations of the substance are measured and plotted on the graph. The linear regression line is then drawn through these points. The slope and intercept of this line can be used to calculate unknown concentrations from future measurements. Here's how it looks:

However, fitting our data to the line sometimes not accurate enough, that is why the logistic regression has been introduced as an improved version of linear regression.

Four-Parameter Logistic Regression

Four parameter logistic regression (or simply 4PL) can better predict values for new data points. It does this by fitting a curve to the data instead of a straight line, allowing for more accurate predictions. The equation itself contains variables related to properties of the curve:

$y = D + {A - D \over 1 + (x /C)^B}$

where y is optical density (or MFI), x — concentration of an analyte, A — the minimum value that can be obtained, D — the maximum value that can be obtained, C — point of inflection (mid-range concentration), and B reflects steepness of the curve at point C (slope factor). Here how it looks on a graph:

Similar to the linear equation above, the equation can then be used to determine unknown concentrations (x) from the assay data (y). While the 4PL regression is one of the most accurate types of curve fitting, sometimes we need an additional parameter to describe our data even better. Here comes the five-parameter logistic regression!

Five-Parameter Logistic Regression

The five-parameter logistic regression (5PL) equation is similar to the four-parameter logistic regression (4PL) equation, but with an additional parameter E. In cases where the standard curve is not symmetrical, the additional parameter improves the fit.

The equation is as follows:

$y = D + {A - D \over [1 + (x/C)^B]^G}$ , where where y is optical density (or MFI), x — concentration of an analyte, A — the minimum value that can be obtained, D — the maximum value that can be obtained, C — point of inflection (mid-range concentration), B reflects steepness of the curve at point C (slope factor) and G is an asymmetry factor. Here's an example of an asymmetrical curve and how would it change with G parameter increasing:

Goodness of fit

There are many ways to measure the goodness of fit, but one of the most common is the R-squared value. This measures how well the data fit the model and ranges from 0 to 1, with 1 being a perfect fit. The r-squared value can be calculated with the use of any statistical software, Python, or MS Excel. For logistic regressions, however, other statistical parameters are used to calculate the goodness of fit, for example, Standards Recovery. This approach involves determining the concentration of each standard and comparing it to the actual concentration using the following formula:

$Recovery = {Observed\ Concentration\over Expected\ Concentration} * 100$

This method provides information on the relative error in sample calculation. Typically, the value should be in the range of 70% to 130%.

Conclusion

The two most widely used curve-fitting models for immunoassays are linear regression and logistic regression. Although linear regression is effective for analyzing samples that fall within the linear portion of the curve, logistic regression gives the broadest range of concentrations at which unknown samples can be correctly predicted. We can check the goodness of fit for linear and logistic regression by calculating the R-squared value or Standards Recovery. Calculating the goodness of fit is important because it allows us to see how well our data fits a model.

How did you like the theory?

Report a typo