Discovering Mathematical Formulas from Sensor Data with Symbolic Regression

Key Takeaways:
  • Discovered temperature calibration formula Tref = (-0.348865)*(T3-T2)+T1 in under 5 minutes
  • Achieved RMSE of 0.011°C—comparable to neural networks but fully interpretable
  • Formula requires 3 arithmetic operations vs. hundreds of weights for equivalent NN
  • TuringBot works offline—no cloud services or external libraries required

Sensors generate massive amounts of time series data daily. Traditional ML can predict outcomes but can't explain the underlying physics.

Symbolic regression solves this by discovering explicit mathematical formulas connecting the sensors variables. Unlike black-box models, symbolic regression searches over the space of all possible mathematical formulas to find the ones that best predict output variables. It is particularly effective at problems in a small number of dimensions, making it perfect for sensor applications.

Here we'll demonstrate how TuringBot, a desktop application for symbolic regression, makes it easy to find hidden mathematical relationships in sensor measurements.

Case Study: Temperature Sensor Calibration and Drift Compensation

Let's tackle a common scenario: compensating for drift across multiple temperature sensors.

Our dataset contains readings from three temperature sensors (T1, T2, T3) showing different values due to calibration differences and drift, plus reference measurements (Tref) from a high-precision thermometer.

Step 1: Data Import

After launching TuringBot, we pasted our sensor dataset containing a time series of sensor readings, where each row represents a reading at a different time, into the built-in spreadsheet:

TuringBot's data import interface showing temperature sensor data with T1, T2, T3 readings and Tref reference values for calibration

Step 2: Configuration Setup

We configured TuringBot to search for a formula with T1, T2, and T3 as inputs and Tref as output, using RMSE as the error metric. All available base functions were allowed for the search, such as basic arithmetic, exp(x), sin(x), and sqrt(x):

TuringBot's configuration interface for symbolic regression showing input variables T1, T2, T3 and target variable Tref with RMSE scoring

Step 3: Running the Search and Results

TuringBot's algorithm searched through possible mathematical expressions, displaying real-time results with different complexities. In less than 5 minutes, it discovered this elegant formula:

Tref = (-0.348865)*(T3-T2)+T1

With an RMSE of 0.011°C, this simple equation captured the relationship remarkably well. These are all the other formulas that were discovered:

TuringBot's solution showing the discovered formula (-0.348865)*(T3-T2)+T1 with error metrics and visualization of actual vs predicted values

Insights from the Formula

This formula reveals several key insights about our sensors:

  • T1 serves as the baseline measurement, suggesting it's the most reliable sensor overall
  • The difference between T3 and T2 provides a correction factor
  • The negative coefficient (-0.348865) indicates that when T3 reads higher than T2, a downward correction is needed

Most interestingly, the formula structure reveals that the differential between sensors T3 and T2 contains valuable information about measurement error, something that wouldn't be obvious from simple inspection of the data.

Additional Sensor Applications

While our example used temperature sensors, symbolic regression is equally applicable to other sensing domains:

  • Vibration sensors can use mathematical patterns to detect early bearing failures, with formulas linking frequency components to specific mechanical issues.
  • Pressure transducers can compensate for non-linear response curves through multi-term equations.
  • Electrochemical sensors can correct for cross-sensitivity to interfering gases by encoding these relationships in explicit mathematical terms.

Symbolic Regression vs. Neural Networks for Sensor Calibration

MetricTuringBot FormulaEquivalent Neural Network
Accuracy (RMSE)0.011°C~0.01°C (with overfitting risk)
Operations per prediction3100s-1000s
Parameters to store1 coefficient100s of weights
InterpretabilityFull (explicit equation)None (black box)
Microcontroller deploymentTrivialRequires TensorFlow Lite/ONNX

Conclusion

Symbolic regression with TuringBot transforms raw sensor data into practical insights. By extracting the exact mathematical relationship Tref = (-0.348865)*(T3-T2)+T1, we've solved a real calibration problem with a formula that's immediately deployable and physically meaningful. This concrete example demonstrates how explicit mathematical formulas can be competitive against traditional predictive approaches in sensor applications.

About TuringBot

TuringBot finds mathematical formulas from data using symbolic regression. Load a CSV, select your target variable, and get interpretable equations—not black-box models.

Free version available for Windows, macOS, and Linux.