TuringBot – Symbolic Regression AI | Discover Formulas from Data

Turn your data into accurate formulas.

TuringBot is a desktop software for symbolic regression. With its highly efficient algorithm, it can find the best formulas to predict y = f(x1,x2,x3...), without any prior knowledge.

Download

Discover formulas in 3 simple steps


TuringBot makes it easy to find formulas that make accurate predictions:

  1. Paste your data into its built-in spreadsheet, or import data from .csv or .txt files.
  2. Select the target variable and the input variables in the UI.
  3. Press the Play button.

That's it! TuringBot will begin its search, and will show the results in real time. It generates a Pareto frontier of the best formulas of each size found so far, where a larger formula is only shown if its error is lower than all smaller formulas.

Example

If we enter a dataset with pendulum periods measurements into the program (source, csv file), it finds the following formulas for period = f(length):

among which is:

\( \mathrm{period} = 1.97638 \times \sqrt{\mathrm{length}} \approx 2\pi \sqrt{\frac{\mathrm{length}}{10.10694}}\)

Without any prior knowledge about the problem, the program found the correct pendulum period formula, including an approximation to Earth's gravitational constant, in less than 1 minute.

Customize your search

You can choose what base mathematical functions the program should use as building blocks. The following functions are available:

  • Arithmetic: Addition/subtraction (+/-), Multiplication (*), Division (/)
  • Trigonometric: sin, cos, tan, asin, acos, atan
  • Exponential: exp, log, log2, log10, sqrt, pow
  • Hyperbolic: sinh, cosh, tanh, asinh, acosh, atanh
  • Logical: smaller, greater, equal, different, min, max, and, or
  • Other: abs, floor, ceil, round, sign, mod, gamma, erf
  • History: delay, moving average

The program also features a variety of error metrics for the search, allowing it to tackle both regression and classification problems.

Regression metrics:

  • RMS error
  • Mean error
  • Percentile error
  • Maximum error
  • Mean relative error
  • Maximum relative error
  • Correlation coefficient
  • Hybrid (correlation coefficient + RMS error)
  • Nash-Sutcliffe efficiency
  • Residual sum of squares (RSS)
  • Root mean squared log error (RMSLE)

Classification metrics:

  • Classification accuracy
  • F-score
  • Binary cross-entropy
  • Matthews correlation coefficient

These are just the basic options - TuringBot offers many additional parameters for fine control over the search. Consult the documentation for the full list of parameters.

Why choose TuringBot

TuringBot uses a completely novel symbolic regression algorithm based on simulated annealing that has unmatched performance.

The mathematical space of all possible mathematical formulas is infinite and difficult to navigate. TuringBot can discover formulas that could potentially never be recovered through alternative algorithms.

As evidence of this unmatched performance, a 2020 paper found that TuringBot performs noticeably better than Eureqa (a previously popular symbolic regression software that is no longer commercially available) on a variety of Physics-inspired problems (arXiv:2010.11328). In this paper, TuringBot even managed to solve problems for which Eureqa could not find a solution at all.

With TuringBot, you are not running a naive symbolic regression search. You are using software that pushes the boundaries of computation, hence the tribute to Alan Turing in the software's name.

Additionally:

  • TuringBot has been in continuous development since its initial public release in February 2020, incorporating improvements and bug fixes based on user feedback. You can find recent improvements on the Changelog page.
  • The program has been used in 40+ academic publications across a wide range of fields (see below), including but not limited to: Engineering, Physics, Astronomy, Mathematics, Materials Science, Agriculture, Healthcare, Computer Graphics, and Finance. It has also been successfully used by clients in an equally wide range of industry applications.
  • Setup is very simple: TuringBot is a standalone desktop program that can be downloaded and installed in less than 5 minutes. No Python dependencies, no Docker, no virtual environments.

Advantages over black-box machine-learning methods

Symbolic regression can be applied to problems where neural networks could also be applied, but it has the advantage of generating models that are simple, explicit, and interpretable. This can allow them to generalize better outside the training domain.

Additionally, symbolic regression models require very few resources to deploy, and can easily be ported to any programming language.

Academic publications

Some publications that use TuringBot are:

  1. AI Descartes: Combining data and theory for derivable scientific discovery
    Cornelio, C., Dash, S., Austel, V., Josephson, T., Goncalves, J., Clarkson, K., ... & Horesh, L. (2021). arXiv preprint arXiv:2109.01634. [URL]
  2. From Kepler to newton: explainable AI for science
    Li, Z., Ji, J., & Zhang, Y. (2021). arXiv preprint arXiv:2111.12210. [URL]
  3. Study of air exchange and temperature efficiency in a room – based on parameter variations at the supply air vent for use with heated supply air
    Simensen, J. (2021). (Master's thesis, OsloMet-storbyuniversitetet). [URL, in Norwegian]
  4. Logic guided genetic algorithms
    Ashok, D., Scott, J., Wetzel, S. J., Panju, M., & Ganesh, V. (2021, May). (student abstract). In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 35, No. 18, pp. 15753-15754). [URL]
  5. An analytic BRDF for materials with spherical Lambertian scatterers
    d'Eon, E. (2021, July). In Computer Graphics Forum (Vol. 40, No. 4, pp. 153-161). [URL]
  6. Estimation of c* Integral for mismatched welded compact tension specimen
    Katinić, M., Turk, D., Konjatić, P., & Kozak, D. (2021). Materials, 14(24), 7491. [URL]
  7. Yield Load Solutions for SE (B) Fracture Toughness Specimen with I-Shaped Heterogeneous Weld
    Konjatić, P., Katinić, M., Kozak, D., & Gubeljak, N. (2021). Materials, 15(1), 214. [URL]
  8. A review of the fractal market hypothesis for trading and market price prediction
    Blackledge, J., & Lamphiere, M. (2021). Mathematics, 10(1), 117. [URL]
  9. Effect of the refrigerant charge, expansion restriction, and compressor speed interactions on the energy performance of household refrigerators
    Knabben, F. T., Ronzoni, A. F., & Hermes, C. J. (2021). International Journal of Refrigeration, 130, 347-355. [URL]
  10. Practical level-of-detail aggregation of fur appearance
    Zhu, J., Zhao, S., Wang, L., Xu, Y., & Yan, L. Q. (2022). ACM Transactions on Graphics (TOG), 41(4), 1-17. [URL]
  11. The SDSS-Gaia View of the Color–Magnitude Relation for Blue Horizontal-branch Stars
    Barbosa, F. O., Santucci, R. M., Rossi, S., Limberg, G., Pérez-Villegas, A., & Perottoni, H. D. (2022). The Astrophysical Journal, 940(1), 30. [URL]
  12. Semi-empirical equation for determination of stress concentration factors (SCF) in tubular joints of fixed offshore platforms subjected to axial forces
    Costa, L. A., & de Sousa, J. R. M. (2022). In XLIII Ibero-Latin American Congress on Computational Methods in Engineering (Vol. 4, No. 04). [URL]
  13. Buckling Resistance of Single and Double Angle Compression Members
    Alenezi, A. M. M. (2022). (Doctoral dissertation, Université d'Ottawa/University of Ottawa). [URL]
  14. Dynamic Economic Load Dispatch Using Linear Programming and Mathematical-Based Models
    Al-Subhi, A. (2022). Mathematical Modelling of Engineering Problems, 9(3). [URL]
  15. Identifying influential nodes with centrality indices combinations using symbolic regressions
    Mukhtar, M. F., Abas, Z. A., Rasib, A. H. A., Anuar, S. H. H., Zaki, N. H. M., Rahman, A. F. N. A., ... & Shibghatullah, A. S. (2022). International Journal of Advanced Computer Science and Applications, 13(5). [URL]
  16. Development of a Simple Method for Predicting Rice Biomass at Harvest Based on Biomass Accumulation Data
    Takeuchi Eisuke, Tanaka Yu, Yoshida Hiroe, Saito Kazuki, Katsura Keisuke, & Shiraiwa Tachihiko. (2022, September). In Proceedings of the 254th Japanese Society of Crop Science Conference (pp. 50-50). Japanese Society of Crop Science. [URL, in Japanese]
  17. Data-driven artificial intelligence (AI) algorithms for modelling potential maize yield under maize–legume farming systems in East Africa
    Agboka, K. M., Tonnang, H. E., Abdel-Rahman, E. M., Odindi, J., Mutanga, O., & Niassy, S. (2022). Agronomy, 12(12), 3085. [URL]
  18. Novel machine-learning-based stall delay correction model for improving blade element momentum analysis in wind turbine performance prediction
    Syed Ahmed Kabir, I. F., Gajendran, M. K., Ng, E. Y. K., Mehdizadeh, A., & Berrouk, A. S. (2022). Wind, 2(4), 636-658. [URL]
  19. High-strain rate compressive behavior of fiber-reinforced rubberized concrete
    Lai, D., Demartino, C., & Xiao, Y. (2022). Construction and Building Materials, 319, 125739. [URL]
  20. Approximating the Boundaries of Unstable Nuclei Using Analytic Continued Fractions
    Moscato, P., & Grebogi, R. (2023, July). In Proceedings of the Companion Conference on Genetic and Evolutionary Computation (pp. 751-754). [URL]
  21. Symbolic Regression Applied to Cosmology: An Approximate Expression for the Density Perturbation Variance
    Carvalho, A., Oliveira, D. M., Krone-Martins, A., & Da Silva, A. (2023, October). In 2023 IEEE 19th International Conference on e-Science (e-Science) (pp. 1-2). IEEE. [URL]
  22. A Regression-Based Approach for Assessing the Buckling Coefficient of Stiffened and Unstiffened Elements
    Lakshmi, J. R., & Kumar, J. V. V. (2023, September). In IOP Conference Series: Earth and Environmental Science (Vol. 1237, No. 1, p. 012010). IOP Publishing. [URL]
  23. Machine learning based predictive modeling of stochastic systems
    Gajendran, M. K. (2023). University of Missouri-Kansas City. [URL]
  24. Multi-Method Simulation and Multi-Objective Optimization for Energy-Flexibility-Potential Assessment of Food-Production Process Cooling
    Howard, D. A., Jørgensen, B. N., & Ma, Z. (2023). Energies, 16(3), 1514. [URL]
  25. Influence of Open Differential Design on the Mass Reduction Function
    Karakašić, M., Konjatić, P., Glavaš, H., & Grgić, I. (2023). Applied Sciences, 13(24), 13300. [URL]
  26. Machine learning-based approach to wind turbine wake prediction under yawed conditions
    Gajendran, M. K., Kabir, I. F. S. A., Vadivelu, S., & Ng, E. Y. K. (2023). Journal of Marine Science and Engineering, 11(11), 2111. [URL]
  27. Continued fractions and the Thomson problem
    Moscato, P., Haque, M. N., & Moscato, A. (2023). Scientific Reports, 13(1), 7272. [URL]
  28. Galactic Archaeology through the Blue Stars of the Horizontal Branch
    Barbosa, F. O. (2023). (Doctoral dissertation, Universidade de São Paulo). [URL, in Portuguese]
  29. Forecasting Emergency Department Waiting Times Using Deep Neural Networks
    Pak, A., & Trinh, K. (2023). Value in Health, 26(12), S10. [URL]
  30. Machine Learning to Identify Atopic Dermatitis Prevalence Using Healthcare Utilisation Patterns of Both Diagnosed and Non-Diagnosed AD Patients Based on Danish Register Data
    Liljendahl, M., Torpet, M., Lyngsie, P. J., Rudolfsen, J. H., Pedersen, M., & Ibler, K. S. (2023). Value in Health, 26(12), S10. [URL]
  31. Finite Element Analysis and Machine Learning Guided Design of Carbon Fiber Organosheet-Based Battery Enclosures for Crashworthiness
    Shaikh, S. A., Taufique, M. F. N., Balusu, K., Kulkarni, S. S., Hale, F., Oleson, J., ... & Soulami, A. (2024). Applied Composite Materials, 1-19. [URL]
  32. New postoperative pain instrument for toddlers—Secondary analysis of prospectively collected assessments after tonsil surgery
    Gude, P., Geldermann, N., Gustedt, F., Grobe, C., Weber, T. P., & Georgevici, A. I. (2024). Pediatric Anesthesia, 34(4), 347-353. [URL]
  33. The electron density at the midpoint of the plasmapause
    Denton, R. E., Tengdin, P. M., Hartley, D. P., Goldstein, J., Lee, J., & Takahashi, K. (2024). Frontiers in Astronomy and Space Sciences, 11, 1376073. [URL]
  34. New alternatives to the Lennard-Jones potential
    Moscato, P., & Haque, M. N. (2024). Scientific Reports, 14(1), 11169. [URL]
  35. Approximating the nuclear binding energy using analytic continued fractions
    Moscato, P., & Grebogi, R. (2024). Scientific Reports, 14(1), 11559. [URL]
  36. Optimization and Comparative Evaluation of Novel Marine Engines Integrated with Fuel Cells Using Sustainable Fuel Choices
    Seyam, S., Dincer, I., & Agelin-Chaab, M. (2024). Energy, 131629. [URL]
  37. A New Approximation for the Perimeter of an Ellipse
    Moscato, P., & Ciezak, A. (2024). Algorithms, 17(10), 464. [URL]

This list is constantly growing and is probably incomplete. If your paper is not shown, please email it to us and we will add it to the list.

Community

  • The TuringBot Forum is a growing community where you can get help, ask questions, and connect with other users of the software.

Documentation

Check out the documentation for the software, where you can find detailed descriptions for all of its parameters as well as:

Ready to get started?

Start discovering formulas from your data today.