### A regression model example and how to generate it

In this example, we use symbolic regression to predict house prices as a function of their characteristics.

TuringBot is a desktop software that discovers mathematical formulas from data using symbolic regression.

Say you want to predict a numerical value from a set of input variables. In 2020, most people would go about it in one of two ways:

- Fit a line or a polynomial to the data.
- Use some horribly complicated black-box method (neural networks, random forests, etc).

The first option is very limited. It barely scratches the space of all possible mathematical relationships that could be relevant.

The second option yields models that are highly susceptible to overfitting and that do not offer much insight into the data.

This is where TuringBot comes in: it solves the problem by finding explicit mathematical formulas that connect the variables. This way, it generalizes curve-fitting methods (including linear and polynomial regression), while generating models that are simple and explainable.

Whether you are a data scientist, an engineer, a quant trader, or a researcher, TuringBot will give you a HUGE edge.

TuringBot implements a technique called symbolic regression. It tries to combine a set of base functions into simple formulas that accurately predict the desired variable. Examples of base functions are addition, sin(x), exp(x), etc.

It uses TXT or CSV files as input, which may contain an arbitrary number of columns. The target and input columns can be selected from the interface.

Both TuringBot and Eureqa are implementations of symbolic regression, but the algorithms used by each are completely different. Eureqa is based on genetic programming, while TuringBot is based on simulated annealing.

Eureqa was acquired by a consulting company and is no longer commercially available.

A recent paper has shown that TuringBot performs noticeably better than Eureqa on a variety of Physics-inspired problems (arXiv:2010.11328). In this paper, TuringBot even managed to solve problems for which Eureqa could not find a solution at all.

Many free symbolic regression packages have been developed in the past, including notably gplearn but also many other small repositories that can be found on GitHub.

If you try any of these packages and compare the performance to TuringBot, you will instantly notice that their performance is vastly inferior. The main reasons for that are two:

- It is easy to write a basic symbolic regression software, but extremely difficult to write an efficient one. This results in many projects that have been started with great hopes, maintained for a few months and then abandoned because they turned out to be of little practical use.
- Most of these packages are written in scripting languages like Python, which are slow and unsuitable for numerical programming. No amount of flashy libraries like NumPy or Cython will make a Python program anywhere as fast as a program written in C from scratch like TuringBot.

TuringBot can be downloaded and used for free for as long as you want, but it also has a paid version which unlocks more functionalities. You can find more details on the Pricing page.

The applications of TuringBot are endless. To give a few examples:

- Are you an engineer or a researcher? You can use TuringBot to create simple and robust regression models. See: A regression model example and how to generate it.
- Are you a data scientist? You can use TuringBot to create models that classify rare events. See: Using Symbolic Regression to predict rare events.
- Are you a quant trader? You can use TuringBot to create classification models that predict whether a stock will rise or fall based on a set of features of your choice. See: How to create an AI trading system.

A decision boundary found with symbolic regression. Tutorial

What makes TuringBot so powerful is that many different search metrics are included, allowing models with different goals to be generated. Those include:

- RMS error
- Mean relative error
- Classification accuracy
- Mean error
- F1 score (for classification problems on highly imbalanced datasets)
- Correlation coefficient
- Maximum error

Start finding formulas for any dataset today.

Want to see TuringBot in action?

Check out the official blog.

In this example, we use symbolic regression to predict house prices as a function of their characteristics.

Here we use TuringBot to develop a classification algorithm that predicts stock market price changes.

Learn how to classify credit card fraud in a real-world dataset using Symbolic Regression.