v3.1

May 26, 2025

Changes

Make a major optimization to the symbolic regression algorithm that helps it escape local minima more effectively.
Add a new percentile error search metric with an option to set the quantile. This metric is useful for datasets with outliers.
Add new options for constraining formulas: minimum and maximum number of constants, minimum and maximum number of different variables, maximum number of occurrences per variable.
Add the option to include constants like pi, e, sqrt2, ln2 in the search.
Make solutions with 'Normalize the dataset' directly usable by replacing variables with (variable-avg)/std in the output formulas.
Add a plot to the Log tab showing record errors over time to visualize search progress.
Add "New record!" log messages when solutions with new record-low errors are found.
Improve logarithmic scales for plots, which now allow visualizing actual numbers (rather than transformed numbers) while only changing the axis scale itself. The symlog options had to be removed to support this change.
Add a 'Plot type' dropdown to switch between regular plots and observed vs predicted plots.
Make it possible to select and load multiple seed formula files at once.
Rename 'cross-validation' to 'test-sample' throughout the UI, since this more accurately represents what is implemented.
Add more diagnostic metrics to the Search tab: Mean error, Median error, Maximum error, Median relative error, Maximum relative error.
Organize the search metrics dropdown with regression metrics at the top and classification metrics at the bottom.
Reduce the number of log messages by not logging solutions with negligible error improvements.
Add a confirmation message after finishing processing seed formulas from a file.
Improve the formatting of error numbers in the Solutions table.
Only show test sample settings (random seed, sequential vs random) when a test sample is selected.
Only show the F-score beta parameter input when the F-score metric is selected.

Bug fixes

Fix the UI sometimes crashing or becoming all white and unresponsive during searches.
Fix the program failing to launch when two instances are launched simultaneously.
Prevent seed formulas with disallowed functions from being loaded.
Fix loading formulas containing spaces from files.
Fix the R-squared calculation to match the most commonly used definition exactly.

v3.1.1

May 27, 2025

Fix "Maximum occurrences per variable" not counting the x in functions like sin(x).
Show an error message instead of crashing when trying to load a badly formatted input file.

v3.1.2

June 1, 2025

Fix a crash during searches introduced in v3.1. This fix also improves search speed, so upgrading is highly recommended!

v3.1.3

June 18, 2025

Fix a rare crash during searches.

v3.1.4

July 3, 2025

Fix a crash when the number of generated formulas exceeds around 4e9.
Fix the displayed elapsed time doubling when the search is paused.
Fix custom searches with constants, like y = f(x) + pi.

v3.0

March 11, 2025

Changes

New UI layout with a left sidebar for navigation across tabs.
New Input tab to enter data directly in the UI through a built-in spreadsheet.
New Prediction tab to predict model outputs for fixed inputs or ranges of inputs, with a plot to make projections. The default values come from the last row in the dataset.
Consolidate all search options in the Search tab and put them in collapsible sections.
Make all plots interactive, with options to zoom in and out, pan, and save the plot. Also improve the plot styles.
Improve the style for the selected solution in the solutions box to make it more readable.
Add a "Show combined data (training + testing)" option for the main plot.
Make the input field for the custom formula larger.
Make the statistical summary area a table instead of raw text.
Check all but the last checkbox under "Input variables" by default when loading a dataset.

v3.0.1

March 28, 2025

More readable log messages, with horizontal scrolling for long entries
Make the "Normalize the dataset" option normalize the row number as well
Fix spreadsheet compatibility with decimal numbers using commas (common when pasting from Excel)
Fix an issue where spreadsheet would repeat pasted values when selection had more rows than content
Fix UI crashes that caused gray screen in some cases after the 3.0 update
Fix "Solution info" box not clearing when loading a new dataset
Fix variable names containing "$" (like "$x") not rendering properly in the Solutions box

v3.0.2

April 11, 2025

Make it possible to resize the spreadsheet columns.
Fix pasting data to the spreadsheet when no selection is made.

v2.23

February 6, 2025

Changes

Repeated terms in custom search: You can now use indexed function terms (e.g., f1(x), f2()) to find formulas with multiple occurrences of the same unknown function. Please consult the updated documentation for details.
UI: Increase the vertical space between solutions in the Solutions box for better readability.

v2.23.1

February 17, 2025

Make the program launch slightly faster.
Fix the program becoming unresponsive after trying to start a search with an invalid custom formula.

v2.22

November 30, 2024

Changes

New command-line flags: New flags have been added to customize searches without a config file, including --search-metric, --train-test-split, and --custom-formula. All UI settings can now be configured through these command-line flags. Run turingbot --help or see the documentation for more details.
Python library updates: Added support for arguments like search_metric and train_test_split in the start_process method, allowing programmatic control of all UI settings without a config file. See the documentation for details. Also added exception handling to raise errors for invalid paths.
History functions generalization: The history functions (delay and moving_average) have been updated to accept general functions in the first argument, enabling formulas like delay(x*sin(x),2). The history functions are now also supported in custom formulas, such as y = delay(f(x),1) or y = delay(f(x),f()).
Improve the UI responsiveness during optimizations when using large input files, such as those containing millions of rows.

Bug fixes

Fix a command-line mode issue where only the first column was used as input by default instead of all columns except the last.
Fix cases where zombie processes were still left behind after the program was closed.
Fix a bug that caused the program to hang when base functions appeared in custom formulas without being selected under "Search options".
Fix an issue that prevented the program from working in command-line mode over SSH.

v2.22.2

December 30, 2024

Increase the maximum formula size to 1024 (from 500), and enforce this limit in the UI.
Add enter and exit methods to the Python library for use in a "with" statement.
Add a --version command-line flag.
Minor aesthetic improvements to the UI.
Fix the Python library leaving zombie processes when the script is interrupted by Ctrl+C, killed by closing the CMD window on Windows, or terminated with kill -9 on Linux.

v2.21

September 9, 2024

Changes

Add the option to specify a custom number of rows for the train/test split.
Add the RMLSE (Root Mean Squared Logarithmic Error) metric. Similar to "Mean Relative Error," it handles target variables across multiple magnitudes but penalizes large errors less aggressively. It requires the target variable to be strictly positive.
Add a y=x line to the "Observed vs Predicted" plot to better visualize the points.
Add the option to choose between log and symlog scales for plots. Previously, plots were always scaled using symlog, which behaves oddly when values are between 0 and 1.
When using cross-validation, export the training rows first, followed by the testing rows, in order, when clicking "Export data + predictions".
Set the "niceness" level of the program to 19 by default on Linux/MacOS. This helps prevent the UI and system from becoming unresponsive when using all CPU threads on these OS.
Toggle checkboxes when clicking on function names under "Search options".

Bug fixes

Fix the program sometimes crashing when starting a new run.
Fix incorrect output for "Export data+predictions" when the solution contains delay or moving_average.
Fix moving_average persisting after starting a search with both delay and moving_average, stopping, and then starting a search with only delay.
Prevent "Load formulas from file" from loading formulas containing base functions not allowed in the current search.
Fix custom formulas not being parsed when a dataset variable has a name similar to a base formula, such as "y = tan(x) + f()" not parsing if one of the variables is named "ta".
Fix the LaTeX output of logical_and and logical_or.

v2.21.1

September 19, 2024

Fix the program sometimes crashing when starting a new run.
Python library: fix some macOS crashes, fix zombie processes being left behind if the script gets interrupted.

v2.20

August 15, 2024

Changes

Add a progress message to the UI when a very large file is being loaded.
Do not truncate variable names to 10 characters. Instead, use the full variable names in formulas.
Add the RSS (residual sum of squares) error metric.

Bug fixes

Fix history functions (delay and moving_average) sometimes being shown with the wrong variable in the argument when the search contains more than one input variable.
Fix a bug that caused the program take a lot longer than necessary to load large files (> 1 GB) in recent releases.
Add missing "Correlation coefficient" option in the "Analysis" tab.
Fix LaTeX rendering for the "moving_average" function.
Fix the "Statistical summary" field in the "Analysis" tab overflowing when many input columns are present.

v2.19

July 19, 2024

Changes

LaTeX rendering: Equations in the "Solutions" box are now rendered as LaTeX, making them a lot easier to read and interpret.
Export as LaTeX: This new menu option makes it possible to export equations in a format that can be directly copied and pasted into academic publications and technical reports.
Less parentheses: A new algorithm for placing parentheses in equations has been developed, reducing their frequency and thus the character count of equations.
Select/deselect groups of base functions by clicking on the corresponding headers. For instance, "Trigonometric functions" and "Exponential functions" are now clickable.
Do not select "Row number" when clicking on the "All/none" button for input variables, as the row number is usually not a relevant variable.
Minor UI optimizations to reduce its CPU usage.

Bug fixes

macOS signing: TuringBot is now a signed macOS application, resolving the cryptic "TuringBot is damaged and can't be opened. You should move it to the Trash" error message.
Add back the "Correlation coefficient" metric, accidentally removed in v2.18.
Fix periodic solutions being saved with "_tmp" extension on Windows.

v2.18

June 10, 2024

Changes

Seamless cross-platform experience: We have rewritten the UI from scratch in a new framework, ensuring a consistent and identical experience across all major operating systems.
MacOS support: We have added new .dmg installers for macOS, available in both Apple Silicon (M1/M2/M3) and Intel versions. These updated installers do not require installing any dependencies manually and should "just work".
Caching of plots and UI errors: Solution errors and plots are now only calculated once per solution, allowing you to go back and forth between different solutions on the list without wasting CPU cycles.
Improved logging: We have streamlined "New solution found" log messages, grouping them together every 1 second for enhanced readability.

v2.18.1

June 16, 2024

Smoother startup: The program's launch has been refined to be visually smoother and less glitchy.
Fix cross-validation errors not being correctly shown in the UI.
Fix the program crashing when "Matthews correlation" was selected as the error metric.
Windows only: Fix loading input files with UTF-8 characters in the filename, and fix saving settings when the username has UTF-8 characters.
Fix loading "Integer constants only" setting from a settings.cfg file (it was being ignored).

v2.16

February 3, 2023

Changes

Implement dataset normalization option (Advanced > Normalize the dataset). This speeds up the search when the input values are large.
Optimize the program for very large numbers of rows/columns. The interface should be much more responsive now and files with many columns should also load faster.
When cross-validation is enabled, only show the corresponding points (train or test sample) in the plot.
Reduce the minimum value of "Maximum formula size" from 10 to 1.
Add the input file filename to the "Statistical summary" box.
Make periodic output save immediately when the periodic output option is first set.
Use >= instead of > for upper bound search, and <= instead of < for lower bound search.
Stop the elapsed time counter from elapsing if the program is paused.

Bug fixes

Make "recent files" work in the file selection dialogues.
Fix bug where not all moving average lengths up to "Maximum history size" are used.

v2.16.1

February 21, 2023

Program still getting slow and unresponsive for large number of rows.
Interface errors not updated after loading formulas from a file with the search paused.

v2.14

November 17, 2022

Changes

F-score: the F1-score metric has been replaced by the more general F-score metric with a customizable beta parameter. Lower values of beta favor precision over recall. Useful for stock market applications.
log10(x) base function has been added.
Random seed for random train/test split generation can now be manually set. When left at the default value (-1), a random seed will be used each time.
~30% faster performance after optimizations.
Customizable interface size: the interface size can now be adjusted under menu > interface scale. By default, this scale is now set to 90%.
Folder memory: the last used folders for exporting and loading files are now remembered even if the program is closed and launched again.
Negative predictive value metric has been added to the "Solution info" box for completeness of the confusion matrix.
Safe saving: whenever an output file is generated, it is now first saved with a _tmp extension and then renamed. This prevents the creation of partially written files in the event of power outages or crashes.
Select/deselect all: history functions are no longer toggled on by this button and must now be toggled manually.
macOS version: no longer requires manual dependency installations using brew. Feedback from macOS users as to whether TuringBot is working as expected would be highly appreciated.
Input dialog: on Windows and macOS, the system dialog for selecting the input file has been replaced by a more practical internal dialog.
Command-line improvements: it is now possible to load seed formulas from the command line with --formulas-file FILENAME and export formulas+predictions periodically with --predictions-file FILENAME.
Python improvements: the same options above have been added to the start_process method: predictions_file=None, formulas_file=None.

Bug fixes

Large formulas are not shown if more than 60 formulas with decreasing error are on display.
Broken "output type" button under the "set periodic output" window on Windows 10.

v2.12

May 20, 2022

Changes

macOS version: a native macOS version of the program is now available. Feedback from Mac users would be highly appreciated.
Analysis tab: this new tab contains a customizable "Error vs Size" plot as well as statistical information about the input data. It replaces the old Prediction tab, which most users did not consider useful.
Binary cross-entropy: new search metric. Solves binary classification problems in terms of probabilities between 0 and 1. To use this metric, the target variable must contain two (and only two) classes represented by the numbers 0 and 1.
Matthews Correlation Coefficient: new search metric. Regarded to be one of the best classification metrics, it combines true positives, true negatives, false positives and false negatives into a single number between -1 and 1. See the documentation for details.
Precision, Recall, and Sensitivity: these metrics have been added to the "Solution info" box in the main tab of the interface. They allow solutions obtained with F1 score/MCC metrics to be much better interpreted and analyzed.
min(x, y) and max(x, y): new base formulas.
Export data+predictions: this option has been added to the menu, allowing data and predictions to be manually exported at any time.
Interface improvements: menus have been made more user friendly (no more overflowing menus with white space) and the interface colors have been made slightly more readable.
Periodic output: outputs are now only saved if new solutions have been found, preventing unnecessary disk usage.
File reading: input files are now read 30% faster.

Bug fixes

Broken interface on the latest Ubuntu versions.
Optimization considers y = 1e-200 to be a perfect fit when the correlation coefficient metric is used.

v2.10

February 18, 2022

Changes

Observed vs predicted plot: you can now see your predictions as a function of the target variable by clicking on Plot x axis > Observed.
Symlog plot: when selecting log y or log x in the "Plot scale" menu, the behavior is now to calculate the "symlog" of the data just like in Python's Matplotlib, allowing negative numbers to be visualized nicely.
Folder memory: if you select a file in a menu, and then open that same menu later, the folder shown is now remembered.
Update frequency: the interface now updates more frequently.
Input file reading: now 30% faster, more robust, and with error handling: if the file is empty or unreadable, the program now shows an error message instead of crashing.
Special characters: the program now supports reading/saving files with UTF-8 filenames (including accented characters and non-Latin alphabets like Chinese and Russian).
Decimal places: when exporting solutions as text/Python/C or exporting data, constants are now written with more decimal places to avoid loss of accuracy.
Error comments: when exporting solutions as Python/C, a comment is now shown above each solution with its corresponding error.
Cross-validation error exporting: when saving solutions as text, the cross-validation error is now also saved (if cross-validation is enabled).
Cross-validation optimization: optimizations with cross-validation enabled are now much faster, especially for large input files.
Command-line help: running turingbot (or TuringBot.exe) with --help flag now shows a help message.

Bug fixes

Fix bug that caused the program to randomly crash after running for several hours.
Fix "Plot x axis" and "Plot scale" menus extending beyond the program window and looking weird.
Fix ghost solutions appearing in the Log messages.
Fix "Set periodic output" dialog disappearing after being closed once.
Fix broken function declarations when exporting as C/Python in Advanced mode.
log2(x) is now correctly read when typed into Advanced mode.
R^2 is now 1 and not 0 for perfect fits.