Program still getting slow and unresponsive for large number of rows.
Interface errors not updated after loading formulas from a file with the search paused.
v2.16
February 3rd, 2023
Changes
Implement dataset normalization option (Advanced > Normalize the dataset). This speeds up the search when the input values are large.
Optimize the program for very large numbers of rows/columns. The interface should be much more responsive now and files with many columns should also load faster.
When cross-validation is enabled, only show the corresponding points (train or test sample) in the plot.
Reduce the minimum value of "Maximum formula size" from 10 to 1.
Add the input file filename to the "Statistical summary" box.
Make periodic output save immediately when the periodic output option is first set.
Use >= instead of > for upper bound search, and <= instead of < for lower bound search.
Stop the elapsed time counter from elapsing if the program is paused.
Bug fixes
Make "recent files" work in the file selection dialogues.
Fix bug where not all moving average lengths up to "Maximum history size" are used.
v2.14
November 17th, 2022
Changes
F-score: the F1-score metric has been replaced by the more general F-score metric with a customizable beta parameter. Lower values of beta favor precision over recall. Useful for stock market applications.
log10(x) base function has been added.
Random seed for random train/test split generation can now be manually set. When left at the default value (-1), a random seed will be used each time.
~30% faster performance after optimizations.
Customizable interface size: the interface size can now be adjusted under menu > interface scale. By default, this scale is now set to 90%.
Folder memory: the last used folders for exporting and loading files are now remembered even if the program is closed and launched again.
Negative predictive value metric has been added to the "Solution info" box for completeness of the confusion matrix.
Safe saving: whenever an output file is generated, it is now first saved with a _tmp extension and then renamed. This prevents the creation of partially written files in the event of power outages or crashes.
Select/deselect all: history functions are no longer toggled on by this button and must now be toggled manually.
macOS version: no longer requires manual dependency installations using brew. Feedback from macOS users as to whether TuringBot is working as expected would be highly appreciated.
Input dialog: on Windows and macOS, the system dialog for selecting the input file has been replaced by a more practical internal dialog.
Command-line improvements: it is now possible to load seed formulas from the command line with --formulas-file FILENAME and export formulas+predictions periodically with --predictions-file FILENAME.
Python improvements: the same options above have been added to the start_process method: predictions_file=None, formulas_file=None.
Bug fixes
Large formulas are not shown if more than 60 formulas with decreasing error are on display.
Broken "output type" button under the "set periodic output" window on Windows 10.
v2.12
May 20th, 2022
Changes
macOS version: a native macOS version of the program is now available. Feedback from Mac users would be highly appreciated.
Analysis tab: this new tab contains a customizable "Error vs Size" plot as well as statistical information about the input data. It replaces the old Prediction tab, which most users did not consider useful.
Binary cross-entropy: new search metric. Solves binary classification problems in terms of probabilities between 0 and 1. To use this metric, the target variable must contain two (and only two) classes represented by the numbers 0 and 1.
Matthews Correlation Coefficient: new search metric. Regarded to be one of the best classification metrics, it combines true positives, true negatives, false positives and false negatives into a single number between -1 and 1. See the documentation for details.
Precision, Recall, and Sensitivity: these metrics have been added to the "Solution info" box in the main tab of the interface. They allow solutions obtained with F1 score/MCC metrics to be much better interpreted and analyzed.
min(x, y) and max(x, y): new base formulas.
Export data+predictions: this option has been added to the menu, allowing data and predictions to be manually exported at any time.
Interface improvements: menus have been made more user friendly (no more overflowing menus with white space) and the interface colors have been made slightly more readable.
Periodic output: outputs are now only saved if new solutions have been found, preventing unnecessary disk usage.
File reading: input files are now read 30% faster.
Bug fixes
Broken interface on the latest Ubuntu versions.
Optimization considers y = 1e-200 to be a perfect fit when the correlation coefficient metric is used.
v2.10
February 18th, 2022
Changes
Observed vs predicted plot: you can now see your predictions as a function of the target variable by clicking on Plot x axis > Observed.
Symlog plot: when selecting log y or log x in the "Plot scale" menu, the behavior is now to calculate the "symlog" of the data just like in Python's Matplotlib, allowing negative numbers to be visualized nicely.
Folder memory: if you select a file in a menu, and then open that same menu later, the folder shown is now remembered.
Update frequency: the interface now updates more frequently.
Input file reading: now 30% faster, more robust, and with error handling: if the file is empty or unreadable, the program now shows an error message instead of crashing.
Special characters: the program now supports reading/saving files with UTF-8 filenames (including accented characters and non-Latin alphabets like Chinese and Russian).
Decimal places: when exporting solutions as text/Python/C or exporting data, constants are now written with more decimal places to avoid loss of accuracy.
Error comments: when exporting solutions as Python/C, a comment is now shown above each solution with its corresponding error.
Cross-validation error exporting: when saving solutions as text, the cross-validation error is now also saved (if cross-validation is enabled).
Cross-validation optimization: optimizations with cross-validation enabled are now much faster, especially for large input files.
Command-line help: running turingbot (or TuringBot.exe) with --help flag now shows a help message.
Bug fixes
Fix bug that caused the program to randomly crash after running for several hours.
Fix "Plot x axis" and "Plot scale" menus extending beyond the program window and looking weird.
Fix ghost solutions appearing in the Log messages.
Fix "Set periodic output" dialog disappearing after being closed once.
Fix broken function declarations when exporting as C/Python in Advanced mode.
log2(x) is now correctly read when typed into Advanced mode.