TuringBot already offers a solid range of error metrics, including the hybrid CC + RMS, which balances correlation and scale. However, for more complex, non-linear datasets—especially those that seem almost random—this metric might not be enough. I’d like to suggest adding a hybrid of Mutual Information (MI) + NRMSE as a new option.
MI captures both linear and non-linear dependencies, making it much better suited for complex data where standard correlation metrics like Pearson CC can miss hidden patterns.
NRMSE complements this by normalizing error based on the data’s scale, providing a more balanced view of error magnitude, especially when outliers are involved.
Since TuringBot uses simulated annealing, which excels at exploring non-linear search spaces, adding a hybrid of MI + NRMSE could further enhance its ability to find optimal formulas in more challenging datasets.
I believe this hybrid metric could benefit users working with complex, non-linear data, without affecting TuringBot’s current performance.Thanks