There are issues with the "mean relative error" metric in TuringBot when fitting models where the regressor spans multiple orders of magnitude:
- Non-symmetry: The metric is biased, because overestimates weigh more than underestimates. This is explained in https://towardsdatascience.com/mean-absolute-log-error-male-a-better-relative-performance-metric-a8fd17bc5f75
- Undefined for Zero Values: The relative error is undefined when the data includes zeros, f(x)=0.
- Disproportionate Impact on Small Values: When f(x) is very small, even a small absolute difference can result in a large relative error, potentially skewing the overall metric if small values are not as important. or are smaller than observation uncertaintly.
Eureqa solved this with two The user could choose to use the balanced error metric "Logarithm error", or could optionaly use row weights. Neither option is available in TuringBot.
I therefore recommend that you implement some variant of RMSLE (Root Mean Squared Log Error). It's not nevessary to calculate the root or mean; it suffices to calculate the Sum of Squared Log Residuals or Sum of Absolute Log Residuals. In addition, instead of having a fixed shift term of 1, as is commonly used in RMSLE, you could make it a user-definable parameter with default value shift=1, but that the user can change it in Advanced settings:
SSLR = SUM[ ( Ln( (y_i+shift) / (y'i+shift) ) )2 ]
where y_i is data, and y'i is model