Recent Posts

Pages: [1] 2 3 4
1
Symbolic Regression General / Re: Multiple data sets, 2 variables, how to proceed
« Last post by RaptorUK on December 19, 2021, 06:11:19 pm »
Hello RaptorUK,

TuringBot's plot makes it seem like it works in 1 dimension, but in reality, it can find relationships in any number of dimensions.

In your case, you are looking for Y = f(X, Z). The best way is to combine all your datasets into a single input file while adding a corresponding Z column to each row. This way, you can use the program to find a general formula Y = f(X, Z) without having to run multiple regressions.

I hope this makes sense.

Hi,

many thanks for the reply . . .  I think I had done this already,  I did have three columns in my CSV file,  I have further refined this to remove duplicates and maintain an ascending order for the target variable.

It looks like it's working better now but I need to do more predictions to make sure . . .

Thanks again for the help.
2
Symbolic Regression General / Re: Multiple data sets, 2 variables, how to proceed
« Last post by admin on December 16, 2021, 12:07:34 pm »
Hello RaptorUK,

TuringBot's plot makes it seem like it works in 1 dimension, but in reality, it can find relationships in any number of dimensions.

In your case, you are looking for Y = f(X, Z). The best way is to combine all your datasets into a single input file while adding a corresponding Z column to each row. This way, you can use the program to find a general formula Y = f(X, Z) without having to run multiple regressions.

I hope this makes sense.
3
Symbolic Regression General / Multiple data sets, 2 variables, how to proceed
« Last post by RaptorUK on December 03, 2021, 11:38:47 am »
Hi folks,

I was searching via Google for tools to create formulas from data sets, what Excel provides me by way of a trendline formula was somewhat helpful but ended up not being quite precise enough for my needs.  So I can across TuringBot and it can do a much better job . . .  obviously. 

So started using on simple X vs Y data and it seemed to do a good job . . .  so I thought "I wonder if it can take all my data sets in one go"

I have X vs Y data at various fixed Z values, I want to be able to predict my Y value for given X and Z values.  It seemed to be working until I started testing using the prediction tab, the values were obviously wrong.  For a set X value when I vary the Z value the predicted Y value should increase or decrease with it . . . this wasn't happening.  Increasing the  Z would sometimes lead to a increasing Y sometimes a decreasing Y.

Hope this makes sense . . .  What I think is happening is that TuringBot assumes all the data is one single series . . . when it is actually several different series.  So I need it to do a regression in 2 dimensions . . . I think.  Is this possible ?

My excel data looks like this,  this may help to explain a little better:

 
4
One more thing is for clarification:
If a fixed number of rows (e.g. 100 rows) with the last points chosen, does this mean that the data of the 1st 100 rows from the beginning are chosen for training?
5
Many thanks. looking forward to the next release having a plot of observed vs predicted data.

By the way, when a train/test split is chosen, is the R-squared value for the whole set of data or only training data?

In the train/test split, I wonder whether it can include a user-defined number of rows or a customized percentage split as an option.

Meanwhile, I wonder whether it is possible to carry out a search for a defined function with a combined variable. For example,  y = f(x1+f(), x2), where x1 and x2 are independent variables.
7
You are right, log2(x) is being ignored if you type it in a custom formula. This is a small bug that is affecting only this particular function.

Also, the R-squared is indeed being calculated as 0 for a perfect fit, which is another bug. For fits that are not perfect the result is correct though.

Thanks for reporting, both of these will be fixed soon.

log10(x) is not available, but meanwhile you can compute it with log(x)/log(10), and the same for log2(x).
8
About different errors appearing each time: if you leave the Test sample setting as the default value, Chosen randomly, the program will generate a new random split each time you start a new optimization, resulting in different errors each time.

To get consistent errors, you can switch this option to The last points, so that a sequential split is used instead of a random one.

Your suggestions of exporting R-squared values and adding an observed/predicted plot are both great, I'll try to add them in the next release.

As a rule of thumb, the 100/1000/10000 points settings are useful to speed up the optimization if you are using a very large dataset, for instance, one with millions of rows. Otherwise, the other options are more appropriate.
9
Symbolic Regression General / Re: Nvidia Cuda support in the future?
« Last post by admin on November 26, 2021, 12:39:04 pm »
Hello all,

In neuroscience, we mostly use 128-512-960 bits large integers with lots of parameters.
Do you have any plans to support Nvidia Cuda as an option for faster calculations in future releases?

Thanks.

Hi Joseph,

You are actually the first person to make this suggestion. It should be possible to pass at least part of the computations to a GPU (if available). I have added this to our backlog, I'll see if we can come up with something.
10
Symbolic Regression General / Re: Size of input numbers
« Last post by admin on November 26, 2021, 12:34:01 pm »
Hello,
There is a problem with large integers.
In my case(turingbot 2.0 + win10 x64) integers larger than 1 million are converted to their exponential form as 1e+06.
It looks fine at the begining, then software uses that rounded exponential number as the value.
Eventually it finds wrong results.
For example
343669578, 343669579, 343669580 are all shown and calculated as 3.4367e+08.
A setting can be added to prevent this problem (up to maximum integer size, which is 10**308 as you mentioned.)

Thanks.

Joseph, I could not reproduce this issue. If I try the following input file on Windows:

1 343669578
2 343669579
3 343669580

the program finds the correct solution, y = 343669577+x:



Do you have an example of input file for which this problem happens? Thanks.
Pages: [1] 2 3 4