All Apps and Add-ons

ML - random forest regressor

Sukisen1981
Champion

ok so I have run a random forest regressor on my sample data, trying to predict field A based on fields Dealers &Orders . I need help on interpreting the results.
Under Fit Model Parameters Summary it gives me 2 rows
feature importance
Dealers 0.219929427086

Orders 0.780070572914

so given that I am happy with R square value (0.9467), does this prediction mean
field A=0.219929427086 * Dealers + 0.780070572914 * Orders
??

0 Karma
1 Solution

aoliner_splunk
Splunk Employee
Splunk Employee

Feature importance with a random forest is different from the coefficients in a model like LinearRegression. Unfortunately, there's no simple equation like that to write down for a random forest; it's many many different regression trees, each of which are (even by themselves) not a simple linear equation.

You can find a decent explanation of how the importance is calculated, and should be interpreted, here:
http://scikit-learn.org/stable/modules/ensemble.html#feature-importance-evaluation

View solution in original post

0 Karma

aoliner_splunk
Splunk Employee
Splunk Employee

Feature importance with a random forest is different from the coefficients in a model like LinearRegression. Unfortunately, there's no simple equation like that to write down for a random forest; it's many many different regression trees, each of which are (even by themselves) not a simple linear equation.

You can find a decent explanation of how the importance is calculated, and should be interpreted, here:
http://scikit-learn.org/stable/modules/ensemble.html#feature-importance-evaluation

0 Karma

aljohnson_splun
Splunk Employee
Splunk Employee

For the quibblers interested, this SO goes into more detail on how its calculated: https://stackoverflow.com/questions/15810339/how-are-feature-importances-in-randomforestclassifier-d...

0 Karma

Sukisen1981
Champion

ok so , then how do i predict field A using dealers and orders?

0 Karma

aoliner_splunk
Splunk Employee
Splunk Employee

When you call the fit command, supply 'into your_model_name'. Then, you can use that model later with the apply command:
[training data] | fit RandomForestRegression A from dealers orders into my_model
[new data] | apply my_model

0 Karma

Sukisen1981
Champion

sorry , not clear.
I have already saved my model as "sc" in the MLTK app.
Now, customer is asking me what is the predicted value for field A when dealers=6000 and orders= 63

0 Karma

aoliner_splunk
Splunk Employee
Splunk Employee

... | eval dealers=6000 | eval orders=63 | apply sc
Then look at the value of the field called predicted(A)

0 Karma
Get Updates on the Splunk Community!

Build Scalable Security While Moving to Cloud - Guide From Clayton Homes

 Clayton Homes faced the increased challenge of strengthening their security posture as they went through ...

Mission Control | Explore the latest release of Splunk Mission Control (2.3)

We’re happy to announce the release of Mission Control 2.3 which includes several new and exciting features ...

Cloud Platform | Migrating your Splunk Cloud deployment to Python 3.7

Python 2.7, the last release of Python 2, reached End of Life back on January 1, 2020. As part of our larger ...