All Apps and Add-ons

ML - random forest regressor

Sukisen1981
Champion

ok so I have run a random forest regressor on my sample data, trying to predict field A based on fields Dealers &Orders . I need help on interpreting the results.
Under Fit Model Parameters Summary it gives me 2 rows
feature importance
Dealers 0.219929427086

Orders 0.780070572914

so given that I am happy with R square value (0.9467), does this prediction mean
field A=0.219929427086 * Dealers + 0.780070572914 * Orders
??

0 Karma
1 Solution

aoliner_splunk
Splunk Employee
Splunk Employee

Feature importance with a random forest is different from the coefficients in a model like LinearRegression. Unfortunately, there's no simple equation like that to write down for a random forest; it's many many different regression trees, each of which are (even by themselves) not a simple linear equation.

You can find a decent explanation of how the importance is calculated, and should be interpreted, here:
http://scikit-learn.org/stable/modules/ensemble.html#feature-importance-evaluation

View solution in original post

0 Karma

aoliner_splunk
Splunk Employee
Splunk Employee

Feature importance with a random forest is different from the coefficients in a model like LinearRegression. Unfortunately, there's no simple equation like that to write down for a random forest; it's many many different regression trees, each of which are (even by themselves) not a simple linear equation.

You can find a decent explanation of how the importance is calculated, and should be interpreted, here:
http://scikit-learn.org/stable/modules/ensemble.html#feature-importance-evaluation

0 Karma

aljohnson_splun
Splunk Employee
Splunk Employee

For the quibblers interested, this SO goes into more detail on how its calculated: https://stackoverflow.com/questions/15810339/how-are-feature-importances-in-randomforestclassifier-d...

0 Karma

Sukisen1981
Champion

ok so , then how do i predict field A using dealers and orders?

0 Karma

aoliner_splunk
Splunk Employee
Splunk Employee

When you call the fit command, supply 'into your_model_name'. Then, you can use that model later with the apply command:
[training data] | fit RandomForestRegression A from dealers orders into my_model
[new data] | apply my_model

0 Karma

Sukisen1981
Champion

sorry , not clear.
I have already saved my model as "sc" in the MLTK app.
Now, customer is asking me what is the predicted value for field A when dealers=6000 and orders= 63

0 Karma

aoliner_splunk
Splunk Employee
Splunk Employee

... | eval dealers=6000 | eval orders=63 | apply sc
Then look at the value of the field called predicted(A)

0 Karma
Get Updates on the Splunk Community!

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

 Are you tired of troubleshooting delays caused by siloed frontend, application, and network data? We've got a ...

Splunk Observability for AI

Don’t miss out on an exciting Tech Talk on Splunk Observability for AI!Discover how Splunk’s agentic AI ...

🔐 Trust at Every Hop: How mTLS in Splunk Enterprise 10.0 Makes Security Simpler

From Idea to Implementation: Why Splunk Built mTLS into Splunk Enterprise 10.0  mTLS wasn’t just a checkbox ...