All Apps and Add-ons

ML - random forest regressor

Sukisen1981
Champion

ok so I have run a random forest regressor on my sample data, trying to predict field A based on fields Dealers &Orders . I need help on interpreting the results.
Under Fit Model Parameters Summary it gives me 2 rows
feature importance
Dealers 0.219929427086

Orders 0.780070572914

so given that I am happy with R square value (0.9467), does this prediction mean
field A=0.219929427086 * Dealers + 0.780070572914 * Orders
??

0 Karma
1 Solution

aoliner_splunk
Splunk Employee
Splunk Employee

Feature importance with a random forest is different from the coefficients in a model like LinearRegression. Unfortunately, there's no simple equation like that to write down for a random forest; it's many many different regression trees, each of which are (even by themselves) not a simple linear equation.

You can find a decent explanation of how the importance is calculated, and should be interpreted, here:
http://scikit-learn.org/stable/modules/ensemble.html#feature-importance-evaluation

View solution in original post

0 Karma

aoliner_splunk
Splunk Employee
Splunk Employee

Feature importance with a random forest is different from the coefficients in a model like LinearRegression. Unfortunately, there's no simple equation like that to write down for a random forest; it's many many different regression trees, each of which are (even by themselves) not a simple linear equation.

You can find a decent explanation of how the importance is calculated, and should be interpreted, here:
http://scikit-learn.org/stable/modules/ensemble.html#feature-importance-evaluation

0 Karma

aljohnson_splun
Splunk Employee
Splunk Employee

For the quibblers interested, this SO goes into more detail on how its calculated: https://stackoverflow.com/questions/15810339/how-are-feature-importances-in-randomforestclassifier-d...

0 Karma

Sukisen1981
Champion

ok so , then how do i predict field A using dealers and orders?

0 Karma

aoliner_splunk
Splunk Employee
Splunk Employee

When you call the fit command, supply 'into your_model_name'. Then, you can use that model later with the apply command:
[training data] | fit RandomForestRegression A from dealers orders into my_model
[new data] | apply my_model

0 Karma

Sukisen1981
Champion

sorry , not clear.
I have already saved my model as "sc" in the MLTK app.
Now, customer is asking me what is the predicted value for field A when dealers=6000 and orders= 63

0 Karma

aoliner_splunk
Splunk Employee
Splunk Employee

... | eval dealers=6000 | eval orders=63 | apply sc
Then look at the value of the field called predicted(A)

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Event Series: Telemetry Pipeline Management

Balancing Scale and Spend: Gaining Control Over High-Volume Metrics in Splunk Observability Cloud As ...

Kick the Tires Before You Commit: A Hands-On Tour of the Splunk Observability Cloud ...

Evaluating an enterprise observability platform usually goes like this: fill out a form, get a free trial with ...

Deep insights, no barriers: Splunk Observability Cloud Free Edition

As software delivery cycles continue to accelerate, observability shouldn’t be a luxury — it should be a ...