All Apps and Add-ons

How can I predict the number of events sent for each host in the Splunk Machine Learning Toolkit?

davietch
Path Finder

Hi,

I am using the MachineLearning Toolkit in order to predict how many events each host are usually sending.
To do so, I selected the "Predict Numeric Fields" showcase and created the following command:

| tstats count where index=*  by host,_time span=1h
|eval date_wday=strftime(_time,"%w"), date_hday=strftime(_time,"%H")

This gives me the number of event per host for each hour. I also compute 2 fields: the weekday and the hour of the day.

But when I run the Linear Regression with "count" field to predict and the "host", "date_wday" and "date_hday" as used fields for predicting, the result is awful.
When I filter on just one host, the prediciting is working quite well but as soon as there are severals hosts names, the ML does not work.

Any idea how to create a model that take in account the name of the host? Maybe some preprocessing?

Thanks

0 Karma

jcoates
Communicator

I expect that means that each host is a different context with different data and needs a different linear regression. If they were all the same then the model of one's past would predict future for all the others. Since your results show that isn't true...

0 Karma

davietch
Path Finder

Yes they all have a different behavior, but I can not create a model for my 20K Forwarders.... Can I? I bet there is a more clever solution..

0 Karma

jcoates
Communicator

do groups behave similarly? Can you make a model for each group?

0 Karma

davietch
Path Finder

I do not have groups... They all behave differently

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Tech Talk Recap | Mastering Threat Hunting

Mastering Threat HuntingDive into the world of threat hunting, exploring the key differences between ...

Observability for AI Applications: Troubleshooting Latency

If you’re working with proprietary company data, you’re probably going to have a locally hosted LLM or many ...

Splunk AI Assistant for SPL vs. ChatGPT: Which One is Better?

In the age of AI, every tool promises to make our lives easier. From summarizing content to writing code, ...