All Apps and Add-ons

When using a search as an input for the Machine Learning (ML) Toolkit Numeric Field Prediction, how do we overcome the following error?

damucka
Builder

Hello,

We would like to use the following search as an input for the ML Toolkit Numeric Field Prediction (DecisionTreeRegressor):

index=mlbso violation sourcetype=*BWP_hanatraces* | timechart span=10s count as mlbso_hana_composite_ooms

The idea is to count the occurrences of the word "violation" (out of memory errors) in the specific source type, create the field "mlbso_hana_composite_ooms" out of it and then predict on it. Now, in order to improve the prediction quality, we need to run the above search for the time span long enough in the past, ideally some months. This however produces the following error:

The specified span would result in too many (>50000) rows
I understand where the error comes from (maxresultrows in limits.conf) but we do not want to extend the span. We would actually want event to reduce it to 1 sec. Neither do we want to make the time window smaller — also here we would rather extend it to several months.

When I reduce the selected time window or put the time span higher in order to stay below 50000 rows, then it works fine but brings around 3.000 events back. So, shortly speaking, out of 50.000 samples, I get 3.000 events where the word "violation" was found — with this, the quality of the DecisionTreeRegressor is not good enough.

How would I overcome this issue?

Am I overlooking something?

Kind Regards,
Kamil

0 Karma

hkeswani_splunk
Splunk Employee
Splunk Employee

If you trying to increase the maximum no. of events that your algorithms can handle then you can change in the mlspl.conf file or if you have MLTK 4.0 installed then you can do it directly from the menu panel by clicking on the settings tab in MLTK.

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...