Clarification on Splunk App DSDL max_inputs limit

MCW · ‎07-08-2024

Hi experts,

I am in early experiment journey with Splunk App for DSDL (aka DLTK) to pull in some events into Jupyter

note book by way of Option 2, i.e:

<SPL search> | fit MLTKContainer mode=stage algo=my_test * into app:my_test_data

where my_test is just cloned from barebone_template, and I want the input data file to be created with name of "my_test_data".

I ran into following error since the SPL returns 500+ events:

Input event count exceeds max_inputs for MLTKContainer (100000), model will be fit on a sample of events. To configure limits, use mlspl.conf or the "Settings" tab in the app navigation bar.

Upon checking mlspl.confg and fair enough max_input is set as default 100,000. However, the resulting my_test_data.csv only contains 1153 lines and excluding the header row only 1152 of events of interest.

Why don't I get 100,000 events into the csv file and it's not a disk space issue either having verified it.

More importantly, how can I get the full 100,000 events into my csv file?

Any advice is greatly appreciated.

Thanks,
MCW

Gabriel · ‎09-03-2024

Hi @MCW

1. How many events are returned by your <SPL search>?

2. Can you share the output of your <SPL search> that you used (e.g. as CSV)? I'd like to replicate your situation on my server.

3. Do you have access to the server where Splunk is running on? If yes, can you provide the output of the following two commands?

./splunk show config mlspl | grep max_inputs

./splunk btool mlspl list --debug | grep max_inputs

Without knowing any more details, my guess is that your <SPL search> returned more events than you allow in your max_inputs setting (e.g. if your search returns 200'000 events and your max_inputs=100'000). Consequently, the number of events are downsampled by DSDL/MLTK. The resulting my_test_data.csv with 1153 lines that you see within the jupyter notebook environment is exactly this sample.

Regards,

Gabriel

Clarification on Splunk App DSDL max_inputs limit

other

Continuing Innovation & New Integrations Unlock Full Stack Observability For Your ...

Monitoring Amazon Elastic Kubernetes Service (EKS)

Cloud Platform & Enterprise: Classic Dashboard Export Feature Deprecation