Deployment Architecture

Clarification on Splunk App DSDL max_inputs limit

MCW
Explorer

Hi experts,

I am in early experiment journey with Splunk App for DSDL (aka DLTK) to pull in some events into Jupyter

note book by way of Option 2, i.e: 

<SPL search> | fit MLTKContainer mode=stage algo=my_test * into app:my_test_data

where my_test is just cloned from barebone_template, and I want the input data file to be created with name of "my_test_data".

I ran into following error since the SPL returns 500+ events:

  • Input event count exceeds max_inputs for MLTKContainer (100000), model will be fit on a sample of events. To configure limits, use mlspl.conf or the "Settings" tab in the app navigation bar.

Upon checking mlspl.confg and fair enough max_input is set as default 100,000. However, the resulting my_test_data.csv only contains 1153 lines and excluding the header row only 1152 of events of interest.

Why don't I get 100,000 events into the csv file and it's not a disk space issue either having verified it.

More importantly, how can I get the full 100,000 events into my csv file?

Any advice is greatly appreciated.

Thanks,
MCW

Gabriel
Path Finder

Hi @MCW

 

1. How many events are returned by your <SPL search>?

2. Can you share the output of your <SPL search> that you used (e.g. as CSV)? I'd like to replicate your situation on my server.

3. Do you have access to the server where Splunk is running on? If yes, can you provide the output of the following two commands?

./splunk show config mlspl | grep max_inputs

./splunk btool mlspl list --debug | grep max_inputs

 

Without knowing any more details, my guess is that your <SPL search> returned more events than you allow in your max_inputs setting (e.g. if your search returns 200'000 events and your max_inputs=100'000). Consequently, the number of events are downsampled by DSDL/MLTK. The resulting my_test_data.csv with 1153 lines that you see within the jupyter notebook environment is exactly this sample. 

Regards,

Gabriel

Tags (2)
0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Agent Mode Engaged! Enchaining Agentic Operations with Splunk AI Assistant 2.0

    Are you ready to transform how your team handles complex data requests? We invite you to our upcoming ...

Announcing Modern Navigation: A New Era of Splunk User Experience

We are excited to introduce the Modern Navigation feature in the Splunk Platform, available to both cloud and ...

Modernize your Splunk Apps – Introducing Python 3.13 in Splunk

We are excited to announce that the upcoming releases of Splunk Enterprise 10.2.x and Splunk Cloud Platform ...