Deployment Architecture

Clarification on Splunk App DSDL max_inputs limit

MCW
Explorer

Hi experts,

I am in early experiment journey with Splunk App for DSDL (aka DLTK) to pull in some events into Jupyter

note book by way of Option 2, i.e: 

<SPL search> | fit MLTKContainer mode=stage algo=my_test * into app:my_test_data

where my_test is just cloned from barebone_template, and I want the input data file to be created with name of "my_test_data".

I ran into following error since the SPL returns 500+ events:

  • Input event count exceeds max_inputs for MLTKContainer (100000), model will be fit on a sample of events. To configure limits, use mlspl.conf or the "Settings" tab in the app navigation bar.

Upon checking mlspl.confg and fair enough max_input is set as default 100,000. However, the resulting my_test_data.csv only contains 1153 lines and excluding the header row only 1152 of events of interest.

Why don't I get 100,000 events into the csv file and it's not a disk space issue either having verified it.

More importantly, how can I get the full 100,000 events into my csv file?

Any advice is greatly appreciated.

Thanks,
MCW

Gabriel
Path Finder

Hi @MCW

 

1. How many events are returned by your <SPL search>?

2. Can you share the output of your <SPL search> that you used (e.g. as CSV)? I'd like to replicate your situation on my server.

3. Do you have access to the server where Splunk is running on? If yes, can you provide the output of the following two commands?

./splunk show config mlspl | grep max_inputs

./splunk btool mlspl list --debug | grep max_inputs

 

Without knowing any more details, my guess is that your <SPL search> returned more events than you allow in your max_inputs setting (e.g. if your search returns 200'000 events and your max_inputs=100'000). Consequently, the number of events are downsampled by DSDL/MLTK. The resulting my_test_data.csv with 1153 lines that you see within the jupyter notebook environment is exactly this sample. 

Regards,

Gabriel

Tags (2)
0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Take Action Automatically on Splunk Alerts with Red Hat Ansible Automation Platform

 Are you ready to revolutionize your IT operations? As digital transformation accelerates, the demand for ...

Calling All Security Pros: Ready to Race Through Boston?

Hey Splunkers, .conf25 is heading to Boston and we’re kicking things off with something bold, competitive, and ...

Beyond Detection: How Splunk and Cisco Integrated Security Platforms Transform ...

Financial services organizations face an impossible equation: maintain 99.9% uptime for mission-critical ...