Deployment Architecture

Clarification on Splunk App DSDL max_inputs limit

MCW
Explorer

Hi experts,

I am in early experiment journey with Splunk App for DSDL (aka DLTK) to pull in some events into Jupyter

note book by way of Option 2, i.e: 

<SPL search> | fit MLTKContainer mode=stage algo=my_test * into app:my_test_data

where my_test is just cloned from barebone_template, and I want the input data file to be created with name of "my_test_data".

I ran into following error since the SPL returns 500+ events:

  • Input event count exceeds max_inputs for MLTKContainer (100000), model will be fit on a sample of events. To configure limits, use mlspl.conf or the "Settings" tab in the app navigation bar.

Upon checking mlspl.confg and fair enough max_input is set as default 100,000. However, the resulting my_test_data.csv only contains 1153 lines and excluding the header row only 1152 of events of interest.

Why don't I get 100,000 events into the csv file and it's not a disk space issue either having verified it.

More importantly, how can I get the full 100,000 events into my csv file?

Any advice is greatly appreciated.

Thanks,
MCW

Gabriel
Path Finder

Hi @MCW

 

1. How many events are returned by your <SPL search>?

2. Can you share the output of your <SPL search> that you used (e.g. as CSV)? I'd like to replicate your situation on my server.

3. Do you have access to the server where Splunk is running on? If yes, can you provide the output of the following two commands?

./splunk show config mlspl | grep max_inputs

./splunk btool mlspl list --debug | grep max_inputs

 

Without knowing any more details, my guess is that your <SPL search> returned more events than you allow in your max_inputs setting (e.g. if your search returns 200'000 events and your max_inputs=100'000). Consequently, the number of events are downsampled by DSDL/MLTK. The resulting my_test_data.csv with 1153 lines that you see within the jupyter notebook environment is exactly this sample. 

Regards,

Gabriel

Tags (2)
0 Karma
Get Updates on the Splunk Community!

CX Day is Coming!

Customer Experience (CX) Day is on October 7th!! We're so excited to bring back another day full of wonderful ...

Strengthen Your Future: A Look Back at Splunk 10 Innovations and .conf25 Highlights!

The Big One: Splunk 10 is Here!  The moment many of you have been waiting for has arrived! We are thrilled to ...

Now Offering the AI Assistant Usage Dashboard in Cloud Monitoring Console

Today, we’re excited to announce the release of a brand new AI assistant usage dashboard in Cloud Monitoring ...