All Apps and Add-ons

REST Modular input drops records when printing to STDOUT

wmoustafa
Engager

Hi,
I have been trying the REST modular input to ingest records from a couple of REST endpoints into Splunk in realtime. Installing a single instance of the rest_ta/ under the etc/apps directory and defining multiple input stanzas led to a high percentage of the records not being indexed into Splunk. Troubleshooting this, I can see the modular input queries the REST endpoints and retrieves all records, but when the records are print() to STDOUT, a high ratio are not indexed into Splunk. I could persist all the records though if I print() to a text file instead of STDOUT. Please note that I have implemented a custom authentication class and also a custom response handler.

Could you please let me know if having single rest_ta app to cater for multiple input stanzas is the appropriate? Also how could I troubleshoot and debug this STDOUT record loss as nothing shows up in splunkd.log after the modular input prints the records to STDOUT? How can I follow the record up after print()?

Thanks in advance.

0 Karma
1 Solution

wmoustafa
Engager

Thanks Damien.
It turned out that Splunk was positioning the missed events far on the timeline from the insertion time. This is due to Splunk preferring to derive a timestamp from the record data than using the indexing current system time. More details from the docs http://docs.splunk.com/Documentation/Splunk/7.0.1/Data/HowSplunkextractstimestamps

Configuring TIME_PREFIX and MAX_TIMESTAMP_LOOKAHEAD in props.conf resolved the issue.
http://docs.splunk.com/Documentation/Splunk/latest/Data/Configurepositionaltimestampextraction

View solution in original post

0 Karma

wmoustafa
Engager

Thanks Damien.
It turned out that Splunk was positioning the missed events far on the timeline from the insertion time. This is due to Splunk preferring to derive a timestamp from the record data than using the indexing current system time. More details from the docs http://docs.splunk.com/Documentation/Splunk/7.0.1/Data/HowSplunkextractstimestamps

Configuring TIME_PREFIX and MAX_TIMESTAMP_LOOKAHEAD in props.conf resolved the issue.
http://docs.splunk.com/Documentation/Splunk/latest/Data/Configurepositionaltimestampextraction

View solution in original post

0 Karma

Damien_Dallimor
Ultra Champion
0 Karma

Damien_Dallimor
Ultra Champion

Pretty standard to have multiple stanzas running per instance of mod input.

Have you ruled out any issues in your custom response handler ? do you have logging in this handler catching and debugging any errors ?

Can you post your custom handler and any data examples ?

Are you using your custom handler to break up large events ?

is it possible that you are indexing events greater than the default 10000 chars and haven't configured your system for larger event sizes ?

0 Karma
Take the 2021 Splunk Career Survey

Help us learn about how Splunk has
impacted your career by taking the 2021 Splunk Career Survey.

Earn $50 in Amazon cash!