Splunk Dev

Rest API call is getting timeout due to high volume of data

deev
Observer

Hi, 

We are using  the Curl script to call splunk RestAPI to send the data out of  splunk  (to Kafka/ES) . We have 1+lakhs  events in every second . So while calling the rest api (calling every 5 secs) , it is getting time out .

Sample  curl command for calling restapi   

curl -k -u admin:changeme \
     https://localhost:8089/services/search/jobs/ -d search="search index=sample sourcetype=access_* earliest=-5m"

What is the limit of event count  we can extract at a time through Rest API Call?

What is the  default timeout settings  ?Is it possible to change ? 

Is there a better way to send splunk data  outside? 

Tried Python script using Splunklib.client .That also failed .

 

Appreciate your inputs in advance .

Regards

Deev

 

Labels (3)
Tags (1)
0 Karma

PickleRick
SplunkTrust
SplunkTrust

Regardless of what @bowesmana already pointed out, the REST endpoint you're using is not supposed to return search results, it's just creating a search job and should return the search job ID.

The question is what exactly is timing out. Aren't you simply trying to access a filtered port and getting a timeout at the network level?

 

0 Karma

bowesmana
SplunkTrust
SplunkTrust

why are you calling this every 5 seconds if you are searching 5 minutes of data. You will get duplicate data.

You should search using a specific time range, e.g. earliest=-1m@m latest=@m and call that every minute, then you will not get duplicates and your data will be in smaller chunks.

splunkdConnectionTimeout

defaults to 30 seconds in web.conf

https://docs.splunk.com/Documentation/Splunk/8.2.3/Admin/Webconf

Tags (1)
0 Karma

deev
Observer

Thank you for your feedback . Point noted.  Tried with 1 mins data but it was taking 15 mins time to execute the script . Is there any better way to handle  high volume of data  into outside splunk ?

0 Karma

bowesmana
SplunkTrust
SplunkTrust

If you want your data both in Splunk and elsewhere, you may want to look at something that can fork the data, so it goes to both places instead of going first into Splunk, then getting it out from there.

Have a look at Cribl, which would help you send data to both places.

https://cribl.io/

 

0 Karma
Get Updates on the Splunk Community!

Detecting Remote Code Executions With the Splunk Threat Research Team

REGISTER NOWRemote code execution (RCE) vulnerabilities pose a significant risk to organizations. If ...

Observability | Use Synthetic Monitoring for Website Metadata Verification

If you are on Splunk Observability Cloud, you may already have Synthetic Monitoringin your observability ...

More Ways To Control Your Costs With Archived Metrics | Register for Tech Talk

Tuesday, May 14, 2024  |  11AM PT / 2PM ET Register to Attend Join us for this Tech Talk and learn how to ...