Splunk Dev

Rest API call is getting timeout due to high volume of data

deev
Observer

Hi, 

We are using  the Curl script to call splunk RestAPI to send the data out of  splunk  (to Kafka/ES) . We have 1+lakhs  events in every second . So while calling the rest api (calling every 5 secs) , it is getting time out .

Sample  curl command for calling restapi   

curl -k -u admin:changeme \
     https://localhost:8089/services/search/jobs/ -d search="search index=sample sourcetype=access_* earliest=-5m"

What is the limit of event count  we can extract at a time through Rest API Call?

What is the  default timeout settings  ?Is it possible to change ? 

Is there a better way to send splunk data  outside? 

Tried Python script using Splunklib.client .That also failed .

 

Appreciate your inputs in advance .

Regards

Deev

 

Labels (3)
Tags (1)
0 Karma

PickleRick
SplunkTrust
SplunkTrust

Regardless of what @bowesmana already pointed out, the REST endpoint you're using is not supposed to return search results, it's just creating a search job and should return the search job ID.

The question is what exactly is timing out. Aren't you simply trying to access a filtered port and getting a timeout at the network level?

 

0 Karma

bowesmana
SplunkTrust
SplunkTrust

why are you calling this every 5 seconds if you are searching 5 minutes of data. You will get duplicate data.

You should search using a specific time range, e.g. earliest=-1m@m latest=@m and call that every minute, then you will not get duplicates and your data will be in smaller chunks.

splunkdConnectionTimeout

defaults to 30 seconds in web.conf

https://docs.splunk.com/Documentation/Splunk/8.2.3/Admin/Webconf

Tags (1)
0 Karma

deev
Observer

Thank you for your feedback . Point noted.  Tried with 1 mins data but it was taking 15 mins time to execute the script . Is there any better way to handle  high volume of data  into outside splunk ?

0 Karma

bowesmana
SplunkTrust
SplunkTrust

If you want your data both in Splunk and elsewhere, you may want to look at something that can fork the data, so it goes to both places instead of going first into Splunk, then getting it out from there.

Have a look at Cribl, which would help you send data to both places.

https://cribl.io/

 

0 Karma
Get Updates on the Splunk Community!

Enter the Splunk Community Dashboard Challenge for Your Chance to Win!

The Splunk Community Dashboard Challenge is underway! This is your chance to showcase your skills in creating ...

.conf24 | Session Scheduler is Live!!

.conf24 is happening June 11 - 14 in Las Vegas, and we are thrilled to announce that the conference catalog ...

Introducing the Splunk Community Dashboard Challenge!

Welcome to Splunk Community Dashboard Challenge! This is your chance to showcase your skills in creating ...