Getting Data In

Why is my HTTP event collector Indexing slowly?

New Member

Fellow Splunksters,

I have been able to send data to Splunk via TCP sockets for a while and never had any issues. I switched some of our apps over to using the HTTP event collector via the Python Splunk API, so for example...

import splunklib.client as splunk_client

service = splunk_client.connect(host='127.0.0.1', port=8089, username=<username>, password=<password>)
index = service.indexes['my_index']

index.submit(message, sourcetype='_json', host='local')

I think it is important to note that data coming into my script running the API is MQTT data, as the data is coming (about 1 event every 2 seconds) Splunk is able to index the data just fine. However, if the data stream is interrupted the events are stored until the connection is re-established and all the events flood to the Splunk server. This is when it takes about 10-15 minutes to index anywhere from 150 to 300 events. I certainly do expect some delay just not 15 minutes.

I'm wondering if anybody else has had this issue with the HTTP Event Collector? Is there a more efficient way of indexing data so this doesn't happen? Is a TCP socket faster than the HEC?

I am currently waiting for our IT department to allocate more resource to our Splunk server (such as RAM and CPU cores), so maybe that will help increase performance?

Thanks!

0 Karma

Champion

that definitely seems like too long, so not sure if this will help at all but it can't hurt....this is a conf session from 2017 that had some good tuning tips - may be a bit outdated now?

https://conf.splunk.com/files/2017/slides/measuring-hec-performance-for-fun-and-profit.pdf

0 Karma
State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!