Getting Data In

Why is my HTTP event collector Indexing slowly?

New Member

Fellow Splunksters,

I have been able to send data to Splunk via TCP sockets for a while and never had any issues. I switched some of our apps over to using the HTTP event collector via the Python Splunk API, so for example...

import splunklib.client as splunk_client

service = splunk_client.connect(host='127.0.0.1', port=8089, username=<username>, password=<password>)
index = service.indexes['my_index']

index.submit(message, sourcetype='_json', host='local')

I think it is important to note that data coming into my script running the API is MQTT data, as the data is coming (about 1 event every 2 seconds) Splunk is able to index the data just fine. However, if the data stream is interrupted the events are stored until the connection is re-established and all the events flood to the Splunk server. This is when it takes about 10-15 minutes to index anywhere from 150 to 300 events. I certainly do expect some delay just not 15 minutes.

I'm wondering if anybody else has had this issue with the HTTP Event Collector? Is there a more efficient way of indexing data so this doesn't happen? Is a TCP socket faster than the HEC?

I am currently waiting for our IT department to allocate more resource to our Splunk server (such as RAM and CPU cores), so maybe that will help increase performance?

Thanks!

0 Karma

Champion

that definitely seems like too long, so not sure if this will help at all but it can't hurt....this is a conf session from 2017 that had some good tuning tips - may be a bit outdated now?

https://conf.splunk.com/files/2017/slides/measuring-hec-performance-for-fun-and-profit.pdf

0 Karma