Getting Data In

What’s the recommended batch size per request for sending to HEC?

spammenot66
Contributor

Is there a limit to how many events can be sent to Splunk HEC per event? What’s recommended, are there any guideline 


This Splunk conf has it at 5-50, but I’ve seen some folks send 1k-6k events per request? Is there a point where # of events per request starts to affect performance and would it affect just the input with large request or the overall HEC server?

https://conf.splunk.com/files/2017/slides/measuring-hec-performance-for-fun-and-profit.pdf

“Recommendation: Batch size between 5 and 50“

Labels (1)
0 Karma
1 Solution

bowesmana
SplunkTrust
SplunkTrust

I have done performance testing in the last 12 months using a HEC buffer size of 256K and then various sizes of up to 5MB. Using an event size of approx 170 bytes. That equates to approx 740 messages per buffer (allowing for additional HEC metadata sent with the request), with the 5MB buffer size up to 15K events per batch.

A lot will depend on how many events you are generating, but just in terms of getting data into Splunk, the number of events per batch doesn't seem to affect performance. If you have very large data volumes, then there are other performance settings you will need to manage, as referred to in that CONF link. I have tested with 3 parallel pipelines and 8 dedicated IO threads, across a 6 index cluster with rates of 900 events/sec (~256MB/sec)

That conf PDF says 5-50 for batch size, but later says 100 events/request. so not sure what it's actually saying.

Note that timing will play a part also. If you have 1 event per second, then if you only send the HEC payload after 100 events, then you will get index lag of 100 seconds in your event stream, so factor that in.

 

View solution in original post

0 Karma

bowesmana
SplunkTrust
SplunkTrust

I have done performance testing in the last 12 months using a HEC buffer size of 256K and then various sizes of up to 5MB. Using an event size of approx 170 bytes. That equates to approx 740 messages per buffer (allowing for additional HEC metadata sent with the request), with the 5MB buffer size up to 15K events per batch.

A lot will depend on how many events you are generating, but just in terms of getting data into Splunk, the number of events per batch doesn't seem to affect performance. If you have very large data volumes, then there are other performance settings you will need to manage, as referred to in that CONF link. I have tested with 3 parallel pipelines and 8 dedicated IO threads, across a 6 index cluster with rates of 900 events/sec (~256MB/sec)

That conf PDF says 5-50 for batch size, but later says 100 events/request. so not sure what it's actually saying.

Note that timing will play a part also. If you have 1 event per second, then if you only send the HEC payload after 100 events, then you will get index lag of 100 seconds in your event stream, so factor that in.

 

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Let's add some additional stuff to the mix.

1. Raw number of events is one thing, but their size also matters. It's a different thing to send 1000 of short syslog-received messages and completely another thing to send 1000 several kilobytes long stack dumps from java app.

2. Technically at some point you will hit some limit (after all server memory doesn't grow on trees ;-). But probably sending tenths of iso images within a single batch request isn't what you're aiming at 🙂

3. And finally, even if you're not using acks, will your source be able to resend the events batch from a given point should any error happen in the middle of the batch and only some events were accepted?

0 Karma

isoutamo
SplunkTrust
SplunkTrust
One thing to consider is that if/when your HEC receiver crashes you will lost those evens unless you have configured indexing ack into use and your HEC sender/client had implemented it into use! First part is an easy step, but second part isn’t! Also when you are using LB before multiple HEC nodes you will be get some duplicate events time by time.
0 Karma
Get Updates on the Splunk Community!

Splunk Enterprise Security 8.0.2 Availability: On cloud and On-premise!

A few months ago, we released Splunk Enterprise Security 8.0 for our cloud customers. Today, we are excited to ...

Logs to Metrics

Logs and Metrics Logs are generally unstructured text or structured events emitted by applications and written ...

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...