I have an event generator that simulates five servers running uberAgent. Data is sent to Splunk via the REST API. When I start the event generator, everything is fine. But while it keeps running, the index lag keeps increasing. In other words: it takes longer and longer for the events to show up in a search.
I am seeing the REST API calls as they are made in splunkd_access.log. Example:
192.168.8.1 - uainput [15/Dec/2013:18:05:38.139 +0100] "POST /services/receivers/simple?source=uberAgent&sourcetype=uberAgent%3aApplication%3aApplicationUsage&host=RDS-1&index=uberagent HTTP/1.1" 200 215 - - - 0ms
In metrics.log I can see that the max_age is increasing. It starts out slow and keeps getting bigger. Example:
12-15-2013 18:05:22.428 +0100 INFO Metrics - group=per_sourcetype_thruput, series="uberagent:application:applicationusage", kbps=0.402483, eps=9.450443, kb=12.478516, ev=293, avg_age=921.771331, max_age=938
I have no errors in splunkd.log. What is happening here? Is there some kind of quota that limits the number of events to be processed?
Update: This issue is not fixed in Splunk 6.0.1.
The receivers/simple endpoint does not scale very well.It opens and closes a socket for every event you send.
Use the receivers/stream endpoint or as GK mentions , send your data directly to a TCP Input in Splunk.
Possible. The REST API is rarely used for data input, and so has not been well-tested by years of field use over thousands of installations like the network, file, or program inputs. You might be better off sending data to a specified TCP port instead.