Getting Data In

Universal forwarder with 2 hours delay on new log files

mihelic
Path Finder

Log messages received by our central loghost take up to 2 hours before being visible on the indexer.

Our network hardware and servers send their messages via syslog to a central loghost running syslog-ng which filters the messages into their respective files. That totals to about 1600 log files with current data. The log files rotate daily at midnight (their name is in the form of service-20130410.log). Then there is a universal forwarder, that monitors the folders/file for changes and forwards that data to a Splunk indexer. Altogether we index about 12GB of data per day.

A typical monitor stanza looks like this:

[monitor:///logs/splunk/servicetype]
host_segment=4
index = main
ignoreOlderThan = 3d
sourcetype = servicetype

The indexer has a certain search that runs every hour and requires certain data from the past hour. The results of the search at 1AM and 2AM come up empty every day. Ther were also instances of the 3AM search comming up empty. After that point all subsequent searches return data as they should. There is no more delay. I have checked the indexer and it should not be the bottleneck.

Disk I/O, CPU (24 core), and RAM (32GB) should not be a problem on the loghost server. Although the UF is constantly hogging 1 core to the maximum.

There is a delay between the time files are created and when the universal forwarder notices and forwards them.
How can I tune this to speed this up?

Is the 1600 monitored files considered a high or a low number for the universal forwarder?

Kind regards, Mitja

esix_splunk
Splunk Employee
Splunk Employee

How about the timezone settings on the forwarder vs Indexer vs SH?

0 Karma

rharrisssi
Path Finder

Did you ever figure this out?

0 Karma

kristian_kolb
Ultra Champion

Perhaps you are pushing the envelope a little bit during peak hours. By default, the UF is limited to 256kbps (configurable). 12GB/day averages to 138kbps.

http://splunk-base.splunk.com/answers/53138/maximum-traffic-of-a-universal-forwarder

Also, you could have a case of blocked queues on the indexer side;

http://splunk-base.splunk.com/answers/31151/index-performance-issue-high-latency
http://wiki.splunk.com/Community:TroubleshootingBlockedQueues

To find out how the UF is performing when reading the files, you could also check out the REST api on the UF itself;

https://your-splunk-forwarder:8089/services/admin/inputstatus/TailingProcessor:FileStatus

Also, you should install the S.O.S app, which is great for diagnosing problems...

/k

Runals
Motivator

To expand on what Kristian posted try running this search

index=_internal sourcetype=splunkd "current data throughput" | rex "Current data throughput \((?<kb>\S+)" | eval rate=case(kb < 500, "256", kb > 499 AND kb < 520, "512", kb > 520 AND kb < 770 ,"768", kb>771 AND kb<1210, "1024", 1=1, ">1024") | stats count as Count sparkline as Trend by host, rate | where Count > 4 | rename rate as "Throughput rate(kb)" | sort -"Throughput rate(kb)",-Count

It is one I baked into the app Forwarder Health

0 Karma
Get Updates on the Splunk Community!

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...