Getting Data In

Universal Forwarder only sending 2 logs??

shakataganai
New Member

I just installed Splunk 4.2 (Several and universal forwarders) on 4 * Ubuntu 11.04 machines. The server (x64) is working fine, and the universal forwarder is working fine on 1 machine (x64). The problem is on the last 2 machines (x86), the forwarder sends only /var/log/dmesg and /var/log/udev - every other log file (IE /var/log/syslog, /var/log/auth, /var/log/kern etc) are NOT logged. The 2 problem machines are talking to the server via OpenVPN, are x86, and are in different time zones - these are the only differences I can think of.

All 3 of the forwarders were installed the same way:

 * sudo -s
 * dpkg -i splunkforwarder*.deb
 * cd /opt/splunkforwarder/bin
 * ./splunk start --accept-license
 * ./splunk enable boot-start
 * ./splunk set deploy-poll splunk.internal.company:8089 -auth admin:changeme
 * ./splunk add forward-server splunk.internal.company:9997 -auth admin:changeme
 * ./splunk add monitor /var/log

The services are still running on all the machines and while I've tinkered, restarted, poked, prodded and pleaded... the forwarders on the two machines will not budge. Here's the regular output from $SPLUNKHOME/var/log/metrics.log (which reports like every 30 seconds)

09-23-2011 19:41:55.335 -0400 INFO  Metrics - group=pipeline, name=parsing, processor=readerin, cpu_seconds=0.000000, executes=3, cumulative_hits=114
09-23-2011 19:41:55.335 -0400 INFO  Metrics - group=pipeline, name=parsing, processor=send-out-light-forwarder, cpu_seconds=0.000000, executes=3, cumulative_hits=114
09-23-2011 19:41:55.335 -0400 INFO  Metrics - group=pipeline, name=parsing, processor=tcp-output-light-forwarder, cpu_seconds=0.000000, executes=3, cumulative_hits=114
09-23-2011 19:41:55.335 -0400 INFO  Metrics - group=pipeline, name=parsing, processor=thruput, cpu_seconds=0.000000, executes=3, cumulative_hits=114
09-23-2011 19:41:55.335 -0400 INFO  Metrics - group=pipeline, name=parsing, processor=utf8, cpu_seconds=0.000000, executes=3, cumulative_hits=114
09-23-2011 19:41:55.335 -0400 INFO  Metrics - group=thruput, name=index_thruput, instantaneous_kbps=0.000000, instantaneous_eps=0.000000, average_kbps=0.000000, total_k_processed=0, kb=0.000000, ev=0, load_average=0.000000
09-23-2011 19:41:55.335 -0400 INFO  Metrics - group=thruput, name=thruput, instantaneous_kbps=0.145130, instantaneous_eps=0.000000, average_kbps=5.606074, total_k_processed=2085, kb=4.499023, ev=3, load_average=0.000000
09-23-2011 19:41:55.335 -0400 INFO  Metrics - group=map, name=pipelineinputchannel, current_size=12, inactive_channels=1, new_channels=0, removed_channels=0, reclaimed_channels=0, timedout_channels=0, abandoned_channels=0
09-23-2011 19:41:55.335 -0400 INFO  Metrics - group=per_host_thruput, series="cnc-4", kbps=0.145130, eps=0.096774, kb=4.499023, ev=3, avg_age=0.000000, max_age=0
09-23-2011 19:41:55.335 -0400 INFO  Metrics - group=per_index_thruput, series="_internal", kbps=0.145130, eps=0.096774, kb=4.499023, ev=3, avg_age=0.000000, max_age=0
09-23-2011 19:41:55.335 -0400 INFO  Metrics - group=per_source_thruput, series="/opt/splunkforwarder/var/log/splunk/metrics.log", kbps=0.145130, eps=0.096774, kb=4.499023, ev=3, avg_age=0.000000, max_age=0
09-23-2011 19:41:55.335 -0400 INFO  Metrics - group=per_sourcetype_thruput, series="splunkd", kbps=0.145130, eps=0.096774, kb=4.499023, ev=3, avg_age=0.000000, max_age=0
09-23-2011 19:41:55.335 -0400 INFO  Metrics - group=queue, name=tcpout_splunk.internal.company_9997, max_size=512000, current_size=0, largest_size=0, smallest_size=0
09-23-2011 19:41:55.335 -0400 INFO  Metrics - group=queue, name=aeq, max_size_kb=500, current_size_kb=0, current_size=0, largest_size=0, smallest_size=0
09-23-2011 19:41:55.335 -0400 INFO  Metrics - group=queue, name=aq, max_size_kb=10240, current_size_kb=0, current_size=0, largest_size=0, smallest_size=0
09-23-2011 19:41:55.335 -0400 INFO  Metrics - group=queue, name=auditqueue, max_size_kb=500, current_size_kb=0, current_size=0, largest_size=0, smallest_size=0
09-23-2011 19:41:55.335 -0400 INFO  Metrics - group=queue, name=fschangemanager_queue, max_size_kb=5120, current_size_kb=0, current_size=0, largest_size=0, smallest_size=0
09-23-2011 19:41:55.335 -0400 INFO  Metrics - group=queue, name=indexqueue, max_size_kb=500, current_size_kb=0, current_size=0, largest_size=0, smallest_size=0
09-23-2011 19:41:55.336 -0400 INFO  Metrics - group=queue, name=nullqueue, max_size_kb=500, current_size_kb=0, current_size=0, largest_size=0, smallest_size=0
09-23-2011 19:41:55.336 -0400 INFO  Metrics - group=queue, name=parsingqueue, max_size_kb=6144, current_size_kb=0, current_size=0, largest_size=1, smallest_size=0
09-23-2011 19:41:55.336 -0400 INFO  Metrics - group=queue, name=tcpin_queue, max_size_kb=500, current_size_kb=0, current_size=0, largest_size=0, smallest_size=0
09-23-2011 19:41:55.336 -0400 INFO  Metrics - group=realtime_search_data, system total, drop_count=0
09-23-2011 19:41:55.336 -0400 INFO  Metrics - group=search_concurrency, system total, active_hist_searches=0, active_realtime_searches=0
09-23-2011 19:41:55.336 -0400 INFO  Metrics - group=tcpout_connections, splunk.internal.company_9997:10.2.99.6:9997:0, sourcePort=8089, destIp=10.2.99.6, destPort=9997, _tcp_Bps=12.87, _tcp_KBps=0.01, _tcp_avg_thruput=0.13, _tcp_Kprocessed=45, _tcp_eps=0.03

And the list of monitors:

$ ./splunk list monitor
Monitored Directories:
        $SPLUNK_HOME/var/log/splunk
                /opt/splunkforwarder/var/log/splunk/audit.log
                /opt/splunkforwarder/var/log/splunk/btool.log
                /opt/splunkforwarder/var/log/splunk/first_install.log
                /opt/splunkforwarder/var/log/splunk/license_audit.log
                /opt/splunkforwarder/var/log/splunk/license_usage.log
                /opt/splunkforwarder/var/log/splunk/metrics.log
                /opt/splunkforwarder/var/log/splunk/scheduler.log
                /opt/splunkforwarder/var/log/splunk/searchhistory.log
                /opt/splunkforwarder/var/log/splunk/splunkd.log
                /opt/splunkforwarder/var/log/splunk/splunkd_access.log
                /opt/splunkforwarder/var/log/splunk/splunkd_stderr.log
                /opt/splunkforwarder/var/log/splunk/splunkd_stdout.log
        $SPLUNK_HOME/var/spool/splunk/...stash_new
        /var/log
                /var/log/alternatives.log
                /var/log/apt
                /var/log/apt/history.log
                /var/log/apt/term.log
                /var/log/aptitude
                /var/log/auth.log
                /var/log/boot
                /var/log/btmp
                /var/log/dmesg
                /var/log/dmesg.0
                /var/log/dpkg.log
                /var/log/fsck
                /var/log/fsck/checkfs
                /var/log/fsck/checkroot
                /var/log/kern.log
                /var/log/lastlog
                /var/log/mail.err
                /var/log/mail.log
                /var/log/mysql
                /var/log/mysql.err
                /var/log/mysql.log
                /var/log/mysql.log.1.gz
                /var/log/mysql/error.log
                /var/log/mysql/mysql-bin.000001
                /var/log/mysql/mysql-bin.000002
                /var/log/mysql/mysql-bin.000003
                /var/log/mysql/mysql-bin.index
                /var/log/news
                /var/log/news/news.crit
                /var/log/news/news.err
                /var/log/news/news.notice
                /var/log/syslog
                /var/log/syslog.1
                /var/log/syslog.2.gz
                /var/log/udev
                /var/log/wtmp
Monitored Files:
        $SPLUNK_HOME/etc/splunk.version
        $SPLUNK_HOME/var/log/splunk/splunkd.log
        $SPLUNK_HOME/var/spool/splunk

I tried adding a "./splunk add monitor /var/log/syslog" but that didn't help either. Any suggestions? I'm at a loss here as to why it refuses to log anything other than those two files.
Thanks
-Jon

0 Karma

gkanapathy
Splunk Employee
Splunk Employee

Different time zones, huh? What makes you think the data isn't being sent? Is it possible that it is being sent, but when you check for it, it's being recorded several hours in the future or several hours in the past, so you don't see if if you look, e.g., "past 15 minutes", or "last hour"?

0 Karma

shakataganai
New Member

Well I changed the timezones to all match up. Still no logs. I installed something new today and the logs from that are showing up.

0 Karma

gkanapathy
Splunk Employee
Splunk Employee

Since you are only looking in the "last 60 minutes" you may in fact be logging your data, and it is simply getting an incorrect timestamp because it being logged with the wrong time zone. that was what I was getting at. you should fix the time zone settings for that data. udev and dmesg don't have this problem because they don't have timestamps at all, and so just get the file mod time. if you change the time range, you will probably find your "missing" data.

0 Karma

shakataganai
New Member

I know that no data is being sent because there are no events for any time period, except when I restart the machine (And generate new data for udev/dmesg).

IE: http://imagebin.org/174328

Last hour, shows zero events. There should at least be a few per hour from Cron and what not into syslog.

0 Karma

shakataganai
New Member

Upon restarting splunk I see 3 of the following lines:

Could not send data to output queue (parsingQueue), retrying..

Could that be related to the issue?

0 Karma

shakataganai
New Member

I turned on debugging per http://www.splunk.com/wiki/Community:Troubleshooting_Monitor_Inputs - I pulled out a section that seems to indicate that it's reading syslog, but I'm not sure what it all means.

09-26-2011 14:35:00.824 -0400 DEBUG TailingProcessor - File state notification for path='/var/log/syslog'.
09-26-2011 14:35:00.824 -0400 DEBUG TailingProcessor -   Have seen this item before (since splunkd was restarted).
09-26-2011 14:35:00.824 -0400 DEBUG BatchReader - inflight=NULL state=/var/log/syslog rc=false
09-26-2011 14:35:00.824 -0400 DEBUG TailingProcessor -   Will attempt to read file: /var/log/syslog.
09-26-2011 14:35:00.824 -0400 DEBUG WatchedFile -   Loading state from fishbucket.
09-26-2011 14:35:00.824 -0400 DEBUG WatchedFile - Record found, will advance file by offset=135522.
09-26-2011 14:35:00.824 -0400 DEBUG TailingProcessor - About to read data (Opening file: /var/log/syslog).
09-26-2011 14:35:00.824 -0400 DEBUG WatchedFile - seeking /var/log/syslog to off=135522
09-26-2011 14:35:00.824 -0400 DEBUG WatchedFile - Reached EOF: /var/log/syslog (key=0x32121e9e2a603984 sptr=135970 scrc=0x2b425ccade0c807d fnamecrc=0xf5fdf6ea0ea92860 modtime=1317062100)
09-26-2011 14:35:00.825 -0400 DEBUG TailingProcessor -   Will doublecheck EOF (in 3000ms)..
09-26-2011 14:35:03.826 -0400 DEBUG TailingProcessor - **********
09-26-2011 14:35:03.826 -0400 DEBUG TailingProcessor - Deferred notification for path='/var/log/syslog'.
09-26-2011 14:35:03.826 -0400 DEBUG TailingProcessor -   Have seen this item before (since splunkd was restarted).
09-26-2011 14:35:03.826 -0400 DEBUG BatchReader - inflight=NULL state=/var/log/syslog rc=false
09-26-2011 14:35:03.826 -0400 DEBUG TailingProcessor -   Will attempt to read file: /var/log/syslog from existing fd.
09-26-2011 14:35:03.826 -0400 DEBUG WatchedFile -   Loading state from fishbucket.
09-26-2011 14:35:03.827 -0400 DEBUG WatchedFile - Record found, will advance file by offset=135970.
09-26-2011 14:35:03.827 -0400 DEBUG TailingProcessor - About to read data (Reusing existing fd for file='/var/log/syslog').
09-26-2011 14:35:03.827 -0400 DEBUG WatchedFile - seeking /var/log/syslog to off=135970
09-26-2011 14:35:03.827 -0400 DEBUG WatchedFile - Reached EOF: /var/log/syslog (key=0x32121e9e2a603984 sptr=135970 scrc=0x2b425ccade0c807d fnamecrc=0xf5fdf6ea0ea92860 modtime=1317062100)
09-26-2011 14:35:03.827 -0400 DEBUG TailingProcessor - Hit EOF immediately.
09-26-2011 14:35:03.827 -0400 DEBUG TailingProcessor - Have definitely hit EOF.
09-26-2011 14:35:17.842 -0400 DEBUG TailingProcessor - **********
0 Karma

mikelanghorst
Motivator

I find the best way to diagnose file inputs is to use the REST endpoint:
https://forwarder:port/services/admin/inputstatus/TailingProcessor%3AFileStatus

It will show the files seen by the configured monitors and their current status. If the file is already read, not matching a whitelist, matching a blacklist, etc.

0 Karma

shakataganai
New Member

I pulled out the part for syslog:

136430/s:key136430/s:key/var/log/*/s:key100.00/s:keyfinished reading/s:key/s:dict

That seems to jive with my debugging output below that it is try to read those files... but I guess it isn't getting sent back to HQ?

0 Karma