Recently, I installed new Splunk Enterprise 9.2.1 (on-prem) on RHEL8 server and have installed Universal Forwarders on bunch of Linux (RHEL and Ubuntu) and Windows clients and logs are being ingesting fine.
However, after waiting for few days, I installed Universal Forwarder on few more Linux machines (followed the same process as before) and installation was successful but logs are not showing up on indexers.
I have checked and compared the inputs.conf, outputs.conf, server.conf under $SPLUNK_HOME/etc/system/local (like other hosts) and looks good.
I have done tcpdump on the clients and the indexers and clinent is sending logs to the indexer
I did look search for the new hosts in $SPLUNK_HOME/var/log/splunk and they do show up in metrics.log but when I search for index="my-index-name", I only see logs from the hosts that I installed last week; nothing from new UF I installed/configured yesterday.
What's the best way to troubleshoot further?
Just wanted to provide an update to my issue. So, it looks like, the problem was with the splunkfwd user that gets created during the UB install didn't have permission to /var/log. After I changed (setfacl -R -m u:splunkfwd:rX /var/log), I started seeing logs in my indexer.
Thanks everyone for your help.
So I tried couple of search strings and I am able to see my new hosts
index="_internal" sourcetype="splunkd" source="*metrics.lo*" group=tcpin_connections component=Metrics | eval sourceHost=if(isnull(hostname), sourceHost,hostname) | eval connectionType=case(fwdType=="uf","universal forwarder", fwdType=="lwf", "lightweight forwarder",fwdType=="full", "heavy forwarder", connectionType=="cooked" or connectionType=="cookedSSL","Splunk forwarder", connectionType=="raw" or connectionType=="rawSSL","legacy forwarder") | eval version=if(isnull(version),"pre 4.2",version) | eval guid=if(isnull(guid),sourceHost,guid) | eval os=if(isnull(os),"n/a",os)| eval arch=if(isnull(arch),"n/a",arch) | fields connectionType sourceIp sourceHost splunk_server version os arch kb guid ssl tcp_KBps | eval lastReceived = case(kb>0, _time) | eval lastConnected=max(_time) | stats first(sourceIp) as sourceIp first(connectionType) as connectionType max(version) as version first(os) as os first(arch) as arch max(lastConnected) as lastConnected max(lastReceived) as lastReceived sparkline(avg(tcp_KBps)) as "KB/s" avg(tcp_KBps) as "Avg_KB/s" by sourceHost guid ssl | addinfo | eval status=if(lastConnected<(info_max_time-900),"missing",if(mystatus="quiet","quiet","active")) | fields sourceHost sourceIp version connectionType os arch lastConnected lastReceived KB/s Avg_KB/s status ssl | rename sourceHost as Forwarder version as "Splunk Version" connectionType as "Forwarder Type" os as "Platform" status as "Current Status" lastConnected as "Last Connected" lastReceived as "Last Data Received" | fieldformat "Last Connected"=strftime('Last Connected', "%D %H:%M:%S %p") | fieldformat "Last Data Received"=strftime('Last Data Received', "%D %H:%M:%S %p") | sort Forwarder
OK. Again - do you see events from your UF in _internal index? (try a longer timespan extending some time into the future).
Hi @jkamdar check splunkd.log on forwarders, they are usually good to diag. Good luck!
Thanks and I see bunch line like below:
TailReader [19453 tailreader0] - error from read call from '/var/log/message'
Is that permission issue?
Hi @jkamdar,
please perform this tests:
index=_internal host=<one_of_the_missing_hosts>
if you have logs the connection is OK.
If the connection is not OK, please on the missing forwarder try:
telnet <ip_spunk_server> 9997
if it cannot connect, there a route issue, maybe there are local or network firewalls.
if instead you have internal logs, you should check on the forwarder if the user you're using to run Splunk has the grants to read the files and obviously if the paths of the files to read are correct.
Ciao.
Giuseppe
Thanks, I tried I tried " index =_internal | stats count by host" but don't see the newly installed UF host name there.
Then, I tried "./splunk add forward-server <host name or ip address>:<listening port>" but it says, it's already there. So, I removed both inputs.conf and outputs.conf and then tried the above command that created outputs.conf. Also, I readded inputs.conf manually and then restarted splunk without any success.
I do see errors in splunkd.log on UF as shown below:
TailReader [19453 tailreader0] - error from read call from '/var/log/message'. Maybe it's a permission issue.
These are two separate issues. If you have local permissions/selinux issues you might not be able to ingest "production" data but you should still be getting events into the _internal index since these are forwarder's own logs.
Check splunkd.log on the forwarder and check if it's able to connect to the receiving indexer(s). If not, see what's the reason.
Well, in that case I am really confused.
I can telnet from UF host to the indexer on port 9997. One more thing, I do see UF host names in metrics.log on my indexer logs. And tcpdump shows traffic being sent from UF host to the indexer and on the indexer, traffic being received from the UF host.
Ok. Do a simple
index=_internal host=your_uf
search and run it as a real-time search. That's one of the very few use cases when real-time search is actually warranted.
If you see something, check validity of the data. Typical problem when you're supposedly ingesting data but don't see it (apart from non-existent destination index) is time problems - if you have misconfigured timezone on the source, the data can be seen as coming from a long time ago so it's indexed into the past. So you're not seeing it looking for it in "last 15 minutes" or so.