I have a problem to find some juniper devices syslog on the splunk, I did packet capture on the server and could confirm the syslog packet reached the server. it's a windows server 2012 R2, port is UDP 514. There are so many other devices syslog could be found, some of them are the same model and firmware and syslog configurations are identical. splunk version is 6.3.1. How can I troubleshoot?
Further researching, it's the different between JUNOS standard-format (unstructured) system log and structured system log (RFC 5424). By adding a syslog system and write the events to a file (probably add the receiving timestamps) like Rich suggested definitely would avoid such kind of issue. Thanks again for the quick response the awesome troubleshooting steps!
problem resolved by adding:
set system syslog host <splunk server IP> structured-data brief.
Appreciated the help!
Many thanks!
index=* TERM(10.10.10.10)
with host ip works, also need to expand the search time range (I use all in this case), found the syslog same date/time but different year though (in my case it's in 2017 or 2016 for two devices, same firmware/model ), going to dig in why it's been parsed like that.
First off, are you sending syslog direct to Splunk? While supported, it's not really a good idea. But that's OK, it's probably not the problem here.
I'd start from the standard way to find *any* source.
First, search all indexes for the host, like
index=* host=<yourHostName>
Unfortunately, if your hostname isn't what you think it is (which is very possible, since you can't find this host!) that won't help. But try
index=* TERM(10.10.10.10)
Obviously, use your hosts IP address there, and see if you can find it.
Another helpful trick, a more wide open search, setting the time to cover when this host *should* have started sending in logs (e.g. if you set that up Thursday, try going Wednesday to Friday):
index=* | timechart count by host
You might see a sudden new host about mid-day Thursday. And that is probably it - again do NOT discount "Oh I know that hostname can't possibly be it" because the problem is that *you can't find that host yet* so everything is a candidate until proven by examining the actual events.
You might see another host suddenly double its event count, too - that means you may have DNS or the host misconfigured, so it is sending in events as if it's another hostname.
Swap in 'by src_ip' or whatever the IP comes in on, too and take a look there if you need to.
Lastly, a bit more complex, you could look at the first time any host sent in data.
index=*
| stats min(_time) as first_mention by host
| eval first_mention = strftime(first_mention, "%Y-%m-%d %H:%M:%S")
There's really no "magic bullet" that works in all cases, because the list of things that might be wrong is HUGE.
Good luck hunting!
-Rich
So why not have Splunk listen on syslog? For our purposes, it would simplify troubleshooting. If syslog-NG were listening for syslog messages, then it would hear them and write them to a file. We can split the troubleshooting up into two FAR easier steps in that case - are the syslog messages coming in and being written to file? If not, troubleshoot that side of things. Are they being written to file? Then we know - and can test, retest and make searches that look for the actual contents of the messages.
This of course is just in addition to all the standard reasons for not having Splunk listen on 514 itself, you can google for those (there are lots of reasons why, many of them outlined right here in Answers).