Hi All - Pretty new to Splunk and having an issue sorting/parsing data from our syslog server. We have many rhel7 linux hosts all sending their logs to one server where they get aggregated. This works fine. I can go into /var/log/secure, messages, etc. and see entries from all the hosts we have. We are running a splunkforwarder on this host with the hopes that it would be forwarding all the data to splunk as it hits the this rhel7 log aggregator. We just have a single head/indexer, and if I run a query "index="*" I do get quite a bit of results, BUT it only shows 2 hosts, the splunk instance and the rhel7 system that we are aggregating the logs on. If I change the search to "index="*" hostname" with the hostname being one of the rhel hosts, I can find the entries specific to that host. I hope this makes sense? So somehow I need to tell Splunk about these hosts so they are recognized as separate hosts. What can I do to make this work? Thank you all in advance!
OK. Processing syslog is not as easy as it seems 🙂
1. There is hardly such thing as "standard syslog". Yes, there are some RFCs describing syslog protocol but in practice many solutions send practically anything on port 514 and consider it "syslog".
2. If the data is properly formatted (RFC3164 or RFC5424), the hostname can (and will if using properly configured sourcetypes) be parsed out from the event itself. Too bad many environments have - for example - a ton of routers in various locations, each called "gateway".
3. When you're receiving messages directly on network port with Splunk (either UF or "full" Splunk), you lose most of the metadata about the source (if properly configured, the input can set the host field to source IP or hostname but it can be subsequently overwritten by the value from the event - see previous point).
So the recommended options of ingesting syslog data into splunk is to set up an intermediate syslog daemon which either:
1) Forwards to HEC input on Splunk adding proper metadata information (this can be done on rsyslog, syslog-ng or SC4S) or
2) Writes to files from which the UF picks up data and forwards it to Splunk (kinda similar to what you did).
But the 2nd option is best done when writing to separate files from each host (for example with dynamic filename generation based on source IP). Then you have source file path in the source field and you can parse the original IP or hostname from that.
It sounds to me like when data is aggregated on the one server the original host information is lost.
Would it be possible for each RHEL7 host to forward their logs directly to Splunk? That would preserve the host information.
Thanks for the reply. Unfortunately that is not an option as we need to keep the logs from all the servers and they all live on giant RAMDISK's and when the system is shutdown it all goes away except for this one host. I was hoping that we can somehow massage the data (Heavy forwarder maybe?) on the log aggregator and push it to splunk with the correct hostname somehow?
It may be possible. Does the aggregated log contain a field that tells what the original host was? If so, the HF could be configured to extract that field as the host field.
Hi
it sounds like you have just normal linux with standard syslog configured as take all remote syslog entries into one log file. Instead of that it's better to configure syslog (rsyslog or syslog-ng) to separate logs into own files like (/var/logs/syslogs/<host>/<date>/xyz) when they comes in. Then just read those files and use that <host> as a hostname when you are sending those to splunk.
Another option is setup SC4S to collect and send those syslog to splunk.
r. Ismo
Hi @doadams85,
after you installe the Splunk Universla forwarder on the target host did you:
Ciao.
Giuseppe
Hi Guiseppe,
Yes to all you mention. The data gets into splunk but ONLY the log aggregator shows up as a host on the search window on the left. I need all the hosts showing up. If I search using the other hostnames I can see the logfile from that host - just doesn't show as a host on the left. Make sense?
Hi @doadams85 ,
sorry but it isn't clear, could you share some sample of your logs and the search you're using?
Ciao.
Giuseppe
Hi All - Any ideas on what I posted?