Getting Data In

Aggregating syslogs and forwarding to splunk as hosts

doadams85
Observer

Hi All - Pretty new to Splunk and having an issue sorting/parsing data from our syslog server. We have many rhel7 linux hosts all sending their logs to one server where they get aggregated. This works fine. I can go into /var/log/secure, messages, etc. and see entries from all the hosts we have. We are running a splunkforwarder on this host with the hopes that it would be forwarding all the data to splunk as it hits the this rhel7 log aggregator. We just have a single head/indexer, and if I run a query "index="*" I do get quite a bit of results, BUT it only shows 2 hosts, the splunk instance and the rhel7 system that we are aggregating the logs on. If I change the search to "index="*" hostname"  with the hostname being one of the rhel hosts, I can find the entries specific to that host. I hope this makes sense? So somehow I need to tell Splunk about these hosts so they are recognized as separate hosts. What can I do to make this work? Thank you all in advance!  

Labels (4)
0 Karma

PickleRick
SplunkTrust
SplunkTrust

OK. Processing syslog is not as easy as it seems 🙂

1. There is hardly such thing as "standard syslog". Yes, there are some RFCs describing syslog protocol but in practice many solutions send practically anything on port 514 and consider it "syslog".

2. If the data is properly formatted (RFC3164 or RFC5424), the hostname can (and will if using properly configured sourcetypes) be parsed out from the event itself. Too bad many environments have - for example - a ton of routers in various locations, each called "gateway".

3. When you're receiving messages directly on network port with Splunk (either UF or "full" Splunk), you lose most of the metadata about the source (if properly configured, the input can set the host field to source IP or hostname but it can be subsequently overwritten by the value from the event - see previous point).

So the recommended options of ingesting syslog data into splunk is to set up an intermediate syslog daemon which either:

1) Forwards to HEC input on Splunk adding proper metadata information (this can be done on rsyslog, syslog-ng or SC4S) or

2) Writes to files from which the UF picks up data and forwards it to Splunk (kinda similar to what you did).

But the 2nd option is best done when writing to separate files from each host (for example with dynamic filename generation based on source IP). Then you have source file path  in the source field and you can parse the original IP or hostname from that.

richgalloway
SplunkTrust
SplunkTrust

It sounds to me like when data is aggregated on the one server the original host information is lost.

Would it be possible for each RHEL7 host to forward their logs directly to Splunk?  That would preserve the host information.

---
If this reply helps you, Karma would be appreciated.
0 Karma

doadams85
Observer

Thanks for the reply. Unfortunately that is not an option as we need to keep the logs from all the servers and they all live on giant RAMDISK's and when the system is shutdown it all goes away except for this one host. I was hoping that we can somehow massage the data (Heavy forwarder maybe?) on the log aggregator and push it to splunk with the correct hostname somehow?

0 Karma

richgalloway
SplunkTrust
SplunkTrust

It may be possible.  Does the aggregated log contain a field that tells what the original host was?  If so, the HF could be configured to extract that field as the host field.

---
If this reply helps you, Karma would be appreciated.
0 Karma

isoutamo
SplunkTrust
SplunkTrust

Hi

it sounds like you have just normal linux with standard syslog configured as take all remote syslog entries into one log file. Instead of that it's better to configure syslog (rsyslog or syslog-ng) to separate logs into own files like (/var/logs/syslogs/<host>/<date>/xyz) when they comes in. Then just read those files and use that <host> as a hostname when you are sending those to splunk.

Another option is setup SC4S to collect and send those syslog to splunk.

r. Ismo

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @doadams85,

after you installe the Splunk Universla forwarder on the target host did you:

  • configure the Indexer to receive logs from forwarders (by default on port 9997)
  • configure your UF to send logs to that Indexer (outputs.conf)?
  • install the Splunk_TA-for Linux?
  • enable the input stanzas?

Ciao.

Giuseppe

0 Karma

doadams85
Observer

Hi Guiseppe,

Yes to all you mention. The data gets into splunk but ONLY the log aggregator shows up as a host on the search window on the left. I need all the hosts showing up.  If I search using the other hostnames I can see the logfile from that host - just doesn't show as a host on the left. Make sense? 

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @doadams85 ,

sorry but it isn't clear, could you share some sample of your logs and the search you're using?

Ciao.

Giuseppe

0 Karma

doadams85
Observer

Hi All - Any ideas on what I posted?

0 Karma
Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Dynamic formatting from XML events

This challenge was first posted on Slack #puzzles channelFor a previous puzzle, I needed a set of fixed-length ...

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

  &#x1f680; Your data just got a serious AI upgrade — are you ready? Say hello to the Agentic Era with the ...

Stronger Security with Federated Search for S3, GCP SQL & Australian Threat ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...