Getting Data In

How to set up a centralized server for logs from network devices?

Path Finder

Hi Guys. How do you deal with logs from network devices? I know that logs from network devices should be sent to a centralized log server where an installed forwarder will grab them and send over to the indexer – this is to avoid the gap in the transfer of logs in case the Splunk indexer is down for maintenance of any other reason.
But what about if you need to perform maintenance on you log server (updates,etc) and it needs to be restarted? I do not want to lose any logs at any time.
Can you advise how to set up the log server/s for availability and redundancy (I have about 200 network devices that will be sending logs to the server/s)?

0 Karma

SplunkTrust
SplunkTrust

This should be a good start: http://www.georgestarcher.com/splunk-success-with-syslog/

Making the syslog server highly available can be achieved with whatever means you prefer - for example VMware HA, Veritas Clusters, HAproxy, whatever.

0 Karma

Path Finder

Thanks for the link. It is a good start.
However, how to deal with resetting syslog servers and losing logs that come in while the server is down (ex. restart after windows updates). How do you guys have your syslog servers setup to make sure data is never lost?

0 Karma

Communicator

For our setup we have a highly available (F5 sitting out in front) 8 node syslog cluster running rsyslog on Red Hat.

This ensures if / when maintenance is needed we can ensure we never have more than the required number of nodes down to ensure data isn't lost. We then have a splunk UF installed on each one of the 8 hosts to ingest data into our Index Cluster.

Hope this helps.

Communicator

Forgot to add this is only for data that cannot be ingested using the UF - think routers, switches, firewalls, and other black box Linux appliances.

We do have quite a few other scenarios where we utilize Heavy Forwarders in lieu of sending to syslog where it makes sense such as where have database connections / tails, or we need to run some python against the data (netflow / appflow) prior to ingestion.

Path Finder

Thank you. Looks like syslog server cluster is a way to go. I will need to find more info on that.

In your situation. How do you filter out redundant data? I assume that all of you syslog servers have the same data but you don't want to index the exact same data that resides on different servers.

0 Karma

Communicator

In regards to the redundant data in the rsyslog configuration, the rsyslog service determines what the data is and where it should be stored at ensuring if a node fails it can move on and write to another host.

These messages typically get archived and roll then 1 or twice per day as they are temporary before we move them into Splunk and replicate them around based on the rules setup for our Index Cluster. The rsyslog cluster isn't a permanent storage solution, it was designed as a temporary holding place before we move them into Splunk.

We then have a Splunk UF installed on each of the (8) where we configure inputs.conf to only read in the data we are interested in.

Here's the link for rsyslog if it helps:

rsyslog

Hope this helps!