The infrastructure department of an enterprise would like to index logs from several BlueCoat proxies in Splunk, but they have little or no control of these proxies and what type of logging is enabled (i.e. what fields are being logged). What they would like to know, is how Splunk would be able to deal with BlueCoat logs that may change without any notice in advance.
There are several BC's within the enterprise, managed by people in different departments/business areas. These people may change the log file format as they see fit, without telling anyone. Currently logs are continuously being sent to a central location (via syslog I believe). From what I've been told, if looking at the data as it comes in, the change of format "just happens", and after a little while there is a message stating what the new log file format is (this seems to be a BlueCoat flaw).
From a Splunk point-of-view this is not good, since field extractions will not work well, if at all.
What would be the best course of action in order to get the BC logs splunkified?
Thanks in advance,
Could you explain why you need to extract field in advance? As you know, we can search certain data with any keyword like ip and url and so on.
I was under the impression that BC normally logs like most web servers, i.e. in a csv or tsv format. However, somebody suggested that it's possible to configure BC to log with
key=value. Could someone confirm?
Dart; Well, yes, I'm quite aware of that, but since I think we would be dealing with data coming in over syslog, it'd have to be done like I described in the question above; one syslog server (instance) per log format.
If you have 2 different log formats, say Basic and Full, you set up 2 syslog receivers, say udp/514 and udp/515. Then you configure your BCs to send to 514 when logging Basic, and to 515 when logging Full.
The syslog server(s) write files to different directory structures, and the forwarder(s) can easily set the sourcetype based on which file(s) it reads.
Any other ideas?
I have a similar problem, just with inertia in the operations department. I ended up creating different event types for each format I encountered. I'm up to five. In our environment, all of our logs are first ftp'ed to BlueCoat Reporter. I can pick them up from there with a forwarder. I did a HEAD on the first 6 lines and the field order is provided there.
For each different one, I created an eventtype (LogFormatA, LogFormatB, etc.) based on the IP address of the proxy. They are in eventtypes.conf
Then I created new stanzas in transforms.conf for each of the new events and moved around the fields in the REGEX to match the format.
Finally, in props.conf, I added all of the new LogFormat_x events to the REPORT-main statement.
that got me consistent parsing across the different formats.