Good afternoon,
I have some syslog data coming into splunk. I am trying to write the props and transforms to add the field extractions and want to make sure I am doing it the best way.
Question:
How do I accommodate different log formats in the same input source? Assuming all my events are in CSV format with clear commas for delimiters, just that the headers are different. For example:
Log 1 format:
Source, random_characters, log_TYPE, Login_Name, Last_Name, First_Name, Event_ID, Severity, Status
Log 2 format:
Source, random_characters, log_TYPE, Login_Name, Last_Name, Time, Date, Status, Resolution
Key point to notice, the fields are different after the log_TYPE field.
How should I build the props and transforms to most effectively and accurately extract the fields for each log type?
So far, the resolution I have thought of is to use a complex regex to get into the middle of the log where it can be identified, then with that information I can build out the rest of the REGEX line to extract fields based on the identifying information using more complex regex code.
Example of my idea:
Log example:
2015-04-16 13:27:37,278, some words and random characters, AUDIT_LOG, Interesting filed 1, not needed, not needed, Interesting field 2, not needed, Interesting field 3.
My transforms.conf
[audit_log_field_extractions]
REGEX = [\w\.\s-\:]+,[\w\.\s-\:]+,[\w\.\s-\:]+,\sAUDIT_LOG,\s(\w+),[\w\.\s-\:]+,[\w\.\s-\:]+,\s(\w+),[\w\.\s-\:]+,\s(\w+)
FORMAT = Field1::$1 Field2::$2 Field3::$3
All of this in English now:
[\w\.\s-\:]+,[\w\.\s-\:]+,[\w\.\s-\:]+,\s is built to get all the way up to the point where the LOG TYPE is included in the event,
AUDIT_LOG is the exact text I use to identify this specific event, how I know which fields I consider to be relevant, and what the headers should be,
The rest of the regex is a series of grabbing data into groups, or skipping past it.
Then the FORMAT assigns names to each of the groups that I collected from the REGEX line.
In theory I would have a similar stanza for each log type, to look at the event, identify the specific log type, and then customized REGEX to grab what I want.
I plan on calling these transforms as REPORT stanzas in props.conf, which means this is done at search time. My concern is will this be too resource intensive for search time? Is this a true concern? Is there a better way to do this?
... View more