I am getting the logs from SBG,but splunk couldnt able to index those logs. I need to index those logs. I did field extraction for first 3 fields are common in every event. The main problem is The next fields depend up on the 3rd field that is action. Now i want to extract those fields and i need to name it for search purpose. below are the example events.
1> jul 7 04:02:01 wipro-blr-out01 ecelerity: 1404685921|cb5bdd57-f792c6d000001154-e6-53b9ce619b84|ACCEPT|126.96.36.199:50090
2> Jul 7 04:02:01 wipro-blr-out01 ecelerity: 1404685921|cb5bdd57-f792c6d000001154-e7-53b9ce61f951|IRCPTACTIONfirstname.lastname@example.org|annotate
3>Jul 7 04:02:01 wipro-blr-out01 bmserver: 1404685921|cb5bdd57-f792c6d000001154-e7-53b9ce61f951|VERDICTemail@example.com|content_300|default|legal disclaimer
if u see above events fisrt 3 fields are common so i named it. Now i want name the rest of fields by using following search.
index=main sourcetype=ec_sbg_outbound action=accept----- now i will get all the events of action=accept it has only one field after action so i need to name for that field.
like that i have to do all the action types..
please help me
in advance thanks.....
Since you have the extraction for the first three fields, I suspect you can make the extraction for the remaining ones, you just need to know how to do that and make it work right.
$splunkhome/etc/apps/myappname/local/transforms.conf (or wherever) you will need to create several REGEX statements. You have log lines like:
jul 7 04:02:01 wipro-blr-out01 ecelerity: 1404685921|cb5bdd57-f792c6d000001154-e6-53b9ce619b84|ACCEPT|188.8.131.52:50090
So, use your REGEX you have for the first three fields, only don't extract the third field yet. Instead, include from the third field to the end of the line as something like "sbg_extra_info". BTW, I assume there's some "header" type of information that's not really "field1" - i.e. timestamp and so on. It doesn't matter for my explanation, I just mention it to prevent confusion below.
Some "pseudo regex" - meaning you may have to escape pipes, and honestly I just whipped it up so it's probably totally wrong, but it's close enough for our purposes:
[sbg-message-parse] REGEX = ^(?<timestamp>[^ ]*\s+[^ ]*\s+[^ ]*)\s+(?<host>[^ ]*)\s+(?<some_other_field>[^ :]*)[:]\s+(?<field1>[^|]*)|(?<field2>[^ ]*)\s+(?<sbg_extra_info>.*)
But the important part was field1, field 2, then "everything else" as sbg_extra_info.
Now, also in that same transforms, create more stanzas, one for each of the type of service (ACCEPT, IRCPTACTION, VERDICT, etc...). Use "SOURCE_KEY = sbg_extra_info" to start by using that field for this extraction.
[sbg-extrainfo-accept-parse] SOURCE_KEY = sbg_extra_info REGEX = (?<service>ACCEPT)|(?<accept_field1>[^|]*)(?<accept_field2>... [sbg-extrainfo-ircptaction-parse] SOURCE_KEY = sbg_extra_info REGEX = (?<service>IRCPTACTION)|(?<ircptaction_field1>[^|]*)(?<ircptaction_field2>...
Notice in each of those, I pull out the "service" (IRCPTACTION, ACCEPT... ) as well, then the rest of the REGEX just extracts whatever appropriate for the rest of the message. Add more fied extractions and stanzas as required.
Lastly, you have to call of these from props.conf. Order is important in that you have to pull out your sbg_extra_info FIRST. All the rest are on equal footing because there's no "nested" dependencies, just that one field needs to be created first. So, in
$splunkhome/etc/apps/myappname/local/props.conf, call them all.
[mysourcetype] REPORT-sbg_info = sbg-message-parse,sbg-extrainfo-accept-parse,sbg-extrainfo-ircptaction-parse,...
That should be it. I usually recommend getting the main sbg-message-parse right first, then proceeding with the rest. That way you can tweak each regex as a rex in a search directly and get it just right before committing it to your transforms.conf file.
I created this regex extraction, that extracts fields for the majority of Symantec Messaging Gateway's logs:
^<142>(?P<date>\w+\s+\d+)\s+(?P<time>[^ ]+)\s+(?P<server>\w+)\s+(?P<process_name>[a-z]+)\[(?P<process_number>\d+)[^ \n]* (?P<process_id>[^\|]+)\|(?P<message_id>[^\|]+)\|(?P<action>IRCPTACTION|VERDICT|UNTESTED|FIRED|SENDER|LOGICAL_IP|EHLO|MSG_SIZE|MSGID|SOURCE|SUBJECT|ORCPTS|TRACKERID|ATTACH|UNSCANNABLE|VIRUS|DELIVER|ACCEPT)(?:(?:(?<=ACCEPT|DELIVER|LOGICAL_IP)\|(?P<src>[^:\s]+)(?::(?P<port>[0-9]+))?(?:\|(?P<to>[^\s]+))?)|(?:(?<=FIRED|IRCPTACTION|ORCPTS|TRACKERID|UNTESTED|VERDICT)\|(?P<recipient>[^\s\|]+)(?:\|)?(?P<result>[a-z][^\|\s]+)?(?:\|(?P<result_2>[a-z][^\|]+))?(?:\|(?P<result_3>.+))?)|(?:(?<=SENDER)\|(?P<from>[^\s]+))|(?:(?<=MSG_SIZE)\|(?P<msg_size>\w+))|(?:(?<=SUBJECT)\|(?P<subject>.*))|(?:(?<=ATTACH)\|(?P<attachment>.+))|(?:(?<=UNSCANNABLE)\|(?P<reason>.+))|(?:(?<=VIRUS)\|(?P<virus_name>.+))|(?:(?<=EHLO)\|(?P<fqdn>.+)))?
I wrote a short blog post about it here: http://alec.dhuse.com/?p=217
try, i used the logic: