Trying to understand how this SEDCMD works so I can modify it for something else. It works in props.conf but I can't seem to get it to work in SPL.
Here is the event log:
Jul 1 19:58:45 filterlog: 67,,,1509205722,igb1,match,pass,in,4,0x0,,64,43017,0,none,17,udp,56,192.168.X.X,X.X.X.X,56393,53,36
Here is the SEDCMD:
s/^(\w{3}\s+\d{1,2}\s\d{2}:\d{2}:\d{2}\s)+\S+.\S+\s+/\1/g
This works in props.conf but testing it within SPL with this:
index=pfsense source="udp:5114"
| rex mode=sed field=fieldtestudp "s/^(\w{3}\s+\d{1,2}\s\d{2}:\d{2}:\d{2}\s)+\S+.\S+\s+/\1/g"
| table fieldtestudp
It is supposed to remove everything up to the filterlog and like I said it does in props.conf but not in SPL, what am I missing?
Thanks.
Here's the test search that I was using:
| makeresults
| eval fieldtestudp="Jul 1 19:58:45 filterlog: 67,,,1509205722,igb1,match,pass,in,4,0x0,,64,43017,0,none,17,udp,56,192.168.X.X,X.X.X.X,56393,53,36"
| rex mode=sed field=fieldtestudp "s/^(\w{3}\s+\d{1,2}\s\d{2}:\d{2}:\d{2}\s)+\S+.\S+\s+/\1/g"
| table fieldtestudp
The result I got was this:
Jul 1 19:58:45 67,,,1509205722,igb1,match,pass,in,4,0x0,,64,43017,0,none,17,udp,56,192.168.X.X,X.X.X.X,56393,53,36
Which seems correct to me... The SEDCMD says:
s/^(\w{3}\s+\d{1,2}\s\d{2}:\d{2}:\d{2}\s)+\S+.\S+\s+/\1/g
s/ => substitute.
^ => from the start
( => capture into buffer 1
\w{3}\s+\d{1,2}\s\d{2}:\d{2}:\d{2}\s -> the time stamp
) => end capture.
\S+.\S+\s+ => <at least 1 non-white space> <any char><at least 1 non-whitespace><at least 1 space>
/ => substitute with
\1 => the contents of buffer 1.. (the time stamp)
/g => (globally)
In a nutshell it says...
s/Jul 1 19:58:45 filterlog:/Jul 1 19:58:45/g
I'd think that there'd be an easier way to get that done, but there ya go.
Maybe if you play around with my search it'll help you a bit.
Hope that helps.
First thank you for the help. I made a mistake in my original post the SEDCMD command is the following, there is a slash in front the period which should escape and look for a period:
"s/^(\w{3}\s+\d{1,2}\s\d{2}:\d{2}:\d{2}\s)+\S+.\S+\s+/\1/g"
To "reset" the conversation the first thing the props file does is call a command in transforms:
[pfsense_sourcetyper]
REGEX = ^\w{3}\s+\d{1,2}\s\d{2}:\d{2}:\d{2}\s+(\w+)([\d+])?:
DEST_KEY = MetaData:Sourcetype
FORMAT = sourcetype::pfsense:$1
This should extract the "filterlog" and output it as a sourctype=pfsense:filterlog. Which it does.
I am assuming once that is done and control is returned to props.conf, the SEDCMD command is trying to remove everything up to the space in front of filter log because the next section in props.conf is the extraction of the pfsense:filterlog
[pfsense:filterlog]
EXTRACT-ipv4_tcp = filterlog:\s(?[^,]),(?[^,]),(?[^,]),(?[^,]),(?[^,]),(?[^,]),(?[^,]),(?[^,]),(?4),(?[^,]),(?[^,]),(?[^,]),(?[^,]),(?[^,]),(?[^,]),(?[^,]),(?tcp),(?[^,]),(?[^,]),(?[^,]),(?[^,]),(?[^,]),(?[^,]),(?[^,]),(?[^,]),(?[^,]),(?[^,]),(?[^,]),(?[^$])$
EXTRACT-ipv4_udp = filterlog:\s(?[^,]),(?[^,]),(?[^,]),(?[^,]),(?[^,]),(?[^,]),(?[^,]),(?[^,]),(?4),(?[^,]),(?[^,]),(?[^,]),(?[^,]),(?[^,]),(?[^,]),(?[^,]),(?udp),(?[^,]),(?[^,]),(?[^,]),(?[^,]),(?[^,]),(?[^,])
Because I agree the SEDCMD command is inserting the date just as you stated, I am just not understanding how then the SEDCMD command is "removing" the information prior getting to the extract section of the props.conf file.
The EXTRACT will work regardless of whether there is a timestamp in front. Note the EXTRACT does not start with ^
, so it does not have to match from the start of the event. It will just look for filterlog:
and start extracting fields from there onwards.
If the SEDCMD tries to match a literal .
, that will never match the sample event you shared, so the SEDCMD won't do anything with the event as you mentioned it here. Have you compared the raw events going into splunk, to the raw event you see in splunk, that should tell you what the SEDCMD actually does (if anything at all).
Under what sourcetype is the SEDCMD placed?
You are absolutely right, the SEDCMD is not doing anything, I commented it out and restarted.
The SEDCMD is placed in the props.conf file under [pfsense] the first section just after the transforms file is called to parse out the filterlog to change the sourcetype to pfsense:filterlog. I though the SEDCMD was being called right after that to cut out everything in front of the filterlog so the extract would work. Did not realize the extract would work without it starting with filterlog.
Thanks for the help!
The SEDCMD might still do something. It would actually kick in before the TRANSFORMS bit. I don't know if you only looked at the raw events in splunk, or also checked the raw logs before they entered splunk?
Could also be that your events are already OK and the SEDCMD is just there for a different pfsense version or so that needed some preprocessing.
Correct, the prior to pfSense 2.2 the logging was different. Starting with pfSense 2.2 they went with comma delimited values. Again appreciate all the help!
Here's the test search that I was using:
| makeresults
| eval fieldtestudp="Jul 1 19:58:45 filterlog: 67,,,1509205722,igb1,match,pass,in,4,0x0,,64,43017,0,none,17,udp,56,192.168.X.X,X.X.X.X,56393,53,36"
| rex mode=sed field=fieldtestudp "s/^(\w{3}\s+\d{1,2}\s\d{2}:\d{2}:\d{2}\s)+\S+.\S+\s+/\1/g"
| table fieldtestudp
The result I got was this:
Jul 1 19:58:45 67,,,1509205722,igb1,match,pass,in,4,0x0,,64,43017,0,none,17,udp,56,192.168.X.X,X.X.X.X,56393,53,36
Which seems correct to me... The SEDCMD says:
s/^(\w{3}\s+\d{1,2}\s\d{2}:\d{2}:\d{2}\s)+\S+.\S+\s+/\1/g
s/ => substitute.
^ => from the start
( => capture into buffer 1
\w{3}\s+\d{1,2}\s\d{2}:\d{2}:\d{2}\s -> the time stamp
) => end capture.
\S+.\S+\s+ => <at least 1 non-white space> <any char><at least 1 non-whitespace><at least 1 space>
/ => substitute with
\1 => the contents of buffer 1.. (the time stamp)
/g => (globally)
In a nutshell it says...
s/Jul 1 19:58:45 filterlog:/Jul 1 19:58:45/g
I'd think that there'd be an easier way to get that done, but there ya go.
Maybe if you play around with my search it'll help you a bit.
Hope that helps.