Hello,
I have the message field of a Windows event which contains data with delimeter ':'. Is there any way to split the data of message to KV style? the desired "field name" is not consistent in name (so I don't actually know the names) and even how many times will be.
Example:
Audit event: event_time:2019-06-12 12:24:21.8542963 sequence_number:1 action_id:CR succeeded:true is_column_permission:false session_id:59 server_principal_id:2(...)
desired result (fieldname)->(value):
event_time->2019-06-12 12:24:21.8542963
sequence_number->1
action_id->1
succeeded-> true
is_column_permission->false
session_id->59
server_principal_id->2
preferable in search time but could be happen in index time.
And you already have the Splunk Add-on for Windows (https://splunkbase.splunk.com/app/742/) installed and is this particular event not compatible with it? Because normally that takes care of extracting the key: value
pairs from the message field.
I guess something like this should work (REGEX might require tuning for your actual data, not sure if your example covers all possible cases). See also: https://regex101.com/r/FQFfQQ/1
props.conf
[yoursourcetype]
REPORT-ZZcustom_msg_kv = custom_msg_kv
transforms.conf
[custom_msg_kv]
SOURCE_KEY = message
REGEX = ([a-zA-Z]\w+):(.*?)(?=\s+[a-zA-Z]\w+:|$)
FORMAT = $1::$2
And you already have the Splunk Add-on for Windows (https://splunkbase.splunk.com/app/742/) installed and is this particular event not compatible with it? Because normally that takes care of extracting the key: value
pairs from the message field.
I guess something like this should work (REGEX might require tuning for your actual data, not sure if your example covers all possible cases). See also: https://regex101.com/r/FQFfQQ/1
props.conf
[yoursourcetype]
REPORT-ZZcustom_msg_kv = custom_msg_kv
transforms.conf
[custom_msg_kv]
SOURCE_KEY = message
REGEX = ([a-zA-Z]\w+):(.*?)(?=\s+[a-zA-Z]\w+:|$)
FORMAT = $1::$2
I decide to accept the solution because as general idea it works. Unfortunately the message field is not standardized so I don't think that there is a perfect regex which will cover all the case.
The TA gets from the Message specific parts per case and doesn't tokenize everything.
I tested with some adjustments in regex(Audit event:\s([a-zA-Z]\w+):(.*?)(?=\s+[a-zA-Z]\w+:|$)\s?) and some syntax but it doesn't work even for one kv.
Btw: the example is real at least in some extent.
Did you also test without making changes to the regex I suggested? Because to enable it to extract all the fields, you should definitely not include that "Audit event:" part).
Is the message field perhaps extracted as Message (capital M) instead of message?
Because as you can see in the regex101 link in my answer, the regex itself works fine with the data you shared. So either that sample data is not representative, or something is not working perfectly in the setup of the REPORT extraction (e.g. wrong SOURCE_KEY).
Can you perhaps share the config just like you tested it (to make sure no mistakes are there)? And perhaps a screenshot of what the message field looks like in splunk (feel free to mask sensitive info in it). Also: where did you deploy this tested config (should be on the search head(s))?
The "Audit event:" is some kind of header so I actually not needed. So that is the reason I change it.
True I saw it and I fix as well the FORMAT= (it was complain)
I tested as well before and after. I change it a bit so that include the space in the end in case that... I don't know, didn't like.
Also for the sake of test (as I saw it before) I closed browsers, restart splunk, do changes from the gui etc.
I will try to get some masked data and yes, on the SH only.
Right, the missing = was indeed a typo, fixed that in my answer post as well, thanks for highlighting that.
Hi, no, it is MSSQL logs and the message contains all the useful information for ES. The TA is doing some general splitting. I tried initially with rex as test but it was kind difficult to go through all "sub"fields. I will have a look on yours, thx!