Can Splunk Filter data in field level before indexing ?
Field level mean that we want to remove some field from event before indexing.
From what I know, heavy forwarder has the capability to filter data, but it is only on “event level” -> mean that we can filter out all event with a specific type. But we can’t only filter some field
I'm newbie in Splunk and this is my first question. Hopefully it help the others too 🙂
Yes, logs can most certainly be filtered before indexing, just as you mention. However, the filtering is not based off an extracted field, simply because the fields are not yet extracted.
The solution is to create a similar regex extraction as the one being performed at search time for most field extraction, and then modify the extracted data prior to indexing. It sounds more complicated than it is, but you need to have some grasp of regex syntax. See the example below, where parts of session_id's are being replaced with ####. You could create a regex that captures your desired field=field_value and replace it with nothing.
Hope this helps,
K
For better help, always post a few sample events.
The easiest way to achieve this would be a SEDCMD.
See http://docs.splunk.com/Documentation/Splunk/5.0.4/admin/Propsconf for how to configure an SEDCMD. You can simply replace the parts you want to remove with "nothing".
E.g.:
props.conf:
[xxxxtesfilterxxxxxx]
SEDCMD-test = s/Domain=EPC-SubscriberId=[^,]+,//g
SEDCMD-test2 = s/EPC-SubscriberId=[^,]+,//g
This is untested. Please test in a dev enviroment bevore roling it out to production.
Yes, but not only change the value of the field to null, I want to remove the field.
Yes, logs can most certainly be filtered before indexing, just as you mention. However, the filtering is not based off an extracted field, simply because the fields are not yet extracted.
The solution is to create a similar regex extraction as the one being performed at search time for most field extraction, and then modify the extracted data prior to indexing. It sounds more complicated than it is, but you need to have some grasp of regex syntax. See the example below, where parts of session_id's are being replaced with ####. You could create a regex that captures your desired field=field_value and replace it with nothing.
Hope this helps,
K
For better help, always post a few sample events.
After I checked again, the filtering is work !, and the license used is smaller than the size of the file log.
Thank you very much for the help !
To make sure I will try it for another case.
And yes, the field I want to remove is this "field" (I don't know how to highlight it) 🙂
XXXXX: Tue Aug 27 13:25:13 2013, Host:úú Tue Aug 27 13:25:22 2013 Field1; Field2;
"Domain=EPC-SubscriberId=ValueDomain,NextField=NextValue,"
EPC-SubscriberId=ValueEPC,
"NextField=ValueField"
I take it that you have read the docs at http://docs.splunk.com/Documentation/Splunk/5.0.4/Data/Anonymizedatausingconfigurationfiles
From what I understand you want to remove the higlighted portions;
XXXXX: Tue Aug 27 13:25:13 2013, Host:úú Tue Aug 27 13:25:22 2013 Field1; Field2; Domain=EPC-SubscriberId=ValueDomain,NextField=NextValue,EPC-SubscriberId=ValueEPC*,NextField=ValueField*
is that correct?
sample event (I change some values)
XXXXX: Tue Aug 27 13:25:11 2013, Host:úú Tue Aug 27 13:25:22 2013
Field1; Field2; Domain=EPC-SubscriberId=ValueDomain,NextField=NextValue,EPC-SubscriberId=ValueEPC,NextField=ValueField
XXXXX: Tue Aug 27 13:25:12 2013, Host:úú Tue Aug 27 13:25:22 2013
Field1; Field2; Domain=EPC-SubscriberId=ValueDomain,NextField=NextValue,EPC-SubscriberId=ValueEPC,NextField=ValueField
XXXXX: Tue Aug 27 13:25:13 2013, Host:úú Tue Aug 27 13:25:22 2013
Field1; Field2; Domain=EPC-SubscriberId=ValueDomain,NextField=NextValue,EPC-SubscriberId=ValueEPC,NextField=ValueField
Here is my props.conf
[xxxxtesfilterxxxxxx]
TRANSFORMS-anonymize = remove-fieldtes
my transform.conf
[remove-fieldtes]
REGEX=(?msi)^(.*?)(Domain=.*?)(EPC-SubscriberId.*?)(EPC-SubscriberId=.*?)(\,.*?)$
FORMAT=$1$4
DEST_KEY=_raw
well I have try that and the filtering still not working,
Actually my concern is to reduce the license usage by removing field. The link you provide generally is used to masking data in indexer.
So What I try is edit props and transform in heavy forwarder to mask the field to null value.
I still don't know why it is not working. I will try again later 🙂
If you want any help with the construction of the regexes, you will need to provide some sample events. Mask sensitive data as needed.
Good luck!
Thanks, I will try this first 🙂
Yes, you can filter events based on field values within the event. But, it sounds like you are trying to change the value of a field to null, while saving the rest of the event. Is that right?
See the following answer. You can use this same type of filtering through RegEx.
like lukejadamec says in his comment, luthfi seems to want to change the contents of (or remove altogether) certain fields in events, not remove the whole event based on a field value.