I tried all the possible things in Splunk, but couldn't index only some part of the file.
For example:
2015/11/30 19:00:00 ad32ah req:connection srv:vm1pskndx3 method HTTPS txnid:986218312825 and from here 20 to 30 lines of event.
I need to index only ad32ah
, vm1pskndx3
, HTTPS
and 986218312825
. I only need these 4 fields and ignore the other part of the event, and it should be repeated for all the events.
I understand I should do something in props.conf
or transforms.conf
.
Please advise!
use your prop.conf as mentioned below(http://docs.splunk.com/Documentation/Splunk/6.2.1/admin/Propsconf)
SEDCMD-<class> = <sed script>
* Only used at index time.
* Commonly used to anonymize incoming data at index time, such as credit card or social
security numbers. For more information, search the online documentation for "anonymize
data."
* Used to specify a sed script which Splunk applies to the _raw field.
* A sed script is a space-separated list of sed commands. Currently the following subset of
sed commands is supported:
* replace (s) and character substitution (y).
* Syntax:
* replace - s/regex/replacement/flags
* regex is a perl regular expression (optionally containing capturing groups).
* replacement is a string to replace the regex match. Use \n for back
references, where "n" is a single digit.
* flags can be either: g to replace all matches, or a number to replace a
specified match.
* substitute - y/string1/string2/
* substitutes the string1[i] with string2[i]
use your prop.conf as mentioned below(http://docs.splunk.com/Documentation/Splunk/6.2.1/admin/Propsconf)
SEDCMD-<class> = <sed script>
* Only used at index time.
* Commonly used to anonymize incoming data at index time, such as credit card or social
security numbers. For more information, search the online documentation for "anonymize
data."
* Used to specify a sed script which Splunk applies to the _raw field.
* A sed script is a space-separated list of sed commands. Currently the following subset of
sed commands is supported:
* replace (s) and character substitution (y).
* Syntax:
* replace - s/regex/replacement/flags
* regex is a perl regular expression (optionally containing capturing groups).
* replacement is a string to replace the regex match. Use \n for back
references, where "n" is a single digit.
* flags can be either: g to replace all matches, or a number to replace a
specified match.
* substitute - y/string1/string2/
* substitutes the string1[i] with string2[i]
Be aware that this approach has 2 drawbacks:
1: Any index-time fields which are created by INDEXED_EXTRACTIONS (and maybe other ways, too), will still possess the values which proves that:
2: Even though you will be "decluttering" your raw events, you will still be metered against your license for the pre-modified size.
How do I save my license ? Please help.
But that is for anonymizing the incoming data.
For example :
1335-1235-1531-1353 is a card number and need to be indexed in a masked format xxxx-xxxx-xxxx-xx53.
please correct me if I'm wrong.
Basically this is true, this doc talks about anonymizing data using a SED script... and what it does is match a pattern and replace it in the example.
You should do the same, but replace it with nothing... You can try the effect using the Data onboarding wizard (Add Data)
But it would be something like this in props.conf
SEDCMD - nullDataIDontWant = s/req:connection//g