I have a csv that is coming in and we want to replace anything in the name section with "XXXX"
Sample events
"2019-04-16 15:02:42",,22290412_163115_00725.pdf,111111,,,,,--------Please Select Member --------, 0000, 000,AlertID-000000,AlertID-000000,AlertID-000000,Success,"Get New File",1,Prod,UserName,COMPANYNAME,WSH4109162,Pega Robotics,8.0.2009,CareAlerts Provider Feedback,1.38,
"2019-04-17 11:43:15",123470044,20190415_115516_00257.pdf,4000146,Provider,123612,General Feedback , 123456789,Jane L Doe , 0000, 000,123758-100000,123233-100000,AlertID-000000,Failed,"General Feedback : AlertID not found or not enabled.",13,Prod,username,CompanyName,WSWH4051106,Pega Robotics,8.0.2009,CareAlerts Provider Feedback,1.38,
So in these events the following strings should be replaced with XXXX
--------Please Select Member --------
Jane L Doe
I've gotten this extraction from Splunk for the field, but it does not work in SEDCMD.
SEDCMD-CSV1 = s/(?ms)^(?:[^,\\n]*,){8}(?P<MemberTest>[^,]+)/XXXX/g
_______________________Edit, to insert image -----------------------------
Hi,
Please try below config.
props.conf
[yoursourcetype]
SEDCMD-abc=s/^((?:[^,]*[,]){8})(?:[^,]*)/\1XXXX/g
You have an extra \\
so try this:
SEDCMD-CSV1 = s/(?ms)^(?:[^,\n]*,){8}(?P<MemberTest>[^,]+)/XXXX/g
Hi,
Please try below config.
props.conf
[yoursourcetype]
SEDCMD-abc=s/^((?:[^,]*[,]){8})(?:[^,]*)/\1XXXX/g
I tried your string, and it sort of works but behaves weirdly..See the image I edited to the original question.
It replaces the desired text in _raw, but the field value is still present for the data that was replaced with XXXX.
Also, it replaces a second set of data 8 more commas down the line.
The first highlighted XXXX with the red circle is what we wanted, the second we did not. Also, MemberName= clearly still shows the value of what is now XXXX.
To fix first problem to not replace last values we can use SEDCMD-abc=s/^((?:[^,]*[,]){8})(?:[^,]*)/\1XXXX/
(Removed g
) but looks like you are using INDEXED_EXTRACTIONS = csv
or sourcetype = csv
and due to that only _raw data modifies but not indexed fields.
In my lab environment I have tested below config on standalone splunk and it is masking data correctly (Raw data as well as Indexed Fields)
props.conf (You might not require INDEXED_EXTRACTIONS = CSV
in below config on Splunk Enterprise if you set that on Universal Forwarder)
[yoursourcetype]
SEDCMD-abc=s/^((?:[^,]*[,]){8})(?:[^,]*)/\1XXXX/
INDEXED_EXTRACTIONS = CSV
TRANSFORMS-test = remove_member
transforms.conf
[remove_member]
REGEX = (?m)^(.*Member\:\:)(?:\"[^\"]*\"|[^\s]*)(\s.*)
FORMAT = $1XXXX$2
WRITE_META = false
SOURCE_KEY = _meta
DEST_KEY = _meta
Based on Splunk Document, it is not recommend to use DEST_KEY = _meta
If DEST_KEY = _meta (not recommended) you should also add $0 to the
start of your FORMAT setting. $0 represents the DEST_KEY value before
Splunk software performs the REGEX (in other words, _meta).
Thanks for your help, but the [remove_member] is still not working as expected.
I've applied this to the app on the UF \etc\appname\local and to my indexers (which should it be?)
../splunk/etc/system/local props.conf and transforms.conf. The _raw gets redacted as expected by the extracted field "MemberName" still comes through with the original value.
Is this because the replacement regex is (.*Member::) not (.*MemberName::)?
Yes correct, I have taken fieldname as Member, in your case if it is MemberName then replace Member with MemberName in REGEX.
Ok.. That got it. Thanks!
To Summarize for posterity.
Had to update from 6.5.2 to 7.2.3 for proper field extraction from the csv.
Files located in /etc/appname/local on the UF deployed from the Deployment server.
inputs.conf
[monitor://E:\CareAlerts_Fax_Prod\Reporting\*csv$]
disabled = 0
sourcetype=cafax:prod
ignoreOlderThan = 30d
index = application
crcSalt = <SOURCE>
props.conf
[cafax:prod]
DATETIME_CONFIG =
INDEXED_EXTRACTIONS = csv
KV_MODE = none
LINE_BREAKER = ([\r\n]+)
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = false
category = Structured
description = Comma-separated value format. Set header and other settings in "Delimited Settings"
disabled = false
pulldown_type = true
SEDCMD-mname=s/^((?:[^,]*[,]){8})(?:[^,]*)/\1XX-REDACTED-XX/
TRANSFORMS-test = remove_membername
transforms.conf.
[remove_membername]
REGEX = (?m)^(.*MemberName\:\:)(?:\"[^\"]*\"|[^\s]*)(\s.*)
FORMAT = $1XX-REDACTED-XX$2
WRITE_META = false
SOURCE_KEY = _meta
DEST_KEY = _meta