Splunk Search

Can I scrub IP only?

rstanonik
Engager

I'm tasked to provide apache logs to a third party for their analysis, but the IPs must be replaced to hide the browsers' identity.
Sounds like a simple splunk job: select, piped through scrub, then exported.
But the scrub function scrubs more than just the IPs, it also scrubs the URLs.
I'm hunting for alternatives that will only scrub IPs.
Alternatively, I'm considering hacking scrub to only affect IPs.
Any thoughts?
Thanks.

Labels (1)
Tags (3)

kpkeimig
Path Finder

After lots of reading and too many attempts.  Renaming the fields is the best option, IMO.  Example below is where src is the IP address.  This is undocumented.

| rename * AS _*
| rename _src AS src
| scrub
| rename _* AS *

(It would be nice if scrub took a field listing as an option.  It appears you can do this through config files, but getting that done on splunkcloud would be $#%^py.  Please upvote the idea.)

0 Karma

d3
Explorer

You can use a rename within your search to temporarily hide what you don't want scrubbed, then rename it again after the scrub but before the results are presented. The example below is something we've come up with to scrub a firewall IPS log. The search looks for the device (FG is a FortiGate) and message type (ips). The "ref" is a reference to a real URL from the vendor website. We rename the ref to _ref which gets ignored by scrub, then rename _ref to ref and build the report table.

device_id="FG*" type="ips" | stats count by msg devname ref | rename ref as _ref | scrub | sort 10 -count | rename _ref as ref| table msg devname ref

0 Karma

MarioM
Motivator

do you do it on the ip field? | scrub ipfieldname

0 Karma

chris
Motivator

One way of getting a result although it is not very elegant:

  1. Export the Events
  2. Save a list of the ip adresses ( append '|dedup ip-field| table ip-field' to your search)
  3. Run $SPLUNK_HOME/bin/splunk anonymize file -source /tmp/events.txt -private-terms /opt/splunk/etc/anonymizer/ip-list.txt -public-terms /tmp/events.txt

The public-terms wont get replaced (which is everything since the file with the events is used) the private-terms do get replaced and they seem to have a higher priority

I got the Idea from here:
http://docs.splunk.com/Documentation/Splunk/latest/Troubleshooting/AnonymizedatasamplestosendtoSuppo...

Not very nice, not very efficient I know, but it worked ...

dwaddle
SplunkTrust
SplunkTrust

Better choice may be to use rex in sed mode. This is a rough stab at a sed-expression to do some (not optimal) scrubbing.

... | rex mode=sed "s/([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})\.([0-9]{1,3})/xxx.yyy.zzz.\2/g"
Get Updates on the Splunk Community!

Splunk Enterprise Security 8.0.2 Availability: On cloud and On-premise!

A few months ago, we released Splunk Enterprise Security 8.0 for our cloud customers. Today, we are excited to ...

Logs to Metrics

Logs and Metrics Logs are generally unstructured text or structured events emitted by applications and written ...

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...