Splunk Search

Can I scrub IP only?

rstanonik
Engager

I'm tasked to provide apache logs to a third party for their analysis, but the IPs must be replaced to hide the browsers' identity.
Sounds like a simple splunk job: select, piped through scrub, then exported.
But the scrub function scrubs more than just the IPs, it also scrubs the URLs.
I'm hunting for alternatives that will only scrub IPs.
Alternatively, I'm considering hacking scrub to only affect IPs.
Any thoughts?
Thanks.

Tags (3)

kpkeimig
Path Finder

After lots of reading and too many attempts.  Renaming the fields is the best option, IMO.  Example below is where src is the IP address.  This is undocumented.

| rename * AS _*
| rename _src AS src
| scrub
| rename _* AS *

(It would be nice if scrub took a field listing as an option.  It appears you can do this through config files, but getting that done on splunkcloud would be $#%^py.  Please upvote the idea.)

0 Karma

d3
Explorer

You can use a rename within your search to temporarily hide what you don't want scrubbed, then rename it again after the scrub but before the results are presented. The example below is something we've come up with to scrub a firewall IPS log. The search looks for the device (FG is a FortiGate) and message type (ips). The "ref" is a reference to a real URL from the vendor website. We rename the ref to _ref which gets ignored by scrub, then rename _ref to ref and build the report table.

device_id="FG*" type="ips" | stats count by msg devname ref | rename ref as _ref | scrub | sort 10 -count | rename _ref as ref| table msg devname ref

0 Karma

MarioM
Motivator

do you do it on the ip field? | scrub ipfieldname

0 Karma

chris
Motivator

One way of getting a result although it is not very elegant:

  1. Export the Events
  2. Save a list of the ip adresses ( append '|dedup ip-field| table ip-field' to your search)
  3. Run $SPLUNK_HOME/bin/splunk anonymize file -source /tmp/events.txt -private-terms /opt/splunk/etc/anonymizer/ip-list.txt -public-terms /tmp/events.txt

The public-terms wont get replaced (which is everything since the file with the events is used) the private-terms do get replaced and they seem to have a higher priority

I got the Idea from here:
http://docs.splunk.com/Documentation/Splunk/latest/Troubleshooting/AnonymizedatasamplestosendtoSuppo...

Not very nice, not very efficient I know, but it worked ...

dwaddle
SplunkTrust
SplunkTrust

Better choice may be to use rex in sed mode. This is a rough stab at a sed-expression to do some (not optimal) scrubbing.

... | rex mode=sed "s/([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})\.([0-9]{1,3})/xxx.yyy.zzz.\2/g"
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Keep the Learning Going with the New Best of .conf Hub

Hello Splunkers, With .conf26 getting closer, there’s already a lot of excitement building around this year’s ...

Splunk Community Badges!

  Hey everyone! Ready to earn some serious bragging rights in the community? Along with our existing badges ...

How to find the worst searches in your Splunk environment and how to fix them

Everyone knows Splunk is a powerful platform for running searches and doing data analytics. Your ...