Splunk Search

Randomizing text?

the_wolverine
Champion

I'm looking to obscure data by randomizing text. Does anyone have a simple way to do this against a field in Splunk? Let's assume that I'm doing this to export the data sample. I could eliminate the field altogether, but would like a randomized placeholder, vs just using eval of a fixed value or using random() which is numeric.

0 Karma

tscroggins
Champion

Similar to @woodcock's answer, Splunk 8.1 adds the mvmap function to iterate over multi-valued field values. This makes it easy to replace characters in a string:

| makeresults
| eval value="randomize this"
| eval mask=" !\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~"
| eval random_value=mvjoin(mvmap(split(value, ""), substr(mask, random()%len(mask), 1)), "")

The replace function and rex command can also be used to mask values, but the replacement value is only evaluated once:

| makeresults
| eval value="randomize this"
| eval masked_value=replace(value, ".", "*")
| makeresults
| eval value="randomize this"
| rex field=value mode=sed "s/./*/g"

Hash functions, lookup tables, and other methods are also useful, depending on why you want to randomize, mask, or deidentify your data.

0 Karma

894859
Explorer

Another way of generating a random string would be to combine a few evals containing data and then concatenate random substrings of those evals. As an example, this will create a random 8 character alpha-numeric string with special characters.

| makeresults
| eval
rand_lower="abcdefghijklmnopqrstuvwxyz"
,rand_upper=upper("abcdefghijklmnopqrstuvwxyz")
,rand_number=1234567890
,rand_special="!#$%*"
,len_lower=len(rand_lower)
,len_upper=len(rand_upper)
,len_number=len(rand_number)
,len_special=len(rand_special)
,RandomString=substr(rand_upper, (random() % len_upper), 1).substr(rand_lower, (random() % len_lower), 1).substr(rand_lower, (random() % len_lower), 1).substr(rand_special, (random() % len_special), 1).substr(rand_lower, (random() % len_lower), 1).substr(rand_number, (random() % len_number), 1).substr(rand_upper, (random() % len_upper), 1).substr(rand_lower, (random() % len_lower), 1)
| fields - rand_* len_*

0 Karma

somesoni2
Revered Legend

How about this (go get 50 sample events randomly)

your base search | streamstats count as sno | where sno%50=0 | table whateverfield
0 Karma

woodcock
Esteemed Legend

Does it have to be random?

Start with this:

|makeresults
| eval fieldToObscure="This is organized text of length=35"
| eval len=len(fieldToObscure)

Then try this:

| eval base_string="12345678901234567890123456789012345678901234567890123456789012345678901234567890"
| eval fieldToObscure=substr(base_string, 0, len)

Non-random, but probably seems enough like it to work:

| eval fieldToObscure=substr(md5(fieldToObscure), 0, len)
0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Mile High Learning with Splunk University, Denver, Colorado

If Denver is known for its mile-high elevation, Splunk University is about to raise the bar on technical ...

IT Service Intelligence 5.0 Series: Your Guide to the June Launch

We are excited to announce the June release of Splunk IT Service Intelligence (ITSI) 5.0. This update ...

Agent Mode Engaged! Enchaining Agentic Operations with Splunk AI Assistant 2.0

    Are you ready to transform how your team handles complex data requests? We invite you to our upcoming ...