Splunk Search

Randomizing text?

the_wolverine
Champion

I'm looking to obscure data by randomizing text. Does anyone have a simple way to do this against a field in Splunk? Let's assume that I'm doing this to export the data sample. I could eliminate the field altogether, but would like a randomized placeholder, vs just using eval of a fixed value or using random() which is numeric.

0 Karma

tscroggins
Champion

Similar to @woodcock's answer, Splunk 8.1 adds the mvmap function to iterate over multi-valued field values. This makes it easy to replace characters in a string:

| makeresults
| eval value="randomize this"
| eval mask=" !\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~"
| eval random_value=mvjoin(mvmap(split(value, ""), substr(mask, random()%len(mask), 1)), "")

The replace function and rex command can also be used to mask values, but the replacement value is only evaluated once:

| makeresults
| eval value="randomize this"
| eval masked_value=replace(value, ".", "*")
| makeresults
| eval value="randomize this"
| rex field=value mode=sed "s/./*/g"

Hash functions, lookup tables, and other methods are also useful, depending on why you want to randomize, mask, or deidentify your data.

0 Karma

894859
Explorer

Another way of generating a random string would be to combine a few evals containing data and then concatenate random substrings of those evals. As an example, this will create a random 8 character alpha-numeric string with special characters.

| makeresults
| eval
rand_lower="abcdefghijklmnopqrstuvwxyz"
,rand_upper=upper("abcdefghijklmnopqrstuvwxyz")
,rand_number=1234567890
,rand_special="!#$%*"
,len_lower=len(rand_lower)
,len_upper=len(rand_upper)
,len_number=len(rand_number)
,len_special=len(rand_special)
,RandomString=substr(rand_upper, (random() % len_upper), 1).substr(rand_lower, (random() % len_lower), 1).substr(rand_lower, (random() % len_lower), 1).substr(rand_special, (random() % len_special), 1).substr(rand_lower, (random() % len_lower), 1).substr(rand_number, (random() % len_number), 1).substr(rand_upper, (random() % len_upper), 1).substr(rand_lower, (random() % len_lower), 1)
| fields - rand_* len_*

0 Karma

somesoni2
Revered Legend

How about this (go get 50 sample events randomly)

your base search | streamstats count as sno | where sno%50=0 | table whateverfield
0 Karma

woodcock
Esteemed Legend

Does it have to be random?

Start with this:

|makeresults
| eval fieldToObscure="This is organized text of length=35"
| eval len=len(fieldToObscure)

Then try this:

| eval base_string="12345678901234567890123456789012345678901234567890123456789012345678901234567890"
| eval fieldToObscure=substr(base_string, 0, len)

Non-random, but probably seems enough like it to work:

| eval fieldToObscure=substr(md5(fieldToObscure), 0, len)
0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Kick the Tires Before You Commit: A Hands-On Tour of the Splunk Observability Cloud ...

Evaluating an enterprise observability platform usually goes like this: fill out a form, get a free trial with ...

Deep insights, no barriers: Splunk Observability Cloud Free Edition

As software delivery cycles continue to accelerate, observability shouldn’t be a luxury — it should be a ...

Monitoring AI Agents with Splunk Observability Cloud

Let’s say I’m running a travel planning AI app in production. A user asks for three concise hotel options in ...