Re: How to mask a field at search time only if the...

lyndac · ‎04-29-2016

I have a requirement to mask the value of a field after 30 days.

The events are json events. The users need to be able to see/search all the fields except 1 for up to a year. The 1 field must be hidden from view after 30 days.

My plan was to define a calculated field that, when _time is more than 30 days ago overwrites the value of the field with one I supply. The calculation would be performed for every search. What I failed to consider was 2 things:

First, The field to be overwritten is a json field. The fieldname is foo{}.id If I use
|eval foo{}.id = if ((_time < (now() - (86400*30))), "TOO OLD", foo{}.id), I get an error that the eval is malformed. If I add quotes around the field names like this: |eval "foo{}.id" = if ((_time < (now() - (86400*30))), "TOO OLD", "foo{}.id"), I get a new field called foo.id which = TOO OLD, but I still have the original foo{}.id with the original value.

Second, Even if I can get the calculated field to behave properly, the original value is still in the _raw field which is easily visible in the events view or by adding _raw to a table.

So, is it possible to overwrite a single field at search time such that every search will return the overwritten value?

Also, can I somehow remove the _raw field for every search, and if so, are there any weird consequences from doing that?

woodcock · ‎04-30-2016

I would do this: At the time of index, modify the event to create a hash using the time-sensitive field and replace the field value in the raw event with the hash. At the same time, add the value with the hash and a date in a KV store so that the data exists in 2 separate places. Then every day purge the KV store of any data that is older than 30 days. When you search, use a lookup on the hash in the event to pull in the field value from the KV store and after 30-days, the lookup will fail.

lyndac · ‎05-02-2016

This sounds like a great approach. So, I'd need a script to pre-process the data files before they are given to the splunk Universal Forwarder, right?

woodcock · ‎05-02-2016

You've got it.

woodcock · ‎04-29-2016

You will need to re-index the event after modifying it and the delete the original event. You can use collect to do this.

lyndac · ‎04-29-2016

I saw a reference to this solution in another answer, but didn't understand it. I thought summary indexes were mainly used to collect the output of stats commands so you can keep counts longer than the actual data. How does a summary index work when you just want to re-index an entire event that is already indexed? Does it just send the _raw field value through the index/parsing pipeline again? if so, do I just need to use |rex to mask the field in the raw json?

Are the same props and transforms applied to the summary indexed data that is applied to the original data? I want to make sure that I can just add the summary index to all of my searches and have them still work.

Any details you can give me would be greatly appreciated. I'd really like to more fully understand how this works.

Thanks...

woodcock · ‎04-30-2016

Although collect is intended to write to a Summary Index, in actuality, it can write to any Index. Play around with it and you will see what it does.

|noop|stats count AS TestOfCollect | collect index=myIndex

Then check it out:

index=myIndex | where isnotnull(TestOfCollect)

Then throw it away and refine:

index=myIndex | where isnotnull(TestOfCollect) | delete

woodcock · ‎04-30-2016

Be aware that using collect to a non-Summary Index will incur double-license hit.

How to mask a field at search time only if the data is > 30 days?

Upcoming Webinar: Unmasking Insider Threats with Slunk Enterprise Security’s UEBA

.conf25 technical session recap of Observability for Gen AI: Monitoring LLM ...

A Season of Skills: New Splunk Courses to Light Up Your Learning Journey

Join the Conversation

How to mask a field at search time only if the data is > 30 days?

Upcoming Webinar: Unmasking Insider Threats with Slunk Enterprise Security’s UEBA

.conf25 technical session recap of Observability for Gen AI: Monitoring LLM ...

A Season of Skills: New Splunk Courses to Light Up Your Learning Journey