Splunk Search

Remove specific string from event at index time

New Member

I am trying to remove specific strings and their values from Splunk events at index time as they are not needed in the event that is being indexed.
eg. 08-09-2016 12:59:25 {"menu":{"id":"file","value":"File","popup":{"menuitem":[{"value":"New","onclick":"CreateNewDoc()"},{"value":"Open","onclick":"OpenDoc()"},{"value":"Close","onclick":"CloseDoc()"}]}}}

For example, from this event I would like to remove the "onclick" key and value.
I have created an entry in the props.conf for a transform to be performed for the sourcetype, and in the transforms.conf, I have configured the following:
REGEX = ^(.)\,\"onclick\":\"[^\"]+\"(.)$
FORMAT = $1$2
DEST_KEY = _raw

The aim is to get everything before the "onclick" string, then get everything after it, and format the event to concatenate these together.

When the event is indexed, the strings are removed correctly, however when the event string is large (over 4096 characters in length), Splunk is truncating the string to 4096 characters when performing the regex. So the result event is chopped at the end, and the remaining event string data is lost.
I have tried indexing the event without any transformation being performed and the event is indexed entirely without any string truncation.

Is there any configuration value that needs to be set to avoid this, or is there another approach I can take to remove specific strings at index time from an event?


0 Karma


Have you looked at SEDCMD? Something like this should work (please verify regex)

SEDCMD-remove_class = s/(\"onclick[^\}]+)//g


0 Karma

New Member

Thanks sundareshr, that solution will work perfectly as well!

0 Karma


no answer is selected, if no transforms, this is the best way to handle this case,

0 Karma

New Member

Within the transforms.conf, the setting for LOOKAHEAD is default to 4096, so this was what I had to increase for the regex to completely work.

0 Karma


it's possible that splunk applies the truncate parameter in props.conf

# Line breaking
#******************************************************************************Line breaking

# Use the following attributes to define the length of a line.

TRUNCATE = <non-negative integer>
* Change the default maximum line length (in bytes).
* Although this is in bytes, line length is rounded down when this would
  otherwise land mid-character for multi-byte characters.
* Set to 0 if you never want truncation (very long lines are, however, often
  a sign of garbage data).
* Defaults to 10000 bytes.

You might need to set this value to a higher number for this particular sourcetype. Also make sure that you have the same settings in indexer and HF if you have an HF in between.

What goes around comes around. If it helps, hit it with Karma 🙂
0 Karma

New Member

Thanks for the reply, I hadn't changed the TRUNCATE value for this sourcetype, so it still had the default value of 10000 bytes.

After further investigation, I have found the solution.
Within the transforms.conf, the setting for LOOKAHEAD is default to 4096, so this was what I had to increase for the regex to completely work.

0 Karma
Get Updates on the Splunk Community!

Customer Experience | Splunk 2024: New Onboarding Resources

In 2023, we were routinely reminded that the digital world is ever-evolving and susceptible to new ...

Celebrate CX Day with Splunk: Take our interactive quiz, join our LinkedIn Live ...

Today and every day, Splunk celebrates the importance of customer experience throughout our product, ...

How to Get Started with Splunk Data Management Pipeline Builders (Edge Processor & ...

If you want to gain full control over your growing data volumes, check out Splunk’s Data Management pipeline ...