Re: Remove specific string from event at index tim...

benUnicoSplunk · ‎09-07-2016

I am trying to remove specific strings and their values from Splunk events at index time as they are not needed in the event that is being indexed.
eg. 08-09-2016 12:59:25 {"menu":{"id":"file","value":"File","popup":{"menuitem":[{"value":"New","onclick":"CreateNewDoc()"},{"value":"Open","onclick":"OpenDoc()"},{"value":"Close","onclick":"CloseDoc()"}]}}}

For example, from this event I would like to remove the "onclick" key and value.
I have created an entry in the props.conf for a transform to be performed for the sourcetype, and in the transforms.conf, I have configured the following:
[remove_onclick]
REGEX = ^(.)\,\"onclick\":\"[^\"]+\"(.)$
FORMAT = $1$2
DEST_KEY = _raw

The aim is to get everything before the "onclick" string, then get everything after it, and format the event to concatenate these together.

When the event is indexed, the strings are removed correctly, however when the event string is large (over 4096 characters in length), Splunk is truncating the string to 4096 characters when performing the regex. So the result event is chopped at the end, and the remaining event string data is lost.
I have tried indexing the event without any transformation being performed and the event is indexed entirely without any string truncation.

Is there any configuration value that needs to be set to avoid this, or is there another approach I can take to remove specific strings at index time from an event?

Thanks!

sundareshr · ‎09-08-2016

Have you looked at SEDCMD? Something like this should work (please verify regex)

SEDCMD-remove_class = s/(\"onclick[^\}]+)//g

http://docs.splunk.com/Documentation/Splunk/6.4.3/Data/Anonymizedata#Anonymize_data_through_a_sed_sc...

benUnicoSplunk · ‎09-08-2016

Thanks sundareshr, that solution will work perfectly as well!

akocak · ‎02-15-2019

no answer is selected, if no transforms, this is the best way to handle this case,

benUnicoSplunk · ‎09-08-2016

Within the transforms.conf, the setting for LOOKAHEAD is default to 4096, so this was what I had to increase for the regex to completely work.

renjith_nair · ‎09-08-2016

it's possible that splunk applies the truncate parameter in props.conf

#******************************************************************************
# Line breaking
#******************************************************************************Line breaking

# Use the following attributes to define the length of a line.

TRUNCATE = <non-negative integer>
* Change the default maximum line length (in bytes).
* Although this is in bytes, line length is rounded down when this would
  otherwise land mid-character for multi-byte characters.
* Set to 0 if you never want truncation (very long lines are, however, often
  a sign of garbage data).
* Defaults to 10000 bytes.

You might need to set this value to a higher number for this particular sourcetype. Also make sure that you have the same settings in indexer and HF if you have an HF in between.

---
What goes around comes around. If it helps, hit it with Karma 🙂

benUnicoSplunk · ‎09-08-2016

Thanks for the reply, I hadn't changed the TRUNCATE value for this sourcetype, so it still had the default value of 10000 bytes.

After further investigation, I have found the solution.
Within the transforms.conf, the setting for LOOKAHEAD is default to 4096, so this was what I had to increase for the regex to completely work.

Karthikeya · ‎02-18-2025

can someone help on this ticket - https://community.splunk.com/t5/Getting-Data-In/Exclude-or-Remove-few-fields-while-on-boarding-data/...

Remove specific string from event at index time

Data Management Digest – December 2025

Index This | What is broken 80% of the time by February?

Unlock Faster Time-to-Value on Edge and Ingest Processor with New SPL2 Pipeline ...

Join the Conversation

Remove specific string from event at index time

Data Management Digest – December 2025

Index This | What is broken 80% of the time by February?

Unlock Faster Time-to-Value on Edge and Ingest Processor with New SPL2 Pipeline ...