Splunk Search

Discarding source type data from _raw event after per-event source type override?

Graham_Hanningt
Builder

Disclaimer: This is a "self-answering" question: I'm already doing what the question asks. I'm "asking" this question because I think the answer might be useful to other users. I also welcome others suggesting different, and perhaps better, ways to do this.

Background to this question

I help to develop a tool that forwards events from multiple log types—in Splunk terms, multiple source types—to Splunk in JSON Lines format.

The tool can forward all such events to a single Splunk input; for example, to the same TCP port. Or it can forward each source type to a separate input; different TCP ports.

Optionally, to support the first case, where a file or stream contains events from multiple log types, the tool can include in each line of JSON Lines a property that identifies the source type. Let's say that property is named code.

In the corresponding Splunk configuration, I use a transform that uses the value of the code property to override source types on a per-event basis.

After the transform, the code property is redundant: its value is now stored in the sourcetype default field.

The "question"

Can I discard the now-redundant code property from the event before it is indexed, to conserve storage and license usage?

0 Karma

Graham_Hanningt
Builder

The property in the incoming JSON Lines event data that contains the event timestamp is also, after timestamp extraction, similarly redundant. But I'll leave discarding that property for another, ahem, time.

0 Karma

Graham_Hanningt
Builder

Follow the existing transform that overrides the source type with a new, second transform that removes the code property from the _raw event data.

In props.conf:

TRANSFORMS-changesourcetype = set_sourcetype, remove_code_property

In transforms.conf:

[set_sourcetype]
# Set the sourcetype to code property value in JSON
REGEX = \"code\":\"([^\"]+)\"
FORMAT = sourcetype::$1
DEST_KEY = MetaData:Sourcetype
[remove_code_property]
# Remove the code property to conserve license usage
REGEX = ^({.*)"code":"[^"]+",(.*)$
FORMAT = $1$2
DEST_KEY = _raw

(In my case, in the serialized incoming JSON Lines data, the code property happens to be the first property in each line.)

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...