Getting Data In

Routing to an dynamic index based on JSON field

trenin
Explorer

I have JSON data that I am ingesting. I would like to route the event to an index based on one of the JSON fields. I've seen examples that use REGEX, but I want to avoid hard coding the indexes since I will need to update multiple config files if I start getting new types of data.

My JSON data includes the following section:

...
"collection": {
  "date": "...",
  "source": <Canada | US | Mexico>
},
...

I would like to have 3 seperate indexes, one for Canada, US, and Mexico. I would like to have the index determine dynamically based on the input.

I've seen examples that suggest this is easy to do with REGEX, and I think I could do this as follows that way:

indexes.conf:

[index-Canada]
...
[index-US]
...
[index-Mexico]
...

props.conf:

[default]
TRUNCATE = 0
INDEX_EXTRACTIONS = json
TIMESTAMP_FIELDS = collection.date
TRANSFORMS-SetIndex = setIndex-Canada, setIndex-US, setIndex-Mexico

transforms.conf:

[setIndex-Canada]
REGEX = "source": "Canada"
DEST_KEY = _MetaData::Index
FORMAT = index-Canada

[setIndex-US]
REGEX = "source": "US"
DEST_KEY = _MetaData::Index
FORMAT = index-US

[setIndex-Mexico]
REGEX = "source": "Mexico"
DEST_KEY = _MetaData::Index
FORMAT = index-Mexico

I think this will work. However, I would like to make it so that I don't have to hard code the transforms.conf for each index. One way is to do the following:

props.conf:

[default]
TRUNCATE = 0
INDEX_EXTRACTIONS = json
TIMESTAMP_FIELDS = collection.date
TRANSFORMS-SetIndex = setIndex

transforms.conf:

[setIndex]
REGEX = "source": "(.*)"
DEST_KEY = _MetaData::Index
FORMAT = index-$1

I have a couple questions about this:

  1. If the data has an index I haven't configured, can I somehow setup a fallback so that events that don't match a configured index are not lost?
  2. Can I use the SOURCE_KEY somehow to use the value of the JSON field instead of REGEX? I would rather use the JSON parsing ability of Splunk than my REGEX skills to make sure I am getting the right field. If somehow my REGEX shows up in the contents of the event later, I could get data routed to the wrong index.
0 Karma

amitm05
Builder

For #1
I think you'd need to handle that with some logical set of rules. May be something like defining 2 stanzas in transforms for setting your indexes. One would assign the index only if the sources are US, Mexico OR Canada :

[setIndex_KnownLocations]
REGEX = "source": "Canada|US|Mexico"
DEST_KEY = _MetaData::Index
FORMAT = index-$1

And the second would assign your backup index for all events from other sources :
[setIndex_UnKnownLocations]
REGEX = "source": "(.*)"
DEST_KEY = _MetaData::Index
FORMAT = index-BackupIndex

trenin
Explorer

Thanks - I will try that. Any thoughts for how to use the Splunk JSON parsing in favour of REGEX?

0 Karma
Get Updates on the Splunk Community!

Splunk at Cisco Live 2025: Learning, Innovation, and a Little Bit of Mr. Brightside

Pack your bags (and maybe your dancing shoes)—Cisco Live is heading to San Diego, June 8–12, 2025, and Splunk ...

Splunk App Dev Community Updates – What’s New and What’s Next

Welcome to your go-to roundup of everything happening in the Splunk App Dev Community! Whether you're building ...

The Latest Cisco Integrations With Splunk Platform!

Join us for an exciting tech talk where we’ll explore the latest integrations in Cisco &#43; Splunk! We’ve ...