I have JSON data that I am ingesting. I would like to route the event to an index based on one of the JSON fields. I've seen examples that use REGEX, but I want to avoid hard coding the indexes since I will need to update multiple config files if I start getting new types of data.
My JSON data includes the following section:
...
"collection": {
"date": "...",
"source": <Canada | US | Mexico>
},
...
I would like to have 3 seperate indexes, one for Canada, US, and Mexico. I would like to have the index determine dynamically based on the input.
I've seen examples that suggest this is easy to do with REGEX, and I think I could do this as follows that way:
indexes.conf:
[index-Canada]
...
[index-US]
...
[index-Mexico]
...
props.conf:
[default]
TRUNCATE = 0
INDEX_EXTRACTIONS = json
TIMESTAMP_FIELDS = collection.date
TRANSFORMS-SetIndex = setIndex-Canada, setIndex-US, setIndex-Mexico
transforms.conf:
[setIndex-Canada]
REGEX = "source": "Canada"
DEST_KEY = _MetaData::Index
FORMAT = index-Canada
[setIndex-US]
REGEX = "source": "US"
DEST_KEY = _MetaData::Index
FORMAT = index-US
[setIndex-Mexico]
REGEX = "source": "Mexico"
DEST_KEY = _MetaData::Index
FORMAT = index-Mexico
I think this will work. However, I would like to make it so that I don't have to hard code the transforms.conf for each index. One way is to do the following:
props.conf:
[default]
TRUNCATE = 0
INDEX_EXTRACTIONS = json
TIMESTAMP_FIELDS = collection.date
TRANSFORMS-SetIndex = setIndex
transforms.conf:
[setIndex]
REGEX = "source": "(.*)"
DEST_KEY = _MetaData::Index
FORMAT = index-$1
I have a couple questions about this:
For #1
I think you'd need to handle that with some logical set of rules. May be something like defining 2 stanzas in transforms for setting your indexes. One would assign the index only if the sources are US, Mexico OR Canada :
[setIndex_KnownLocations]
REGEX = "source": "Canada|US|Mexico"
DEST_KEY = _MetaData::Index
FORMAT = index-$1
And the second would assign your backup index for all events from other sources :
[setIndex_UnKnownLocations]
REGEX = "source": "(.*)"
DEST_KEY = _MetaData::Index
FORMAT = index-BackupIndex
Thanks - I will try that. Any thoughts for how to use the Splunk JSON parsing in favour of REGEX?