I'm new to splunk. We pre-process http logs and assign geo tags so those tags can be included in the index. One of the geo tags we are adding is called 'region'. This tag describes the users' regions using different levels of granularity. For example, here is a sample log entry:
asn=19262 location="MD, US" region=us region=us_east region=us_south region=us_south_southatlantic region=america_north
The number of regions is arbitrary depending on the users location (e.g. US regions have more granularity than other countries). We need to be able to both search and group on region. For example: region="us" currently works, but region="us_east" does not because the index appears to only use the first value. However, we can do a fulltext search "region=us_east", but this has the undesirable effect of also picking up us_east_coast. We also want to be able to group by region using the stats function => stats median(rate) by region.... but this again only picks up the first instance of the region key. There are no parenthesis around the region value in the logs. I control the log input, so if it would be easier to accomplish this using delimited values like "region=us,us_east,us_south,us_south_southatlantic,america_north" I could change the format.
I've read other posts suggesting use of MV_ADD or REPEAT_MATCH but haven't had much luck getting either to work. I would like the region to be added to the index, not applied at search time.
Instead of
region=us region=us_east
region=us_south
could you do
region=us;us_east;us_south
?
Which would give you a mv field.
Instead of
region=us region=us_east
region=us_south
could you do
region=us;us_east;us_south
?
Which would give you a mv field.
I can make that change. Will MV be automatic, or will I also need to make a change to any configuration files?
I'm assuming that performance will be better if applied at index time
Why don't you want it to be applied at search time?