Knowledge Management

How do create indexed fields in a summary index?

Lowell
Super Champion

I'm populating a summary index with data that I would like to be able to search very quickly using tstats. I've got this mostly working but can't quite seem to figure out if I'm doing something wrong or why it isn't working as expected.

Summary index generating search: search_foo
Fields to index: a, b, c, d
I want to be able to write a search like this: | tstats sum(a), sum(b), values(c) WHERE index=summary source=search_foo by d

Here are the settings I'm trying to make work:

props.conf:

[source::search_foo]
TRANSFORMS-index-fields = search_foo_indexfields

transforms.conf:

[search_foo_indexfields]
REGEX = \b(a|b|c|d)=("?)([^"]*?)\2(?:,|$)
FORMAT = $1::$3
WRITE_META = true
REPEAT_MATCH = true

I know that I have all the names and meta settings correctly because the first field does get added as an indexed field. (I confirmed this by running exporttool -csv on one of the buckets and confirmed that the field showed up in the _meta field. Splunk seems to be ignoring the REPEAT_MATCH setting.

So as a workaround, I've made REGEX match all 4 fields directly and index them all at once. (e.g., FORMAT = a::$1 b::$2 c::$3 d::$4) This works, but I really don't like the approach because it assumes a hard-coded order of the fields, which seems unnecessarily fragile. In my actual use case, sometimes "a" or "b'' is missing from the data. I've been able to make the regex cope with that fact, but that still results in an empty indexed field. (In other words, if "b" is missing form the data, I still see b:: in _meta when I run exporttool.) I also considered making 4 transforms entries, one for each field, but that seems silly as well.

Bonus question: Here's one somewhat related question, how to I avoid double escaping backslashes in my solution. One of my actual fields a "source", so Window's paths show up in the raw data with escaped backslashes ( \\ ) which gets translated to double escaped ( \\\\ ) in the _meta field, which then means that at search time, the indexed fields look like "C:\Windows\.." instead of "C:\Window...".

0 Karma

woodcock
Esteemed Legend

Double-check that the source value for the data in your Summary Index matches your stanza header specification.

0 Karma

woodcock
Esteemed Legend

Many people do not know about _KEY_1 and _VAL_1 (you can search on it). Try this:

[search_foo_indexfields]
REGEX = \b(?<_KEY_1>a|b|c|d)=("?)(?<_VAL_1>[^"]*?)\2(?:,|$)
WRITE_META = true
MV_ADD= true
0 Karma

Lowell
Super Champion

Okay, so this adds a new field with the name of the transforms stanza ("search_foo_indexfields") with the value of either "a" or "b".

Just confirmed it in the _meta field dumped out with exporttool. "... date_mday::25 date_zone::0 search_foo_indexfields::a"

From the docs, it's not 100% clear if _KEY_x and _VAL_x is supported at index time, but it doesn't seem to be working.

0 Karma

woodcock
Esteemed Legend

You have to deploy these configurations to the INDEXING SERVER. In most cases this is your indexers HOWEVER in the case of Summary Indices, by default (unless you went out of your way to change it), these are stored on the SEARCH HEAD so you will have to EITHER deploy the configurations to the Search Head OR make sure that Summary Indexing happens on the Indexers.

0 Karma

Lowell
Super Champion

I assume _KEY_! is a typo for _KEY_1? I was aware of that syntax, but didn't think it held any advantages here. (But I'll give it a try.) I haven't tried MV_ADD as the docs say, "This attribute is only valid for search-time field extractions."

0 Karma

woodcock
Esteemed Legend

Yes, fixed.

0 Karma

somesoni2
SplunkTrust
SplunkTrust

How about creating separate TRANSFORMS stanza for each field, so that even if one field is missing, the other show up independently?
For double escaping, may be try applying some command in the summary index search to remove escaped backslash.

0 Karma

Lowell
Super Champion

I'd like to avoid on transforms stanza per field if possible. My real use case has more than just 4 fields. (Not an unmanageable number, just seems like the has to be a better solution.)

I'm pretty sure the backslash escaping is happing automatically by the summary indexing plumbing commands (I'm just using the defaults builtin alert actions for summary indexing) And in fact, I'm already dealing with escaped backlashes in part of my search, so I know the've been taken care of in my base search.

And yes I could remove them at search time, but since I'm in control of the data generation, it seems silly to deal with something in every search I write, if I could fix the issue once when the data is written.

0 Karma

somesoni2
SplunkTrust
SplunkTrust

Give this a try?

[search_foo_indexfields]
 REGEX = \b(?<_KEY_1>(a|b|c|d))=("?)(?<_VAL_1>[^"]*?)\2(?:,|$)
 WRITE_META = true
 REPEAT_MATCH = true
0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...