Splunk Search

Does index time field extraction make sense for our situation?

Engager

(currently using Splunk 4.3.3 build 128297)

I have poked around the docs covering index time field extraction and some of the related Q&A but I decide I would ask directly outlining our situation.

We have a logging facility that several of our future product will use. This facility receives JSON payloads containing key/value pairs like the following (names have been changed to protect the innocent).

{ 
  "key1" : "value1",
  "key2" : "value2",
  (could contain more pairs)
  "entries" : [
                {
                 "key3" : "value3a",
                 "key4" : "value4a",
                 "key5" : "value5",
                 (could contain more pairs)
                },
                {
                 "key3" : "value3b",
                 "key4" : "value4b",
                 "key6" : "value6",
                 (could contain more pairs)
                },
                (could contain more entries)
              ]
}

When the logging facility gets the above example JSON payload it would turn it into the following two log statements and push those to splunk via TCP.

timestamp key1="value1" key2="value2" key3="value3a" key4="value4a" key5="value5"
timestamp key1="value1" key2="value2" key3="value3b" key4="value4b" key6="value6"

We are defining "key1" to be used to denote the product/component submitting the data and the value it contains would follow a reverse DNS style naming convention but with no real restrictions on the hierarchy of it other then ensuring it likely unique across our family of products. For example: "mycompany.product.component" or "mycompany.mydivision.product.component.subcomponent".

The remaining key/value pairs are product specific (aka can be whatever the product wants). In other words key1 will be used to namespace the rest of the key/value pairs.

We are considering adding "key1" to be extracted at index time. I believe by doing so would speed our ability to focus on the events coming from a particular product and/or component out in the field.

Search possibilities...

key1="mycompany.product.*" ...blah...
key1="mycompany.product.component"  ...blah...
key1="*.component.*"  ...blah...
etc.

Opinions?

0 Karma

Splunk Employee
Splunk Employee

Based on this post, it sounds like this may be one of the cases where it does makes sense:

http://splunk-base.splunk.com/answers/842/do-search-time-fields-have-performance-considerations?page...

0 Karma

Splunk Employee
Splunk Employee

Have you considered making key1 the sourcetype or the source? It is a safer solution and will still allow you to use metasearch and other fun indexed field tricks

I advise against the use of custom indexed fields, namely because it changes the structure of your index compared to your other indices and is not advised by the docs.

State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!