Splunk Search

What is better, MV search time extraction or split?

Communicator

My data in Splunk looks like so:

     geo {
     id: 0
     internal_name: "TEST"
     type: LIST
    zip: 1   zip: 2        zip: 3            zip:4 zip: 5 zip: 6 zip: 7 zip: 9 ... etc
     description: "TEST"
     }
     geo {
      id: 1
     internal_name: "TEST"
      type: LIST
     zip: 1   zip: 2        zip: 3            zip:4 zip: 5 zip: 6 zip: 7 zip: 9 ... etc
      description: "TEST"
     }
     geo {
     id: 2
     internal_name: "TEST"
     type: LIST
      zip: 1   zip: 2        zip: 3            zip:4 zip: 5 zip: 6 zip: 7 zip: 9 ... etc
     description: "TEST"
     }
     geo {
     id: 3
     internal_name: "TEST"
     type: LIST
     zip: 1   zip: 2        zip: 3            zip:4 zip: 5 zip: 6 zip: 7 zip: 8
     description: "TEST"
     }

I want to get the zip numbers all into their own field called zip — if I do it via regex, it only takes the FIRST value not all the others per event. Reading some of the docs, it seems like I need to do something with MV_ADD in my props or transform config files, but I can't find anything that clearly states what I'm suppose to do.

0 Karma
1 Solution

Esteemed Legend

Just add MV_ADD = true to the same stanza where you have your REGEX = line. If it is attached to a TRANSFORMS-based line in props.conf on your Indexers, then you are too late, because that happens at index time. Bit if it is attached to a REPORT-based line on your Search Head, then just give it some time to refresh and it will work. Make sure that your RegEx is properly un-anchored (test in RegEx101 to see if it captures all values).

View solution in original post

Esteemed Legend

Just add MV_ADD = true to the same stanza where you have your REGEX = line. If it is attached to a TRANSFORMS-based line in props.conf on your Indexers, then you are too late, because that happens at index time. Bit if it is attached to a REPORT-based line on your Search Head, then just give it some time to refresh and it will work. Make sure that your RegEx is properly un-anchored (test in RegEx101 to see if it captures all values).

View solution in original post

Communicator

lol thats all too complicated - I need specifics man!
this is my current props conf -
[test]
DATETIME_CONFIG = CURRENT
MAX_TIMESTAMP_LOOKAHEAD =
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = false
TIME_FORMAT =
TIME_PREFIX =
TRUNCATE = 200000
category = Custom
pulldown_type = true
BREAK_ONLY_BEFORE_DATE =
LINE_BREAKER = ([\r\n]+)\s*geo\s{
disabled = false

0 Karma

Esteemed Legend

First of all, if you can get your developers to output proper JSON, then you can just add KV_MODE = json and you are done. Short of that, you need a REPORT-test_extractions = test_mv_zip in there, too and then in transforms.conf, something like:

[test_mv_zip]
REGEX = [\r\n\s]+zip:\s+(?<zip>\d+)
MV_ADD = true
0 Karma

Communicator

Thanks looks like that works but some of the MV fields are huge and its really slowing down the search - thus is it better to do this at indexing time? and if so whats that proper setup?

0 Karma

Esteemed Legend

Like I said earlier. Make your developer output valid JSON or XML and then you can use KV_MODE and/or spath / xpath.

0 Karma

Esteemed Legend

Do everything at Search Time that you can because Indexer horsepower is almost always your most limited resource and anything done at index time takes up more disk space (another limited resource).

0 Karma

Communicator

looks like this might be something along the lines of what I need to do which says I might only need props.conf

The EXTRACT field extraction type is considered to be "inline," which means
that it does not reference a field transform. It contains the regular
expression that Splunk software uses to extract fields at search time. You
can use EXTRACT to define a field extraction entirely within props.conf, no
transforms.conf component is required.

0 Karma

Communicator

further reading now it sounds like report in props is ideal which requires something in transform conf

0 Karma