Splunk Search

How to rewrite metadata using the values of a few keys in the event data?

Builder

We are looking at [potentially] adding an abstraction layer in between a host and the indexers but we of course lose the metadata so key to spunk. We are looking to use fluentd as the abstraction layer/data pipeline. In many cases, I have a nice json output with key/value pairs but I would like to use the values of a few keys to rewrite the metadata (host, index, source, sourcetype). So lets say we have this:

{"sourcetype":"fluentd","index":"main"}

How do I carve out those field and rewrite them as metadata? It seems that I need use a regex, can't I use the keys?

Any help is much appreciated!

0 Karma

SplunkTrust
SplunkTrust

Extracting and indexing processes happen in a particular order. Regex can be confusing, but this one can be really simple.

Try something like this -

props:

 [bv]
 KV_MODE = json
 INDEXED_EXTRACTIONS = json
 TRANSFORMS-extract = json_extraction, index_reset
 FIELDALIAS-conn_id = protocol.session_id AS conn_id

transforms.conf:

 [json_extraction]
 SOURCE_KEY = _raw
 DEST_KEY = _raw
 REGEX = ^([^{]+)({.+})$
 FORMAT = $2

 [index_reset]
 SOURCE_KEY = index
 DEST_KEY =  _MetaData:index
 REGEX = .
 FORMAT = $1

Most of this was copied from your comment on this one - https://answers.splunk.com/answers/501118/setting-event-time-and-host-metadata-from-keyvalue.html. I've just modified it to run two different TRANSFORMS-extract stanzas, the second of which takes the entire value of the index field and uses it to rewrite the index metadata.

Basic method is cribbed from here - https://answers.splunk.com/answers/1026/route-data-to-index-based-on-host.html

Wiser heads should feel free to comment on any issues with this code.


note, for index, ONLY, use _Metadata:index, for any other metadata, use Metadata:Host (for example)

http://docs.splunk.com/Documentation/Splunk/5.0.3/Admin/Transformsconf

0 Karma

Builder

Thank you for the response. It did not work:

props.conf:

[bv]
KV_MODE = json
INDEXED_EXTRACTIONS = JSON
TRANSFORMS-extract = json_extraction, host_extraction
FIELDALIAS-conn_id = protocol.session_id AS conn_id
FIELDALIAS-timestamp = protocol.timestamp AS ts
TIMESTAMP_FIELDS = "protocol.timestamp"

Transforms.conf:

[json_extraction]
SOURCE_KEY = _raw
DEST_KEY = _raw
REGEX = ^([^{]+)({.+})$
FORMAT = $2

[host_extraction]
SOURCE_KEY = protocol.host
DEST_KEY = MetaData:host
REGEX = .
FORMAT = $1

You will see the I am trying to rewrite the host metadata and not index. However I get this error:

Splunk> Take the sh out of IT.

Checking prerequisites...
    Checking http port [8000]: open
    Checking mgmt port [8089]: open
    Checking appserver port [127.0.0.1:8065]: open
    Checking kvstore port [8191]: open
    Checking configuration...  Done.
    Checking critical directories...    Done
    Checking indexes...
        Validated: _audit _internal _introspection _telemetry _thefishbucket bro bv firedalerts history main os summary unix_summary
    Done
    Checking filesystem compatibility...  Done
    Checking conf files for problems...
    Done
Undocumented key used in transforms.conf; stanza='host_extraction' setting='SOURCE_KEY' key='protocol.host'
Undocumented key used in transforms.conf; stanza='host_extraction' setting='DEST_KEY' key='MetaData:host'
Please resolve these problems by correcting typos in key names, or by adding them to [accepted_keys] in transforms.conf if they are intended.
    Checking default conf files for edits...
    Validating installed files against hashes from '/opt/splunk/splunk-6.5.2-67571ef4b87d-linux-2.6-x86_64-manifest'
    All installed files intact.
    Done
All preliminary checks passed.

Starting splunk server daemon (splunkd)...  
Done
                                                           [  OK  ]

Waiting for web server at http://127.0.0.1:8000 to be available........... Done

What am I missing here? Again, any help is MUCH appreciated!

0 Karma

SplunkTrust
SplunkTrust

It is telling you that the field key='protocol.host' is not known at the time that the config is being analyzed. Check to make sure that spelling and capitalization is exactly what you expect to extract, and that the underlying field will have been extracted before this rule runs. If so, then you can tell splunk not to worry about it with an entry in [accepted_keys] .

Regarding the second one, from reviewing a few other posts, I believe that Metadata:Host has to have a capital H.

http://docs.splunk.com/Documentation/Splunk/5.0.3/Admin/Transformsconf

"By adding entries to [accepted_keys], you can tell Splunk that a key that is not documented is a key you intend to work for reasons that are valid in your environment / app / etc."

[accepted_keys] 
is_valid= protocol.host
0 Karma

Builder

Did this question ever get answered? I find it hard to believe that SPlunk cannot rewrite metadata from kv pairs (or json, xml etc...). The links above do not seem to cover this from a kv pair perspective, it seems to require REGEX. What am I missing?

0 Karma

Revered Legend

Seems like overriding metadata like host/sourcetype/index etc based on event data. You have to use the TRANSFORM so to override them. See these links for overriding host and sourcetype. Overriding index would be same.

https://docs.splunk.com/Documentation/Splunk/6.5.1/Data/Overridedefaulthostassignments
http://docs.splunk.com/Documentation/Splunk/6.5.1/Data/Advancedsourcetypeoverrides
https://answers.splunk.com/answers/301504/how-to-override-sourcetype-and-index-assignment.html

0 Karma
State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!