Getting Data In

Way to insert/create field based on source?

mfrost8
Builder

I have a need (OK, it's a desire) to create a field that I can search on based on an input. The particular field I want to create exists in the path to the source file, but unfortunately, there's no way to reliably predict which part of the path it shows up in. I don't want to generically identify the field as simply "tag".

Also note that this field does not exist anywhere in the events of the source.

What I'd like to do is something like the following (imaginary) inputs.conf:

[monitor:///var/opt/MQHA/FOO/data/FOO/errors]
_whitelist = AMQERR01\.LOG$
field = qmgr = FOO

so that when I do a search, I could simply say qmgr="FOO". I realize I could wildcard with source="FOO", but I don't live that solution. So I'm really trying to kind of inject a key and value for any entries from this sourcetype.

If there's some way to do this, it's eluding me.

I'm running 4.1.4.

Thanks

Tags (3)
1 Solution

Stephen_Sorkin
Splunk Employee
Splunk Employee

There are two possible approaches for this: create an index-time field at input time or create an eventtype that represents your data.

For the eventtype, create in $SPLUNK_HOME/etc/apps/search/local/eventtypes.conf a stanza like:

[qmgr-foo]
search = source=*FOO*

To retrieve based on this, you'll have to search for eventtype=qmgr-foo, so it's a bit suboptimal.

A more direct way is to use an index-time field.

In inputs.conf, we will pre-populate the _meta key, which holds the index-time fields. The stanza will look like:

[monitor:///var/opt/MQHA/FOO/data/FOO/errors]
_whitelist = AMQERR01\.LOG$
_meta = qmgr::foo

You'll also want to edit fields.conf to add qmgr as an index-time field:

[qmgr]
INDEXED = true

View solution in original post

lguinn2
Legend

You could also use the rex command to create a field on the fly:

 <your search here> | rex field=source " W3SVC(?<key>[^\\\.]*?)" | <your next command>

This means "create a field named key which consists of all the characters following W3SVC in the source field, up to but not including the next \ or ."

0 Karma

southeringtonp
Motivator

A third approach, if you want to do it at search time. If you're looking for a discrete set of strings in the source, you should still do it at search time by specifying SOURCE_KEY.

Something like this:

transforms.conf

[addfoo]
SOURCE_KEY = source
REGEX = (foo|bar|baz)
FORMAT = qmgr::$1

props.conf

[source:///var/opt/MQHA/FOO/data/FOO/errors]
REPORT-foo = addfoo
0 Karma

Stephen_Sorkin
Splunk Employee
Splunk Employee

I considered this approach, but it has a key problem: in most cases you can't search on qmgr=foo and get the right results. This is because of the heuristic used for search-time fields, where the search engine assumes that the value itself is searchable. So with qmgr=foo, first the index is consulted for events that match "foo", and then those are filtered. Unfortunately, an embedded string in the source is not indexed and hence not searchable. This can be defeated by setting INDEXED_VALUE=false in fields.conf, but this comes at the expense of speed, since a full table scan is performed.

Stephen_Sorkin
Splunk Employee
Splunk Employee

There are two possible approaches for this: create an index-time field at input time or create an eventtype that represents your data.

For the eventtype, create in $SPLUNK_HOME/etc/apps/search/local/eventtypes.conf a stanza like:

[qmgr-foo]
search = source=*FOO*

To retrieve based on this, you'll have to search for eventtype=qmgr-foo, so it's a bit suboptimal.

A more direct way is to use an index-time field.

In inputs.conf, we will pre-populate the _meta key, which holds the index-time fields. The stanza will look like:

[monitor:///var/opt/MQHA/FOO/data/FOO/errors]
_whitelist = AMQERR01\.LOG$
_meta = qmgr::foo

You'll also want to edit fields.conf to add qmgr as an index-time field:

[qmgr]
INDEXED = true

Stephen_Sorkin
Splunk Employee
Splunk Employee

Yes, this is an undocumented feature of inputs.conf. Any DEST_KEY in transforms.conf can be set directly in inputs.conf. You are correct that the inputs.conf setting goes on the forwarder but fields.conf goes on the indexer.

0 Karma

mfrost8
Builder

I'm a little confused about this in 2 ways. First, I don't see the _meta key defined in the inputs.conf documentation. Is this an undocumented feature of inputs.conf?

Second, the inputs.conf comes from a LW forwarder. I assume I can put the lines above there, but I would also assume that the fields.conf definition would need to go on the indexer itself and not the LWF, correct? Thanks.

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...