Splunk Search

How to do a field extraction on the source field?

Champion

Hi,

I need to create a field on the source field, but am not sure how to do that. Can someone help me?

Tags (2)
0 Karma
1 Solution

Motivator

Have you tried using rex?

... search terms here ... | rex field=source "instances\/(?<NewFieldName>[^\/]+)" | stats count by NewFieldName

View solution in original post

Influencer

Try this:

On your search heads, in props.conf, within the stanzas you want to create this extraction for, add:

EXTRACT-vdsHost = instances\/(?<vdsHost>[^\/]+)/diagnostics in source

After saving, either reload your search head(s), or less intrusively, open the following URL while logged into the search head under an admin account:

https://YOURSPUNKSERVERHERE:8000/en-US/debug/refresh

Lastly, run a search on the data and verify that the new "vdsHost" field appears in the sidebar.

0 Karma

Splunk Employee
Splunk Employee

An even easier props.conf method is to use EXTRACT- without referencing transforms.conf:

[LogFiles]
TIME_FORMAT = %m/%d/%Y
...
...
EXTRACT-myfield = instances/(?<myField>[^/]*)/diagnostics in source

Splunk Employee
Splunk Employee

Awesome, I didn't know that you could do "in source"

0 Karma

Champion

Tried this, but the field is not showing up.

Put this in my props, pushed it via my cluster manager, even did a rolling restart on the indexers, but it's not appearing.

EXTRACT-vdsHost = instances/(?[^/]*)/diagnostics in source

0 Karma

Splunk Employee
Splunk Employee

Field extractions go on the search head, not indexers. Also, your capture group in the regex is missing a name; myField above

0 Karma

Splunk Employee
Splunk Employee

You could also apply it in props/transforms.conf. I had one scenario where given a file like /var/log/SystemAOutput.good I wanted to extract "SystemAOutput" and "good." I did this via the props.conf and transforms.conf:

props.conf:

[LogFiles]
TIME_FORMAT = %m/%d/%Y
MAX_EVENTS = 100000
NO_BINARY_CHECK = true
disabled = false
pulldown_type = true
REPORT-reporting = extract_filename

transforms.conf:

[extract_filename]
SOURCE_KEY = source
REGEX = [^/\\]([^\\/\.]*?)(?:_File\d*){0,1}\.(bad|good)$
FORMAT = srcfile::$1 status::$2

Output will then be:

Filename: /var/log/SystemAOutput.good
srcfile: SystemAOutput
status: good
0 Karma

Champion

I decided to go with the props/transforms method. Can someone help me with the regex? I'm not very good with these expressions.

Source = /apps/oracle/install/admin/instances/ovdprtp2a/diagnostics/logs/OVD/ovd1/diagnostic.log

I need to extract the value between instances and diagnostics.

0 Karma

Motivator

Have you tried using rex?

... search terms here ... | rex field=source "instances\/(?<NewFieldName>[^\/]+)" | stats count by NewFieldName

View solution in original post

Motivator

Like @David said, props/transforms.conf is the way to go. From the docs on using props.conf only extractions:

All extraction configurations in props.conf are restricted by a specific source, source type, or host. Start by identifying the source type, source, or host that provide the events that your field should be extracted from

Also from the docs on transforms.conf extractions:

Your search-time field extractions require a field transform component if you need to:
• Reuse the same field-extracting regular expression across multiple sources, source types, or hosts (in other words, configure one field transform for multiple field extractions). If you find yourself using the same regex to extract fields for different sources, source types, and hosts, you may want to set it up as a transform. Then, if you find that you need to update the regex, you only have to do so once, even though it is used more than one field extraction.

So you can't wildcard the sourcetype. To dowhat you want while making maintenance easy, create a field transform in transforms.conf and reference it in props.conf for each host/source/sourcetype to which it applies:

transforms.conf:

[myNewFieldExtract]
REGEX = instances\/(?<NewFieldName>[^\/]+)
SOURCE_KEY = source

props.conf:

[sourcetype::first_sourcetype_this_applies_to]
REPORT-my_class_name = myNewFieldExtract

[sourcetype::second_sourcetype_this_applies_to]
REPORT-my_class_name = myNewFieldExtract

... and so on...
0 Karma

Champion

Not sure what I'm doing wrong here... followed what you have...

props.conf:

[sourcetype::vds_access]
ANNOTATE_PUNCT = false
KV_MODE = auto
LINE_BREAKER = ([\r\n]+).\d{4}-\d{2}-\d{2}
MAX_TIMESTAMP_LOOKAHEAD = 30
NO_BINARY_CHECK = 1
SHOULD_LINEMERGE = false
TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%3N
TIME_PREFIX = ^.
TRUNCATE = 999999
REPORT-vdsaccessExtract = vdsHost_extract

[sourcetype::vds_diagnostic]
ANNOTATE_PUNCT = false
KV_MODE = auto
LINE_BREAKER = ([\r\n]+).\d{4}-\d{2}-\d{2}
MAX_TIMESTAMP_LOOKAHEAD = 30
NO_BINARY_CHECK = 1
SHOULD_LINEMERGE = false
TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%3N
TIME_PREFIX = ^.
TRUNCATE = 999999
REPORT-vdsdiagExtract = vdsHost_extract
pulldown_type = 1

transforms.conf:

[vdsHost_extract]
REGEX = instances\/(?[^\/]+)
SOURCE_KEY = source

I pushed these out via the cluster manager, but still don't see the field.

0 Karma

Motivator

It looks like you're missing the name of the new field in your transforms.conf stanza. Assuming you want the field to show up in Splunk as vdsHost, It should be:

[vdsHost_extract]
REGEX = instances/(?<vdsHost>[^/]+)
SOURCE_KEY = source

Or if you want to do it the old school way:

[vdsHost_extract]
REGEX = instances/([^/]+)
FORMAT = vdsHost::$1
SOURCE_KEY = source
0 Karma

Champion

It's there - just getting stripped by this website.

[vdsHost_extract]
REGEX = instances\/(?<vdsHost>[^\/]+)
SOURCE_KEY = source
0 Karma

Motivator

So you pushed the configuration out to your search heads... What is the status of the knowledge bundle? splunk show cluster-bundle-status

Have you tried refreshing or restarting your search head? You can refresh at https://your_splunk_url:port/en-US/debug/refresh

Have you checked the permissions on the field transformation? They are most likely fine but I'm trying to cover all bases. Settings --> Fields --> Field Transformations.

0 Karma

Champion

OK, just want to make sure that I'm following this...

The props listed above is going on the indexer, and the transforms on the searchead? Correct?

0 Karma

Motivator

Also note that the class names in each props.conf report stanza should be unique.

0 Karma

Influencer

This should work.

0 Karma

Champion

Thanks. I'm testing out both methods. Is there a way to put the rex above in an extraction, rather than in a search?

0 Karma

Champion

So, this?

vds_* : EXTRACT-vdsHost Inline field=source "instances\/(?[^\/]+)"

I need this to work across multiple sources and sourcetypes, can I wildcard a sourcetype when creating a field extraction?

0 Karma

Splunk Employee
Splunk Employee

Yes, the method I posted below.

0 Karma

Community Manager
Community Manager

Hi @a212830

Would you be able to provide sample data and what exactly you're trying to extract from the source field so users have more content to work with?

0 Karma