Splunk Search

How to do a field extraction on the source field?

a212830
Champion

Hi,

I need to create a field on the source field, but am not sure how to do that. Can someone help me?

Tags (2)
0 Karma
1 Solution

wpreston
Motivator

Have you tried using rex?

... search terms here ... | rex field=source "instances\/(?<NewFieldName>[^\/]+)" | stats count by NewFieldName

View solution in original post

masonmorales
Influencer

Try this:

On your search heads, in props.conf, within the stanzas you want to create this extraction for, add:

EXTRACT-vdsHost = instances\/(?<vdsHost>[^\/]+)/diagnostics in source

After saving, either reload your search head(s), or less intrusively, open the following URL while logged into the search head under an admin account:

https://YOURSPUNKSERVERHERE:8000/en-US/debug/refresh

Lastly, run a search on the data and verify that the new "vdsHost" field appears in the sidebar.

0 Karma

_d_
Splunk Employee
Splunk Employee

An even easier props.conf method is to use EXTRACT- without referencing transforms.conf:

[LogFiles]
TIME_FORMAT = %m/%d/%Y
...
...
EXTRACT-myfield = instances/(?<myField>[^/]*)/diagnostics in source

David
Splunk Employee
Splunk Employee

Awesome, I didn't know that you could do "in source"

0 Karma

a212830
Champion

Tried this, but the field is not showing up.

Put this in my props, pushed it via my cluster manager, even did a rolling restart on the indexers, but it's not appearing.

EXTRACT-vdsHost = instances/(?[^/]*)/diagnostics in source

0 Karma

_d_
Splunk Employee
Splunk Employee

Field extractions go on the search head, not indexers. Also, your capture group in the regex is missing a name; myField above

0 Karma

David
Splunk Employee
Splunk Employee

You could also apply it in props/transforms.conf. I had one scenario where given a file like /var/log/SystemAOutput.good I wanted to extract "SystemAOutput" and "good." I did this via the props.conf and transforms.conf:

props.conf:

[LogFiles]
TIME_FORMAT = %m/%d/%Y
MAX_EVENTS = 100000
NO_BINARY_CHECK = true
disabled = false
pulldown_type = true
REPORT-reporting = extract_filename

transforms.conf:

[extract_filename]
SOURCE_KEY = source
REGEX = [^/\\]([^\\/\.]*?)(?:_File\d*){0,1}\.(bad|good)$
FORMAT = srcfile::$1 status::$2

Output will then be:

Filename: /var/log/SystemAOutput.good
srcfile: SystemAOutput
status: good
0 Karma

a212830
Champion

I decided to go with the props/transforms method. Can someone help me with the regex? I'm not very good with these expressions.

Source = /apps/oracle/install/admin/instances/ovdprtp2a/diagnostics/logs/OVD/ovd1/diagnostic.log

I need to extract the value between instances and diagnostics.

0 Karma

wpreston
Motivator

Have you tried using rex?

... search terms here ... | rex field=source "instances\/(?<NewFieldName>[^\/]+)" | stats count by NewFieldName

wpreston
Motivator

Like @David said, props/transforms.conf is the way to go. From the docs on using props.conf only extractions:

All extraction configurations in props.conf are restricted by a specific source, source type, or host. Start by identifying the source type, source, or host that provide the events that your field should be extracted from

Also from the docs on transforms.conf extractions:

Your search-time field extractions require a field transform component if you need to:
• Reuse the same field-extracting regular expression across multiple sources, source types, or hosts (in other words, configure one field transform for multiple field extractions). If you find yourself using the same regex to extract fields for different sources, source types, and hosts, you may want to set it up as a transform. Then, if you find that you need to update the regex, you only have to do so once, even though it is used more than one field extraction.

So you can't wildcard the sourcetype. To dowhat you want while making maintenance easy, create a field transform in transforms.conf and reference it in props.conf for each host/source/sourcetype to which it applies:

transforms.conf:

[myNewFieldExtract]
REGEX = instances\/(?<NewFieldName>[^\/]+)
SOURCE_KEY = source

props.conf:

[sourcetype::first_sourcetype_this_applies_to]
REPORT-my_class_name = myNewFieldExtract

[sourcetype::second_sourcetype_this_applies_to]
REPORT-my_class_name = myNewFieldExtract

... and so on...
0 Karma

a212830
Champion

Not sure what I'm doing wrong here... followed what you have...

props.conf:

[sourcetype::vds_access]
ANNOTATE_PUNCT = false
KV_MODE = auto
LINE_BREAKER = ([\r\n]+).\d{4}-\d{2}-\d{2}
MAX_TIMESTAMP_LOOKAHEAD = 30
NO_BINARY_CHECK = 1
SHOULD_LINEMERGE = false
TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%3N
TIME_PREFIX = ^.
TRUNCATE = 999999
REPORT-vdsaccessExtract = vdsHost_extract

[sourcetype::vds_diagnostic]
ANNOTATE_PUNCT = false
KV_MODE = auto
LINE_BREAKER = ([\r\n]+).\d{4}-\d{2}-\d{2}
MAX_TIMESTAMP_LOOKAHEAD = 30
NO_BINARY_CHECK = 1
SHOULD_LINEMERGE = false
TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%3N
TIME_PREFIX = ^.
TRUNCATE = 999999
REPORT-vdsdiagExtract = vdsHost_extract
pulldown_type = 1

transforms.conf:

[vdsHost_extract]
REGEX = instances\/(?[^\/]+)
SOURCE_KEY = source

I pushed these out via the cluster manager, but still don't see the field.

0 Karma

wpreston
Motivator

It looks like you're missing the name of the new field in your transforms.conf stanza. Assuming you want the field to show up in Splunk as vdsHost, It should be:

[vdsHost_extract]
REGEX = instances/(?<vdsHost>[^/]+)
SOURCE_KEY = source

Or if you want to do it the old school way:

[vdsHost_extract]
REGEX = instances/([^/]+)
FORMAT = vdsHost::$1
SOURCE_KEY = source
0 Karma

a212830
Champion

It's there - just getting stripped by this website.

[vdsHost_extract]
REGEX = instances\/(?<vdsHost>[^\/]+)
SOURCE_KEY = source
0 Karma

wpreston
Motivator

So you pushed the configuration out to your search heads... What is the status of the knowledge bundle? splunk show cluster-bundle-status

Have you tried refreshing or restarting your search head? You can refresh at https://your_splunk_url:port/en-US/debug/refresh

Have you checked the permissions on the field transformation? They are most likely fine but I'm trying to cover all bases. Settings --> Fields --> Field Transformations.

0 Karma

a212830
Champion

OK, just want to make sure that I'm following this...

The props listed above is going on the indexer, and the transforms on the searchead? Correct?

0 Karma

wpreston
Motivator

Also note that the class names in each props.conf report stanza should be unique.

0 Karma

masonmorales
Influencer

This should work.

0 Karma

a212830
Champion

Thanks. I'm testing out both methods. Is there a way to put the rex above in an extraction, rather than in a search?

0 Karma

a212830
Champion

So, this?

vds_* : EXTRACT-vdsHost Inline field=source "instances\/(?[^\/]+)"

I need this to work across multiple sources and sourcetypes, can I wildcard a sourcetype when creating a field extraction?

0 Karma

David
Splunk Employee
Splunk Employee

Yes, the method I posted below.

0 Karma

ppablo
Retired

Hi @a212830

Would you be able to provide sample data and what exactly you're trying to extract from the source field so users have more content to work with?

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...