Splunk Search

Enhancing Data with Field Extraction and Automated Lookups

Flenwy
Explorer

Hello to all,
i have the following Issue:
I receive logs from an older machine for which I cannot adjust the logging settings. When extracting data in Splunk, I encounter the following field and some values:

id = EF_jblo_fdsfew42_sla
id = EF_space_332312_sla
id = EF_97324_pewpew_sla

with a field extraction I then get my location from the id.
For example:

id = EF_jblo_fdsfew42_sla         => location = jblo
id = EF_space_332312_sla       => location = space
id = EF_97324_pewpew_sla     => location = 97324 <- where this is not a location here.

 

Now, I aim to replace the location using an automatic lookup based on the ID "EF_97324_pewpew_sla." Unfortunately, I encounter an issue where I either retrieve only the location from the table, omitting the rest, or I only receive the values extracted from the field extraction.

I've reviewed the search sequence as per the documentation, ensuring that field extraction precedes lookup. However, I'm perplexed as to why it consistently erases all the values rather than just overwriting a single one. Is there an automated solution running in the background, similar to automatic lookup, that could resolve this?

Thought lookup:

IDSolution
EF_97324_pewpew_slaTSINOC

 

My original concept was as follows:

  1. Data is ingested into Splunk.
  2. Using field extraction to extract the location from the ID.
  3. For the IDs where I am aware that they do not contain any location information, I intend to replace the extracted value with the lookup data.

I wanted to run the whole thing in the "background" so that the users do not have to run it as a search string.

I also tried to use calculated fields  to build one from two fields, but since the calculation takes place before the lookup, this was unfortunately not possible.

Hope someone can help me.
Kind regards,
Flenwy




Labels (1)
0 Karma
1 Solution

yuanliu
SplunkTrust
SplunkTrust

Now, I aim to replace the location using an automatic lookup based on the ID "EF_97324_pewpew_sla." Unfortunately, I encounter an issue where I either retrieve only the location from the table, omitting the rest, or I only receive the values extracted from the field extraction.

I think you meant to say that your extraction populates location field with every id, even in those that do not contain location information.  Instead of creating a table with all possible id's, you want to use a sparsely populated lookup to selectively override "bad" location value in those events with "bad" id's.  Is this correct?

Let me restate the requirement as this: if a lookup value exists, you want it to take precedence over any value your field extraction populates; if a lookup value does not exist, use the extracted value.

SPL can use coalesce to signal precedence.  You need to name extraction and lookup fields differently.  Say, you name your extracted field location_may_be_bad, and the lookup output field just location, you can then use this to get the location

| eval location = coalesce(location, location_may_be_bad)

Hope this helps.

View solution in original post

Tags (1)

yuanliu
SplunkTrust
SplunkTrust

Now, I aim to replace the location using an automatic lookup based on the ID "EF_97324_pewpew_sla." Unfortunately, I encounter an issue where I either retrieve only the location from the table, omitting the rest, or I only receive the values extracted from the field extraction.

I think you meant to say that your extraction populates location field with every id, even in those that do not contain location information.  Instead of creating a table with all possible id's, you want to use a sparsely populated lookup to selectively override "bad" location value in those events with "bad" id's.  Is this correct?

Let me restate the requirement as this: if a lookup value exists, you want it to take precedence over any value your field extraction populates; if a lookup value does not exist, use the extracted value.

SPL can use coalesce to signal precedence.  You need to name extraction and lookup fields differently.  Say, you name your extracted field location_may_be_bad, and the lookup output field just location, you can then use this to get the location

| eval location = coalesce(location, location_may_be_bad)

Hope this helps.

Tags (1)

isoutamo
SplunkTrust
SplunkTrust

Hi

is it possible that you put all locations into this automatic lookup and use only it without any additional field extractions etc.?

r. Ismo

Flenwy
Explorer

Hello @isoutamo Hello @yuanliu,

thank you for your reply.
At the moment i use the "coalesce" to quick fix the issue but i think in the long run in will do implement the lookup solution. 

Thank you both for your help!

Kind regards,
Flenwy

Tags (1)
0 Karma

yuanliu
SplunkTrust
SplunkTrust

Based on your illustrated data, the id field seems to have a certain format that can help you extract only location.  For example,

 

| rex field=id "^[A-Z]{2}_(?<location>\D[^_]*)"

 

will give you

idlocation
EF_jblo_fdsfew42_slajblo
EF_space_332312_slaspace
EF_97324_pewpew_sla 

If you can find the correct format and a regex that populates location only when the format is correct, you can use OUTPUTNEW feature in lookup. (Automatic lookup also has OUTPUTNEW feature; I believe it is default.) This way, you do not have to perform the field name acrobat.

Flenwy
Explorer

Hello,

thank you for this idea.
Will try this soulution this week.

Thanks,
Flenwy

0 Karma

isoutamo
SplunkTrust
SplunkTrust
Nice to hear that you found solution or actually several. You should remember that with splunk there are almost always several ways to do things, not only one! When you need to select "the best" one, you should look performance etc. from job inspector to understand better how those are working.
Happy Splunking!
Get Updates on the Splunk Community!

Now Available: Cisco Talos Threat Intelligence Integrations for Splunk Security Cloud ...

At .conf24, we shared that we were in the process of integrating Cisco Talos threat intelligence into Splunk ...

Preparing your Splunk Environment for OpenSSL3

The Splunk platform will transition to OpenSSL version 3 in a future release. Actions are required to prepare ...

Easily Improve Agent Saturation with the Splunk Add-on for OpenTelemetry Collector

Agent Saturation What and Whys In application performance monitoring, saturation is defined as the total load ...