Splunk Search

Why can't values of fields from field extractions be searched as expected?

bitnapper
Path Finder

Hi there,

I created multiple field extractions, extracting values from different sourcetypes into the same field:

sourcetype0: "field0":"(?<geolocation_code>.{7})
sourcetype1: "field1":"(?<geolocation_code>.{7})
sourcetype2: "field2":"(?<geolocation_code>.{7})

They are populated as expected, all looking like ABCDEF0, CDEFGH3 or ZDEGFH9. But when using them in search geolocation_code=ABCDEF0 I have zero hits even though the preview from the fields pane on the left shows me plenty of those values. Using geolocation_code!=ABCDEF0 on the other hand works exactly as inteded. Also geolocation_code=ABCDEF0* gives the result I expected from geolocation_code=ABCDEF0 even though this field only contains exactly the value I'm looking for. I don't really understand what is happening here and why only with this extraction but not with other. 

Labels (1)
0 Karma

bitnapper
Path Finder

The data is indeed JSON. But I want to have another field just containing the first 7 characters from field0 to use that field for crossreferencing. But when I extract it like "field0":"(?<geolocation_code>.{7}) I can only use it like geolocation_code=ABCDEF1* not geolocation_code=ABCDEF1.

{"info":"text","host":"SOURCE00033","@timestamp":"2022-12-07T09:33:01.000Z","user":"UserName@domain","device":"DVD/CD-ROM","ctime":"2022-12-07T09:33:01.000Z","id":737112781,"field0":"ABCDEF1ABCDEF01.domain.local"}

 

0 Karma

yuanliu
SplunkTrust
SplunkTrust

geolocation_code is a new field, is that correct?  If you only want the first 7 characters, you can do so easily with substr (or any number of other methods).

| eval geolocation_code = substr(coalesce(field0, field1, field2), 1, 7)

 

Tags (1)
0 Karma

bitnapper
Path Finder

And substring works in field extractions? But why does the search only work with asterisk? I'd love to understand what I did not understand about how splunk processes this regex to avoid repeating that mistake in the future. As far as I understand it, it should not happen like this but obviously I does so there must be something that I got wrong.

Tags (1)
0 Karma

yuanliu
SplunkTrust
SplunkTrust

substr only works in SPL.  What I was saying is that there is no need to do this in transforms (or worse, index time extraction).  Generally, you should do things at search time, and it is a bad practice to use regex on raw events with structured data like JSON. (Use regex on fields extracted from JSON instead.)

As I said earlier, without real knowledge about actual data, it is not possible to know why you need that asterisk.  Using substr can actually help you diagnose by supplying an independent data point that do not rely on your automatic extraction, which is generally more difficult to diagnose.

0 Karma

bitnapper
Path Finder

So with substr I have no such effect. 

0 Karma

yuanliu
SplunkTrust
SplunkTrust

Without seeing actual data, I cannot tell why your instance have that behavior.  However, the snippet you showed suggests that your data is actually JSON.  If this is correct, using regex for extraction is counterproductive.  You should have "field0", "field1", etc., already.  Just use coalesce.

| eval geolocation_code = coalesce(field0, field1, field2)

If the raw events are not JSON but part of it is, aim to extract the entire JSON object, then run spath on it. 

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...