Splunk Search

Single field not always extracted, but appears when piping into "extract"

spock_yh
Path Finder

I have set up a search-time field extraction. The extraction extracts a bunch of fields from a URL in a log file.

My problem is that for one of these fields, some events contain it and others do not, with no apparent reason. Here are two such examples. The first manages to extract the field, the second doesn't:

1.1.9.1 - [20/Mar/2011:17:39:37 -0700] 15625 "some.web.site" GET "/myaccount/videos/B004CZXC54.flv" "" 307 - "medusa" "-" "Python-urllib/2.6" "2.2.2.2"

1.1.9.1 - [20/Mar/2011:18:10:45 -0700] 0 "some.web.site" GET "/myaccount/videos/B003QMJAXM.flv" "" 307 - "medusa" "-" "Python-urllib/2.6" "2.2.2.2"

The field I'm trying to extract is the one corresponding to the "myaccount" part. As you can see, the two events are extremely similar - but the first doesn't show the field, the second does.

The odd thing about this is that: * If I pipe my search into | extract reload=T, I can see the missing field for all results. * There are a number of fields after this missing field (for the "videos" part, "B003QM.." part, "flv" part, etc) that are extracted fine.

The original regular expression was quite complex but I stripped it down to something simple that still shows the problem:

 /(?<medusa_account_alias>[^/]+)/(?<medusa_restype>videos|images)

The problem field is the medusa_account_alias field. The fields following it seem to be extracted ok.

Any ideas will be greatly appreciated, is this some kind of bug in splunk or am I missing something?

Tags (1)
0 Karma

Ledion_Bitincka
Splunk Employee
Splunk Employee

Can you please provide the props.conf/transforms.conf stanzas that are responsible for performing the extractions and field aliasing?

spock_yh
Path Finder

The problem is caused by a field alias I have defined.

What I want is to have medusa_account_alias filled either from the above regex, or from another field ("accountId") extracted for another format of the log row. I used an alias from accountId to medusa_account_alias, and this caused the problem.

How do I achieve this otherwise? Having a field that can get filled by two disjoint cases?

Also, this doesn't explain why splunk's behavior was so arbitrary - why would it generate medusa_account_alias for one event and not for the other?

0 Karma
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...