Splunk Search

Extract URL field with regex for certain error codes

splunkchris2
New Member

Hi everyone,

I have one logfile per day that is filled with several lines of information showing requests to play video streams:

ABC: [2019:09:10 09:39:15] abcdefg 1234567890 -hijklmnopqrs !warning! Request to play stream : "http://holiday.mpeg" on [website]

ABC: [2019:09:10 09:39:16] abcdefg 1234567890 -hijklmnopqrs !warning! Show error message : "Streamfail"

ABC: [2019:09:10 09:39:20] abcdefg 1234567890 -hijklmnopqrs success

And I am trying to extract the URLs that are listed in the file if there is the error message "Streamfail".

So for the example above I would like to extract the video name as well as the occurence:
1 x holiday.mpeg

I have tried the following:
index=website.log ("Show error message" AND "Streamfail")
| rex field=_raw "\/(?[^\?\/]+)\?"
| stats count by Streamfail

Tags (1)
0 Karma

splunkchris2
New Member

I'm quite a Splunk Newbie so sorry for asking.
The logs have an individual logID, but how could this be of any help?
By the lists you mean, e.g. list1=fails: Streamfails, Loginfail, Downloadfail, etc.; and list2=success: streamsuccess, loginsuccess, downloadsuccess...

0 Karma

jpolvino
Builder

If there was some sort of transaction ID, that can help tie things together. In your sample, what does do these fields signify? ABC, abcdefg,1234567890, hijklmnopqrs. Are these values static across all events, or are one or more of them unique for each file request?

In some systems, you'll find something like this (simplifying your example):
[2019:09:10 09:39:15] 12345 !warning! Request to play stream : "http://holiday.mpeg" on [website]
[2019:09:10 09:39:16] 12345 !warning! Show error message : "Streamfail"
[2019:09:10 09:39:20] 12345 success
[2019:09:10 09:39:21] 6789 !warning! Request to play stream : "http://splunk.mpeg" on [website]
[2019:09:10 09:39:30] 6789 success

In this case, transaction ID 12345 has the error message you seek, so the file associated with that transaction ID is what you want displayed, The transaction ID 6789 does not have the error message, so you don't care about this transaction's file.

If this is accurate, then one approach would be to collect all transaction IDs that have Streamfail, and then use those transaction IDs to find the filenames.

0 Karma

jpolvino
Builder

One potential problem is that you may have several requests intermingled, some successful and some not. Do your logs have any unique identifiers, such as a session ID or request ID? If so, then you could use that to identify IDs containing errors and then use that ID to find the file.

Can you provide a listing of events that are successful, and a listing of events that indicate failure?

0 Karma

thiru_wf
Observer

| rex field=_raw "\/(?P.[^\?\/\"]+)[\?\/\"]"

0 Karma

splunkchris2
New Member

Thanks, but it returns the following error:
⚠ Error in 'rex' command: Encountered the following error while compiling the regex '\/(?P.[^\?\/"]+)[\?\/"]': Regex: unrecognized character after (?P

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...