Splunk Search

Help with regex to extract fields with different patterns

vrmandadi
Builder

Hello Experts,

I am trying to extract some data from events of different patterns and saving in a field called Details,but I dont think regex can do that since it not following a pattern.I have below sample events and I have highlighted the text to be extracted in bold as a new field called "details".Is it possible??

sample events

Sample 1

28 Aug 2017 22:33:49 [WARN ] http_srv: DONE 1023533 0.023082 404[Not Found] UNKNOWN-ID 24.211.252.82:58699 GET http://mmdai-linear-west-02.ti.com/linear-scope010.ti.com/LIVE/1024/hls/ae/HGTV_HD/.swnd8bdfc1a-7d30...... (id 19702873)
sample 2

28 Aug 2017 14:21:53 [WARN ] Content Generator: Client with unknown ID. Rejecting request (id 30299754) - uuid 5777d11e-c8d6-49ac-a5b2-fa163e11220e

sample3
28 Aug 2017 20:21:53 [WARN ] http_cli: Origin latency exceeded threshold: 0.183154 seconds GET Status: 200[OK] Bytes: 10087 Origin URL: http://aa.video.cdn.ch.com/LIVE/1027/hls/ae/VH1HD/3400.m3u8 refReqId 22804466 reqWait 0 (id 3314781656)

sample4
28 Aug 2017 20:41:08 [WARN ] Content Generator: Media Time Line Broken. Reset time line for session 185d563a-dd54-4234-a38a-005056b20601 (id 1052189)

Sample5
28 Aug 2017 20:45:24 [WARN ] ManifestCache: Request Failed: add entry status 404 url http://mmdai-vod-west-01.ti.com/TWCTV_vod/ooh/vod-9.ti.com/HLS_DRM/move1572890050300002/index.m3u8 (id 3292891677)
Sample 6
28 Aug 2017 14:46:30 [INFO ] ManifestCache: HLS STATS: requests=0 reqHit=0 reqMiss=0 urlHit=0 urlMiss=0 toServer=0
Sample 7
28 Aug 2017 20:47:30 [WARN ] ManifestCache: Sequence number jumped back from 2069693 to 2069689 for http://linear-scope010.t.com/LIVE/2002/hls/ae/NFLNHD_13698/150.m3u8, keep original content (id 3313583396)

Sample 8
28 Aug 2017 20:45:22 [ERROR] snalarmd: Only one health check operation supported at a time

Sample 9
28 Aug 2017 14:50:20 [DEBUG] CSAP: traverseClientInitBufferAndUpdateState: size of client list is: 0

Sample 10
28 Aug 2017 20:50:07 [WARN ] AAA Manager: VMAP VAST ADS Plugin: undefined variable in server URL "http://69.134.155.15/adrouter/vmap/v1/scte?caid=tntdrama.com/TNTD0006071701018135&csid=stva_kindle_t..." (id 3020264156)

SAMPLE 11
28 Aug 2017 15:04:45 [DEBUG] CSAP: traverseClientInitBufferAndUpdateState: size of client list is: 0
SAMPLE 12
28 Aug 2017 13:25:29 [INFO ] snalarmd: NET-SNMP version 5.7.1 AgentX subagent connected

sample 12
28 Aug 2017 15:32:16 [WARN ] ServerManager: Satellite 192.168.0.12 changed to use Other as Master.

sample 13
28 Aug 2017 15:32:16 [INFO ] ServerManager: Socket 31 is ready for 192.168.0.8:5551 set bufsize 67108864

Tags (3)
0 Karma

DalJeanis
SplunkTrust
SplunkTrust

Here's the thing - yes, you can do it, but it's probably what they call in analytics "overfitting". It's probably going to need constant tweaking.

Why do you want to leave out the session ID, the latency that was exceeded, the other specifics?

In any case, here's the run-anywhere code that does these 13 samples. Basically, it was easier to extract a little bit more and then sed away the last bits.

| makeresults 
| eval mydata="28 Aug 2017 22:33:49 [WARN ] http_srv: DONE 1023533 0.023082 404[Not Found] UNKNOWN-ID 24.211.252.82:58699 GET http://mmdai-linear-west-02.ti.com/linear-scope010.ti.com/LIVE/1024/hls/ae/HGTV_HD/.swnd8bdfc1a-7d30.... (id 19702873)!!!!28 Aug 2017 14:21:53 [WARN ] Content Generator: Client with unknown ID. Rejecting request (id 30299754) - uuid 5777d11e-c8d6-49ac-a5b2-fa163e11220e!!!!28 Aug 2017 20:21:53 [WARN ] http_cli: Origin latency exceeded threshold: 0.183154 seconds GET Status: 200[OK] Bytes: 10087 Origin URL: http://aa.video.cdn.ch.com/LIVE/1027/hls/ae/VH1HD/3400.m3u8 refReqId 22804466 reqWait 0 (id 3314781656)!!!!28 Aug 2017 20:41:08 [WARN ] Content Generator: Media Time Line Broken. Reset time line for session 185d563a-dd54-4234-a38a-005056b20601 (id 1052189)!!!!28 Aug 2017 20:45:24 [WARN ] ManifestCache: Request Failed: add entry status 404 url http://mmdai-vod-west-01.ti.com/TWCTV_vod/ooh/vod-9.ti.com/HLS_DRM/move1572890050300002/index.m3u8 (id 3292891677)!!!!28 Aug 2017 14:46:30 [INFO ] ManifestCache: HLS STATS: requests=0 reqHit=0 reqMiss=0 urlHit=0 urlMiss=0 toServer=0!!!!28 Aug 2017 20:47:30 [WARN ] ManifestCache: Sequence number jumped back from 2069693 to 2069689 for http://linear-scope010.t.com/LIVE/2002/hls/ae/NFLNHD_13698/150.m3u8, keep original content (id 3313583396)!!!!28 Aug 2017 20:45:22 [ERROR] snalarmd: Only one health check operation supported at a time!!!!28 Aug 2017 14:50:20 [DEBUG] CSAP: traverseClientInitBufferAndUpdateState: size of client list is: 0!!!!28 Aug 2017 20:50:07 [WARN ] AAA Manager: VMAP VAST ADS Plugin: undefined variable in server URL \"http://69.134.155.15/adrouter/vmap/v1/scte?caid=tntdrama.com/TNTD0006071701018135&csid=stva_kindle_tab_vod&vcid=850ec238-bd28-319a-b6b7-1efcd689b9f2&adId={{.CLIENT_URI.PARAM[adId]}}&idt=CHTR_ADM_STVA_IH_VMAP\" (id 3020264156)!!!!28 Aug 2017 15:04:45 [DEBUG] CSAP: traverseClientInitBufferAndUpdateState: size of client list is: 0!!!!28 Aug 2017 13:25:29 [INFO ] snalarmd: NET-SNMP version 5.7.1 AgentX subagent connected!!!!28 Aug 2017 15:32:16 [WARN ] ServerManager: Satellite 192.168.0.12 changed to use Other as Master.!!!!28 Aug 2017 15:32:16 [INFO ] ServerManager: Socket 31 is ready for 192.168.0.8:5551 set bufsize 67108864"
| makemv delim="!!!!" mydata
| mvexpand mydata
| rename mydata as _raw
| rename COMMENT as "The above just inputs your test data."

| rename COMMENT as "Below is the requested code"
| rex "(?:[^:]*:){3} (?<details>DONE|Request Failed: [^\d]+|HLS STATS: [^:\(\[\n]+|Socket [^\n]+|[^:\(\[\n]+)"
| rex mode=sed field=details "s/[-0-9a-fA-F]{36} ?$//g s/for( http)?$//g"

woodcock
Esteemed Legend

You will have to write a couple and do the most specific ones first. This is the as close as a general case can get:

... | rex "(?:[^:]*:){3}(?<details>[^:\(]+)"
0 Karma

vrmandadi
Builder

I tried your rex but It is failing for sample events sample1,sample4,sample7

0 Karma

woodcock
Esteemed Legend

Yes, that is why I wrote exactly what I wrote. Read it again.

0 Karma

DalJeanis
SplunkTrust
SplunkTrust

Sorry, many of us cannot (or won't) download files from strangers. If you edit your post and post the events in clear text, marked with the code button, then we may be able to help you more.

0 Karma

vrmandadi
Builder

@DalJeanis

Thanks for letting me know,I have edited and pasted the sample events

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...