Splunk Search

regex field extraction question

attgjh1
Communicator

this is the search i use:
sourcetype="Outbound" | head 10000 | rex "(?im)^(?:[^:\n]*:){3}\d+\|\w+\s+\w+\s+\w+\s+(?P.+)" | top 50 Socket_time

which works and are able to extract the field: socket_time

Corrected extracted out data: 0ms (or any time that is specified)

however, the moment i identify it as a fieldtype, the extracted data goes all wrong.
extracted out:
0ms
<and other remaining info from the log are included, making this search giving alot unique hits.

Example of one Event:

2012-03-21 00:00:12.299 - Socket connect|10.53.16.120:5000|Time taken is 2ms
2012-03-21 00:00:12.299 - Compress|From 00173 to 00079| Time taken is 0ms
2012-03-21 00:00:12.436 Socket send|10.53.16.120|Time taken is 136ms
2012-03-21 00:00:12.436 - Send|00079|BQC911CM00314 BQC911
2012-03-21 00:00:12.436 - > Process successfully|Total processing time is 160ms

as u can see. im just trying to get the 2ms out. but the search is extracting it all the way to the end of the event.

my question to anyone whose willing to help is which regex expression should i put to ignore everything after '2ms'.

Thanks!

EDIT: i ran it through Field extractor and were able to produce results:
e.g.

0ms 12
12ms 21
19ms 43

BUT. when i select it normally as a field in search app: this is wat shows up:

Socket_time=0ms2012-03-21 11:16:51.756 DEBUG - BQC911|Compress|From 00173 to 00078|Time taken is 0ms2012-03-21 11:16:51.877 DEBUG - BQC911|Socket send|10.53.16.120|Time taken is 120ms2012-03-21 11:16:51.877 INFO - BQC911|Send|00078|BQC911CM00413 BQC911 2012-03-21 11:16:51.877 INFO - BQC911|Process successfully|Total processing time is 127ms

basically the entire 'event' has been absorbed into this fieldname.

Tags (2)
0 Karma
1 Solution

Damien_Dallimor
Ultra Champion

You might better off to break up the log lines into individual events by setting the SHOULD_LINEMERGE value to "false" in props.conf.

And then use a regex like :

(?im)Socket connect\|.*\|Time\staken\sis\s(?<socket_connect_time>.+)

You could also add well named field extractions for the other fields too :

(?im)Compress\|.*\|Time\staken\sis\s(?<compression_time>.+)
(?im)Socket send\|.*\|Time\staken\sis\s(?<socket_send_time>.+)
(?im)Process successfully\|Total\sprocessing\stime\sis\s(?<total_processing_time>.+)

View solution in original post

Damien_Dallimor
Ultra Champion

You might better off to break up the log lines into individual events by setting the SHOULD_LINEMERGE value to "false" in props.conf.

And then use a regex like :

(?im)Socket connect\|.*\|Time\staken\sis\s(?<socket_connect_time>.+)

You could also add well named field extractions for the other fields too :

(?im)Compress\|.*\|Time\staken\sis\s(?<compression_time>.+)
(?im)Socket send\|.*\|Time\staken\sis\s(?<socket_send_time>.+)
(?im)Process successfully\|Total\sprocessing\stime\sis\s(?<total_processing_time>.+)

attgjh1
Communicator

thanks alot 😉

0 Karma

Damien_Dallimor
Ultra Champion

Well if you really want to stick with 1 single merged event :

(?im)Socket connect\|\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}:\d{2,5}\|Time\staken\sis\s(?\d+ms)

0 Karma

attgjh1
Communicator

the whole chunk of text are one entire event. that's why its annoying =/
wondering if there's any regex that ignores remaining lines?

0 Karma
Get Updates on the Splunk Community!

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...