topic Re: regex field extraction question in Splunk Search

regex field extraction question

attgjh1 — Mon, 23 Apr 2012 04:53:06 GMT

this is the search i use:
sourcetype="Outbound" | head 10000 | rex "(?im)^(?:[^:\n]*:){3}\d+\|\w+\s+\w+\s+\w+\s+(?P.+)" | top 50 Socket_time

which works and are able to extract the field: socket_time

Corrected extracted out data: 0ms (or any time that is specified)

however, the moment i identify it as a fieldtype, the extracted data goes all wrong.
extracted out:
0ms
<and other remaining info from the log are included, making this search giving alot unique hits.

Example of one Event:

2012-03-21 00:00:12.299 - Socket connect|10.53.16.120:5000|Time taken is 2ms
2012-03-21 00:00:12.299 - Compress|From 00173 to 00079| Time taken is 0ms
2012-03-21 00:00:12.436 Socket send|10.53.16.120|Time taken is 136ms
2012-03-21 00:00:12.436 - Send|00079|BQC911CM00314 BQC911
2012-03-21 00:00:12.436 - > Process successfully|Total processing time is 160ms

as u can see. im just trying to get the 2ms out. but the search is extracting it all the way to the end of the event.

my question to anyone whose willing to help is which regex expression should i put to ignore everything after '2ms'.

Thanks!

EDIT: i ran it through Field extractor and were able to produce results:
e.g.

0ms 12
12ms 21
19ms 43

BUT. when i select it normally as a field in search app: this is wat shows up:

basically the entire 'event' has been absorbed into this fieldname.

Re: regex field extraction question

Damien_Dallimor — Mon, 23 Apr 2012 05:20:43 GMT

You might better off to break up the log lines into individual events by setting the SHOULD_LINEMERGE value to "false" in props.conf.

And then use a regex like :

(?im)Socket connect\|.*\|Time\staken\sis\s(?<socket_connect_time>.+)

You could also add well named field extractions for the other fields too :

(?im)Compress\|.*\|Time\staken\sis\s(?<compression_time>.+)
(?im)Socket send\|.*\|Time\staken\sis\s(?<socket_send_time>.+)
(?im)Process successfully\|Total\sprocessing\stime\sis\s(?<total_processing_time>.+)

Re: regex field extraction question

attgjh1 — Mon, 23 Apr 2012 05:35:28 GMT

the whole chunk of text are one entire event. that's why its annoying =/
wondering if there's any regex that ignores remaining lines?

Re: regex field extraction question

Damien_Dallimor — Mon, 23 Apr 2012 05:59:03 GMT

Well if you really want to stick with 1 single merged event :

(?im)Socket connect\|\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}:\d{2,5}\|Time\staken\sis\s(?\d+ms)

Re: regex field extraction question

attgjh1 — Tue, 24 Apr 2012 01:50:00 GMT

thanks alot 😉