Splunk Search

How to write the regex to extract the date fields and another key-value pair from my sample data?

vtsguerrero
Contributor

Hello guys, sup?

We've got this piece of log which is a MySql log and we should not change the layout, but need to extract START_DATE, END_DATE and NUM_ROWS.
As follows a piece of data:

******************************************************
*****************  EXECUTION HYW   *****************
*****************    TRTD08IIL.SQL    *****************
******************************************************

***************   26/02/2015-07:00:06  ***************
Conectting to MYSQL

Execution Data
Process Code : TRTD08IIL
Program           : TRTD08IIL.SQL
Descripton          : Table
Critic      : E
Reference Date : 20140103

***************   26/02/2015-07:00:06  ***************
Temp Execution Path For File: /C:/TEMP/TRTD08IIL.SQL.TRTD08IIL

***************   26/02/2015-07:00:06  ***************
Temp Execution Path For File: /C:/TEMP/TRTD08IIL.SQL.TRTD08IIL

Table tb.User stats: [num_partitions: 0, num_files: 0, num_rows: 65931447, total_size: 0, raw_data_size: 15793929087]
65931447 Rows loaded to tab_clie
MapReduce Jobs Launched:
Job 0: Map: 18  Reduce: 4   Cumulative CPU: 608.98 sec   HDFS Read: 1037601634 HDFS Write: 604987045 SUCCESS
Job 1: Map: 59  Reduce: 58   Cumulative CPU: 4337.55 sec   HDFS Read: 14707612009 HDFS Write: 29413721 SUCCESS
Total MapReduce CPU Time Spent: 0 days 7 hours 59 minutes 42 seconds 920 msec
OK
Time taken: 2256.431 seconds


***************   26/02/2015-07:37:47  ***************
Temp Execution Path For File: /C:/TEMP/TRTD08IIL.SQL.TRTD08IIL

***************   26/02/2015-07:37:47  ***************
Temp Execution Path For File: /C:/TEMP/TRTD08IIL.SQL.TRTD08IIL
0 Karma
1 Solution

richgalloway
SplunkTrust
SplunkTrust

This works on regex101.com:

"(?s)\*{15}\s+(?P<START_DATE>\d\d\/\d\d\/\d{4}-\d\d:\d\d:\d\d)\s+\*{15}.*\s(?P<NUM_ROWS>\d+)\sRows loaded.*Time taken: .*?seconds.*?\*{15}\s+(?<END_DATE>\d\d\/\d\d\/\d{4}-\d\d:\d\d:\d\d)"
---
If this reply helps you, Karma would be appreciated.

View solution in original post

richgalloway
SplunkTrust
SplunkTrust

This works on regex101.com:

"(?s)\*{15}\s+(?P<START_DATE>\d\d\/\d\d\/\d{4}-\d\d:\d\d:\d\d)\s+\*{15}.*\s(?P<NUM_ROWS>\d+)\sRows loaded.*Time taken: .*?seconds.*?\*{15}\s+(?<END_DATE>\d\d\/\d\d\/\d{4}-\d\d:\d\d:\d\d)"
---
If this reply helps you, Karma would be appreciated.

vtsguerrero
Contributor

Thanks @richgalloway! Worked here!

somesoni2
Revered Legend

Is data already ingested and all these lines appear as one event?

0 Karma

vtsguerrero
Contributor

Yeah, max_event was applied with 10.000 in index preview...

0 Karma

theouhuios
Motivator
rex field=_raw "*\s+(?P<Start_date>\S+)\s+"

should give you the time value.

rex field=_raw "(?P<ROLLS>num_rows:\s+\d+)"

This will get the ROLLS.

But you would want num_rows = 65931447 right? for that..

rex field=_raw "num_rows:\s+(?P<num_rows>\d+)\,"

Didn't test these. Should be good though.

0 Karma

theouhuios
Motivator

Sorry about that. Been a while since I posted here. Looks like formatting got messed up.

0 Karma

vtsguerrero
Contributor

I need to extract by regex
should be:

START_DATE = *************** 26/02/2015-07:00:06 ***************
END_DATE = *************** 26/02/2015-07:37:47 ***************
ROLLS = num_rows: 65931447

0 Karma
Get Updates on the Splunk Community!

Get Your Exclusive Splunk Certified Cybersecurity Defense Engineer at Splunk .conf24 ...

We’re excited to announce a new Splunk certification exam being released at .conf24! If you’re headed to Vegas ...

Share Your Ideas & Meet the Lantern team at .Conf! Plus All of This Month’s New ...

Splunk Lantern is Splunk’s customer success center that provides advice from Splunk experts on valuable data ...

Combine Multiline Logs into a Single Event with SOCK: a Step-by-Step Guide for ...

Combine multiline logs into a single event with SOCK - a step-by-step guide for newbies Olga Malita The ...