Splunk Search

How to write the regex to extract the date fields and another key-value pair from my sample data?

Contributor

Hello guys, sup?

We've got this piece of log which is a MySql log and we should not change the layout, but need to extract START_DATE, END_DATE and NUM_ROWS.
As follows a piece of data:

******************************************************
*****************  EXECUTION HYW   *****************
*****************    TRTD08IIL.SQL    *****************
******************************************************

***************   26/02/2015-07:00:06  ***************
Conectting to MYSQL

Execution Data
Process Code : TRTD08IIL
Program           : TRTD08IIL.SQL
Descripton          : Table
Critic      : E
Reference Date : 20140103

***************   26/02/2015-07:00:06  ***************
Temp Execution Path For File: /C:/TEMP/TRTD08IIL.SQL.TRTD08IIL

***************   26/02/2015-07:00:06  ***************
Temp Execution Path For File: /C:/TEMP/TRTD08IIL.SQL.TRTD08IIL

Table tb.User stats: [num_partitions: 0, num_files: 0, num_rows: 65931447, total_size: 0, raw_data_size: 15793929087]
65931447 Rows loaded to tab_clie
MapReduce Jobs Launched:
Job 0: Map: 18  Reduce: 4   Cumulative CPU: 608.98 sec   HDFS Read: 1037601634 HDFS Write: 604987045 SUCCESS
Job 1: Map: 59  Reduce: 58   Cumulative CPU: 4337.55 sec   HDFS Read: 14707612009 HDFS Write: 29413721 SUCCESS
Total MapReduce CPU Time Spent: 0 days 7 hours 59 minutes 42 seconds 920 msec
OK
Time taken: 2256.431 seconds


***************   26/02/2015-07:37:47  ***************
Temp Execution Path For File: /C:/TEMP/TRTD08IIL.SQL.TRTD08IIL

***************   26/02/2015-07:37:47  ***************
Temp Execution Path For File: /C:/TEMP/TRTD08IIL.SQL.TRTD08IIL
0 Karma
1 Solution

SplunkTrust
SplunkTrust

This works on regex101.com:

"(?s)\*{15}\s+(?P<START_DATE>\d\d\/\d\d\/\d{4}-\d\d:\d\d:\d\d)\s+\*{15}.*\s(?P<NUM_ROWS>\d+)\sRows loaded.*Time taken: .*?seconds.*?\*{15}\s+(?<END_DATE>\d\d\/\d\d\/\d{4}-\d\d:\d\d:\d\d)"
---
If this reply helps you, an upvote would be appreciated.

View solution in original post

SplunkTrust
SplunkTrust

This works on regex101.com:

"(?s)\*{15}\s+(?P<START_DATE>\d\d\/\d\d\/\d{4}-\d\d:\d\d:\d\d)\s+\*{15}.*\s(?P<NUM_ROWS>\d+)\sRows loaded.*Time taken: .*?seconds.*?\*{15}\s+(?<END_DATE>\d\d\/\d\d\/\d{4}-\d\d:\d\d:\d\d)"
---
If this reply helps you, an upvote would be appreciated.

View solution in original post

Contributor

Thanks @richgalloway! Worked here!

Revered Legend

Is data already ingested and all these lines appear as one event?

0 Karma

Contributor

Yeah, max_event was applied with 10.000 in index preview...

0 Karma

Motivator
rex field=_raw "*\s+(?P<Start_date>\S+)\s+"

should give you the time value.

rex field=_raw "(?P<ROLLS>num_rows:\s+\d+)"

This will get the ROLLS.

But you would want num_rows = 65931447 right? for that..

rex field=_raw "num_rows:\s+(?P<num_rows>\d+)\,"

Didn't test these. Should be good though.

0 Karma

Motivator

Sorry about that. Been a while since I posted here. Looks like formatting got messed up.

0 Karma

Contributor

I need to extract by regex
should be:

START_DATE = *************** 26/02/2015-07:00:06 ***************
END_DATE = *************** 26/02/2015-07:37:47 ***************
ROLLS = num_rows: 65931447

0 Karma