Splunk Search

How to write the regex to extract the date fields and another key-value pair from my sample data?

vtsguerrero
Contributor

Hello guys, sup?

We've got this piece of log which is a MySql log and we should not change the layout, but need to extract START_DATE, END_DATE and NUM_ROWS.
As follows a piece of data:

******************************************************
*****************  EXECUTION HYW   *****************
*****************    TRTD08IIL.SQL    *****************
******************************************************

***************   26/02/2015-07:00:06  ***************
Conectting to MYSQL

Execution Data
Process Code : TRTD08IIL
Program           : TRTD08IIL.SQL
Descripton          : Table
Critic      : E
Reference Date : 20140103

***************   26/02/2015-07:00:06  ***************
Temp Execution Path For File: /C:/TEMP/TRTD08IIL.SQL.TRTD08IIL

***************   26/02/2015-07:00:06  ***************
Temp Execution Path For File: /C:/TEMP/TRTD08IIL.SQL.TRTD08IIL

Table tb.User stats: [num_partitions: 0, num_files: 0, num_rows: 65931447, total_size: 0, raw_data_size: 15793929087]
65931447 Rows loaded to tab_clie
MapReduce Jobs Launched:
Job 0: Map: 18  Reduce: 4   Cumulative CPU: 608.98 sec   HDFS Read: 1037601634 HDFS Write: 604987045 SUCCESS
Job 1: Map: 59  Reduce: 58   Cumulative CPU: 4337.55 sec   HDFS Read: 14707612009 HDFS Write: 29413721 SUCCESS
Total MapReduce CPU Time Spent: 0 days 7 hours 59 minutes 42 seconds 920 msec
OK
Time taken: 2256.431 seconds


***************   26/02/2015-07:37:47  ***************
Temp Execution Path For File: /C:/TEMP/TRTD08IIL.SQL.TRTD08IIL

***************   26/02/2015-07:37:47  ***************
Temp Execution Path For File: /C:/TEMP/TRTD08IIL.SQL.TRTD08IIL
0 Karma
1 Solution

richgalloway
SplunkTrust
SplunkTrust

This works on regex101.com:

"(?s)\*{15}\s+(?P<START_DATE>\d\d\/\d\d\/\d{4}-\d\d:\d\d:\d\d)\s+\*{15}.*\s(?P<NUM_ROWS>\d+)\sRows loaded.*Time taken: .*?seconds.*?\*{15}\s+(?<END_DATE>\d\d\/\d\d\/\d{4}-\d\d:\d\d:\d\d)"
---
If this reply helps you, Karma would be appreciated.

View solution in original post

richgalloway
SplunkTrust
SplunkTrust

This works on regex101.com:

"(?s)\*{15}\s+(?P<START_DATE>\d\d\/\d\d\/\d{4}-\d\d:\d\d:\d\d)\s+\*{15}.*\s(?P<NUM_ROWS>\d+)\sRows loaded.*Time taken: .*?seconds.*?\*{15}\s+(?<END_DATE>\d\d\/\d\d\/\d{4}-\d\d:\d\d:\d\d)"
---
If this reply helps you, Karma would be appreciated.

vtsguerrero
Contributor

Thanks @richgalloway! Worked here!

somesoni2
SplunkTrust
SplunkTrust

Is data already ingested and all these lines appear as one event?

0 Karma

vtsguerrero
Contributor

Yeah, max_event was applied with 10.000 in index preview...

0 Karma

theouhuios
Motivator
rex field=_raw "*\s+(?P<Start_date>\S+)\s+"

should give you the time value.

rex field=_raw "(?P<ROLLS>num_rows:\s+\d+)"

This will get the ROLLS.

But you would want num_rows = 65931447 right? for that..

rex field=_raw "num_rows:\s+(?P<num_rows>\d+)\,"

Didn't test these. Should be good though.

0 Karma

theouhuios
Motivator

Sorry about that. Been a while since I posted here. Looks like formatting got messed up.

0 Karma

vtsguerrero
Contributor

I need to extract by regex
should be:

START_DATE = *************** 26/02/2015-07:00:06 ***************
END_DATE = *************** 26/02/2015-07:37:47 ***************
ROLLS = num_rows: 65931447

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...