Getting Data In

Why are there completely different formats in same logfile?

sini
Explorer

Hi all,

We have an application which produces logfiles where other logfiles are inserted (they are pulled from stdout when the other program is executed). We are only interested in the stdout that is generated by SQL statements of another program, which are multiline entries themselves in a specific format. So basically an SQL event starts with a date and ends with the next date of an SQL event. We have a RegEx which captures all the SQL lines we are interessted in, but we cannot see a way to ignore the rest that is contained in the logfile, since all routing to nullQueue or SEDCMD takes place after timestamp recognition and event breaking and those other entries are either messing up the event breaking or are attached to the SQL events if we specify a timeconfig which only matches the SQL statements.

Basically what needs to be done is that all lines not matching ^(\d+|\t+|\s\s+|CREATE|SELECT|DROP|UPDATE|INSERT|FROM|TBLPROPERTIES|\)).* need to be excluded before any timestamp recognition or eventbreaking is applied.

To make it clear again. The problem is that all events, also those we want to get rid of are multiline events with different start and end and the date for the eventtypes are specified in different locations and format, hence the exclusion must occur before merging takes place.

Is this possible? 

Regards

Labels (1)
Tags (2)
0 Karma
1 Solution

richgalloway
SplunkTrust
SplunkTrust

As you know, timestamp extraction and event breaking happen early in the processing pipeline and the order cannot be changed.

Is it possible to break events based on the end of the SQL rather than the beginning of the next SQL?

Consider using Cribl (cribl.io) to filter out unwanted events before they get to Splunk.

---
If this reply helps you, Karma would be appreciated.

View solution in original post

0 Karma

PickleRick
SplunkTrust
SplunkTrust

You could use  INGEST_EVAL and/or CLONE_SOURCETYPE.

https://conf.splunk.com/files/2020/slides/PLA1154C.pdf

0 Karma

richgalloway
SplunkTrust
SplunkTrust

As you know, timestamp extraction and event breaking happen early in the processing pipeline and the order cannot be changed.

Is it possible to break events based on the end of the SQL rather than the beginning of the next SQL?

Consider using Cribl (cribl.io) to filter out unwanted events before they get to Splunk.

---
If this reply helps you, Karma would be appreciated.
0 Karma

sini
Explorer

Hi,

Thanks for confirming. I might as well just use different scripted inputs to get exactly what I need. The file isn't written constantly so it's sufficient to parse it once a day and then send the required contents to the Indexer.

Regards

Get Updates on the Splunk Community!

Unlock Database Monitoring with Splunk Observability Cloud

  In today’s fast-paced digital landscape, even minor database slowdowns can disrupt user experiences and ...

Purpose in Action: How Splunk Is Helping Power an Inclusive Future for All

At Cisco, purpose isn’t a tagline—it’s a commitment. Cisco’s FY25 Purpose Report outlines how the company is ...

[Upcoming Webinar] Demo Day: Transforming IT Operations with Splunk

Join us for a live Demo Day at the Cisco Store on January 21st 10:00am - 11:00am PST In the fast-paced world ...