Getting Data In

Why are there completely different formats in same logfile?

sini
Explorer

Hi all,

We have an application which produces logfiles where other logfiles are inserted (they are pulled from stdout when the other program is executed). We are only interested in the stdout that is generated by SQL statements of another program, which are multiline entries themselves in a specific format. So basically an SQL event starts with a date and ends with the next date of an SQL event. We have a RegEx which captures all the SQL lines we are interessted in, but we cannot see a way to ignore the rest that is contained in the logfile, since all routing to nullQueue or SEDCMD takes place after timestamp recognition and event breaking and those other entries are either messing up the event breaking or are attached to the SQL events if we specify a timeconfig which only matches the SQL statements.

Basically what needs to be done is that all lines not matching ^(\d+|\t+|\s\s+|CREATE|SELECT|DROP|UPDATE|INSERT|FROM|TBLPROPERTIES|\)).* need to be excluded before any timestamp recognition or eventbreaking is applied.

To make it clear again. The problem is that all events, also those we want to get rid of are multiline events with different start and end and the date for the eventtypes are specified in different locations and format, hence the exclusion must occur before merging takes place.

Is this possible? 

Regards

Labels (1)
Tags (2)
0 Karma
1 Solution

richgalloway
SplunkTrust
SplunkTrust

As you know, timestamp extraction and event breaking happen early in the processing pipeline and the order cannot be changed.

Is it possible to break events based on the end of the SQL rather than the beginning of the next SQL?

Consider using Cribl (cribl.io) to filter out unwanted events before they get to Splunk.

---
If this reply helps you, Karma would be appreciated.

View solution in original post

0 Karma

PickleRick
SplunkTrust
SplunkTrust

You could use  INGEST_EVAL and/or CLONE_SOURCETYPE.

https://conf.splunk.com/files/2020/slides/PLA1154C.pdf

0 Karma

richgalloway
SplunkTrust
SplunkTrust

As you know, timestamp extraction and event breaking happen early in the processing pipeline and the order cannot be changed.

Is it possible to break events based on the end of the SQL rather than the beginning of the next SQL?

Consider using Cribl (cribl.io) to filter out unwanted events before they get to Splunk.

---
If this reply helps you, Karma would be appreciated.
0 Karma

sini
Explorer

Hi,

Thanks for confirming. I might as well just use different scripted inputs to get exactly what I need. The file isn't written constantly so it's sufficient to parse it once a day and then send the required contents to the Indexer.

Regards

Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...