Getting Data In

Complex line breaking configuration help needed

jfaldmomacu
Path Finder

Here is a snippet of a log file that I am trying to do line breaking on. I want it to only break when one line has matches "info]*" and the next line has "info]Line"

[2019-12-18 07:00:01.070924-07:00|info]Line 3:     :begin
[2019-12-18 07:00:01.070924-07:00|info]Line 4:     
[2019-12-18 07:00:01.070924-07:00|info]Line 5:     WORKINGDIR "C:\Download\Server1"
[2019-12-18 07:00:01.070924-07:00|info]*Working directory: C:\Download\Server1\
[2019-12-18 07:00:01.070924-07:00|info]Line 6:     
[2019-12-18 07:00:01.070924-07:00|info]Line 7:       FTPLOGON "Server1" /timeout=60
[2019-12-18 07:00:01.070924-07:00|info]*Logging on to <server1> as SFTP (SSH File Transfer Protocol)
[2019-12-18 07:00:01.070924-07:00|info]*Logon in progress...
[2019-12-18 07:00:03.055523-07:00|info]*Logon successful.
[2019-12-18 07:00:03.055523-07:00|info]Line 8:       FTPCD "Extracts"
[2019-12-18 07:00:03.164909-07:00|info]*Current FTP site directory: /Extracts/
[2019-12-18 07:00:03.164909-07:00|info]Line 9:       IFERROR= $ERROR_SUCCESS GOTO Operation1
[2019-12-18 07:00:03.164909-07:00|info]Line 21:    :Operation1
[2019-12-18 07:00:03.164909-07:00|info]Line 22:      FTPGETFILE "*na_alert_subs*" /newest
[2019-12-18 07:00:03.164909-07:00|info]*Hint: FTPGETFILE /newest always returns the newest file
[2019-12-18 07:00:03.430561-07:00|info]Line 22:    *%sitefile has been set to: na_alert_subs_20191217.txt
[2019-12-18 07:00:03.446223-07:00|info]Line 23:      RCVFILE %sitefile /delete
[2019-12-18 07:00:03.446223-07:00|info]*Receiving to "C:\Download\Server1\na_alert_subs_20191217.txt"
[2019-12-18 07:00:12.947244-07:00|info]*Complete, received 1394788 bytes in 9 seconds (1513.44K cps)
[2019-12-18 07:00:13.103506-07:00|info]*File deleted on FTP site.
[2019-12-18 07:00:13.103506-07:00|info]*Download complete, 1 file received.

So in that snippet it would break down into five events.

0 Karma
1 Solution

jfaldmomacu
Path Finder

I guess this wasn't as complex as I initially thought. I was getting wrapped up in all the options. In reading the documentation for LINE_BREAKER I was able to get a simple solution. Thank you @to4kawa for prompting me to get to the right answer.

SHOULD_LINEMERGE = false
LINE_BREAKER = .*]\*.*([\r\n]+)\[.*\]Line

I swapped out Line for [^*] as I saw some edge cases where I needed some an event breaking there as well. So I really ended up with this.

SHOULD_LINEMERGE = false
LINE_BREAKER = .*]\*.*([\r\n]+)\[.*\][^\*]

View solution in original post

0 Karma

woodcock
Esteemed Legend

Like this:

SHOULD_LINEMERGE = false
LINE_BREAKER = [\r\n]+\[\d{4}-\d{2}-\d{2}\s+\d{2}:\d{2}:\d{2}\.\d+-\d{2}:\d{2}\|info]\*[^\r\n]+([\r\n]+)\[\d{4}-\d{2}-\d{2}\s+\d{2}:\d{2}:\d{2}\.\d+-\d{2}:\d{2}\|info\]Line

See here:
https://regex101.com/r/P4LwaF/1

jfaldmomacu
Path Finder

I guess this wasn't as complex as I initially thought. I was getting wrapped up in all the options. In reading the documentation for LINE_BREAKER I was able to get a simple solution. Thank you @to4kawa for prompting me to get to the right answer.

SHOULD_LINEMERGE = false
LINE_BREAKER = .*]\*.*([\r\n]+)\[.*\]Line

I swapped out Line for [^*] as I saw some edge cases where I needed some an event breaking there as well. So I really ended up with this.

SHOULD_LINEMERGE = false
LINE_BREAKER = .*]\*.*([\r\n]+)\[.*\][^\*]
0 Karma

to4kawa
Ultra Champion

I see
When I checked Splunk Add-on Builder, I need SHOULD_LINEMERGE = true
Does your setting cut between 16 and 17?

0 Karma

to4kawa
Ultra Champion

props.conf

SHOULD_LINEMERGE = true
LINE_BREAKER = Line.*[\r\n]\[.*\]\*.*([\r\n])\[.*\]Line

SHOULD_LINEMERGE is with (?msU) Implicitly.

0 Karma

jfaldmomacu
Path Finder

That made it so there are no line breaks, or so that everything comes through as one event. The source is a bunch of small files, less than a hundred lines, and each file is now one event.

0 Karma

woodcock
Esteemed Legend

You got the SHOULD_LINEMERGE wrong...

to4kawa
Ultra Champion

what's line number you want to cut?

0 Karma

jfaldmomacu
Path Finder

The five events in my original post should be lines--
1-4,
5-9,
10-11,
12-15,
16-21.

Or like this--

Event 1
 [2019-12-18 07:00:01.070924-07:00|info]Line 3:     :begin
 [2019-12-18 07:00:01.070924-07:00|info]Line 4:     
 [2019-12-18 07:00:01.070924-07:00|info]Line 5:     WORKINGDIR "C:\Download\Server1"
 [2019-12-18 07:00:01.070924-07:00|info]*Working directory: C:\Download\Server1\

Event 2
 [2019-12-18 07:00:01.070924-07:00|info]Line 6:     
 [2019-12-18 07:00:01.070924-07:00|info]Line 7:       FTPLOGON "Server1" /timeout=60
 [2019-12-18 07:00:01.070924-07:00|info]*Logging on to <server1> as SFTP (SSH File Transfer Protocol)
 [2019-12-18 07:00:01.070924-07:00|info]*Logon in progress...
 [2019-12-18 07:00:03.055523-07:00|info]*Logon successful.

Event 3
 [2019-12-18 07:00:03.055523-07:00|info]Line 8:       FTPCD "Extracts"
 [2019-12-18 07:00:03.164909-07:00|info]*Current FTP site directory: /Extracts/

Event 4
 [2019-12-18 07:00:03.164909-07:00|info]Line 9:       IFERROR= $ERROR_SUCCESS GOTO Operation1
 [2019-12-18 07:00:03.164909-07:00|info]Line 21:    :Operation1
 [2019-12-18 07:00:03.164909-07:00|info]Line 22:      FTPGETFILE "*na_alert_subs*" /newest
 [2019-12-18 07:00:03.164909-07:00|info]*Hint: FTPGETFILE /newest always returns the newest file

Event 5
 [2019-12-18 07:00:03.430561-07:00|info]Line 22:    *%sitefile has been set to: na_alert_subs_20191217.txt
 [2019-12-18 07:00:03.446223-07:00|info]Line 23:      RCVFILE %sitefile /delete
 [2019-12-18 07:00:03.446223-07:00|info]*Receiving to "C:\Download\Server1\na_alert_subs_20191217.txt"
 [2019-12-18 07:00:12.947244-07:00|info]*Complete, received 1394788 bytes in 9 seconds (1513.44K cps)
 [2019-12-18 07:00:13.103506-07:00|info]*File deleted on FTP site.
 [2019-12-18 07:00:13.103506-07:00|info]*Download complete, 1 file received.
0 Karma
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...