Archive

Go through Pasringqueue twice to break files via \n

SplunkTrust
SplunkTrust

Hello,

I have a file that doesnt seems to be breakable via the standard line breaker since it's a full text file with no \n or \r whatsoever. Using delimiters for lines didnt work so I want to use sedcmd on the keywords and add a \n as a suffix in order to define the lines. After that i wish to send the data back into the parsing queue and tag the whole thing with a new sourcetype to apply setting that use \n as a linebreaker.

Bellow is what i have so far. sedcmd is working and redirection to the new sourcetype is too but the settings of that sourcetype (test1 bellow) isn't being applied so no line break is happening.

PROPS

  [test]
    BREAK_ONLY_BEFORE = blabla
    CHARSET = 
    DATETIME_CONFIG = 
    NO_BINARY_CHECK = true
    SEDCMD-replace = s/\sblabla/\nblabla/g
    SHOULD_LINEMERGE=false
    NO_BINARY_CHECK=true
    category = Custom
    pulldown_type = true
    TRANSFORMS-t1= redirect,reparse

    [test1]
    SHOULD_LINEMERGE=false
    NO_BINARY_CHECK=true

TRANSFORMS

[reparse]
REGEX=(.)
FORMAT=parsingQueue
DEST_KEY=queue

[redirect]
REGEX=(.)
FORMAT = sourcetype::test1
DEST_KEY = MetaData:Sourcetype

example data:

blabla asd asd asd asd as                         asda sdasd asd asd                   blabla asd asd asdaddddddddddddddddddddddddddddddddddddddddddd    ddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddd           dddddddddddddddddddddddddddddddddddddd                     aaaaaaaweeeeeeeeeeeeeeeeeeeeeeee bla blabla assssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss

thanks for the help,
David

0 Karma
1 Solution

SplunkTrust
SplunkTrust

If you want to break on spaces before blabla within a line, you could use something like this for only one parsing iteration:

[test]
SHOULD_LINEMERGE = false
LINE_BREAKER = (\s+)blabla

That will override the default line breaker that breaks on newlines with your custom definition of where a line should end.
Note, this will consume the spaces from the first capturing group - they won't be part of either event before or after the break.

View solution in original post

SplunkTrust
SplunkTrust

If you want to break on spaces before blabla within a line, you could use something like this for only one parsing iteration:

[test]
SHOULD_LINEMERGE = false
LINE_BREAKER = (\s+)blabla

That will override the default line breaker that breaks on newlines with your custom definition of where a line should end.
Note, this will consume the spaces from the first capturing group - they won't be part of either event before or after the break.

View solution in original post

SplunkTrust
SplunkTrust

First I'd actually send the events to the parsing queue, your question lists a transforms that sends them to the aggregation queue - that's too late for line breaking.

0 Karma

SplunkTrust
SplunkTrust

yeah ^^ copy pasted the wrong transforms, but i tried sending it to the parsing queue...gonna fix that in the questions 😄

SplunkTrust
SplunkTrust

Well, if you don't use the working LINE_BREAKER from my answer then it certainly won't work 😛

Read up on LINE_BREAKER at http://docs.splunk.com/Documentation/Splunk/6.3.3/Admin/Propsconf - it needs a capturing group to mark the break itself, to mark the bit of the text that should be consumed by the line breaker.

* The regex must contain a capturing group -- a pair of parentheses which
  defines an identified subcomponent of the match.
* Wherever the regex matches, Splunk considers the start of the first
  capturing group to be the end of the previous event, and considers the end
  of the first capturing group to be the start of the next event.
* The contents of the first capturing group are discarded, and will not be
  present in any event.  You are telling Splunk that this text comes between
  lines.

SplunkTrust
SplunkTrust

ohhhhhh lol okay ! 😄 good to know 😄 ill accept your answer kind sir 😛

And about the second part, any clue how its done ? I've had it working once but i cant seem to get it to work again 😞

0 Karma

SplunkTrust
SplunkTrust

I'd rather figure out why the right approach doesn't work on your end than come up with a convoluted workaround. Besides, by adding a line break to the events and parsing them again you'd also rely on LINE_BREAKER = ([\r\n]+). If that works, different regular expressions will work too.

What happens when you recreate the data upload/preview screen I posted?

SplunkTrust
SplunkTrust

Hello Martin,
Thanks again for your help.
Adding a new line to the events works with the default line breaker works out well, this is why i wanted to add it via the sed ^^
After re-testing a couple of times with the (\s+)blabla it worked but for some reason the "blabla" alone as a line breaker still doesnt. so LINE_BREAKER = (\s+)blabla works but LINE_BREAKER = blabla doesnt ^^
Still curious to get the "workaround" to work though for testing purposes ^^

0 Karma

SplunkTrust
SplunkTrust

Looks fine to me:

alt text

The text editor formatted for line breaks to fit the window, it's one long line as in your example.

SplunkTrust
SplunkTrust

ummm i don't know....I might be facing some bug then...because yes logically what you sent "should" work..nothing happens though..anyway regardless of whether it is a bug or not, do you know how to send data back to the parsingqueue ? As I was trying to do above, 1-add \n via sed and 2 - send data to parsingqueue to break lines on \n

0 Karma

SplunkTrust
SplunkTrust

And thank you for your help so far 🙂

0 Karma

SplunkTrust
SplunkTrust

it doesn't work tried all sorts of LINE_BREAKER and BREAK_ONLY_BEFORE options and nothing ^^ Thats why i want to go through this kind of loop. Try recreating the same file above and parsing it with this LINE_BREAKER, it wont do anything 😞

0 Karma