Solved: Re: How to split multiple lines of data into indiv...

otman01 · ‎07-06-2015

Hello every one,

I have some data in my Splunk server that is not separated correctly. I want to split this data into lines. Below a sample of my data:

06/07/15 15:39:11,000       zone CL=Product;AC=resp;MID=AS5952 ;code=57;time=251;mark=samsung;zone CL=Product;AC=resp;MID=AS5952 ;code=03;time=614;mark=Iphone;zone CL=Product;AC=resp;MID=AS5952 ;code=00;time=506;mark=samsung;zone CL=Product;AC=resp;MID=AS5952 ;code=57;time=274;mark=samsung;zone CL=Product;AC=resp;MID=AS5952 ;code=00;time=892;mark=Iphone;zone CL=Product;AC=resp;MID=AS5952 ;code=57;time=256;mark=samsung;zone CL=Product;AC=resp;MID=AS5952 ;code=00;time=623;mark=samsung;zone CL=Product;AC=resp;MID=AS5952 ;code=57;time=281;mark=samsung;

so what I want to do is to split this log like :

06/07/15 15:39:11,000  zone CL=Product;AC=resp;MID=AS5952 ;code=57;time=251;mark=samsung;
06/07/15 15:39:11,000  zone CL=Product;AC=resp;MID=AS5952 ;code=03;time=614;mark=Iphone;
06/07/15 15:39:11,000   zone CL=Product;AC=resp;MID=AS5952 ;code=00;time=506;mark=samsung;
06/07/15 15:39:11,000   zone CL=Product;AC=resp;MID=AS5952 ;code=57;time=274;mark=samsung;

any help please. thank you .

woodcock · ‎07-06-2015

I am assuming that what you posted was one event and that you are succesfully sending these "bunched" events into Splunk already; if so, use something like this when you need to break them apart at search time:

...  | rex max_match=0 field=_raw "(?<lineData>zone.*?mark=[^;]+)" | mvexpand lineData | fields lineData

If you have your timestmaping working correctly, each event will have the correct timestamp.

View solution in original post

peter_krammer · ‎02-11-2016

We also just had this problem. I added SHOULD_LINEMERGE = false to my props.conf but all earlier events were naturally still mashed together. I wrote an all purpose query to split the lines based on previous answer, but with added bonus that all field extractions for the sourcetype work fine after putting the splitted data into the _raw field.

| rex max_match=0 field=_raw "(?<lineData>[^\n]+)" | mvexpand lineData | eval _raw=lineData

nick405060 · ‎07-27-2018

Thanks this worked!!

woodcock · ‎07-06-2015

I am assuming that what you posted was one event and that you are succesfully sending these "bunched" events into Splunk already; if so, use something like this when you need to break them apart at search time:

...  | rex max_match=0 field=_raw "(?<lineData>zone.*?mark=[^;]+)" | mvexpand lineData | fields lineData

If you have your timestmaping working correctly, each event will have the correct timestamp.

otman01 · ‎07-07-2015

Can I reindex my data without delete it? because I can't delete data present in the server

woodcock · ‎05-30-2016

Yes, but you will have to clear the fishbucket first.

MuS · ‎05-30-2016

Just in case the next question will be How can I clear the fishbucket?

Please find the docs on how to remove a file from the fishbucket using btprobe here http://docs.splunk.com/Documentation/Splunk/6.4.0/Troubleshooting/CommandlinetoolsforusewithSupport#... or How to clean the fishbucket here http://docs.splunk.com/Documentation/Splunk/6.4.0/Indexer/RemovedatafromSplunk#Remove_data_from_one_...

cheers, MuS

RichaSingh · ‎05-30-2016

This one was such a saviour !

Thanks for sharing this...

bmacias84 · ‎07-06-2015

If his application is not line breaking the event, essentially printing multiple line into a single line they would merge. If this is the case you could use BREAK_ONLY_BEFORE in the props.conf.

otman01 · ‎07-07-2015

I used the parameter with this configuration
BREAK_ONLY_BEFORE=zone
but it doesn't work

woodcock · ‎07-07-2015

He is saying at index-time, you can configure Splunk to break events into multiple events so that you do not have to do it at search-time. The "problem" with this approach is that each line does not have it's own timestmamp so this will cause Splunk to issue a warning in the log for each sub-event after the first in a clump with text like this:

WARN  DateParserVerbose - Failed to parse timestamp. Defaulting to timestamp of previous event.

It is harmless, but since it does the correct thing (use previous event's timestamp), it will work for you.

otman01 · ‎07-07-2015

how can I use this parameter BREAK_ONLY_BEFORE based on my log ?
thank you

How to split multiple lines of data into individual lines?

Introducing the Splunk Community Dashboard Challenge!

Built-in Service Level Objectives Management to Bridge the Gap Between Service & ...

Get Your Exclusive Splunk Certified Cybersecurity Defense Engineer Certification at ...