Getting Data In

How do i ignore the new line character in the _raw field?

linu1988
Champion

Hello All,
I am forwarding some csv data into splunk from a script. The problem is splunk is applying a newline character which is breaking the value in between. The data is already indexed. Is there a way that i can capture them in an extracted field and then ignore the newline character to make a new field so that i don't have use the replace function which will be used in where cause.

Currently i am using the REPORT CLASS to extract the csv values

e.g.
Field= ABCD without escape character
Field= ABC D where new line character is included

Thanks in advance

0 Karma
1 Solution

martin_mueller
SplunkTrust
SplunkTrust

If there are no intended newlines then you could add this:

SEDCMD-newlines = s/[\n\r]+//g

View solution in original post

martin_mueller
SplunkTrust
SplunkTrust

If there are no intended newlines then you could add this:

SEDCMD-newlines = s/[\n\r]+//g

martin_mueller
SplunkTrust
SplunkTrust

Yeah, index-time configuration cannot be refreshed.

0 Karma

linu1988
Champion

It works. I thought the refresh will work, but the restart did the trick.

0 Karma

linu1988
Champion

Sorry for getting in late
20140805140704.650149+120,21240,350,21240,60,0,48,12,313651,7277,69124096,41893 888,dtexec.exe /SQL "\APPLICATION_NAME" /SERVER "SERV ER_NAME"

The above is one of the sample _raw data(single event). Check the highlighted one(SERV ER_NAME). Can't really post original set of records, but there is no pattern when the newline character is coming..

Props.conf
DATETIME_CONFIG=CURRENT
MAX_TIMESTAMP_LOOKAHEAD=150
BREAK_ONLY_BEFORE=\d{14}\.\d{6}
NO_BINARY_CHECK=1
REPORT-dwh = package_Usage

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Are you running into any limits such as MAX_EVENTS or TRUNCATE?

Without actual sample data and the props.conf settings used to index it I'm just guessing here.

0 Karma

linu1988
Champion

that i was thinking, but there are so much data indexed, if no other option i have to go for SEDCMD. and the newline actually is not coming from the script outout which i can confirm.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

I'd focus on getting rid of the newline character, wherever it's from. Did you try using SEDCMD-foo to remove it during indexing?

0 Karma

linu1988
Champion

i have to use replace in the search but for filtering it will not work for a larger data set which i can filter out at the source level only if i can remove that newline as a extracted field.

0 Karma

linu1988
Champion

I am using a script to geneate the data which i am extracting at search time.

What splunk is doing in _raw it inserts "\n" character at the end of the line if it is little large.

e.g. my data looks like this

123,123,123,0,12,93,cmd "program_name parameter1 parameter2",234324

whats am doing is:

i extract each field and program as
"program_name parameter1 parameter2"

many times i get values
"program_name parameter1 para meter2"
"program_name parameter1 paramete r2"
"program_name param eter1 parameter2" like these

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Could you post some sample data?

I'd also be interested in where that newline is coming from - Splunk doesn't just add characters to the data.

0 Karma
Get Updates on the Splunk Community!

Detecting Brute Force Account Takeover Fraud with Splunk

This article is the second in a three-part series exploring advanced fraud detection techniques using Splunk. ...

Buttercup Games: Further Dashboarding Techniques (Part 9)

This series of blogs assumes you have already completed the Splunk Enterprise Search Tutorial as it uses the ...

Buttercup Games: Further Dashboarding Techniques (Part 8)

This series of blogs assumes you have already completed the Splunk Enterprise Search Tutorial as it uses the ...