Getting Data In

Line breaks within a CSV field.

ocallender
Explorer

I have a .csv file with several fields. there are many date fields and text fields, but fields are long blobs of text (such as the body of an e-mail) lets call such a field "longtext". The problem is that whenever Splunk encounters a newline character in the longtext field, it interprets it as a line break which throws off the event breaks.

How can I get all of the text in "longtext" to be properly indexed without Splunk interpreting the newline as a line break?

Example:

create_time,request_id,username,longtext,responded_time,closed_time
2013-11-23 11:00,2322,johnsmith,Here is the long blob of text I was talking about. If i have a newline here: <newline> 
Splunk sees it as a break in the log file and doesn't place the rest of this text in the longtext field,2013-11-23 13:43,2013-11-23 14:05

Any ideas?

0 Karma

mrsprague
New Member

If this is a DOS format text file, you should be able break on CR-LF line breaks with LINE_BREAKER (in props.conf)

[yoursourcetype]
LINE_BREAKER=(\r\n)
0 Karma

somesoni2
Revered Legend

Following seems to be working for me, for the sample data you have given (to be added in props.conf)

[yoursourcetype]
INDEXED_EXTRACTIONS = csv
KV_MODE = none
MAX_TIMESTAMP_LOOKAHEAD = 50
NO_BINARY_CHECK = 1
SHOULD_LINEMERGE = true
pulldown_type = 1
0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...