I am creating a modular input. My input is a CSV and I convert it to JSON to be imported as a new event in Splunk. Several of the fields have newlines in the data. However, once indexed the newlines are removed. Here is the code that does it:
csvdata = [row for row in csv.reader(data.splitlines())]
header = csvdata.pop(0)
for row in csvdata:
e = {}
for col, val in zip( header, row ):
col = col.replace( " ", "_" )
e[col] = val
event_time = calendar.timegm(time.strptime(e["timefield"], time_pattern))
event = helper.new_event(data=json.dumps(e), time=event_time, index=index, unbroken=True)
ew.write_event(event)
One thing I've tried is adding the SHOULD_LINEMERGE=0 to props.conf which didn't work. Is there a way to tell Splunk not to remove the newlines from fields?
Thanks!
I'm going to mark this as resolved.
The problem wasn't during indexing. It was actually here:
csvdata = [row for row in csv.reader(data.splitlines())]
It mishandled the newlines. Getting rid of that and spliting on "\r\n" solved the problem