Getting Data In

When import TSV file, a field end with '\' will be combined with the following field wrongly

ryu_kahou
Explorer

I'm importing tab-delimited files formatted as the following. The space is tab.

 "field1    field2    field3    field4"

The files are imported with command "splunk add oneshot " and parsed with the following definition in transforms.conf.

DELIMS = "\t"
FIELDS = "field_name1","field_name2","field_name3","field_name4"

It works fine but for some line the value goes into wrong field name. It took me a lot of time to find the reason. I found that if a field ends with '\', it will be combined with its next field.

For example, "field1 field2\ field3 field4" will be parsed as

field_name1=field1
field_name2=field2\ field3
field_name3=field4
field_name4=

If I remove all ending '\' it will work fine. But I have to import the original data as is. Is there any way to import field end with '\'?

0 Karma
1 Solution

richgalloway
SplunkTrust
SplunkTrust

The backslash is escaping the tab so it is treated as an ordinary character rather than as a separator. One solution is to use a regex. Replace the DELIMS and FIELDS lines in your transforms.conf with this line:

REGEX=(?<field_name1>.*?[\\]?)\t(?<field_name2>.*?[\\]?)\t(?<field_name3>.*?[\\]?)\t(?<field_name4>.*?)

If the backslash is not part of the field value then use this REGEX string:

REGEX=(?<field_name1>.*?)[\\]?\t(?<field_name2>.*?)[\\]?\t(?<field_name3>.*?)[\\]?\t(?<field_name4>.*?)
---
If this reply helps you, Karma would be appreciated.

View solution in original post

richgalloway
SplunkTrust
SplunkTrust

The backslash is escaping the tab so it is treated as an ordinary character rather than as a separator. One solution is to use a regex. Replace the DELIMS and FIELDS lines in your transforms.conf with this line:

REGEX=(?<field_name1>.*?[\\]?)\t(?<field_name2>.*?[\\]?)\t(?<field_name3>.*?[\\]?)\t(?<field_name4>.*?)

If the backslash is not part of the field value then use this REGEX string:

REGEX=(?<field_name1>.*?)[\\]?\t(?<field_name2>.*?)[\\]?\t(?<field_name3>.*?)[\\]?\t(?<field_name4>.*?)
---
If this reply helps you, Karma would be appreciated.

ryu_kahou
Explorer

Thank you for your answer.

I think it's a bug of DELIMS. Even simple split function doesn't have the problem.

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Tech Talk Recap | Mastering Threat Hunting

Mastering Threat HuntingDive into the world of threat hunting, exploring the key differences between ...

Observability for AI Applications: Troubleshooting Latency

If you’re working with proprietary company data, you’re probably going to have a locally hosted LLM or many ...

Splunk AI Assistant for SPL vs. ChatGPT: Which One is Better?

In the age of AI, every tool promises to make our lives easier. From summarizing content to writing code, ...