All Apps and Add-ons

DELIMS/FIELDS with a field that has sub fields?

narwhal
Splunk Employee
Splunk Employee

I have a large CSV data file (CDR) that has some 300 fields. Looks something like:

value1,value2,value3,...,value51,"subvalue52.1,subvalue52.2.,...subvalue51.20",value53,...,value300

The gotcha is field52. field51 is properly extracted, but field52 isn't. I'm not worried yet about the subextraction--right now, I just want field52 to be the whole thing inside the quotes.

from transforms.conf:

[my-report-stanza-name]

DELIMS = ","

FIELDS = f1,f2,f3,...,f300 (where f1-300 are LONG NAMES)

Is it because my f1-f300 are LONG?

Do I have the syntax for DELIMS wrong? (like, is that saying the delim char can be any of " OR , OR ' ?)

Once I do get this right, what's the best way to subextract f52?

adTHANKSvance gang!

-tv

0 Karma
1 Solution

narwhal
Splunk Employee
Splunk Employee

It appears from my testing that there is a line length limitation in the "FIELDS =" definition. So, I am now extracting them as short names ("F001","F002",etc) and then doing FIELDALIAS'es on them to have longer names.

Now all fields (including the CSV embedded inside another field in quotes) are properly extracted. I then am sub-extracting the embedded field with another stanza.

To be more precise:

props.conf:

[myBigCSV]

REPORT-foo = BigCSV, SubCSV

transforms.conf

[BigCSV]
DELIMS = ","
FIELDS = "F001","F002","F003"
FIELDALIAS-F001 = F001 AS MyFirstBigFieldName

[SubCSV]
SOURCE_KEY = F003
DELIMS = ","
FIELDS = "F003a","F003b","F003c"
FIELDALIAS-F003c = F003c AS MyThirdSubField

A very elegant and easy to maintain config.

-tv

View solution in original post

narwhal
Splunk Employee
Splunk Employee

It appears from my testing that there is a line length limitation in the "FIELDS =" definition. So, I am now extracting them as short names ("F001","F002",etc) and then doing FIELDALIAS'es on them to have longer names.

Now all fields (including the CSV embedded inside another field in quotes) are properly extracted. I then am sub-extracting the embedded field with another stanza.

To be more precise:

props.conf:

[myBigCSV]

REPORT-foo = BigCSV, SubCSV

transforms.conf

[BigCSV]
DELIMS = ","
FIELDS = "F001","F002","F003"
FIELDALIAS-F001 = F001 AS MyFirstBigFieldName

[SubCSV]
SOURCE_KEY = F003
DELIMS = ","
FIELDS = "F003a","F003b","F003c"
FIELDALIAS-F003c = F003c AS MyThirdSubField

A very elegant and easy to maintain config.

-tv

emotz
Splunk Employee
Splunk Employee

You have DELIMS setup correctly - but how are subfields delimited? Commas?
You will probably have to write a custom field extraction for the big f52, and all of the sub fields too.
That seems like a great data set.
Good luck!

0 Karma
Get Updates on the Splunk Community!

.conf24 | Day 0

Hello Splunk Community! My name is Chris, and I'm based in Canberra, Australia's capital, and I travelled for ...

Enhance Security Visibility with Splunk Enterprise Security 7.1 through Threat ...

(view in My Videos)Struggling with alert fatigue, lack of context, and prioritization around security ...

Troubleshooting the OpenTelemetry Collector

  In this tech talk, you’ll learn how to troubleshoot the OpenTelemetry collector - from checking the ...