All Apps and Add-ons

DELIMS/FIELDS with a field that has sub fields?

narwhal
Splunk Employee
Splunk Employee

I have a large CSV data file (CDR) that has some 300 fields. Looks something like:

value1,value2,value3,...,value51,"subvalue52.1,subvalue52.2.,...subvalue51.20",value53,...,value300

The gotcha is field52. field51 is properly extracted, but field52 isn't. I'm not worried yet about the subextraction--right now, I just want field52 to be the whole thing inside the quotes.

from transforms.conf:

[my-report-stanza-name]

DELIMS = ","

FIELDS = f1,f2,f3,...,f300 (where f1-300 are LONG NAMES)

Is it because my f1-f300 are LONG?

Do I have the syntax for DELIMS wrong? (like, is that saying the delim char can be any of " OR , OR ' ?)

Once I do get this right, what's the best way to subextract f52?

adTHANKSvance gang!

-tv

0 Karma
1 Solution

narwhal
Splunk Employee
Splunk Employee

It appears from my testing that there is a line length limitation in the "FIELDS =" definition. So, I am now extracting them as short names ("F001","F002",etc) and then doing FIELDALIAS'es on them to have longer names.

Now all fields (including the CSV embedded inside another field in quotes) are properly extracted. I then am sub-extracting the embedded field with another stanza.

To be more precise:

props.conf:

[myBigCSV]

REPORT-foo = BigCSV, SubCSV

transforms.conf

[BigCSV]
DELIMS = ","
FIELDS = "F001","F002","F003"
FIELDALIAS-F001 = F001 AS MyFirstBigFieldName

[SubCSV]
SOURCE_KEY = F003
DELIMS = ","
FIELDS = "F003a","F003b","F003c"
FIELDALIAS-F003c = F003c AS MyThirdSubField

A very elegant and easy to maintain config.

-tv

View solution in original post

narwhal
Splunk Employee
Splunk Employee

It appears from my testing that there is a line length limitation in the "FIELDS =" definition. So, I am now extracting them as short names ("F001","F002",etc) and then doing FIELDALIAS'es on them to have longer names.

Now all fields (including the CSV embedded inside another field in quotes) are properly extracted. I then am sub-extracting the embedded field with another stanza.

To be more precise:

props.conf:

[myBigCSV]

REPORT-foo = BigCSV, SubCSV

transforms.conf

[BigCSV]
DELIMS = ","
FIELDS = "F001","F002","F003"
FIELDALIAS-F001 = F001 AS MyFirstBigFieldName

[SubCSV]
SOURCE_KEY = F003
DELIMS = ","
FIELDS = "F003a","F003b","F003c"
FIELDALIAS-F003c = F003c AS MyThirdSubField

A very elegant and easy to maintain config.

-tv

emotz
Splunk Employee
Splunk Employee

You have DELIMS setup correctly - but how are subfields delimited? Commas?
You will probably have to write a custom field extraction for the big f52, and all of the sub fields too.
That seems like a great data set.
Good luck!

0 Karma
Get Updates on the Splunk Community!

Dashboards: Hiding charts while search is being executed and other uses for tokens

There are a couple of features of SimpleXML / Classic dashboards that can be used to enhance the user ...

Splunk Observability Cloud's AI Assistant in Action Series: Explaining Metrics and ...

This is the fourth post in the Splunk Observability Cloud’s AI Assistant in Action series that digs into how ...

Brains, Bytes, and Boston: Learn from the Best at .conf25

When you think of Boston, you might picture colonial charm, world-class universities, or even the crack of a ...