Splunk Search

Problem calculating fields at index time when input csv fields are all quoted

cajose3pepe
New Member

I was wondering if anyone knows about the next, and if there’s any solution:

I have tried to calculate two fields at index time when indexing a CSV (a monitored one) following the documentation. I achieved my objective having:

/opt/splunk/import/sftp/mbr/FILE_NAME.CSV
Header1,Header2,Header3
Field1,Field2,Field3
`
INPUTS.CONF

[monitor:///opt/splunk/import/sftp/mbr/FILE_NAME.csv]
index = my_index
disabled = 0
crcSalt = 
sourcetype = my_sourcetype

PROPS.CONF:

[my_sourcetype]
CHARSET = ISO-8859-1
TZ = America/Sao_Paulo
TIME_FORMAT=%s
TIMESTAMP_FIELDS = Header1
MAX_TIMESTAMP_LOOKAHEAD=10
SHOULD_LINEMERGE = false
disabled = false
pulldown_type = true
INDEXED_EXTRACTIONS = csv
KV_MODE = none
NO_BINARY_CHECK = true
TRANSFORMS-company = company_transform

TRANSFORMS.CONF:

[company_transform]
REGEX = .*FILE_NAME.*
FORMAT = company::"100" companydesc::"MY COMPANY"
SOURCE_KEY = MetaData:Source
WRITE_META = true
*If FILE_NAME in source field, then index those fields

FIELDS.CONF

[company]
INDEXED = True
[companydesc]
INDEXED = True
  • I have seen that this is not necessary (without it everything works fine)

When indexed, I can see ‘company’ and ‘companydesc’ fields indexed and searchable.

The problem comes when the CSV has every field between quotes "fieldN" (to avoid conflicts with text fields that already contains quotes):
/opt/splunk/import/sftp/mbr/FILE_NAME.CSV

"Header1","Header2","Header3"
"Field1", "Field2", "Field3"

When indexing this file, TRANSFORMS-company = company_transform stops working.

Do you know what’s happening?

Many thanks in advance.
Kind Regards

0 Karma

cajose3pepe
New Member

I have found a "solution" that fits for me:

PROPS.CONF
[my_sourcetype]
CHARSET = ISO-8859-1
TZ = America/Sao_Paulo
TIME_PREFIX = \" # Line to get first field as timestamp
TIME_FORMAT=%s
MAX_TIMESTAMP_LOOKAHEAD=10
SHOULD_LINEMERGE = false
disabled = false
pulldown_type = true
KV_MODE = none
NO_BINARY_CHECK = true
PREAMBLE_REGEX = .Header3" #Line to avoid header indexing
SEDCMD-changeeventformat1 = s/(\"[^\"]
\"),(\"[^\"]\"),(\"[^\"]\")/Header1=\1 Header2=\2 Header3=\3/g
SEDCMD-changeeventformat2 = s/ \w+=\"\"//g #This line deletes empty fields
TRANSFORMS-company = company_transform

TRANSFORMS.CONF
[company_transform]
REGEX = .FILE_NAME.
FORMAT = $0 company="100" companydesc="MY COMPANY"
SOURCE_KEY = MetaData:Source
WRITE_META = true
DEST_KEY = _raw

Not a beautiful solution but after hours of tries is the only solution I have found.

Hope is helpful for others.

0 Karma

cajose3pepe
New Member

I have found a solution:

TRANSFORMS.conf

[company_transform]
REGEX = .FILE_NAME.
FORMAT = $0,company="100",companydesc="MY COMPANY"
SOURCE_KEY = MetaData:Source
WRITE_META = true
DEST_KEY = _raw

The only inconvenient is that your indexed event will look like this:

"Field1Value","Field2Value","Field3Value",company="100", companydesc="MY COMPANY"

I have tried to apply a SEDCMD:
SEDCMD-removeEmpty = s/clm_losscompanydesc=//g
to make the event raw look like "Field1Value","Field2Value","Field3Value","100","MY COMPANY"
but doesn't work.

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Announcing Modern Navigation: A New Era of Splunk User Experience

We are excited to introduce the Modern Navigation feature in the Splunk Platform, available to both cloud and ...

Modernize your Splunk Apps – Introducing Python 3.13 in Splunk

We are excited to announce that the upcoming releases of Splunk Enterprise 10.2.x and Splunk Cloud Platform ...

Step into “Hunt the Insider: An Splunk ES Premier Mystery” to catch a cybercriminal ...

After a whole week of being on call, you fell asleep on your keyboard, and you hit a sequence of buttons that ...