Getting Data In

CSV Field Extraction Cutting Off at 1,000 Characters

Jornoh
Loves-to-Learn

I am trying to ingest data from a CSV file. One of the columns in the CSV file contain SQL queries. The header has field names that are comma-separated, but the field containing the SQL queries is not being extracted correctly. It seems that if any of the SQL queries are longer than 1,000 characters, Splunk will only extract the first 1,000 characters of the query to the field. How can I make it so that Splunk will extract the entire SQL query from the CSV file?

For example, here is the _raw for one of the rows in the CSV file:

2024-07-15 16:30:12.207504,job_name,24 9 * * *,Warehouse_ABCDEF,indexName,sourcetype,"SELECT Lowest_Tier_Rating, Application_Name, Base_Application_Name, Application_Status, Application_Short_Description, Application_Comments, Application_CDL_Associated_Legal_Entities, Application_Environment, Application_Function, Application_Platform, RegSCI, RegSCI_Indirect, RegSCI_Critical, Application_Title, Application_Support_Group, Application_Info_Classification, Application_Info_Risk, Sr_Systems_Director_Responsible_Name, Sr_Systems_Director_Responsible_ID, Team_Process_Lead_Responsible_Name, Team_Process_Lead_Responsible_ID, Executive_Director_Responsible_Name, Executive_Director_Responsible_ID, Systems_Director_Responsible_Name, Systems_Director_Responsible_ID, Managing_Director_Responsible_Name, Managing_Director_Responsible_ID, Team_Content_Lead_Responsible_Name, Team_Content_Lead_Responsible_ID, Area_Process_Lead_Responsible_Name, Area_Process_Lead_Responsible_ID, Area_Content_Lead_Responsible_Name, Area_Content_Lead_Responsible_ID, manager_Responsible_Name, Manager_Responsible_ID, Infrastructure_Team_Lead_Responsible_Name, Infrastructure_Team_Lead_Responsible_ID, Domain_Lead_Responsible_Name, Domain_Lead_Responsible_ID, ResolvedDeviceName, DeviceName, IP_Address, Common_Name, Host, Valid_From, Valid_To, Source, Validity_Period, Validity_Period_Months, Key_Size, Signature_Algorithm, Organizational_Unit, Issuer, Serial_Number, Contact, Installations, Nickname, Problems, NC, Port, Device, DN, Validity_Status, IsManaged\ FROM ""ab_cd_ef"".""Dashboards"".""ABC_Venafi_Certificate_MappedTo_ABCDEF_Applications"";",0,1

 

And here is the extracted SQL Query:

SELECT Lowest_Tier_Rating, Application_Name, Base_Application_Name, Application_Status, Application_Short_Description, Application_Comments, Application_CDL_Associated_Legal_Entities, Application_Environment, Application_Function, Application_Platform, RegSCI, RegSCI_Indirect, RegSCI_Critical, Application_Title, Application_Support_Group, Application_Info_Classification, Application_Info_Risk, Sr_Systems_Director_Responsible_Name, Sr_Systems_Director_Responsible_ID, Team_Process_Lead_Responsible_Name, Team_Process_Lead_Responsible_ID, Executive_Director_Responsible_Name, Executive_Director_Responsible_ID, Systems_Director_Responsible_Name, Systems_Director_Responsible_ID, Managing_Director_Responsible_Name, Managing_Director_Responsible_ID, Team_Content_Lead_Responsible_Name, Team_Content_Lead_Responsible_ID, Area_Process_Lead_Responsible_Name, Area_Process_Lead_Responsible_ID, Area_Content_Lead_Responsible_Name, Area_Content_Lead_Responsible_ID, manager_Responsible_Name, Manager_Respo

Labels (2)
0 Karma

PickleRick
SplunkTrust
SplunkTrust

And your props for this sourcetype are...?

0 Karma

Jornoh
Loves-to-Learn

Hi, here are the props.conf for the CSV file:

DATETIME_CONFIG =
INDEXED_EXTRACTIONS = csv
KV_MODE = none
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = false
category = Structured
description = This  sourcetype stores all the DB connect information
disabled = false
pulldown_type = true

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Monitoring AI Agents with Splunk Observability Cloud

Let’s say I’m running a travel planning AI app in production. A user asks for three concise hotel options in ...

[Puzzles] Solve, Learn, Repeat: Tiling

This puzzle (first published here) is based on finding groups of tessellated tiles (inspired by floor tiles I ...

SOK it to Me: Top 3 Benefits of Using Splunk Operator on Kubernetes that’ll Make ...

    Thursday, July 9, 2026  |  11:00AM–12:00PM PDT Duration: 1 hour (includes Q&A) Managing can feel like a ...