Getting Data In

Why splunk is not able to index large csv files,

premforsplunk
Explorer

I'm trying to index a 3.5 GB csv file, but splunk is not reading it. Any clues ?

Tags (2)
0 Karma

nickhills
Ultra Champion

I had to do this once for a huge CSV, but I also has to prune a few rows, and didn't need half the columns, (so not exactly the same as your requirement)

I actually opted to do this with a python scripted input, which allowed me to pre-process the file as it went in, and dumped to stdout as key=value pairs.
Once it was completed I disabled the input, but it meant I had the ability to run it again if ever needed.
I never did need it again, but even so, It was time well spent making sure the data was concise when it went in.

If my comment helps, please give it a thumbs up!
0 Karma

MousumiChowdhur
Contributor

Hi!

You can add the below configuration parameters in your limits.conf file of your app.

[kv]
limit =
maxcols =

The default values for limit is 100 and maxcols is 512. You should try indexing your csv by increasing the default values.

Link for your reference.
https://docs.splunk.com/Documentation/Splunk/7.0.1/Admin/Limitsconf

Thanks.

0 Karma

mayurr98
Super Champion

what configuration you have done to index this csv?
If connectivity and all is there then
add following command in your splunkforwarder using cli
./splunk add monitor -index index_name -sourcetype csv

also on the indexer create the same index that you specified as index_name

0 Karma
Get Updates on the Splunk Community!

Cloud Platform | Customer Change Announcement: Email Notification Will Be Available ...

The Notification Team is migrating our email service provider since currently there’s no support ...

Mastering Synthetic Browser Testing: Pro Tips to Keep Your Web App Running Smoothly

To start, if you're new to synthetic monitoring, I recommend exploring this synthetic monitoring overview. In ...

Splunk Edge Processor | Popular Use Cases to Get Started with Edge Processor

Splunk Edge Processor offers more efficient, flexible data transformation – helping you reduce noise, control ...