Getting Data In

Ingesting Data into the Correct Splunk Index

Abass42
Communicator

I have had a few issues ingesting data into the correct index. We are deploying an app from the deployment server, and this particular app has two clients. Initially, when I set this app up, I was ingesting data into our o365 index. This data looked somewhat like:

Abass42_0-1727115558654.png

We have a team running a script that tracks all deleted files. We were getting in one line per event. And at the time, I had the inputs.conf that looked like:

[monitor://F:\scripts\DataDeletion\SplunkReports]

index=o365

disabled=false

source=DataDeletion

It would ingest all CSV files within that DataDeletion Directory. In this case, it ingested everything under that directory. This worked. 

I changed the index to testing so i could manage the new data a bit better while we were still testing it. One inputs.conf backup shows that i had this at some point:

[monitor://F:\scripts\DataDeletion\SplunkReports\*.csv]
index=testing
disabled=false
sourcetype=DataDeletion
crcSalt = <string>

 

Now months later, I have changed the inputs.conf to ingest everything into the o365 index, and i have applied that change and pushed it out to the class using the Deployment server, and yet the most recent data looks different. The last events we ingested went into the testing index and looked like:

Abass42_1-1727116006501.pngAbass42_2-1727116164587.png

This may be due to how the script is sending data into splunk, but it looks like its aggregating hundreds of separate lines into one event. My inputs.conf looks like this currently:

[monitor://F:\scripts\DataDeletion\SplunkReports\*]
index = o365
disabled = 0
sourcetype = DataDeletion
crcSalt = <SOURCE>
recursive = true
#whitelist = \.csv


[monitor://F:\SCRIPTS\DataDeletion\SplunkReports\*]
index = o365
disabled = 0
sourcetype = DataDeletion
crcSalt = <SOURCE>
recursive = true
#whitelist = \.csv


[monitor://D:\DataDeletion\SplunkReports\*]
index = o365
disabled = 0
sourcetype = DataDeletion
crcSalt = <SOURCE>
recursive = true
#whitelist = \.csv

 

I am just trying to grab everything under D:\DataDeletion\SplunkReports\ on the new windows servers, and ingest all of the csv files under there, breaking up each line in the csv into a new event. What is the proper syntax for this inputs, what am i doing wrong, I have tried a few things and none of them see to work. Ive tried adding a whitelist, adding a blacklist, I have recursive and crcSalt there just to grab anything and everything.  and if the script isnt at fault at sending in chunks of data in one event, would adding a props.conf fix how Splunk is ingesting this data? Thanks for any help. 

0 Karma

Jawahir
Communicator

Use INDEXED_EXTRACTIONS = CSV in porps.conf for your sourcetype and push it to Universal Forwarder too along with inputs.conf

props.conf

[DataDeletion] 
INDEXED_EXTRACTIONS = CSV
FIELD_DELIMITER = ,
FIELD_NAMES = field1, field2, field3, field4 # (Replace with actual field names)
TIME_FORMAT = %Y-%m-%d %H:%M:%S # (Adjust based on your timestamp format)
TIMESTAMP_FIELDS = timestamp_field # (Replace with the actual field containing the timestamp)

------

If you find this solution helpful, please consider accepting it and awarding karma points !!
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Index This | What travels the world but is also stuck in place?

April 2026 Edition  Hayyy Splunk Education Enthusiasts and the Eternally Curious!   We’re back with this ...

Discover New Use Cases: Unlock Greater Value from Your Existing Splunk Data

Realizing the full potential of your Splunk investment requires more than just understanding current usage; it ...

Continue Your Journey: Join Session 2 of the Data Management and Federation Bootcamp ...

As data volumes continue to grow and environments become more distributed, managing and optimizing data ...