We have a distributed splunk (8.x) environment on-prem, with CM and 3 peers, 2 SH, 1 deployment server, and many clients.
On on of my Windows 10 clients, I have a csv file that gets new data appended to it every 5 minutes via a separate python script that runs locally. I have verified the appending is working as intended. On the UF, I have setup an App with a monitor input to watch the file. A custom props.conf is located on the indexer peers. I am experiencing some unexpected behavior and am struggling to find a solution.
The first odd thing I have noticed is that when data is appended to the csv file, either via the python script or manually adding, I sometimes will get an event ingested but sometimes it will not. If I manually add lines 4 or 5 times over a 15 minute period, I might get 1 event. Sometimes I wont get any events at all, but if I get one event, its never more than 1.
The second weird thing noticed is that the event is always the top line of the csv file. Never the line that I added manually or via python to the bottom of the CSV file. The file is over 2500 lines. I have verified that the lines are actually appended to the bottom, and persist.
I suspect that there might be an issue with the timestamp or possibly the LINE_BREAK but I cannot say definitively that is the issue. (Maybe the LINE_BREAK regex is not actually there?)
I can take the csv file, and add it without issue using the "Add Data" process in the SplunkWeb. It breaks the events exactly how I would expect it should (not just the top line) with correct timestamps, and looks perfect. Using the Add Data method, I copied the props from the WebUI and tried to add it to my Apps custom props.conf that is pushed to the peers. I am still left with the same weird behavior as I experienced with my custom props.conf file. I am reaching a point where googling is becoming too specific to get me a solid lead on my next steps in the troubleshoot. Does anyone know what might be causing either of these issues?
Here is a snippet of the csv file (the top 3 lines):
profile_val,start_time,scheduler_tree,bot_name,tree_name,query_val,kill_option,stopped_on,remarks,status_val,ts
Fit,11/03/2022 19:34:00.277,lb_prod_scheduler,spotbot,,%20redacted%20auto,false,11/03/2022 19:40:00.107,BOT did not pinged for more than 300 seconds,failed,2022-11-04 00:40:05
Fit,11/03/2022 19:49:00.143,lb_prod_scheduler,spotbot,,%20redacted%20auto,false,11/03/2022 19:55:00.091,BOT did not pinged for more than 300 seconds,failed,2022-11-04 00:55:05
Here is my custom props.conf that I initially tried:
[bot:logs]
description = BOT Application Status Logs
category = Custom
disabled = false
CHARSET=UTF-8
KV_MODE = none
LINE_BREAKER = ([\r\n]+)
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = false
TIME_FORMAT = %m/%d/%Y%t%H:%M:%S.%f
TIME_PREFIX = Fit,
INDEXED_EXTRACTIONS=CSV
FIELD_DELIMITER=,
FIELD_NAMES=profile_val,start_time,scheduler_tree,bot_name,tree_name,query_val,kill_option,stopped_on,remarks,status_val,ts
Here is the "Add Data" props.conf that works in the WebUI but not on the indexers:
[bot:logs]
SHOULD_LINEMERGE=false
LINE_BREAKER=([\r\n]+)
NO_BINARY_CHECK=true
CHARSET=UTF-8
EXTRACT-Ex_account=^(?:[^:\n]*:){4}(?P<Ex_account>\d+)
INDEXED_EXTRACTIONS=csv
KV_MODE=none
category=Structured
description=Comma-separated value format. Set header and other settings in "Delimited Settings"
disabled=false
pulldown_type=true
Hi @calvinmcelroy,
indexed extractions as csv is one of the situation where you have to deploy props.conf also on Forwarder.
let me know.
about timestamp, please share a sample of your data and of your props.conf.
Ciao.
Giuseppe
Ok - Now I am feeling pretty foolish. The logs in my CSV file have timestamps from over a month ago. After getting the idea to check All Time time duration, I noticed that the events are in fact being ingested, and the timestamps are coming in correctly. Was fooling myself, thinking that the events would be ingested as the current time. No they are coming in exactly where they should be, about a month and a half ago.
If anyone sees anything foolish with the props files, would definitely take some tips if I am missing anything.
Hi @calvinmcelroy,
indexed extractions as csv is one of the situation where you have to deploy props.conf also on Forwarder.
let me know.
about timestamp, please share a sample of your data and of your props.conf.
Ciao.
Giuseppe
Thank you. I will try this out.