We have a distributed splunk (8.x) environment on-prem, with CM and 3 peers, 2 SH, 1 deployment server, and many clients.
On on of my Windows 10 clients, I have a csv file that gets new data appended to it every 5 minutes via a separate python script that runs locally. I have verified the appending is working as intended. On the UF, I have setup an App with a monitor input to watch the file. A custom props.conf is located on the indexer peers. I am experiencing some unexpected behavior and am struggling to find a solution.
The first odd thing I have noticed is that when data is appended to the csv file, either via the python script or manually adding, I sometimes will get an event ingested but sometimes it will not. If I manually add lines 4 or 5 times over a 15 minute period, I might get 1 event. Sometimes I wont get any events at all, but if I get one event, its never more than 1.
The second weird thing noticed is that the event is always the top line of the csv file. Never the line that I added manually or via python to the bottom of the CSV file. The file is over 2500 lines. I have verified that the lines are actually appended to the bottom, and persist.
I suspect that there might be an issue with the timestamp or possibly the LINE_BREAK but I cannot say definitively that is the issue. (Maybe the LINE_BREAK regex is not actually there?)
I can take the csv file, and add it without issue using the "Add Data" process in the SplunkWeb. It breaks the events exactly how I would expect it should (not just the top line) with correct timestamps, and looks perfect. Using the Add Data method, I copied the props from the WebUI and tried to add it to my Apps custom props.conf that is pushed to the peers. I am still left with the same weird behavior as I experienced with my custom props.conf file. I am reaching a point where googling is becoming too specific to get me a solid lead on my next steps in the troubleshoot. Does anyone know what might be causing either of these issues?
Here is a snippet of the csv file (the top 3 lines):
profile_val,start_time,scheduler_tree,bot_name,tree_name,query_val,kill_option,stopped_on,remarks,status_val,ts
Fit,11/03/2022 19:34:00.277,lb_prod_scheduler,spotbot,,%20redacted%20auto,false,11/03/2022 19:40:00.107,BOT did not pinged for more than 300 seconds,failed,2022-11-04 00:40:05
Fit,11/03/2022 19:49:00.143,lb_prod_scheduler,spotbot,,%20redacted%20auto,false,11/03/2022 19:55:00.091,BOT did not pinged for more than 300 seconds,failed,2022-11-04 00:55:05
Here is my custom props.conf that I initially tried:
[bot:logs]
description = BOT Application Status Logs
category = Custom
disabled = false
CHARSET=UTF-8
KV_MODE = none
LINE_BREAKER = ([\r\n]+)
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = false
TIME_FORMAT = %m/%d/%Y%t%H:%M:%S.%f
TIME_PREFIX = Fit,
INDEXED_EXTRACTIONS=CSV
FIELD_DELIMITER=,
FIELD_NAMES=profile_val,start_time,scheduler_tree,bot_name,tree_name,query_val,kill_option,stopped_on,remarks,status_val,ts
Here is the "Add Data" props.conf that works in the WebUI but not on the indexers:
[bot:logs]
SHOULD_LINEMERGE=false
LINE_BREAKER=([\r\n]+)
NO_BINARY_CHECK=true
CHARSET=UTF-8
EXTRACT-Ex_account=^(?:[^:\n]*:){4}(?P<Ex_account>\d+)
INDEXED_EXTRACTIONS=csv
KV_MODE=none
category=Structured
description=Comma-separated value format. Set header and other settings in "Delimited Settings"
disabled=false
pulldown_type=true
... View more