I am facing strange behavior, for which I can't find anything in the docs.
I have a source that generates CSV files(comma sep.). They are indexed in a dedicated index, and sourcetype.
A look-alike example :
id,service_id,product_id,shop_id,user_id,blah_blah,whatever,name,date,client_id 1,34456789,12234,23,4,f,45678,ivan,2022-01-13 07:04:49,1 2,34452789,12134,25,4,f,45678,ivan,2022-01-13 07:14:49,1 3,34451789,12134,27,4,f,45678,ivan,2022-01-13 07:14:49,1 4,34451789,12134,27,4,f,45678,ivan,2022-01-13 07:15:49,1 5,34451789,12133,23,4,f,45678,ivan,2022-01-13 07:15:49,1 6,34456789,12234,23,4,f,45678,ivan,2022-01-13 07:04:49,1 7,34452789,12134,25,4,f,45678,ivan,2022-01-13 07:14:49,1 8,34451789,12134,27,4,f,45678,ivan,2022-01-13 07:14:49,1 9,34451789,12134,27,4,f,45678,ivan,2022-01-13 07:15:49,1
Now, the challenge no1 is that the script that generates the csv, can edit on already existing lines.
challenge no2 is that this does not result in one and the same behaviors all the time.
If a simple value is changed ( from 1 to 2, or from ivan to ivag - important is same number of characters) , the change is no-were to be found in the indexed data. However, if the change includes a change in the number of characters(say ivan becomes johnathan) then, the whole file is re-indexed with the new value, causing lots of duplications.
I am sure that this must be documented somewhere....but I can not find it, thus can not really understand it.
Does anyone know what is going on( I managed to find something in the community about splunk checking the first 256 char. of a file to decide, but I have tested changing both before 256 threshold and after it).....?
... View more