Hello,
I am missing data in my current setup (about 20 to 30%).
All the data from Intance A is arriving perfectly well into /var/log/app.log.
However, some events are missing in Splunk.
Would you have any idea about the potential issue please?
Thank you very much in advance
If instance B is Splunk instance, I would suggest that you use tcp input in inputs.conf, no need to route through rsyslog.
i.e
[tcp://instanceA:514]
For more information refer TCP section in inputs.conf spec.
Think he is using UDP 514, than rsyslog is a very good Idea.
Otherwise you loose data everytime splunk is restarting
I have to use UDP 514 indeed but all the data is arriving well to the file (plus it's just data from the same VPC no very low risk of losing datagrams). So udp is fine, just Splunk not indexing everything from the file.
We have been using both syslog forwarding as well as TCP listen provided by Splunk for on boarding data from different sources such as firewall which produces large chunk of data. Yet, we are getting all data on boarded.
Can you find out any pattern in data that's missed ? Do you have any log rotation policy?
Yes we do have logrotation policy. It looks like that the log rotate is the issue but I am not entirely sure yet. Is it the same on your side? .
Apparantly that's the log rotation which breaks everything even if i try putting a higher initCrcLength or crcSalt=.
Are you sure it is "missing"?
We had a multiple problems with time extracion.
- another timstamp in the message, that was pickup as time
- an ID was picked-up as Epoch time
- wrong cut-off for timestamp. (timestamp without year followed by IP address. first 2 digits of the IP where used as YEAR
To see if the timestamp is wrong, start a real-time search. Just watch the timeline for events popping up in the past.
If they pop up in the past, you have to alter the props.conf for time extraction
I am going to try realtime search right now but when I check if the data is arriving, I do a very general search specifically looking for content in the _raw, no sourcetype or source filter.