Getting Data In

Why are we missing data in Splunk after rsyslog?

Arkon
Explorer

Hello,

I am missing data in my current setup (about 20 to 30%).

  1. Instance A is sending data to Instance B on port 514
  2. Instance B uses rsyslog to get the data and log it into a file called /var/log/app.log
  3. Splunk indexes /var/log/app.log

All the data from Intance A is arriving perfectly well into /var/log/app.log.
However, some events are missing in Splunk.

Would you have any idea about the potential issue please?
Thank you very much in advance

0 Karma

hardikJsheth
Motivator

If instance B is Splunk instance, I would suggest that you use tcp input in inputs.conf, no need to route through rsyslog.
i.e
[tcp://instanceA:514]

For more information refer TCP section in inputs.conf spec.

0 Karma

teunlaan
Contributor

Think he is using UDP 514, than rsyslog is a very good Idea.
Otherwise you loose data everytime splunk is restarting

0 Karma

Arkon
Explorer

I have to use UDP 514 indeed but all the data is arriving well to the file (plus it's just data from the same VPC no very low risk of losing datagrams). So udp is fine, just Splunk not indexing everything from the file.

0 Karma

hardikJsheth
Motivator

We have been using both syslog forwarding as well as TCP listen provided by Splunk for on boarding data from different sources such as firewall which produces large chunk of data. Yet, we are getting all data on boarded.

Can you find out any pattern in data that's missed ? Do you have any log rotation policy?

0 Karma

Arkon
Explorer

Yes we do have logrotation policy. It looks like that the log rotate is the issue but I am not entirely sure yet. Is it the same on your side? .
Apparantly that's the log rotation which breaks everything even if i try putting a higher initCrcLength or crcSalt=.

0 Karma

teunlaan
Contributor

Are you sure it is "missing"?

We had a multiple problems with time extracion.
- another timstamp in the message, that was pickup as time
- an ID was picked-up as Epoch time
- wrong cut-off for timestamp. (timestamp without year followed by IP address. first 2 digits of the IP where used as YEAR

To see if the timestamp is wrong, start a real-time search. Just watch the timeline for events popping up in the past.

If they pop up in the past, you have to alter the props.conf for time extraction

0 Karma

Arkon
Explorer

I am going to try realtime search right now but when I check if the data is arriving, I do a very general search specifically looking for content in the _raw, no sourcetype or source filter.

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...