Getting Data In

splunk is trigerring duplicate events from syslog.

rakesh_498115
Motivator

Hi

I have been using syslog to store my server logs and splunk will be monitoring the syslog.log file located at /opt/splunk/var/syslog-ng/ path. Now while splunk montoring the files i could see duplicate events in my logs. when i checked the splunkd log file i could see at partiucular timestamps i.e

06-17-2013 07:18:48.691 +0100 INFO WatchedFile - Will begin reading at offset=0 for file='/opt/splunk/var/syslog-ng/syslog.log'.
06-17-2013 07:18:48.691 +0100 INFO WatchedFile - Will begin reading at offset=0 for file='/opt/splunk/var/syslog-ng/syslog.log'.

i could see splunk reading the file twice ..hence i could see duplicates events in my index. Posted you the snippet of splunkd log file.

06-17-2013 07:18:30.689 +0100 INFO  WatchedFile - Will begin reading at offset=0 for file='/opt/splunk/var/syslog-ng/syslog.log'.
06-17-2013 07:18:33.690 +0100 INFO  WatchedFile - Will begin reading at offset=0 for file='/opt/splunk/var/syslog-ng/syslog.log'.
06-17-2013 07:18:36.690 +0100 INFO  WatchedFile - Will begin reading at offset=0 for file='/opt/splunk/var/syslog-ng/syslog.log'.
06-17-2013 07:18:39.690 +0100 INFO  WatchedFile - Will begin reading at offset=0 for file='/opt/splunk/var/syslog-ng/syslog.log'.
06-17-2013 07:18:42.690 +0100 INFO  WatchedFile - Will begin reading at offset=0 for file='/opt/splunk/var/syslog-ng/syslog.log'.
06-17-2013 07:18:45.692 +0100 INFO  WatchedFile - Will begin reading at offset=0 for file='/opt/splunk/var/syslog-ng/syslog.log'.
06-17-2013 07:18:48.691 +0100 INFO  WatchedFile - Will begin reading at offset=0 for file='/opt/splunk/var/syslog-ng/syslog.log'.
06-17-2013 07:18:48.691 +0100 INFO  WatchedFile - Will begin reading at offset=0 for file='/opt/splunk/var/syslog-ng/syslog.log'.
06-17-2013 07:18:50.551 +0100 INFO  BatchReader - Removed from queue file='/opt/splunk/var/syslog-ng/syslog.log'.
06-17-2013 07:18:56.561 +0100 INFO  TcpOutputProc - Connected to idx=host1:8089
06-17-2013 07:19:26.563 +0100 INFO  TcpOutputProc - Connected to idx=host2:8089
06-17-2013 07:19:56.576 +0100 INFO  TcpOutputProc - Connected to idx=host3:8089

Can any one help me.. wats happening here .why splunk is reading a file a twice and generating duplicate events ??

for Syslog-log rotation i have defined the following configuration in syslog-ng file

//syslog-ng logrotation configuration

/etc/logrotate.d/syslog-ng

/opt/splunk/var/syslog-ng/syslog.log {
        size 30M
        copytruncate
        create 750 splunk splunk
        rotate 500
}

crontab - entry to check the syslog size every 5 min and rotate

// crontab

#Added entry to rotate logs generated from syslog-ng
*/5 * * * * /usr/sbin/logrotate /etc/logrotate.d/syslog-ng

I cleary see duplicates . You can find the same with the screenshot below.

alt text

Tags (2)

lukejadamec
Super Champion

I had the exact same problem with syslog (and others).

Try changing the inputs.conf for monitor:///var/log (or whichever stanza controls your syslogs:

Blacklist .gz

Whitelist .log$

Basically, you want to ignore all rotated logs, and just tail the current log. It worked for me.

0 Karma

lukejadamec
Super Champion

Try this in your inputs.conf. The blacklist is overkill, but it can't hurt. Also, it looks like you're missing a \

[monitor:///opt/splunk/var/syslog-ng/syslog.log]

blacklist = (\.log\.)

whitelist = (syslog\.log$)

queue = parsingQueue

index = xmlgapps

sourcetype = xmlg_syslog

0 Karma

lukejadamec
Super Champion

There was a post recently about a bug that resulted in duplicate events, but I don't think it applies to you: http://answers.splunk.com/answers/100001/504-duplicate-blocks-of-events

0 Karma

rakesh_498115
Motivator

_time indextime source _raw
1 8/22/13 3:50:44.287 AM 08/22/2013 03:54:22 /opt/splunk/var/syslog-ng/syslog.log 2013-08-22T03:50:44.287+01:00 10.35.90.213 08.22.2013 03:49:07,705 Id-00fbd1d452157c2343408da5 The filter 'Request document:' lXXXXXXXX
2 8/22/13 3:50:44.287 AM 08/22/2013 03:50:46 /opt/splunk/var/syslog-ng/syslog.log 2013-08-22T03:50:44.287+01:00 10.35.90.213 08.22.2013 03:49:07,705 Id-00fbd1d452157c2343408da5 The filter 'Request document:' lXXXXXXXX

0 Karma

rakesh_498115
Motivator

I have used the below query to check duplicates. cleary it shows it as duplicates..

index="xmlgapps" xmlg_message="Request document" Id-00fbd1d452157c2343408da5 | convert ctime(_indextime) AS indextime | table _time indextime source _raw

you can see the output below ,it has same source name , same _time, same _raw event but different _indextime i.e splunk is indexing the same event twice ..

0 Karma

rakesh_498115
Motivator

by the way this config is defined in Heavy Forwarder 4.3.2 version..is this any bug in splunk ??

0 Karma

rakesh_498115
Motivator

there in no other input configuration.only one input configuration is there..and now presently it is like this..

[monitor:///opt/splunk/var/syslog-ng/syslog.log]
queue = parsingQueue
index = xmlgapps
sourcetype = xmlg_syslog
whitelist = (syslog.log$)

and log rotation is done by the above mentioed conf /etc/logrotate.d/syslog-ng and cron job.. and by default logs are rotated and named as syslog.log.1 , .2,and so on..i am not finding any clue where it is going wrong ? 😞

0 Karma

lukejadamec
Super Champion

Let's back up a bit. Earlier you said you 'removed the tail option from inputs.conf and still saw duplicates'. That is not possible unless there is another input configuration for the syslog folder. It could be a script, or a UDP input. What apps do you have installed on the heavy forwarder?

Also, please post your current input stanza, and how are you changing the log file names when you rotate them?

0 Karma

rakesh_498115
Motivator

yeah . i have run.and i could see only one monitored file..and only instance splunk is reading it..my syslog server has the heavy fwder installed in it..

0 Karma

lukejadamec
Super Champion

Did you run the btool on the syslog server?
./splunk cmd btool inputs list

0 Karma

rakesh_498115
Motivator

hmm.even i did the same..in the "source" field ..i could see only syslog.log coming..but even then i am seeing duplicates...pls help.. can you give log rotating configurations for your syslog-ng...why splunk is creating duplicates if they are not there in my source file .. 😞

0 Karma

lukejadamec
Super Champion

Don't forget to escape the .
whitelist=(syslog\.log$)

0 Karma

rakesh_498115
Motivator

I have added whitelist option to above stanza but that event didnt work..i have used something like this...

[monitor:///opt/software/syslog-ng/syslog.log]
queue = parsingQueue
index = xmlgapps
sourcetype = xmlg_syslog
whitelist=syslog.log$

0 Karma

lukejadamec
Super Champion

Sorry, the slashes were removed from the post.
You need an escape slash before each '.'
I'll try again.
blacklist = (\.log\.)

0 Karma

lukejadamec
Super Champion

Try this.
[monitor:///opt/software/syslog-ng/syslog.log]
blacklist = (.log.)
queue = parsingQueue
index = xmlgapps
sourcetype = xmlg_syslog

0 Karma

rakesh_498115
Motivator

rotated log files will be named as syslog.log.1 ,syslog.log.2 and so on..

0 Karma

lukejadamec
Super Champion

This is not a script, it is a stanza. And you have it set up to monitor you rotated files. A * is implied after the .log
Try adding the line:
whitelist=(.log$)

0 Karma

lukejadamec
Super Champion

Sure. Can you tell me how you are naming the rotated files?

0 Karma

rakesh_498115
Motivator

Hi lukejadamec ,

even i am monitor the current log only..the monitor script i have used is

[monitor:///opt/software/syslog-ng/syslog.log]
queue = parsingQueue
index = xmlgapps
sourcetype = xmlg_syslog

Can you pls tell..based on the configurations i mentioned above where could be the problem.

0 Karma

MuS
SplunkTrust
SplunkTrust

Hi rakesh_498115

probably some duplicated input/monitor config, use cmd tool btool to check for any duplicates

./splunk cmd btool inputs list

see docs for more information http://docs.splunk.com/Documentation/Splunk/5.0.3/Troubleshooting/CommandlinetoolsforusewithSupport#...

cheers, MuS

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...