Splunk Search

big lookup file replication too slow

mataharry
Communicator

I have a search-head and several search-peer, I see sometimes this warning in the splunkd.log.

DistributedBundleReplicationManager - bundle replication to 2 peer(s) took too long (12943ms), bundle file size=296150KB

My lookup file is > 100 Mb and it seems that it's not replicated properly on all the search-peer.

How to do ?

Tags (2)
1 Solution

yannK
Splunk Employee
Splunk Employee

Definitely our lookup is too big, you can maybe reduce his size.

Or if your lookup file is not changing too often, here is the workaround, to use local lookups instead of distributing it.

  1. prevent the lookup to be replicated/distributed (add it to the replicationBlacklist in distsearch.conf)

  2. copy the lookup file on each of your servers ( search-peers and search-head) in the same folder $SPLUNK_HOME/etc/apps//lookup/ (a simple rsync script can make it)

  3. change the way the lookups are called to use the local version of the file only add the option " lookup local=true" in your search see http://www.splunk.com/base/Documentation/4.2/SearchReference/Lookup

View solution in original post

yannK
Splunk Employee
Splunk Employee

Definitely our lookup is too big, you can maybe reduce his size.

Or if your lookup file is not changing too often, here is the workaround, to use local lookups instead of distributing it.

  1. prevent the lookup to be replicated/distributed (add it to the replicationBlacklist in distsearch.conf)

  2. copy the lookup file on each of your servers ( search-peers and search-head) in the same folder $SPLUNK_HOME/etc/apps//lookup/ (a simple rsync script can make it)

  3. change the way the lookups are called to use the local version of the file only add the option " lookup local=true" in your search see http://www.splunk.com/base/Documentation/4.2/SearchReference/Lookup

yannK
Splunk Employee
Splunk Employee

here is the workaround used in 4.1.4 to share the lookup table with network mounted folders.

http://splunk-base.splunk.com/answers/3436/how-could-i-optimize-distributed-replication-of-large-loo...

jrodman
Splunk Employee
Splunk Employee

In addition to what yann said, in 4.2, bundle replication should not block search (we transfer asynchronously in another context), so the problems with large bundles are (slightly) mitigated. However, 12 seconds is a fairly long time to delay bundle replication. It seems you are getting a transfer rate of around 25MB/s, so I'm not sure if you are on a slightly busy GigE Setup, or if we ahve a bit of overhead in the protocol.

If your bundles are being frequently updated, one fairly heavyweight solution is to disable bundle replication and instead use NFS to make the bundles avaialble to all nodes. This puts a pretty strong reliance on NFS health and performance however, at that point.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...