Getting Data In

Issues due to network slowness

Influencer

Hi,

We have deployed Job Scheduler, Indexer, Search Head and Forwarder on Virtual Machines. Often we see issues like: 1. Indexer is down. Unable to distribute to peer. 2. Crash logs in indexer. 3. Splunk stops running in Job Scheduler node. 4. Many processes of Splunk helpers running(PIDs increase drastically to 15 and then fluctuate between 5 to 10).

Earlier we did not have all these issues. Recently we see that network is slow and sometimes it is unresponsive for couple of minutes (It takes more than a minute to establish connection to VM).

What kind of issues may arise in splunk system due to slow nature of network. Whether the issues that i have mentioned are due to network speed being slow.

Thanks
Strive.

Tags (1)
0 Karma

New Member

There is a Splunk App For Boundary that may be useful...
http://splunk-base.splunk.com/apps/55950/splunk-app-for-boundary
www.boundary.com
Boundary monitors all of the network-flows between all of the VMs - its useful for identifying if a hotspots is in the App, VM or Network.

0 Karma

Champion

Ok, there are a few pointers to go through here;

Everytime you run a search it will spawn a new Splunkd process, each process will consume a CPU core. From what you've said you need to be sure that you can support up to 15+ instances. It is most likely caused by users running searches combined with scheduled searches you may have.

The old adage that running Splunk on VM's is bad is a little out of date nowadays. If deployed correctly you won't experience too many issues, the key problem is ensuring that the indexer can get about 800 (Really I'd aim for 1200) IOPS on whatever storage it has, giving it native read/write access to a disk can help improve this. VM's are great for Splunk because they make the deployment of Search Heads quite simple, just ensure it has some IO and then load up the CPU's.

Finally, are these all on the same host? How many network ports does the box have? If its a single port and you've got data flowing over it into the indexer, perhaps tcp acks coming back out, users connecting to run searches etc.. its quite possible you're running it near capacity. Have you checked any of the box performance metrics?

0 Karma

Influencer

Yes, they are all on same ESX server. The ESX box has 2 ports. When i checked performance metrics of the box, for most part of the time it is around 20%. Some times, there is a steep rise and it is at 98.77% memory usage. That time, i see crash logs in indexer.

0 Karma

SplunkTrust
SplunkTrust

In general, running Splunk in a virtualized environment is not a good idea.

0 Karma
State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!