All Apps and Add-ons

Virtualized Splunk causing netapp latency

rmf185039
New Member

We just purchased Splunk and decided to roll it out into a virtual environment. As we rolled out forwarders to our Servers once we hit about 300 servers we saw latency on our NetApp SAN's spike every 40-45 minutes. As part of our troubleshooting since rollout of Splunk was one of the recent changes we turned off the virtual nic and the spikes stopped. Does anyone have any idea what process of spunk would cause it? I originally thought it was as new devices checking in but the spikes continued for a week after devices were onboarded.

Thanks,
Ryan

0 Karma
1 Solution

esix_splunk
Splunk Employee
Splunk Employee

There are a lot of considerations for using Virtual Instances and Shared storage. Can you elaborate more on what exactly you mean? E.g., Architecturally what does your topology look like..

1) Are your indexers all using the shared storage, and the virtual NICs on this are causing the issues
2) Are your forwarders all running of the Netapp?
3) Linux or Windows?

Thanks

View solution in original post

0 Karma

esix_splunk
Splunk Employee
Splunk Employee

There are a lot of considerations for using Virtual Instances and Shared storage. Can you elaborate more on what exactly you mean? E.g., Architecturally what does your topology look like..

1) Are your indexers all using the shared storage, and the virtual NICs on this are causing the issues
2) Are your forwarders all running of the Netapp?
3) Linux or Windows?

Thanks

0 Karma

jtacy
Builder

Interesting issue. I'm assuming that by turning off the "virtual nic" you stopped the forwarders from talking to the Splunk indexer(s) and the problem went away. The Splunk servers are going to add some load to the storage and each forwarder will hit the disk just a little bit (reading log files and writing its own logging and state info), but an interval of 40-45 minutes doesn't really stand out as a Splunk activity. Questions that come to mind:

  1. Is the Splunk infrastructure using the same storage as the 300 servers with forwarders installed?
  2. Do the latency spikes appear to correspond to events on your systems that might result in a storm of indexing activity?
  3. What OS do the forwarders run and what are they configured to index?
  4. Did the Splunk server work OK in terms of running searches while this was going on?

If possible, I would probably try to get a fraction of the forwarders sending data again (maybe the ones that you'd expect to be the busiest) and then watch the disk activity (as reported by your VM infrastructure) for the forwarding and Splunk infrastructure hosts. The interesting part is the 40-45 minute periodic activity and my goal would be to find out where that's coming from; I'm guessing it will appear to at least a small extent with fewer forwarders enabled.

0 Karma

rmf185039
New Member

Thanks esix! We did do that, I was told from my netapp team after we did it that it resolved the issue but it actually didn't. After I left the problem continued to persist even after splunk was effectively shut off and it turned out to be another system, luckily. When they first told me it was splunk it made absolutely no sense but it was one of the only global system changes that had occurred, but we figured it out in the end. Thanks for responding though!

-Ryan

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas     Cisco Live 2026 is almost here, and this ...

What Is the Name of the USB Key Inserted by Bob Smith? (BOTS Hint, Not the Answer)

Hello Splunkers,   So you searched, “what is the name of the usb key inserted by bob smith?”  Not gonna lie… ...

Automating Threat Operations and Threat Hunting with Recorded Future

    Automating Threat Operations and Threat Hunting with Recorded Future June 29, 2026 | Register   Is your ...