Getting Data In

single indexer operation

surekhasplunk
Communicator

Hi,

It may be a very simple question but i want to know how the indexing actually works when the indexer is down for a few hours or a day. What happens to the data? will i lose it or it will get indexed once the indexer is up and running ?

I have 1 indexer/ 1 searchhead/ 1 heavyforwarder/ 1 deployment Licensing server.

Also how to maintain data resiliency in this scenario

Thanks

Tags (2)
0 Karma
1 Solution

gcusello
SplunkTrust
SplunkTrust

Hi @surekhasplunk,
when the Indexer of your architecture is down you aren't able to search data and indexing is stopped until the Indexer restarts.

If data arrives from Universal or Heavy Forwarders you don't miss any data because they locally cache data.
The only data you could lose (if you have) are syslogs and HEC if they directly arrive on Indexer.

Infact it's a best practice to ingest syslogs using two Heavy Forwarders and a Load Balancer to avoid to lose logs also in case of failure of Indexers and one of the Heavy Forwarders.

The first analysis you have to perform is if you really need HA features.
If you have syslogs or HEC and you need only to avoid to lose logs, you could think to duplicate only Heavy Forwarder, otherwise you don't need anymore.

About the duration of the MTBF, you have to define a value and eventually enlarge queues on Universal and Heavy Forwarders, the only limit is the free space on the server.

Ciao.
Giuseppe

View solution in original post

Arpit_S
Path Finder

@surekhasplunk , When the indexer in a 1 indexer deployment is down the indexing stops. So, if using splunk Universal forwarder you can increase the size of the persistence queue to retain the data in the buffer memory on the UF, untill the indexers comes online. Once the indexer comes online the data will be ingested.

NOTE: if the queue gets filled upto its max size the data will start dropping.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @surekhasplunk,
when the Indexer of your architecture is down you aren't able to search data and indexing is stopped until the Indexer restarts.

If data arrives from Universal or Heavy Forwarders you don't miss any data because they locally cache data.
The only data you could lose (if you have) are syslogs and HEC if they directly arrive on Indexer.

Infact it's a best practice to ingest syslogs using two Heavy Forwarders and a Load Balancer to avoid to lose logs also in case of failure of Indexers and one of the Heavy Forwarders.

The first analysis you have to perform is if you really need HA features.
If you have syslogs or HEC and you need only to avoid to lose logs, you could think to duplicate only Heavy Forwarder, otherwise you don't need anymore.

About the duration of the MTBF, you have to define a value and eventually enlarge queues on Universal and Heavy Forwarders, the only limit is the free space on the server.

Ciao.
Giuseppe

View solution in original post

gfreitas
Builder

If you're talking about new data that should have been indexed it will depend how you're getting this data. For example if you have UFs sending data to your indexers, they have a default configuration of in-memory input queue of 500KB. If you do not change this behaviour when this queue fills the oldest data will be dropped. You will need to configure persistent queues to avoid that. See more information here: https://docs.splunk.com/Documentation/SplunkCloud/8.0.0/Data/Usepersistentqueues.

If you receive data from syslog and Splunk is listening for the data, I'm sorry but that is gone if you're using UDP syslog. For syslog, Splunk recommends having a syslog server/daemon listing to syslog messages before indexing to Splunk and them use a UF or HF to read the syslog files.

Hope this helps clearing the idea.

.conf21 Now Fully Virtual!
Register for FREE Today!

We've made .conf21 totally virtual and totally FREE! Our completely online experience will run from 10/19 through 10/20 with some additional events, too!