Getting Data In

How can we configure what goes to _internal?

danielbb
Motivator

We hit the 0.5 TB limit for _internal in our lower environment and we have barely 10 days of data. Unfortunately, we don’t have enough disk space to have the index for 60 days or so.

We wonder if there is a way to log, let’s say 60 days from the Splunk servers but only 10 or so from the forwarders. Is it possible? Are there other ways to configure what goes to _internal ?

Labels (1)

gcusello
Legend

Hi @danielbb,

The only way to do what you request is to use a different indexes:

  • one for _internal logs from Forwarders,
  • one for _internal from Splunk servers.

in this way you can set two different retention periods for each index.

To do this, you can modify inputs.conf in $SPLUNK_HOME/etc/system/default (copying in local) changing the index name.

Ciao.

Giuseppe

danielbb
Motivator

In ten days, the Splunk servers generated about 312 GBs of _internal raw data - I don't know whether this volume makes sense or not - it's  a small environment of three indexers. The major contributor is the /opt/apps/splunk/var/log/splunk/metrics.log.

0 Karma

gcusello
Legend

Hi @danielbb,

the retention of your _internal logs depends on your reaction time when problems.

In other words, if you and your team usually intervene in one or two days when there's a problem, you could also reduce the retention time of Forwarders.

Otherwise, I hint to enlarge the data storage of your indexers, eventually not using high performance storage for cold _internal buckets, but also old and slow storages, to use only in case of problems.

Ciao.

Giuseppe

danielbb
Motivator

@gcusello   - is there a way to limit the metrics flow within _internal ? INFO, WARNING, etc... ?

0 Karma

gcusello
Legend

Hi @danielbb,

you could filter events (for more infos see at https://docs.splunk.com/Documentation/Splunk/8.0.6/Forwarding/Routeandfilterdatad#Filter_event_data_... ), but in my opinion isn't a good idea because you could need those logs to debug situations.

As I said the best approach is to use an additional storage, even if not quick or to reduce the retention time to few days, greater that your reaction time.

Ciao.

Giuseppe

0 Karma
Did you miss .conf21 Virtual?

Good news! The event's keynotes and many of its breakout sessions are now available online, and still totally FREE!