How can we configure what goes to _internal?

danielbb · ‎10-19-2020

We hit the 0.5 TB limit for _internal in our lower environment and we have barely 10 days of data. Unfortunately, we don’t have enough disk space to have the index for 60 days or so.

We wonder if there is a way to log, let’s say 60 days from the Splunk servers but only 10 or so from the forwarders. Is it possible? Are there other ways to configure what goes to _internal ?

gcusello · ‎10-19-2020

Hi @danielbb,

The only way to do what you request is to use a different indexes:

one for _internal logs from Forwarders,
one for _internal from Splunk servers.

in this way you can set two different retention periods for each index.

To do this, you can modify inputs.conf in $SPLUNK_HOME/etc/system/default (copying in local) changing the index name.

Ciao.

Giuseppe

danielbb · ‎10-21-2020

In ten days, the Splunk servers generated about 312 GBs of _internal raw data - I don't know whether this volume makes sense or not - it's a small environment of three indexers. The major contributor is the /opt/apps/splunk/var/log/splunk/metrics.log.

gcusello · ‎10-21-2020

Hi @danielbb,

the retention of your _internal logs depends on your reaction time when problems.

In other words, if you and your team usually intervene in one or two days when there's a problem, you could also reduce the retention time of Forwarders.

Otherwise, I hint to enlarge the data storage of your indexers, eventually not using high performance storage for cold _internal buckets, but also old and slow storages, to use only in case of problems.

Ciao.

Giuseppe

danielbb · ‎10-22-2020

@gcusello - is there a way to limit the metrics flow within _internal ? INFO, WARNING, etc... ?

gcusello · ‎10-22-2020

Hi @danielbb,

you could filter events (for more infos see at https://docs.splunk.com/Documentation/Splunk/8.0.6/Forwarding/Routeandfilterdatad#Filter_event_data_... ), but in my opinion isn't a good idea because you could need those logs to debug situations.

As I said the best approach is to use an additional storage, even if not quick or to reduce the retention time to few days, greater that your reaction time.

Ciao.

Giuseppe

How can we configure what goes to _internal?

inputs.conf

Observability Unlocked: Kubernetes Monitoring with Splunk Observability Cloud

Wrapping Up Cybersecurity Awareness Month

🌟 From Audit Chaos to Clarity: Welcoming Audit Trail v2

Are you a member of the Splunk Community?

How can we configure what goes to _internal?

inputs.conf

Observability Unlocked: Kubernetes Monitoring with Splunk Observability Cloud

Wrapping Up Cybersecurity Awareness Month

🌟 From Audit Chaos to Clarity: Welcoming Audit Trail v2