Installation

License usage shows higher volume than the file being monitored

bwouters
Path Finder

Hi community

I'm still a bit confused how Splunk is calculating the volume usage.
So far, I'm using the Free license (up to 500MB) and I tried to optimize my logging file as good as possible.
Today, my log file is 800KB big but when checking the License settings using Splunk Web, I see (Volume used today:) 11 MB?

How is this possible? Where is this extra data coming from?

Tags (1)
0 Karma

Sukisen1981
Champion

Hi - See the attachment I have uploaded
I started with the text at 6:31 (source is a simple .txt file continuously monitored)
@ 6:31
Hello hello
hi
hi again
hi again 2
Now, at 6:33 , I just added anextra hello to line 1 in the .txt file
but see the event being indexed at 6:33 . Splunk indexed ALL the lines again, even though I did not change anything in lines 2,3&4 in the .txt file......

Hello hello hello
hi
hi again
hi again 2

Now again at 6:34 I added a new line to the file hello again 3, this time Splunk indexed ONLY the new line and NOT ALL the lines in the text file.
Now again, at 6:34:31 I added another hello to line 1 , Splunk indexed ALL the lines again from the text file.

See the behavior?

So - Without knowing your file structure , it is difficult to say why the continuous monitoring is affecting your license, but trust me it does. I have seen this behavior before.
I advise can you look at the way your file (which is being continuously monitored) is structured in terms of the data lines ? If possible try to have a sample uploaded under continuously monitoring. And as @FrankVl mentioned, can you share your inputs.conf settings
I am hopeful though that you will be able to figure out a solution from the example I have given and by tinkering with your data file a bit.alt text

0 Karma

bwouters
Path Finder
$ cat ./etc/system/local/inputs.conf
[default]
host = SPLUNK

[monitor://$SPLUNK_HOME/etc/splunk.version]
disabled = 0

[monitor://$SPLUNK_HOME/var/log/splunk]
disabled = 0

[monitor://$SPLUNK_HOME/var/log/splunk/license_usage_summary.log]
disabled = 0

[batch://$SPLUNK_HOME/var/spool/splunk/...stash_new]
disabled = 0

[batch://$SPLUNK_HOME/var/spool/splunk]
disabled = 0

I created the monitor through the Splunk Web

0 Karma

Sukisen1981
Champion

Hi - i am curious to see what your indexed event shows , something like the snapshot I gave . How does your indexed event shows? Are there duplicates, if so from which source?

0 Karma

bwouters
Path Finder

Snippet of 2 events

This is how it looks like, 2 following events.
I removed certain sensitive information but it shouldn't matter to the behavior I see

I monitor this specific file and split the lines (=events) with TRACE

0 Karma

Sukisen1981
Champion

Hi,

Splunk license manager is rarely wrong. What probably has happened - you are doing a continuous (or consuming/monitoring your file at fixed time intervals), when this happens for example say, you use continuous monitoring on your log file at an interval of 1 hour, after 3 hours you would have indexed 800*3=2400MB of data.
The other cause could be some user is running a REST api intergration with some api, when this happens and similar to the above example, the toal indexed data volume (and hence the license) will be the number of times the api is polled * data volume retrieved from the api each time.
Please check carefully.

0 Karma

bwouters
Path Finder

Hello,

Indeed, the file is being continuously monitored but how can I tell Splunk not to process the whole file again and again? Because now the file is small but after some time, the file will grow until 100-200MB and then I'm hitting that limit again.

For instance, by now the file has grown to 2.3MB, so if I follow your example, after a few hours the total used volume will be ~7MB even though the file might have only increased to 2.4MB..

0 Karma

FrankVl
Ultra Champion

You should be able to see in your index whether the file was actually ingested multiple times. Normally this shouldn't happen, but it all depends on how things were configured. Just do some investigation like counting the events in the index(es) against the number of events in the original log file. Or have a look at the license_usage.log in your indexers to see what the license usage is spent on (you can also search that log through the _internal index).

My guess would be that there may be a little overhead in the data that counts against your license and at such tiny numbers that becomes visible. But that really isn't much more than a guess.

0 Karma

bwouters
Path Finder

A breakdown of the indexes and current sizes

_audit: 26 MB
_internal: 106 MB
_introspection: 145 MB
_telemetry: 1 MB
_thefishbucket: 1 MB
history: 1 MB
main: 146 MB
splunklogger: 0 B
summary: 1 MB

while the only monitored file is 2.6 Mb..
I noticed that Splunk is logging certain data up to 14 (!) times over and over again, which also affects the counts on my dashboard. However, I'm not really sure why he does this.

0 Karma

FrankVl
Ultra Champion

If Splunk is indeed ingesting the data from the monitored file over and over again, that's worth investigating.

Can you provide some details on how logs are being written to that file and what the splunk inputs.conf looks like?

0 Karma

bwouters
Path Finder
 $ cat ./etc/system/local/inputs.conf
 [default]
 host = SPLUNK

 [monitor://$SPLUNK_HOME/etc/splunk.version]
 disabled = 0

 [monitor://$SPLUNK_HOME/var/log/splunk]
 disabled = 0

 [monitor://$SPLUNK_HOME/var/log/splunk/license_usage_summary.log]
 disabled = 0

 [batch://$SPLUNK_HOME/var/spool/splunk/...stash_new]
 disabled = 0

 [batch://$SPLUNK_HOME/var/spool/splunk]
 disabled = 0
0 Karma

FrankVl
Ultra Champion

And which files specifically did you check for size and did you see getting duplicated?

This all looks like copies of config that is already in /etc/system/default/inputs.conf anyway, but with some important settings missing (like move_policy = sinkhole for the batch inputs...).

0 Karma

bwouters
Path Finder

This is the configuration of /etc/system/default/inputs.conf

[default]
index         = default
_rcvbuf        = 1572864
host = $decideOnStartup

[blacklist:$SPLUNK_HOME/etc/auth]

[monitor://$SPLUNK_HOME/var/log/splunk]
index = _internal

[monitor://$SPLUNK_HOME/var/log/splunk/license_usage_summary.log]
index = _telemetry

[monitor://$SPLUNK_HOME/etc/splunk.version]
_TCP_ROUTING = *
index = _internal
sourcetype=splunk_version

[batch://$SPLUNK_HOME/var/spool/splunk]
move_policy = sinkhole
crcSalt = <SOURCE>

[batch://$SPLUNK_HOME/var/spool/splunk/...stash_new]
queue       = stashparsing
sourcetype  = stash_new
move_policy = sinkhole
crcSalt     = <SOURCE>


[fschange:$SPLUNK_HOME/etc]
#poll every 10 minutes
pollPeriod = 600
#generate audit events into the audit index, instead of fschange events
signedaudit=true
recurse=true
followLinks=false
hashMaxSize=-1
fullEvent=false
sendEventMaxSize=-1
filesPerDelay = 10
delayInMills = 100

[udp]
connection_host=ip

[tcp]
acceptFrom=*
connection_host=dns

[splunktcp]
route=has_key:_replicationBucketUUID:replicationQueue;has_key:_dstrx:typingQueue;has_key:_linebreaker:indexQueue;absent_key:_linebreaker:parsingQueue
acceptFrom=*
connection_host=ip

[script]
interval = 60.0
start_by_shell = true

And to be honest, the file I'm always referring to is not really mentioned in neither the inputs.conf files. The file I monitor is called 'Splunk.log' and put in Source Type 'SessionLogs' for example

0 Karma

FrankVl
Ultra Champion

If it gets ingested into splunk it must be mentioned in some inputs.conf, so best to find out where. If it is not in system/default or system/local, I guess it is in some specific app under etc/apps?

0 Karma

bwouters
Path Finder

Indeed, it is mentioned in /opt/splunk/etc/apps/search/local

[monitor:///opt/splunk/share/DEMO/logs/Splunk_new.log]
disabled = false
index = main
sourcetype = SessionLogs

You think I should add extra configs to this monitor stanza?

0 Karma

FrankVl
Ultra Champion

That looks fine. How does that file get populated?

0 Karma

bwouters
Path Finder

One of our servers is putting certain data in it depending on the load this server has (amount of session requests). It's hard to predict how often this file gets updated.

Know that I configured this location as a share from our server towards the Splunk Enterprise server.

0 Karma

Sukisen1981
Champion

and when you query this data file source, you get duplicate events?
That is events (lines) already indexed once are being reflected again and again?

0 Karma

bwouters
Path Finder

Yes, but the duplicate events might not be the problem anymore because I also see them in the log files, so it's ok that there are duplicates.
However, I'm still amazed that a small file can use up so much daily volume 😐

0 Karma

Sukisen1981
Champion

Just to mention a couple of things here -
data indexed PER DAY contributes towards license, hence it does not matter if there are duplicates, however if there are duplicates (for eg. file being indexed 4 times a day), then your data consumed towards license will be the sum of the data volume being ingested for each of the 4 times.
Indexes like _audit , _internal DO NOT contribute towards daily licensing UNLESS you are doing a continuous monitoring on those files and ingesting them again and again in a 24 hour time period.

0 Karma

bwouters
Path Finder

Yes, I understand if there are duplicates that this also counts towards the daily volume usage.
I just checked and the file size is now -> 1.1MB while my daily usage is over 10MB.

How can I configure Splunk monitoring in such a way that he only needs to care about the newly added date from this file? Or how can I configure Splunk monitoring to only check this file every 20 minutes and index the data that has been added?

0 Karma
Get Updates on the Splunk Community!

Introducing Ingest Actions: Filter, Mask, Route, Repeat

WATCH NOW Ingest Actions (IA) is the best new way to easily filter, mask and route your data in Splunk® ...

Splunk Forwarders and Forced Time Based Load Balancing

Splunk customers use universal forwarders to collect and send data to Splunk. A universal forwarder can send ...

NEW! Log Views in Splunk Observability Dashboards Gives Context From a Single Page

Today, Splunk Observability releases log views, a new feature for users to add their logs data from Splunk Log ...