HI All,
My source.log file is approx 40GB (syslog format), but my Splunk license consumption in reading that log is 60-70GB.
Can you please help me how to determine relationship between data volume and license usage?
Thanks in Advance
Neel
Please check out lguinn's answer here, where she gives a recommendation to check the DMC (or install and use the SOS app on older versions) then gives a search that may help you discover why the discrepancy.
IMO, woodcock is likely correct (and his comments are completely correct regarding how that situation happens, if that is indeed what is happening.) There are a few other explanations possible, though.
40GB turning into 60GB is a big jump: if it had ended up only somewhat bigger (please use a very loose definition of "somewhat") then it could be tokenization and other things. Some inputs expand somewhat upon ingestion because of what needs to happen to them. I've seen inputs double their size, but IIRC that's not usually what syslog stuff does. Doesn't mean it can't, though.
One other non-impossible reason is that it isn't actually that much bigger. Instead, you are just seeing other inputs that you weren't aware of, had forgotten about or didn't realize were so large. Local OS information perhaps - some of those can be a bit chatty in certain circumstances. The DMC can help a lot to identify this if you dig around in the indexing section, or it may be as simple as clicking into the search app then clicking the "Data Summary" button and looking around in there.
Hope this helps!
It has probably rotated (changed names) at least once.
if name was changed, it would log under different source. Source i see here is same
Not at all. You are looking at the file now and ASSUMING that it contains everything that it always has. It is common for systems to rotate the file by renaming (or deleting) it and then reusing the original name (with the contents now gone). You are presuming too much.
but in that case splunk would capture data like this D:/location/location/location/original_file.log and D:/location/location/location/temporary_file.log ... right ?
No. Splunk will notice that the renamed file is identical to a file it has already forwarded and will ignore it. Then when new data begins to come into the original name, it will forward that data. So all events will show to be from 1 file (because they were).
that looks unlikely 😞