Splunk Search

Get Log size

New Member

I want to get the log size in MB and GB. I have used this command
index=index1 |eval raw_len=(len(_raw)/1028) | stats sum(raw_len) by source

0 Karma

New Member

If you do /1024/1024/1024 you will go to 0 for small logs and it wont work. Just reuse the previously calculated value. then you save cycles and data

0 Karma


Without much context as to why, using len(_raw) is an ok approximation of the size of a log... however you should know that len does not actually count bytes but rather it counts characters. If knowing bytes is crucial, I would refer you to looking at the License Usage Report View or actually just running ls -l or similar utilities on the box where the log comes from.

To see this in action.... I made two files, one that contained words and the other كلمات I then put both in a directory and indexed them (taking good advantage of my dev-test license). Using len() both come out to 5, but checking the index usage data, I can see that words equals 5 bytes but كلمات is 10 bytes. (In this case, each character, encoded UTF-8 is 2 bytes wide).

Now most system level logs, that you'd aggregate in Splunk tend to be US-ASCII so each character (UTF-8) happens to be 1 byte, but this might not be universally the case.

EDIT: A bit more of a rabbit hole, but I had one file containing كلمات encoded UTF-8 (10 bytes long), and another encoded ISO8859-6 (5 byte long file on disk). Ingesting the 8859-6 file using a sourcetype that specifies the encoding as such (so the text is readable in Splunk), the license impact is still 10 bytes, because translation to UTF-8 happens before counting license.

Esteemed Legend

That should be OK, what's the problem? You just need to do more /1028.

0 Karma

Ultra Champion

check this out:
here's a search:

|eval raw_len=len(_raw)
| eval raw_len_kb = raw_len/1024
| eval raw_len_mb = raw_len/1024/1024
| eval raw_len_gb = raw_len/1024/1024/1024
| stats sum(raw_len) as Bytes sum(raw_len_kb) as KB sum(raw_len_mb) as MB sum(raw_len_gb) as GB by source

hope it helps


it worked

0 Karma

New Member

Thanks for your answer...let me try it .

0 Karma
Get Updates on the Splunk Community!

Understanding Generative AI Techniques and Their Application in Cybersecurity

Watch On-Demand Artificial intelligence is the talk of the town nowadays, with industries of all kinds ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

Using the Splunk Threat Research Team’s Latest Security Content

REGISTER HERE Tech Talk | Security Edition Did you know the Splunk Threat Research Team regularly releases ...