Splunk Search

How to get the bytes of an indexed event

dstuder
Communicator

I'm trying to get the bytes of indexed events to find out by event code in our windows event log security events how much indexing they are taking up. Below is what I have, but I'm not sure if that will really get me the bytes. Sure it will get me the relative sizes, but I'm specifically looking for the bytes. My hunch is what I have below is totally correct because the data could be in ASCII (one byte per character), UTF-8 (one to four bytes per character), UTF-16 (two to four bytes per character), etc. Does Splunk store the actual bytes anywhere,  if not is there a way to get it to? Thoughts?

 

 

index="wineventlog" source="WinEventLog:Security"
| eval bytes = len(_raw)
| stats sum(bytes) by EventCode
| sort sum(bytes) desc

 

Labels (1)
1 Solution

tscroggins
Influencer

@dstuder 

You are correct. The size on disk may not equal the number of characters in _raw.

You can estimate the number of bytes per raw event using dbinspect:

| dbinspect index=wineventlog
| where eventCount>0 AND rawSize>0
| dedup bucketId
| stats avg(eval(exact(rawSize/eventCount))) as bytes_per_event

Also note that event codes are not globally unique. They are unique by event source, e.g. "Microsoft Windows security auditing." or "Eventlog" in the Security event log. The average size of event code 0 from source Foo may not be the same as the average size of event code 0 from source Bar.

View solution in original post

0 Karma

tscroggins
Influencer

@dstuder 

You are correct. The size on disk may not equal the number of characters in _raw.

You can estimate the number of bytes per raw event using dbinspect:

| dbinspect index=wineventlog
| where eventCount>0 AND rawSize>0
| dedup bucketId
| stats avg(eval(exact(rawSize/eventCount))) as bytes_per_event

Also note that event codes are not globally unique. They are unique by event source, e.g. "Microsoft Windows security auditing." or "Eventlog" in the Security event log. The average size of event code 0 from source Foo may not be the same as the average size of event code 0 from source Bar.

0 Karma
Get Updates on the Splunk Community!

Splunk Observability for AI

Don’t miss out on an exciting Tech Talk on Splunk Observability for AI! Discover how Splunk’s agentic AI ...

[Puzzles] Solve, Learn, Repeat: Dereferencing XML to Fixed-length events

This challenge was first posted on Slack #puzzles channelFor a previous puzzle, I needed a set of fixed-length ...

Stay Connected: Your Guide to December Tech Talks, Office Hours, and Webinars!

What are Community Office Hours? Community Office Hours is an interactive 60-minute Zoom series where ...