Solved: Difference in Size Between Events

ConnorG · ‎05-08-2015

I have two indexes that contain different sets of events.

Index 1
Event Count – 23,952
Current Size – 19

Index 2
Event Count – 431,026
Current Size – 20

The size is the same, but the number of events is drastically different. This would make sense except that the events in both indexes are generally the same length. Any explanation for the difference in size here?

Index 1 - Event Example

    {"time":"Fri Apr 03 17:57:08 CDT 2015","web_request_response_time":"0.45356011390686035","application":"node_count":"1","DataType":"PurepathData","state":"OK","cpu":"0.448837012052536","System Profile":"c_prodissue","breakdown":"CPU: 0.449 ms, Sync: -, Wait: -, Suspension: -","agent":"_JavaApp06_sin@sin:1547","root_path_thread_name":"http-apr-169.97.17.67-11000-exec-2","time":"Fri Apr 03 17:57:08 CDT 2015","response_time":"0.45356011390686035","execsum":"0.45356011390686035","name":"/SUI/monitoring","exec":"0.45361328125"}

     {"time":"Fri Apr 03 17:57:03 CDT 2015","web_request_response_time":"0.5128860473632812","application":"applic","node_count":"1","DataType":"PurepathData","state":"OK","cpu":"0.5083289742469788","System Profile":"_uat_prodissue","breakdown":"CPU: 0.508 ms, Sync: -, Wait: -, Suspension: -","agent":"UAT_JavaApp05_sin@sin:28893","root_path_thread_name":"http-apr-169.97.17.62-11000-exec-17","time":"Fri Apr 03 17:57:03 CDT 2015","response_time":"0.5128860473632812","execsum":"0.5128860473632812","name":"/UI/monitoring","exec":"0.512939453125"}

Index 2 - Event Example

            System_Profile=Monitoring #document dynatrace version=6.1.0.8054 systemprofile capture=true modifiedby=E745984 repositoryaccess=true incidentrules incidentrule flags=1 id=Host Disk Unhealthy incidentdashboardname=Incident Zero Conf Dashboard timeframe=10 actions actionref bundleversion=0.0.0 execution=begin key=com.dynatrace.diagnostics.plugins.EmailNotification refaction=com.dynatrace.diagnostics.plugins.EmailNotification rolekey=com.dynatrace.diagnostics.plugins.EmailNotificationAction roletype=1 severity=informational smartalert=false type=Email Notification property key=from typeid=string value= 

            System_Profile=Monitoring #document dynatrace version=6.1.0.8054 systemprofile capture=true modifiedby=E745984 repositoryaccess=true incidentrules incidentrule flags=1 id=Host Network Unhealthy incidentdashboardname=Incident Zero Conf Dashboard timeframe=10 actions actionref bundleversion=0.0.0 execution=begin key=com.dynatrace.diagnostics.plugins.EmailNotification refaction=com.dynatrace.diagnostics.plugins.EmailNotification rolekey=com.dynatrace.diagnostics.plugins.EmailNotificationAction roletype=1 severity=informational smartalert=false type=Email Notification property key=bcc typeid=string value=

martin_mueller · ‎05-08-2015

I'm going to guess that your data in index 1 has INDEXED_EXTRACTIONS=json activated in props.conf. More space used in that case is expected behaviour, that space is traded for speed when using those fields - especially in tstats situations.

To further investigate, run these two searches:

| dbinspect index=index1 | eval rawSizeMB = rawSize / 1048576 | table id eventCount rawSizeMB sizeOnDiskMB

| dbinspect index=index2 | eval rawSizeMB = rawSize / 1048576 | table id eventCount rawSizeMB sizeOnDiskMB

That'll give you the event count, the raw size ingested into each bucket for that index, and how much space each bucket occupies on disk. If you have single huge rogue events you should see one bucket behaving differently from the others, if my JSON guess is correct all buckets for an index should look fairly similar.

As for your events themselves, it seems the data in index 1 has more unique tokens - for example, those huge precision numbers. Lots of unique tokens will increase the size of dictionaries, and hence Splunk's index structures. The index 2 sample events seems to have lots of repeating tokens in the field values, not a lot of unique ones.

View solution in original post

martin_mueller · ‎05-08-2015

By default, Splunk will force an event break after 10000 characters. You can modify that in props.conf using the TRUNCATE setting. In the same spirit, the default will break after 256 lines in one event, see MAX_EVENTS in props.conf.

These default limits are there to mitigate either wrong configurations or systems throwing unexpected log data.

martin_mueller · ‎05-08-2015

I'm going to guess that your data in index 1 has INDEXED_EXTRACTIONS=json activated in props.conf. More space used in that case is expected behaviour, that space is traded for speed when using those fields - especially in tstats situations.

To further investigate, run these two searches:

| dbinspect index=index1 | eval rawSizeMB = rawSize / 1048576 | table id eventCount rawSizeMB sizeOnDiskMB

| dbinspect index=index2 | eval rawSizeMB = rawSize / 1048576 | table id eventCount rawSizeMB sizeOnDiskMB

That'll give you the event count, the raw size ingested into each bucket for that index, and how much space each bucket occupies on disk. If you have single huge rogue events you should see one bucket behaving differently from the others, if my JSON guess is correct all buckets for an index should look fairly similar.

As for your events themselves, it seems the data in index 1 has more unique tokens - for example, those huge precision numbers. Lots of unique tokens will increase the size of dictionaries, and hence Splunk's index structures. The index 2 sample events seems to have lots of repeating tokens in the field values, not a lot of unique ones.

martin_mueller · ‎05-08-2015

The configuration reference is here: http://docs.splunk.com/Documentation/Splunk/6.2.3/Admin/Propsconf (search for INDEXED_EXTRACTIONS)
There's a bit of human-readable docs here: docs.splunk.com/Documentation/Splunk/6.2.3/Data/Extractfieldsfromfileheadersatindextime

Regular searches should run at similar speeds. What benefits the most is stuff like this:

| tstats avg(cpu) avg(web_request_response_time) where index=index1 by _time span=auto prestats=t | timechart avg(cpu) avg(web_request_response_time)

That should be massively faster than trying to pry the cpu and web_request_response_time fields from the JSON at search time.

ConnorG · ‎05-08-2015

You're assumption is correct. So you're saying that the data in index1 can be searched faster?

This data is coming from a custom made script. If the trade off for larger file size is quicker results then I will leave the formatting as is. Otherwise if there were no pros to having the events formatted as such I would change it to be simpler.

Thanks for the heads up. Are there any reference docs available related to this?

j4adam · ‎05-08-2015

Is it possible there are one or two rogue gigantic events in Index 1? I've never used it personally, but I've read of people using "eval esize" to check this kind of thing.

ConnorG · ‎05-08-2015

I believe there is a character limit for events. So even if there were a handful of rogue events that still couldn't account for the tenfold size increase.

j4adam · ‎05-08-2015

Ah, I didn't know that actually.

ConnorG · ‎05-08-2015

There's some more info in this post here:

http://answers.splunk.com/answers/4162/size-limit-for-an-event.html

j4adam · ‎05-08-2015

Yeah, I immediately looked into that as soon as you mentioned it. That post exactly, actually. Thanks!

woodcock · ‎05-08-2015

how are you calculating "size"?

ConnorG · ‎05-08-2015

That is coming from the Indexes view in the Splunk Settings. "Current size in MB"

ConnorG · ‎05-08-2015

There are more field extractions occurring in the heavier events. So that could possibly be the case.

Difference in Size Between Events

Splunk MCP & Agentic AI: Machine Data Without Limits

Finding Based Detections General Availability

Get Your Hands Dirty (and Your Shoes Comfy): The Splunk Experience

Join the Conversation

Difference in Size Between Events

Splunk MCP & Agentic AI: Machine Data Without Limits

Finding Based Detections General Availability

Get Your Hands Dirty (and Your Shoes Comfy): The Splunk Experience