Getting Data In

How to squash host and source fields?

Omar
Explorer

Dear Splunkers, 

 

I am having an issue with the process of squashing fields. When searching for events with no hosts or source I don't get any results: 

index=<my_index> 
| where isnull(source)

Does Splunk drop events after being squashed? Because logically, there should be events on my index that are missing the field host and source.

 

Labels (3)
0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @Omar,

at first is very strange that the source field is null because every event must have a value in this field.

Anyway, if you want to search events without values in the source field, please try this:

index=<my_index> NOT source=*

Ciao.

Giuseppe

0 Karma

Omar
Explorer

Hello @gcusello

Thank you for your response.

 

Actually, yes, this could happen due to a process of squashing

when a certain threshold is reached indexers drop (host, source) fields to avoid explosion in memory/processing overhead. 

what confuses me is I am unable to find those events, so I'm wondering if Splunk is dropping the entire events or just those fields.

 

Bellow search shows if you have this issue or not. This only works with large indexes: 

index=_internal source=*license_usage.log* type="Usage" idx="my_index"
| eval h=if(len(h)=0 OR isnull(h),"(SQUASHED)",h)
| eval s=if(len(s)=0 OR isnull(s),"(SQUASHED)",s)
| eval st=if(len(st)=0 OR isnull(st),"(UNKNOWN)",st)
| fields _time,b,h,st
| bin _time span=1d
| stats sum(b) AS volume by h, _time,st
| stats avg(volume) AS avgVolume max(volume) AS maxVolume by h,st
| eval avgVolumeGB=round(avgVolume/1024/1024/1024,3)
| eval maxVolumeGB=round(maxVolume/1024/1024/1024,3)
| fields h,st, avgVolumeGB, maxVolumeGB
| rename avgVolumeGB AS "average" maxVolumeGB AS "peak",st AS "sourcetype", h AS "hostname"
| sort - average
| head 10

  

 

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @Omar,

it's the first time I see this behavior, I found that when there's a congestion (for full queues) there's a delay  in internal logs indexing but I never found that host an source fields are dropped!

Anyway, I hint to analyze why there's this congestion and found a solution, maybe it's a too slow storage or maybe you need more resources for your servers or there's a queue problem for a wrong configuration.

Anyway, open a ticket to Splunk Support for this.

Ciao.

Giuseppe

Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Character substitutions with Regular Expressions

This challenge was first posted on Slack #puzzles channelFor BORE at .conf23, we had a puzzle question which ...

Splunk Community Badges!

  Hey everyone! Ready to earn some serious bragging rights in the community? Along with our existing badges ...

[Puzzles] Solve, Learn, Repeat: Matching cron expressions

This puzzle (first published here) is based on matching timestamps to cron expressions.All the timestamps ...