Archive

Efficiently counting the occurrences of a delimited substring across events.

New Member

In my event data, I have a field called "blocks", the content of that field is a comma separated list of blocks.

For example:
block1,block4,block10
block
1,block3,block4

I want to be able to count the occurrence of the blocks over time.

So from above example, I should get the following

block_1                 2
block_3                 1
block_4                 2
block_10               1

I have this working via the following, however it is very slow.

<my search>  | makemv delim="," blocks| mvexpand blocks| stats count by blocks

Is there a more efficient way to do this?

Tags (1)
0 Karma

Influencer

If you know all the expected values for field blocks then you can do this. Here, block1, block2, block3, block4 and block_10 are expected values of multi-value field blocks.

<my search> 
| foreach block_1,block_2,block_3,block_4,block_10 
    [ eval <<FIELD>>=if(like(blocks, "%<<FIELD>>%"), "yes", null)] 
| stats count(block_*) as block_* 
| transpose column_name=blocks

Here you avoid using mvexpand which is time consuming for large number of events.

0 Karma

New Member

Hi manjunathmeti,

Thanks for the suggestion.

Yes, the expected values for field "blocks" is known, currently there are 48 possible values. I tried your suggestion and ran it over 24 hours of data (~4 million events), the mvexpand approach was faster:
mvexpand approach - 178.78 seconds
foreach/eval approach - 222.54 seconds

I'll need to run this search ad hoc over longer timer periods (30 - 90 days), hence the focus on performance.

Thanks for the help.

0 Karma