Hi, I wonder whether someone may be able to help me please.
I'm using the the search below to return values in a table.
auditSource="cato-filing" auditType="FilingStarted" | dedup "detail.Filing ID" | table "detail.*"
What I'd like to do is make this more efficient by changing the dedup
to stats dc
. Now I have used stats dc before, so I tried auditSource="cato-filing" auditType="FilingStarted" | stats dc(detail.Filing ID) By "detail.*"
, but for the life of me, I can't work out how to return the table of results, because my effort returns no results.
I just wondered whether someone may be able to look at this please and let me know where I've gone wrong.
Many thanks and kind regards
Chris
So just to throw in my two cents:
auditSource="cato-filing" auditType="FilingStarted" | rename detail.* as * | stats dc("Filing ID") as "Filing ID" by CompanyName Department Location
HI @alacercogitatus, thank you for taking the time to reply to my post.
I tried the query you kindly provided, but unfortunately this create a table.
Many thanks and kind regards
Chris
Try something like this
auditSource="cato-filing" auditType="FilingStarted" | table "detail.*" | stats latest(*) as * by "detail.Filing ID"
This will do same function as dedup
Hi @somesoni2, thank you for taking the time to my post.
I've tried the query you kindly provided but it doesn't work. Rather than displaying the "detail.*" fields it just displays the "detail.Filing ID".
Many thanks and kind regards
Chris
Like this:
auditSource="cato-filing" auditType="FilingStarted" | stats values(detail.Filing ID)
To make it more flexible to cover more fields, you can do this:
auditSource="cato-filing" auditType="FilingStarted" | stats values(detail.*)
Hi @woodcock, thank you for taking the time to reply to my post.
I've tried the query you kindly provided, and although it does extract all the fields which is great, rather than each record being on it's own row, they are all contained within one.
Many thanks and kind regards
Chris
OK, then just add this:
... | foreach values* [ mvexpand <<FIELD>> ]
Hi @woodcock, thank you for coming back to me with this, but unfortunately this doesn't change the layout of the results.
I think I'll have to say with the 'dedup' because I think what I want to achieve isn't possible.
Many thanks and kind regards
Chris
There is no doubt that it is possible but if you'd like to keep your original solution, that is up to you. The problem is that your original search implied some things about your dataset that clearly aren't true (else our solutions would have worked). Given this, it is likely that your original search is not doing exactly what you think it is doing either, so beware! It might be best if you back all the way up and show us some sample events including fields:
... | eval raw=_raw | table *
What makes you think that dedup
is inefficient? What makes you think that dedup
is any different from stats latest(_raw) AS _raw
(which will be pretty much the same work as stats dc
)?
dedup
compared to a stats dc by
is indeed less efficient. When using dedup
most (if not all) of the _raw
events must be returned to the search head. When using stats
, only the dc
and the by clause field
are returned to the search head - which gives a better performance when re-assembling on the search head to do the final stats
.
As per my understanding from your question, you could rename before stats something like this.
| rename "detail.Filing ID" as FilingID | stats dc(FilingID)
Hope this helps !
Hi @immortalraghaven,
Thank you for taking the time to reply to my post.
Unfortunately the rename won't work because I'm trying to use a 'stats dc' for multiple fields as per my last post to @MuS.
Many thanks and kind regards
Chris
Hi IRHM73,
two things that I can think of:
auditSource="cato-filing" auditType="FilingStarted" | stats dc(detail.Filing ID) By "detail.*"
will not work because the dc(detail.Filing ID)
contains a space and should be dc("detail.Filing ID")
auditSource="cato-filing" auditType="FilingStarted" | stats dc(detail.Filing ID) By "detail.*"
will not work because by "detail.*"
contains a wild card; take a field which is unique like detail.foo
or detail.baz
You could try auditSource="cato-filing" auditType="FilingStarted" | stats dc("detail.Filing ID") BY "detail.Filing ID" | table "detail.Filing ID"
or just use dedup
.
Hope this helps ... and this is un-tested 😉
cheers, MuS
Hi @MuS thank you for coming back to me with this. I'm trying to get away from dedup because of it's over use of resources.
I have tried the query you kindly provided and I appreciate that this was untested, but unfortunately this doesn't work.
However I've found that rather than using "detail.*" I can use the fields:
detail.CompanyName
detail.Department
detail.Location
I'm not sure whether that helps.
Many thanks and kind regards
Chris
Just for the fun of it, I ran all searches as run everywhere examples with fixed time range and leave you to choose the fastest/best for your use case:
index=_internal earliest=-2d@d latest=-1d@d | dedup sourcetype | table sourcetype
This search has completed and has returned 8 results by scanning 2,907,591 events in 249.892 seconds.
index=_internal earliest=-2d@d latest=-1d@d | stats dc(sourcetype) by sourcetype
This search has completed and has returned 8 results by scanning 2,907,591 events in 229.72 seconds.
index=_internal earliest=-2d@d latest=-1d@d | stats latest(sourcetype) by sourcetype
This search has completed and has returned 8 results by scanning 2,907,591 events in 228.705 seconds.
index=_internal earliest=-2d@d latest=-1d@d | table sourcetype | stats latest(*) as * by sourcetype
This search has completed and has returned 8 results by scanning 2,907,591 events in 382.519 seconds.
index=_internal earliest=-2d@d latest=-1d@d | table sourcetype | stats values(sourcetype)
This search has completed and has returned 1 result by scanning 2,907,591 events in 388.672 seconds.
cheers, MuS
Hi @Mus, thank you very much for this.
It provide some interesting reading.
Kind Regards
Chris
just to be complete, here is the latest command ran on the same server as yesterday:
index=_internal earliest=-3d@d latest=-2d@d | stats values(sourcetype) | foreach values* [ mvexpand <<FIELD>> ]
This search has completed and has returned 8 results by scanning 2,907,591 events in 248.982 seconds.
Hi, thank you very much for this.
I think I'll have to stay with the dedup because what I want to achieve doesn't seem possible.
Kind Regards
Chris