I know there is a syntax difference between:
sourcetype=blah | chart count over foo by bar
and
sourcetype=blah | chart count by foo, bar
But what's the difference, if any?
Comparing the performance and request sections of the job inspection for those queries reveals a difference of a couple milliseconds on a sample dataset.
Are they actually different under the hood or is "over X by Y" just another way of saying "by X, Y"?
On a related note, where is the best place to look to see what a job is actually doing?
Update: added the count keyword in the search - miscopied that.
No difference between the two.
chart something OVER a BY b
and
chart something BY a b
a will be the vertical column, and b the horizontal columns.
No difference between the two.
chart something OVER a BY b
and
chart something BY a b
a will be the vertical column, and b the horizontal columns.
Going to mark your response as the answer as I'm also pretty sure that the difference is purely cosmetic.
It would be nice if we could get a Splunk developer on here to verify. Maybe post the source code; just kidding 😉
Thanks.
There is no real difference that I've seen so far, except maybe better readability: "chart some statistic over the x-axis field and group by some other field"
That's a matter of personal taste though.
Yes, definitely they stay the same, here the first field source acts as fixed field. I am also trying to find out the real differences 🙂
Thanks for the answer, strive.
I'm not sure if the data I'm using is causing any differences but going along with your example, have you tried the following search?
index=_internal earliest=-10m@m latest=-2m@m | chart count by source, sourcetype
Does that not show you the same visualizations when all other settings are the same? My queries are showing the same exact information for me.
As per my understanding:
Generally over
is chosen to determine which field should take axes.
Lets take an example:
index=_internal earliest=-10m@m latest=-2m@m | chart count over source by sourcetype
for this search, if i choose my visualization as Column/line/area(stacked mode on) the X axis remains constant that is source. If i choose visualization as bar then my Y axis is source.
Simply Put: Over is used to fix a field and split that field further by other dimensions.
by field1, field2 also works in similar manner...
I would be more than happy to know the real differences 🙂