Solved: how to do distinct count over a multivalue data co...

fere · ‎10-10-2012

Hi,
I have the following search which returns the avg number of "EnterPlace" actions in a session (a transaction = a session) by month.
Each "EnterPlace" action_type also has a place_id.
The problem is that we need to calculate the avg number of EnterPlace actions with unique place_id in a session (ie. if a transaction had two logs with action_type="EnterPlace" and both logs had the same value for the world_id, then the count for that transaction should be 1).
Is there a way to do that?
Really appreciate your help.

This is what I have so far, which needs to be corrected:

endlessSource | transaction user_id, pid keeporphans=f maxspan=24h maxpause=30m mvraw=t delim="," mvlist=t |
bucket _time span=30d |
convert timeformat="%y-%m-%d" ctime(_time) as YMD |
eval this_action_type= mvfilter(match(action_type, "EnterPlace")) |
eval this_action_count=mvcount(this_action_type) |
fillnull this_action_count |
stats avg(this_action_count) as avg_Action_by_session
max(this_action_count) as max_Action_by_session by YMD

the problem is that dc should be on a different field than the mvexpand and only conditionally:
the events of a transaction have different values for action_type.
so, for each transaction, only for the events that action_type=EnterPlace, I need to do dc(world_id). and then average out the dc value over all the transactions.

yannK · ‎10-10-2012

Hi Fere,
My understanding is that after the transaction you want to expand each multivalue of the action_type and world_id (or other fields), in order to do a distinct count of them.

The difficult being that, the fields are multivalue after the transaction, and that you want to distinguish per transaction.

Here is a proposition using :

an eval field for the counter action_type and world_id (action_id)
an eval fields to distinguish each unique transaction (unique_id based on user_id, pid and the _time of the transaction begin, before the bucketing of time)
a mvindex command to turn a multiline in to a single line (they are identical)
a mvexpand to explode each action_id into seperate events in order to count them.

see



endlessSource | eval unique_id=user_id."-".pid

| eval action_id=action_type."-".world_id

| transaction unique_id keeporphans=f maxspan=24h maxpause=30m mvraw=t delim="," mvlist=t

| eval unique_id=_time."-".mvindex(unique_id,0,0)

| bucket _time span=30d

| convert timeformat="%y-%m-%d" ctime(_time) as YMD

| mvexpand action_id

| search action_id="EnterRoom*"

| stats count dc(action_id) AS distinct by unique_id YMD

View solution in original post

yannK · ‎10-10-2012

Hi Fere,
My understanding is that after the transaction you want to expand each multivalue of the action_type and world_id (or other fields), in order to do a distinct count of them.

The difficult being that, the fields are multivalue after the transaction, and that you want to distinguish per transaction.

Here is a proposition using :

an eval field for the counter action_type and world_id (action_id)
an eval fields to distinguish each unique transaction (unique_id based on user_id, pid and the _time of the transaction begin, before the bucketing of time)
a mvindex command to turn a multiline in to a single line (they are identical)
a mvexpand to explode each action_id into seperate events in order to count them.

see



endlessSource | eval unique_id=user_id."-".pid

| eval action_id=action_type."-".world_id

| transaction unique_id keeporphans=f maxspan=24h maxpause=30m mvraw=t delim="," mvlist=t

| eval unique_id=_time."-".mvindex(unique_id,0,0)

| bucket _time span=30d

| convert timeformat="%y-%m-%d" ctime(_time) as YMD

| mvexpand action_id

| search action_id="EnterRoom*"

| stats count dc(action_id) AS distinct by unique_id YMD

how to do distinct count over a multivalue data conditionally?

Prove Your Splunk Prowess at .conf25—No Prereqs Required!

Splunk Observability Cloud's AI Assistant in Action Series: Observability as Code

Splunk Answers Content Calendar, July Edition I

Are you a member of the Splunk Community?

how to do distinct count over a multivalue data conditionally?

Prove Your Splunk Prowess at .conf25—No Prereqs Required!

Splunk Observability Cloud's AI Assistant in Action Series: Observability as Code

Splunk Answers Content Calendar, July Edition I