Splunk Search
Highlighted

Counting duplicate values

Explorer

Situation : I have fields sessionId and personName. This session ID has many-to-may mapping with personName.

Need is : I want the count of personName associated with sessionId.

Query I am using : | table sessionId, personName, it gives following

sessionId personName
1. 1234 Name1
2. 1234 Name 2
3. 1234 Name1
4. 1234 Name3
5. 4321 Name1
6. 4321 Name3
7. 4321 Name3

What I want :
sessionId personName count
1. 1234 Name1 2
2. 1234 Name2 1
3. 1234 Name3 1
4. 4321 Name1 1
5. 4321 Name3 2

0 Karma
Highlighted

Re: Counting duplicate values

SplunkTrust
SplunkTrust

You can just replace | table sessionId, personName with | stats count by sessionId, personName in your search.

Update

User following for your sorting and filter requirement.

your base search

| stats count by sessionId, personName 
| where count>3
| sort -count
Highlighted

Re: Counting duplicate values

Explorer

I also want to have a where clause -- where I want all the sessionId with highest number of occurrence. And this occurrence value should be configurable.

So, lets say, if I want the occurrence threshold is 3 I should all the below three
1. 1234 Name1 20
2. 1234 Name2 15
3. 1234 Name3 1
4. 4321 Name1 2
5. 4321 Name3 6
6. 4321 Name3 9

if I want the occurrence threshold is 2 I should all the below three
1. 1234 Name1 20
2. 1234 Name2 15
3. 4321 Name3 6
4. 4321 Name3 9

So, if there is sessionID (lets say 3243) with just 1 occurrence, then it should not come in the result.

0 Karma
Highlighted

Re: Counting duplicate values

Explorer

I also want to have a where clause -- where I want all the sessionId with highest number of occurrence. And this occurrence value should be configurable.

So, lets say, if I want the occurrence threshold is 3 I should all the below three
1. 1234 Name1 20
2. 1234 Name2 15
3. 1234 Name3 1
4. 4321 Name1 2
5. 4321 Name3 6
6. 4321 Name3 9

if I want the occurrence threshold is 2 I should all the below three
1. 1234 Name1 20
2. 1234 Name2 15
3. 4321 Name3 6
4. 4321 Name3 9

So, if there is sessionID (lets say 3243) with just 1 occurrence, then it should not come in the result.

0 Karma
Highlighted

Re: Counting duplicate values

Explorer

No, this doesn't help -

It gives something like this , assuming the count is more that 10:
1. 1234 Name1 20
2. 4321 Name2 15

3. 1234 Name3 1
4. 4321 Name1 2
5. 4321 Name3 6
6. 4321 Name3 9

0 Karma
Highlighted

Re: Counting duplicate values

SplunkTrust
SplunkTrust

Change the where clause according to the threshold you want to use.

0 Karma
Highlighted

Re: Counting duplicate values

Explorer

this is not helping either. Basically I want the key (sessionId in our case) to be repetitive if there are different personName occurring multiple times

  1. 1234 Name1 20
  2. 1234 Name2 15
  3. 1234 Name3 1
  4. 4321 Name1 2
  5. 4321 Name3 6
  6. 4321 Name3 9
0 Karma
Highlighted

Re: Counting duplicate values

SplunkTrust
SplunkTrust

The query your base search | stats count by sessionId, personName does do that. Could you post the exact query you're using, corresponding output (which I believe have issues) and corresponding expected output?

0 Karma
Highlighted

Re: Counting duplicate values

Explorer

92S38LP31QG4E93Z0A0 NAME1
92S38LP31QG4E93Z0A0 NAME1
92S38LP31QG4E93Z0A0 NAME1
92S38LP31QG4E93Z0A0 NAME2
92S38LP31QG4E93Z0A0 NAME2
92S38LP31QG4E93Z0A0 NAME2
92S38LP31QG4E93Z0A0 NAME2
92S38LP31QG4E93Z0A0 NAME2
92S38LP31QG4E93Z0A0 NAME3
92S38LP31QG4E93Z0A0 NAME3
92S38LP31QG4E93Z0A0 NAME3
92S38LP31QG4E93Z0A0 NAME3
92S38LP31QG4E93Z0A0 NAME3

1826SSQ98518QAID99A NAME1
1826SSQ98518QAID99A NAME2
1826SSQ98518QAID99A NAME3
1826SSQ98518QAID99A NAME4
1826SSQ98518QAID99A NAME3
1826SSQ98518QAID99A NAME2
1826SSQ98518QAID99A NAME2
1826SSQ98518QAID99A NAME3

0LKJ68OP19A8865A92Q NAME1
0LKJ68OP19A8865A92Q NAME1
0LKJ68OP19A8865A92Q NAME2
0LKJ68OP19A8865A92Q NAME2
0LKJ68OP19A8865A92Q NAME4

92S38LP31QG4E93Z0A0 NAME1 3
92S38LP31QG4E93Z0A0 NAME2 5
92S38LP31QG4E93Z0A0 NAME3 5
1826SSQ98518QAID99A NAME1 1
1826SSQ98518QAID99A NAME2 3
1826SSQ98518QAID99A NAME3 3
1826SSQ98518QAID99A NAME4 1
0LKJ68OP19A8865A92Q NAME1 2
0LKJ68OP19A8865A92Q NAME2 2
0LKJ68OP19A8865A92Q NAME4 1

Now, if I say my occurence count is 2 then I should get

92S38LP31QG4E93Z0A0 NAME2 5
92S38LP31QG4E93Z0A0 NAME3 5

1826SSQ98518QAID99A NAME2 3
1826SSQ98518QAID99A NAME3 2

0LKJ68OP19A8865A92Q NAME1 2
0LKJ68OP19A8865A92Q NAME2 2

Instead what I am getting is --> coz it is just checking the count(end column). But I want to check the occurence count of the sessionID
92S38LP31QG4E93Z0A0 NAME1 3
92S38LP31QG4E93Z0A0 NAME2 5
92S38LP31QG4E93Z0A0 NAME3 5

1826SSQ98518QAID99A NAME2 3
1826SSQ98518QAID99A NAME3 3

0LKJ68OP19A8865A92Q NAME1 2
0LKJ68OP19A8865A92Q NAME2 2

And I am using exact same query you mentioned

| stats count by sessionId, personName
| where count>3
| sort -count

0 Karma
Highlighted

Re: Counting duplicate values

Explorer

I also want to have a where clause -- where I want all the sessionId with highest number of occurrence. And this occurrence value should be configurable.

So, lets say, if I want the occurrence threshold is 3 I should all the below three
1. 1234 Name1 20
2. 1234 Name2 15
3. 1234 Name3 1
4. 4321 Name1 2
5. 4321 Name3 6
6. 4321 Name3 9

if I want the occurrence threshold is 2 I should all the below three
1. 1234 Name1 20
2. 1234 Name2 15
3. 4321 Name3 6
4. 4321 Name3 9

So, if there is sessionID (lets say 3243) with just 1 occurrence, then it should not come in the result.

0 Karma