Splunk Search

grouping and delta in search

RobertRi
Communicator

Hi

I have the following problem with a search.

This is my data

01/23/2013 08:00 user=Mimi pieces=23 price=30 region=noe
01/23/2013 09:00 user=Mimi pieces=0 price=33 region=ooe
01/23/2013 10:00 user=Mimi pieces=13 price=30 region=w
01/23/2013 08:00 user=Franz pieces=26 price=23 region=noe
01/23/2013 09:00 user=Franz pieces=21 price=73 region=ooe
01/23/2013 10:00 user=Franz pieces=43 price=12 region=w
01/23/2013 08:00 user=Sandra pieces=565 price=54 region=noe
01/23/2013 09:00 user=Sandra pieces=453 price=12 region=ooe
01/23/2013 10:00 user=Sandra pieces=233 price=21 region=w
01/23/2013 08:00 user=Susi pieces=0 price=320 region=noe
01/23/2013 09:00 user=Susi pieces=5 price=3 region=ooe
01/23/2013 10:00 user=Susi pieces=50 price=33 region=w

Now I want to see the delta for each user and each time, how many pieces the user have sold.
If I try it with a single user, this works fine

sourcetype=delta  user=sandra | reverse | delta pieces as delta | stats avg(pieces) as pieces,avg(delta) as delta by user,_time

If I would like to see all users with there deltas and I am ommit the user=sandra then I get a list with all users and the delta is calculated between the old and the new user.

How I can make the search, that I only get the delta values for each user separated in a list to make a chart?

Update:

This search looks better, but how I can add the delta for pieces

sourcetype=delta  user=* | reverse | chart avg(pieces) as pieces by _time, user

_time   Franz   Mimi    Sandra  Susi
1   1/23/13 8:00:00.000 AM  26.000000   23.000000   565.000000  0.000000
2   1/23/13 9:00:00.000 AM  21.000000   0.000000    453.000000  5.000000
3   1/23/13 10:00:00.000 AM 43.000000   13.000000   233.000000  50.000000

Update:

I think I'm near to the answer
I can use the parameter for delta p=4 then, splunk calculate the right values of each user

sourcetype=delta | reverse | stats avg(pieces) as pieces by _time, user | delta pieces p=4

_time   user    pieces  delta(pieces)
1   1/23/13 8:00:00.000 AM  Franz   26.000000   
2   1/23/13 8:00:00.000 AM  Mimi    23.000000   
3   1/23/13 8:00:00.000 AM  Sandra  565.000000  
4   1/23/13 8:00:00.000 AM  Susi    0.000000    
5   1/23/13 9:00:00.000 AM  Franz   21.000000   -5.000000
6   1/23/13 9:00:00.000 AM  Mimi    0.000000    -23.000000
7   1/23/13 9:00:00.000 AM  Sandra  453.000000  -112.000000
8   1/23/13 9:00:00.000 AM  Susi    5.000000    5.000000
9   1/23/13 10:00:00.000 AM Franz   43.000000   22.000000
10  1/23/13 10:00:00.000 AM Mimi    13.000000   13.000000
11  1/23/13 10:00:00.000 AM Sandra  233.000000  -220.000000
12  1/23/13 10:00:00.000 AM Susi    50.000000   45.000000

I have tried to combine a search with a subsearch to get the distinct users but I get a error message for the delta function

sourcetype=delta | reverse | stats avg(pieces) as pieces by _time, user | delta pieces [search sourcetype=delta  | stats dc(user) as tmp | eval tmp= "p=" . tmp]

Thanks#Rob

Tags (1)
1 Solution

jonuwz
Influencer

You need to use streamstats to calculate deltas if you need a 'by' clause:

Example :

* earliest=-1h@h
| bin _time span=5m
| stats count by _time, sourcetype
| streamstats window=2 global=f current=f first(count) as p_count by sourcetype
| eval delta=count-p_count

This gives you the the change in count every 5 minutes per sourcetype.

Add this if you want to visualise it :

 | xyseries _time,sourcetype,delta
 | makecontinuous _time

View solution in original post

jonuwz
Influencer

You need to use streamstats to calculate deltas if you need a 'by' clause:

Example :

* earliest=-1h@h
| bin _time span=5m
| stats count by _time, sourcetype
| streamstats window=2 global=f current=f first(count) as p_count by sourcetype
| eval delta=count-p_count

This gives you the the change in count every 5 minutes per sourcetype.

Add this if you want to visualise it :

 | xyseries _time,sourcetype,delta
 | makecontinuous _time

neerajsafenet
New Member

Thanks a lot buddy. This solved my problem. 🙂

0 Karma

jonuwz
Influencer

Your data is already binnned into daily chunks (from the look of it) so you probably dont need the | bin _time span=5m part

xyseries is like chart/timechart apart from you dont need to use an aggregate function. docs

makecontinuous _time looks at the _time field, and makes it continuous if there's any missing samples. This is needed for splunk to print the times on the graphs in a readable format. (timechart does this bit automatically)

0 Karma

RobertRi
Communicator

Hi jonuwz.

This command looks very difficult to me :-), but the first part works for my data.
I had to modify window= attribute from streamstats to 1, that I get the last previous sample correct. Now this command works fine

sourcetype=delta | reverse | bin _time span=5 | stats avg(pieces) as pieces by _time, user | streamstats window=1 global=f current=f first(pieces) as p_pieces by user | eval delta=pieces-p_pieces

The second part is totally encrypted for me, could you explain in short words how the second part works

Thanks for your help
Rob

0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

(This is the first of a series of 2 blogs). Splunk Enterprise Security is a fantastic tool that offers robust ...

Index This | What are the 12 Days of Splunk-mas?

December 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...