Splunk Search

grouping and delta in search

RobertRi
Communicator

Hi

I have the following problem with a search.

This is my data

01/23/2013 08:00 user=Mimi pieces=23 price=30 region=noe
01/23/2013 09:00 user=Mimi pieces=0 price=33 region=ooe
01/23/2013 10:00 user=Mimi pieces=13 price=30 region=w
01/23/2013 08:00 user=Franz pieces=26 price=23 region=noe
01/23/2013 09:00 user=Franz pieces=21 price=73 region=ooe
01/23/2013 10:00 user=Franz pieces=43 price=12 region=w
01/23/2013 08:00 user=Sandra pieces=565 price=54 region=noe
01/23/2013 09:00 user=Sandra pieces=453 price=12 region=ooe
01/23/2013 10:00 user=Sandra pieces=233 price=21 region=w
01/23/2013 08:00 user=Susi pieces=0 price=320 region=noe
01/23/2013 09:00 user=Susi pieces=5 price=3 region=ooe
01/23/2013 10:00 user=Susi pieces=50 price=33 region=w

Now I want to see the delta for each user and each time, how many pieces the user have sold.
If I try it with a single user, this works fine

sourcetype=delta  user=sandra | reverse | delta pieces as delta | stats avg(pieces) as pieces,avg(delta) as delta by user,_time

If I would like to see all users with there deltas and I am ommit the user=sandra then I get a list with all users and the delta is calculated between the old and the new user.

How I can make the search, that I only get the delta values for each user separated in a list to make a chart?

Update:

This search looks better, but how I can add the delta for pieces

sourcetype=delta  user=* | reverse | chart avg(pieces) as pieces by _time, user

_time   Franz   Mimi    Sandra  Susi
1   1/23/13 8:00:00.000 AM  26.000000   23.000000   565.000000  0.000000
2   1/23/13 9:00:00.000 AM  21.000000   0.000000    453.000000  5.000000
3   1/23/13 10:00:00.000 AM 43.000000   13.000000   233.000000  50.000000

Update:

I think I'm near to the answer
I can use the parameter for delta p=4 then, splunk calculate the right values of each user

sourcetype=delta | reverse | stats avg(pieces) as pieces by _time, user | delta pieces p=4

_time   user    pieces  delta(pieces)
1   1/23/13 8:00:00.000 AM  Franz   26.000000   
2   1/23/13 8:00:00.000 AM  Mimi    23.000000   
3   1/23/13 8:00:00.000 AM  Sandra  565.000000  
4   1/23/13 8:00:00.000 AM  Susi    0.000000    
5   1/23/13 9:00:00.000 AM  Franz   21.000000   -5.000000
6   1/23/13 9:00:00.000 AM  Mimi    0.000000    -23.000000
7   1/23/13 9:00:00.000 AM  Sandra  453.000000  -112.000000
8   1/23/13 9:00:00.000 AM  Susi    5.000000    5.000000
9   1/23/13 10:00:00.000 AM Franz   43.000000   22.000000
10  1/23/13 10:00:00.000 AM Mimi    13.000000   13.000000
11  1/23/13 10:00:00.000 AM Sandra  233.000000  -220.000000
12  1/23/13 10:00:00.000 AM Susi    50.000000   45.000000

I have tried to combine a search with a subsearch to get the distinct users but I get a error message for the delta function

sourcetype=delta | reverse | stats avg(pieces) as pieces by _time, user | delta pieces [search sourcetype=delta  | stats dc(user) as tmp | eval tmp= "p=" . tmp]

Thanks#Rob

Tags (1)
1 Solution

jonuwz
Influencer

You need to use streamstats to calculate deltas if you need a 'by' clause:

Example :

* earliest=-1h@h
| bin _time span=5m
| stats count by _time, sourcetype
| streamstats window=2 global=f current=f first(count) as p_count by sourcetype
| eval delta=count-p_count

This gives you the the change in count every 5 minutes per sourcetype.

Add this if you want to visualise it :

 | xyseries _time,sourcetype,delta
 | makecontinuous _time

View solution in original post

jonuwz
Influencer

You need to use streamstats to calculate deltas if you need a 'by' clause:

Example :

* earliest=-1h@h
| bin _time span=5m
| stats count by _time, sourcetype
| streamstats window=2 global=f current=f first(count) as p_count by sourcetype
| eval delta=count-p_count

This gives you the the change in count every 5 minutes per sourcetype.

Add this if you want to visualise it :

 | xyseries _time,sourcetype,delta
 | makecontinuous _time

neerajsafenet
New Member

Thanks a lot buddy. This solved my problem. 🙂

0 Karma

jonuwz
Influencer

Your data is already binnned into daily chunks (from the look of it) so you probably dont need the | bin _time span=5m part

xyseries is like chart/timechart apart from you dont need to use an aggregate function. docs

makecontinuous _time looks at the _time field, and makes it continuous if there's any missing samples. This is needed for splunk to print the times on the graphs in a readable format. (timechart does this bit automatically)

0 Karma

RobertRi
Communicator

Hi jonuwz.

This command looks very difficult to me :-), but the first part works for my data.
I had to modify window= attribute from streamstats to 1, that I get the last previous sample correct. Now this command works fine

sourcetype=delta | reverse | bin _time span=5 | stats avg(pieces) as pieces by _time, user | streamstats window=1 global=f current=f first(pieces) as p_pieces by user | eval delta=pieces-p_pieces

The second part is totally encrypted for me, could you explain in short words how the second part works

Thanks for your help
Rob

0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In September, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...

New in Observability - Improvements to Custom Metrics SLOs, Log Observer Connect & ...

The latest enhancements to the Splunk observability portfolio deliver improved SLO management accuracy, better ...

Improve Data Pipelines Using Splunk Data Management

  Register Now   This Tech Talk will explore the pipeline management offerings Edge Processor and Ingest ...