Splunk Search

I have few events contains sell_time, based on sell_time I want to calculate sum of "price" column

nagarjuna280
Communicator

I have few events contains sell_time, based on sell_time I want to calculate sum of "price" column

index="example" sourcetype="abc" | eval a=strptime(sell_time ,"%m/%d/%y %H:%M:%S.%3Q") |eval _time=a | timechart per_day(price)

earliest and latest time would get from the query

latesttime is the latest time in the sell_time column, and earliest would be -2d of the latest time

How to do?

Tags (2)
0 Karma
1 Solution

DalJeanis
Legend

Okay, so it sounds like you are not dealing with realtime data (obviously), and I think you mean you only want full day's data for the last two days on file. If you want to change that to some other number of days, just change the number in the head command.

 index="example" sourcetype="abc"
| eval sell_day=strptime(sell_time ,"%m/%d/%y") 

| rename COMMENT as "Put whatever calculations you want right here"
| stats sum(price) as totalprice avg(price) as avgprice by sell_day

| rename COMMENT as "Now we sort it descending, and take the first 2 values, and reformat the date. "
| sort 0 - sell_day
| head 2
| eval sell_day=strftime(sell_day,"%Y-%m-%d")

There are a couple of different options, depending on what you want, how delayed your data might be, and whether this is always the price for a single item or whether there is a "unit count" field that tells you how many items were sold at that price

Let's assume that you always receive your sales data within an hour of the sell_time, and you want to aggregate your data on a full-hour basis. Let's also suppose you have a field "unitssold" that says how many of the thing were sold in that transaction.

earliest=-2d@h-2h  index="example" sourcetype="abc"
| eval min_sell_time = relative_time(now(),"-49h@h")
| eval max_sell_time = relative_time(now(),"-1h@h")
| eval _time=strptime(sell_time ,"%m/%d/%y %H:%M:%S.%3Q") 
| bin _time span=1h
| eval totalprice=unitssold*price
| stats sum(totalprice) as totalprice, avg(price) as avgprice, sum(unitssold) as unitssold by _time 
| eval weightedavgprice = round(totalprice/unitssold,2)
| where _time>=min_sell_time AND _time<max_sell_time

The above gives you, on an hourly basis, the total sale price, total number of units sold, the average price on a transaction basis, and the average price on a unit basis.

Now, if you want to add lines summing it up for each 24 hour period, you can do this

| eval mid_sell_time = (min_sell_time+max_sell_time)/2  
| eval period = if(_time<mid_sell_time,"prior","current")
| appendpipe 
    [
     | stats max(_time) as endtime, sum(totalprice) as totalprice, avg(avgprice) as avgprice, sum(unitssold) as unitssold by period
     | eval weightedavgprice = round(totalprice/unitssold,2
     | eval _time = endtime + 3600) 
    ]
| table _time period avgprice unitssold totalprice weightedavgprice
| sort 0 - period + _time

Now, if you don't care about the hourly fluctuations, then you can mix the above all together and just do this...

earliest=-2d@h-2h  index="example" sourcetype="abc"
| eval min_sell_time = relative_time(now(),"-49h@h")
| eval max_sell_time = relative_time(now(),"-1h@h")
| eval mid_sell_time = (min_sell_time+max_sell_time)/2  
| eval _time=strptime(sell_time ,"%m/%d/%y %H:%M:%S.%3Q") 
| where _time>=min_sell_time AND _time<max_sell_time
| eval period = if(_time<mid_sell_time,"prior","current")
| eval _time  = if(_time<mid_sell_time,mid_sell_time,max_sell_time)
| eval totalprice=unitssold*price
| stats sum(totalprice) as totalprice, avg(price) as avgprice, sum(unitssold) as unitssold max(_time) as _time by period 
| eval weightedavgprice = round(totalprice/unitssold,2)
| sort 0 - period

View solution in original post

DalJeanis
Legend

Okay, so it sounds like you are not dealing with realtime data (obviously), and I think you mean you only want full day's data for the last two days on file. If you want to change that to some other number of days, just change the number in the head command.

 index="example" sourcetype="abc"
| eval sell_day=strptime(sell_time ,"%m/%d/%y") 

| rename COMMENT as "Put whatever calculations you want right here"
| stats sum(price) as totalprice avg(price) as avgprice by sell_day

| rename COMMENT as "Now we sort it descending, and take the first 2 values, and reformat the date. "
| sort 0 - sell_day
| head 2
| eval sell_day=strftime(sell_day,"%Y-%m-%d")

There are a couple of different options, depending on what you want, how delayed your data might be, and whether this is always the price for a single item or whether there is a "unit count" field that tells you how many items were sold at that price

Let's assume that you always receive your sales data within an hour of the sell_time, and you want to aggregate your data on a full-hour basis. Let's also suppose you have a field "unitssold" that says how many of the thing were sold in that transaction.

earliest=-2d@h-2h  index="example" sourcetype="abc"
| eval min_sell_time = relative_time(now(),"-49h@h")
| eval max_sell_time = relative_time(now(),"-1h@h")
| eval _time=strptime(sell_time ,"%m/%d/%y %H:%M:%S.%3Q") 
| bin _time span=1h
| eval totalprice=unitssold*price
| stats sum(totalprice) as totalprice, avg(price) as avgprice, sum(unitssold) as unitssold by _time 
| eval weightedavgprice = round(totalprice/unitssold,2)
| where _time>=min_sell_time AND _time<max_sell_time

The above gives you, on an hourly basis, the total sale price, total number of units sold, the average price on a transaction basis, and the average price on a unit basis.

Now, if you want to add lines summing it up for each 24 hour period, you can do this

| eval mid_sell_time = (min_sell_time+max_sell_time)/2  
| eval period = if(_time<mid_sell_time,"prior","current")
| appendpipe 
    [
     | stats max(_time) as endtime, sum(totalprice) as totalprice, avg(avgprice) as avgprice, sum(unitssold) as unitssold by period
     | eval weightedavgprice = round(totalprice/unitssold,2
     | eval _time = endtime + 3600) 
    ]
| table _time period avgprice unitssold totalprice weightedavgprice
| sort 0 - period + _time

Now, if you don't care about the hourly fluctuations, then you can mix the above all together and just do this...

earliest=-2d@h-2h  index="example" sourcetype="abc"
| eval min_sell_time = relative_time(now(),"-49h@h")
| eval max_sell_time = relative_time(now(),"-1h@h")
| eval mid_sell_time = (min_sell_time+max_sell_time)/2  
| eval _time=strptime(sell_time ,"%m/%d/%y %H:%M:%S.%3Q") 
| where _time>=min_sell_time AND _time<max_sell_time
| eval period = if(_time<mid_sell_time,"prior","current")
| eval _time  = if(_time<mid_sell_time,mid_sell_time,max_sell_time)
| eval totalprice=unitssold*price
| stats sum(totalprice) as totalprice, avg(price) as avgprice, sum(unitssold) as unitssold max(_time) as _time by period 
| eval weightedavgprice = round(totalprice/unitssold,2)
| sort 0 - period

nagarjuna280
Communicator

actually we upload all the data whenever we want statistics on the data, but we upload we need last 2 days data,

Example: today date 10th , I upload data contains last month 10th this month 1st, from that I need last month 20th to this month 1st

0 Karma
Get Updates on the Splunk Community!

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...

State of Splunk Careers 2024: Maximizing Career Outcomes and the Continued Value of ...

For the past four years, Splunk has partnered with Enterprise Strategy Group to conduct a survey that gauges ...

Data-Driven Success: Splunk & Financial Services

Splunk streamlines the process of extracting insights from large volumes of data. In this fast-paced world, ...