Hi
I want to calculate/simulate a data to analysis price difference, my data set in picture,
left is my data set, right table is that i want to produce
Process like this
LOOP
Until data finished
For example
find DIFF> 100
time: 00:02:58 PAR:1655
get 5 minutes after BTC: 1800 as BTC_last
get difference BTC_last - PAR = 1800 -1655 = 145 and continue
like this
find DIFF> 100
time: 00:09:58 PAR:1610
get 5 minutes after BTC: 1510 as BTC_last
get difference BTC_last - PAR = 1510 -1610 = -100
and so on?
How can i produce this process, given in right table using splunk?
Holy crud. I think this is an honest-to-got good use case for transaction
.
your search that gets the events
| fields _time BTC PAR
| rename COMMENT as "Sort into ascending _time order because of how transaction works, calculate DIFF if not already present"
| sort 0 _time
| eval DIFF = BTC-PAR
| rename COMMENT as "Build fields to hold PAR and BTC, which transaction will roll together and sort, so we make sure they will be in _time order"
| eval Time = strftime(_time,"%Y-%m-%d %H:%M:%S")
| eval TimePAR = Time."=".PAR
| eval TimeBTC = Time."=".BTC
| rename COMMENT as "Set a flag at valid places to start a transaction, get rid of fields we wont need, then run transaction."
| rename COMMENT as "max difference between first and last event is 4 minutes, +1s for good measure"
| rename COMMENT as "transaction command counts backward so we tell it to **end** with our start flag."
| eval CanStartHere=case(DIFF>100,"CanStartHere")
| fields - PAR BTC DIFF Time
| transaction maxspan=241s endswith="CanStartHere"
| rename COMMENT as "Get the PAR from the earliest time and the BTC from the latest time"
| eval PAR_first = mvindex(split(mvindex(TimePAR,0),"="),1)
| eval BTC_last = mvindex(split(mvindex(TimeBTC,mvcount(TimeBTC)-1),"="),1)
| eval REAL_DIFF = BTC_last - PAR_first
| table _time PAR_first BTC_last REAL_DIFF
Hey, that's a fun one. Pretty tough, because it needs to use the output of the same decision as the input for the next one. This is probably going to have to be either repetitive (same code repeated multiple times) or recursive.
What is the maximum overall length of time this query might be run against? Is the dataset above typical in terms of how frequently DIFF goes above 100?
As far as I can see, this might end up requiring one pass per 5m period. I can grab 2:58 and 22:58 at the same time by requiring no DIFF>100 within 5 minutes. But, considering that 17:58-22:58 is the longest such period at 5m overall, you're going to need at least 6-8 passes to be reasonably likely to get them all in a one-hour pull. I don't like those odds.
What is the maximum overall length of time this query might be run against? Is the dataset above typical in terms of how frequently DIFF goes above 100?
It works for a day, i have minutes data so 60*24 = 1440 for a day. And frequent is up to market data, i can not estimate.
Is the order time of your data set is same as what's shown in screenshot OR they appear in Splunk in reverse chronological order of _time (descending order of time)?
Although the detailing screenshot is different this is essentially the same question as https://answers.splunk.com/answers/590633/difference-calculation-different-time-in-different.html. You could have just added the screenshot to your previous question and added more details/clarification in the original thread rather than creating a new one.
Yes i want to add it, but splunk answers give error me by saying no enough point to add picture