Combine Sub searches with different time reference...

MarcusBB · ‎03-06-2021

Hey,

right now I am a bit messed up in the mind and not sure if I try to find an overly complicated solution to a maybe fairly simple problem.

Secanrio is; I have a set of historical stock market prices (Ticker, Close-price, Volume). I want to extract several information here:

1. The %Change of a given day compared to the previous (work) day and the respective volume. Additionally the time frame should be defined relatively (set only the date fix, but the range flexible). Table that for the Top10 with their respective volume that day.

2. Use the ouput from above to generate a similar set but now with 1day, week, month later then the given day, but only for the Top10 above and combine that into one result table

Desired outcome:

Ticker	%-Change -1d	Volume	%-Change +1d	%-Change +1w
AAA	5.0	10000	1.0	-8.0
BBB	3.0	50000	4.0	4.5
CCC	1.0	70000	-1.0	3.0

I think I solved part 1 with:

index=market_price Close!=""  
[
search latest="6/23/2020:00:00:00"
| addinfo 
| head 1 
| eval latest=info_max_time+86400
| eval earliest=info_max_time-86400
| fields earliest,latest 
| format "(" "(" "" ")" "OR" ")" 
]
| bin span=1d _time
| stats avg(Close) as Close avg(Volume) as Vol by _time Ticker
| streamstats global=f current=f first(Close) as p_close by Ticker
| eval delta=round(((Close-p_close)/p_close)*100,2)
| where delta!=""
| eval Amount_in_Mio=round((Close*Vol)/1000000,0)
| where Amount_in_Mio>2000
| table Ticker delta Amount_in_Mio
| sort - delta |head 10

Part 2 I can get a set of results for the day after, but not based on the Top 10 from above:

index=market_price Close!=""  
[
search earliest="6/23/2020:00:00:00"
| addinfo 
| head 1 
| eval latest=info_min_time+86400
| eval earliest=info_min_time-86400
| fields earliest,latest 
| format "(" "(" "" ")" "OR" ")" 
]
| bin span=1d _time
| stats avg(Close) as Close avg(Volume) as Vol by _time Ticker
| streamstats global=f current=f first(Close) as p_close by Ticker
| eval delta=round(((Close-p_close)/p_close)*100,2)
| where delta!=""

| table Ticker delta
| sort - delta |head 10

I think right now, I think I would need to combine several sub searches for relative times, the first Top10 and the subsequent datasets.

Not sure If I moved myself into a dead end here, so any suggestions are welcome.

Thanks

MarcusBB · ‎03-07-2021

@tscroggins : awesome. Thanks for the input! I think I'll have an exciting evening today to try it.
I'll let you know how it went.

tscroggins · ‎03-07-2021

Here's a starting point for a dashboard, but you'll need to accommodate market weekends, holidays, etc.

<form>
  <init>
    <set token="_time_tok.earliest">-1mon@d-1d</set>
    <set token="_time_tok.latest">-1mon@d</set>
  </init>
  <label>Nasdaq Quotes</label>
  <fieldset submitButton="false" autoRun="true">
    <input type="time" token="_time_tok" searchWhenChanged="true">
      <label></label>
      <change>
        <condition match="isnum($_time_tok.earliest$) AND isnum($_time_tok.latest$)">
          <eval token="_time_earliest_tok">$_time_tok.earliest$</eval>
          <eval token="_time_latest_tok">$_time_tok.latest$</eval>
          <eval token="_time_minus_1d_earliest_tok">relative_time($_time_tok.earliest$, "-1d")</eval>
          <eval token="_time_minus_1d_latest_tok">relative_time($_time_tok.latest$, "-1d")</eval>
          <eval token="_time_plus_1d_earliest_tok">relative_time($_time_tok.earliest$, "+1d")</eval>
          <eval token="_time_plus_1d_latest_tok">relative_time($_time_tok.latest$, "+1d")</eval>
          <eval token="_time_plus_1w_earliest_tok">relative_time($_time_tok.earliest$, "+1w")</eval>
          <eval token="_time_plus_1w_latest_tok">relative_time($_time_tok.latest$, "+1w")</eval>
          <eval token="_time_plus_1mon_earliest_tok">relative_time($_time_tok.earliest$, "+1mon")</eval>
          <eval token="_time_plus_1mon_latest_tok">relative_time($_time_tok.latest$, "+1mon")</eval>
        </condition>
        <condition>
          <eval token="_time_earliest_tok">relative_time(now(), $_time_tok.earliest$)</eval>
          <eval token="_time_latest_tok">relative_time(now(), $_time_tok.latest$)</eval>
          <eval token="_time_minus_1d_earliest_tok">relative_time(relative_time(now(), $_time_tok.earliest$), "-1d")</eval>
          <eval token="_time_minus_1d_latest_tok">relative_time(relative_time(now(), $_time_tok.latest$), "-1d")</eval>
          <eval token="_time_plus_1d_earliest_tok">relative_time(relative_time(now(), $_time_tok.earliest$), "+1d")</eval>
          <eval token="_time_plus_1d_latest_tok">relative_time(relative_time(now(), $_time_tok.latest$), "+1d")</eval>
          <eval token="_time_plus_1w_earliest_tok">relative_time(relative_time(now(), $_time_tok.earliest$), "+1w")</eval>
          <eval token="_time_plus_1w_latest_tok">relative_time(relative_time(now(), $_time_tok.latest$), "+1w")</eval>
          <eval token="_time_plus_1mon_earliest_tok">relative_time(relative_time(now(), $_time_tok.earliest$), "+1mon")</eval>
          <eval token="_time_plus_1mon_latest_tok">relative_time(relative_time(now(), $_time_tok.latest$), "+1mon")</eval>
        </condition>
      </change>
      <default>
        <earliest>-1mon@d-1d</earliest>
        <latest>-1mon@d</latest>
      </default>
    </input>
  </fieldset>
  <row>
    <panel>
      <title>Top 10 by Volume</title>
      <table>
        <search>
          <query>index=nasdaq_market_activity sourcetype=nasdaq_quotes_csv ((earliest=$_time_minus_1d_earliest_tok$ latest=$_time_minus_1d_latest_tok$) OR (earliest=$_time_earliest_tok$ latest=$_time_latest_tok$) OR (earliest=$_time_plus_1d_earliest_tok$ latest=$_time_plus_1d_latest_tok$) OR (earliest=$_time_plus_1w_earliest_tok$ latest=$_time_plus_1w_latest_tok$) OR (earliest=$_time_plus_1mon_earliest_tok$ latest=$_time_plus_1mon_latest_tok$))
| rex field=source "(?:.*/)?(?&lt;Ticker&gt;[^\.]+)" 
| eval Close_Last=ltrim(Close_Last, "$$") ``` my currency is USD, and my locale uses the $$ prefix ```
| rename Close_Last as Close
| eventstats values(eval(case(_time==$_time_minus_1d_earliest_tok$, Close))) as Close-1d  values(eval(case(_time==$_time_plus_1d_earliest_tok$, Close))) as Close+1d  values(eval(case(_time==$_time_plus_1w_earliest_tok$, Close))) as Close+1w  values(eval(case(_time==$_time_plus_1mon_earliest_tok$, Close))) as Close+1mon by Ticker
| where _time==$_time_earliest_tok$
| sort 10 - Volume
| eval "%-Change - 1d"=round(100 * (Close - 'Close-1d') / 'Close-1d', 1)
| eval "%-Change + 1d"=round(100 * ('Close+1d' - Close) / Close, 1)
| eval "%-Change + 1w"=round(100 * ('Close+1w' - Close) / Close, 1)
| eval "%-Change + 1m"=round(100 * ('Close+1mon' - Close) / Close, 1)
| table Ticker "%-Change - 1d" Volume "%-Change + 1d" "%-Change + 1w" "%-Change + 1m"</query>
          <earliest>0</earliest>
          <sampleRatio>1</sampleRatio>
        </search>
        <option name="count">100</option>
        <option name="dataOverlayMode">none</option>
        <option name="drilldown">none</option>
        <option name="percentagesRow">false</option>
        <option name="rowNumbers">false</option>
        <option name="totalsRow">false</option>
        <option name="wrap">true</option>
      </table>
    </panel>
  </row>
</form>

tscroggins · ‎03-07-2021

@MarcusBB

Here are two examples, one search using multiple join commands with subsearches and one search using no joins. Both produce the same output but require pre-calculating time ranges. In a Simple XML dashboard, the time ranges can be calculated in change events linked to a time input.

I've used historical Nasdaq data from https://www.nasdaq.com/market-activity/quotes/historical for fifteen symbols: AAPL, AMD, AMZN, CSCO, DELL, FB, IBM, MSFT, ORCL, QCOM, SBUX, SPLK, TSLA, TWTR, ZNGA. (Note that I'm a freelance Splunker, not a stock trader!)

``` my data for 2020-09-01 in America/New_York https://en.wikipedia.org/wiki/List_of_tz_database_time_zones ```
``` -1d earliest=1598846400 latest=1598932800 ```
``` -0d earliest=1598932800 latest=1599019200 ```
``` +1d earliest=1599019200 latest=1599105600 ```
``` +1w earliest=1599537600 latest=1599624000 ```
``` +1mon earliest=1601524800 latest=1601611200 ```
index=nasdaq_market_activity sourcetype=nasdaq_quotes_csv earliest=1598932800 latest=1599019200 
| rex field=source "(?:.*/)?(?<Ticker>[^\.]+)" 
| eval Close_Last=ltrim(Close_Last, "$") ``` my currency is USD, and my locale uses the $ prefix ```
| rename Close_Last as Close
``` end of my data ``` 
| sort 10 - Volume 
| table Ticker Close Volume
| join type=left Ticker 
    [ search index=nasdaq_market_activity sourcetype=nasdaq_quotes_csv earliest=1598846400 latest=1598932800 
    | rex field=source "(?:.*/)?(?<Ticker>[^\.]+)" 
    | eval Close_Last=ltrim(Close_Last, "$")
    | table Ticker Close_Last 
    | rename Close_Last as Close-1d ] 
| join type=left Ticker 
    [ search index=nasdaq_market_activity sourcetype=nasdaq_quotes_csv earliest=1599019200 latest=1599105600 
    | rex field=source "(?:.*/)?(?<Ticker>[^\.]+)" 
    | eval Close_Last=ltrim(Close_Last, "$")
    | table Ticker Close_Last 
    | rename Close_Last as Close+1d ] 
| join type=left Ticker 
    [ search index=nasdaq_market_activity sourcetype=nasdaq_quotes_csv earliest=1599537600 latest=1599624000 
    | rex field=source "(?:.*/)?(?<Ticker>[^\.]+)" 
    | eval Close_Last=ltrim(Close_Last, "$")
    | table Ticker Close_Last 
    | rename Close_Last as Close+1w ] 
| join type=left Ticker 
    [ search index=nasdaq_market_activity sourcetype=nasdaq_quotes_csv earliest=1601524800 latest=1601611200 
    | rex field=source "(?:.*/)?(?<Ticker>[^\.]+)" 
    | eval Close_Last=ltrim(Close_Last, "$")
    | table Ticker Close_Last 
    | rename Close_Last as Close+1mon ]
| eval "%-Change - 1d"=round(100 * (Close - 'Close-1d') / 'Close-1d', 1)
| eval "%-Change + 1d"=round(100 * ('Close+1d' - Close) / Close, 1)
| eval "%-Change + 1w"=round(100 * ('Close+1w' - Close) / Close, 1)
| eval "%-Change + 1m"=round(100 * ('Close+1mon' - Close) / Close, 1)
| table Ticker "%-Change - 1d" Volume "%-Change + 1d" "%-Change + 1w" "%-Change + 1m"

Ticker	%-Change - 1d	Volume	%-Change + 1d	%-Change + 1w	%-Change + 1m
AAPL	4.0	152470100	-2.1	-15.9	-13.0
TSLA	-4.7	90119420	-5.8	-30.5	-5.7
AMD	1.5	56117100	-2.1	-14.6	-7.9
MSFT	0.8	25791240	1.9	-10.8	-6.5
CSCO	-0.5	23344890	0.9	-4.8	-7.7
FB	0.8	17320870	2.4	-8.2	-9.8
ZNGA	0.1	12708640	-0.1	-8.3	4.6
TWTR	1.4	10478790	6.1	-7.2	13.5
ORCL	0.8	10170910	2.4	-4.1	3.5
QCOM	2.4	9101301	1.0	-10.0	-2.0

Joins are great for modeling searches, but they have a bad reputation, sometimes deserved, sometimes not. In my small, standalone environment, this search executes in about 1.5 seconds. That's a bit slow.

Here's a similar search without joins that executes in about 0.5 seconds.

``` my data for 2020-09-01 in America/New_York https://en.wikipedia.org/wiki/List_of_tz_database_time_zones ```
``` -1d earliest=1598846400 latest=1598932800 ```
``` -0d earliest=1598932800 latest=1599019200 ```
``` +1d earliest=1599019200 latest=1599105600 ```
``` +1w earliest=1599537600 latest=1599624000 ```
``` +1mon earliest=1601524800 latest=1601611200 ```
index=nasdaq_market_activity sourcetype=nasdaq_quotes_csv ((earliest=1598846400 latest=1598932800) OR (earliest=1598932800 latest=1599019200) OR (earliest=1599019200 latest=1599105600) OR (earliest=1599537600 latest=1599624000) OR (earliest=1601524800 latest=1601611200))
| rex field=source "(?:.*/)?(?<Ticker>[^\.]+)" 
| eval Close_Last=ltrim(Close_Last, "$") ``` my currency is USD, and my locale uses the $ prefix ```
| rename Close_Last as Close
| eventstats values(eval(case(_time==1598846400, Close))) as Close-1d  values(eval(case(_time==1599019200, Close))) as Close+1d  values(eval(case(_time==1599537600, Close))) as Close+1w  values(eval(case(_time==1601524800, Close))) as Close+1mon by Ticker
| where _time==1598932800
| sort 10 - Volume
| eval "%-Change - 1d"=round(100 * (Close - 'Close-1d') / 'Close-1d', 1)
| eval "%-Change + 1d"=round(100 * ('Close+1d' - Close) / Close, 1)
| eval "%-Change + 1w"=round(100 * ('Close+1w' - Close) / Close, 1)
| eval "%-Change + 1m"=round(100 * ('Close+1mon' - Close) / Close, 1)
| table Ticker "%-Change - 1d" Volume "%-Change + 1d" "%-Change + 1w" "%-Change + 1m"

Ticker	%-Change - 1d	Volume	%-Change + 1d	%-Change + 1w	%-Change + 1m
AAPL	4.0	152470100	-2.1	-15.9	-13.0
TSLA	-4.7	90119420	-5.8	-30.5	-5.7
AMD	1.5	56117100	-2.1	-14.6	-7.9
MSFT	0.8	25791240	1.9	-10.8	-6.5
CSCO	-0.5	23344890	0.9	-4.8	-7.7
FB	0.8	17320870	2.4	-8.2	-9.8
ZNGA	0.1	12708640	-0.1	-8.3	4.6
TWTR	1.4	10478790	6.1	-7.2	13.5
ORCL	0.8	10170910	2.4	-4.1	3.5
QCOM	2.4	9101301	1.0	-10.0	-2.0

Whether you do or don't use joins depends on your environment. In my case, I would use the search without joins, as each of the join subsearches returns about the same number of events as the simplified search, but the simplified search does away with the join overhead (subsearch startup and teardown, set operations, etc.).

Combine Sub searches with different time references

join

lookup

subsearch

Tech Talk Recap | Mastering Threat Hunting

Observability for AI Applications: Troubleshooting Latency

Splunk AI Assistant for SPL vs. ChatGPT: Which One is Better?

Are you a member of the Splunk Community?