Splunk Search

Tstats query and dashboard optimization

Justinboucher0
Path Finder

I'm looking for assistance in optimizing a dashboard where we use tstats as a base search. Our Splunk systems have more than enough resources and there hasn't been any signs of degraded performance on them either. However, this dashboard takes an average of 237 seconds to finish all the queries. We have optimized the queries as much as possible, but this time has not diminished. We really need some help getting this time down. Based on our customer's requirements we also needed to have just about every field in the data filterable as well. Currently there are only about 134K events per hour, but we know that this will increase in the future, so we need to make this dashboard as quick as possible.

I'm going to post the code in its entirety so maybe someone can find something I missed when optimizing this dashboard, but before I do here are some things that we have already done:

  • Accelerated data models with 90 day summarization
  • Base search broken into two parts
  • Disabled Autorun of the dashboard
  • Optimized post-process searches

    <form stylesheet="arm_tables.css" script="arm_tables.js">
        <label>AHLTA Response Measurements</label>
        <description>*NOTE: Some ORD values, such as Local Workflow Transactions, do not have ORD Values defined.
        </description>
        <search id="transactionsBase">
            <query>| tstats prestats=f local=f summariesonly=t count AS transaction_count from datamodel=arm.transactions
                where transactions.region="$region$"
                groupby transactions.sitename transactions.response_time transactions.start_time
                transactions.end_time transactions.objective transactions.service transactions.username
                transactions.transaction_name
                transactions.workstation_name transactions.transaction_status transactions.threshold
                transactions.transaction_mode _time | rename transactions.* AS *
            </query>
            <earliest>$search_time.earliest$</earliest>
            <latest>$search_time.latest$</latest>
            <sampleRatio>1</sampleRatio>
        </search>
        <search base="transactionsBase" id="filteredSearch">
            <query>| search sitename IN ( $sitename$ ) service IN (
                $service$ ) transaction_status IN ( $transaction_status$ ) transaction_name IN (
                $transaction_name$ ) username IN ( $username$ ) workstation_name IN ( $workstation_name$)
                transaction_mode IN ( $transaction_mode$ ) threshold IN ( $threshold$ ) $bool_workflow$
            </query>
        </search>
        <fieldset submitButton="true" autoRun="false">
            <input type="time" token="search_time" searchWhenChanged="false">
                <label></label>
                <default>
                    <earliest>-60m@m</earliest>
                    <latest>now</latest>
                </default>
            </input>
            <input type="dropdown" token="region" searchWhenChanged="false">
                <label>Region</label>
                <choice value="*">ALL</choice>
                <default>*</default>
                <initialValue>*</initialValue>
                <fieldForLabel>region</fieldForLabel>
                <fieldForValue>region</fieldForValue>
                <search>
                    <query>| `DHAIO_LIST_MTF_REGIONS`</query>
                    <earliest>-24h@h</earliest>
                    <latest>now</latest>
                </search>
            </input>
            <input type="multiselect" token="sitename" searchWhenChanged="false">
                <label>Sitename</label>
                <fieldForLabel>sitename</fieldForLabel>
                <fieldForValue>sitename</fieldForValue>
                <search>
                    <query>| `DHAIO_LIST_MTF_REGION_SITES($region$)`</query>
                    <earliest>-24h@h</earliest>
                    <latest>now</latest>
                </search>
                <delimiter>,</delimiter>
                <valuePrefix>"</valuePrefix>
                <valueSuffix>"</valueSuffix>
            </input>
            <input type="multiselect" token="service">
                <label>Service</label>
                <fieldForLabel>service</fieldForLabel>
                <fieldForValue>service</fieldForValue>
                <search>
                    <query>| `LIST_SERVICE`</query>
                    <earliest>-24h@h</earliest>
                    <latest>now</latest>
                </search>
                <initialValue>Army,Air Force,Navy,VA/DOD</initialValue>
                <delimiter>,</delimiter>
                <valuePrefix>"</valuePrefix>
                <valueSuffix>"</valueSuffix>
            </input>
            <input type="dropdown" token="bool_workflow">
                <label>Show Local Workflow Transactions?</label>
                <choice value="*">Yes</choice>
                <choice value="| search transaction_name!=Workflow*">No</choice>
                <default>*</default>
                <initialValue>*</initialValue>
            </input>
            <input type="multiselect" token="transaction_name">
                <label>Transaction Name</label>
                <choice value="*">ALL</choice>
                <default>*</default>
                <initialValue>*</initialValue>
                <fieldForLabel>transaction_name</fieldForLabel>
                <fieldForValue>transaction_name</fieldForValue>
                <search base="transactionsBase">
                    <query>| search transaction_mode
                        IN ( $transaction_mode$ ) $bool_workflow$ | stats count by transaction_name | fields - count
                    </query>
                </search>
                <delimiter>,</delimiter>
                <valuePrefix>"</valuePrefix>
                <valueSuffix>"</valueSuffix>
            </input>
            <input type="multiselect" token="username" searchWhenChanged="true">
                <label>Username</label>
                <choice value="*">ALL</choice>
                <default>*</default>
                <initialValue>*</initialValue>
                <valuePrefix>"</valuePrefix>
                <valueSuffix>"</valueSuffix>
                <delimiter>,</delimiter>
                <fieldForLabel>username</fieldForLabel>
                <fieldForValue>username</fieldForValue>
                <search base="transactionsBase">
                    <query>| stats count by
                        username | fields - count
                    </query>
                </search>
                <prefix>"</prefix>
                <suffix>"</suffix>
            </input>
            <input type="multiselect" token="workstation_name" searchWhenChanged="true">
                <label>Workstation Name</label>
                <choice value="*">ALL</choice>
                <default>*</default>
                <initialValue>*</initialValue>
                <valuePrefix>"</valuePrefix>
                <valueSuffix>"</valueSuffix>
                <delimiter>,</delimiter>
                <fieldForLabel>workstation_name</fieldForLabel>
                <fieldForValue>workstation_name</fieldForValue>
                <search base="transactionsBase">
                    <query>| stats count by
                        workstation_name | fields - count
                    </query>
                </search>
            </input>
            <input type="multiselect" token="threshold">
                <label>ORD Value:</label>
                <choice value="1">1</choice>
                <choice value="5">5</choice>
                <choice value="6">6</choice>
                <choice value="25">25</choice>
                <choice value="100">Not Defined*</choice>
                <default>1,5,6,25,100</default>
                <initialValue>1,5,6,25,100</initialValue>
                <delimiter>,</delimiter>
            </input>
            <input type="multiselect" token="transaction_status">
                <label>Transaction Status:</label>
                <choice value="ARM_GOOD">Success</choice>
                <choice value="ARM_FAIL">Failure</choice>
                <default>ARM_GOOD,ARM_FAIL</default>
                <initialValue>ARM_GOOD,ARM_FAIL</initialValue>
                <delimiter>,</delimiter>
            </input>
            <input type="multiselect" token="transaction_mode">
                <label>Show Transaction Modes in:</label>
                <choice value="NM1">Normal</choice>
                <choice value="FM1">Failure</choice>
                <choice value="UK1">Unknown</choice>
                <default>NM1,FM1,UK1</default>
                <initialValue>NM1,FM1,UK1</initialValue>
                <delimiter>,</delimiter>
            </input>
        </fieldset>
        <row>
            <panel>
                <single>
                    <title>Total Transactions</title>
                    <search base="filteredSearch">
                        <query>| stats count(transaction_count)
                        </query>
                    </search>
                    <option name="drilldown">none</option>
                    <option name="numberPrecision">0</option>
                    <option name="rangeColors">["0x65a637","0xf7bc38","0xd93f3c"]</option>
                    <option name="rangeValues">[5,10]</option>
                    <option name="refresh.display">progressbar</option>
                    <option name="trellis.enabled">0</option>
                    <option name="trellis.scales.shared">1</option>
                    <option name="trellis.size">small</option>
                    <option name="trellis.splitBy">transaction_name</option>
                    <option name="useColors">0</option>
                </single>
                <single>
                    <title>Distinct Workstations</title>
                    <search base="filteredSearch">
                        <query>| stats dc(workstation_name)
                        </query>
                    </search>
                    <option name="drilldown">none</option>
                    <option name="numberPrecision">0</option>
                    <option name="rangeColors">["0x65a637","0xf7bc38","0xd93f3c"]</option>
                    <option name="rangeValues">[5,10]</option>
                    <option name="refresh.display">progressbar</option>
                    <option name="useColors">0</option>
                </single>
                <single>
                    <title>Distinct Users</title>
                    <search base="filteredSearch">
                        <query>| stats dc(username)
                        </query>
                    </search>
                    <option name="drilldown">none</option>
                    <option name="numberPrecision">0</option>
                    <option name="rangeColors">["0x65a637","0xf7bc38","0xd93f3c"]</option>
                    <option name="rangeValues">[5,10]</option>
                    <option name="refresh.display">progressbar</option>
                    <option name="useColors">0</option>
                </single>
                <single>
                    <title>Successful Transactions</title>
                    <search base="filteredSearch">
                        <query>| where transaction_status=="ARM_GOOD" | stats count</query>
                    </search>
                    <option name="colorBy">trend</option>
                    <option name="colorMode">none</option>
                    <option name="drilldown">none</option>
                    <option name="numberPrecision">0</option>
                    <option name="rangeColors">["0x65a637","0x6db7c6","0xf7bc38","0xf58f39","0xd93f3c"]</option>
                    <option name="rangeValues">[0,30,70,100]</option>
                    <option name="refresh.display">progressbar</option>
                    <option name="showSparkline">1</option>
                    <option name="showTrendIndicator">1</option>
                    <option name="trellis.enabled">0</option>
                    <option name="trellis.scales.shared">1</option>
                    <option name="trellis.size">large</option>
                    <option name="trendColorInterpretation">standard</option>
                    <option name="trendDisplayMode">absolute</option>
                    <option name="trendInterval">auto</option>
                    <option name="unitPosition">after</option>
                    <option name="useColors">0</option>
                    <option name="useThousandSeparators">1</option>
                </single>
                <single>
                    <title>Failed Transactions</title>
                    <search base="filteredSearch">
                        <query>| where transaction_status=="ARM_FAIL" | stats count</query>
                    </search>
                    <option name="colorBy">trend</option>
                    <option name="drilldown">none</option>
                    <option name="refresh.display">progressbar</option>
                    <option name="trendColorInterpretation">inverse</option>
                    <option name="trendInterval">auto</option>
                    <option name="useColors">0</option>
                </single>
            </panel>
        </row>
        <row>
            <panel>
                <title>Transactions by Response Time (secs)</title>
                <input type="link" token="needed">
                    <label></label>
                    <choice value="rt_overview">Overview</choice>
                    <choice value="rt_ord">Split by ORD</choice>
                    <default>rt_overview</default>
                    <change>
                        <condition value="rt_overview">
                            <set token="showOverview">true</set>
                            <unset token="showORD"></unset>
                        </condition>
                        <condition value="rt_ord">
                            <unset token="showOverview"></unset>
                            <set token="showORD">true</set>
                        </condition>
                    </change>
                </input>
                <chart depends="$showOverview$">
                    <search base="filteredSearch">
                        <query>| where threshold!=100 | timechart count as transactions avg(response_time) AS response_time
                            | eval response_time=round(response_time,2)
                            | fields _time transactions response_time
                        </query>
                    </search>
                    <!-- Secondary search that drives the annotations -->
                    <option name="charting.axisLabelsX.majorLabelStyle.overflowMode">ellipsisNone</option>
                    <option name="charting.axisLabelsX.majorLabelStyle.rotation">0</option>
                    <option name="charting.axisTitleX.visibility">visible</option>
                    <option name="charting.axisTitleY.visibility">visible</option>
                    <option name="charting.axisTitleY2.visibility">visible</option>
                    <option name="charting.axisX.abbreviation">none</option>
                    <option name="charting.axisX.scale">linear</option>
                    <option name="charting.axisY.abbreviation">none</option>
                    <option name="charting.axisY.scale">linear</option>
                    <option name="charting.axisY2.abbreviation">none</option>
                    <option name="charting.axisY2.enabled">1</option>
                    <option name="charting.axisY2.scale">inherit</option>
                    <option name="charting.chart">column</option>
                    <option name="charting.chart.bubbleMaximumSize">50</option>
                    <option name="charting.chart.bubbleMinimumSize">10</option>
                    <option name="charting.chart.bubbleSizeBy">area</option>
                    <option name="charting.chart.nullValueMode">zero</option>
                    <option name="charting.chart.overlayFields">response_time</option>
                    <option name="charting.chart.showDataLabels">none</option>
                    <option name="charting.chart.sliceCollapsingThreshold">0.01</option>
                    <option name="charting.chart.stackMode">default</option>
                    <option name="charting.chart.style">shiny</option>
                    <option name="charting.drilldown">none</option>
                    <option name="charting.layout.splitSeries">0</option>
                    <option name="charting.layout.splitSeries.allowIndependentYRanges">0</option>
                    <option name="charting.legend.labelStyle.overflowMode">ellipsisMiddle</option>
                    <option name="charting.legend.mode">standard</option>
                    <option name="charting.legend.placement">right</option>
                    <option name="charting.lineWidth">2</option>
                    <option name="refresh.display">progressbar</option>
                    <option name="trellis.enabled">0</option>
                    <option name="trellis.scales.shared">1</option>
                    <option name="trellis.size">medium</option>
                </chart>
                <chart depends="$showORD$">
                    <search base="filteredSearch">
                        <query>| eval threshold=if(threshold==100,"Not Defined",threshold)
                            | timechart count as transactions avg(response_time) AS response_time by threshold
                        </query>
                    </search>
                    <option name="charting.axisLabelsX.majorLabelStyle.overflowMode">ellipsisNone</option>
                    <option name="charting.axisLabelsX.majorLabelStyle.rotation">0</option>
                    <option name="charting.axisTitleX.visibility">collapsed</option>
                    <option name="charting.axisTitleY.visibility">collapsed</option>
                    <option name="charting.axisTitleY2.visibility">visible</option>
                    <option name="charting.axisX.abbreviation">none</option>
                    <option name="charting.axisX.scale">linear</option>
                    <option name="charting.axisY.abbreviation">none</option>
                    <option name="charting.axisY.scale">linear</option>
                    <option name="charting.axisY2.abbreviation">none</option>
                    <option name="charting.axisY2.enabled">0</option>
                    <option name="charting.axisY2.scale">inherit</option>
                    <option name="charting.chart">column</option>
                    <option name="charting.chart.bubbleMaximumSize">50</option>
                    <option name="charting.chart.bubbleMinimumSize">10</option>
                    <option name="charting.chart.bubbleSizeBy">area</option>
                    <option name="charting.chart.nullValueMode">gaps</option>
                    <option name="charting.chart.overlayFields">response_time</option>
                    <option name="charting.chart.showDataLabels">none</option>
                    <option name="charting.chart.sliceCollapsingThreshold">0.01</option>
                    <option name="charting.chart.stackMode">default</option>
                    <option name="charting.chart.style">shiny</option>
                    <option name="charting.drilldown">none</option>
                    <option name="charting.layout.splitSeries">0</option>
                    <option name="charting.layout.splitSeries.allowIndependentYRanges">0</option>
                    <option name="charting.legend.labelStyle.overflowMode">ellipsisMiddle</option>
                    <option name="charting.legend.mode">standard</option>
                    <option name="charting.legend.placement">none</option>
                    <option name="charting.lineWidth">2</option>
                    <option name="refresh.display">progressbar</option>
                    <option name="trellis.enabled">1</option>
                    <option name="trellis.scales.shared">0</option>
                    <option name="trellis.size">medium</option>
                </chart>
            </panel>
        </row>
        <row>
            <panel>
                <title>Transaction Responses by ORD</title>
                <table id="response_time_highlight_cell">
                    <search base="filteredSearch">
                        <query>| search service IN (
                            $service$ ) transaction_status IN ( $transaction_status$ ) transaction_name IN (
                            $transaction_name$ ) username IN ( $username$ ) workstation_name IN ( $workstation_name$)
                            transaction_mode IN ( $transaction_mode$ ) threshold IN ( $threshold$ ) $bool_workflow$ | stats
                            count avg(response_time) AS response_time BY transaction_name objective threshold
                            | where threshold!=100
                            | eval threshold=round(threshold,2)
                            | eval objective=round(objective,2)
                            | eval response_time=round(response_time,2)
                            | fields transaction_name count response_time objective threshold
                            | eval "Average Response Time SLA"=case(response_time>=threshold,"severe",
                            response_time<threshold AND response_time>=objective,"elevated",objective==100 AND
                            threshold==100,"info",true(),"low")
                            | rename response_time AS "Average Response Time (secs)" transaction_name AS "Transaction Name"
                            objective as Objective threshold as Threshold count as "Transaction Count"
                            | sort +transaction_name
                        </query>
                    </search>
                    <option name="count">20</option>
                    <option name="dataOverlayMode">none</option>
                    <option name="drilldown">none</option>
                    <option name="percentagesRow">false</option>
                    <option name="refresh.display">progressbar</option>
                    <option name="rowNumbers">true</option>
                    <option name="totalsRow">false</option>
                    <option name="wrap">true</option>
                </table>
            </panel>
        </row>
        <row>
            <panel>
                <title>Transactions by Operating Mode</title>
                <chart>
                    <search base="filteredSearch">
                        <query>| eval mode_name=substr(transaction_name, len(transaction_name)-2, 3)
                            | eval modes=case(mode_name=="FM1","Failure Mode",mode_name=="NM1","Normal
                            Mode",mode_name=="UK1", "Unknown")
                            | stats count by modes
                        </query>
                    </search>
                    <option name="charting.axisLabelsX.majorLabelStyle.overflowMode">ellipsisNone</option>
                    <option name="charting.axisLabelsX.majorLabelStyle.rotation">0</option>
                    <option name="charting.axisTitleX.visibility">visible</option>
                    <option name="charting.axisTitleY.visibility">visible</option>
                    <option name="charting.axisTitleY2.visibility">visible</option>
                    <option name="charting.axisX.abbreviation">none</option>
                    <option name="charting.axisX.scale">linear</option>
                    <option name="charting.axisY.abbreviation">none</option>
                    <option name="charting.axisY.scale">linear</option>
                    <option name="charting.axisY2.abbreviation">none</option>
                    <option name="charting.axisY2.enabled">0</option>
                    <option name="charting.axisY2.scale">inherit</option>
                    <option name="charting.chart">pie</option>
                    <option name="charting.chart.bubbleMaximumSize">50</option>
                    <option name="charting.chart.bubbleMinimumSize">10</option>
                    <option name="charting.chart.bubbleSizeBy">area</option>
                    <option name="charting.chart.nullValueMode">gaps</option>
                    <option name="charting.chart.showDataLabels">none</option>
                    <option name="charting.chart.sliceCollapsingThreshold">0.01</option>
                    <option name="charting.chart.stackMode">default</option>
                    <option name="charting.chart.style">shiny</option>
                    <option name="charting.drilldown">none</option>
                    <option name="charting.layout.splitSeries">0</option>
                    <option name="charting.layout.splitSeries.allowIndependentYRanges">0</option>
                    <option name="charting.legend.labelStyle.overflowMode">ellipsisMiddle</option>
                    <option name="charting.legend.mode">standard</option>
                    <option name="charting.legend.placement">right</option>
                    <option name="charting.lineWidth">2</option>
                    <option name="refresh.display">progressbar</option>
                    <option name="trellis.enabled">0</option>
                    <option name="trellis.scales.shared">1</option>
                    <option name="trellis.size">medium</option>
                </chart>
            </panel>
            <panel>
                <title>Users in Failure Mode</title>
                <table>
                    <search base="filteredSearch">
                        <query>| eval mode_name=substr(transaction_name, len(transaction_name)-2, 3)
                            | eval modes=case(mode_name=="FM1","Failure Mode",mode_name=="NM1","Normal
                            Mode",mode_name=="UK1", "Unknown")
                            | where mode_name=="FM1"
                            | stats count as "transaction count" by username workstation_name
                            | rename workstation_name as "workstation name"
                        </query>
                    </search>
                    <option name="count">20</option>
                    <option name="dataOverlayMode">none</option>
                    <option name="drilldown">none</option>
                    <option name="percentagesRow">false</option>
                    <option name="refresh.display">progressbar</option>
                    <option name="rowNumbers">true</option>
                    <option name="totalsRow">false</option>
                    <option name="wrap">true</option>
                </table>
            </panel>
        </row>
        <row>
            <panel>
                <title>Transaction Workflow by User</title>
                <viz type="timeline_app.timeline">
                    <search base="filteredSearch">
                        <query>| eval start=strptime(start_time,"%Y-%m-%d %H:%M:%S.%N")
                            | eval end=strptime(end_time,"%Y-%m-%d %H:%M:%S.%N")
                            | eval duration = tostring((end - start), "duration")
                            | table start transaction_name username duration
                            | reverse
                        </query>
                    </search>
                    <option name="drilldown">none</option>
                    <option name="height">375</option>
                    <option name="refresh.display">progressbar</option>
                    <option name="timeline_app.timeline.axisTimeFormat">SECONDS</option>
                    <option name="timeline_app.timeline.colorMode">categorical</option>
                    <option name="timeline_app.timeline.maxColor">#DA5C5C</option>
                    <option name="timeline_app.timeline.minColor">#FFE8E8</option>
                    <option name="timeline_app.timeline.numOfBins">9</option>
                    <option name="timeline_app.timeline.tooltipTimeFormat">SUBSECONDS</option>
                    <option name="timeline_app.timeline.useColors">1</option>
                    <option name="trellis.enabled">0</option>
                    <option name="trellis.scales.shared">1</option>
                    <option name="trellis.size">medium</option>
                </viz>
            </panel>
        </row>
    </form>
    

Thank you in advance for any assistance you can provide.

0 Karma

DEAD_BEEF
Builder

Why not just save these searches as reports and then populate the dashboard with the report? Schedule it to run once every 10 minutes, that way the data is at worst "10 minutes old".

0 Karma
Get Updates on the Splunk Community!

New in Observability - Improvements to Custom Metrics SLOs, Log Observer Connect & ...

The latest enhancements to the Splunk observability portfolio deliver improved SLO management accuracy, better ...

Improve Data Pipelines Using Splunk Data Management

  Register Now   This Tech Talk will explore the pipeline management offerings Edge Processor and Ingest ...

3-2-1 Go! How Fast Can You Debug Microservices with Observability Cloud?

Register Join this Tech Talk to learn how unique features like Service Centric Views, Tag Spotlight, and ...