Splunk Search

Whats the best way to improve the performance of my search?

greenwayb
Explorer

I have a report, which is based on a DataModel, and I'm interested in how best to optimize/tune it, and improve performance.

The Report is:

| datamodel AuditIIStoSTM AuditModelSearchTXNs search | rename AuditModelSearchTXNs.* AS *

Which doesn't give much away, the dataModel is, based on a base search:

'index-selector' | join type=inner CID [search 'index-selector' eventtype="AuditMarketResults" (date_hour>'hour-start' AND date_hour<'hour-end') | where CEVENT="SENT_TO_MARKET" | table CID] | transaction CID maxspan=10m | where CEVENT="IIS-START" OR CEVENT="CTS_START" | rex max_match=0 CLABEL=(?\w+)

[index-selector is a macro, which basically translates to index=myspecialindex]

The theory of the search above is to find a target event identified by SENT_TO_MARKET, and use that to create a list of transactions based on its id (CID) - Im not interested in anyother events, that never got to market.

The datamodel then extracts 8 items by way of regular expression, one typical example is:
securityCode='(?[A-Z]{0,6})

Here is an example of a result, when looking at 1 whole day's (yesterday's) data:

This search has completed and has returned 4,437 results by scanning 1,231,789 events in 90.819 seconds.

(SID: 1427296598.141) search.log
Execution costs
Duration (seconds)      Component   Invocations     Input count     Output count
    0.58    command.eval    124     4,674   4,674
    0.12    command.fields  123     1,231,789   1,231,789
    11.16   command.join    124     1,231,789   225,540
    0.25    command.pretransaction  248     676,620     676,620
    0.49    command.rename  496     18,696  18,696
    47.17   command.rex     1,116   42,066  42,066
    56.59   command.search  247     4,674   1,236,463
    0.33    command.search.index    123     -   -
    0.12    command.search.filter   124     -   -
    0.12    command.search.fieldalias   118     1,231,789   1,231,789
    0.12    command.search.calcfields   118     1,231,789   1,231,789
    0.00    command.search.index.usec_1_8   26  -   -
    0.00    command.search.index.usec_64_512    1   -   -
    0.00    command.search.index.usec_8_64  220     -   -
    26.68   command.search.typer    118     1,231,789   1,231,789
    16.62   command.search.kv   118     -   -
    9.19    command.search.rawdata  118     -   -
    2.15    command.search.tags     118     1,231,789   1,231,789
    0.12    command.search.lookups  118     1,231,789   1,231,789
    0.03    command.search.summary  123     -   -
    7.89    command.transaction     124     225,540     7,377
    0.22    command.where   124     7,354   4,674
    0.01    dispatch.check_disk_usage   9   -   -
    0.04    dispatch.createdSearchResultInfrastructure  1   -   -
    1.12    dispatch.evaluate   1   -   -
    1.07    dispatch.evaluate.join  1   -   -
    0.05    dispatch.evaluate.search    2   -   -
    0.01    dispatch.evaluate.rex   9   -   -
    0.00    dispatch.evaluate.rename    4   -   -
    0.00    dispatch.evaluate.eval  1   -   -
    0.00    dispatch.evaluate.transaction   1   -   -
    0.00    dispatch.evaluate.where     1   -   -
    12.38   dispatch.fetch  124     -   -
    0.00    dispatch.localSearch    1   -   -
    13.56   dispatch.parserThread   122     -   -
    0.04    dispatch.preview    36  -   -
    1.12    dispatch.readEventsInResults    1   -   -
    2.96    dispatch.results_combiner   124     -   -
    0.00    dispatch.stream.local   1   -   -
    56.47   dispatch.stream.remote  122     -   1,735,505,283
    34.34   dispatch.stream.remote.splunk1-2.management     74  -   1,045,830,596
    22.12   dispatch.stream.remote.splunk1-1.management     46  -   689,667,013
    0.00    dispatch.stream.remote.splunk2-1.management     1   -   3,837
    0.00    dispatch.stream.remote.splunk2-2.management     1   -   3,837
    4.57    dispatch.timeline   124     -   -
    0.19    dispatch.writeStatus    99  -   -
    0.09    startup.configuration   5   -   -
    5.09    startup.handoff     
0 Karma

masonmorales
Influencer

Scanning 1.2 million events is likely the cause of your performance issue. Can you add constraints to your base search that would reduce the data set? If not, you may want to consider scaling your architecture out to more powerful hardware on your indexers (faster disk, more memory, more cores, etc.)

Your distributed search times are quite long due to the number of events being scanned...

 56.47     dispatch.stream.remote     122     -     1,735,505,283
 34.34     dispatch.stream.remote.splunk1-2.management     74     -     1,045,830,596
 22.12     dispatch.stream.remote.splunk1-1.management     46     -     689,667,013
0 Karma

rlough
Path Finder

Check out these two questions:
1. http://answers.splunk.com/answers/211761/is-there-an-alternative-to-subsearch-or-a-way-to-r.html
2. http://answers.splunk.com/answers/129424/how-to-compare-fields-over-multiple-sourcetypes-without-joi...

I personally avoid using subsearches like they are the plague. They would take ages to parse, longer to run, and often I would be blocked from getting useful results because of subsearch's limit on events. I would highly suggest refactoring your query to stop using subsearch and instead use a command such as eventstats to really speed up your query.

I hope this helps!

greenwayb
Explorer

Also almost everything I would do, will tie back to a CID= field. Is there a way i should be optimising the application to leverage this (Custom fields at index time)?

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...