Splunk Search
Highlighted

Search performance and optimization

Builder

when I search with below query

sourcetype=my_log UUID="3fc5e6c2-57b4-4e59-a3c0-8115f5ec74a1"

search result will appear within one second amazing fast 🙂
this log information is older then one month

but when I search with this query

sourcetype=my_log | transaction startswith=log_begin endswith=log_end | where UUID="3fc5e6c2-57b4-4e59-a3c0-8115f5ec74a1"

It'll take 8 to 10 minutes to display the result 😞 extremely slow

Now I have two question

  1. How to improve this search with transaction?
  2. How do I stop my search after first result because after getting this result Splunk keep continue to search and I know there is no more results?
0 Karma
Highlighted

Re: Search performance and optimization

SplunkTrust
SplunkTrust

The first query is fast because splunk can use index data to narrow down the events that need to be loaded.
The second query is slow because splunk has to push everything into the transaction command, which then is slow because it can't handle large (in splunk terms) amounts of data.

One way to speed things up is to narrow down the time range that needs to be searched.
Other ways depend on your data and what you do with it.

Highlighted

Re: Search performance and optimization

Motivator

Does the UUID field exist in all events you are interested in? Like martinmueller said the first search is fast because index data is used to narrow down your search results. But the second search is very slow because it is handling so much data. If i understand the search pipeline correctly, your second search is taking the entire contents of `mylogand trying to apply thetransactionfunction to it before narrowing it down again with thewherecommand.Transactionis an intensive operation and you'll want to narrow down your search results as much as possible before piping to it. Additionally, if there is a field that uniquely identifies log entries as part of a transaction, you should include them as the optional field list of thetransactioncommand, this makes it easier fortransaction` to group events together. Would a search like one of the following accomplish what you need?

sourcetype=my_log UUID="3fc5e6c2-57b4-4e59-a3c0-8115f5ec74a1" | transaction UUID startswith=log_begin endswith=log_end

Highlighted

Re: Search performance and optimization

Builder

NO UUID appears only once in a transaction, I understand the reason but 8 minutes is not good for search the log. Is there any other alternate e.g. to display x line before UUID field and y line after UUID field.

0 Karma
Highlighted

Re: Search performance and optimization

Legend

8 minutes is understandable since you're telling Splunk to retrieve all events from disk before really doing anything.

You might want to look into the localize command: http://docs.splunk.com/Documentation/Splunk/5.0.1/SearchReference/Localize

Highlighted

Re: Search performance and optimization

Communicator

So I realize I'm way late to the party here, but what about using a subsearch? Assuming that there is a field in your log data (let's call it myTransactionID) can be used to uniquely identify a transaction, you could do something like:

sourcetype=mylog [search sourcetype=my\log UUID="3fc5e6c2-57b4-4e59-a3c0-8115f5ec74a1" | dedup myTransactionID | fields myTransactionID] | transaction startswith=log_begin endswith=log_end

Essentially, what the subsearch does is find the initial log with the specified UUID value, obtain the value of myTransactionID, and then pass that as an argument to the main search so that it only returns events with the matching transaction ID. Normally subsearches aren't particularly fast, so as a general rule I wouldn't be suggesting them for optimization, but it will be far better than letting transaction operate on every single event with the my_log sourcetype.

0 Karma