Getting Data In

Is the search I'm giving my transaction too broad? I get results with an additional filter on the initial search, but no results if I remove the filter.

MCD
Engager

I'm trying to identify all log messages that are part of our application's start up on a host,source pair. I've identified messages that mark the start and end of the start up.

Here's the part that's confusing me: if I filter the search to contain only those messages and then try to make transactions, it works fine and I get results. If I remove the filter, since I want all messages (not just the start/end messages), no results are returned.

Constrained search, which returns the expected three transactions:

index=my_index host=prod* ("init_startup_message" OR "complete_startup_message") 
| transaction host,source startswith="init_startup_message" endswith="complete_startup_message"

Un-constrained search, which returns no transactions (if I remove the transaction, the search returns ~700k results):

index=my_index host=prod* 
| transaction host,source startswith="init_startup_message" endswith="complete_startup_message"

What's going on?!

0 Karma
1 Solution

lguinn2
Legend

The transaction command needs to bring all the events into memory and examine them to create the transactions. This happens on the search head, and it is quite costly.

In fact, if the command runs out of resources, it will fail or give partial results.

Restricting the input is a key technique for using the transaction command. You can shorten the time range or filter the number of events (as you did).

Also, there are explicit restrictions in limits.conf, quoted below

[transactions]
maxopentxn = <integer>
* Specifies the maximum number of not yet closed transactions to keep in the open pool before starting to evict transactions.
* Defaults to 5000.

maxopenevents = <integer>
* Specifies the maximum number of events (which are) part of open transactions before transaction eviction starts happening, using LRU policy.
* Defaults to 100000.

You might also want to read about transactions in the Search Manual. The Search Manual also discusses using stats instead of transaction if possible. In my experience, the stats command will be orders of magnitude faster, but may be difficult to use in your specific case.

View solution in original post

lguinn2
Legend

The transaction command needs to bring all the events into memory and examine them to create the transactions. This happens on the search head, and it is quite costly.

In fact, if the command runs out of resources, it will fail or give partial results.

Restricting the input is a key technique for using the transaction command. You can shorten the time range or filter the number of events (as you did).

Also, there are explicit restrictions in limits.conf, quoted below

[transactions]
maxopentxn = <integer>
* Specifies the maximum number of not yet closed transactions to keep in the open pool before starting to evict transactions.
* Defaults to 5000.

maxopenevents = <integer>
* Specifies the maximum number of events (which are) part of open transactions before transaction eviction starts happening, using LRU policy.
* Defaults to 100000.

You might also want to read about transactions in the Search Manual. The Search Manual also discusses using stats instead of transaction if possible. In my experience, the stats command will be orders of magnitude faster, but may be difficult to use in your specific case.

somesoni2
Revered Legend

The 700K events may be too much for transaction command. It's quite expensive command to run on large data set. Is there way you could add some filter to reduce the number of rows that will be searched for transaction?

MCD
Engager

Okay, that was my gut feeling based on the fact that the transaction works if I significantly restrict the input. It's a bummer that the output isn't more informative than "0 results found", though!

0 Karma

woodcock
Esteemed Legend

For each search, go to Activity -> Jobs, find your job and click Inspect and look at Normalized search. There should be some difference between the searches that will tell most of the tale.

0 Karma

MCD
Engager

I'm not seeing any smoking guns--it looks like the only difference is the filter search going into the transaction. Here are the respective normalized searches:

litsearch index=report_public host=prod* source=*ngny* ( "oejs.Server:jetty" OR "Started SelectChannelConnector" ) | fields keepcolorder=t "*" "_bkt" "_cd" "_si" "host" "index" "linecount" "source" "sourcetype" "splunk_server" | eval _txn_starts_with=if(searchMatch("oejs.Server:jetty"), 1,0) | eval _txn_ends_with=if(searchMatch("Started SelectChannelConnector"), 1,0) | pretransaction host,source startswith="oejs.Server:jetty" endswith="Started SelectChannelConnector"

litsearch index=report_public host=prod* source=*ngny* | fields keepcolorder=t "*" "_bkt" "_cd" "_si" "host" "index" "linecount" "source" "sourcetype" "splunk_server" | eval _txn_starts_with=if(searchMatch("oejs.Server:jetty"), 1,0) | eval _txn_ends_with=if(searchMatch("Started SelectChannelConnector"), 1,0) | pretransaction host,source startswith="oejs.Server:jetty" endswith="Started SelectChannelConnector"
0 Karma
Get Updates on the Splunk Community!

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...