I have a search which uses email data to search specific email logs for communications from/to specific organization and lists all necessary attributes required. This search is required to run end of every month and generate a report. This scheduled search roughly takes 25 hours to run.
index=cisco_esa sourcetype=cisco:esa:textmail mail_logs
| transaction internal_message_id maxspan=300s
| search recipient="*@abc.com" OR sender="*@abc.com"
| table _time internal_message_id sender recipient field2 field3
| outputlookup abc_esa_summary.csv
Can somebody please suggest some improvements to this search to make it run faster ?
Hi @dm1,
transaction is a very slow command, you can replace it using stats, something like this:
index=cisco_esa sourcetype=cisco:esa:textmail mail_logs
| stats earliest(_time) AS _time values(sender) AS sender values(recipient) recipient values(field2) AS field2 values(field3) AS field3 BY internal_message_id
| search recipient="*@abc.com" OR sender="*@abc.com"
| table _time internal_message_id sender recipient field2 field3
| outputlookup abc_esa_summary.csv
If this solution isn't acceptable for you (maybe the limit of maxspan is relevant!), you could schedule your search to run eventy night with a timeframe of 24 hours, saving results in a summary index (with the collect command) and every month you can take all the results.
Ciao.
Giuseppe
Hi @dm1,
transaction is a very slow command, you can replace it using stats, something like this:
index=cisco_esa sourcetype=cisco:esa:textmail mail_logs
| stats earliest(_time) AS _time values(sender) AS sender values(recipient) recipient values(field2) AS field2 values(field3) AS field3 BY internal_message_id
| search recipient="*@abc.com" OR sender="*@abc.com"
| table _time internal_message_id sender recipient field2 field3
| outputlookup abc_esa_summary.csv
If this solution isn't acceptable for you (maybe the limit of maxspan is relevant!), you could schedule your search to run eventy night with a timeframe of 24 hours, saving results in a summary index (with the collect command) and every month you can take all the results.
Ciao.
Giuseppe
Hi @dm1,
good for you, see next time!
Ciao and happy splunking.
Giuseppe
P.S.: Karma Points are appreciated by all the contributors 😉
@gcusello Thanks! I just tested your search and it seems to be giving what I want in a fairly faster way. Still testing it.
Can you please elaborate on ? I have never tried this
"saving results in a summary index (with the collect command) and every month you can take all the results. "
Hi @dm1,
the approach is the following:
you modify you search replacing the last row with "collect index=my_summary_index".
you schedule your search to run every night on the last 24 hurs.
in this way you save the results in a summary index and you can run you montly search on the summary index and you have in a table all the results.
In this way you have a very quick search that you can also run every day.
For more infos you can see at https://docs.splunk.com/Documentation/Splunk/8.1.3/Knowledge/Usesummaryindexing
Ciao.
Giuseppe
P.S.: if this answer solves your your need, please accept it for the other people of Community, and Karma Points are appreciated by all the contributors 😉
Any specific reason to use transaction commands?
Meanwhile you can try by adding fields command before transaction,
index=cisco_esa sourcetype=cisco:esa:textmail mail_logs
| fields internal_message_id sender recipient field2 field3
@kamlesh_vaghela because transaction command groups all messages having same unique message identifier which is internal_message_id.