I am searching through postfix email logs and trying to put all the revevent logs together for each email. I am also setting up the search in a view so that our email admin can just type in the search string and find an email.
The first search I came up with is as follows. This search worked well but was very slow for search of 24 hours or more (we log about 500,000 emails a day).
<row>
<chart>
<title>Number of Messages over Time</title>
<searchTemplate>sourcetype=postfix_syslog | transaction keepevicted=true message_pid | search to=*$Username$* | timechart count by host</searchTemplate>
<option name="charting.chart">column</option>
<option name="charting.primaryAxisTitle.text">Timeline</option>
<option name="charting.secondaryAxisTitle.text">Messages</option>
<option name="charting.legend.placement">right</option>
</chart>
</row>
<row>
<event>
<title>Message Logs</title>
<searchTemplate>sourcetype=postfix_syslog | transaction keepevicted=true message_pid | search to=*$Username$* OR orig_to=*$Username$*</searchTemplate>
<option name="count">20</option>
<option name="showPager">true</option>
</event>
</row>
I then changed the search to the following and it worked a lot faster but now does not display a progress bar. This is causing our email admins to keep clicking thinking it has locked up.
<row>
<chart>
<title>Number of Messages over Time</title>
<searchTemplate>sourcetype=postfix_syslog [ search sourcetype=postfix_syslog *$Username$* | dedup message_pid | fields message_pid ] | transaction keepevicted=true fields=message_pid maxspan=3m maxpause=1m | timechart count by host</searchTemplate>
<option name="charting.chart">column</option>
<option name="charting.primaryAxisTitle.text">Timeline</option>
<option name="charting.secondaryAxisTitle.text">Messages</option>
<option name="charting.legend.placement">right</option>
</chart>
</row>
<row>
<event>
<title>Message Logs</title>
<searchTemplate>sourcetype=postfix_syslog [ search sourcetype=postfix_syslog *$Username$* | dedup message_pid | fields message_pid ] | transaction keepevicted=true fields=message_pid maxspan=3m maxpause=1m</searchTemplate>
<option name="count">20</option>
<option name="showPager">true</option>
</event>
</row>
How do I get a progress bar back for the last search and why did I loose it?
Anyone else working on postfix email logs?
---- Kirk
The progress bar went away because it only shows progress for the main search pipeline.
In the rewritten version it's the subsearch that is doing most of the work and the outer search is comparatively zippy so the JobProgressIndicator only appears at the end for a very brief time.
You can probably confirm this by running them separately in the charting
view. ie run
sourcetype=postfix_syslog *$Username$*
vs
sourcetype=postfix_syslog (message_pid=<pidA> OR message_pid=<pidB> OR message_pid=<pidC> ...) | transaction keepevicted=true
(Its the prefix+postfix search on Username that makes it expensive, because it has to get all of the events off of disk and then scan them in memory. )
I dont think there's any way to get the main job to reflect the progress of the subsearch job, and that JobProgressIndicator definitely only responds to the main job.
One quite different solution you might try:
a) extract the username field if it isn't already.
b) create a summary index search that runs every 10mins or so that maps usernames to pids.
sourcetype=postfix_syslog | stats count by username, message_pid
c) then you can search for this
sourcetype=postfix_syslog [ index=summary username="*$Username$*" | dedup pid ] | transaction keepevicted=true fields=message_pid maxspan=3m maxpause=1m | timechart count by host
Of course, removing the asterisks around Username will probably make this problem go away as well...
Can you tell us more about why you're using transaction at all? Do the message_pid values repeat a lot? Seems like "sourcetype=postfix_syslog | dedup message_pid | timechart count by host" or just "sourcetype=postfix_syslog | timechart dc(message_pid) by host" might work and they'd be a lot simpler...