Splunk Search

How to optimize an automated correlation search for process monitoring?

jrnortonjr
New Member

I am utilizing a correlation search to schedule the delivery of application performance metrics against running processes on remote hosts. Whether the host has reported using the winhostmon:process stanza or on *nix boxes using ps within a given amount of time is a good enough place to start.

I have been tasked with creating a template to monitor processes in our enterprise. We want one search that we can use with any process and any OS and we want to generate an event when the process is broken and when the process returns to "good". My attempt follows. Please recommend a better way to accomplish this or how I can solve the existing problems with my solution.

I have managed to cobble together a query that gives good enough results that we can use, however there is one problem with a column being potentially ridiculously large.

earliest=-10m ((sourcetype=WinHostMon source=Process) OR sourcetype=ps) | rex field=_raw "CommandLine=(?.+[^\n])" | eval full_command=coalesce(CmdLine,app), Process=coalesce(Name,process_name) | search [| inputlookup Customer_test_processes.csv] | stats latest(_time) AS last_reported by host Process full_command source| eval age = now() - last_reported | search age > 300 | sort - age | convert ctime(last_reported) | eval timestamp=now() | convert ctime(timestamp) | eval Category="Process" | eval Severity="INFO" | eval Value="5" | eval message="The Splunk Process heartbeat for ".host." ".Process." is Unreachable. Last reported at: ".last_reported | eval event_title=host."-".Process."-Heartbeat" | table timestamp host Category Process Severity full_command Value source message event_title

the included inputlookup table looks like this:

host,Process,full_command
XXXYYYZZZ,mdrv,custom_commandline
XXXYYYZZZ,SiteScope,-service
XXXYYYZZZ,splunkd,*


The problem stems from the resulting "full_command" field. I would prefer that the output from the command not be so big. In some cases where possibly *nix boxes are running apache tomcat servers or jboss, the process to monitor would be java, however the full command would be an ungodly amount of classpath configurations and java flags.

I would like to know if there is a way to get the "full_command" as it appears in the lookup table to marry up with the corresponding search result that is returned from the query as opposed to the entire command.

0 Karma

valiquet
Contributor

Add at the end
| lookup mylookup host AS host, Process AS Process OUTPUT full_command AS short_command

0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...

They're back! Join the SplunkTrust and MVP at .conf24

With our highly anticipated annual conference, .conf, comes the fez-wearers you can trust! The SplunkTrust, as ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...