<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to cherry pick values from different sources? in All Apps and Add-ons</title>
    <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/How-to-cherry-pick-values-from-different-sources/m-p/278284#M32733</link>
    <description>&lt;P&gt;@woodcock's introduction of  &lt;CODE&gt;coalesce&lt;/CODE&gt; makes me search for alternative statement of the problem.  Here is one clunky solution:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index=os (sourcetype="top" OR sourcetype=ps)
| eval pctCPU=sourcetype.pctCPU
| bucket _time span=1m
| stats values(pctCPU) as pctCPU latest(eval(if(sourcetype="ps",app,COMMAND) as app
 by _time PID host
| eval pctCPU=replace(if(match(pctCPU,"top"),mvfilter(match(pctCPU,"top")),pctCPU),"[stop]+","")
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Effectively, label  &lt;CODE&gt;pctCPU&lt;/CODE&gt; from different sources, then filter desired values by label based on the pseudo code; get rid of the label lastly. ( &lt;CODE&gt;(ps|top)&lt;/CODE&gt; would be more efficient, but [&lt;EM&gt;stop&lt;/EM&gt;]+ or [&lt;EM&gt;tops&lt;/EM&gt;]+ has the sound byte.)&lt;/P&gt;

&lt;P&gt;It is noisy in terms of code efficiency, and that  &lt;CODE&gt;span=1m&lt;/CODE&gt; is a very bad approximation. (There should be better methods to tidy up small stagger.)  I hope for better, but I'll take this for the time being.&lt;/P&gt;</description>
    <pubDate>Wed, 21 Oct 2015 18:29:00 GMT</pubDate>
    <dc:creator>yuanliu</dc:creator>
    <dc:date>2015-10-21T18:29:00Z</dc:date>
    <item>
      <title>How to cherry pick values from different sources?</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/How-to-cherry-pick-values-from-different-sources/m-p/278280#M32729</link>
      <description>&lt;P&gt;Given &lt;CODE&gt;sourcetype=ps&lt;/CODE&gt; and &lt;CODE&gt;sourcetype=top&lt;/CODE&gt;, in both of which &lt;CODE&gt;pctCPU&lt;/CODE&gt; are present, how do I associate &lt;CODE&gt;pctCPU&lt;/CODE&gt; from &lt;EM&gt;top&lt;/EM&gt; only while using fields unique to &lt;EM&gt;ps&lt;/EM&gt;? (Despite identical field name, values in these two sources represent very different things.)&lt;/P&gt;

&lt;P&gt;In Splunk Add-on for &lt;EM&gt;Nix, for example, *ps&lt;/EM&gt; and &lt;EM&gt;top&lt;/EM&gt; both contain fields &lt;CODE&gt;PID&lt;/CODE&gt;, &lt;CODE&gt;COMMAND&lt;/CODE&gt; and &lt;CODE&gt;pctCPU&lt;/CODE&gt;.  (They share some other field names of interest which I will not use in this example.)  As @Paolo Prigione pointed out many years ago, pctCPU in &lt;EM&gt;ps&lt;/EM&gt; is not useful for monitoring. (&lt;A href="https://answers.splunk.com/answers/27398/is-nix-sourcetype-ps-pctcpu-really-suitable-for-charting-ootb.html"&gt;https://answers.splunk.com/answers/27398/is-nix-sourcetype-ps-pctcpu-really-suitable-for-charting-ootb.html&lt;/A&gt;)  In the simplest use case, pctCPU in &lt;EM&gt;top&lt;/EM&gt; would give the instantaneous CPU usage of each process.  However, &lt;CODE&gt;COMMAND&lt;/CODE&gt; in &lt;EM&gt;top&lt;/EM&gt; only gives a simple program name, which is insufficient for my purposes. (In the old &lt;EM&gt;nix for Splunk, *ps&lt;/EM&gt;' &lt;CODE&gt;COMMAND&lt;/CODE&gt; includes full arguments; in Splunk Add-on for &lt;EM&gt;Nix, *ps&lt;/EM&gt; has a separate &lt;CODE&gt;ARGS&lt;/CODE&gt; field.)&lt;/P&gt;

&lt;P&gt;Conceivably I can associate &lt;EM&gt;top&lt;/EM&gt;'s &lt;CODE&gt;pctCPU&lt;/CODE&gt; values with &lt;EM&gt;ps&lt;/EM&gt;' &lt;CODE&gt;app&lt;/CODE&gt; (combination of &lt;CODE&gt;COMMAND&lt;/CODE&gt; and &lt;CODE&gt;ARGS&lt;/CODE&gt; in the new Splunk Add-on for &lt;EM&gt;Nix) by joining a *top&lt;/EM&gt; search with a &lt;EM&gt;ps&lt;/EM&gt; search.  This looks very wasteful, however.  So I thought I would tackle it by a simple search, then eliminate values from &lt;EM&gt;ps&lt;/EM&gt;.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index=os (sourcetype=ps OR sourcetype=top)
|  bucket _time span=1m
| stats values(if(sourcetype="ps",app,COMMAND)) as app values(eval(if(sourcetype="top",pctCPU,null()))) as pctCPU by _time PID
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;(&lt;CODE&gt;bucket _time&lt;/CODE&gt; is necessary because, though launched with the same frequency, the two sources often have sub-minute stagger.)  This works for all processes output from &lt;EM&gt;ps&lt;/EM&gt;.  However, as &lt;EM&gt;ps&lt;/EM&gt; and &lt;EM&gt;top&lt;/EM&gt; do not always survey the same processes even when they are launched within a subsecond, some processes captured by &lt;EM&gt;ps&lt;/EM&gt; will not show in &lt;EM&gt;top&lt;/EM&gt; of the same time interval, and vice versa.  As a result, the above strategy gives null values when the process is in &lt;EM&gt;ps&lt;/EM&gt; only.  I want to fill these gaps with values from &lt;EM&gt;ps&lt;/EM&gt;, because for these extremely momentary processes, &lt;CODE&gt;pctCPU&lt;/CODE&gt; from &lt;EM&gt;ps&lt;/EM&gt; has the same significance as that from &lt;EM&gt;top&lt;/EM&gt;.&lt;/P&gt;

&lt;P&gt;In other words, I want eliminate value of &lt;CODE&gt;pctCPU&lt;/CODE&gt; from &lt;EM&gt;ps&lt;/EM&gt; when &lt;EM&gt;top&lt;/EM&gt; is available, but use value from &lt;EM&gt;ps&lt;/EM&gt; when not. (The first term in the example, &lt;CODE&gt;values(if(sourcetype="ps",app,COMMAND)) as app&lt;/CODE&gt;, is a much more sophisticated macro output in reality.  That output can cause gaps when a process is only in &lt;EM&gt;ps&lt;/EM&gt; but missing from &lt;EM&gt;top&lt;/EM&gt;.)&lt;/P&gt;</description>
      <pubDate>Wed, 21 Oct 2015 05:31:33 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/How-to-cherry-pick-values-from-different-sources/m-p/278280#M32729</guid>
      <dc:creator>yuanliu</dc:creator>
      <dc:date>2015-10-21T05:31:33Z</dc:date>
    </item>
    <item>
      <title>Re: How to cherry pick values from different sources?</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/How-to-cherry-pick-values-from-different-sources/m-p/278281#M32730</link>
      <description>&lt;P&gt;Like this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index=os (sourcetype=ps OR sourcetype=top)
| bucket _time span=1m
| chart over _time latests(pctCPU) by sourcetype
| eval pctCPU=coalesce(top, ps)
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;At this point, each value for _time (each minute) has a value for pctCPU that uses sourcetype &lt;CODE&gt;top&lt;/CODE&gt; in preference to sourcetype &lt;CODE&gt;ps&lt;/CODE&gt;.  Tack on the rest of what you need after that.&lt;/P&gt;</description>
      <pubDate>Wed, 21 Oct 2015 16:01:06 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/How-to-cherry-pick-values-from-different-sources/m-p/278281#M32730</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2015-10-21T16:01:06Z</dc:date>
    </item>
    <item>
      <title>Re: How to cherry pick values from different sources?</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/How-to-cherry-pick-values-from-different-sources/m-p/278282#M32731</link>
      <description>&lt;P&gt;@woodcock Thanks for the reply.  I need the result by PID so I can show consumption of each process over time.&lt;/P&gt;</description>
      <pubDate>Wed, 21 Oct 2015 16:55:54 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/How-to-cherry-pick-values-from-different-sources/m-p/278282#M32731</guid>
      <dc:creator>yuanliu</dc:creator>
      <dc:date>2015-10-21T16:55:54Z</dc:date>
    </item>
    <item>
      <title>Re: How to cherry pick values from different sources?</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/How-to-cherry-pick-values-from-different-sources/m-p/278283#M32732</link>
      <description>&lt;P&gt;An alternative statement of the problem could be: How to ask Splunk to perform the following pseudo code:&lt;/P&gt;

&lt;OL&gt;
&lt;LI&gt;discard &lt;CODE&gt;pctCPU&lt;/CODE&gt; from &lt;CODE&gt;sourcetype=ps&lt;/CODE&gt; IF output from &lt;CODE&gt;sourcetype=top&lt;/CODE&gt; exists for that &lt;CODE&gt;PID&lt;/CODE&gt; in that sample period (every 5 minute but wavers from period to period and from sourcetype to sourcetype)&lt;/LI&gt;
&lt;LI&gt;discard &lt;CODE&gt;COMMAND&lt;/CODE&gt; from &lt;CODE&gt;sourcetype=top&lt;/CODE&gt; IF output from &lt;CODE&gt;sourcetype=ps&lt;/CODE&gt; exists for that &lt;CODE&gt;PID&lt;/CODE&gt; in that sample period&lt;/LI&gt;
&lt;/OL&gt;</description>
      <pubDate>Wed, 21 Oct 2015 17:09:23 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/How-to-cherry-pick-values-from-different-sources/m-p/278283#M32732</guid>
      <dc:creator>yuanliu</dc:creator>
      <dc:date>2015-10-21T17:09:23Z</dc:date>
    </item>
    <item>
      <title>Re: How to cherry pick values from different sources?</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/How-to-cherry-pick-values-from-different-sources/m-p/278284#M32733</link>
      <description>&lt;P&gt;@woodcock's introduction of  &lt;CODE&gt;coalesce&lt;/CODE&gt; makes me search for alternative statement of the problem.  Here is one clunky solution:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index=os (sourcetype="top" OR sourcetype=ps)
| eval pctCPU=sourcetype.pctCPU
| bucket _time span=1m
| stats values(pctCPU) as pctCPU latest(eval(if(sourcetype="ps",app,COMMAND) as app
 by _time PID host
| eval pctCPU=replace(if(match(pctCPU,"top"),mvfilter(match(pctCPU,"top")),pctCPU),"[stop]+","")
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Effectively, label  &lt;CODE&gt;pctCPU&lt;/CODE&gt; from different sources, then filter desired values by label based on the pseudo code; get rid of the label lastly. ( &lt;CODE&gt;(ps|top)&lt;/CODE&gt; would be more efficient, but [&lt;EM&gt;stop&lt;/EM&gt;]+ or [&lt;EM&gt;tops&lt;/EM&gt;]+ has the sound byte.)&lt;/P&gt;

&lt;P&gt;It is noisy in terms of code efficiency, and that  &lt;CODE&gt;span=1m&lt;/CODE&gt; is a very bad approximation. (There should be better methods to tidy up small stagger.)  I hope for better, but I'll take this for the time being.&lt;/P&gt;</description>
      <pubDate>Wed, 21 Oct 2015 18:29:00 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/How-to-cherry-pick-values-from-different-sources/m-p/278284#M32733</guid>
      <dc:creator>yuanliu</dc:creator>
      <dc:date>2015-10-21T18:29:00Z</dc:date>
    </item>
    <item>
      <title>Re: How to cherry pick values from different sources?</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/How-to-cherry-pick-values-from-different-sources/m-p/278285#M32734</link>
      <description>&lt;P&gt;OK,  then do this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index=os (sourcetype=ps OR sourcetype=top)
| bucket _time span=1m
| chart over _time latests(pctCPU) by sourcetype PID
| eval pctCPU=coalesce(top, ps)
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 21 Oct 2015 18:37:27 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/How-to-cherry-pick-values-from-different-sources/m-p/278285#M32734</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2015-10-21T18:37:27Z</dc:date>
    </item>
    <item>
      <title>Re: How to cherry pick values from different sources?</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/How-to-cherry-pick-values-from-different-sources/m-p/278286#M32735</link>
      <description>&lt;P&gt;I mean, Splunk won't allow two groupings in &lt;CODE&gt;chart&lt;/CODE&gt; when &lt;CODE&gt;over&lt;/CODE&gt; is used.  I have already permuted through these.&lt;/P&gt;</description>
      <pubDate>Wed, 21 Oct 2015 19:21:34 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/How-to-cherry-pick-values-from-different-sources/m-p/278286#M32735</guid>
      <dc:creator>yuanliu</dc:creator>
      <dc:date>2015-10-21T19:21:34Z</dc:date>
    </item>
    <item>
      <title>Re: How to cherry pick values from different sources?</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/How-to-cherry-pick-values-from-different-sources/m-p/278287#M32736</link>
      <description>&lt;P&gt;Have you considered the Nmon app? You may be able to accomplish what you're looking for and more vs the nix app.&lt;/P&gt;</description>
      <pubDate>Wed, 21 Oct 2015 20:03:19 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/How-to-cherry-pick-values-from-different-sources/m-p/278287#M32736</guid>
      <dc:creator>stmyers7941</dc:creator>
      <dc:date>2015-10-21T20:03:19Z</dc:date>
    </item>
    <item>
      <title>Re: How to cherry pick values from different sources?</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/How-to-cherry-pick-values-from-different-sources/m-p/278288#M32737</link>
      <description>&lt;P&gt;Thanks for the suggestion, @stmyers7941.  Though keenly aware of the pains induced by *nix app, the option is not mine to pick .  This said, the general method could have other use cases when field name overload happens.&lt;/P&gt;</description>
      <pubDate>Wed, 21 Oct 2015 20:16:15 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/How-to-cherry-pick-values-from-different-sources/m-p/278288#M32737</guid>
      <dc:creator>yuanliu</dc:creator>
      <dc:date>2015-10-21T20:16:15Z</dc:date>
    </item>
    <item>
      <title>Re: How to cherry pick values from different sources?</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/How-to-cherry-pick-values-from-different-sources/m-p/278289#M32738</link>
      <description>&lt;P&gt;The above works well as a solution to the stated generalised question.  But there's a big caveat as to suitability for fixing the &lt;EM&gt;nix app.  In GNU *top&lt;/EM&gt;, the default (which is how &lt;EM&gt;top.ps&lt;/EM&gt; calls it) is to use &lt;EM&gt;Irix mode&lt;/EM&gt;, in which percentage is calculated against a single core.  For this data to be useful, therefore, one must divide the number by number of cores.  But then, I haven't determined how GNU &lt;EM&gt;ps&lt;/EM&gt; handles pcpu.  Is it calibrated against a single core or is it against all cores?  I'll post outcome in &lt;A href="https://answers.splunk.com/answers/27398/is-nix-sourcetype-ps-pctcpu-really-suitable-for-charting-ootb.html"&gt;the other thread&lt;/A&gt;.  In all cases, I really like to see *nix app fixed from the source as I suggested in &lt;A href="https://answers.splunk.com/answers/117872/for-splunk-add-on-for-linux-why-do-we-need-both-ps-and-top.html"&gt;https://answers.splunk.com/answers/117872/for-splunk-add-on-for-linux-why-do-we-need-both-ps-and-top.html&lt;/A&gt;.&lt;/P&gt;</description>
      <pubDate>Wed, 21 Oct 2015 23:58:49 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/How-to-cherry-pick-values-from-different-sources/m-p/278289#M32738</guid>
      <dc:creator>yuanliu</dc:creator>
      <dc:date>2015-10-21T23:58:49Z</dc:date>
    </item>
    <item>
      <title>Re: How to cherry pick values from different sources?</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/How-to-cherry-pick-values-from-different-sources/m-p/278290#M32739</link>
      <description>&lt;P&gt;OK, then try this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index=os (sourcetype=ps OR sourcetype=top)
| bucket _time span=1m
| stats latest(pctCPU) AS pctCPU by sourcetype PID _time
| eval combo=sourcetype . ":" . PID
| xyseries _time combo pctCPU
| foreach top* [ eval pctCPU&amp;lt;&amp;lt;MATCHSTR&amp;gt;&amp;gt;=coalesce(top&amp;lt;&amp;lt;FIELD&amp;gt;&amp;gt;, ps&amp;lt;&amp;lt;FIELD&amp;gt;&amp;gt;)
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 22 Oct 2015 02:09:25 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/How-to-cherry-pick-values-from-different-sources/m-p/278290#M32739</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2015-10-22T02:09:25Z</dc:date>
    </item>
    <item>
      <title>Re: How to cherry pick values from different sources?</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/How-to-cherry-pick-values-from-different-sources/m-p/278291#M32740</link>
      <description>&lt;P&gt;If you are going with this answer (note that I modified my solution yet again), then you should click "Accept".&lt;/P&gt;</description>
      <pubDate>Thu, 22 Oct 2015 13:09:46 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/How-to-cherry-pick-values-from-different-sources/m-p/278291#M32740</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2015-10-22T13:09:46Z</dc:date>
    </item>
    <item>
      <title>Re: How to cherry pick values from different sources?</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/How-to-cherry-pick-values-from-different-sources/m-p/278292#M32741</link>
      <description>&lt;P&gt;@woodcock I'm going with this.  After some investigation, I realise that field name overload is a cardinal sin that we shouldn't commit in the first place.  So I'm really trying to solve an artificial problem.  Still, your methods really expanded my Splunk vocabulary. (&lt;CODE&gt;xyseries&lt;/CODE&gt; is something I have wanted for some other problems.)&lt;/P&gt;</description>
      <pubDate>Mon, 26 Oct 2015 18:19:05 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/How-to-cherry-pick-values-from-different-sources/m-p/278292#M32741</guid>
      <dc:creator>yuanliu</dc:creator>
      <dc:date>2015-10-26T18:19:05Z</dc:date>
    </item>
  </channel>
</rss>

