<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Sorting inquiry in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Sorting-inquiry/m-p/423699#M121589</link>
    <description>&lt;P&gt;@richgalloway @niketnilay Thank you so much for your advice!! &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt; It was really helpful! &lt;/P&gt;

&lt;P&gt;Also, @Sukisen1981 thank you so much as well! &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;</description>
    <pubDate>Fri, 02 Aug 2019 09:18:59 GMT</pubDate>
    <dc:creator>chinkeeparco</dc:creator>
    <dc:date>2019-08-02T09:18:59Z</dc:date>
    <item>
      <title>Sorting inquiry</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Sorting-inquiry/m-p/423692#M121582</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;

&lt;P&gt;I need help to further sort the following data. In the sample data in the screenshot, I wanted to group the password.&lt;/P&gt;

&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="alt text"&gt;&lt;img src="https://community.splunk.com/t5/image/serverpage/image-id/7436i875B3EFFF749E1D9/image-size/large?v=v2&amp;amp;px=999" role="button" title="alt text" alt="alt text" /&gt;&lt;/span&gt;&lt;/P&gt;

&lt;P&gt;The output should look like&lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;problem abstract&lt;/STRONG&gt;         &lt;STRONG&gt;count&lt;/STRONG&gt;&lt;BR /&gt;
SAP Password Reset                  27&lt;/P&gt;</description>
      <pubDate>Thu, 01 Aug 2019 06:11:09 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Sorting-inquiry/m-p/423692#M121582</guid>
      <dc:creator>chinkeeparco</dc:creator>
      <dc:date>2019-08-01T06:11:09Z</dc:date>
    </item>
    <item>
      <title>Re: Sorting inquiry</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Sorting-inquiry/m-p/423693#M121583</link>
      <description>&lt;P&gt;you dont have any fields call password to sort on.&lt;BR /&gt;
Try &lt;BR /&gt;
| sort problem_abstract&lt;/P&gt;</description>
      <pubDate>Thu, 01 Aug 2019 12:35:12 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Sorting-inquiry/m-p/423693#M121583</guid>
      <dc:creator>chinmoya</dc:creator>
      <dc:date>2019-08-01T12:35:12Z</dc:date>
    </item>
    <item>
      <title>Re: Sorting inquiry</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Sorting-inquiry/m-p/423694#M121584</link>
      <description>&lt;P&gt;You do that by normalizing the data.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;... | eval problem_abstract=case(problem_abstract="SAP Password Reset", problem_abstract, problem_abstract="Reset SAP Password", "SAP Password Reset", problem_abstract="SAP Reset Password", "SAP Password Reset", 1=1, problem_abstract) | ...
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;but that means having a case entry for each possible problem.&lt;/P&gt;

&lt;P&gt;Letting Splunk do that for you may work better, depending on your data.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;... | cluster showcount=true countfield=count field=problem_abstract match=termset | top limit=10 count | sort - count | table problem_abstract count
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 01 Aug 2019 12:39:31 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Sorting-inquiry/m-p/423694#M121584</guid>
      <dc:creator>richgalloway</dc:creator>
      <dc:date>2019-08-01T12:39:31Z</dc:date>
    </item>
    <item>
      <title>Re: Sorting inquiry</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Sorting-inquiry/m-p/423695#M121585</link>
      <description>&lt;P&gt;@richgalloway exactly where I was going with the cluster command. In fact if patterns are not know some other options like TFIDF, NLP etc.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| makeresults 
| fields - _time 
| eval problem_abstract="SAP reset Password=10,reset SAP Password=20,Password reset SAP=20,Other=100,Something Else=50" 
| makemv problem_abstract delim="," 
| mvexpand problem_abstract 
| makemv problem_abstract delim="="
| eval count=mvindex(problem_abstract,1),problem_abstract=mvindex(problem_abstract,0)
| table problem_abstract count

| cluster field=problem_abstract t=0.3 
|  fields - cluster_label
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Play around with &lt;CODE&gt;t&lt;/CODE&gt; as per your need of creation of &lt;A href="https://docs.splunk.com/Documentation/Splunk/latets/SearchReference/Cluster"&gt;clusters&lt;/A&gt;. Refer to cluster command documentation.&lt;/P&gt;</description>
      <pubDate>Thu, 01 Aug 2019 15:05:32 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Sorting-inquiry/m-p/423695#M121585</guid>
      <dc:creator>niketn</dc:creator>
      <dc:date>2019-08-01T15:05:32Z</dc:date>
    </item>
    <item>
      <title>Re: Sorting inquiry</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Sorting-inquiry/m-p/423696#M121586</link>
      <description>&lt;P&gt;@richgalloway  and @niketnilay - clustering is definitely an interesting option. it has to be termset or ngramset though, termlist , which is the 'match' parameter by default will yield inferior results. &lt;BR /&gt;
But there is a risk - I tested with reset sap password &amp;amp; sap password reset with text like 'i care' and 'i don;t care' as dummy. It works well with termset and ngramset. But then i added a fourth line/phrase - please reset my sap password. Now, the game changes and the clustering fails to yield proper results.&lt;BR /&gt;
@chinkeeparco - please go ahead with the clustering as suggested by rich and niket, you have to play around  with the t value and the match term , to see what suits you best&lt;/P&gt;</description>
      <pubDate>Thu, 01 Aug 2019 17:17:04 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Sorting-inquiry/m-p/423696#M121586</guid>
      <dc:creator>Sukisen1981</dc:creator>
      <dc:date>2019-08-01T17:17:04Z</dc:date>
    </item>
    <item>
      <title>Re: Sorting inquiry</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Sorting-inquiry/m-p/423697#M121587</link>
      <description>&lt;P&gt;@Sukisen1981 yes indeed I have mentioned TFIDF, NLP to be tried as well. But like @richgalloway  has mentioned solution should be adopted as per the use case.&lt;/P&gt;

&lt;OL&gt;
&lt;LI&gt; Simplest use case is where we know all possible groups of field problem_abstract. The case statement provided by Rich can be prepared using a lookup as well, where such static combinations can be stored and updated.&lt;/LI&gt;
&lt;LI&gt;cluster command can do initial clustering based on strict and lenient pattern match using &lt;CODE&gt;t&lt;/CODE&gt; options.&lt;/LI&gt;
&lt;LI&gt;ML is the solid use case for such free form text pattern match where we are not aware of any possible combination/s of text. &lt;A href="https://docs.splunk.com/Documentation/MLApp/latest/User/Algorithms#Feature_Extraction"&gt;TFIDF or HashingVector&lt;/A&gt; for feature extraction with less compute and NLP for &lt;A href="https://www.splunk.com/blog/2019/04/11/let-s-talk-about-text-baby.html"&gt;Natural Language Processing&lt;/A&gt;.&lt;/LI&gt;
&lt;/OL&gt;</description>
      <pubDate>Thu, 01 Aug 2019 20:34:33 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Sorting-inquiry/m-p/423697#M121587</guid>
      <dc:creator>niketn</dc:creator>
      <dc:date>2019-08-01T20:34:33Z</dc:date>
    </item>
    <item>
      <title>Re: Sorting inquiry</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Sorting-inquiry/m-p/423698#M121588</link>
      <description>&lt;P&gt;I did explicitly mention &lt;CODE&gt;match=termset&lt;/CODE&gt; in my answer as well as "depending on your data".  You may have to combine the two approaches I offered - normalize some outliers then let &lt;CODE&gt;cluster&lt;/CODE&gt; do the rest.  Then again, some experimenting with various &lt;CODE&gt;cluster&lt;/CODE&gt; options may yield acceptable results.&lt;/P&gt;</description>
      <pubDate>Thu, 01 Aug 2019 21:14:09 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Sorting-inquiry/m-p/423698#M121588</guid>
      <dc:creator>richgalloway</dc:creator>
      <dc:date>2019-08-01T21:14:09Z</dc:date>
    </item>
    <item>
      <title>Re: Sorting inquiry</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Sorting-inquiry/m-p/423699#M121589</link>
      <description>&lt;P&gt;@richgalloway @niketnilay Thank you so much for your advice!! &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt; It was really helpful! &lt;/P&gt;

&lt;P&gt;Also, @Sukisen1981 thank you so much as well! &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 02 Aug 2019 09:18:59 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Sorting-inquiry/m-p/423699#M121589</guid>
      <dc:creator>chinkeeparco</dc:creator>
      <dc:date>2019-08-02T09:18:59Z</dc:date>
    </item>
  </channel>
</rss>

