<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Rank data from web access files in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Rank-data-from-web-access-files/m-p/98099#M25287</link>
    <description>&lt;P&gt;I have web content (articles, stories) where each article is grouped in a category such as NEWS, STORY, etc. Website visitors are grouped by region. In each region, I want to be able to rank each category by the number of site visitors who read articles in a category.&lt;/P&gt;

&lt;P&gt;I can get a count by region and category.&lt;/P&gt;

&lt;P&gt;I can get a count by Region, VisitorID, Category.&lt;/P&gt;

&lt;P&gt;However, I want to know how many site visitors had CAT1 as their most-read category.  How many had CAT1 as their second most-read category? How many had CAT2 as their most-read category?&lt;/P&gt;

&lt;P&gt;Here's an example:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;**Region    VID      Category   # Visitors who ranked this 1st**
NY           87        STORY               10
NY           44        STORY                9
LA           98        NEWS                 4
&lt;/CODE&gt;&lt;/PRE&gt;</description>
    <pubDate>Mon, 09 May 2011 19:01:13 GMT</pubDate>
    <dc:creator>ndoshi</dc:creator>
    <dc:date>2011-05-09T19:01:13Z</dc:date>
    <item>
      <title>Rank data from web access files</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Rank-data-from-web-access-files/m-p/98099#M25287</link>
      <description>&lt;P&gt;I have web content (articles, stories) where each article is grouped in a category such as NEWS, STORY, etc. Website visitors are grouped by region. In each region, I want to be able to rank each category by the number of site visitors who read articles in a category.&lt;/P&gt;

&lt;P&gt;I can get a count by region and category.&lt;/P&gt;

&lt;P&gt;I can get a count by Region, VisitorID, Category.&lt;/P&gt;

&lt;P&gt;However, I want to know how many site visitors had CAT1 as their most-read category.  How many had CAT1 as their second most-read category? How many had CAT2 as their most-read category?&lt;/P&gt;

&lt;P&gt;Here's an example:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;**Region    VID      Category   # Visitors who ranked this 1st**
NY           87        STORY               10
NY           44        STORY                9
LA           98        NEWS                 4
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Mon, 09 May 2011 19:01:13 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Rank-data-from-web-access-files/m-p/98099#M25287</guid>
      <dc:creator>ndoshi</dc:creator>
      <dc:date>2011-05-09T19:01:13Z</dc:date>
    </item>
    <item>
      <title>Re: Rank data from web access files</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Rank-data-from-web-access-files/m-p/98100#M25288</link>
      <description>&lt;P&gt;Try aggregating by region, visitor id, and classification.  Then, use streamstats to create the rank field. &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;sourcetype="mydata" geo_region=XXX | eval Classification=case(like(pagename, "home%"), "HOME",like(pagename, "%news%") AND like(pagename, "%story%"), "STORY",like(pagename, "%markets%"), "MARKETS",like(pagename, "%personalfinance%"), "PERSONALFINANCE",like(pagename, "%search%"), "SEARCH",like(pagename, "%news%"), "NEWS")|stats count as hitcount by geo_region, visid, Classification | sort geo_region,visid,-hitcount | streamstats count as rank by visid | stats count as rankcount by geo_region, Classification, rank
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Breaking down the search command, here are the results from each section:&lt;/P&gt;

&lt;P&gt;stats count as hitcount by geo_region, visid, Classification&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;geo_region      visid   Classification          hitcount
CA              100     HOME                    5
CA              100     NEWS                    10
CA              100     MARKETS                 7
CA              200     NEWS                    5
CA              200     HOME                    10
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;You want to sort in order of the number of hits most to fewest&lt;BR /&gt;
sort geo_region,visid,-hitcount&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;geo_region      visid   Classification          hitcount
CA              100     NEWS                    10
CA              100     MARKETS                 7
CA              100     HOME                    5
CA              200     HOME                    10
CA              200     NEWS                    15
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Use streamstats to count the number of rows for each visitor id.  By starting with 1 for each visitor id, this creates a rank field.&lt;/P&gt;

&lt;P&gt;streamstats count as rank by visid &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;geo_region      visid   Classification          hitcount        rank
CA              100     NEWS                    10              1
CA              100     MARKETS                 7               2
CA              100     HOME                    5               3
CA              200     NEWS                    15              1
CA              200     HOME                    10              2
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Then, count the number of rows per Classification and rank&lt;/P&gt;

&lt;P&gt;stats count as rankcount by geo_region, Classification, rank&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;Classification          rank            rankcount
NEWS                    1               2
MARKETS                 2               1
HOME                    2               1
HOME                    3               1
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Mon, 09 May 2011 19:05:44 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Rank-data-from-web-access-files/m-p/98100#M25288</guid>
      <dc:creator>eelisio2</dc:creator>
      <dc:date>2011-05-09T19:05:44Z</dc:date>
    </item>
  </channel>
</rss>

