<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Why is my subquery returning duplicate values? in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/Why-is-my-subquery-returning-duplicate-values/m-p/437360#M76286</link>
    <description>&lt;P&gt;Thanks woodcock. But still I am getting the duplicate values.&lt;/P&gt;</description>
    <pubDate>Thu, 31 Jan 2019 06:04:54 GMT</pubDate>
    <dc:creator>abouttathagata</dc:creator>
    <dc:date>2019-01-31T06:04:54Z</dc:date>
    <item>
      <title>Why is my subquery returning duplicate values?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Why-is-my-subquery-returning-duplicate-values/m-p/437356#M76282</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;

&lt;P&gt;I am trying to run a query, which will give me the results not returning by the inner query. Basically any &lt;CODE&gt;userid&lt;/CODE&gt; can have &lt;CODE&gt;url="/data/a.jsp"&lt;/CODE&gt; and also &lt;CODE&gt;url="^/data/abc.* "&lt;/CODE&gt;. I want &lt;CODE&gt;userids&lt;/CODE&gt; having &lt;CODE&gt;url="/data/a.jsp"&lt;/CODE&gt; to not appear in the search for &lt;CODE&gt;url&lt;BR /&gt;
="(^/data/abc.*) ................"&lt;/CODE&gt;. Here is the main query:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;host="hostname" sourcetype="source_type" NOT 
[search  host="hostname" sourcetype="source_type" | search url = "/data/a.jsp" | fields userid] | 
search userid!="-" | regex url="(^/data/abc.*) |(^/data/def.*)|(^/data/ghi.*)|(^/data/klm.*)" |
dedup url | eval user_status = "no" | dedup userid| 
lookup main_data userid OUTPUT userid, first_name,last_name| table userid, first_name, last_name, user_status 
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;I tried several ways, but still the duplicate &lt;CODE&gt;userids&lt;/CODE&gt; are coming. Please help me out. Thanks in advance.&lt;/P&gt;

&lt;P&gt;Regards,&lt;BR /&gt;
Arka&lt;/P&gt;</description>
      <pubDate>Wed, 30 Jan 2019 17:07:53 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Why-is-my-subquery-returning-duplicate-values/m-p/437356#M76282</guid>
      <dc:creator>abouttathagata</dc:creator>
      <dc:date>2019-01-30T17:07:53Z</dc:date>
    </item>
    <item>
      <title>Re: Why is my subquery returning duplicate values?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Why-is-my-subquery-returning-duplicate-values/m-p/437357#M76283</link>
      <description>&lt;P&gt;Try this to filter userids in the subsearch.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;host="hostname" sourcetype="source_type" NOT 
[ search host="hostname" sourcetype="source_type" | search url = "/data/a.jsp" userid!="-" | stats count by userid | fields userid | format ]
| regex url="(^\/data\/abc\.)|(^\/data\/def\.)|(^\/data\/ghi\.)|(^\/data\/klm\.)" 
| dedup userid | eval user_status = "no"
| lookup main_data userid OUTPUT userid, first_name,last_name
| table userid, first_name, last_name, user_status
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 30 Jan 2019 17:58:55 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Why-is-my-subquery-returning-duplicate-values/m-p/437357#M76283</guid>
      <dc:creator>richgalloway</dc:creator>
      <dc:date>2019-01-30T17:58:55Z</dc:date>
    </item>
    <item>
      <title>Re: Why is my subquery returning duplicate values?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Why-is-my-subquery-returning-duplicate-values/m-p/437358#M76284</link>
      <description>&lt;P&gt;Thanks for your quick response. But still the same. The userid present in &lt;CODE&gt;search url = "/data/a.jsp"&lt;/CODE&gt; still appearing. I am not sure but looks like the inner query not returning anything. If I run it individually it is running fine though.&lt;/P&gt;</description>
      <pubDate>Wed, 30 Jan 2019 18:27:16 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Why-is-my-subquery-returning-duplicate-values/m-p/437358#M76284</guid>
      <dc:creator>abouttathagata</dc:creator>
      <dc:date>2019-01-30T18:27:16Z</dc:date>
    </item>
    <item>
      <title>Re: Why is my subquery returning duplicate values?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Why-is-my-subquery-returning-duplicate-values/m-p/437359#M76285</link>
      <description>&lt;P&gt;Try this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index=YouShoulAlwaysSpeciryIndexValues host="hostname" sourcetype="source_type" userid!="-"
NOT [search  host="hostname" sourcetype="source_type" url = "/data/a.jsp" | stats count BY userid | table userid]
| regex url="(^/data/abc.*) |(^/data/def.*)|(^/data/ghi.*)|(^/data/klm.*)" 
| dedup userid
| lookup main_data userid OUTPUT userid first_name last_name
| eval user_status = "no" 
| table userid first_name last_name user_status 
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 30 Jan 2019 19:41:31 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Why-is-my-subquery-returning-duplicate-values/m-p/437359#M76285</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2019-01-30T19:41:31Z</dc:date>
    </item>
    <item>
      <title>Re: Why is my subquery returning duplicate values?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Why-is-my-subquery-returning-duplicate-values/m-p/437360#M76286</link>
      <description>&lt;P&gt;Thanks woodcock. But still I am getting the duplicate values.&lt;/P&gt;</description>
      <pubDate>Thu, 31 Jan 2019 06:04:54 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Why-is-my-subquery-returning-duplicate-values/m-p/437360#M76286</guid>
      <dc:creator>abouttathagata</dc:creator>
      <dc:date>2019-01-31T06:04:54Z</dc:date>
    </item>
    <item>
      <title>Re: Why is my subquery returning duplicate values?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Why-is-my-subquery-returning-duplicate-values/m-p/437361#M76287</link>
      <description>&lt;P&gt;Hello @abouttathagata &lt;/P&gt;

&lt;P&gt;Output of this query is also having duplicate userid:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt; host="hostname" sourcetype="source_type" NOT 
 [search  host="hostname" sourcetype="source_type" | search url = "/data/a.jsp" | fields userid] | 
 search userid!="-" | regex url="(^/data/abc.*) |(^/data/def.*)|(^/data/ghi.*)|(^/data/klm.*)" |
 dedup url | eval user_status = "no" | dedup userid
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 31 Jan 2019 06:28:23 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Why-is-my-subquery-returning-duplicate-values/m-p/437361#M76287</guid>
      <dc:creator>vishaltaneja070</dc:creator>
      <dc:date>2019-01-31T06:28:23Z</dc:date>
    </item>
    <item>
      <title>Re: Why is my subquery returning duplicate values?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Why-is-my-subquery-returning-duplicate-values/m-p/437362#M76288</link>
      <description>&lt;P&gt;yes it is the same query right. So it will give the duplicate userid only. &lt;/P&gt;</description>
      <pubDate>Thu, 31 Jan 2019 08:19:42 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Why-is-my-subquery-returning-duplicate-values/m-p/437362#M76288</guid>
      <dc:creator>abouttathagata</dc:creator>
      <dc:date>2019-01-31T08:19:42Z</dc:date>
    </item>
    <item>
      <title>Re: Why is my subquery returning duplicate values?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Why-is-my-subquery-returning-duplicate-values/m-p/437363#M76289</link>
      <description>&lt;P&gt;Hello &lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/161338"&gt;@abouttathagata&lt;/a&gt; &lt;/P&gt;

&lt;P&gt;If at the end of query, dedup userid is mentioned and still you are able to see duplicate userid, then i think the issue is with data. Same userid has either different case or having extra space in the value etc.&lt;/P&gt;

&lt;P&gt;Try to run this query to better check this:&lt;BR /&gt;
host="hostname" sourcetype="source_type" NOT &lt;BR /&gt;
  [search  host="hostname" sourcetype="source_type" | search url = "/data/a.jsp" | fields userid] | &lt;BR /&gt;
  search userid!="-" | regex url="(^/data/abc.&lt;EM&gt;) |(^/data/def.&lt;/EM&gt;)|(^/data/ghi.&lt;EM&gt;)|(^/data/klm.&lt;/EM&gt;)" |&lt;BR /&gt;
  dedup url | eval user_status = "no | stats count by userid&lt;/P&gt;</description>
      <pubDate>Tue, 29 Sep 2020 23:01:40 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Why-is-my-subquery-returning-duplicate-values/m-p/437363#M76289</guid>
      <dc:creator>vishaltaneja070</dc:creator>
      <dc:date>2020-09-29T23:01:40Z</dc:date>
    </item>
    <item>
      <title>Re: Why is my subquery returning duplicate values?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Why-is-my-subquery-returning-duplicate-values/m-p/437364#M76290</link>
      <description>&lt;P&gt;No Hope. Still same result. Data is not a problem I think.&lt;/P&gt;</description>
      <pubDate>Thu, 31 Jan 2019 10:21:02 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Why-is-my-subquery-returning-duplicate-values/m-p/437364#M76290</guid>
      <dc:creator>abouttathagata</dc:creator>
      <dc:date>2019-01-31T10:21:02Z</dc:date>
    </item>
    <item>
      <title>Re: Why is my subquery returning duplicate values?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Why-is-my-subquery-returning-duplicate-values/m-p/437365#M76291</link>
      <description>&lt;P&gt;is it possible to put two duplicate set you are getting while running the above command?&lt;/P&gt;</description>
      <pubDate>Thu, 31 Jan 2019 10:31:47 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Why-is-my-subquery-returning-duplicate-values/m-p/437365#M76291</guid>
      <dc:creator>vishaltaneja070</dc:creator>
      <dc:date>2019-01-31T10:31:47Z</dc:date>
    </item>
    <item>
      <title>Re: Why is my subquery returning duplicate values?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Why-is-my-subquery-returning-duplicate-values/m-p/437366#M76292</link>
      <description>&lt;P&gt;&lt;STRONG&gt;I am using following query to get the data for user status = yes&lt;/STRONG&gt;&lt;/P&gt;

&lt;P&gt;host="hostname" sourcetype="source_type" |search userid!="-" &lt;BR /&gt;
|search url="/data/a.jsp" | eval user_status="yes" | dedup userid &lt;BR /&gt;
| lookup main_data userid OUTPUT userid, first_name,last_name&lt;BR /&gt;
| table userid, first_name, last_name, user_status, url&lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;Result&lt;/STRONG&gt;:&lt;/P&gt;

&lt;P&gt;userid = sam01&lt;BR /&gt;
first_name=sam&lt;BR /&gt;
last_name=Rogers&lt;BR /&gt;
user_status=yes&lt;BR /&gt;
url=/data/a.jsp&lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;following query to get the data for user status = no&lt;/STRONG&gt;&lt;/P&gt;

&lt;P&gt;host="hostname" sourcetype="source_type" NOT &lt;BR /&gt;
  [search host="hostname" sourcetype="source_type" | search url = "/data/a.jsp" | fields userid] &lt;BR /&gt;
|   search userid!="-" | regex url="(^\/data\/abc.&lt;EM&gt;)|(^\/data\/def.&lt;/EM&gt;)|(^\/data\/ghi.&lt;EM&gt;)|(^\/data\/klm.&lt;/EM&gt;)" &lt;BR /&gt;
 |  dedup url | eval user_status = "no" | dedup userid&lt;BR /&gt;
| lookup main_data userid OUTPUT userid, first_name,last_name&lt;BR /&gt;
| table userid, first_name, last_name, user_status, url&lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;Result&lt;/STRONG&gt;:&lt;/P&gt;

&lt;P&gt;userid = sam01&lt;BR /&gt;
first_name=sam&lt;BR /&gt;
last_name=Rogers&lt;BR /&gt;
user_status=no&lt;BR /&gt;
url=/data/*&lt;/P&gt;</description>
      <pubDate>Tue, 29 Sep 2020 23:01:45 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Why-is-my-subquery-returning-duplicate-values/m-p/437366#M76292</guid>
      <dc:creator>abouttathagata</dc:creator>
      <dc:date>2020-09-29T23:01:45Z</dc:date>
    </item>
    <item>
      <title>Re: Why is my subquery returning duplicate values?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Why-is-my-subquery-returning-duplicate-values/m-p/437367#M76293</link>
      <description>&lt;P&gt;The problem is this line:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt; | lookup main_data userid OUTPUT userid first_name last_name
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Because that file contains duplicate &lt;CODE&gt;userid&lt;/CODE&gt; values AND because you are outputting &lt;CODE&gt;userid&lt;/CODE&gt; again (which is pretty silly), it is doing exactly what you are telling it to do and outputting them all on each line.  First, fix your lookup file like this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| inputlookup main_data
| dedup userid
| outputlookup main_data
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 12 Feb 2019 16:32:19 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Why-is-my-subquery-returning-duplicate-values/m-p/437367#M76293</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2019-02-12T16:32:19Z</dc:date>
    </item>
  </channel>
</rss>

