<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Getting duplicate record after uploading json (even dedup not working) in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/Getting-duplicate-record-after-uploading-json-even-dedup-not/m-p/372357#M67573</link>
    <description>&lt;P&gt;Hi @sawgata12345,&lt;BR /&gt;
If you have indexed file twice and if it is possible to clean index then clean it and while indexing new file add crcSalt in inputs.conf so that splunk won't index duplicate file.&lt;/P&gt;</description>
    <pubDate>Thu, 04 Jan 2018 10:54:49 GMT</pubDate>
    <dc:creator>nikita_p</dc:creator>
    <dc:date>2018-01-04T10:54:49Z</dc:date>
    <item>
      <title>Getting duplicate record after uploading json (even dedup not working)</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Getting-duplicate-record-after-uploading-json-even-dedup-not/m-p/372355#M67571</link>
      <description>&lt;P&gt;Hi,&lt;BR /&gt;
I have uploaded a json file to splunk and using spath command to get output, but the output shows two rows for a single record.&lt;BR /&gt;
The json file sample is:&lt;BR /&gt;
[&lt;BR /&gt;
  {&lt;BR /&gt;
    "id": 707860,&lt;BR /&gt;
    "name": "Hurzuf",&lt;BR /&gt;
    "country": "UA",&lt;BR /&gt;
    "timestamp":"2017-09-02T06:44:14,799 MDT",&lt;BR /&gt;
    "coord": {&lt;BR /&gt;
      "lon": 34.283333,&lt;BR /&gt;
      "lat": 44.549999&lt;BR /&gt;
    },&lt;BR /&gt;
    "ports":[&lt;BR /&gt;
    {&lt;BR /&gt;
      "port": 1,&lt;BR /&gt;
      "utilization": 140,&lt;BR /&gt;
      "error": {&lt;BR /&gt;
         "tx": 1000.00,&lt;BR /&gt;
             "rx": 500&lt;BR /&gt;
      }&lt;BR /&gt;
    },&lt;BR /&gt;
    {&lt;BR /&gt;
      "port": 2,&lt;BR /&gt;
      "utilization": 110,&lt;BR /&gt;
      "error": {&lt;BR /&gt;
         "tx": 1002.00,&lt;BR /&gt;
         "rx": 420&lt;BR /&gt;
      }&lt;BR /&gt;
    }&lt;BR /&gt;
   ]&lt;BR /&gt;
  },&lt;BR /&gt;
  {&lt;BR /&gt;
    "id": 519188,&lt;BR /&gt;
    "name": "Novinki",&lt;BR /&gt;
    "country": "RU",&lt;BR /&gt;
    "timestamp":"2017-09-03T06:50:14,799 MDT",&lt;BR /&gt;
    "coord": {&lt;BR /&gt;
      "lon": 37.666668,&lt;BR /&gt;
      "lat": 55.683334&lt;BR /&gt;
    },&lt;BR /&gt;
    "ports":[&lt;BR /&gt;
    {&lt;BR /&gt;
      "port": 1,&lt;BR /&gt;
      "utilization": 120,&lt;BR /&gt;
      "error": {&lt;BR /&gt;
         "tx": 1020.00,&lt;BR /&gt;
         "rx": 400&lt;BR /&gt;
      }&lt;BR /&gt;
    },&lt;BR /&gt;
    {&lt;BR /&gt;
      "port": 2,&lt;BR /&gt;
      "utilization": 120,&lt;BR /&gt;
      "error": {&lt;BR /&gt;
         "tx": 1002.00,&lt;BR /&gt;
             "rx": 400&lt;BR /&gt;
      }&lt;BR /&gt;
    }&lt;BR /&gt;
    ]&lt;BR /&gt;
  }]&lt;/P&gt;

&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="alt text"&gt;&lt;img src="https://community.splunk.com/t5/image/serverpage/image-id/4095iCC2730FE05C3D8CB/image-size/large?v=v2&amp;amp;px=999" role="button" title="alt text" alt="alt text" /&gt;&lt;/span&gt; &lt;/P&gt;</description>
      <pubDate>Thu, 04 Jan 2018 10:25:30 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Getting-duplicate-record-after-uploading-json-even-dedup-not/m-p/372355#M67571</guid>
      <dc:creator>sawgata12345</dc:creator>
      <dc:date>2018-01-04T10:25:30Z</dc:date>
    </item>
    <item>
      <title>Re: Getting duplicate record after uploading json (even dedup not working)</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Getting-duplicate-record-after-uploading-json-even-dedup-not/m-p/372356#M67572</link>
      <description>&lt;P&gt;Have you uploaded files twice ? Maybe previously as well ? Have you tried this query &lt;CODE&gt;index="newjson1" sourcetype="_json" | dedup _raw| spath .....&lt;/CODE&gt; ?&lt;/P&gt;</description>
      <pubDate>Thu, 04 Jan 2018 10:46:55 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Getting-duplicate-record-after-uploading-json-even-dedup-not/m-p/372356#M67572</guid>
      <dc:creator>harsmarvania57</dc:creator>
      <dc:date>2018-01-04T10:46:55Z</dc:date>
    </item>
    <item>
      <title>Re: Getting duplicate record after uploading json (even dedup not working)</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Getting-duplicate-record-after-uploading-json-even-dedup-not/m-p/372357#M67573</link>
      <description>&lt;P&gt;Hi @sawgata12345,&lt;BR /&gt;
If you have indexed file twice and if it is possible to clean index then clean it and while indexing new file add crcSalt in inputs.conf so that splunk won't index duplicate file.&lt;/P&gt;</description>
      <pubDate>Thu, 04 Jan 2018 10:54:49 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Getting-duplicate-record-after-uploading-json-even-dedup-not/m-p/372357#M67573</guid>
      <dc:creator>nikita_p</dc:creator>
      <dc:date>2018-01-04T10:54:49Z</dc:date>
    </item>
    <item>
      <title>Re: Getting duplicate record after uploading json (even dedup not working)</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Getting-duplicate-record-after-uploading-json-even-dedup-not/m-p/372358#M67574</link>
      <description>&lt;P&gt;Hi @sawgata12345,&lt;/P&gt;

&lt;P&gt;I think your each event contain multiple records.And each record has it's own &lt;CODE&gt;timestamp&lt;/CODE&gt; &lt;CODE&gt;country&lt;/CODE&gt; &lt;CODE&gt;name&lt;/CODE&gt; &lt;CODE&gt;id&lt;/CODE&gt; fields moreover &lt;CODE&gt;ports&lt;/CODE&gt; field also contain multiple values like &lt;CODE&gt;port: 1&lt;/CODE&gt; &amp;amp; &lt;CODE&gt;port:2&lt;/CODE&gt;.  So you have to write a search which can maintain relation of each field of port and relative &lt;CODE&gt;id&lt;/CODE&gt;.&lt;/P&gt;

&lt;P&gt;Your Sample Event:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[{
    "id": 707860,
    "name": "Hurzuf",
    "country": "UA",
    "timestamp": "2017-09-02T06:44:14,799 MDT",
    "coord": {
        "lon": 34.283333,
        "lat": 44.549999
    },
    "ports": [{
        "port": 1,
        "utilization": 140,
        "error": {
            "tx": 1000.00,
            "rx": 500
        }
    },
    {
        "port": 2,
        "utilization": 110,
        "error": {
            "tx": 1002.00,
            "rx": 420
        }
    }]
},
{
    "id": 519188,
    "name": "Novinki",
    "country": "RU",
    "timestamp": "2017-09-03T06:50:14,799 MDT",
    "coord": {
        "lon": 37.666668,
        "lat": 55.683334
    },
    "ports": [{
        "port": 1,
        "utilization": 120,
        "error": {
            "tx": 1020.00,
            "rx": 400
        }
    },
    {
        "port": 2,
        "utilization": 120,
        "error": {
            "tx": 1002.00,
            "rx": 400
        }
    }]
}]
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Can you please try below search:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index="newjson1" sourcetype="_json"
| spath {} output=even 
| stats count by even 
| eval _raw=even 
| spath 
| rename coord.lat as coord_lat coord.lon as coord_lon ports{}.error.rx as ports_error_rx ports{}.error.tx as ports_error_tx ports{}.port as ports_port ports{}.utilization as ports_utilization 
| eval tempField=mvzip(mvzip(mvzip(ports_error_tx,ports_error_rx),ports_port),ports_utilization) 
| stats count by tempField timestamp country name id 
| eval ports_error_tx=mvindex(split(tempField,","),0), ports_error_rx=mvindex(split(tempField,","),1),ports_port=mvindex(split(tempField,","),2), ports_utilization=mvindex(split(tempField,","),3) | sort timestamp id
| table timestamp country name id ports_error_tx ports_error_rx ports_port ports_utilization
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;My Sample Search:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| makeresults 
| eval _raw="[ { \"id\": 707860, \"name\": \"Hurzuf\", \"country\": \"UA\", \"timestamp\":\"2017-09-02T06:44:14,799 MDT\", \"coord\": { \"lon\": 34.283333, \"lat\": 44.549999 }, \"ports\":[ { \"port\": 1, \"utilization\": 140, \"error\": { \"tx\": 1000.00, \"rx\": 500 } }, { \"port\": 2, \"utilization\": 110, \"error\": { \"tx\": 1002.00, \"rx\": 420 } } ] }, { \"id\": 519188, \"name\": \"Novinki\", \"country\": \"RU\", \"timestamp\":\"2017-09-03T06:50:14,799 MDT\", \"coord\": { \"lon\": 37.666668, \"lat\": 55.683334 }, \"ports\":[ { \"port\": 1, \"utilization\": 120, \"error\": { \"tx\": 1020.00, \"rx\": 400 } }, { \"port\": 2, \"utilization\": 120, \"error\": { \"tx\": 1002.00, \"rx\": 400 } } ] }]" 
| spath {} output=even 
| stats count by even 
| eval _raw=even 
| spath 
| rename coord.lat as coord_lat coord.lon as coord_lon ports{}.error.rx as ports_error_rx ports{}.error.tx as ports_error_tx ports{}.port as ports_port ports{}.utilization as ports_utilization 
| eval tempField=mvzip(mvzip(mvzip(ports_error_tx,ports_error_rx),ports_port),ports_utilization) 
| stats count by tempField timestamp country name id 
| eval ports_error_tx=mvindex(split(tempField,","),0), ports_error_rx=mvindex(split(tempField,","),1),ports_port=mvindex(split(tempField,","),2), ports_utilization=mvindex(split(tempField,","),3) | sort timestamp id
| table timestamp country name id ports_error_tx ports_error_rx ports_port ports_utilization
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="alt text"&gt;&lt;img src="https://community.splunk.com/t5/image/serverpage/image-id/4094iD10BEB4C960E6AD7/image-size/large?v=v2&amp;amp;px=999" role="button" title="alt text" alt="alt text" /&gt;&lt;/span&gt;&lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;Happy Splunking&lt;/STRONG&gt;&lt;/P&gt;

&lt;P&gt;Thanks &lt;BR /&gt;
Kamlesh&lt;/P&gt;</description>
      <pubDate>Thu, 04 Jan 2018 12:36:07 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Getting-duplicate-record-after-uploading-json-even-dedup-not/m-p/372358#M67574</guid>
      <dc:creator>kamlesh_vaghela</dc:creator>
      <dc:date>2018-01-04T12:36:07Z</dc:date>
    </item>
    <item>
      <title>Re: Getting duplicate record after uploading json (even dedup not working)</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Getting-duplicate-record-after-uploading-json-even-dedup-not/m-p/372359#M67575</link>
      <description>&lt;P&gt;@nikita_p&lt;BR /&gt;
i cleared the index and uploaded fresh data and  added crcSalt in inputs.conf. Even then the rows are coming twice. This could be because of the multiple port details per json event. But in that case if 50 port details then the row gets repeated 50 times. Its not good right?&lt;/P&gt;</description>
      <pubDate>Fri, 05 Jan 2018 06:53:21 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Getting-duplicate-record-after-uploading-json-even-dedup-not/m-p/372359#M67575</guid>
      <dc:creator>sawgata12345</dc:creator>
      <dc:date>2018-01-05T06:53:21Z</dc:date>
    </item>
    <item>
      <title>Re: Getting duplicate record after uploading json (even dedup not working)</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Getting-duplicate-record-after-uploading-json-even-dedup-not/m-p/372360#M67576</link>
      <description>&lt;P&gt;@harsmarvania57&lt;BR /&gt;
I cleaned the index and uploaded the file once fresh, but even then its same. Even dedup _raw before the |spath   also giving the same result. Is it because of multiple port details in each single json event?&lt;/P&gt;</description>
      <pubDate>Fri, 05 Jan 2018 06:55:55 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Getting-duplicate-record-after-uploading-json-even-dedup-not/m-p/372360#M67576</guid>
      <dc:creator>sawgata12345</dc:creator>
      <dc:date>2018-01-05T06:55:55Z</dc:date>
    </item>
    <item>
      <title>Re: Getting duplicate record after uploading json (even dedup not working)</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Getting-duplicate-record-after-uploading-json-even-dedup-not/m-p/372361#M67577</link>
      <description>&lt;P&gt;@kamlesh&lt;/P&gt;

&lt;P&gt;This query as you suggested gives me "No result" after executing.&lt;BR /&gt;
what is this _raw=even(this "_raw" according to your query you have stored the whole json content but for me the data is already uploaded and indexed)&lt;/P&gt;

&lt;P&gt;index="newjson1" sourcetype="_json"&lt;BR /&gt;
 | spath {} output=even &lt;BR /&gt;
 | stats count by even &lt;BR /&gt;
 | eval _raw=even &lt;BR /&gt;
 | spath &lt;BR /&gt;
 | rename coord.lat as coord_lat coord.lon as coord_lon ports{}.error.rx as ports_error_rx ports{}.error.tx as ports_error_tx ports{}.port as ports_port ports{}.utilization as ports_utilization &lt;BR /&gt;
 | eval tempField=mvzip(mvzip(mvzip(ports_error_tx,ports_error_rx),ports_port),ports_utilization) &lt;BR /&gt;
 | stats count by tempField timestamp country name id &lt;BR /&gt;
 | eval ports_error_tx=mvindex(split(tempField,","),0), ports_error_rx=mvindex(split(tempField,","),1),ports_port=mvindex(split(tempField,","),2), ports_utilization=mvindex(split(tempField,","),3) | sort timestamp id&lt;BR /&gt;
 | table timestamp country name id ports_error_tx ports_error_rx ports_port ports_utilization&lt;/P&gt;</description>
      <pubDate>Tue, 29 Sep 2020 17:29:49 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Getting-duplicate-record-after-uploading-json-even-dedup-not/m-p/372361#M67577</guid>
      <dc:creator>sawgata12345</dc:creator>
      <dc:date>2020-09-29T17:29:49Z</dc:date>
    </item>
    <item>
      <title>Re: Getting duplicate record after uploading json (even dedup not working)</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Getting-duplicate-record-after-uploading-json-even-dedup-not/m-p/372362#M67578</link>
      <description>&lt;P&gt;Found the issue, remove &lt;CODE&gt;| spath&lt;/CODE&gt; from your query and it will display only single value not duplicate.&lt;/P&gt;</description>
      <pubDate>Fri, 05 Jan 2018 09:11:56 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Getting-duplicate-record-after-uploading-json-even-dedup-not/m-p/372362#M67578</guid>
      <dc:creator>harsmarvania57</dc:creator>
      <dc:date>2018-01-05T09:11:56Z</dc:date>
    </item>
    <item>
      <title>Re: Getting duplicate record after uploading json (even dedup not working)</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Getting-duplicate-record-after-uploading-json-even-dedup-not/m-p/372363#M67579</link>
      <description>&lt;P&gt;Hi @sawgata12345,&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt; | eval _raw=even 
 | spath
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;I'm overwriting  _&lt;CODE&gt;raw&lt;/CODE&gt; with new generated event from &lt;CODE&gt;even&lt;/CODE&gt; field. So next statement &lt;CODE&gt;spath&lt;/CODE&gt; will execute as per expectation. This is because your given event contains multiple records.&lt;/P&gt;

&lt;P&gt;Have you tried to execute this search in parts?&lt;/P&gt;

&lt;P&gt;like ..&lt;BR /&gt;
1) &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index="newjson1" sourcetype="_json"
| spath {} output=even
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;2)&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index="newjson1" sourcetype="_json"
| spath {} output=even
| stats count by even
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;3)&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index="newjson1" sourcetype="_json"
| spath {} output=even
| stats count by even
| eval _raw=even
| spath
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Can you please execute above searches and give me output?&lt;/P&gt;

&lt;P&gt;Thanks&lt;/P&gt;</description>
      <pubDate>Fri, 05 Jan 2018 13:24:47 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Getting-duplicate-record-after-uploading-json-even-dedup-not/m-p/372363#M67579</guid>
      <dc:creator>kamlesh_vaghela</dc:creator>
      <dc:date>2018-01-05T13:24:47Z</dc:date>
    </item>
  </channel>
</rss>

