<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Duplicate entries in index with JSON and missing values in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/Duplicate-entries-in-index-with-JSON-and-missing-values/m-p/302756#M57177</link>
    <description>&lt;P&gt;Screenshot nr. 3 , I started new index and it shows that each file is indexed twice&lt;/P&gt;</description>
    <pubDate>Thu, 11 Jan 2018 21:10:57 GMT</pubDate>
    <dc:creator>rbruinsma</dc:creator>
    <dc:date>2018-01-11T21:10:57Z</dc:date>
    <item>
      <title>Duplicate entries in index with JSON and missing values</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Duplicate-entries-in-index-with-JSON-and-missing-values/m-p/302753#M57174</link>
      <description>&lt;P&gt;When I index JSON files I get duplicate entries in the Splunk index and some values are not indexed at.&lt;/P&gt;

&lt;P&gt;Example of the JSON files:&lt;/P&gt;

&lt;P&gt;{&lt;BR /&gt;
    "State":  "value" &lt;BR /&gt;
    "TimeStarted":  "03-jan-2018 10:13:29",&lt;BR /&gt;
    "RBName":  "Value",&lt;BR /&gt;
    "Tower":  "Value",&lt;BR /&gt;
    "RBType":  "Value",&lt;BR /&gt;
    "ManualTimeToExecute":  20,&lt;BR /&gt;
    "RefGUID":  "cad8efd8-58c4-4924-add7-78c8f9768b83",&lt;BR /&gt;
    "TicketDetails":  {&lt;BR /&gt;
                          "TimeData":  "03-jan-2018 10:13:30",&lt;BR /&gt;
                          "Description":  "Value",&lt;BR /&gt;
                          "TicketNo":  "Value",&lt;BR /&gt;
                          "TimeCreated":  "03-jan-2018 10:13:12",&lt;BR /&gt;
                          "ShortDescription":  "Value",&lt;BR /&gt;
                          "State":  "Value",&lt;BR /&gt;
                          "ClientRefNumber":  "Value"&lt;BR /&gt;
                      },&lt;BR /&gt;
    "Activities":  [&lt;BR /&gt;
                       {&lt;BR /&gt;
                           "LogLevel":  "Information",&lt;BR /&gt;
                           "LogTime":  "03-jan-2018 10:13:31",&lt;BR /&gt;
                           "Completion":  "Success",&lt;BR /&gt;
                           "Severity":  "GOOD",&lt;BR /&gt;
                           "ImpactedUser":  "Value",&lt;BR /&gt;
                           "Condition":  "GOOD",&lt;BR /&gt;
                           "LogMessage":  " Value",&lt;BR /&gt;
                           "ActionTaskName":  "Value"&lt;BR /&gt;
                       },&lt;BR /&gt;
                  ],&lt;BR /&gt;
    "Comment":  "Value",&lt;BR /&gt;
    "Completion":  "Success",&lt;BR /&gt;
    "Condition":  "BAD",&lt;BR /&gt;
    "EndTime":  "03-jan-2018 10:13:57",&lt;BR /&gt;
    "Severity":  "WARNING"&lt;BR /&gt;
}&lt;/P&gt;

&lt;P&gt;The JSON files contains one array which can contain upto 30 items and the file name of each JSON is unique.&lt;/P&gt;

&lt;P&gt;The results of indexing the JSON files is:&lt;BR /&gt;
&lt;span class="lia-inline-image-display-wrapper" image-alt="alt text"&gt;&lt;img src="https://community.splunk.com/t5/image/serverpage/image-id/4134i5B102C19A2C1E87F/image-size/large?v=v2&amp;amp;px=999" role="button" title="alt text" alt="alt text" /&gt;&lt;/span&gt;&lt;/P&gt;

&lt;P&gt;I use Splunk 7.1 version and the default _json source type to index the files. The JSON files are hosted on the same server as Splunk is installed in a folder&lt;BR /&gt;
&lt;span class="lia-inline-image-display-wrapper" image-alt="alt text"&gt;&lt;img src="https://community.splunk.com/t5/image/serverpage/image-id/4135i52F1F95728FD57ED/image-size/large?v=v2&amp;amp;px=999" role="button" title="alt text" alt="alt text" /&gt;&lt;/span&gt;&lt;/P&gt;

&lt;P&gt;Any idea how to fix the duplicate entries in the index and why some values are not indexed at all?&lt;/P&gt;</description>
      <pubDate>Thu, 11 Jan 2018 19:03:32 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Duplicate-entries-in-index-with-JSON-and-missing-values/m-p/302753#M57174</guid>
      <dc:creator>rbruinsma</dc:creator>
      <dc:date>2018-01-11T19:03:32Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicate entries in index with JSON and missing values</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Duplicate-entries-in-index-with-JSON-and-missing-values/m-p/302754#M57175</link>
      <description>&lt;P&gt;Are you sure the duplicate RefGUIDs are incorrect?  That would make it sound like events were indexed twice, instead of parsed twice.&lt;/P&gt;

&lt;P&gt;And the Condition may not necessarily be wrong either.  Do you see a single event that has the &lt;CODE&gt;Condition&lt;/CODE&gt; value in the JSON but not parsed out by splunk?&lt;/P&gt;</description>
      <pubDate>Thu, 11 Jan 2018 20:19:42 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Duplicate-entries-in-index-with-JSON-and-missing-values/m-p/302754#M57175</guid>
      <dc:creator>micahkemp</dc:creator>
      <dc:date>2018-01-11T20:19:42Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicate entries in index with JSON and missing values</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Duplicate-entries-in-index-with-JSON-and-missing-values/m-p/302755#M57176</link>
      <description>&lt;P&gt;I really looks like all files are indexed twice instead of parsed twice. I started over with clean index and right after the indexing starts you can see  that same file is indexed twice (check file name GUID in screenshot 3).&lt;/P&gt;

&lt;P&gt;Yes, I checked the original JSON files and they all contain a value in the Condition field.&lt;/P&gt;</description>
      <pubDate>Thu, 11 Jan 2018 21:10:00 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Duplicate-entries-in-index-with-JSON-and-missing-values/m-p/302755#M57176</guid>
      <dc:creator>rbruinsma</dc:creator>
      <dc:date>2018-01-11T21:10:00Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicate entries in index with JSON and missing values</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Duplicate-entries-in-index-with-JSON-and-missing-values/m-p/302756#M57177</link>
      <description>&lt;P&gt;Screenshot nr. 3 , I started new index and it shows that each file is indexed twice&lt;/P&gt;</description>
      <pubDate>Thu, 11 Jan 2018 21:10:57 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Duplicate-entries-in-index-with-JSON-and-missing-values/m-p/302756#M57177</guid>
      <dc:creator>rbruinsma</dc:creator>
      <dc:date>2018-01-11T21:10:57Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicate entries in index with JSON and missing values</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Duplicate-entries-in-index-with-JSON-and-missing-values/m-p/302757#M57178</link>
      <description>&lt;P&gt;Check the output of &lt;CODE&gt;splunk list monitor&lt;/CODE&gt; to see if the file somehow shows up twice.&lt;/P&gt;</description>
      <pubDate>Thu, 11 Jan 2018 21:32:51 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Duplicate-entries-in-index-with-JSON-and-missing-values/m-p/302757#M57178</guid>
      <dc:creator>micahkemp</dc:creator>
      <dc:date>2018-01-11T21:32:51Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicate entries in index with JSON and missing values</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Duplicate-entries-in-index-with-JSON-and-missing-values/m-p/302758#M57179</link>
      <description>&lt;P&gt;This outputs exactly the 78 JSON files that are in the folder&lt;/P&gt;</description>
      <pubDate>Thu, 11 Jan 2018 21:41:44 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Duplicate-entries-in-index-with-JSON-and-missing-values/m-p/302758#M57179</guid>
      <dc:creator>rbruinsma</dc:creator>
      <dc:date>2018-01-11T21:41:44Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicate entries in index with JSON and missing values</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Duplicate-entries-in-index-with-JSON-and-missing-values/m-p/302759#M57180</link>
      <description>&lt;P&gt;Will you add the output of:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;splunk btool props list _json --debug
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;(From your screenshot it looks like the sourcetype is &lt;CODE&gt;_json&lt;/CODE&gt;)&lt;/P&gt;</description>
      <pubDate>Thu, 11 Jan 2018 21:48:20 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Duplicate-entries-in-index-with-JSON-and-missing-values/m-p/302759#M57180</guid>
      <dc:creator>micahkemp</dc:creator>
      <dc:date>2018-01-11T21:48:20Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicate entries in index with JSON and missing values</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Duplicate-entries-in-index-with-JSON-and-missing-values/m-p/302760#M57181</link>
      <description>&lt;P&gt;Hereby the output:&lt;BR /&gt;
&lt;IMG src="http://idocs.info/wp-content/uploads/2018/01/splunk4.png" alt="alt text" /&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 11 Jan 2018 22:00:08 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Duplicate-entries-in-index-with-JSON-and-missing-values/m-p/302760#M57181</guid>
      <dc:creator>rbruinsma</dc:creator>
      <dc:date>2018-01-11T22:00:08Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicate entries in index with JSON and missing values</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Duplicate-entries-in-index-with-JSON-and-missing-values/m-p/302761#M57182</link>
      <description>&lt;P&gt;There is certainly nothing in there that I'd expect to be causing this.  Can you also send the inputs.conf responsible for this data?&lt;/P&gt;</description>
      <pubDate>Thu, 11 Jan 2018 22:02:47 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Duplicate-entries-in-index-with-JSON-and-missing-values/m-p/302761#M57182</guid>
      <dc:creator>micahkemp</dc:creator>
      <dc:date>2018-01-11T22:02:47Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicate entries in index with JSON and missing values</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Duplicate-entries-in-index-with-JSON-and-missing-values/m-p/302762#M57183</link>
      <description>&lt;P&gt;I am testing on clean install of Splunk.&lt;/P&gt;

&lt;P&gt;Inputs.conf in splunk\etc\apps\search\default:&lt;/P&gt;

&lt;H1&gt;Version 7.0.1&lt;/H1&gt;

&lt;P&gt;Inputs.conf in splunk\etc\system\default:&lt;/P&gt;

&lt;H1&gt;Version 7.0.1&lt;/H1&gt;

&lt;H1&gt;DO NOT EDIT THIS FILE!&lt;/H1&gt;

&lt;H1&gt;Changes to default files will be lost on update and are difficult to&lt;/H1&gt;

&lt;H1&gt;manage and support.&lt;/H1&gt;

&lt;H1&gt;Please make any changes to system defaults by overriding them in&lt;/H1&gt;

&lt;H1&gt;apps or $SPLUNK_HOME/etc/system/local&lt;/H1&gt;

&lt;H1&gt;(See "Configuration file precedence" in the web documentation).&lt;/H1&gt;

&lt;H1&gt;To override a specific setting, copy the name of the stanza and&lt;/H1&gt;

&lt;H1&gt;setting to the file where you wish to override it.&lt;/H1&gt;

&lt;H1&gt;This file contains possible attributes and values you can use to&lt;/H1&gt;

&lt;H1&gt;configure inputs, distributed inputs and file system monitoring.&lt;/H1&gt;

&lt;P&gt;[default]&lt;BR /&gt;
index         = default&lt;BR /&gt;
_rcvbuf        = 1572864&lt;BR /&gt;
host = $decideOnStartup&lt;BR /&gt;
evt_resolve_ad_obj = 0&lt;BR /&gt;
evt_dc_name=&lt;BR /&gt;
evt_dns_name=&lt;/P&gt;

&lt;P&gt;[blacklist:$SPLUNK_HOME\etc\auth]&lt;/P&gt;

&lt;P&gt;[monitor://$SPLUNK_HOME\var\log\splunk]&lt;BR /&gt;
index = _internal&lt;/P&gt;

&lt;P&gt;[monitor://$SPLUNK_HOME\var\log\splunk\license_usage_summary.log]&lt;BR /&gt;
index = _telemetry&lt;/P&gt;

&lt;P&gt;[monitor://$SPLUNK_HOME\etc\splunk.version]&lt;BR /&gt;
_TCP_ROUTING = *&lt;BR /&gt;
index = _internal&lt;BR /&gt;
sourcetype=splunk_version&lt;/P&gt;

&lt;P&gt;[batch://$SPLUNK_HOME\var\spool\splunk]&lt;BR /&gt;
move_policy = sinkhole&lt;BR /&gt;
crcSalt = &lt;/P&gt;

&lt;P&gt;[batch://$SPLUNK_HOME\var\spool\splunk...stash_new]&lt;BR /&gt;
queue       = stashparsing&lt;BR /&gt;
sourcetype  = stash_new&lt;BR /&gt;
move_policy = sinkhole&lt;BR /&gt;
crcSalt     = &lt;/P&gt;

&lt;P&gt;[fschange:$SPLUNK_HOME\etc]&lt;/P&gt;

&lt;H1&gt;poll every 10 minutes&lt;/H1&gt;

&lt;P&gt;pollPeriod = 600&lt;/P&gt;

&lt;H1&gt;generate audit events into the audit index, instead of fschange events&lt;/H1&gt;

&lt;P&gt;signedaudit=true&lt;BR /&gt;
recurse=true&lt;BR /&gt;
followLinks=false&lt;BR /&gt;
hashMaxSize=-1&lt;BR /&gt;
fullEvent=false&lt;BR /&gt;
sendEventMaxSize=-1&lt;BR /&gt;
filesPerDelay = 10&lt;BR /&gt;
delayInMills = 100&lt;/P&gt;

&lt;P&gt;[udp]&lt;BR /&gt;
connection_host=ip&lt;/P&gt;

&lt;P&gt;[tcp]&lt;BR /&gt;
acceptFrom=*&lt;BR /&gt;
connection_host=dns&lt;/P&gt;

&lt;P&gt;[splunktcp]&lt;BR /&gt;
route=has_key:_replicationBucketUUID:replicationQueue;has_key:_dstrx:typingQueue;has_key:_linebreaker:indexQueue;absent_key:_linebreaker:parsingQueue&lt;BR /&gt;
acceptFrom=*&lt;BR /&gt;
connection_host=ip&lt;/P&gt;

&lt;P&gt;[script]&lt;BR /&gt;
interval = 60.0&lt;BR /&gt;
start_by_shell = false&lt;/P&gt;

&lt;P&gt;[SSL]&lt;/P&gt;

&lt;H1&gt;SSL settings&lt;/H1&gt;

&lt;H1&gt;The following provides modern TLS configuration that guarantees forward-&lt;/H1&gt;

&lt;H1&gt;secrecy and efficiency. This configuration drops support for old Splunk&lt;/H1&gt;

&lt;H1&gt;versions (Splunk 5.x and earlier).&lt;/H1&gt;

&lt;H1&gt;To add support for Splunk 5.x set sslVersions to tls and add this to the&lt;/H1&gt;

&lt;H1&gt;end of cipherSuite:&lt;/H1&gt;

&lt;H1&gt;DHE-RSA-AES256-SHA:AES256-SHA:DHE-RSA-AES128-SHA:AES128-SHA&lt;/H1&gt;

&lt;H1&gt;and this, in case Diffie Hellman is not configured:&lt;/H1&gt;

&lt;H1&gt;AES256-SHA:AES128-SHA&lt;/H1&gt;

&lt;P&gt;sslVersions = tls1.2&lt;BR /&gt;
cipherSuite = ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256&lt;BR /&gt;
ecdhCurves = prime256v1, secp384r1, secp521r1&lt;/P&gt;

&lt;P&gt;allowSslRenegotiation = true&lt;BR /&gt;
sslQuietShutdown = false&lt;/P&gt;

&lt;P&gt;[script://$SPLUNK_HOME\bin\scripts\splunk-wmi.path]&lt;BR /&gt;
disabled = 0&lt;BR /&gt;
interval = 10000000&lt;BR /&gt;
source = wmi&lt;BR /&gt;
sourcetype = wmi&lt;BR /&gt;
queue = winparsing&lt;BR /&gt;
persistentQueueSize=200MB&lt;/P&gt;

&lt;H1&gt;default single instance modular input restarts&lt;/H1&gt;

&lt;P&gt;[admon]&lt;BR /&gt;
interval=60&lt;BR /&gt;
baseline=0&lt;/P&gt;

&lt;P&gt;[MonitorNoHandle]&lt;BR /&gt;
interval=60&lt;/P&gt;

&lt;P&gt;[WinEventLog]&lt;BR /&gt;
interval=60&lt;BR /&gt;
evt_resolve_ad_obj = 0&lt;BR /&gt;
evt_dc_name=&lt;BR /&gt;
evt_dns_name=&lt;/P&gt;

&lt;P&gt;[WinNetMon]&lt;BR /&gt;
interval=60&lt;/P&gt;

&lt;P&gt;[WinPrintMon]&lt;BR /&gt;
interval=60&lt;/P&gt;

&lt;P&gt;[WinRegMon]&lt;BR /&gt;
interval=60&lt;BR /&gt;
baseline=0&lt;/P&gt;

&lt;P&gt;[perfmon]&lt;BR /&gt;
interval=300&lt;/P&gt;

&lt;P&gt;[powershell]&lt;BR /&gt;
interval=60&lt;/P&gt;

&lt;P&gt;[powershell2]&lt;BR /&gt;
interval=60&lt;/P&gt;</description>
      <pubDate>Tue, 29 Sep 2020 17:33:43 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Duplicate-entries-in-index-with-JSON-and-missing-values/m-p/302762#M57183</guid>
      <dc:creator>rbruinsma</dc:creator>
      <dc:date>2020-09-29T17:33:43Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicate entries in index with JSON and missing values</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Duplicate-entries-in-index-with-JSON-and-missing-values/m-p/302763#M57184</link>
      <description>&lt;P&gt;I tried a complete reinstall of SPlunk, same results. &lt;span class="lia-unicode-emoji" title=":disappointed_face:"&gt;😞&lt;/span&gt;&lt;BR /&gt;
What I noticed that the JSON's missing some values are all indexed only the first +/-160 rows, somehow it doesn't index the complete JSON file. Is there somewhere a limit that I need to increase? Some of the JSON's are upto 500-600 rows in length.&lt;/P&gt;</description>
      <pubDate>Fri, 12 Jan 2018 08:38:24 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Duplicate-entries-in-index-with-JSON-and-missing-values/m-p/302763#M57184</guid>
      <dc:creator>rbruinsma</dc:creator>
      <dc:date>2018-01-12T08:38:24Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicate entries in index with JSON and missing values</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Duplicate-entries-in-index-with-JSON-and-missing-values/m-p/302764#M57185</link>
      <description>&lt;P&gt;I fixed the missing values by adding following settings to the json source type:&lt;BR /&gt;
- TRUNCATE =0&lt;BR /&gt;
- MAX_EVENTS=1000&lt;BR /&gt;
Now the complete JSON's gets indexed but still twice.&lt;/P&gt;

&lt;P&gt;Any idea how to get rid of the twice indexed JSON's?&lt;/P&gt;</description>
      <pubDate>Fri, 12 Jan 2018 10:41:22 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Duplicate-entries-in-index-with-JSON-and-missing-values/m-p/302764#M57185</guid>
      <dc:creator>rbruinsma</dc:creator>
      <dc:date>2018-01-12T10:41:22Z</dc:date>
    </item>
  </channel>
</rss>

