<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Duplicate logs. in Monitoring Splunk</title>
    <link>https://community.splunk.com/t5/Monitoring-Splunk/Duplicate-logs/m-p/466867#M8224</link>
    <description>&lt;P&gt;You are not monitoring a log in splunk's sense of monitoring. You are just GETting a file and index it. Thats why you get the duplicated logs. When you are if fact monitoring (e.g. log file available locally) splunk is able to established where he did stop indexing in the file and restarts from there.&lt;BR /&gt;
Usually with rest you should be able too use the query string to pass additional parameters (e.g. start data, end data, etc.) to filter data before indexing.&lt;/P&gt;</description>
    <pubDate>Sun, 01 Sep 2019 11:08:52 GMT</pubDate>
    <dc:creator>diogofgm</dc:creator>
    <dc:date>2019-09-01T11:08:52Z</dc:date>
    <item>
      <title>Duplicate logs.</title>
      <link>https://community.splunk.com/t5/Monitoring-Splunk/Duplicate-logs/m-p/466866#M8223</link>
      <description>&lt;P&gt;Hi Splunker;&lt;/P&gt;

&lt;P&gt;Splunk monitoring logs from URL by reset API, but we noticed that there duplicate logs, I mean Splunk read more than one for one event, you can see the below inputs.conf which I used?&lt;/P&gt;

&lt;P&gt;[rest://port scanner from cloud ps.log]&lt;BR /&gt;
auth_type = none&lt;BR /&gt;
endpoint = &lt;A href="http://95.177.216.188/ps.log" target="_blank"&gt;http://95.177.216.188/ps.log&lt;/A&gt;&lt;BR /&gt;
host = 95.177.216.188&lt;BR /&gt;
http_method = GET&lt;BR /&gt;
http_proxy = &lt;BR /&gt;
index = ps&lt;BR /&gt;
index_error_response_codes = 0&lt;BR /&gt;
response_handler = DefaultResponseHandler&lt;BR /&gt;
response_type = json&lt;BR /&gt;
sequential_mode = 0&lt;BR /&gt;
sourcetype = ps:ports&lt;BR /&gt;
streaming_request = 0&lt;BR /&gt;
polling_interval = 420&lt;/P&gt;

&lt;P&gt;Please help me in that...&lt;/P&gt;

&lt;P&gt;BR;&lt;/P&gt;</description>
      <pubDate>Wed, 30 Sep 2020 01:58:18 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Monitoring-Splunk/Duplicate-logs/m-p/466866#M8223</guid>
      <dc:creator>aalhabbash1</dc:creator>
      <dc:date>2020-09-30T01:58:18Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicate logs.</title>
      <link>https://community.splunk.com/t5/Monitoring-Splunk/Duplicate-logs/m-p/466867#M8224</link>
      <description>&lt;P&gt;You are not monitoring a log in splunk's sense of monitoring. You are just GETting a file and index it. Thats why you get the duplicated logs. When you are if fact monitoring (e.g. log file available locally) splunk is able to established where he did stop indexing in the file and restarts from there.&lt;BR /&gt;
Usually with rest you should be able too use the query string to pass additional parameters (e.g. start data, end data, etc.) to filter data before indexing.&lt;/P&gt;</description>
      <pubDate>Sun, 01 Sep 2019 11:08:52 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Monitoring-Splunk/Duplicate-logs/m-p/466867#M8224</guid>
      <dc:creator>diogofgm</dc:creator>
      <dc:date>2019-09-01T11:08:52Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicate logs.</title>
      <link>https://community.splunk.com/t5/Monitoring-Splunk/Duplicate-logs/m-p/466868#M8225</link>
      <description>&lt;P&gt;hmm the issue is with the polling. You are polling at an interval of 420 s but it is still pulling the historical data. Before we get into the complexities of the checkpoints, is it possible for you to tune your endpoint, so that when its is queried it only returns the 'delta' data ? In most APIs it is possible to set some sort of start/end timings and you can probably do a smoke test by manually changing / modifying the time start/end values in the api endpoint url itself?&lt;/P&gt;</description>
      <pubDate>Sun, 01 Sep 2019 11:13:32 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Monitoring-Splunk/Duplicate-logs/m-p/466868#M8225</guid>
      <dc:creator>Sukisen1981</dc:creator>
      <dc:date>2019-09-01T11:13:32Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicate logs.</title>
      <link>https://community.splunk.com/t5/Monitoring-Splunk/Duplicate-logs/m-p/466869#M8226</link>
      <description>&lt;P&gt;Thank you &lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/90723"&gt;@diogofgm&lt;/a&gt; and &lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/182782"&gt;@Sukisen1981&lt;/a&gt; for reply;&lt;/P&gt;

&lt;P&gt;These logs contains the timestamp and I set configuration about timestamp on props.conf, you can see the below, Mean that Splunk must look in the timestamp to read the new logs coming depends on the new timestamp, why read the historical logs?&lt;/P&gt;

&lt;P&gt;[ps:ports]&lt;BR /&gt;
category = Application&lt;BR /&gt;
description = Ports logs produced by ps&lt;BR /&gt;
pulldown_type = true&lt;/P&gt;

&lt;P&gt;SHOULD_LINEMERGE = false&lt;BR /&gt;
CHECK_FOR_HEADER = false&lt;BR /&gt;
LINE_BREAKER = ({)\'timestamp&lt;BR /&gt;
MAX_TIMESTAMP_LOOKAHEAD = 19&lt;BR /&gt;
NO_BINARY_CHECK = true&lt;BR /&gt;
TIME_FORMAT = %Y-%m-%d %H:%M:%S&lt;BR /&gt;
TIME_PREFIX = \'timestamp\':\s\'&lt;BR /&gt;
TZ = UTC&lt;/P&gt;

&lt;P&gt;And you can see the sample logs from these logs in the below:&lt;/P&gt;

&lt;P&gt;'tcp'}]}\n{'timestamp': '2019-09-01 10:00:56', 'data': [{'status': 'open', 'host': '82.147.220.28', 'port': '80', 'proto': 'tcp'}]}\n{'timestamp': '2019-09-01 10:00:56', 'data': [{'status': 'open', 'host': '82.147.220.11', 'port': '443', 'proto':&lt;BR /&gt;
'tcp'}]}\n{'timestamp': '2019-09-01 10:00:56', 'data': [{'status': 'open', 'host': '82.147.220.13', 'port': '443', 'proto': 'tcp'}]}\n{'timestamp': '2019-09-01 10:00:56', 'data': [{'status': 'open', 'host': '82.147.220.49', 'port': '80', 'proto':&lt;/P&gt;

&lt;P&gt;Then, how can I add parameters?&lt;/P&gt;

&lt;P&gt;BR&lt;/P&gt;</description>
      <pubDate>Wed, 30 Sep 2020 01:58:20 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Monitoring-Splunk/Duplicate-logs/m-p/466869#M8226</guid>
      <dc:creator>aalhabbash1</dc:creator>
      <dc:date>2020-09-30T01:58:20Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicate logs.</title>
      <link>https://community.splunk.com/t5/Monitoring-Splunk/Duplicate-logs/m-p/466870#M8227</link>
      <description>&lt;P&gt;Hi @aalhabbash1 &lt;BR /&gt;
None of the time parameters in your props.conf is related to purging of the historical data.&lt;BR /&gt;
What you are showing in the logs,is an example of what your endpoint data retrieves , what we are asking is something like this - &lt;CODE&gt;&lt;A href="http://95.177.216.188/ps.log/start?=xxxxx" target="test_blank"&gt;http://95.177.216.188/ps.log/start?=xxxxx&lt;/A&gt; end?=yyyy&lt;/CODE&gt;&lt;/P&gt;

&lt;P&gt;We refer to the query string that you are using to make the GET request through your endpoint.&lt;BR /&gt;
My strong suggestion is to make changes in your GET endpoint rather than try to use Splunk to filter out stuff&lt;/P&gt;</description>
      <pubDate>Sun, 01 Sep 2019 14:27:12 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Monitoring-Splunk/Duplicate-logs/m-p/466870#M8227</guid>
      <dc:creator>Sukisen1981</dc:creator>
      <dc:date>2019-09-01T14:27:12Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicate logs.</title>
      <link>https://community.splunk.com/t5/Monitoring-Splunk/Duplicate-logs/m-p/466871#M8228</link>
      <description>&lt;P&gt;Do you own the api endpoint or know who does? Do you have any documentation Regarding the api? If so, check if there’s a way to filter the data you are retrieving. Like I said by passing parameter in the query string along the lines &lt;CODE&gt;&lt;A href="http://yoururl.com/apiendpoint?start=2019-08-29&amp;amp;end=2019-08-30"&gt;http://yoururl.com/apiendpoint?start=2019-08-29&amp;amp;end=2019-08-30&lt;/A&gt;&lt;/CODE&gt;&lt;BR /&gt;
Depending on how the api was coded these will probably be different or even non existent. &lt;/P&gt;</description>
      <pubDate>Sun, 01 Sep 2019 21:32:07 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Monitoring-Splunk/Duplicate-logs/m-p/466871#M8228</guid>
      <dc:creator>diogofgm</dc:creator>
      <dc:date>2019-09-01T21:32:07Z</dc:date>
    </item>
  </channel>
</rss>

