Monitoring Splunk

Duplicate logs.

aalhabbash1
Path Finder

Hi Splunker;

Splunk monitoring logs from URL by reset API, but we noticed that there duplicate logs, I mean Splunk read more than one for one event, you can see the below inputs.conf which I used?

[rest://port scanner from cloud ps.log]
auth_type = none
endpoint = http://95.177.216.188/ps.log
host = 95.177.216.188
http_method = GET
http_proxy =
index = ps
index_error_response_codes = 0
response_handler = DefaultResponseHandler
response_type = json
sequential_mode = 0
sourcetype = ps:ports
streaming_request = 0
polling_interval = 420

Please help me in that...

BR;

Tags (1)
0 Karma

Sukisen1981
Champion

hmm the issue is with the polling. You are polling at an interval of 420 s but it is still pulling the historical data. Before we get into the complexities of the checkpoints, is it possible for you to tune your endpoint, so that when its is queried it only returns the 'delta' data ? In most APIs it is possible to set some sort of start/end timings and you can probably do a smoke test by manually changing / modifying the time start/end values in the api endpoint url itself?

0 Karma

diogofgm
SplunkTrust
SplunkTrust

You are not monitoring a log in splunk's sense of monitoring. You are just GETting a file and index it. Thats why you get the duplicated logs. When you are if fact monitoring (e.g. log file available locally) splunk is able to established where he did stop indexing in the file and restarts from there.
Usually with rest you should be able too use the query string to pass additional parameters (e.g. start data, end data, etc.) to filter data before indexing.

------------
Hope I was able to help you. If so, some karma would be appreciated.
0 Karma

aalhabbash1
Path Finder

Thank you @diogofgm and @Sukisen1981 for reply;

These logs contains the timestamp and I set configuration about timestamp on props.conf, you can see the below, Mean that Splunk must look in the timestamp to read the new logs coming depends on the new timestamp, why read the historical logs?

[ps:ports]
category = Application
description = Ports logs produced by ps
pulldown_type = true

SHOULD_LINEMERGE = false
CHECK_FOR_HEADER = false
LINE_BREAKER = ({)\'timestamp
MAX_TIMESTAMP_LOOKAHEAD = 19
NO_BINARY_CHECK = true
TIME_FORMAT = %Y-%m-%d %H:%M:%S
TIME_PREFIX = \'timestamp\':\s\'
TZ = UTC

And you can see the sample logs from these logs in the below:

'tcp'}]}\n{'timestamp': '2019-09-01 10:00:56', 'data': [{'status': 'open', 'host': '82.147.220.28', 'port': '80', 'proto': 'tcp'}]}\n{'timestamp': '2019-09-01 10:00:56', 'data': [{'status': 'open', 'host': '82.147.220.11', 'port': '443', 'proto':
'tcp'}]}\n{'timestamp': '2019-09-01 10:00:56', 'data': [{'status': 'open', 'host': '82.147.220.13', 'port': '443', 'proto': 'tcp'}]}\n{'timestamp': '2019-09-01 10:00:56', 'data': [{'status': 'open', 'host': '82.147.220.49', 'port': '80', 'proto':

Then, how can I add parameters?

BR

0 Karma

diogofgm
SplunkTrust
SplunkTrust

Do you own the api endpoint or know who does? Do you have any documentation Regarding the api? If so, check if there’s a way to filter the data you are retrieving. Like I said by passing parameter in the query string along the lines http://yoururl.com/apiendpoint?start=2019-08-29&end=2019-08-30
Depending on how the api was coded these will probably be different or even non existent.

------------
Hope I was able to help you. If so, some karma would be appreciated.
0 Karma

Sukisen1981
Champion

Hi @aalhabbash1
None of the time parameters in your props.conf is related to purging of the historical data.
What you are showing in the logs,is an example of what your endpoint data retrieves , what we are asking is something like this - http://95.177.216.188/ps.log/start?=xxxxx end?=yyyy

We refer to the query string that you are using to make the GET request through your endpoint.
My strong suggestion is to make changes in your GET endpoint rather than try to use Splunk to filter out stuff

0 Karma
Get Updates on the Splunk Community!

Take Your Breath Away with Splunk Risk-Based Alerting (RBA)

WATCH NOW!The Splunk Guide to Risk-Based Alerting is here to empower your SOC like never before. Join Haylee ...

SignalFlow: What? Why? How?

What is SignalFlow? Splunk Observability Cloud’s analytics engine, SignalFlow, opens up a world of in-depth ...

Federated Search for Amazon S3 | Key Use Cases to Streamline Compliance Workflows

Modern business operations are supported by data compliance. As regulations evolve, organizations must ...