Hi Splunker;
Splunk monitoring logs from URL by reset API, but we noticed that there duplicate logs, I mean Splunk read more than one for one event, you can see the below inputs.conf which I used?
[rest://port scanner from cloud ps.log]
auth_type = none
endpoint = http://95.177.216.188/ps.log
host = 95.177.216.188
http_method = GET
http_proxy =
index = ps
index_error_response_codes = 0
response_handler = DefaultResponseHandler
response_type = json
sequential_mode = 0
sourcetype = ps:ports
streaming_request = 0
polling_interval = 420
Please help me in that...
BR;
hmm the issue is with the polling. You are polling at an interval of 420 s but it is still pulling the historical data. Before we get into the complexities of the checkpoints, is it possible for you to tune your endpoint, so that when its is queried it only returns the 'delta' data ? In most APIs it is possible to set some sort of start/end timings and you can probably do a smoke test by manually changing / modifying the time start/end values in the api endpoint url itself?
You are not monitoring a log in splunk's sense of monitoring. You are just GETting a file and index it. Thats why you get the duplicated logs. When you are if fact monitoring (e.g. log file available locally) splunk is able to established where he did stop indexing in the file and restarts from there.
Usually with rest you should be able too use the query string to pass additional parameters (e.g. start data, end data, etc.) to filter data before indexing.
Thank you @diogofgm and @Sukisen1981 for reply;
These logs contains the timestamp and I set configuration about timestamp on props.conf, you can see the below, Mean that Splunk must look in the timestamp to read the new logs coming depends on the new timestamp, why read the historical logs?
[ps:ports]
category = Application
description = Ports logs produced by ps
pulldown_type = true
SHOULD_LINEMERGE = false
CHECK_FOR_HEADER = false
LINE_BREAKER = ({)\'timestamp
MAX_TIMESTAMP_LOOKAHEAD = 19
NO_BINARY_CHECK = true
TIME_FORMAT = %Y-%m-%d %H:%M:%S
TIME_PREFIX = \'timestamp\':\s\'
TZ = UTC
And you can see the sample logs from these logs in the below:
'tcp'}]}\n{'timestamp': '2019-09-01 10:00:56', 'data': [{'status': 'open', 'host': '82.147.220.28', 'port': '80', 'proto': 'tcp'}]}\n{'timestamp': '2019-09-01 10:00:56', 'data': [{'status': 'open', 'host': '82.147.220.11', 'port': '443', 'proto':
'tcp'}]}\n{'timestamp': '2019-09-01 10:00:56', 'data': [{'status': 'open', 'host': '82.147.220.13', 'port': '443', 'proto': 'tcp'}]}\n{'timestamp': '2019-09-01 10:00:56', 'data': [{'status': 'open', 'host': '82.147.220.49', 'port': '80', 'proto':
Then, how can I add parameters?
BR
Do you own the api endpoint or know who does? Do you have any documentation Regarding the api? If so, check if there’s a way to filter the data you are retrieving. Like I said by passing parameter in the query string along the lines http://yoururl.com/apiendpoint?start=2019-08-29&end=2019-08-30
Depending on how the api was coded these will probably be different or even non existent.
Hi @aalhabbash1
None of the time parameters in your props.conf is related to purging of the historical data.
What you are showing in the logs,is an example of what your endpoint data retrieves , what we are asking is something like this - http://95.177.216.188/ps.log/start?=xxxxx end?=yyyy
We refer to the query string that you are using to make the GET request through your endpoint.
My strong suggestion is to make changes in your GET endpoint rather than try to use Splunk to filter out stuff