Hi,
I am written python code to download data from splunk for the given search and given date range but it seems date range is not working- I can see logs which are outside of the date that I've entered.
Here is my code snippat:
s = requests.Session()
r = s.post(url_path, auth=auth, data=data, stream=True, verify=self.verify_cert)
r.raise_for_status()
with open(output_file_path, 'wb') as f:
for chunk in r.iter_content(chunk_size=512):
if chunk:
f.write(chunk)
f.close()
Here is my URL and Json data object:
URL https://example-zone-ms.compnay.com:8089/services/search/jobs/export
{'search': 'search source=*FOO_access* http_apikey | fields - host,source,sourcetype, splunk_server, _time, index, _serial', 'output_mode': 'csv', 'earliest': '08/22/2019:0:0:0', 'latest': '08/22/2019:23:59:59'}
Don't think dates will be taken literally, you have to use the epoch converisions
strptime(earliest,"%m/%d/%Y") OR pass the number representation of the dates for your time fields
I've updated code to use date object as below but still I am not getting data in my date range but I noticed it's from now - 7 days of data.
earliest = datetime.strptime(self.earliest, "%m/%d/%Y:%H:%M:%S")
latest = datetime.strptime(self.latest, "%m/%d/%Y:%H:%M:%S")
here is my Json data
{'search': 'search source=kong_access http_apikey | fields - host,source,sourcetype, splunk_server, _time, index, _serial', 'output_mode': 'csv', 'earliest': datetime.datetime(2019, 8, 15, 0, 0), 'latest': datetime.datetime(2019, 8, 15, 23, 59, 59)}
hi just to test can you hardcode the datetime to a number format...say -1h ago?
No change same result as before, here is the hardcoded info
{'search': 'search source=*kong_access* http_apikey | fields - host,source,sourcetype, splunk_server, _time, index, _serial', 'output_mode': 'csv', 'earliest': '-3h', 'latest': '-1h'}
Request and kwargs is ........
https://MAHCINENAME:8089/services/search/jobs/export
{'response': []}
search=search+source%3D%2Akong_access%2A+http_apikey+%7C+fields+-+host%2Csource%2Csourcetype%2C+splunk_server%2C+_time%2C+index%2C+_serial&output_mode=csv&earliest=-3h&latest=-1h
{'timeout': None, 'allow_redirects': True, 'verify': False, 'proxies': OrderedDict(), 'stream': True, 'cert': None}
hi can you remove latest and just use this:
'earliest': '1567017000.000000'
check the space and colon in the json format
same result no -change - getting data from AUG 24 - AUG 30
{'search': 'search source=*kong_access* http_apikey | fields - host,source,sourcetype, splunk_server, _time, index, _serial', 'output_mode': 'csv', 'earliest': '1567017000.000000'}
Request and kwargs is ........
https://MACHINENAME:8089/services/search/jobs/export
{'response': []}
search=search+source%3D%2Akong_access%2A+http_apikey+%7C+fields+-+host%2Csource%2Csourcetype%2C+splunk_server%2C+_time%2C+index%2C+_serial&output_mode=csv&earliest=1567017000.000000
{'timeout': None, 'allow_redirects': True, 'verify': False, 'proxies': OrderedDict(), 'stream': True, 'cert': None}
ok one last try from me {'search': 'search source=*FOO_access* earliest=-3d http_apikey | fields - host,source,sourcetype, splunk_server, _time, index, _serial', 'output_mode': 'csv'}
Seems it's working, it means earliest and latest should be part of the search string.
Now Only issue I am facing is some timezone related:
When I am sending
earliest= 08/24/2019:0:0:0 --> Log start from 08/24/2019:07:00:00
latest=08/28/2019:23:59:59 -->Log ends to 08/29/2019:07:00:00
Writing logs to file: /Users/i844276/Kong_log_08_01_2019_08_08_2019/kong_access_log_PROD_US_08_01_2019_0_0_0_08_08_2019_23_59_59.csv
Inside sessoin py
{'search': 'search source=*kong_access* earliest=08/01/2019:0:0:0 latest=08/08/2019:23:59:59 http_apikey | fields - host,source,sourcetype, splunk_server, _time, index, _serial', 'output_mode': 'csv'}
Request and kwargs is ........
https:MACHINENAME:8089/services/search/jobs/export
{'response': []}
search=search+source%3D%2Akong_access%2A+earliest%3D08%2F01%2F2019%3A0%3A0%3A0++latest%3D08%2F08%2F2019%3A23%3A59%3A59+http_apikey+%7C+fields+-+host%2Csource%2Csourcetype%2C+splunk_server%2C+_time%2C+index%2C+_serial&output_mode=csv
{'timeout': None, 'allow_redirects': True, 'verify': False, 'proxies': OrderedDict(), 'stream': True, 'cert': None}
hi @kotak86
Yes and I am sorry , I should have spotted that in the first instance. It was Friday night and late here...hehe
Now, coming to the remainder of your issue , it is strange at first glance. Run the job query in splunk UI first and verify that the output is correct , for example - source=kong_access earliest=08/01/2019:0:0:0 latest=08/08/2019:23:59:59
Do you see any entries before 7 AM?
Consider using the number format for example if i do this eval x=strptime("08/27/2019:07:59:59","%m/%d/%Y:%H:%M:%S")
x or the time comes to be earliest="1566872999.000000", also consider using quotes in earliest and latest
I am not getting consistent result - it looks very weird...
sending time in epoch ("1566872999.000000")as well date string (""08/27/2019:07:59:59","%m/%d/%Y:%H:%M:%S") works but I am getting different start boundary (means in both file log start from different time) for the following two example.
{'search': 'search source=*kong_access* earliest=08/01/2019:0:0:0 latest=08/01/2019:23:59:59 http_apikey | fields - host,source,sourcetype, splunk_server, _time, index, _serial', 'output_mode': 'csv'}
{'search': 'search source=*kong_access* earliest=08/01/2019:0:0:0 latest=08/08/2019:23:59:59 http_apikey | fields - host,source,sourcetype, splunk_server, _time, index, _serial', 'output_mode': 'csv'}