I'm not quite sure how to ask my question, and I'm not sure what data would be relevant to share in order to help solve the problem. First I'll explain that I have Splunk Enterprise 6.3 installed on one Search Head, one Indexer and another server acting as the License manager/Distributed Management Console. All three are virtual machines on different hosts with plenty of resources allotted to them to perform the job they are being asked to do.
After setting up my Splunk instance I installed the Splunk add-on for Okta and configured it to use the API I created from my company's Okta instance. It is pulling in the max amount of events (1,000) ever time it connects. The problem is that it doesn't seem to pull data from more than 24 hours out from the time of start I put into the Data Inputs configuration. I have let it run for more than a week with the interval set at 120 seconds to try and get as much data indexed as fast as possible but the only data that I can search from is the date that I set as the "Start Time".
The data I am able to search looks to be valid data given accurate numbers, but if I'm only ever going to be able to index 24 hours worth of data, then I don't see the point of this add-on/app being around.
I'm sure there is something I haven't set up properly or I'm missing within the logs that would clue me in on what is wrong, but I'm very new and only self taught on Splunk. My company still hasn't purchased it yet so no professional services are available and it's my job to be able to figure out usable data from Splunk to justify purchasing the product.
If anyone could help, please let me know what log files you may need to look at or the configuration settings or even the python scripts (I'm not going to even attempt to say that I'm savvy with python).
I have the exact same issue!
I have configured my input with no start date (which defaults to start from previous 30 days) and only data that I can see on my dashboards are 24 hours starting from a month back! So basically, I see data for only one day (on Jan 2nd - I configured it today, Feb2nd) and nothing after that.
Elias or rdunlap - were you able to fix this issue? Please let me know too! Thanks!
I turned the Debugging on in the Okta.conf file and I see logs like these;
2017-02-02 10:04:06,102 ERROR pid=27732 tid=MainThread file=okta_rest_client.py:_log_api_error:117 | Failed to connect https://XYZABC.okta.com/api/v1/groups?after=00g41dbnnnazNSPql0x7&limit=200, code=E0000047, reason="API call exceeded rate limit due to too many requests."
Its there some rate limiting on Okta side that needs to be configured or maybe is there a timeout value that needs to be increased on the Splunk app side?
I can't seem to find a setting option to set it to debug. There doesn't seem to be a debug option within the okta.conf file and there's no option to set debug within the Web Console.
When I do the following search:
index=_internal source=*ta_okta* NOT INFO
I get the following events:
2015-11-20 06:39:43,903 ERROR pid=2776 tid=MainThread file=okta_rest_client.py:request:90 | Failed to connect https://company.okta.com/api/v1/events?limit=1000&filter=published+ge+%222015-11-19T00%3A00%3A00.000..., reason=Traceback (most recent call last): File "S:\Splunk\etc\apps\Splunk_TA_okta\bin\okta_rest_client.py", line 78, in request headers=headers) File "S:\Splunk\etc\apps\Splunk_TA_okta\bin\splunktalib\httplib2\__init__.py", line 1593, in request (response, content) = self._request(conn, authority, uri, request_uri, method, body, headers, redirections, cachekey) File "S:\Splunk\etc\apps\Splunk_TA_okta\bin\splunktalib\httplib2\__init__.py", line 1335, in _request (response, content) = self._conn_request(conn, request_uri, method, body, headers) File "S:\Splunk\etc\apps\Splunk_TA_okta\bin\splunktalib\httplib2\__init__.py", line 1291, in _conn_request response = conn.getresponse() File "S:\Splunk\Python-2.7\Lib\httplib.py", line 1073, in getresponse response.begin() File "S:\Splunk\Python-2.7\Lib\httplib.py", line 415, in begin version, status, reason = self._read_status() File "S:\Splunk\Python-2.7\Lib\httplib.py", line 371, in _read_status line = self.fp.readline(_MAXLINE + 1) File "S:\Splunk\Python-2.7\Lib\socket.py", line 476, in readline data = self._sock.recv(self._rbufsize) File "S:\Splunk\Python-2.7\Lib\ssl.py", line 714, in recv return self.read(buflen) File "S:\Splunk\Python-2.7\Lib\ssl.py", line 608, in read v = self._sslobj.read(len or 1024) SSLError: ('The read operation timed out',)
The most recent event was on the 20th on November and that error has cleared up on its own. I'm not sure why, but I haven't seen that error except for the first time it tries to connect after changing the data input settings and re-adding the API key. After the failure it re-attempts the connection and succeeds returning the requested amount of events.
Thanks for the reply.
Since i have the logging for the Okta Add-on set to 'INFO' the search command in step one returns 335,472 events.
Since there are over 300,000 returned events, do you have an idea of what type of event I should look for in order to paste that in this thread?
I've pasted the first returned event as an example.
2015-11-25 09:37:34,426 INFO pid=4920 tid=MainThread file=okta_base.py:write:123 | Traceback (most recent call last): File "S:\Splunk\etc\apps\Splunk_TA_okta\bin\okta_base.py", line 121, in write os.rename(self._fname, self._fname + ".old") WindowsError: [Error 2] The system cannot find the file specified host = SPLUNK-SEARCH source = S:\Splunk\var\log\splunk\ta_okta.log sourcetype = ta_okta-4