We ended up working this issue from a support perspective, and this was related to specific configs within the customers ENV. If customer wishes to share our findings he can note that here.
Having said that under normal circumstances, using wget to validate connectivity from SH to source is a good first start to understand why the download is failing.
Hmm, this indicates you are a cloud customer. If that is the case email me your info jwelch @ splunk dot com.
I will take a look for you.
Otherwise, if I am missing something here, we log the success or failure of a download in the threatlist.log in /opt/splunk/var/log/splunk
index=_internal source =*threatlist.log alexa
This could be related to a previous failure and now it is successful, and you are hitting the bug I was talking about, which I did not think affected 4.2.0
Or it really is failing and I need to see why from the backend.
If you are not a cloud customer you could try this from your SH
wget https://s3.amazonaws.com/alexa-static/top-1m.csv.zip? to determine if you have success.
Let me know here or via email how I can help
Can you access the URL: https://s3.amazonaws.com/alexa-static/top-1m.csv.zip? This is where the Alexa Top Million is hosted. Personally, I can, so I know they haven't shut down the Alext top million (like happened a few months back and presumably will happen again). It's possible that your Splunk ES Search Head can't access that URL itself, blocked by a content filter or web proxy in your network somewhere. If you don't use the Alexa Top Million, you could just disable the input.