What is the best way to monitor a web page contain...

landen99 · ‎05-26-2015

I want to monitor a web page containing links to xml for rss feeds to be indexed into Splunk in real-time. Let's use the following website as an example: https://spotcrime.com/rss.php

What is (are) the best way(s) to monitor that site for indexing? I want to test the method on my Windows machine first. I am interested in learning more about Powershell or Python scripted inputs if that is the best approach.

LukeMurphey · ‎05-26-2015

If you want to get all of those XML feeds into Splunk without manually entering them making a custom script might be the best approach. I have been considering writing a search command that will allow you to scrape web-pages recursively. I'll take a look at doing that soon (perhaps in the next couple of days); you can monitor progress on that here.

For reference, there are a couple of apps that may be useful to you:

Web Input: this app will allow you to scrape web-pages using a modular input. You could use this to get a list of the RSS feeds from https://spotcrime.com/rss.php.
- Syndication Input: this app is similar to the RSS scripted input except that it offers a simple user interface for making inputs and it supports more than just RSS.

landen99 · ‎05-26-2015

I found a couple of links on the subject, but I don't understand specifically what I should do to make it work: https://splunkbase.splunk.com/app/278/ and http://blogs.splunk.com/2012/03/14/indexing-feeds/ Also, some of these articles are a bit old (2012, etc.) so I am unsure about how much has changed since then and if they are not better ways to do it.

What is the best way to monitor a web page containing links to xml to index RSS feeds in Splunk?

Introducing the Splunk Community Dashboard Challenge!

Built-in Service Level Objectives Management to Bridge the Gap Between Service & ...

Get Your Exclusive Splunk Certified Cybersecurity Defense Engineer Certification at ...