Getting Data In

Monitor:// for a file that has a FOOTER

narwhal
Splunk Employee
Splunk Employee

I'm trying to monitor a html table in an html file that is updated regularly. My gotcha is that there are about 15 lines at the bottom of the HTML that finish up the page. I already know how to use props/transforms to only keep the entries that I want, but I'm worried that Splunk will see the file get bigger but really only eat the footer again and again. Is there some way that (without a preprocessor--which isn't out of the question) that I can get Splunk to not only see that there is more data in the file, but to eat the new entries not just the same footer over and over as it gets pushed deeper into the file?

Tags (1)
1 Solution

narwhal
Splunk Employee
Splunk Employee

I have taken the preprocessor route on this issue. Again, I have a programmatically created HTML file that has 202 lines html gunk at the top and 17 lines at the bottom. I want to strip that off and have just the "table" html with the "rows" indexed into Splunk. So, I now have an every minute fired off Linux cron task that does a combination of head & tail to clean it up and create a new file, and I monitor that file.

My script loops and does this for each file, but the important part is how to use head/tail to accomplish my goal.

head -n -17 filename.html | tail -n +202 > filename.html.table

Hope that helps someone...

View solution in original post

0 Karma

narwhal
Splunk Employee
Splunk Employee

I have taken the preprocessor route on this issue. Again, I have a programmatically created HTML file that has 202 lines html gunk at the top and 17 lines at the bottom. I want to strip that off and have just the "table" html with the "rows" indexed into Splunk. So, I now have an every minute fired off Linux cron task that does a combination of head & tail to clean it up and create a new file, and I monitor that file.

My script loops and does this for each file, but the important part is how to use head/tail to accomplish my goal.

head -n -17 filename.html | tail -n +202 > filename.html.table

Hope that helps someone...

0 Karma
Get Updates on the Splunk Community!

New Year, New Changes for Splunk Certifications

As we embrace a new year, we’re making a small but important update to the Splunk Certification ...

Stay Connected: Your Guide to January Tech Talks, Office Hours, and Webinars!

What are Community Office Hours? Community Office Hours is an interactive 60-minute Zoom series where ...

[Puzzles] Solve, Learn, Repeat: Reprocessing XML into Fixed-Length Events

This challenge was first posted on Slack #puzzles channelFor a previous puzzle, I needed a set of fixed-length ...