I've spent a few hours with Splunk and have a few different inputs being piped into Splunk. Unfortunately, I cannot for the life of me, figure out how to get it to process information into a meaningful way that can then be statistically represented. I've tried reading the getting started stuff that Splunk has published but I feel like a lot of assumptions have been made about the user's knowledge level that are beyond me.
Basically, what I would like to do is take a tab delimited log file and pull information out of it. For example, I am using DansGuardian as a content filter who's log format looks like this:
TimeStamp - Source IP - Attempted Site Visit - Result - Stuff
(Tab is used in log to separate fields, not spaces)
I'd like to be able to pull that information out and statistically analyze it and present it in those pretty charts and graphs that Splunk teases me with.
Where can I go to get an understanding on this stuff or who is willing to provide me with the information?
index = main
or
index = main Source_IP = * | stats count by Source_IP
or
index = main Source_IP = * OR Attempted_Site_Visit = * | stats count by Source_IP Attempted_Site_Visit
or
index = main Source_IP = * | stats count by Source_IP Attempted_Site_Visit Result Stuff
or
index = main | stats count by Stuff
or
index = main | stats count by Attempted_Site Visit Source_IP
or
All of these things are called fields. If your logs are not extracted as fields automatically, then focus on that first. We can help with field extraction.
Great idea Iguinn.
Run a search that shows the router logs. If you don't know the source, then use this search:
index=main | dedup source| table source
This will show you all of the sources. Find the router source and then run this search:
index=main source=routersouce
This will give you the routersource data. From there you can use the Field Extractor.
use the Interactive Field Extractor!
http://www.splunk.com/view/SP-CAAADUY for a video... or just Google
Splunk Interactive Field Extractor
Well, I've got two routers reporting that have very different structures. I am interested in src, dst, port, protocol, drop or allow, and timestamp
Mar 11 22:44:56 10.50.25.1 Mar 11 21:44:56 rv180w KERNEL [Kernel] [638811.520000] LOG_PACKET[DROP]IN=eth1 OUT= DST MAC=78:da:6e:e6:3b:7d SRC MAC=78:cd:8e:4b:1f:a2 PAYLOAD TYPE=08:00 SRC=97.92.215.221 DST=24.182.130.162 LEN=40 TOS=0x00 PREC=0x00 TTL=118 ID=639 DF PROTO=TCP SPT=49489 DPT=8000 WINDOW=0 RES=0x00 ACK URGP=0
Mar 11 19:24:03 24.182.134.18 Mar 11 18:27:09 10.50.22.1 [Access Log ] Deny TCP Packet - 10.50.22.3:46440 --> 10.50.25.4:25
Yes, but it is often much better to configure Splunk to automatically extract them, so you just call the field names in the search.
Can you post some event data? That way we can give you specifics.
index=main |table _raw
Just a few events of the type that contain the data you want to analyze.
Awesome, yes my fields are way off or completely irrelevant. How can I modify/create my own fields, using Regex?
Have you gone through the Search Tutorial? It contains a sample data set that has some similarities to yours, and includes search and reporting examples. The Exploring Splunk book also contains numerous search recipes and examples.
I also recommend the Exploring Splunk book. It is free in several electronic formats, or you can pay money for a hard copy.
I HAVE gone through that but found it really difficult to follow any of what was being talked about...though it was also like 1-2am so there's a good chance I was suffering some learning impairment, lol. I will go back through it again to see if I can pick up what I am looking for.
From what I remember though, it makes the assumption that you know how to use wildcards and Boolean arguments which I have a very limited knowledge of let alone being able to combine multiple ones in a single search.