So, question relating to pulling useful data out of AFP (Apple File Protocol) logs on the server.
A line in the log looks like:
IP 123.456.789.101 - - [12/Jan/2011:09:23:06 -0800] "Login user" 0 0 0
So basically, it would be cool to be able to say "this is how many unique users logged in today.
Is this easily doable?
Probably a noobish question, but...
You should setup a sourcetype if you don't have one already. This is where you will setup some field extractions to extract the values you are looking for; once you have that done, getting the specific search your are looking for is pretty easy once you get familiar with the tool.
Let's say your log file is located in
/var/log/afl.log and you want your sourcetype to be called
Setup the following entries in
[source::.../afl.log] sourcetype = afl [afl] TIME_PREFIX = \[ TIME_FORMAT = %d/%b/%Y:%H:%M:%S %z MAX_TIMESTAMP_LOOKAHEAD = 128 SHOULD_LINEMERGE = False EXTRACT-fields = ^(?<clientip>\d+\.\d+\.\d+\.\d+) \S+ \S+ \[[^\]]+\] "(?<user>[^"]+)" ...
Once that's setup, you can restart splunkd and setup splunk to index your afl log file. Once you have events in splunk, you should be able to see them with a search like so:
Assuming that's working, you can move on to getting the report you're looking for like so:
sourcetype=afl | top user
BTW, I haven't actually tested the regex shown above, and the "..." is because I have no clue what those last fields are for (and it's also a valid regex expression. 😉 Yeah, I'm a geek.) If you post a few example events and indicate what the fields are for, I or someone else can help you out with the regex. You can use the IFX (interactive field extractor) for this too, but IMHO, for strictly formated events like this you're better off writing a single all-in-one regex than creating many single-field regexes as would happen with IFX.
Yes, only new events would be assigned to this sourcetype. However, you can always use the same name, I chose "afl" as an example.) If you do use the same sourcetype that splunk used automatically, then the field extractions will work for old and new events (because
EXTRACT- is a search-time setting. Whereas
SHOULD_LINEMERGE are index-time settings. (If you're new to splunk, it will payoff long-term to become familiar with the difference between search-time and index-time settings; I remember being confused at first; fortunately the docs are more clear on this point now.)
So, assuming I already have a bunch of data in splunk from this log, adding a sourcetype would only apply to new data, or old data as well?
Would what you put above work for what I posted? I'm not expert on regex, but it looks like it's formatted basically correctly. The Timestamp looks right.
So if the log is called "AppleFileServer.log" would that mean the [source::.../AppleFileServer.log] would look like that?