Hi, I've been strumming the documentation and looking through the answers site, so far unable to come up with a solution to the topic problem. Appreciate any advice!
Working with archived data from remote systems that include output of unix/linux style "iptables -L" command. We want to search the info according to ACCEPTs, src addresses, etc.
Individual lines in the data don't have date/time info or "chain" names, so I wrote a python script that reads stdin and outputs lines with date/time and series of name=value pairs. I hoped to get this working from props.conf with a stanza that looks roughly thus:
[source::.../iptables-log*] sourcetype = iptables-trafficlog [iptables-trafficlog] invalid_cause = archive unarchive_cmd = python interpret-iptables-eventlog.py
That didn't seem to work much 😞 My hypothesis right now is that input processing isn't finding either the python interpreter or my script. My questions are (1) is what I'm attempting supposed to work? and (2) Where do I deploy my script and how specify its invocation within props.conf? (3) Is there a much simpler or obvious solution that I've overlooked?
thanks so much for your time and attention! --A Newbie
Since posting this query I've had the chance to try a number of variations on the setup: "wrapping" the python command in an executable script, prepending full path specs to ensure the files can be found, and so on. The result is no joy: It appears that the "unarchive_cmd" specified is never activated. So it suggests maybe I'm taking the wrong approach. I "know" that my monitored data files contain (within some layers of zip/Z/tgz and so on) the iptables-log* contents because this configuration:
[source::.../iptables-log*] sourcetype = iptables-trafficlog
results in many records
so sorry about the terrible formatting of the code in that comment :-(. it's just two lines naming the source (file name) and assigning sourcetype.
When that's in my props.conf, I am able to search for the relevant sourcetype, but the result that comes back is one very big event containing the entire (hundred-line plus) listing from iptables -L. Not what I was hoping to get. Any suggestions on an approach are welcome!
Hello everybody! Actually, I'm not clear whether anyone but me has looked at this question. Is anybody out there?
Intuition would suggest the problem of invoking a little bespoke preprocessing on data at input time would be a very common thing for managers of real system deployments to want. So, it's hard for me to believe that there isn't some sort of "standard" answer to the problem I've posed. But about two weeks after posting this I've seen no response at all. Is my situation so unusual?
I don't see much in the way of debugging output here. How do you know it isn't working? What warnings/errors are you seeing? Did you enable the script in your management console? Did you put it in etc/apps/search/bin ?
Thanks for your attention John.
I know it isn't working because (1) data don't get indexed and (2) I put a line in my script to write a line of diagnostic to a fully-qualified path when the script is invoked, which never appears.
I'm not getting warnings or errors, indexing works as expected but the data in sources identified as iptables-log* in props.conf are ignored. So that suggests the "invalid_cause" spec is working. Formerly the data in question had been indexed into a large multi-line event (unusable).
I didn't enable the script in management console? Where would that be done?
I didn't enable the script in management console. Where would that be done?
I fully specified the path name. Does it need to be in etc/apps/search/bin to be invoked?
I struggled with this way too long also.
I have a custom access log format that is gzipped. It needs to be gunziped then piped through a custom converter to get to NCSA format (access_combined). No matter what I seemed to do, my log format would seem to get unarchived, but never passed through my converter (even though it seemed to be honoring my source:: spec
i had to do this in local/props.conf:
[source::/path/to/my/special/logs/.../*] unarchive_cmd = gunzip | my_custom_converter unarchive_sourcetype = access_combined NO_BINARY_CHECK = true priority = 10
The key here seems to be that I had to use priority keyword. I believe this was necessary to override the default gzip unarchiver which seemed to take precedence over whatever custom sourcetype I defined.
I'm sure there is a way to see how this is getting parsed and processed, but it's not really obvious. Full disclosure: I am a complete splunk newbie.
I believe they did have .gz extensions.