Getting Data In

setting up filters properly

infinitiguy
Path Finder

Hi,
I'm new to splunk. Got through the initial setup and forwarding syslog - cool stuff.

What I want to do next is properly set up a filter to look at jboss error logs, but only pull stuff that matches a regex, and a few lines below the matched string if possible.

What I'm planning - or hoping to do - is set up inputs.conf on the universal forwarder node to look at [monitor:///opt/jboss/default/logs/error.log]. This forwards to my splunk server over tcp (outputs.conf).

I then think that I would use props.conf to do the filter - but I don't know how that would work in this case because I think I'm sending the log from the end node directly to splunk over tcp and the file is actually never being written locally on the splunk server. Is this something that is possible in this setup? If not - what approach is recommended to take?

Tags (2)
0 Karma
1 Solution

infinitiguy
Path Finder

I finally have a workable (but not ideal) solution.

[root@splunkserver local]# cat props.conf

#Sourcetype
[jb_log4j]
###### breaks log4j into separate lines #######
#Match multiline events and break them into single events based on new line character
LINE_BREAKER = (?m)(\n)
SHOULD_LINEMERGE=false
###### breaks log4j into separate lines #######

###### set null  queue #######
# sets null transform name of "setnull"
TRANSFORMS-null= setnull
###### try to null out stuff #######

At this point before the transforms are set if I have a java stack trace - I get 80+ separate events - boo.

[root@splunkserver local]# cat transforms.conf

##### tries to match starting with a space as null #######
[setnull]
# matches on any line starting with a space (I.e. The noisy part of a java stack trace)
REGEX = ^\s
DEST_KEY = queue
FORMAT = nullQueue
##### tries to match starting with a space as null #######

It's not ideal – each stack trace that might have 2 lines of interest now show up as 2 events… but far better than a single event that is 87 lines long 🙂

View solution in original post

0 Karma

infinitiguy
Path Finder

I finally have a workable (but not ideal) solution.

[root@splunkserver local]# cat props.conf

#Sourcetype
[jb_log4j]
###### breaks log4j into separate lines #######
#Match multiline events and break them into single events based on new line character
LINE_BREAKER = (?m)(\n)
SHOULD_LINEMERGE=false
###### breaks log4j into separate lines #######

###### set null  queue #######
# sets null transform name of "setnull"
TRANSFORMS-null= setnull
###### try to null out stuff #######

At this point before the transforms are set if I have a java stack trace - I get 80+ separate events - boo.

[root@splunkserver local]# cat transforms.conf

##### tries to match starting with a space as null #######
[setnull]
# matches on any line starting with a space (I.e. The noisy part of a java stack trace)
REGEX = ^\s
DEST_KEY = queue
FORMAT = nullQueue
##### tries to match starting with a space as null #######

It's not ideal – each stack trace that might have 2 lines of interest now show up as 2 events… but far better than a single event that is 87 lines long 🙂

0 Karma

infinitiguy
Path Finder

also - to clarify. Much of my confusion was around the fact that nullQueue is for sending entire events to the nullQueue - not part of a single event, which I wasn't clear about. Once I was able to split my stack trace into separate events it became clear how to send some to null. With a little more work I think I can get single events that don't contain noise.

0 Karma

infinitiguy
Path Finder

I tried the props and transforms suggestions but I can't seem to get them to work. I still end up with the whole stacktrace being indexed 😞

0 Karma

infinitiguy
Path Finder

spoke too soon. It looks like MAX_EVENTS = 5 is taking my 87 line log entry and chopping it up into blocks of 5 instead of truncating. I just added another server that had a far noiser error.log (100MB worth) and it looks like i'm getting events that are pieces of a java exception.

0 Karma

infinitiguy
Path Finder

I think I found something that works.
client:
inputs.conf

[default]
host = n1a901
[monitor:///opt/jboss/server/default/log/error.log*]
sourcetype = med_log4j

server:
props.conf

[med_log4j]
MAX_EVENTS = 5

this results in search output that only has 5 lines - compared to the 87 I normally see. I assume it's only the 5 because I no longer see "show all 87 lines". Can anyone confirm that MAX_EVENTS is actually truncating the lines and not indexing them and that it's not just some sort of gui filter? 🙂

0 Karma

infinitiguy
Path Finder

thinking about it more I think I can go about this a different way - not completely sure how yet.

I setup a udp port (514) for syslog and the only other logs im interested in are the jboss logs which will be read locally and sent via a forwarder. The first 5 lines of each "event" are the only ones I'm interested in. Typical jboss events seem to have 80+ lines. I tried adding TRUNCATE=128 to props.conf but that limited each of the 87 lines to 128 characters. I was hoping it would limit the event itself.

Is there an easy way to say I don't care what the event is, please just limit it to 5 lines of data?

0 Karma

infinitiguy
Path Finder

yes - the jboss error.log is full of java stack traces.

Currently I have the clients set up with an inputs.conf of:
[dmurphy@n1a901 local]$ sudo cat inputs.conf

enter code here
[default]
host = n1a901
[monitor:///var/log/messages*]
sourcetype = syslog
[monitor:///opt/jboss/server/default/log/error.log*]
sourcetype = jb_log4j

outputs.conf looks like:
[dmurphy@n1a901 local]$ sudo cat outputs.conf

[tcpout]
autoLB = true
maxQueueSize = 500KB
forwardedindex.0.whitelist = .*
forwardedindex.1.blacklist = _.*
forwardedindex.2.whitelist = _audit
forwardedindex.filter.disable = false
indexAndForward = false
autoLBFrequency = 30
blockOnCloning = true
compressed = false
disabled = false
dropClonedEventsOnQueueFull = 5
dropEventsOnQueueFull = -1
heartbeatFrequency = 30
maxFailuresPerInterval = 2
secsInFailureInterval = 1
maxConnectionsPerIndexer = 2
sendCookedData = true
connectionTimeout = 20
readTimeout = 300
writeTimeout = 300
useACK = false
defaultGroup = n1s901.domain.com_9009

[tcpout:n1s901.domain.com_9009]
server = n1s901.domain.com:9009

[tcpout-server://n1s901.domain.com:9009]

Since the files are locally on the box I'm hoping to be able to have splunk look at sourcetype for different filtering. syslog I don't want filtered at all, but jb_log4j sourcetype I do want filtered.

Instead of [source::/opt/jboss/default/logs/error.log] could I do something like [source::sourcetype=jb_log4j]

The end result we want to work towards is having splunk be "the" syslog server. Can splunk be configured to directly listen for syslog traffic? i.e. if I have splunk listening on 514 - can I have syslog just forward to the splunk IP?

I want to avoid having to have forwarders on all of my hosts for just syslog traffic. The jboss hosts (only 3) will have forwarders. I also don't want to have syslog writing events to a local messages file on the splunk server - just for splunk to read. If it has to be setup that way then that's ok - but I'm hoping to just have native syslog send over the network direct to splunk.

Is that possible?

0 Karma

lguinn2
Legend

You will use props.conf and transforms.conf to do the filtering. You will filter on the indexer(s), so that's where the props.conf and transforms.conf files need to be. (Note that you can also have a props.conf on the forwarder, to do other things. But let's ignore that for now.)

#$SPLUNK_HOME/etc/system/local/props.conf
[source::/opt/jboss/default/logs/error.log]
SHOULD_LINEMERGE=true
TRANSFORMS-jboss1 = filter_jboss
TRANSFORMS-jboss2 = cut_line

#$SPLUNK_HOME/etc/system/local/transforms.conf
[filter_jboss]
REGEX=regexToNOTMatch
DEST_KEY=queue
FORMAT=nullQueue

[cut_line]
REGEX=(regexToMatch)(.{200})
DEST_KEY=_raw
FORMAT=$1$2

The first stanza in transforms.conf identifies events that you do NOT want to index and sends them to the "nullQueue". Events that match the regex in filter_jboss will not appear in your Splunk index at all.

The second stanza is where you identify exactly what you want to index. If you wanted to index the entire event, you don't need the "cut_line" stanza. You also said that you want a few of the following lines. The regex here specifies the next 200 characters in this event. That's sort of lame, but you get the idea - you can write a regex that picks up as much data as you want.

Here are some references to the docs: Route and filter data and Anonymize data

BTW, I am assuming that the jboss data is multiple lines per event. If the data is single-line, then this won't work...

0 Karma
*NEW* Splunk Love Promo!
Snag a $25 Visa Gift Card for Giving Your Review!

It's another Splunk Love Special! For a limited time, you can review one of our select Splunk products through Gartner Peer Insights and receive a $25 Visa gift card!

Review:





Or Learn More in Our Blog >>