Hello fellow-splunkers!
Problem Statement
- My logs have INFO, WARNING and DEBUG log entries. The DEBUG log entries have customer-specific information which I wouldn't want to expose to a wider audience.
- I want some specific users in the team to have access to the logs with these DEBUG log entries. Others shouldn't be able to access it.
My Solution
- Create 2 indexes. 'index-normal' and 'index-debug'.
- Have roles and users created so that the access to these indexers is provided accordingly. Easy. Can be managed!
- At the forwarder, I have 2 segments - each corresponding to indexing the same log to a different index. Note that I am attempting to bypass the props.conf and transforms.conf at the indexer by using queue = indexQueue
in one of the sections.
[monitor:///mypath/abc.log]
disabled = false
index = index-normal
sourcetype = mysourcetype
[monitor:///mypath/abc.log]
disabled = false
index = index-debug
sourcetype = mysourcetype
queue = indexQueue
props.conf
[mysourcetype]
TRANSFORMS-null= setnull
NO_BINARY_CHECK = 1
pulldown_type = 1
transforms.conf:
[setnull]
REGEX = DEBUG
DEST_KEY = queue
FORMAT = nullQueue
Needless to say, this isn't working.
Questions
- Is this the best way to handle this situation? I am trying to index the same log twice (and maybe thats not happening). Is there a better approach by using some logic at the indexer end?
- If this is the approach which is to be used, where am I going wrong?
Thanks!
Just found CLONE_SOURCETYPE
today in transforms.conf.spec:
http://docs.splunk.com/Documentation/Splunk/latest/admin/Transformsconf
Sounds like it might be what you need (see excerpts below):
CLONE_SOURCETYPE = <string>
* If CLONE_SOURCETYPE is used as part of a transform, the transform will
create a modified duplicate event, for all events that the transform is
applied to via normal props.conf rules.
* Use this feature if you need to store both the original and a modified
form of the data in your system, or if you want to send the original and a
modified form to different outbound systems.
* A typical example would be to retain sensitive information according to
one policy and a version with the sensitive information removed
according to another policy. For example, some events may have data
that you must retain for 30 days (such as personally identifying
information) and only 30 days with restricted access, but you need that
event retained without the sensitive data for a longer time with wider
access.
Then in the examples:
[hide-ip-address]
# Make a clone of an event with the sourcetype masked_ip_address. The clone
# will be modified; its text changed to mask the ip address.
# The cloned event will be further processed by index-time transforms and
# SEDCMD expressions according to its new sourcetype.
# In most scenarios an additional transform would be used to direct the
# masked_ip_address event to a different index than the original data.
REGEX = ^(.*?)src=\d+\.\d+\.\d+\.\d+(.*)$
FORMAT = $1src=XXXXX$2
DEST_KEY = _raw
CLONE_SOURCETYPE = masked_ip_addresses
#2 will work, however how are you receiving this data? Is it coming in via syslog-ng written to a file, or what? Is there any way to break the data into separate files? That way you could just have 2 input statements watching each separate file. If not, then you might have to do a transform on a subset of the data.
Thanks for your help, but unfortunately it doesn't quite help my situation.
The solution you outlined in #2 would basically redirect the log entries identified by a REGEX to a different index. However, in my case, I need the index (in this case, the index_debug) to be populated with not only the DEBUG log entries, but also the INFO and WARNING log entries, basically the unfiltered log.
I would also need the filtered log (without DEBUG) entries to a different index (index_normal)
To your other point, again its a good suggestion. However, I wouldn't think we would be able to get the DEBUG log enetries to a separate file. Technically, we could, but I don't think the team would be receptive to this approach.
I'm not quite understanding why you need the same data in multiple indexes. Why not just control the permissions in such a way that everyone has access to the general info, and then grant a few access to the debug logs?