props.conf

yuwtennis · ‎12-11-2013

Hi!

I am considering to implement two separate indexes containing
non-anonymized data and anonyimized on the other.

input data looks like following,

a,b,c
1,2,3
4,5,6
7,8,9

I have configured props.conf and transforms.conf as,

props.conf

[hoge]
SHOULD_LINEMERGE =False
REPORT-1 = searchext
TRANSFORMS-1 = setnull
TRANSFORMS-2 = indexrouting1
TRANSFORMS-3 = anonymize
TRANSFORMS-4 = indexrouting2

Transforms.conf

[searchext]
DELIMS = ","
FIELDS = "a","b","c"

[setnull]
REGEX = a
DEST_KEY = queue
FORMAT = nullQueue

[anonymize]
REGEX = (\d),(\d),(\d)
FORMAT = $1,###,$2
DEST_KEY = _raw

[indexrouting2]
REGEX = ###
DEST_KEY = _MetaData:Index
FORMAT = indexB

It seems that data is only going into index B.
However , I want them to go indexA and indexB

I appreciate if someone can verify this.

Thanks,
Yu

kristian_kolb · ‎12-11-2013

Normally, you'd write your transforms like;

TRANSFORMS-blah = transform1, transform2, transform3

This means that each event to be transformed will go through all three transforms before returning to the pipeline for further processing (i.e. indexing). However, that will not let you create multiple copies of the events into different indexes.

One way of doing it is to index the events normally into indexA, and then have a scheduled search that changes the events and populates another index (indexB) ;

index=indexA earliest=-1h@h latest=@h | replace "SecretStuff" with #### in _raw | collect index=indexB

or

index=indexA earliest=-1h@h latest=@h | rex field=_raw mode=sed "s/<some_regex>/###/" | collect index=indexB

Setting this to run 5 minutes past every hour will ensure that all events are collected only once into indexB. Restrict the access to indexA. Allow general access to indexB. Adjust timeranges and scheduling to your needs.

Hope this helps,

k

View solution in original post

kristian_kolb · ‎12-11-2013

Normally, you'd write your transforms like;

TRANSFORMS-blah = transform1, transform2, transform3

This means that each event to be transformed will go through all three transforms before returning to the pipeline for further processing (i.e. indexing). However, that will not let you create multiple copies of the events into different indexes.

One way of doing it is to index the events normally into indexA, and then have a scheduled search that changes the events and populates another index (indexB) ;

index=indexA earliest=-1h@h latest=@h | replace "SecretStuff" with #### in _raw | collect index=indexB

or

index=indexA earliest=-1h@h latest=@h | rex field=_raw mode=sed "s/<some_regex>/###/" | collect index=indexB

Setting this to run 5 minutes past every hour will ensure that all events are collected only once into indexB. Restrict the access to indexA. Allow general access to indexB. Adjust timeranges and scheduling to your needs.

Hope this helps,

k

kristian_kolb · ‎12-12-2013

...| collect index=dummy file=apa spool=f

Then set up the [monitor] to watch this file in SPLUNK_HOME/var/run/splunk

Set the source and sourcetype like the original data in inputs.conf

At the head of the file that is created there will be a line *** SPLUNK *** and some info regarding the index you set in the the collect This line will indexed as a separate event, but you can probably remove it through a nullQueue transform.

kristian_kolb · ‎12-12-2013

The collect command allows you to specify an alternate location of where to write the file. Thus you should be able to set up a [monitor] stanza in inputs.conf with the correct source/sourcetype (so that the field extractions are automatically applied). I'm sorry, but I haven't played with this feature a lot. You'll need to do a bit of your own testing.

yuwtennis · ‎12-12-2013

Hello Kristian.

I got it working but it seems that field extraction of stash sourcetype extracts the raw data itself to field.

I would like to disable the stash sourcetype extraction but do you know any way to do this?

Thanks,
YU

yuwtennis · ‎12-12-2013

Hello Kristian.

Thank you for the reply.

This sounds good!

I will give it a try.

Thanks,
Yu

Data anonymizing and index routing

props.conf

Transforms.conf

Developer Spotlight with Paul Stout

State of Splunk Careers 2024: Maximizing Career Outcomes and the Continued Value of ...

Data-Driven Success: Splunk & Financial Services