According to the documentation i'm reading, permanently purging selective data (matching search filter/s) doesn't appear to be possible.
I'm wondering if there isn't a work-around / procedure of sorts.
For example, is it possible to swap the data, much like swapping partitions in a relational db:
1. Extract / move the data i want to keep from "PROD" Index to a "TEMP" index (stash)
ex: index=PROD source=something other_filters | collect index=TEMP
2. Clean the "PROD" index
ex: splunk clean eventdata -index index_name
3. re-inject the data from "TEMP" index back into "PROD"
ex: index=TEMP | collect index=PROD
Thoughts/ guidance would be greatly appreciated.
regards
Seb
imho your approach will work,
however, consider couple of things:
1. if youll use other sourcetype then stash
you will use data against license.
2. some search extractions might not work as you have a new sourectypes
3. if you have anything that calls your previous index, youll have to modify and modify again.
4. will be very tough (to impossible) to use if you have an indexer cluster architecture
hope it helps
imho your approach will work,
however, consider couple of things:
1. if youll use other sourcetype then stash
you will use data against license.
2. some search extractions might not work as you have a new sourectypes
3. if you have anything that calls your previous index, youll have to modify and modify again.
4. will be very tough (to impossible) to use if you have an indexer cluster architecture
hope it helps
I'll give it a shot and see what happens.
Regarding your considerations:
1. I was under the impression that "collect" automatically sets the sourcetype to 'stash' - is that not true?
2. Can you give an example of which search extractions might not work; and what are you referring to exactly when you say "as you have a new sourcetypes?"
3. I assume that i would need to put the index in 'single-user' mode so as to prevent reads and then pause the data connectors to stop any writes.
4. I don't have a clustered architecture; i have a small, dev implementation.
Thanks for the reply.
regards
indeed collect
sets sourcetype to stash
most of the time, users apply search time extractions, as well as search filters to sourcetype. meaning that if you have a search index=a sourcetype=b
youll have to change it to index=c sourcetype=stash
also if you have field extractions based on sourcetype b
you will have to modify them
not sure what single-user
mode means as describe, however, defiantly stop any incoming data as the collect
command will execute only on the data fetched at the search / execution time
good luck with your purge!
if it answers your questions, kindly accept the answers for others to know it worked for you
single-user mode as in prevent other accounts from searching the index while i'm running thru the procedure.
Thanks, I'll run through the procedure and update on what i find.
you have been very helpful and insightful. i wanted to award you some points but the system wont allow me - says i don't have enough Karma.