Getting Data In

Re-index data

TheFlash
Path Finder

How do I get the data re-indexed to same sourcetype which I deleted using the delete command.

for eg.

        lets say I used this query: index=demo sourcetype=db_demo| delete 

now here correct me If i am wrong, my "db_demo" data is marked as deleted that it is unsearchable but it is not deleted from disk space.

now my question is without cleaning my index, how can I re-index or you can say monitor again my " db_demo" without changing the sourcetype. I don't want to change sourcetype "db_demo" to something else.

is there a way ?

Labels (5)
0 Karma
1 Solution

gcusello
SplunkTrust
SplunkTrust

Hi @TheFlash,

yes the data you deleted are phisically still in your index but are unsearcable.

To reindex them I need to know which kind of logs are they:

  • from db-connect,
  • from files,
  • from syslogs?

If they arrive from syslog, it's not possible to reindex them.

if they arrive from dbconnect, it's a little difficoult but possible because you have to manually modify (the From DB Connect 3 and later) the rising column checkpoints of the input that are stored  in $SPLUNK_HOME/var/lib/splunk/modinputs/server/splunk_app_db_connect.

If they are from files, you have to identify the sources to reindex and, if they are few, manually load  them by guided procedure, if they are many you have to modify your inputs.conf adding to the related stanza the option 

crcSalt = <SOURCE>

Ciao.

Giuseppe 

View solution in original post

hnorvik
Explorer

If you want to have the deleted data reappear for searching without actually re-indexing the data, you can do the following:

  • Stop Splunk
  • In the folder for the index, find the buckets by UTC timestamp where you want to recover the deleted data.
  • Within the bucket's rawdata folder you will find a folder called deletes containing one or more csv.gz files. Remove the deletes folder. 
  • Start Splunk

A side effect of this is that ALL deleted data will then reappear - not just your sourcetype=db_demo.

Since you now have access to all the data that was present in the index, you can use any other export / re-index methods on the data. By exporting _raw records to a CSV file you can also use monitor / file upload again if you need to test your indexing process. 

I recommend making a copy of your original index demo to work on rather than working on the original. If something during this process fails you can always return to start.  Also - do this in a lab before working on the real dataset. I am not sure if this causes any issues around the tsidx files or the .bucketmanifest, but this worked for me when I needed to restore some "lost" data.

Regards, H.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

The process that got the db_demo data into the demo index in the first place must be repeated.

If the data came from a file, then Splunk will not re-process it because remembers reading it before.  You'll have to tell Splunk to "forget" that file by deleting the fishbucket.  To do that, run this CLI command 

splunk cmd btprobe -d /opt/splunkforwarder/var/lib/splunk/fishbucket/splunk_private_db --file <foo> --reset

replace "<foo>" with the name of the file you wish to re-index.

 

---
If this reply helps you, Karma would be appreciated.

gcusello
SplunkTrust
SplunkTrust

Hi @TheFlash,

yes the data you deleted are phisically still in your index but are unsearcable.

To reindex them I need to know which kind of logs are they:

  • from db-connect,
  • from files,
  • from syslogs?

If they arrive from syslog, it's not possible to reindex them.

if they arrive from dbconnect, it's a little difficoult but possible because you have to manually modify (the From DB Connect 3 and later) the rising column checkpoints of the input that are stored  in $SPLUNK_HOME/var/lib/splunk/modinputs/server/splunk_app_db_connect.

If they are from files, you have to identify the sources to reindex and, if they are few, manually load  them by guided procedure, if they are many you have to modify your inputs.conf adding to the related stanza the option 

crcSalt = <SOURCE>

Ciao.

Giuseppe 

gcusello
SplunkTrust
SplunkTrust

Hi @TheFlash,

good for you, see nect time!

Ciao and happy splunking.

Giuseppe

P.S.: Karma Points are appreciated by all the contributors 😉

0 Karma
Get Updates on the Splunk Community!

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud  In today’s fast-paced digital ...

Observability protocols to know about

Observability protocols define the specifications or formats for collecting, encoding, transporting, and ...

Take Your Breath Away with Splunk Risk-Based Alerting (RBA)

WATCH NOW!The Splunk Guide to Risk-Based Alerting is here to empower your SOC like never before. Join Haylee ...