I've got an error like this:
ERROR IndexProcessor - caught exception for index=indexname during initialzation: 'Splunk has detected that a directory has been manually copied into its database, causing id conflicts [/opt/splunk/var/lib/splunk/indexname/db/db_epoch_epoch_1, /opt/splunk/var/lib/splunk/indexname/db/hot_v1_1].'.Disabling the index, please fix-up and run splunk enable index.
I try to fix it using the instructions here:
http://splunk-base.splunk.com/answers/23536/moving-indexes-to-a-new-splunk-server
But I keep finding more conflicts. How can I find all the bucket id conflicts and fix them?
Go to your $SPLUNK_DB directory
# cd $SPLUNK_DB
Run the following one liner
( The earliest and latest time information will be removed from the buckets.)
# find . -maxdepth 3 -mindepth 3 -type d | grep -P "db_\d{10}|hot" | sed 's/(db_)[0-9]_[0-9]([0-9]*$)/\1\2/' | sed 's/(hot)_v1([0-9]*$)/db_\2 \1/' | awk '{a[$1]++} END { for ( i in a ) print a[i], "\t", i}' | sort -rn | grep -P "^([2-9] |[1-9][0-9]+)"
Look for the id(s) in the index database(s). They could be hot, warm, or cold buckets id.
Here is an output example. I used var/lib/splunk as $SPLUNK_DB.
# find var/lib/splunk -maxdepth 3 -mindepth 3 -type d | grep -P "db_\d{10}_|hot_" | sed -e 's/\(db_\)[0-9]*_[0-9]*_\([0-9]*$\)/\1\2/' -e 's/\(hot\)_v1_\([0-9]*$\)/db_\2 \1/' | awk '{a[$1]++} END { for ( i in a ) print a[i], "\t", i}' | sort -rn | grep -P "^([2-9] |[1-9][0-9]+)" 2 var/lib/splunk/os/db/db_167 2 var/lib/splunk/os/db/db_111 2 var/lib/splunk/defaultdb/db/db_9 2 var/lib/splunk/defaultdb/db/db_7 2 var/lib/splunk/defaultdb/db/db_6 2 var/lib/splunk/defaultdb/db/db_4 2 var/lib/splunk/defaultdb/db/db_3 2 var/lib/splunk/defaultdb/db/db_2 2 var/lib/splunk/defaultdb/db/db_1 2 var/lib/splunk/defaultdb/db/db_0
Of course, you can not run this in Windows....
There are a couple of escape chars missing in step 2 of the accepted answer.
This updated version ran without error for me.
find . -maxdepth 3 -mindepth 3 -type d | grep -P "db_\d{10}|hot" | sed 's/(db_)[0-9]_[0-9]([0-9]$)/\1\2/' | sed 's/(warm)v1([0-9]$)/db\ 2\ 1/' | awk '{a[$1]++} END { for ( i in a ) print a[i], "\t", i}' | sort -rn | grep -P "^([2-9] |[1-9][0-9]+)"
Here is an easy way to look for duplicates on Linux,
cd (directory where the all the indexes live)
ls -R | cut -d'_' -f4 | sort -n | uniq -c | grep -v "1 [1-9]"
Note that with a clustered index, you'll have to take some other things into account. Each clustered indexer starts counting at 0 for new buckets, so you might have legitimate overlap. The bucket name also includes the source server GUID in fifth position (cut -d'_' -f5).
http://docs.splunk.com/Documentation/Splunk/6.0/Indexer/HowSplunkstoresindexes
Also when I have a list of conflits in splunkd.log on linux, I use splunk to generate my script to fix them (by incrementing and moving them)
example with increment of 100 to the bucket id.
index=_internal source=*splunkd.log* "conflicts [" | rex "conflicts \[(?)," | rex field=path "(? .*)_\d+$" | rex "_(? \d+)$" | convert num(id) | eval id=id+100 | eval _raw="mv ".path." ".shortpath."_".id
JBSplunk,
It looks like Masa contributed a great solution, but I wanted to share what I generally use as well, since it is shorter syntax (less to remember/paste).
I use Larry Wall's "rename" perl script. (Remember, Larry Wall is the father of perl). This rename comes stock on debian/ubuntu but it is NOT the same as the CentOS/RHEL rename. To use it on CentOS/RHEL I download a fresh copy from the Internet.
Larry Wall's "rename" takes sed-style matches, so the following will work:
Sean
To manually merge buckets from multiple legacy indexers onto one new indexer, I used these commands which work on RHEL7/8:
1. on the legacy indexers, in indexes.conf, set maxVolumeDataSizeMB=400 for the warm volume to force all buckets to roll to cold
2. on the legacy indexers, in /etc/sudoers add
johndoe ALL=NOPASSWD:/usr/bin/rsync
to enable passwordless sudo with rsync
3. on the new indexer, run this command to rysnc cold buckets from each legacy indexer to a subfolder on the new indexer:
sudo rsync --delete --compress-level=0 -aPe ssh --rsync-path="sudo rsync" johndoe@192.168.1.108:/splunk_cold /splunk_cold/idx8
in the above case, 192.168.1.108 is the address for legacy indexer#8
4. on the new indexer, run this command to renumber the buckets from each indexer
for i in /splunk_cold/idx8/*/colddb ; do echo $i ; cd $i ; ls ; for f in ` ls -rtd db_* `; do jj=` echo $f | cut -d "_" -f 4 `; kk=$(($jj + 8000)) ; ff=` echo $f | sed -e "s/_$jj\$/_$kk/" ` ; mv $f $ff ; done ; done
in the above case, each bucket from indexer#8 with 3-digit id=xxx is changed to id=8xxx
5. on the new indexer, run this command to merge each subfolder to the parent folder
cd /splunk_cold/idx8 ; 596 find -type d -exec mkdir -vp "/splunk_cold"/{} \; -or -exec mv -nv {} "/splunk_cold"/{} \;
Masa,
Good catch. Thanks.
Sean:
Thanks for sharing the alternative way. Just like Perl, there are more than one way to do it.
ls -rtd db_*
This needs to be in the specific directory like defaultdb/db or defaultdb/colddb. In such case, it'd be a lot easier. The one liner is to go through all the index database directories.
Go to your $SPLUNK_DB directory
# cd $SPLUNK_DB
Run the following one liner
( The earliest and latest time information will be removed from the buckets.)
# find . -maxdepth 3 -mindepth 3 -type d | grep -P "db_\d{10}|hot" | sed 's/(db_)[0-9]_[0-9]([0-9]*$)/\1\2/' | sed 's/(hot)_v1([0-9]*$)/db_\2 \1/' | awk '{a[$1]++} END { for ( i in a ) print a[i], "\t", i}' | sort -rn | grep -P "^([2-9] |[1-9][0-9]+)"
Look for the id(s) in the index database(s). They could be hot, warm, or cold buckets id.
Here is an output example. I used var/lib/splunk as $SPLUNK_DB.
# find var/lib/splunk -maxdepth 3 -mindepth 3 -type d | grep -P "db_\d{10}_|hot_" | sed -e 's/\(db_\)[0-9]*_[0-9]*_\([0-9]*$\)/\1\2/' -e 's/\(hot\)_v1_\([0-9]*$\)/db_\2 \1/' | awk '{a[$1]++} END { for ( i in a ) print a[i], "\t", i}' | sort -rn | grep -P "^([2-9] |[1-9][0-9]+)" 2 var/lib/splunk/os/db/db_167 2 var/lib/splunk/os/db/db_111 2 var/lib/splunk/defaultdb/db/db_9 2 var/lib/splunk/defaultdb/db/db_7 2 var/lib/splunk/defaultdb/db/db_6 2 var/lib/splunk/defaultdb/db/db_4 2 var/lib/splunk/defaultdb/db/db_3 2 var/lib/splunk/defaultdb/db/db_2 2 var/lib/splunk/defaultdb/db/db_1 2 var/lib/splunk/defaultdb/db/db_0
Of course, you can not run this in Windows....
Thanks, this is exactly what I needed!