Solved: How to configure a new splunk instance to search p...

Log_wrangler · ‎01-08-2018

I have previously indexed data uploaded to an s3 bucket.

I installed Splunk (full version) on an EC2 (RHEL7).
I (persistently) mounted the s3 bucket to the EC2 instance (with FUSE).
I can see all the data when I change to my_s3fs_mount_directory,

(e.g. /my_s3fs_mount_directory/index_name/db_1234567_123456_1234/rawdata/journal.gz)

My question is how I should edit the indexes.conf correctly, so that my new indexer sees this data and doesn't accidentally overwrite the existing data in my path by accident.

Here is what I have so far (in /opt/splunk/etc/system/local/)

[myindex]

homePath = /my_s3fs_mount_directory/index_name/db
coldPath = /my_s3fs_mount_directory/index_name/colddb
thawedPath =/my_s3fs_mount_directory/index_name/thaweddb
maxDataSize = 10000
maxHotBuckets = 10

The index is visible but no data in results.

Is there anything else I need to do or another conf I would also need to edit?

Any advice is appreciated.
Thank you

nickhills · ‎01-12-2018

S3 over fuse is S. L. O. W. As well as being a fake fs

I would mount an ebs and copy the data from S3 to the ebs before doing anything else

If my comment helps, please give it a thumbs up!

View solution in original post

nickhills · ‎01-12-2018

S3 over fuse is S. L. O. W. As well as being a fake fs

I would mount an ebs and copy the data from S3 to the ebs before doing anything else

If my comment helps, please give it a thumbs up!

Log_wrangler · ‎01-12-2018

Your suggestion is probably the best solution at this point.

My current scenario was a test to see if it would read, and apparently it will not (as you have mentioned the s3fs is slow, also object based, and not listed as supported).

For those interested I started another thread (title of question below) to see if Splunk 7.0 remotePath may be a solution.

"has anyone successful setup the remotePath option in indexes.conf in Splunk 7.0 to work with indexed data in s3?"

Log_wrangler · ‎01-12-2018

FYI, I was able to read a test file.txt from the /s3fs dir, but as a "data Input'

I could read the file.txt via data inputs > files & directories > new (then select the /s3fs/file.txt)

Of course this would need to be automated to input loads of files.... have not worked that out but any suggestions appreciated.

nickhills · ‎01-12-2018

I saw it!
I too am super interested in this, but as I note, i suspect it will only be for archive data

If my comment helps, please give it a thumbs up!

nickhills · ‎01-12-2018

Is the data a 'copy' of the indexes which you have uploaded to s3, or was the data frozen?

If the data was frozen, you need to copy the buckets to the thawed directory - not the hot/cold db

If my comment helps, please give it a thumbs up!

Log_wrangler · ‎01-12-2018

it was a copy of warm and cold.

micahkemp · ‎01-11-2018

One thing you want to be very careful with is making sure you get your frozenTimePeriodInSecs and maxTotalDataSizeMB correct before you point splunk at an existing index location. If either is wrong you risk splunk thinking data needs to be frozen (which really means deleted in most cases).

Log_wrangler · ‎01-12-2018

After reviewing, some other posts...

It is quite possible that the s3 object based data is just not compatible (with Splunk) without some custom code making it readable for splunk.

I am using an old version of Splunk (i.e. 5.x).

I am thinking that I will try Splunk 7.x and see if it can read indexed data from a remote s3 location.

Please advise if you have any more insight on this. If/when I get results, I plan to share lessons learned.

Thank you

micahkemp · ‎01-11-2018

Is this a standalone splunk instance (or are you trying to search directly from the instance that has the data mounted)?

Can you post the output of splunk btool indexes list --debug?

Log_wrangler · ‎01-12-2018

sorry sec-policy does not permit to post actual data thx

Log_wrangler · ‎01-12-2018

This is a standalone splunk instance on RHEL7 on the EC2 AWS instance.

I created a custom index which points to the s3 path...

When I restarted after creating the indexes.conf file for this index, I got this error

error message for /my_s3fs_mount_directory/...

homePath '/my_s3fs_mount_directory/index_name/db' is in a filesystem that Splunk cannot use. (index=index_name)

Checking indexes...
homePath '//my_s3fs_mount_directory/index_name/db' is in a filesystem that Splunk cannot use. (index=index_name)
Validating databases (splunkd validatedb) failed with code '1'.

How to configure a new splunk instance to search previously indexed data stored on s3?

.conf25 Registration is OPEN!

Detecting Cross-Channel Fraud with Splunk

Splunk at Cisco Live 2025: Learning, Innovation, and a Little Bit of Mr. Brightside

Are you a member of the Splunk Community?

How to configure a new splunk instance to search previously indexed data stored on s3?

.conf25 Registration is OPEN!

Detecting Cross-Channel Fraud with Splunk

Splunk at Cisco Live 2025: Learning, Innovation, and a Little Bit of Mr. Brightside