All Apps and Add-ons

Cloud Storage Bucket Input Using the Splunk Add-on for Google Cloud Platform

a805555
New Member

I installed and configure successfully "Splunk Add-on for GCP" version 3.0.2 to access data xml files stored in a bucket.

I use it for 2 GCP bucket (DEV and PROD).

It's works well in DEV with a dedicated bucket with hundreds files directly in root

But it didn't work well with PROD bucket (a larger one with thousands files in a tree). It seems to be continuously reading sames files in first directory and don't index them because of unsupported type.

I don't understand why it didn't scan the entire tree and didn't throw error in the process. Why message is always "Files to be ingested: 978" since there are 1916 files in first directory called cdp ? 

I didn't find a way to filter by example by specifying a path to analyze just that path and not the complete bucket.

 

Does somebody have ideas ?

 

Thanks by advance.

 

Following is extract of log file splunk_ta_google_cloudplatform_google_cloud_bucket_metadata__1.log

2021-07-26 10:53:10,700 level=INFO pid=34200 tid=MainThread logger=splunk_ta_gcp.modinputs.bucket_metadata pos=bucket_metadata.py:ingest_data:107 |  | message="-----Data Ingestion begins-----"

2021-07-26 10:53:36,829 level=INFO pid=15708 tid=MainThread logger=splunk_ta_gcp.modinputs.bucket_metadata pos=bucket_metadata.py:ingest_data:107 |  | message="-----Data Ingestion begins-----"

2021-07-26 10:53:45,848 level=WARNING pid=15708 tid=MainThread logger=googleapiclient.discovery_cache pos=__init__.py:autodetect:44 | file_cache is unavailable when using oauth2client >= 4.0.0

Traceback (most recent call last):

  File "D:\SPLUNK\etc\apps\Splunk_TA_google-cloudplatform\bin\3rdparty\googleapiclient\discovery_cache\__init__.py", line 41, in autodetect

    from . import file_cache

  File "D:\SPLUNK\etc\apps\Splunk_TA_google-cloudplatform\bin\3rdparty\googleapiclient\discovery_cache\file_cache.py", line 41, in <module>

    'file_cache is unavailable when using oauth2client >= 4.0.0')

ImportError: file_cache is unavailable when using oauth2client >= 4.0.0

2021-07-26 10:53:46,118 level=INFO pid=15708 tid=MainThread logger=splunk_ta_gcp.modinputs.bucket_metadata pos=bucket_metadata.py:get_metadata:264 |  | message="Successfully obtained bucket metadata for prd-europe-west1-archiving"

2021-07-26 10:53:46,259 level=INFO pid=15708 tid=MainThread logger=splunk_ta_gcp.modinputs.bucket_metadata pos=bucket_metadata.py:get_metadata:269 |  | message="Successfully obtained object information present in the bucket prd-europe-west1-archiving."

2021-07-26 10:53:47,107 level=INFO pid=15708 tid=MainThread logger=splunk_ta_gcp.modinputs.bucket_metadata pos=bucket_metadata.py:get_list_of_files_to_be_ingested:352 |  | message="Files to be ingested: 978 files"

2021-07-26 10:53:47,224 level=INFO pid=15708 tid=MainThread logger=splunk_ta_gcp.modinputs.bucket_metadata pos=bucket_metadata.py:ingest_file_content:396 |  | message="Cannot ingest contents of cdp/f006006102/processing/InternalTranscodifications_f006006102_161839.avro, file with this extention is not yet supported in the TA"

2021-07-26 10:53:47,361 level=INFO pid=15708 tid=MainThread logger=splunk_ta_gcp.modinputs.bucket_metadata pos=bucket_metadata.py:ingest_file_content:396 |  | message="Cannot ingest contents of cdp/f006006102/processing/InternalTranscodifications_f006006102_161916.avro, file with this extention is not yet supported in the TA"

Labels (1)
0 Karma

rsaliou
Engager

Still same with latest version 3.2.0

0 Karma
Get Updates on the Splunk Community!

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...