So the rest API that I set up in Splunk will go out to this rest endpoint and the file that it will receive is a zip file. Inside this zip file, there are 2 CSV files but I only need to index 1 file (in this case, the file name is ENDPOINT_CDR_DETAIL_ALL_CSV). But I only see 3 options for the response type which is text, xml, and json. Does Splunk have an option for us to set may be a response handler to unzip the file and only index 1 file out of the 2?
The name and form of the file:
Content inside the zip file:
In rest_ta/bin/responsehandlers.py
add a custom response handler , pseudo example :
class ZipFileResponseHandler:
def __init__(self,**args):
self.csv_file_to_index = args['csv_file_to_index']
def __call__(self, response_object,raw_response_output,response_type,req_args,endpoint):
import zipfile,io,re
file = zipfile.ZipFile(BytesIO(response_object.content))
for info in file.infolist():
if re.match(self.csv_file_to_index, info.filename):
filecontent = file.read(info)
print_xml_stream(filecontent)
In your config stanza , apply this handler :
The csv_file_to_index
parameter value in this example is a python regex such as :
ENDPOINT_CDR_DETAIL_ALL_CSV\.csv
for an exact filename to extract from the zip.*CDR_DETAIL.*\.csv$
for a pattern for the filename(s) to extract from the zipIn rest_ta/bin/responsehandlers.py
add a custom response handler , pseudo example :
class ZipFileResponseHandler:
def __init__(self,**args):
self.csv_file_to_index = args['csv_file_to_index']
def __call__(self, response_object,raw_response_output,response_type,req_args,endpoint):
import zipfile,io,re
file = zipfile.ZipFile(BytesIO(response_object.content))
for info in file.infolist():
if re.match(self.csv_file_to_index, info.filename):
filecontent = file.read(info)
print_xml_stream(filecontent)
In your config stanza , apply this handler :
The csv_file_to_index
parameter value in this example is a python regex such as :
ENDPOINT_CDR_DETAIL_ALL_CSV\.csv
for an exact filename to extract from the zip.*CDR_DETAIL.*\.csv$
for a pattern for the filename(s) to extract from the zipThis is my version of the code:
class ZipFileResponseHandler:
def __init__(self,**args):
pass
def __call__(self, response_object, raw_response_output, response_type, req_args, endpoint):
file = zipfile.ZipFile(StringIO.StringIO(response_object.content))
for name in file.namelist():
if "ENDPOINT" in name:
data =file.read(name)
data = data.split('\n')
for element in data[1:]:
print_xml_stream(element)
I suggest using the REST API Modualr Input and plugging in a custom response handler to perform the unzipping for you and any other pre processing you require.
Could you give me more information as how do I make the handler give the specific file to the indexer
Hi
can you please let me how you call REST API, using the script or anything else ??
I was able to download the rest api from splunk but for now, I'm not using any script yet. Do you think I could do this by writing a script that could run every minute to go to the url api? Again if the script allows me to unzip the file and pick what file I want. Thanks!
Yes,
you can create scripted input which downloads and extracts files for you.
Create inputs.conf in your app and put below configuration in file.
[script:///opt/splunk/etc/app/yourapp/bin/scriptedfile.py]
disabled = 0
interval = 60
This will run file every 60 secs. You can change as per your requirement.
Create bin/scriptedfile.py
and do code for REST API (file download ) and extraction of files.
Scripted Input docs:
https://docs.splunk.com/Documentation/SplunkCloud/6.6.3/AdvancedDev/ScriptedInputsIntro
Or if REST API couldn't do this. Is there any alternative way?