Splunk Search

How to convert splunk api response to dataframe pandas?

bijodev1
Communicator

Hi Everyone, thanks to "kamlesh_vaghela" for helping me with importing the userid into the search query. But I am having trouble to convert the data to dataframe pandas, because of jsondecode error 

 

 

import requests
import json
import requests
import urllib3
from base64 import b64encode
import urllib.parse
from csv import reader

urllib3.disable_warnings()

data=[]
def fetch_data_using_userid(userid):
    url = "https://localhost:8089/servicesNS/admin/search/search/jobs/export"
    payload = {
        'search': f'search index=_internal earliest=-1h sourcetype="{userid}" | stats count by sourcetype',
        'output_mode': 'json'
    }
    safe_payload = urllib.parse.urlencode(payload)

    userAndPass = b64encode(b"admin:admin123").decode("ascii")
    headers = {
        'Authorization': 'Basic %s' % userAndPass,
        'Content-Type': 'application/x-www-form-urlencoded'
    }
    response = requests.request("POST", url, headers=headers, data=safe_payload, verify=False)
    return response.text



# open file in read mode
with open('user_data.csv', 'r') as read_obj:
    # pass the file object to reader() to get the reader object
    csv_reader = reader(read_obj)
    header = next(csv_reader)
    if header is not None:
        # Iterate over each row in the csv using reader object
        for row in csv_reader:
            # row variable is a list that represents a row in csv
            print(row)
            response_text = fetch_data_using_userid(row[0])
            data.append(response_text)
            print(response_text)

 

 

 

the result which I am appending is not the the correct json format somehow, and because of that while converting it to dataframe, I am unable to separate the cells into different columns.  Has someone already worked with this, can suggest some ideas. thanks.

Labels (1)
0 Karma
1 Solution

bijodev1
Communicator

This is the solution, if someone out there needs.  The code will transform the splunk api invalid json to pandas dataframe into rows and columns.

 

 

 

import pandas as pd
import requests
import json
import urllib3
from base64 import b64encode
import urllib.parse
from csv import reader


urllib3.disable_warnings()

data=[]
output1 = pd.DataFrame()

def fetch_data_using_userid(userid):
    url = "https://localhost:8000/services/search/jobs/export"
    payload = {
        'search': f'search  earliest = -40h index=* user_id="{userid}" | stats count by user user_ip ',
        'output_mode':'json'
    }
    safe_payload = urllib.parse.urlencode(payload)

    userAndPass = b64encode(b"admin:pass").decode("ascii")
    headers = {
        'Authorization': 'Basic %s' % userAndPass,
        'Content-Type': 'application/json; charset=UTF-8'
    }
    response = requests.request("POST", url, headers=headers, data=safe_payload, verify=False)
    return response.text


# open file in read mode
with open('/Users/Downloads/user_data.csv', 'r') as read_obj:
    # pass the file object to reader() to get the reader object
    csv_reader = reader(read_obj)
    header = next(csv_reader)
    if header is not None:
        # Iterate over each row in the csv using reader object
        for row in csv_reader:
            # row variable is a list that represents a row in csv
            print(row)
            response_text = fetch_data_using_userid(row[0])
            data = response_text
            with open('data.txt', 'w') as wf:
                wf.write(data)
            with open('data.txt') as rf:
                for entry in rf:
                    LDict = json.loads(entry)
                    output1 = output1.append(LDict,ignore_index=True)
            print(response_text)
#  result1 = pd.json_normalize(output1['result'])        

 

 

 

 

thanks @kamlesh_vaghela for all the help.

View solution in original post

0 Karma

bijodev1
Communicator

This is the solution, if someone out there needs.  The code will transform the splunk api invalid json to pandas dataframe into rows and columns.

 

 

 

import pandas as pd
import requests
import json
import urllib3
from base64 import b64encode
import urllib.parse
from csv import reader


urllib3.disable_warnings()

data=[]
output1 = pd.DataFrame()

def fetch_data_using_userid(userid):
    url = "https://localhost:8000/services/search/jobs/export"
    payload = {
        'search': f'search  earliest = -40h index=* user_id="{userid}" | stats count by user user_ip ',
        'output_mode':'json'
    }
    safe_payload = urllib.parse.urlencode(payload)

    userAndPass = b64encode(b"admin:pass").decode("ascii")
    headers = {
        'Authorization': 'Basic %s' % userAndPass,
        'Content-Type': 'application/json; charset=UTF-8'
    }
    response = requests.request("POST", url, headers=headers, data=safe_payload, verify=False)
    return response.text


# open file in read mode
with open('/Users/Downloads/user_data.csv', 'r') as read_obj:
    # pass the file object to reader() to get the reader object
    csv_reader = reader(read_obj)
    header = next(csv_reader)
    if header is not None:
        # Iterate over each row in the csv using reader object
        for row in csv_reader:
            # row variable is a list that represents a row in csv
            print(row)
            response_text = fetch_data_using_userid(row[0])
            data = response_text
            with open('data.txt', 'w') as wf:
                wf.write(data)
            with open('data.txt') as rf:
                for entry in rf:
                    LDict = json.loads(entry)
                    output1 = output1.append(LDict,ignore_index=True)
            print(response_text)
#  result1 = pd.json_normalize(output1['result'])        

 

 

 

 

thanks @kamlesh_vaghela for all the help.

0 Karma

kamlesh_vaghela
SplunkTrust
SplunkTrust

@bijodev1 

Can you please share sample output from below variable? 

response_text

 

May be I can help you on this.

KV

0 Karma

bijodev1
Communicator

@kamlesh_vaghela 

 

['{"preview":false,"lastrow":true}\n',
 '{"preview":true,"offset":0,"lastrow":true,"result”:{“user”:”abc1213”,”ip":"10.1.1.1","path”:”/home/search/“,”url_status”:”200”,”count":"7"}}\n{"preview":true,"offset":0,"lastrow":true,"result":{"user”:”abc1213”,”ip”:”10.1.1.1”,”path”:”/browse/contract/payment”,”url_status":"400","count":"7"}}\n{"preview":false,"offset":0,"lastrow":true,"result":{"user”:”abc”123,”ip”:”10.1.1.1”,”path”:”/home/data/contact“,”url_status”:”200”,”count":"7"}}\n']

 

This is the sample output which I am getting

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...