Hi,
I'm using the below line, while creating custom command for key value field extraction. My events are in json format. I would like to know the format in which the events are retrieved while the below line is given (string) or type of json_event.
json_events = splunk.Intersplunk.readResults(None, None, True)
I have also attached the entire script for your reference. I think I'm having problems while loading the json event
import json
import sys
import csv
import splunk.Intersplunk
(isgetinfo, sys.argv) = splunk.Intersplunk.isGetInfo(sys.argv)
json_events = splunk.Intersplunk.readResults(None, None, True)
for json_event in json_events:
json_data = json.load(json_event)
fields = (json_data['Fields'])
for f in fields:
if not f['values']:
json_event[f['Name']] = 'null'
elif 'value' not in f['values'][0]:
json_event[f['Name']] = 'null'
else:
json_event[f['Name']] = f['values'][0]['value']
splunk.Intersplunk.outputResults(json_events)
Thanks in advance
This will work, provided the events are valid JSON
import json
import sys
import csv
import splunk.Intersplunk as si
import logging as logger
import os
logger.basicConfig(level=logger.INFO, format='%(asctime)s %(levelname)s %(message)s',filename=os.path.join('/home','flynt','test.log'),filemode='a')
(isgetinfo, sys.argv) = si.isGetInfo(sys.argv)
json_events = si.readResults(None, None, True)
events=[]
row={}
for json_event in json_events:
for f in json_event:
if f=="_raw":
try:
json_event_raw=json.loads(json_event[f])
except:
json_event_raw={"Fields": [{"values": [{"value":"THIS IS NOT A VALID JSON EVENT"}],"Name": "ERROR"}]}
row[f]=json_event[f]
for fields in json_event_raw['Fields']:
if fields['values']!=[]:
row[fields['Name']]=fields['values'][0]['value']
else:
row[fields['Name']]="Null"
events.append(row)
row={}
si.outputResults(events)
If the events are not valid, a new field called ERROR will appear explaining so.
Basically we just take the _raw (JSON event) field from the results, parse and then output our newly created fields. Note that we are picking the first value of the values list. If you expect to have multiple values, you'll need to adjust accordingly.
This will work, provided the events are valid JSON
import json
import sys
import csv
import splunk.Intersplunk as si
import logging as logger
import os
logger.basicConfig(level=logger.INFO, format='%(asctime)s %(levelname)s %(message)s',filename=os.path.join('/home','flynt','test.log'),filemode='a')
(isgetinfo, sys.argv) = si.isGetInfo(sys.argv)
json_events = si.readResults(None, None, True)
events=[]
row={}
for json_event in json_events:
for f in json_event:
if f=="_raw":
try:
json_event_raw=json.loads(json_event[f])
except:
json_event_raw={"Fields": [{"values": [{"value":"THIS IS NOT A VALID JSON EVENT"}],"Name": "ERROR"}]}
row[f]=json_event[f]
for fields in json_event_raw['Fields']:
if fields['values']!=[]:
row[fields['Name']]=fields['values'][0]['value']
else:
row[fields['Name']]="Null"
events.append(row)
row={}
si.outputResults(events)
If the events are not valid, a new field called ERROR will appear explaining so.
Basically we just take the _raw (JSON event) field from the results, parse and then output our newly created fields. Note that we are picking the first value of the values list. If you expect to have multiple values, you'll need to adjust accordingly.
Hi Flynt,
still no luck. When try executing the command it just gives me an error
External search command 'jsonfields' returned error code 1.
I think there's no problem with the JSON. When I tried to run in pycharm with the sample json event as a File. I got it without any errors
import json
import sys
import csv
json_event = open(r'C:\Users\rodney.jerome.samuel\Desktop\sample_json.json')
json_data = json.load(json_event)
fields = (json_data['Fields'])
extracted_fields = {}
for f in fields:
if not f['values']:
extracted_fields[f['Name']] = 'null'
elif 'value' not in f['values'][0]:
extracted_fields[f['Name']] = 'null'
else:
extracted_fields[f['Name']] = f['values'][0]['value']
print(extracted_fields)
This is the sample output i got as dictionary. So I don't think there is any problem with json event.
{'user-template-29': 'null', 'has-change': 'null', 'user-template-28': 'MPT', 'user-template-05': 'SF'}
There might be a slight difference in your sample than what is posted here. (You can copy paste straight into jsonlint to see what I mean - http://jsonlint.com/). Additionally, the job inspector will help you figure out what error you are getting (it could be as simple as indentation). If you click the job inspector in Splunk then check the search.log, the output of your script error should be there.
At any rate, feel free to drop me a mail at fdeboer@splunk.com and we can look at this together. Once we figure out your solution we can come back here to share it with the community.
That really helped. Apparently there were some errors in python scripts that I need to take care of. The logic works perfectly fine. Thanks.
You might consider adding a logger so you'll know what event object is returned. For instance if I want to log it in my home directory -
import logging as logger
import os
logger.basicConfig(level=logger.INFO, format='%(asctime)s %(levelname)s %(message)s',filename=os.path.join('/home','flynt','test.log'),filemode='a')
Then add
logger.info(json_events)
To check what the json_events look like. In my own testing, I didn't see the fieldnames, etc, but simply the json object of the results.
Using your script as an example I was able to get what I think is your goal this way -
import json
import sys
import csv
import splunk.Intersplunk as si
import logging as logger
import os
logger.basicConfig(level=logger.INFO, format='%(asctime)s %(levelname)s %(message)s',filename=os.path.join('/home','flynt','test.log'),filemode='a')
(isgetinfo, sys.argv) = si.isGetInfo(sys.argv)
json_events = si.readResults(None, None, True)
events=[]
row={}
for json_event in json_events:
for f in json_event:
#logger.info(f)
if f=="Name":
if json_event[f] is None or json_event[f]=="":
row[f]="null"
else:
row[f]=str(json_event[f])
else:
row[f]=json_event[f]
events.append(row)
row={}
si.outputResults(events)
I used logger.info with a tail on /home/flynt/test.log to check the outputs.
It's not working. Below is my sample json input. I want the fields extracted
for example :
ut2: null
ut3:SF
VS:2
I need the python script for extracting the fields in such a way.
{
"Type": "defect",
"Fields": [
{
"values": [],
"Name": "ut2"
},
{
"values": [
{
"value": "SF"
}
],
"Name": "ut3"
},
{
"values": [
{
"value": "2"
}
],
"Name": "vs"
},
{
"values": [
{
"value": "N"
}
],
"Name": "attached"
},
{
"values": [
{
"value": "vh"
}
],
"Name": "pri"
},
],
}
I'm just getting the fields that are automatically extracted if I try your script. Not any change. Kindly help. I guess the json_events will have csv. Need to get _raw from that and then perform the json operations
any suggestions?
Can you explain how you are calling this? How are the events in Splunk itself? Can you give the output of the logger for the results? I can see the input but how Splunk presents them to your command is important for the parsing.
logger.info(json_events)
Also, linting this JSON input gives me an error.
Error: Parse error on line 26:
... "Name": "pri" }, ], }
----------------------^
Expecting 'STRING', 'NUMBER', 'NULL', 'TRUE', 'FALSE', '{', '[', got ']'
There shouldn't be ending commas. For example, this json is valid -
{
"Type": "defect",
"Fields": [{
"values": [],
"Name": "ut2"
}, {
"values": [{
"value": "SF"
}],
"Name": "ut3"
}, {
"values": [{
"value": "2"
}],
"Name": "vs"
}, {
"values": [{
"value": "N"
}],
"Name": "attached"
}, {
"values": [{
"value": "vh"
}],
"Name": "pri"
}]
}