Hi Team,
need your help, while i am ingesting data using python script i.e scripted input. for timestamp field i am getting none value . even in script data is populating fine but when it is ingesting in splunk it is getting extra field value none for timestamp
need your help
You should always define your own sourcetype instead of use _json. So change your inputs.conf like
[script://./bin/networker_alerts.py]
disabled = 0
index = test
interval = 4-59/5 * * * *
source = script:networker_alerts.py
sourcetype = json:with:timestamp
Then you should define your props.conf for that sourcetype on that HF. Please create own app for it
[json:with:timestamp]
SHOULD_LINEMERGE=true
LINE_BREAKER=([\r\n]+)
NO_BINARY_CHECK=true
CHARSET=UTF-8
INDEXED_EXTRACTIONS=json
KV_MODE=none
category=Structured
description=Your own JSON definition for networker_alerts.py script
disabled=false
pulldown_type=true
TIME_FORMAT=%Y-%m-%dT%H:%M:%S%:z
TIMESTAMP_FIELDS=timestamp
Then you must restart that HF for reading that props.conf.
Hi
we need some more information for helping you!
r. Ismo
script is inside datainputs --> scripts, it may be scripted input
running on HF
no props.conf in local, default one only
tested using .. cmd python , data coming fine in timestamp field , none value not there
You probably need a props.conf for it on HF to get timestamp correctly?
Could you show your inputs.conf, where you have defined input and also example output from script?
inputs.conf like this
[script://./bin/networker_alerts.py]
disabled = 0
index = test
interval = 4-59/5 * * * *
source = test_1
sourcetype = _json
and output of script is like this
{"category": "disk space", "message": "'xxx' host '/nsr' disk path occupied with '92.42%' of disk space. Free up the space.", "priority": "warning", "timestamp": "2023-07-03T08:51:25+02:00"}
here timestamp is having only time value but when when that data populating in splunk it is showing like this
You should always define your own sourcetype instead of use _json. So change your inputs.conf like
[script://./bin/networker_alerts.py]
disabled = 0
index = test
interval = 4-59/5 * * * *
source = script:networker_alerts.py
sourcetype = json:with:timestamp
Then you should define your props.conf for that sourcetype on that HF. Please create own app for it
[json:with:timestamp]
SHOULD_LINEMERGE=true
LINE_BREAKER=([\r\n]+)
NO_BINARY_CHECK=true
CHARSET=UTF-8
INDEXED_EXTRACTIONS=json
KV_MODE=none
category=Structured
description=Your own JSON definition for networker_alerts.py script
disabled=false
pulldown_type=true
TIME_FORMAT=%Y-%m-%dT%H:%M:%S%:z
TIMESTAMP_FIELDS=timestamp
Then you must restart that HF for reading that props.conf.
none is removed from timestamp field . but now i am getting duplicate values in timestamp field for single event
see below. please help to resolve this
Usually duplicate events "exists" when you have both KV_MODE=json and INDEXED_EXTRACTIONS=json defined.
@isoutamo , what will be the solution. please help
What kind of environment you have? Single node where you have indexer, search head and running this modular input or is this a distributed environment with own node for different nodes?
If last one are you sure that you haven't KV_MODE=json on search head? You cannot use is as you already have INDEXED_EXTRACTS=json on you scripted input on HF/Indexer.
when i removed indexed_extractions from props.conf again i start getting same data as it was in starting like
category, priority and message field as single value but timestamp field again with 2 values like none value timestamp
if there was any kv_mode=json on any node like search head . data would not have come back in old format
it would have been duplicate values only
On HF/IDX (where script inputs is running) you should use INDEXED_EXTRACTIONS=json and on SH (I suppose that this is a different node) you must have a KV_MODE=none. Or vice versa, but both cannot be at the same time. If you are not using INDEXED_EXTRACTIONS=json you must take care of timestamp -> _time modification otherwise.
i have distributed env. search heads and indexers are in clustering , search heads are not in shclustering heavy forwarder is also there
i was testing this script on dev environment before making changes in prod. so i have placed scripted inputs on search head not on HF
props.conf is also on search head containing below configs as you mentioned in above post
[json_scripted_input]
SHOULD_LINEMERGE=true
LINE_BREAKER=([\r\n]+)
NO_BINARY_CHECK=true
CHARSET=UTF-8
INDEXED_EXTRACTIONS=json [on search head]
KV_MODE=none [on search head]
category=Structured
description=Your own JSON definition for networker_alerts.py script
disabled=false
pulldown_type=true
TIME_FORMAT=%Y-%m-%dT%H:%M:%S%:z
TIMESTAMP_FIELDS=timestamp
#AUTO_KV_JSON=false
INDEXED_EXTRACTIONS=json ( when i disable it , category, message and priority field goes fine with single value, but timestamp got 2 values i.e none and timestamp)
and when i enable it, all field having duplicates values, timestamp also having duplicate values without none
please suggest
also let me know if i am not using indexed_extractions=json, how can i convert timestamp into _time
In distributed onprem environment you shouldn't use SH as a HF to run that script if you could use separate HF.
Can you do on SH's command line
splunk btool props list json_scripted_input --debug | egrep'(INDEXED_EXTRACTIONS|KV_MODE|AUTO_KV_JSON)'
This will show what those values are on SH
do you want me to place scripted input on HF ? and KV_MODE=none on SH . one doubt does it makes any changes . will it solved the problem of duplicate values ?
please find the output of SH node of above command.
splunk btool props list json_scripted_input --debug | egrep '(INDEXED_EXTRACTIONS|KV_MODE|AUTO_KV_JSON)'
/opt/splunk/etc/system/default/props.conf AUTO_KV_JSON = true
/opt/splunk/etc/apps/networker_inputs/local/props.conf INDEXED_EXTRACTIONS = json
/opt/splunk/etc/apps/networker_inputs/local/props.conf KV_MODE = none
In distributed environment HF is the place where all scripted inputs, modular inputs etc. are. So let's move it there and then check what happening on SH when you change INDEXED_EXTRACTIONS=none on it. This shouldn't change anything as it's for ingesting time, but let's try it. Basically you should remove that networker_inputs apps from SH.
Btool outputs seems to be reasonable on SH side.
i have moved scripted input app to HF . but nothing changed . same duplicate values are coming.
props.conf also there on HF. do i need to configure props.conf on Search head as well
i have placed networker_inputs app at both places SH and HF at /apps
i have disabled data inputs on SH and it is enabled on HF
HF props.conf
[json_scripted_input]
INDEXED_EXTRACTIONS=json
TIME_FORMAT=%Y-%m-%dT%H:%M:%S%:z
TIMESTAMP_FIELDS=timestamp
DATETIME_CONFIG =
SH props.conf
[json_scripted_input]
#INDEXED_EXTRACTIONS=json
TIME_FORMAT=%Y-%m-%dT%H:%M:%S%:z
TIMESTAMP_FIELDS=timestamp
DATETIME_CONFIG =
KV_MODE = none
please check and let me know is it fine ?
after adding this indexed_extractions = json , i am getting duplicate field values