Getting Data In

extracting data using script getting none in timestamp field

anilkapoor123
Explorer

Hi Team,

need your help, while i am ingesting data using python script i.e scripted input. for timestamp field i am getting none value . even in script data is populating fine but when it is ingesting in splunk it is getting extra field value none for timestamp

 

need your help

Labels (1)
0 Karma
1 Solution

isoutamo
SplunkTrust
SplunkTrust

You should always define your own sourcetype instead of use _json. So change your inputs.conf like

[script://./bin/networker_alerts.py]
disabled = 0
index = test
interval = 4-59/5 * * * *
source = script:networker_alerts.py
sourcetype = json:with:timestamp

 Then you should define your props.conf for that sourcetype on that HF. Please create own app for it

[json:with:timestamp]
SHOULD_LINEMERGE=true
LINE_BREAKER=([\r\n]+)
NO_BINARY_CHECK=true
CHARSET=UTF-8
INDEXED_EXTRACTIONS=json
KV_MODE=none
category=Structured
description=Your own JSON definition for networker_alerts.py script
disabled=false
pulldown_type=true
TIME_FORMAT=%Y-%m-%dT%H:%M:%S%:z
TIMESTAMP_FIELDS=timestamp

Then you must restart that HF for reading that props.conf.

View solution in original post

isoutamo
SplunkTrust
SplunkTrust

Hi

we need some more information for helping you! 

  • Are this a scripted input or modular input?
  • If scripted input can you show your inputs.conf and also what are output of script.
  • Have you test it it "splunk cmd python <your script>"
  • What is your props.conf for this sourcetype/source
  • Are you running this on UF, HF or where

r. Ismo

 

anilkapoor123
Explorer

script is inside datainputs --> scripts, it may be scripted input

running on HF

no props.conf in local, default one only

tested using .. cmd python , data coming fine in timestamp field , none value not there 

0 Karma

isoutamo
SplunkTrust
SplunkTrust

You probably need a props.conf for it on HF to get timestamp correctly?

Could you show your inputs.conf, where you have defined input and also example output from script?

0 Karma

anilkapoor123
Explorer

inputs.conf like this

[script://./bin/networker_alerts.py]
disabled = 0
index = test
interval = 4-59/5 * * * *
source = test_1
sourcetype = _json

and output of script is like this 

{"category": "disk space", "message": "'xxx' host '/nsr' disk path occupied with '92.42%' of disk space. Free up the space.", "priority": "warning", "timestamp": "2023-07-03T08:51:25+02:00"}

here timestamp is having only time value but when when that data populating in splunk it is showing like this 

anilkapoor123_0-1688470377100.png

 

0 Karma

isoutamo
SplunkTrust
SplunkTrust

You should always define your own sourcetype instead of use _json. So change your inputs.conf like

[script://./bin/networker_alerts.py]
disabled = 0
index = test
interval = 4-59/5 * * * *
source = script:networker_alerts.py
sourcetype = json:with:timestamp

 Then you should define your props.conf for that sourcetype on that HF. Please create own app for it

[json:with:timestamp]
SHOULD_LINEMERGE=true
LINE_BREAKER=([\r\n]+)
NO_BINARY_CHECK=true
CHARSET=UTF-8
INDEXED_EXTRACTIONS=json
KV_MODE=none
category=Structured
description=Your own JSON definition for networker_alerts.py script
disabled=false
pulldown_type=true
TIME_FORMAT=%Y-%m-%dT%H:%M:%S%:z
TIMESTAMP_FIELDS=timestamp

Then you must restart that HF for reading that props.conf.

anilkapoor123
Explorer

none is removed from timestamp field . but now i am getting duplicate values in timestamp field for single event

see below. please help to resolve this

anilkapoor123_0-1688478287829.png

 

0 Karma

isoutamo
SplunkTrust
SplunkTrust

Usually duplicate events "exists" when you have both KV_MODE=json and INDEXED_EXTRACTIONS=json defined.

0 Karma

anilkapoor123
Explorer

@isoutamo , what will be the solution. please help

0 Karma

isoutamo
SplunkTrust
SplunkTrust

What kind of environment you have? Single node where you have indexer, search head and running this modular input or is this a distributed environment with own node for different nodes?

If last one are you sure that you haven't KV_MODE=json on search head? You cannot use is as you already have INDEXED_EXTRACTS=json on you scripted input on HF/Indexer.

0 Karma

anilkapoor123
Explorer

when i removed indexed_extractions from props.conf again i start getting same data as it was in starting like 

category, priority and message field as single value but timestamp field again  with 2 values like  none value timestamp

if there was any kv_mode=json on any node like search head . data would not  have come back in old format 

it would have been duplicate values only

 

Tags (1)
0 Karma

isoutamo
SplunkTrust
SplunkTrust

On HF/IDX (where script inputs is running) you should use INDEXED_EXTRACTIONS=json and on SH (I suppose that this is a different node) you must have a KV_MODE=none. Or vice versa, but both cannot be at the same time. If you are not using INDEXED_EXTRACTIONS=json you must take care of timestamp -> _time modification otherwise.

0 Karma

anilkapoor123
Explorer

i have distributed env.  search heads and indexers are in clustering ,  search heads are not in shclustering  heavy forwarder is also there

i was testing this script on dev environment before making changes in prod. so i have placed scripted inputs on search head not on HF 

props.conf is also on search head containing below configs as you mentioned in above post

[json_scripted_input]
SHOULD_LINEMERGE=true
LINE_BREAKER=([\r\n]+)
NO_BINARY_CHECK=true
CHARSET=UTF-8
INDEXED_EXTRACTIONS=json                 [on search head]
KV_MODE=none                                              [on search head]
category=Structured
description=Your own JSON definition for networker_alerts.py script
disabled=false
pulldown_type=true
TIME_FORMAT=%Y-%m-%dT%H:%M:%S%:z
TIMESTAMP_FIELDS=timestamp
#AUTO_KV_JSON=false

INDEXED_EXTRACTIONS=json ( when i disable it , category, message and priority field goes fine with single                                                                            value, but timestamp got 2 values i.e none and timestamp)

                                                                 and when i enable it, all field having duplicates values, timestamp also                                                                               having duplicate values without none

please suggest 

also let me know if i am not using indexed_extractions=json, how can i convert timestamp into _time

0 Karma

isoutamo
SplunkTrust
SplunkTrust

In distributed onprem environment you shouldn't use SH as a HF to run that script if you could use separate HF.

Can you do on SH's command line

 

splunk btool props list json_scripted_input --debug | egrep'(INDEXED_EXTRACTIONS|KV_MODE|AUTO_KV_JSON)'

 

 This will show what those values are on SH

0 Karma

anilkapoor123
Explorer

do you want me to place scripted input on HF ? and KV_MODE=none on SH . one doubt does it makes any changes . will it solved the problem of duplicate values ?

please find the output of SH node of above command.

 

splunk btool props list json_scripted_input --debug | egrep '(INDEXED_EXTRACTIONS|KV_MODE|AUTO_KV_JSON)'


/opt/splunk/etc/system/default/props.conf AUTO_KV_JSON = true
/opt/splunk/etc/apps/networker_inputs/local/props.conf INDEXED_EXTRACTIONS = json
/opt/splunk/etc/apps/networker_inputs/local/props.conf KV_MODE = none

0 Karma

isoutamo
SplunkTrust
SplunkTrust

In distributed environment HF is the place where all scripted inputs, modular inputs etc. are. So let's move it there and then check what happening on SH when you change INDEXED_EXTRACTIONS=none on it. This shouldn't change anything as it's for ingesting time, but let's try it. Basically you should remove that networker_inputs apps from SH.

Btool outputs seems to be reasonable on SH side.

0 Karma

anilkapoor123
Explorer

i have moved scripted input app to HF . but nothing changed . same duplicate values are coming.

props.conf also there on HF. do i need to configure props.conf on Search head as well 

 

0 Karma

isoutamo
SplunkTrust
SplunkTrust
Have you also this "INDEXED_EXTRACTIONS = json" there?

You should have own development/test server where you will test and do all onboarding. When events are correctly shown in this environment then just create TA/apps for those configurations and deploy those into the correct places.
0 Karma

anilkapoor123
Explorer

@isoutamo 

i have placed networker_inputs app at both places SH and HF at /apps

i have disabled data inputs on SH and it is enabled on HF

HF props.conf 

[json_scripted_input]
INDEXED_EXTRACTIONS=json
TIME_FORMAT=%Y-%m-%dT%H:%M:%S%:z
TIMESTAMP_FIELDS=timestamp
DATETIME_CONFIG =

SH props.conf

[json_scripted_input]
#INDEXED_EXTRACTIONS=json
TIME_FORMAT=%Y-%m-%dT%H:%M:%S%:z
TIMESTAMP_FIELDS=timestamp
DATETIME_CONFIG =
KV_MODE = none

please check and let me know is it fine ?

0 Karma

isoutamo
SplunkTrust
SplunkTrust
In SH side you need only KV_MODE, otherwise those seems to be ok.
0 Karma

anilkapoor123
Explorer

after adding this indexed_extractions = json , i am getting duplicate field values

0 Karma
Get Updates on the Splunk Community!

Take Your Breath Away with Splunk Risk-Based Alerting (RBA)

WATCH NOW!The Splunk Guide to Risk-Based Alerting is here to empower your SOC like never before. Join Haylee ...

SignalFlow: What? Why? How?

What is SignalFlow? Splunk Observability Cloud’s analytics engine, SignalFlow, opens up a world of in-depth ...

Federated Search for Amazon S3 | Key Use Cases to Streamline Compliance Workflows

Modern business operations are supported by data compliance. As regulations evolve, organizations must ...