Getting Data In

Yet Another Duplicated JSON Field Data in Table

dbray_sd
Path Finder

I have seen a lot of similar Questions/Solutions with this aggravating issue, none of which are working.

 

Trying to pull RabbitMQ API (JSON) data into Splunk. Bash script creates /opt/data/rabbitmq-queues.json

curl -s -u test:test https://localhost:15671/api/queues | jq > /opt/data/rabbitmq-queues.json

 

Universal Forwarder has following props.conf on the RabbitMQ server:

[rabbitmq:queues:json]
AUTO_KV_JSON = false
INDEXED_EXTRACTIONS = JSON
KV_MODE = none

 And the inputs.conf: 

[batch:///opt/data/rabbitmq-queues.json]
disabled = false
index = rabbitmq
sourcetype = rabbitmq:queues:json
move_policy = sinkhole
crcSalt = <SOURCE>
initCrcLength = 1048576

We run the btool on the Universal Forwarder to verify the settings are getting applied correctly:

sudo /opt/splunkforwarder/bin/splunk btool props list --debug "rabbitmq:queues:json"

/opt/splunkforwarder/etc/apps/RabbitMQ_Settings/local/props.conf         [rabbitmq:queues:json]
/opt/splunkforwarder/etc/system/default/props.conf                       ADD_EXTRA_TIME_FIELDS = True
/opt/splunkforwarder/etc/system/default/props.conf                       ANNOTATE_PUNCT = True
/opt/splunkforwarder/etc/apps/RabbitMQ_Settings/local/props.conf         AUTO_KV_JSON = false
/opt/splunkforwarder/etc/system/default/props.conf                       BREAK_ONLY_BEFORE =
/opt/splunkforwarder/etc/system/default/props.conf                       BREAK_ONLY_BEFORE_DATE = True
/opt/splunkforwarder/etc/system/default/props.conf                       CHARSET = UTF-8
/opt/splunkforwarder/etc/system/default/props.conf                       DATETIME_CONFIG = /etc/datetime.xml
/opt/splunkforwarder/etc/system/default/props.conf                       DEPTH_LIMIT = 1000
/opt/splunkforwarder/etc/system/default/props.conf                       DETERMINE_TIMESTAMP_DATE_WITH_SYSTEM_TIME = false
/opt/splunkforwarder/etc/system/default/props.conf                       HEADER_MODE =
/opt/splunkforwarder/etc/apps/RabbitMQ_Settings/local/props.conf         INDEXED_EXTRACTIONS = JSON
/opt/splunkforwarder/etc/apps/RabbitMQ_Settings/local/props.conf         KV_MODE = none
/opt/splunkforwarder/etc/system/default/props.conf                       LB_CHUNK_BREAKER_TRUNCATE = 2000000
/opt/splunkforwarder/etc/system/default/props.conf                       LEARN_MODEL = true
/opt/splunkforwarder/etc/system/default/props.conf                       LEARN_SOURCETYPE = true
/opt/splunkforwarder/etc/system/default/props.conf                       LINE_BREAKER_LOOKBEHIND = 100
/opt/splunkforwarder/etc/system/default/props.conf                       MATCH_LIMIT = 100000
/opt/splunkforwarder/etc/system/default/props.conf                       MAX_DAYS_AGO = 2000
/opt/splunkforwarder/etc/system/default/props.conf                       MAX_DAYS_HENCE = 2
/opt/splunkforwarder/etc/system/default/props.conf                       MAX_DIFF_SECS_AGO = 3600
/opt/splunkforwarder/etc/system/default/props.conf                       MAX_DIFF_SECS_HENCE = 604800
/opt/splunkforwarder/etc/system/default/props.conf                       MAX_EVENTS = 256
/opt/splunkforwarder/etc/system/default/props.conf                       MAX_TIMESTAMP_LOOKAHEAD = 128
/opt/splunkforwarder/etc/system/default/props.conf                       MUST_BREAK_AFTER =
/opt/splunkforwarder/etc/system/default/props.conf                       MUST_NOT_BREAK_AFTER =
/opt/splunkforwarder/etc/system/default/props.conf                       MUST_NOT_BREAK_BEFORE =
/opt/splunkforwarder/etc/system/default/props.conf                       SEGMENTATION = indexing
/opt/splunkforwarder/etc/system/default/props.conf                       SEGMENTATION-all = full
/opt/splunkforwarder/etc/system/default/props.conf                       SEGMENTATION-inner = inner
/opt/splunkforwarder/etc/system/default/props.conf                       SEGMENTATION-outer = outer
/opt/splunkforwarder/etc/system/default/props.conf                       SEGMENTATION-raw = none
/opt/splunkforwarder/etc/system/default/props.conf                       SEGMENTATION-standard = standard
/opt/splunkforwarder/etc/system/default/props.conf                       SHOULD_LINEMERGE = True
/opt/splunkforwarder/etc/system/default/props.conf                       TRANSFORMS =
/opt/splunkforwarder/etc/system/default/props.conf                       TRUNCATE = 10000
/opt/splunkforwarder/etc/system/default/props.conf                       detect_trailing_nulls = false
/opt/splunkforwarder/etc/system/default/props.conf                       maxDist = 100
/opt/splunkforwarder/etc/system/default/props.conf                       priority =
/opt/splunkforwarder/etc/system/default/props.conf                       sourcetype =
/opt/splunkforwarder/etc/system/default/props.conf                       termFrequencyWeightedDist = false
/opt/splunkforwarder/etc/system/default/props.conf                       unarchive_cmd_start_mode = shell

 

 On the local Search Head we have the following props.conf:

[rabbitmq:queues:json]
KV_MODE = none
INDEXED_EXTRACTIONS = json
AUTO_KV_JSON = false

 We run the btool on the Search Head to verify the settings are getting applied correctly:

sudo -u splunk /opt/splunk/bin/splunk btool props list --debug "rabbitmq:queues:json"

/opt/splunk/etc/apps/RabbitMQ_Settings/local/props.conf           [rabbitmq:queues:json]
/opt/splunk/etc/system/default/props.conf                         ADD_EXTRA_TIME_FIELDS = True
/opt/splunk/etc/system/default/props.conf                         ANNOTATE_PUNCT = True
/opt/splunk/etc/apps/RabbitMQ_Settings/local/props.conf           AUTO_KV_JSON = false
/opt/splunk/etc/system/default/props.conf                         BREAK_ONLY_BEFORE =
/opt/splunk/etc/system/default/props.conf                         BREAK_ONLY_BEFORE_DATE = True
/opt/splunk/etc/system/default/props.conf                         CHARSET = UTF-8
/opt/splunk/etc/system/default/props.conf                         DATETIME_CONFIG = /etc/datetime.xml
/opt/splunk/etc/system/default/props.conf                         DEPTH_LIMIT = 1000
/opt/splunk/etc/system/default/props.conf                         DETERMINE_TIMESTAMP_DATE_WITH_SYSTEM_TIME = false
/opt/splunk/etc/system/default/props.conf                         HEADER_MODE =
/opt/splunk/etc/apps/RabbitMQ_Settings/local/props.conf           INDEXED_EXTRACTIONS = json
/opt/splunk/etc/apps/RabbitMQ_Settings/local/props.conf           KV_MODE = none
/opt/splunk/etc/system/default/props.conf                         LB_CHUNK_BREAKER_TRUNCATE = 2000000
/opt/splunk/etc/system/default/props.conf                         LEARN_MODEL = true
/opt/splunk/etc/system/default/props.conf                         LEARN_SOURCETYPE = true
/opt/splunk/etc/system/default/props.conf                         LINE_BREAKER_LOOKBEHIND = 100
/opt/splunk/etc/system/default/props.conf                         MATCH_LIMIT = 100000
/opt/splunk/etc/system/default/props.conf                         MAX_DAYS_AGO = 2000
/opt/splunk/etc/system/default/props.conf                         MAX_DAYS_HENCE = 2
/opt/splunk/etc/system/default/props.conf                         MAX_DIFF_SECS_AGO = 3600
/opt/splunk/etc/system/default/props.conf                         MAX_DIFF_SECS_HENCE = 604800
/opt/splunk/etc/system/default/props.conf                         MAX_EVENTS = 256
/opt/splunk/etc/system/default/props.conf                         MAX_TIMESTAMP_LOOKAHEAD = 128
/opt/splunk/etc/system/default/props.conf                         MUST_BREAK_AFTER =
/opt/splunk/etc/system/default/props.conf                         MUST_NOT_BREAK_AFTER =
/opt/splunk/etc/system/default/props.conf                         MUST_NOT_BREAK_BEFORE =
/opt/splunk/etc/system/default/props.conf                         SEGMENTATION = indexing
/opt/splunk/etc/system/default/props.conf                         SEGMENTATION-all = full
/opt/splunk/etc/system/default/props.conf                         SEGMENTATION-inner = inner
/opt/splunk/etc/system/default/props.conf                         SEGMENTATION-outer = outer
/opt/splunk/etc/system/default/props.conf                         SEGMENTATION-raw = none
/opt/splunk/etc/system/default/props.conf                         SEGMENTATION-standard = standard
/opt/splunk/etc/system/default/props.conf                         SHOULD_LINEMERGE = True
/opt/splunk/etc/system/default/props.conf                         TRANSFORMS =
/opt/splunk/etc/system/default/props.conf                         TRUNCATE = 10000
/opt/splunk/etc/system/default/props.conf                         detect_trailing_nulls = false
/opt/splunk/etc/system/default/props.conf                         maxDist = 100
/opt/splunk/etc/system/default/props.conf                         priority =
/opt/splunk/etc/system/default/props.conf                         sourcetype =
/opt/splunk/etc/system/default/props.conf                         termFrequencyWeightedDist = false
/opt/splunk/etc/system/default/props.conf                         unarchive_cmd_start_mode = shell

 

However, even with all that in place, we're still seeing duplicate values when using tables:

index="rabbitmq"
| table _time messages state

dbray_sd_1-1734100914033.png

 

 

Labels (1)
0 Karma

isoutamo
SplunkTrust
SplunkTrust
You have twice INDEXED_EXTRACTIONS = json. Please remove another.
0 Karma

dbray_sd
Path Finder

From which one? The Search Head or the Universal Forwarder.....because I can already tell you, neither fixes the issue.

Removed from Search Head, issue remains.

Put it back on the Search Head, removed from the Universal Forwarder, issue changes into it not parsing the data as JSON, so it loses all formatting, and the table is blank.

0 Karma

isoutamo
SplunkTrust
SplunkTrust
It depends. In most cases I prefer SH side. But this depends on your json and your needs.
Can you show your raw event on disk before indexing!
0 Karma

dbray_sd
Path Finder

Prefer the SH side to what? Have or not have INDEXED_EXTRACTIONS?

Depends on what? The needs are simple. We want Splunk to not show duplicated fields when showing JSON data into tables., and be consistent.

What is really confusing (and aggravating) about this, is we have other JSON feeds coming in, and those are working just fine. However, each config appears to be different. Sometimes the UF has the props.conf. Sometimes the SH has the props.conf. Sometimes the props.conf has "INDEXED_EXTRACTIONS = JSON" sometimes it doesn't. The whole thing is really confusing as to why Splunk sometimes works properly, and sometimes doesn't.

The JSON file is rather large, so put it on pastebin:
https://pastebin.com/VwkcdLLA

 

Appreciate your attention with this confusing issue. Let me know if you have any other questions.

0 Karma

isoutamo
SplunkTrust
SplunkTrust

In your case (as you have multiline and multiple events in one json file) You should use INDEXED_EXTRACTIONS=json on your UF side. So remove it from SH side.

If I got this file correctly, it contains 25 events?

Unfortunately I haven't suitable environment to test this from UF -> IDX -> SH, but just leave INDEXED_EXTRACTIONS on UF's props.conf (restart it after that) and remove it from SH (and IDX side if you have it also there). Then it should works.

Usually props.conf should/must be on indexer or first full splunk enterprise instance from UF to IDX path. Also you could/should put it into SH when there is some runtime definitions which are needed there.  There is only some definitions which must be on UF side. This https://www.aplura.com/assets/pdf/where_to_put_props.pdf describes when and where you should put it when you are ingesting events. You could found more instructions at least lantern and docs.splunk.com.

BTW why you are using jq to pretty print that json file? This add lot of additional spaces, new line characters and other unnecessary stuff on your input file. Those characters just increase your license consumption!

0 Karma

dbray_sd
Path Finder

Well, unfortunately, as I stated above "neither fixes the issue." It doesn't matter how I configure the UF props.conf or the SH props.conf, Splunk is refusing to parse JSON properly. Even though other JSON datafeeds work just fine. Guess I'll have to open another ticket with Splunk.

0 Karma

dbray_sd
Path Finder

Well, right as I go to create a ticket, I stumbled onto an old note that had the fix. I had to do the following on the SH:

sudo -u splunk mkdir -p /opt/splunk/etc/apps/RabbitMQ_Settings/metadata

sudo -u splunk vim /opt/splunk/etc/apps/RabbitMQ_Settings/metadata/local.meta

[]
access = read : [ * ], write : [ admin ]
export = system

 

I recalled something about permissions or some other weirdo Splunk requirement. Aggravating, but at least it's working as expected now.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

OK. Two things.

1. @isoutamo it doesn't matter how many times you have the same setting specified. Only the "last one" is effective. So on its own just specifying INDEXED_EXTRACTIONS twice doesn't do anything. Of course which setting is the "last one" depends on the settings priority.

2. @dbray_sd It makes sense. If you simply run btool on a SH without providing app context you will get effective settings flattened using "normal" setting precedence - not taking into account those context (as if all settings were specified in a global system context). For quite a few versions now btool has supported an --app=something argument so you can evaluate your settings in an app context. But it will - as far as I remember - still not check user's settings and I'm not 100% sure if it will properly check permissions.

So yes, your solution makes sense. If you haven't explicitly exported your app's contents they'll only be usable in that app's context.

isoutamo
SplunkTrust
SplunkTrust

isoutamo
SplunkTrust
SplunkTrust
Maybe this app https://splunkbase.splunk.com/app/6368 helps you to see what you have in props.conf in your search context?
0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Tech Talk Recap | Mastering Threat Hunting

Mastering Threat HuntingDive into the world of threat hunting, exploring the key differences between ...

Observability for AI Applications: Troubleshooting Latency

If you’re working with proprietary company data, you’re probably going to have a locally hosted LLM or many ...

Splunk AI Assistant for SPL vs. ChatGPT: Which One is Better?

In the age of AI, every tool promises to make our lives easier. From summarizing content to writing code, ...