Hi All,
I'm a newbie to the Splunk world!
I'm monitoring a path which point to a JSON file, the inputs.conf has been setup to monitor the file path as shown below and im using the source type as _json
[monitor://<windows path to the file>\*.json]
disabled = false
index = index_name
sourcetype = _json
below is the error im getting in the Internal splunk logs,
11-12-2018 23:04:02.745 +0000 ERROR JsonLineBreaker - JSON StreamId:11440680656958819810 had parsing error:Unexpected character while looking for value: ']' - data_source="<path of windows monitoring\>cucumber.json", data_host="<hostname>", data_sourcetype="_json"
Is there a way this issue could be resolved, sometimes the data is getting indexed but sometimes they are not available. I could see in the splunk logs that it has error'ed as shown above on most of the days. Not sure if its working as expected.
I believe someone could help me out on this
Sample JSON added,
[
{
"line": 1,
"elements": [
{
"before": [
{
"result": {
"duration": 3007379010,
"status": "passed"
},
"match": {
"location": "Hook.InitializeTest()"
}
}
],
"line": 3,
"name": "Verify the health of XXXXXX Application",
"description": "",
"id": "XXXXXX-application;verify-the-health-of-XXXXXX-application",
"after": [
{
"result": {
"duration": 1506065506,
"status": "passed"
},
"match": {
"location": "Hook.TearDownTest(Scenario)"
}
}
],
"type": "scenario",
"keyword": "Scenario",
"steps": [
{
"result": {
"duration": 1499020197,
"status": "passed"
},
"line": 4,
"name": "launch the XXXXXX application URL",
"match": {
"location": "XXXXXXStep.user_launches_the_XXXXXX_application_URL()"
},
"keyword": "When "
},
{
"result": {
"duration": 47893377,
"status": "passed"
},
"line": 5,
"name": "the XXXXXX application launched successfully",
"match": {
"location": "XXXXXXStep.the_XXXXXX_application_launched_successfully()"
},
"keyword": "Then "
},
{
"result": {
"duration": 3762996694,
"status": "passed"
},
"line": 6,
"name": "login into XXXXXX application",
"match": {
"location": "XXXXXXStep.login_into_XXXXXX_application()"
},
"keyword": "When "
},
{
"result": {
"duration": 279313222,
"status": "passed"
},
"line": 7,
"name": "verify page title displayed as \"XXXXXX Dashboard\"",
"match": {
"arguments": [
{
"val": "XXXXXX Dashboard",
"offset": 32
}
],
"location": "GenericStep.verify_page_title_displayed_as(String)"
},
"keyword": "Then "
},
{
"result": {
"duration": 18273214543,
"status": "passed"
},
"line": 8,
"name": "Create XXXXXX with \"default checkbox\" checked",
"match": {
"arguments": [
{
"val": "default checkbox",
"offset": 20
}
],
"location": "XXXXXXStep.Create_XXXXXX_with_checked(String)"
},
"keyword": "When "
},
{
"result": {
"duration": 336633357,
"status": "passed"
},
"line": 9,
"name": "verify XXXXXX is displayed successfully with name \"Client Rate\"",
"match": {
"arguments": [
{
"val": "Client Rate",
"offset": 51
}
],
"location": "XXXXXXStep.verify_XXXXXX_is_displayed_successfully(String)"
},
"keyword": "Then "
},
{
"result": {
"duration": 18389039668,
"status": "passed"
},
"line": 10,
"name": "Create XXXXXX with \"clean growth\" checked",
"match": {
"arguments": [
{
"val": "clean growth",
"offset": 20
}
],
"location": "XXXXXXStep.Create_XXXXXX_with_checked(String)"
},
"keyword": "When "
},
{
"result": {
"duration": 327113991,
"status": "passed"
},
"embeddings": [
{
"data": "some very big data here",
"mime_type": "image/png"
}
],
"line": 7,
"name": "Health of EUR Forwards is ok",
"match": {
"location": "YYYYYYYStep.health_of_EUR_Forwards_is_ok()"
},
"keyword": "Then "
}
],
"tags": [
{
"line": 2,
"name": "@Ready2"
}
]
}
],
"name": "YYYYYYY Application EUR Fwds Check",
"description": "",
"id": "YYYYYYY-application-eur-fwds-check",
"keyword": "Feature",
"uri": "YYYYYYYEURFwdsCheck.feature"
}
]
Try:
LINE_BREAKER = "uri":+[^\}]+\}(,[\r\n]+)
Or:
LINE_BREAKER = \}(,[\r\n\s]+)\{[\r\n\s]+"line":\s1
Try:
LINE_BREAKER = "uri":+[^\}]+\}(,[\r\n]+)
Or:
LINE_BREAKER = \}(,[\r\n\s]+)\{[\r\n\s]+"line":\s1
tried both of them but its not splitting mate, below is the props.conf im using
[sourcetype]
TRUNCATE = 0
KV_MODE = json
#INDEXED_EXTRACTIONS = json
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = false
LINE_BREAKER = "uri":+[^\}]+\}(,[\r\n]+)
#MUST_BREAK_AFTER =
DATETIME_CONFIG =
#pulldown_type = 1
Well, then I can only think of 2 things: either the sample data you're posting here is not representative, or your props.conf is not deployed correctly.
Have you tried running btool, to see if splunk reads the config properly? (e.g. ./splunk cmd btool props list yoursourcetype --debug
)
Have you restarted splunk after making these props.conf changes?
On what instance have you deployed this props.conf? It must be on the first Splunk Enterprise instance that processes the data (so either a heavy forwarder if one is involved or on your indexer(s)).
thanks a lot FrankVI, it was the props.conf that was being deployed in UF but was not on the indexer. I have created the entries in the /etc/system/local
and it started to work
Right, yeah, UF doesn't do linebreaking, that happens on the indexer, so that config needs to be there.
I'll convert my comment with the working linebreaking setting to an answer, so you can mark that as accepted.
I have tried this in the UF props.conf, earlier when i used the _json internal source type I was not able to figure out why the entries were not getting indexed, but now im able to see the events just need help with the line breaker for this JSON sourcetype
[sourcetype]
TRUNCATE = 0
KV_MODE = json
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = false
#MUST_BREAK_AFTER = "uri":+\s+"+\w+.+\w+"+\s+\},
LINE_BREAK = "uri":+\s+"+\w+.+\w+"+\s+\},
DATETIME_CONFIG =
and i'm planning to break the line from this instance, from the sample, but the problem is its not working after the line break and its breaking each line.
"uri": "YYYYYYYEURFwdsCheck.feature"
}
That's because your linebreaker is invalid. The setting is called LINE_BREAKER and a line_breaker regex must include 1 capture group, that captures the characters between the end of the previous event and the start of the next event (e.g. a newline).
Assuming what you have now is the end of an event, try this:
LINE_BREAKER = "uri":+\s+"+\w+.+\w+"+\s+\}(,[\r\n]+)
If that does't work, please provide a sample that contains multiple events (perhaps strip out the middle part of each event, to keep the sample small).
I have tried the line breaker but still its not breaking as expected,
Below is the sample,
[
{
"line": 1,
"elements": [
],
"name": "App Application EUR Fwds Check",
"description": "",
"id": "App-application-eur-fwds-check",
"keyword": "Feature",
"uri": "AppFeature.feature"
},
{
"line": 1,
"elements": [
],
"name": "App Application Fwds Desk Check",
"description": "",
"id": "App-application-fwds-desk-check",
"keyword": "Feature",
"uri": "AppFwdsDeskCheck.feature"
},
{
"line": 1,
"elements": [
],
"name": "App Application GBP Fwds Check",
"description": "",
"id": "App-application-gbp-fwds-check",
"keyword": "Feature",
"uri": "AppGBPFwdsCheck.feature"
},
{
"line": 1,
"elements": [
],
"name": "App Application Liabilities Check",
"description": "",
"id": "App-application-liabilities-check",
"keyword": "Feature",
"uri": "AppLiabilitiesCheck.feature"
},
{
"line": 1,
"elements": [
],
"name": "App Application Spot Check",
"description": "",
"id": "App-application-spot-check",
"keyword": "Feature",
"uri": "AppSpotCheck.feature"
},
{
"line": 1,
"elements": [
],
"name": "App Application USD Fwds Check",
"description": "",
"id": "App-application-usd-fwds-check",
"keyword": "Feature",
"uri": "AppUSDFwdsCheck.feature"
}
]
Some sample data would indeed help.
Also: you might want to consider defining your own sourcetype rather than relying on the built in _json sourcetype. That not only makes it easier to recognize your data at search time, but also allows you to configure line breaking and timestamping settings specific to your data. Which may or may not resolve your issue (corrupt json data would still cause issues when applying INDEXED_EXTRACTIONS = json
, but it would at least give you more control, take out some of the guesswork for Splunk and as a result also significantly improve performance of the index time processing (linebreaking, timestamping).
I have added a sample JSON which is having issue in Splunk, All I could see in the internal indexes is parsing error but was not clear on where the error is.
I have added a props.conf in UF and search head, I have changed the inputs.conf to pick this custom source type rather than the internal _json
In UF,
[sourcetype]
DATETIME_CONFIG =
NO_BINARY_CHECK = true
INDEXED_EXTRACTIONS = JSON
category = Custom
pulldown_type = 1
In Search Head,
[sourcetype]
KV_MODE = none
AUTO_KV_JSON = false
It can be a problem with the monitored file, can you post what's inside json.