Hi, I am trying to come up with a REGEX that would give me the entire json from the log event. Here is how my log looks like
TIMESTAMP CHARS {
"a": "1",
"b": {
"c": "2",
"d": "3",
"e": {
"f": "4",
"g": "5",
"h": "6",
"i": "7"
},
"j": "8",
"k": "9"
}
}
REGEX i could come up with search | rex "(?<jsonData>{[^}]+})" | spath input=jsonData
is removing all data after the first } closes. Any suggestions to fix this pls.
Try this
search | rex "^[^\{]+)(?m)(?<jsonData>.+)" | spath input=jsonData
This did not work but your suggestion helped me modify my existing pattern. rex "(?<jsonData>{[^}].+})"
works for me.
Does the search result have the fields available that are JSON hierarchically denoted? If not, it might not be viewing the data as JSON data. Your example has characters BEFORE the JSON string, which can cause the JSON parsing to not work. If Splunk is not viewing the data as JSON data, then that is your (first) problem. Remove the non-JSON string characters from the front of the event and it will likely work (no guarantees, as there could be some other problem, that that is a highly likely cause of @somesoni2's suggestion not working).
I don't know about the original poster, but in my case, Cloudwatch is pre-pending data to what would otherwise be pure JSON. The characters BEFORE the string are not any content that we're specifically logging, they're a byproduct of the Cloudwatch log.
Any suggestions on how to remove the non-JSON string characters from the front of an event logged via Cloudwatch?