Your data illustration strongly suggest that it is part of a JSON event like, {"message":"sypher:[tokenized] build successful -\xxxxy {\"data\":{\"account_id\":\"ABC123XYZ\",\"activity\":{\"ti...
See more...
Your data illustration strongly suggest that it is part of a JSON event like, {"message":"sypher:[tokenized] build successful -\xxxxy {\"data\":{\"account_id\":\"ABC123XYZ\",\"activity\":{\"time\":\"2024-05-31T12:37:25Z\"}}", "some_field":"somevalue", "some_other_field": "morevalue"} In this case, Splunk should have given you a field named "message" that has this value: "message":"sypher:[tokenized] build successful -\xxxxy {\"data\":{\"account_id\":\"ABC123XYZ\",\"activity\":{\"time\":\"2024-05-31T12:37:25Z\"}}" What the developer is trying to do is to embed more data in this field, partially also in JSON. For long-term maintainability, it is best not to treat that as text, either. This means that regex is not the right tool for the job. Instead, try to get the embedded JSON first. There is just one problem (in addition to missing a closing double quote for the time value): the string \xxxxy is illegal in JSON. If this is the real data, Splunk would have bailed and NOT give you a field named "message". In that case, you will have to deal with that first. Let's explore how later. For now, suppose your data is actually {"message":"sypher:[tokenized] build successful -\\\xxxxy {\"data\":{\"account_id\":\"ABC123XYZ\",\"activity\":{\"time\":\"2024-05-31T12:37:25Z\"}}", "some_field":"somevalue", "some_other_field": "morevalue"} As such, Splunk would have given you a value for message like this: sypher:[tokenized] build successful -\xxxxy {"data":{"account_id":"ABC123XYZ","activity":{"time":"2024-05-31T12:37:25Z"}} Consequently, all you need to do is | eval jmessage = replace(message, "^[^{]+", "")
| spath input=jmessage You will get the following fields data.account_id data.activity.time some_field some_other_field ABC123XYZ 2024-05-31T12:37:25Z somevalue morevalue Here is an emulation of the "correct" data you can play with and compare with real data | makeresults
| eval _raw = "{\"message\":\"sypher:[tokenized] build successful -\\\xxxxy {\\\"data\\\":{\\\"account_id\\\":\\\"ABC123XYZ\\\",\\\"activity\\\":{\\\"time\\\":\\\"2024-05-31T12:37:25Z\\\"}}\", \"some_field\":\"somevalue\", \"some_other_field\": \"morevalue\"}"
| spath
``` data emulation above ``` Now, if your raw data indeed contains \xxxxy inside a JSON block, you can still rectify that with text manipulation so you get a legal JSON. But you have to tell your developer that they are logging bad JSON. (Recently there was a case where an IBM mainframe plugin sent Splunk bad data like this. It is best for the developer to fix this kind of problem.)