Getting Data In

How to properly parse my JSON input?

a212830
Champion

Hi,

I have a JSON input file, and am having two issues. First, I can't seem to get the timestamp to map appropriately, and second, the events don't appear as proper JSON events within Splunk.

Here's a sample event:

[
    {
        "PSComputerName":  "testaaaaaaaa",
        "RunspaceId":  "c98aff32-7a72-4",
        "PSShowComputerName":  false,
        "RecordType":  "SharePointFileOperation",
        "CreationDate":  "\/Date(1489501679000)\/",
        "UserIds":  "srvfp123@mycompany.com",
        "Operations":  "FileAccessed",
        "AuditData":  "{\"CreationTime\":\"2017-03-14T14:27:59\",\"Id\":\"20187-f36f-bc-a7cb-050e2\",\"Operation\":\"FileAccessed\",\"OrganizationId\":\"75cbc-a68c-41e5-b95-1cfzzz6dd19\",\"RecordType\":6,\"UserKey\":\"i:0h.f|membership|10lskdjflkj90892a46@live.com\",\"UserType\":0,\"Version\":1,\"Workload\":\"SharePoint\",\"ClientIP\":\"1.12.25.1\",\"ObjectId\":\"https:\\/\\/sp.cloud.com\\/sites\\/workbench\\/pi\\/Topics\\/Concept8972e-af4d-4bc-8361-647d9b49cc7e.xml\",\"UserId\":\"srvfp2spo@.com\",\"EventSource\":\"SharePoint\",\"ItemType\":\"File\",\"ListId\":\"12ffce27-9e06-4672-8079-41d9ad911255\",\"ListItemUniqueId\":\"5a61cb68-01bb-43ff-a83b-cc6aafc325ca\",\"Site\":\"b9738191-350f-4d0e-8bd0-8be1dd1ec55a\",\"UserAgent\":\"\",\"WebId\":\"49b2d22c-c0f8-4d8d-b4ad-de22a35d8d57\",\"SourceFileExtension\":\"xml\",\"SiteUrl\":\"https:\\/\\/sp.fmrcloud.com\\/sites\\/workbench\\/\",\"SourceFileName\":\"Concept89eab72e-af4d-49bc-8361-647d9b49cc7e.xml\",\"SourceRelativeUrl\":\"\\/sites\\/workbench\\/pi\\/Topics\\/Concept89eab72e-af4d-49bc-8361-647d9b49cc7e.xml\"}",
        "ResultIndex":  1,
        "ResultCount":  3295,
        "Identity":  "2ca27-f36f-48bc-a7cb-08d0e2",
        "IsValid":  true,
        "ObjectState":  "Unchanged"
    },
    {
        "PSComputerName":  "mail-nam.mcld.oud.com",
        "RunspaceId":  "cff32-7a72-4213-8760-e55469e",
        "PSShowComputerName":  false,
        "RecordType":  "SharePointFileOperation",
        "CreationDate":  "\/Date(1489501679000)\/",
        "UserIds":  "z524@company.com",
        "Operations":  "FileAccessed",
        "AuditData":  "{\"CreationTime\":\"2017-03-14T14:27:59\",\"Id\":\"c8c8eb-9ed2-4a48-934a-08e65072\",\"Operation\":\"FileAccessed\",\"OrganizationId\":\"75bc-a68c-41e5-a3455-1cf830619\",\"RecordType\":6,\"UserKey\":\"i:0h.f|membership|10033fff9b1ba6ce@lze.com\",\"UserType\":0,\"Version\":1,\"Workload\":\"SharePoint\",\"ClientIP\":\"137.199.241.16\",\"ObjectId\":\"https:\\/\\/sp.cloud.com\\/sites\\/workbench\\/pi\\/Maps\\/36d42faf-d405-480f-8e28-9c8db9e7e.xml\",\"UserId\":\"z98824@company.com\",\"EventSource\":\"SharePoint\",\"ItemType\":\"File\",\"ListId\":\"34409-7160-425b-8a46-d5af7b3\",\"ListItemUniqueId\":\"b656-1242-43a3-aa7c-169e910a\",\"Site\":\"b9738191-350f-4d0e-80-8be1dec55a\",\"UserAgent\":\"Mozilla\\/5.0 (Windows NT 6.1; WOW64; Trident\\/7.0; rv:11.0) like Gecko\",\"WebId\":\"492c-c0f8-4d8d-b4ad-de5d8d57\",\"SourceFileExtension\":\"xml\",\"SiteUrl\":\"https:\\/\\/sp.cloud.com\\/sites\\/workbench\\/\",\"SourceFileName\":\"3xxfaf-d405-480f-8e28-9c8cb9e7e.xml\",\"SourceRelativeUrl\":\"pi\\/Maps\"}",
        "ResultIndex":  2,
        "ResultCount":  3295,
        "Identity":  "z23ta28eb-9ed2-4a48-934a-08072",
        "IsValid":  true,
        "ObjectState":  "Unchanged"
    },

Here is my props:

BREAK_ONLY_BEFORE_DATE = false
LINE_BREAKER = (,[\r\n]+\s+)\{
KV_MODE=json
TZ=UTC
TIME_PREFIX = \"CreationTime\":\s*\"
MAX_TIMESTAMP_LOOKAHEAD = 35
KV_MODE=json
TZ = UTC
0 Karma
1 Solution

sloshburch
Splunk Employee
Splunk Employee

Ok, second time's the charm. I got it.

[new_sourcetype]
DATETIME_CONFIG = 
KV_MODE = json
LINE_BREAKER = \}(\,?[\r\n]+)\{?
MAX_TIMESTAMP_LOOKAHEAD = 25
NO_BINARY_CHECK = true
TIME_PREFIX = CreationTime\D+
TZ = UTC
category = Custom
disabled = false
pulldown_type = true
SHOULD_LINEMERGE = false

As far as I understand the syntax won't be pretty printed automatically (it's available in the UI per event) because the json already has formatting applied to it with white spaces and carriage returns. I guess if Splunk see's a single line json, it pretty-prints it but if you added in your own spacing it honors your intentions and displays it that way.

Lastly, and probably most importantly, the AuditData field has it's own json payload. To get that, you'll want to throw down this: | spath input=AuditData

BTW, I see the example you provided leads off with an open bracket [. Is that for real in the data? If so, you might want to scrub that out in the sourcetype.

Results:
alt text

View solution in original post

sloshburch
Splunk Employee
Splunk Employee

Ok, second time's the charm. I got it.

[new_sourcetype]
DATETIME_CONFIG = 
KV_MODE = json
LINE_BREAKER = \}(\,?[\r\n]+)\{?
MAX_TIMESTAMP_LOOKAHEAD = 25
NO_BINARY_CHECK = true
TIME_PREFIX = CreationTime\D+
TZ = UTC
category = Custom
disabled = false
pulldown_type = true
SHOULD_LINEMERGE = false

As far as I understand the syntax won't be pretty printed automatically (it's available in the UI per event) because the json already has formatting applied to it with white spaces and carriage returns. I guess if Splunk see's a single line json, it pretty-prints it but if you added in your own spacing it honors your intentions and displays it that way.

Lastly, and probably most importantly, the AuditData field has it's own json payload. To get that, you'll want to throw down this: | spath input=AuditData

BTW, I see the example you provided leads off with an open bracket [. Is that for real in the data? If so, you might want to scrub that out in the sourcetype.

Results:
alt text

a212830
Champion

Oh, YOU are THE MAN. Thanks!

0 Karma

sloshburch
Splunk Employee
Splunk Employee

Still working on it but this far thus far:

DATETIME_CONFIG = NONE
KV_MODE = json
LINE_BREAKER = },(\s+)
MAX_TIMESTAMP_LOOKAHEAD = 25
NO_BINARY_CHECK = true
TIME_FORMAT = %FT%T
TIME_PREFIX = CreationTime\W+
TZ = UTC
category = Custom
disabled = false
pulldown_type = true
SHOULD_LINEMERGE = false
0 Karma

a212830
Champion

Thanks. Didn't work, unfortunately.

0 Karma

a212830
Champion

Anyone? More concerned with the date than the json format at this point...

0 Karma

sloshburch
Splunk Employee
Splunk Employee

I just noticed that there are commas in between each event in the sample provided which is causing a json parsing error. I'm not json expect but I'm inclined to think that there shouldn't be commas between json items and only in between the json field/attribute pairs.

0 Karma

bshuler_splunk
Splunk Employee
Splunk Employee

So, the message you posted isn't valid JSON. I validate json format using https://jsonformatter.curiousconcept.com

But, my bet is that the message is valid json, but you didn't paste the full message.

Splunk is probably truncating the message.

If you are certain that this will always be valid data, set
props.conf
TRUNCATE = 0
http://docs.splunk.com/Documentation/Splunk/6.5.2/Admin/Propsconf

0 Karma

a212830
Champion

Those are two events within the file. I couldn't post the whole file - it's huge. I don't want one huge file as the event - separate events within the file.

0 Karma
Get Updates on the Splunk Community!

SplunkTrust Application Period is Officially OPEN!

It's that time, folks! The application/nomination period for the 2025 SplunkTrust is officially open! If you ...

Splunk Answers Content Calendar, June Edition II

Get ready to dive into Splunk Dashboard panels this week! We'll be tackling common questions around ...

Splunk Observability Cloud's AI Assistant in Action Series: Auditing Compliance and ...

This is the third post in the Splunk Observability Cloud’s AI Assistant in Action series that digs into how to ...