Getting Data In

How to properly parse my JSON input?

a212830
Champion

Hi,

I have a JSON input file, and am having two issues. First, I can't seem to get the timestamp to map appropriately, and second, the events don't appear as proper JSON events within Splunk.

Here's a sample event:

[
    {
        "PSComputerName":  "testaaaaaaaa",
        "RunspaceId":  "c98aff32-7a72-4",
        "PSShowComputerName":  false,
        "RecordType":  "SharePointFileOperation",
        "CreationDate":  "\/Date(1489501679000)\/",
        "UserIds":  "srvfp123@mycompany.com",
        "Operations":  "FileAccessed",
        "AuditData":  "{\"CreationTime\":\"2017-03-14T14:27:59\",\"Id\":\"20187-f36f-bc-a7cb-050e2\",\"Operation\":\"FileAccessed\",\"OrganizationId\":\"75cbc-a68c-41e5-b95-1cfzzz6dd19\",\"RecordType\":6,\"UserKey\":\"i:0h.f|membership|10lskdjflkj90892a46@live.com\",\"UserType\":0,\"Version\":1,\"Workload\":\"SharePoint\",\"ClientIP\":\"1.12.25.1\",\"ObjectId\":\"https:\\/\\/sp.cloud.com\\/sites\\/workbench\\/pi\\/Topics\\/Concept8972e-af4d-4bc-8361-647d9b49cc7e.xml\",\"UserId\":\"srvfp2spo@.com\",\"EventSource\":\"SharePoint\",\"ItemType\":\"File\",\"ListId\":\"12ffce27-9e06-4672-8079-41d9ad911255\",\"ListItemUniqueId\":\"5a61cb68-01bb-43ff-a83b-cc6aafc325ca\",\"Site\":\"b9738191-350f-4d0e-8bd0-8be1dd1ec55a\",\"UserAgent\":\"\",\"WebId\":\"49b2d22c-c0f8-4d8d-b4ad-de22a35d8d57\",\"SourceFileExtension\":\"xml\",\"SiteUrl\":\"https:\\/\\/sp.fmrcloud.com\\/sites\\/workbench\\/\",\"SourceFileName\":\"Concept89eab72e-af4d-49bc-8361-647d9b49cc7e.xml\",\"SourceRelativeUrl\":\"\\/sites\\/workbench\\/pi\\/Topics\\/Concept89eab72e-af4d-49bc-8361-647d9b49cc7e.xml\"}",
        "ResultIndex":  1,
        "ResultCount":  3295,
        "Identity":  "2ca27-f36f-48bc-a7cb-08d0e2",
        "IsValid":  true,
        "ObjectState":  "Unchanged"
    },
    {
        "PSComputerName":  "mail-nam.mcld.oud.com",
        "RunspaceId":  "cff32-7a72-4213-8760-e55469e",
        "PSShowComputerName":  false,
        "RecordType":  "SharePointFileOperation",
        "CreationDate":  "\/Date(1489501679000)\/",
        "UserIds":  "z524@company.com",
        "Operations":  "FileAccessed",
        "AuditData":  "{\"CreationTime\":\"2017-03-14T14:27:59\",\"Id\":\"c8c8eb-9ed2-4a48-934a-08e65072\",\"Operation\":\"FileAccessed\",\"OrganizationId\":\"75bc-a68c-41e5-a3455-1cf830619\",\"RecordType\":6,\"UserKey\":\"i:0h.f|membership|10033fff9b1ba6ce@lze.com\",\"UserType\":0,\"Version\":1,\"Workload\":\"SharePoint\",\"ClientIP\":\"137.199.241.16\",\"ObjectId\":\"https:\\/\\/sp.cloud.com\\/sites\\/workbench\\/pi\\/Maps\\/36d42faf-d405-480f-8e28-9c8db9e7e.xml\",\"UserId\":\"z98824@company.com\",\"EventSource\":\"SharePoint\",\"ItemType\":\"File\",\"ListId\":\"34409-7160-425b-8a46-d5af7b3\",\"ListItemUniqueId\":\"b656-1242-43a3-aa7c-169e910a\",\"Site\":\"b9738191-350f-4d0e-80-8be1dec55a\",\"UserAgent\":\"Mozilla\\/5.0 (Windows NT 6.1; WOW64; Trident\\/7.0; rv:11.0) like Gecko\",\"WebId\":\"492c-c0f8-4d8d-b4ad-de5d8d57\",\"SourceFileExtension\":\"xml\",\"SiteUrl\":\"https:\\/\\/sp.cloud.com\\/sites\\/workbench\\/\",\"SourceFileName\":\"3xxfaf-d405-480f-8e28-9c8cb9e7e.xml\",\"SourceRelativeUrl\":\"pi\\/Maps\"}",
        "ResultIndex":  2,
        "ResultCount":  3295,
        "Identity":  "z23ta28eb-9ed2-4a48-934a-08072",
        "IsValid":  true,
        "ObjectState":  "Unchanged"
    },

Here is my props:

BREAK_ONLY_BEFORE_DATE = false
LINE_BREAKER = (,[\r\n]+\s+)\{
KV_MODE=json
TZ=UTC
TIME_PREFIX = \"CreationTime\":\s*\"
MAX_TIMESTAMP_LOOKAHEAD = 35
KV_MODE=json
TZ = UTC
0 Karma
1 Solution

sloshburch
Splunk Employee
Splunk Employee

Ok, second time's the charm. I got it.

[new_sourcetype]
DATETIME_CONFIG = 
KV_MODE = json
LINE_BREAKER = \}(\,?[\r\n]+)\{?
MAX_TIMESTAMP_LOOKAHEAD = 25
NO_BINARY_CHECK = true
TIME_PREFIX = CreationTime\D+
TZ = UTC
category = Custom
disabled = false
pulldown_type = true
SHOULD_LINEMERGE = false

As far as I understand the syntax won't be pretty printed automatically (it's available in the UI per event) because the json already has formatting applied to it with white spaces and carriage returns. I guess if Splunk see's a single line json, it pretty-prints it but if you added in your own spacing it honors your intentions and displays it that way.

Lastly, and probably most importantly, the AuditData field has it's own json payload. To get that, you'll want to throw down this: | spath input=AuditData

BTW, I see the example you provided leads off with an open bracket [. Is that for real in the data? If so, you might want to scrub that out in the sourcetype.

Results:
alt text

View solution in original post

sloshburch
Splunk Employee
Splunk Employee

Ok, second time's the charm. I got it.

[new_sourcetype]
DATETIME_CONFIG = 
KV_MODE = json
LINE_BREAKER = \}(\,?[\r\n]+)\{?
MAX_TIMESTAMP_LOOKAHEAD = 25
NO_BINARY_CHECK = true
TIME_PREFIX = CreationTime\D+
TZ = UTC
category = Custom
disabled = false
pulldown_type = true
SHOULD_LINEMERGE = false

As far as I understand the syntax won't be pretty printed automatically (it's available in the UI per event) because the json already has formatting applied to it with white spaces and carriage returns. I guess if Splunk see's a single line json, it pretty-prints it but if you added in your own spacing it honors your intentions and displays it that way.

Lastly, and probably most importantly, the AuditData field has it's own json payload. To get that, you'll want to throw down this: | spath input=AuditData

BTW, I see the example you provided leads off with an open bracket [. Is that for real in the data? If so, you might want to scrub that out in the sourcetype.

Results:
alt text

a212830
Champion

Oh, YOU are THE MAN. Thanks!

0 Karma

sloshburch
Splunk Employee
Splunk Employee

Still working on it but this far thus far:

DATETIME_CONFIG = NONE
KV_MODE = json
LINE_BREAKER = },(\s+)
MAX_TIMESTAMP_LOOKAHEAD = 25
NO_BINARY_CHECK = true
TIME_FORMAT = %FT%T
TIME_PREFIX = CreationTime\W+
TZ = UTC
category = Custom
disabled = false
pulldown_type = true
SHOULD_LINEMERGE = false
0 Karma

a212830
Champion

Thanks. Didn't work, unfortunately.

0 Karma

a212830
Champion

Anyone? More concerned with the date than the json format at this point...

0 Karma

sloshburch
Splunk Employee
Splunk Employee

I just noticed that there are commas in between each event in the sample provided which is causing a json parsing error. I'm not json expect but I'm inclined to think that there shouldn't be commas between json items and only in between the json field/attribute pairs.

0 Karma

bshuler_splunk
Splunk Employee
Splunk Employee

So, the message you posted isn't valid JSON. I validate json format using https://jsonformatter.curiousconcept.com

But, my bet is that the message is valid json, but you didn't paste the full message.

Splunk is probably truncating the message.

If you are certain that this will always be valid data, set
props.conf
TRUNCATE = 0
http://docs.splunk.com/Documentation/Splunk/6.5.2/Admin/Propsconf

0 Karma

a212830
Champion

Those are two events within the file. I couldn't post the whole file - it's huge. I don't want one huge file as the event - separate events within the file.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...