Getting Data In

How do I fine tune a JSON extraction from inside a log file using the "add data" wizard?

tmaire2
New Member

Hello everyone,

I have a Log file with JSON format in it like this :

12:48:12.3194 Info {"message":"Test ListOfEmails execution started","level":"Information","logType":"Default","timeStamp":"2018-11-12T12:48:12.0992011+01:00","fingerprint":"fingerprintID","windowsIdentity":"WindowsIdentity_name","machineName":"machine_name","processName":"Test ListOfEmails","processVersion":"1.0.0.0","jobId":"name_of_the_job","robotName":"name_of_the_robot","machineId":44111,"fileName":"Main"}

When i imported this file (manually) with the Splunk "Add data" wizard, it didn't auto discover the fields in the JSON part. So i try to use the "Extract Fields" to extract my fields. It works for some of the fields but not for all of them (like "machineId" and fileName"). When I try to extract multiple fields in once and field one by one, I get the same results; it throws me this error :

"The extraction failed. If you are extracting multiple fields, try removing one or more fields. Start with extractions that are embedded within longer text strings."

Then i try to do my own Regex :

^(?:[^ \n]* ){2}\{"\w+":"(?P<message>[^"]+)[^:\n]*:"(?P<level>[^"]+)[^:\n]*:"(?P<logType>\w+)(?:[^"\n]*"){8}(?P<fingerprint>[^"]+)[^:\n]*:"(?P<windowsIdentity>[^"]+)[^:\n]*:"(?P<machineName>[^"]+)[^:\n]*:"(?P<processName>[^"]+)[^:\n]*:"(?P<processVersion>[^"]+)[^:\n]*:"(?P<jobId>[^"]+)[^:\n]*:"(?P<robotName>[^"]+)[^:\n]*:(?P<machineId>[^",]+)[^:\n]*:"(?P<fileName>[^"]+)

It work (it extract all my field except some of them with very long message) until i write the last part for the "fileName" and give me this error :

Error in 'rex' command: regex="(?ms)^(?:[^ \n] ){2}{"\w+":"(?P[^"]+)[^:\n]:"(?P[^"]+)[^:\n]:"(?P\w+)(?:[^"\n]"){8}(?P[^"]+)[^:\n]:"(?P[^"]+)[^:\n]:"(?P[^"]+)[^:\n]:"(?P[^"]+)[^:\n]:"(?P[^"]+)[^:\n]:"(?P[^"]+)[^:\n]:"(?P[^"]+)[^:\n]:(?P[^",]+)[^:\n]*:"(?P[^"]+)" has exceeded configured match_limit, consider raising the value in limits.conf*

Afteward, i try to remove this part "12:48:12.3194 Info " in order to only have the JSON format and it works like a charm with the field auto discovery (no need to use the "Extract fields").

Is there a way in the "Add data" wizard to remove this part "12:48:12.3194 Info ". In order to only keep JSON? Is that a good way to do that? Or maybe there is another way to transform my logs that i didn't think of?

Thank you by advance for your replies,

Regards,
Thibaut

0 Karma
1 Solution

skalliger
SplunkTrust
SplunkTrust

Hi,

nope, there is no way to tune the JSON discovery. However, you can cut the _raw before the fields get extracted.
You would want to do something like this in your props.conf and transforms.conf:

props.conf:

[your_sourcetype]
# call it whatever you like (TRANSFORM-example)
TRANSFORM-json = json_cut

transforms.conf:

[json_cut]
DEST_KEY = _raw
REGEX = (?:^(\d+\:){2}\d+\.\d+\s\w+\s)(?<json>[^\}]+\})
FORMAT = $1

You may want to tune this RegEx. I just took your example event and matched it.

Skalli

View solution in original post

0 Karma

tmaire2
New Member

Hello Skalliger, Thank you for your help 🙂

Thank you it works better, although i still have some event that are not taken. But i found the problem. When i import my data with the "add data" wizard, by letting the "line break" in auto i got the same amount of event when i import my file with the configuration files. But as i say, some event are "merged" together so i don't have all the events.

Always in "add data" wizard, if i select "every line" instead of "auto" in "Line break" it works (i got all my event separated correctly. So, how can I translate this part in the config file. I guess the modification is in the transforms.conf?

edit: i found it : in props.conf add SHOULD_LINEMERGE = false

Thank you again for your time.
Thibaut

0 Karma

skalliger
SplunkTrust
SplunkTrust

Thanks for the feedback. Woud be nice however, if you could accept my answer as the answer to the question. I'm trying to get a free .conf pass next year. 🙂

Skalli

0 Karma

tmaire2
New Member

Thank you for your time and help :). I accept your answer, is that ok now?

Thibaut

0 Karma

tmaire2
New Member

It's working but not completely. some of the events are not present.

It work for :

    09:54:34.1821 Info {"message":"UiPath_REFrameWork_UiDemo execution started","level":"Information","logType":"Default","timeStamp":"2018-10-08T09:54:34.0170959+02:00","fingerprint":"0fcfd8d0-ad31-47fd-b240-c1ddc9fd4169","windowsIdentity":"name","machineName":"DCPJQQ2","processName":"UiPath_REFrameWork_UiDemo","processVersion":"1.0.0.0","jobId":"252fbec2-83d3-4f01-b165-5c728b850989","robotName":"DCPJQQ2","machineId":44772,"fileName":"System1_login"}

But not for :

09:55:11.0503 Info {"message":"UiPath_REFrameWork_UiDemo execution started","level":"Information","logType":"Default","timeStamp":"2018-10-08T09:55:10.9611418+02:00","fingerprint":"41543e91-d14f-48d3-ac9a-d53b3a3c33da","windowsIdentity":"name","machineName":"DCPJQQ2","processName":"UiPath_REFrameWork_UiDemo","processVersion":"1.0.0.0","jobId":"54bea4fe-da6c-4c55-aec3-019bd57b037b","robotName":"DCPJQQ2","machineId":44772,"fileName":"InitAllApplications"}

Or :

14:11:05.6823 Info {"message":"UiPath_REFrameWork_UiDemo execution ended","level":"Information","logType":"Default","timeStamp":"2018-10-08T14:11:05.6874037+02:00","fingerprint":"325d2ba7-f8a2-440d-9e8a-70bf6103008a","windowsIdentity":"name","machineName":"DCPJQQ2","processName":"UiPath_REFrameWork_UiDemo","processVersion":"1.0.0.0","jobId":"4f6ac200-c4cc-4562-953d-33c7f1e3b00e","robotName":"DCPJQQ2","machineId":44772,"totalExecutionTimeInSeconds":1,"totalExecutionTime":"00:00:01","fileName":"Main"}

Or :

09:54:34.6757 Error {"message":"Invoke Workflow File: Cannot create unknown type '{http://schemas.uipath.com/workflow/activities}GetSecureCredential'.","level":"Error","logType":"Default","timeStamp":"2018-10-08T09:54:34.6747442+02:00","fingerprint":"1953d68a-44f3-4b9f-b10d-df026d4b941e","windowsIdentity":"name","machineName":"DCPJQQ2","processName":"UiPath_REFrameWork_UiDemo","processVersion":"1.0.0.0","jobId":"252fbec2-83d3-4f01-b165-5c728b850989","robotName":"DCPJQQ2","machineId":44772,"fileName":"System1_login"}

The last one i guess is a regex problem because of the "}" in the "message" but for the rest i don't know why Splunk don't take them because they are very similar from the first one.

Thanks for the help

0 Karma

skalliger
SplunkTrust
SplunkTrust

Oh, I didn't see those extra braces. Sorry, then we make it a little bit easier:

(?=\{)(?<json>\{[^(\n||\r\n)]+)

This will match until the end of the line (\r or \r\n), because your JSON should end with a closing brace.

Skalli

0 Karma

skalliger
SplunkTrust
SplunkTrust

Hi,

nope, there is no way to tune the JSON discovery. However, you can cut the _raw before the fields get extracted.
You would want to do something like this in your props.conf and transforms.conf:

props.conf:

[your_sourcetype]
# call it whatever you like (TRANSFORM-example)
TRANSFORM-json = json_cut

transforms.conf:

[json_cut]
DEST_KEY = _raw
REGEX = (?:^(\d+\:){2}\d+\.\d+\s\w+\s)(?<json>[^\}]+\})
FORMAT = $1

You may want to tune this RegEx. I just took your example event and matched it.

Skalli

0 Karma

tmaire2
New Member

hello skalliger,

Thanks for your response !

Do i modify theses files in the default or local directory? (sorry i'm quite new at theses config files) and after that, how can i see theses modification (select my new sourcetype) in "add data" wizard because even if I modifiy props.conf and transform.conf in the local or default directory i still can't see my new sourcetype ?
I must be doing something wrong.

Thank you,
Thibaut

0 Karma

skalliger
SplunkTrust
SplunkTrust

You want to do your modifications inside the local directory. If the files don't exist yet, create them.
The JSON must be read somehwere. For example from a monitor of a Universal Forwarder or something else. When defining your inputs.conf to get your data in, you should always define an index and a sourcetype.
This sourcetype is it where we refer to from props.conf and transforms.conf.

Did that answer your question?

Skalli

0 Karma

tmaire2
New Member

Thank you Skalliger, it's very clear with a UF. The thing is we don't have a Splunk infrastructure yet (i use the free license on my machine without any UF or HF) so for now i just want to understand how to properly get data in. All my log are on my computer and i import them with the "Add data" wizard.

So, if i'm right, i first need to create an Index (or i can use the default one?) and a sourcetype in the Inputs.conf on my machine where Splunk is installed. Modify props and transforms files and indicate the sourcetype previously created in the inputs.conf. After that, i will see my sourcetype in the "add wizard" with the correct transformation applied on my logs?

Thanks for your time,
Thibaut

0 Karma

skalliger
SplunkTrust
SplunkTrust

As mentioned before, you always want to set an index and a sourcetype. You don't want to use the main index. 🙂

That's correct, define the data inputs in inputs.conf, create an index in indexes.conf and here you go.

Skalli

tmaire2
New Member

Hi skalliger,

Thanks for your response !

Do i modify theses files in the default or local directory? (sorry i'm quite new with theses conf files) and after that how can i find these modification in the "add data" wizard (to apply my sourcetype to the logfile)?

Thibaut

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...