Splunk Search

How to extract multiple fields from a search result?

hamishcross
Engager

Hi All

I am trying to extract the values that trail context, userid, username, groupid

Sample partial event

 

{ "type": "login","context": "Rsomeserver:8877-T1670321752-P18407-T030-C000025-S38","sequence": 998,"message": { "state": "ok","agent": true,"userid": "User0000000949","loginid": "somelogin101","ownerid": "system","username": "John Smith","cssurl": "[\"/css/somepage.css\",\"/branding/\"]","groupid": "Group0000000945","windows": [ {"name":"something","id":"someid","url":"/someurl//

 

 

I started with this approach

 

 

"context": "(?<SessionID>[^\"]*)".*?"username"+: "(?<Username>[^\"]*)"

 

And this seems to compile on regex101 but on rex it's throwing an error 

 

Error in 'SearchParser': Missing a search command before '^'. Error at position '141' of search query 'search index=<removed> ("\"login\"\,\"contex...{snipped} {errorcontext = ?<userid>[^\"]*)"}'.

 

My aim is to then use this data to join on the  context value with another search, but I'm looking for help on where I'm going wrong with my Rex.

As the JSON seems to be truncated, I don't think I can treat it as JSON, so any help with a REX extraction would be greatly appreciated.

Labels (1)
0 Karma

bowesmana
SplunkTrust
SplunkTrust

and as a further comment - join is rarely the right solution to a Splunk join search.

It has limitations and can silently give you the wrong results.

It's best to start looking at solving a join issue with stats, e.g. a typical starting point is

(search data_set_1) OR (search data_set_2)
| get_session_id_from_data_here
| stats values(*) as * by sessionId

and getting the session id will depend on the data set it comes from. This can involve typically

| rex data_set_1_field "(?<id_1>session id from here)"
| rex data_set_2_field "(?<id_2>session id from here)"
| eval sessionId=coalesce(id_1, id_2)
0 Karma

bowesmana
SplunkTrust
SplunkTrust

Your quotes inside your rex string need to be escaped

| rex "context\":\s\"(?<SessionID>[^\"]*)\".*?\"username\"+:\s\"(?<Username>[^\"]*)"
0 Karma

hamishcross
Engager

 

So my aim is to execute the below, which should tally up the number of events that a given "context" has executed, and subsequently logged. This context(id) is another name for a session.

index=myindex ("events") OR ("events2") | rex "context.{3}\"(?<context>.[a-zA-Z0-9_:-]+)" | stats count by context

 

I'd then like to tie join the above context on the below, so that I can get user details related to above results

index=myindex ("\"login\"\,\"context\"") AND ("username") | rex "context\":\s\"(?<context>[^\"]*)\".*?\"userid\"+:\s\"(?<userid>[^\"]*)\".*?\"username\"+:\s\"(?<username>[^\"]*)\".*?\"groupid\"+:\s\"(?<groupid>[^\"]*)" | table context userid groupid username

 

I'd then like to only show unique rows based on the userid

 

And finally, I'd then like to be able to show a count of the unique rows above

 

0 Karma

yuanliu
SplunkTrust
SplunkTrust

First, if you have any influence at all on the developers, persuade, plea with, beg them to make logs complete.  Second, because you are confident that groupid is always included in the login event, I would recommend mending partial JSON to conformant objects, like thus

 

| rex mode=sed "s/(\"groupid\": *\"[^\"]+\"),.*/\1}}/"
```| eval valid = if(json_valid(_raw), "yes", "no")```
| spath

 

Your sample input now becomes

contextmessage.agentmessage.cssurlmessage.groupidmessage.loginidmessage.owneridmessage.statemessage.useridmessage.usernamesequencetype
Rsomeserver:8877-T1670321752-P18407-T030-C000025-S38true["/css/somepage.css","/branding/"]Group0000000945somelogin101systemokUser0000000949John Smith998login

This would be much easier to handle.

To achieve your combined search, your want to retrieve all events in both searches, then perform stats on them, like thus

 

index=myindex (("events") OR ("events2")) OR ("\"login\"\,\"context\"") AND ("username")
| rex mode=sed "s/(\"groupid\": *\"[^\"]+\"),.*/\1}}/" ``` you can design another rex to make "events or events2" conformant ```
| spath
| rename message.* AS *
| rex "\"context\"\s*:\"(?<context>.[^\"]+)" | rex "\"type\"\s*:\"(?<type>.[^\"]+)\"" ``` unnecessary if "events or events2" are already mended ```
| stats dc(type) count by username userid groupid context
| where 'dc(type)' > 1

 

 

0 Karma
Get Updates on the Splunk Community!

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...

State of Splunk Careers 2024: Maximizing Career Outcomes and the Continued Value of ...

For the past four years, Splunk has partnered with Enterprise Strategy Group to conduct a survey that gauges ...

Data-Driven Success: Splunk & Financial Services

Splunk streamlines the process of extracting insights from large volumes of data. In this fast-paced world, ...