Splunk Search

Not all fields showing up

Santosh2
Path Finder
index=test-index (data loaded) OR ("GET data published/data/ui" OR "GET /v8/wi/data/*" OR "GET data/ui/wi/load/success")
|rex field=_raw "DIP:\s+\[(?<dip>[^\]]+)."
|rex field=_raw "ACTION:\s+(?<actions>\w+)"
|rex dield=_raw "SERVICE:\s+(?<services>\S+)"
|search actions= start OR actions=done NOT service="null"
|eval split=services.":".actions
|timechart span=1d count by split
|eval _time=strftime(_time, "%d/%m/%Y")
|table _time *start *done

 

When we run the above query , not all services getting captured,  but we have data, attached the screen shot(highlighted ones are missing). can anyone let me know what is the issue with the query.


Labels (1)
0 Karma

PickleRick
SplunkTrust
SplunkTrust

We don't know your data, we don't know what you're getting, we don't know if you match your data properly or extract the fields properly. We don't know anything except a search and some excel table.

0 Karma

Santosh2
Path Finder
Sample logs:
{"date": "1/2/2022 00:12:22,124"  DATA [http:nio-12567-exec-44] DIP: [675478-7655a-56778d-655de45565] Data: [7665-56767ed-5454656] MIM: [483748348-632637f-38648266257d] FLOW: [NEW] { SERVICE: AAP | Applicationid: iis-675456 | ACTION: START | REQ: GET data published/data/ui } DADTA -:TIME:<TIMESTAMP> (0) 1712721546785 to 1712721546885 ms GET /v8/wi/data/*, GET data/ui/wi/load/success, "tags": {"host": "GTU5656", "insuranceid": "8786578896667", "lib": "app"}}
Sample logs:
{"date": "1/2/2022 00:12:22,124"  DATA [http:nio-12567-exec-44] DIP: [675478-7655a-56778d-655de45565] Data: [7665-56767ed-5454656] MIM: [483748348-632637f-38648266257d] FLOW: [NEW] { SERVICE: AAP | Applicationid: iis-675456 | ACTION: DONE | REQ: GET data published/data/ui } DADTA -:TIME:<TIMESTAMP> (0) 1712721546785 to 1712721546885 ms GET /v8/wi/data/*, GET data/ui/wi/load/success, "tags": {"host": "GTU5656", "insuranceid": "8786578896667", "lib": "app"}}

 

Hi @PickleRick , added sample logs, let me know if u need any other details.

0 Karma

yuanliu
SplunkTrust
SplunkTrust

Are you sure that your raw event is not a valid JSON closer to

 

{"date": "1/2/2022 00:12:22,124",  "DATA": "[http:nio-12567-exec-44] DIP: [675478-7655a-56778d-655de45565] Data: [7665-56767ed-5454656] MIM: [483748348-632637f-38648266257d] FLOW: [NEW] { SERVICE: AAP | Applicationid: iis-675456 | ACTION: START | REQ: GET data published/data/ui } DADTA -:TIME:<TIMESTAMP> (0) 1712721546785 to 1712721546885 ms GET /v8/wi/data/*, GET data/ui/wi/load/success", "tags": {"host": "GTU5656", "insuranceid": "8786578896667", "lib": "app"}}

 

instead?  In other words, do you not have a field  named "DATA" already? Because the overall structure of your illustration is very much compliant.

Assuming you have a field named DATA, a better strategy is trying to reconstruct a structure as your developers intended, instead of trying to extract individual tidbits as random text because your developers have clearly put in thoughts about data structure within DATA.  I would propose something like

 

index=test-index (data loaded) OR ("GET data published/data/ui" OR "GET /v8/wi/data/*" OR "GET data/ui/wi/load/success")
| rex field=DATA mode=sed "s/ *[\|}\]]/\"/g s/: *\[*/=\"/g"
| rename _raw as temp
| rename DATA AS _raw
| kv
| rename temp as _raw

 

Your sample data should give you

ACTIONApplicationidDIPDataFLOWMIMREQSERVICEdatehttptags.hosttags.insuranceidtags.lib
STARTiis-675456675478-7655a-56778d-655de455657665-56767ed-5454656NEW483748348-632637f-38648266257dGET data published/data/uiAAP1/2/2022 00:12:22,124nio-12567-exec-44GTU56568786578896667app

Here is an emulation that results in my hypothesized raw log:

 

| makeresults
| eval _raw = "{\"date\": \"1/2/2022 00:12:22,124\",  \"DATA\": \"[http:nio-12567-exec-44] DIP: [675478-7655a-56778d-655de45565] Data: [7665-56767ed-5454656] MIM: [483748348-632637f-38648266257d] FLOW: [NEW] { SERVICE: AAP | Applicationid: iis-675456 | ACTION: START | REQ: GET data published/data/ui } DADTA -:TIME:<TIMESTAMP> (0) 1712721546785 to 1712721546885 ms GET /v8/wi/data/*, GET data/ui/wi/load/success\", \"tags\": {\"host\": \"GTU5656\", \"insuranceid\": \"8786578896667\", \"lib\": \"app\"}}"
| spath
``` the above emulates
index=test-index (data loaded) OR ("GET data published/data/ui" OR "GET /v8/wi/data/*" OR "GET data/ui/wi/load/success")
```

 

Play with the emulation and compare with real data.

Note: In the unimaginable case where your developers try really hard to mess up everybody's mind and inject semblance of JSON compliance while violating common sense, you can still apply the same principle against _raw.  Like this:

 

index=test-index (data loaded) OR ("GET data published/data/ui" OR "GET /v8/wi/data/*" OR "GET data/ui/wi/load/success")
```
| rex mode=sed "s/ *[\|}\]]/\"/g s/: *\[*/=\"/g"
| kv

 

This is what the output would look like:

ACTIONApplicationidDATADIPDataFLOWMIMREQSERVICEhost
STARTiis-675456http=675478-7655a-56778d-655de455657665-56767ed-5454656NEW483748348-632637f-38648266257dGET data published/data/uiAAP 

Without a better structure, you won't get subnodes embedded in tags; but your original question does not seem to care about tags.

Here is an emulation that resembles the actual sample you posted:

 

| makeresults
| eval _raw = "{\"date\": \"1/2/2022 00:12:22,124\",  DATA: [http:nio-12567-exec-44] DIP: [675478-7655a-56778d-655de45565] Data: [7665-56767ed-5454656] MIM: [483748348-632637f-38648266257d] FLOW: [NEW] { SERVICE: AAP | Applicationid: iis-675456 | ACTION: START | REQ: GET data published/data/ui } DADTA -:TIME:<TIMESTAMP> (0) 1712721546785 to 1712721546885 ms GET /v8/wi/data/*, GET data/ui/wi/load/success\", \"tags\": {\"host\": \"GTU5656\", \"insuranceid\": \"8786578896667\", \"lib\": \"app\"}}"
``` the above emulates
index=test-index (data loaded) OR ("GET data published/data/ui" OR "GET /v8/wi/data/*" OR "GET data/ui/wi/load/success")
```

 

Tags (1)
0 Karma

Santosh2
Path Finder

We have around 10 services, by using below query i am getting 8 services and other 2 are not getting displayed in the table. But we can view them in events. Filed extraction is working correctly.
not sure why other 2 services are not showing up in the table. please find the output.

index=test-index (data loaded) OR ("GET data published/data/ui" OR "GET /v8/wi/data/*" OR "GET data/ui/wi/load/success")
|rex field=_raw "DIP:\s+\[(?<dip>[^\]]+)."
|rex field=_raw "ACTION:\s+(?<actions>\w+)"
|rex dield=_raw "SERVICE:\s+(?<services>\S+)"
|search actions= start OR actions=done NOT service="null"
|eval split=services.":".actions
|timechart span=1d count by split
|eval _time=strftime(_time, "%d/%m/%Y")
|table _time *start *done


 Current output: (DCC:DONE &PIP:DONE  fields are missing)

_timeAAP:STARTACC:STARTABB:STARTDCC:STARTPIP:STARTAAP:DONEACC:DONEABB:DONE
1/2/20221100110011661
2/2/202250503303
3/2/20221001008708
4/2/2022100110019780180
5/2/20220505350040

 

Expected output:

_timeAAP:STARTACC:STARTABB:STARTDCC:STARTPIP:STARTAAP:DONEACC:DONEABB:DONEDCC:DONEPIP:DONE
1/2/20221100110011661991
2/2/20225050330302
3/2/2022100100870803
4/2/2022100110019780180190
5/2/202205053500405200

 

0 Karma

yuanliu
SplunkTrust
SplunkTrust

Treating structured data as pure text is doomed to be unstable.  Have you tried my suggestion of reconstructing events based on inherent structure?

0 Karma

PickleRick
SplunkTrust
SplunkTrust

As a side note - I suppose this is some sort of a typo and your search contains "search action=start", not "search action= start" (notice the space in the middle). Assuming that...

That's a bit strange because assuming all your events follow the same syntax, the search looks relatively sound.

The normal approach in debugging searches would be to either start from the beginning and verify whether each step gives you desired results so that after adding each subsequent step you can verify when it stops doing what you want or cutting the commands from the end and see when it starts working properly (for that stage of the pipeline).

I'd cut back to just after the rex commands and search for events that should match those results you lack in your final results.

Then add one command after another and see.

Two possible culprits:

1) default limit of results for timechart (but that's kinda unlikely because you'd get 10 results and "OTHER" by default, not 8 results)

2) case of field names - field names are case sensitive whereas field values are not so if your services field contains "done" in most cases but "DONE" for those missing ones, the whatever:DONE fields would _not_ get matched by the *done wildcard in the table command.

0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to July and August Tech Talks, Office Hours, and Webinars!

Dive into our sizzling summer lineup for July and August Community Office Hours and Tech Talks. Scroll down to ...

Edge Processor Scaling, Energy & Manufacturing Use Cases, and More New Articles on ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Get More Out of Your Security Practice With a SIEM

Get More Out of Your Security Practice With a SIEMWednesday, July 31, 2024  |  11AM PT / 2PM ETREGISTER ...