I have 2 searches from two individual log files with Txid in common (could be outerjoin):
The first search I get the Txid from source file A and get the duration of that transaction.
The second search (I used Drilldown Editor to create a click event --> Set TxnId=$click.value$) is to retrieve appname, columns from a SQL statement, host and by the selected Txnid.
I'd like to make these two outputs as one result. How do I do it? The exact syntaxes I used are as follows:
index="IDX" (host="PRhosts") source="WS.webapi.log" "Controller.Post" "- End" | rex field=_raw "s/^.* {/{/" mode=sed | spath output=status path=stat |rex field=_raw "\s+T+\s(?<txid>.*?)\s+Controller\\.Post\s\\-\s(?<duration>.*?)\s\\-\s+End" |sort - duration |table txid duration
index="IDX" (host="PRhosts") source="*WS.Business.Milestones.log" |rex field=_raw "s/^.* {/{/" mode=sed |spath output=nv path=flds{}.nv |spath output=status path=stat |spath output=tid path=tid |spath output=fn path=flds{}.fn | search tid=$Txnid$ | table fn nv host status tid
WS.Webapi.log raw date looks like one line below (and you can guess there is a - Begin somewhere above but there is no duration recorded):
08/10/22 19:21:18.33 p06712 [00017] T M2kYTm7ywE6RFEnqc9m_1g Controller.Post - 00:00:00:270 - End
WS.Business.Milestones.log raw data look like the following:
08/10/22 19:26:03.44 p08604 [00106] T {"tid":"H2R2JPpkiECRHW5hEszG3Q","sid":"T1-COOLSECURITY:CSAPPAUTH-{E7690AF7-D1F0-4A84-A612-7E47C9F07679}","stat":"Success","sf":"EmployeeLogic","sm":"GetAsync","dt":"2022-08-10T23:26:03.4462133Z","flds":[{"fn":"username","nv":"HostedRedirGlobalEmployeeWS_PR"},{"fn":"dbQueries","nv":"SQL_QUERIES=SELECT emp.EMP_ID, emp.REPORTS_TO_SCID, emp.DEPT_CODE , emp.EMP_ID\n FROM coolemp.SHIPS_COOL2 emp\n WHERE ((UPPER(emp.SYSTEM_PERSON_TYPE) != UPPER('Pending Worker'))) AND ((UPPER(emp.USER_SID) = UPPER(:emp_userSid)))"}]}
So I'd like to know how to join the above 2 results into one so I can show the duration, with fn and nv values that has the SQL field "emp.Last_Updated_Date".
First search is not right.
You need to repeat the fields for the OR condition and use brackets where necessary, maybe this is what you intended?
(index="IDX" host="PRHosts") ((source="WS.Business.Milestones.log") OR (source="WS.webapi.log" "EmployeeController.Post" "- End"))
AND is implicit within terms
The duration is not captured. It works fine in (Search 1) when only used webapi log. After combination, the txid is captured but duration field is null. Why?
| rex field=_raw "\s+T+\s(?<txid>.*?)\s+Controller\\.Post\s\\-\s(?<duration>.*?)\s\\-\s+End"
The final part of your search is out of order.
You are using dedup on column/username. If the data from search 1 does not have those fields, those events will disappear. Also you are searching status=success, which will only appear in search 2 data, so you should finish the search something like this
| fields - fn nv fieldname
...
| eval txid=coalesce(txid, tid)
| stats values(*) as * by txid
``` Filter success items ```
| where status="Success"
``` Break out columns ```
| rex field=dbQueries "(?i)select\s+(?<columns>.*)\s+from"
| eval column=split(columns,", ")
``` Now get the column we want ```
| eval column = mvfilter(column="emp.last_updated_date")
| dedup column username
i.e.
There is no point in mvexpand and then searching for the last_updated_date to remove all the other expanded columns, all you need to do it filter the one you want.
dedup may or may not be necessary
This new search syntax did not create any errors but did not return any results in the time frame that I knew 18 results should come back. What do I do wrong? I believe the "OR" is the way to combine the raw data (2 log files) but perhaps it's fundamentally incorrect?
index="IDX" (host="PRHosts") source="WS.Business.Milestones.log" OR "WS.webapi.log" "EmployeeController.Post" "- End"
| rex field=_raw "\s+T+\s(?<txid>.*?)\s+Controller\\.Post\s\\-\s(?<duration>.*?)\s\\-\s+End"
| rex field=_raw "s/^.* {/{/" mode=sed
| spath output=status path=stat
| spath output=nv path=flds{}.nv
| spath output=status path=stat
| spath output=tid path=tid
| spath output=fn path=flds{}.fn | eval fieldname=mvindex(fn,0)
| eval {fieldname}=mvindex(nv,0)
| eval fieldname=mvindex(fn,1)
| eval {fieldname}=mvindex(nv,1)
| fields - fn nv fieldname
| rex field=dbQueries "(?i)select\s+(?<columns>.*)\s+from"
| eval column=split(columns,", ")
| mvexpand column
| dedup column username | search status=Success column="emp.last_updated_date"
| eval txid=coalesce(txid, tid)
| stats values(*) as * by txid
First search is not right.
You need to repeat the fields for the OR condition and use brackets where necessary, maybe this is what you intended?
(index="IDX" host="PRHosts") ((source="WS.Business.Milestones.log") OR (source="WS.webapi.log" "EmployeeController.Post" "- End"))
AND is implicit within terms
No need for drill down any more. That is just my initial practice to show in Dashboard through the drill down.
I meant to join the search results by txid. I will try your syntax and then extract emp.Last_updated_Date from the combined data.
Sorry, my examples shown just to illustrate the formats of records, did not pick the same txid.
Main point about combining these to a single search is to do things the 'Splunk' way, i.e. not to use the join command, which has limitations, but instead join using the 'stats' command.
If you come from SQL, it's good to remember that with Splunk, the first option is NOT a join command
With the drilldown, if you go into the XML source, you can set any token for any field in the table
<drilldown>
<set token="Txnid">$row.txid$</set>
<set token="Duration">$row.duration$</set>
</drilldown>
your drilldown setting it 'row'.
Do you actually want to do this with a drilldown or to have a single table with all information in it? See this example
| makeresults
| eval r=split("08/10/22 19:21:18.33 p06712 [00017] T M2kYTm7ywE6RFEnqc9m_1g Controller.Post - 00:00:00:270 - End###08/10/22 19:26:03.44 p08604 [00106] T {\"tid\":\"H2R2JPpkiECRHW5hEszG3Q\",\"sid\":\"T1-COOLSECURITY:CSAPPAUTH-{E7690AF7-D1F0-4A84-A612-7E47C9F07679}\",\"stat\":\"Success\",\"sf\":\"EmployeeLogic\",\"sm\":\"GetAsync\",\"dt\":\"2022-08-10T23:26:03.4462133Z\",\"flds\":[{\"fn\":\"username\",\"nv\":\"HostedRedirGlobalEmployeeWS_PR\"},{\"fn\":\"dbQueries\",\"nv\":\"SQL_QUERIES=SELECT emp.EMP_ID, emp.REPORTS_TO_SCID, emp.DEPT_CODE , emp.EMP_ID\n FROM coolemp.SHIPS_COOL2 emp\n WHERE ((UPPER(emp.SYSTEM_PERSON_TYPE) != UPPER('Pending Worker'))) AND ((UPPER(emp.USER_SID) = UPPER(:emp_userSid)))\"}]}", "###")
| mvexpand r
| rename r as _raw
``` Above is setting up your example data ```
| rex field=_raw "\s+T+\s(?<txid>.*?)\s+Controller\\.Post\s\\-\s(?<duration>.*?)\s\\-\s+End"
| rex field=_raw "s/^.* {/{/" mode=sed
| spath output=status path=stat
| spath output=nv path=flds{}.nv
| spath output=status path=stat
| spath output=tid path=tid
| spath output=fn path=flds{}.fn
| eval txid=coalesce(txid, tid)
| stats values(*) as * by txid
You would search both indexes with (search1) OR (search2) and then create the common field txid from the two data sets, then stats will 'join' them together - the above data has two different txids though.
As for emp.Last_Updated_Date, if that data exists anywhere you can then extract that as needed.