Splunk Search

Splunk Search Query to fill empty columns of data with data from other rows

wm
Loves-to-Learn Everything

I'll first insert my whole splunk search query and show whats it showing and whats the expected result

 

 

 

 

 

index=sss sourcetype=sss_trade_www_timestamp 
| rex field=_raw "trade_id=\"(?<trade_id>\X+)\", event_id" 
| rex field=_raw "mx_status=\"(?<status>\X+)\", operation" 
| rex field=_raw "operation=\"(?<operation>\X+)\", action" 
| rex field=_raw " action=\"(?<action>\X+)\", tradebooking_sgp" 
| rex field=_raw " eventtime_sgp=\"(?<booking_mq_timestamp>\X+)\", sky_to_mq" 
| rex field=_raw "mq_latency=\"(?<mq_latency>[^\"]+)\".*\bportfolio_name=\"(?<portfolio_name>[^\"]+)\".*\bportfolio_entity=\"(?<portfolio_entity>[^\"]+)\".*\btrade_type=\"(?<trade_type>[^\"]+)"

| join event_id 
    [ search index=sss sourcetype=Sss_Www_EP_Logs "Successfully processed event" 
    | rex field=_raw "INFO: (?<booking_ep_timestamp>\d{8} \d{2}:\d{2}:\d{2}.\d{3})" 
    | rex field=_raw "Successfully processed event: (?<event_id>\X+), action" 
    | eval booking_ep_timestamp = strftime(strptime(booking_ep_timestamp."+0800", "%Y%d%m %H:%M:%S.%N%z"), "%Y/%m/%d %H:%M:%S")] 

    
| join type=left sss_id 
    [ search index=sss "New trades in amendment" "*pnl*"
    | rex "Trade Events (?<trades>.*)"
    | rex max_match=0 field=trades "(?<both_id>\d+:\d+)"
    | mvexpand both_id
    | rex field=both_id ":(?<sky_id>\d+)" 
    | rex max_match=1 field=_raw "(?<booking_pnl_timestamp>\d{4}+-\d{2}+-\d{2} \d{2}:\d{2}:\d{2},\d{3})"] 

| rex field=tradebooking_sgp "(?<booking_timestamp>\d{4}/\d{2}/\d{2} \d{2}:\d{2}:\d{2})"
| rex field=booking_mq_timestamp "(?<booking_mq_timestamp>\d{4}/\d{2}/\d{2} \d{2}:\d{2}:\d{2})"
| eval booking_pnl_timestamp = booking_pnl_timestamp."+0800" 

| eval ep_latency = strptime(booking_ep_timestamp, "%Y-%m-%d %H:%M:%S.%N%z") - strptime(booking_timestamp, "%Y/%m/%d %H:%M:%S.%N%z") 
| eval pnl_latency = strptime(booking_pnl_timestamp, "%Y-%m-%d %H:%M:%S,%N%z") - strptime(booking_timestamp, "%Y/%m/%d %H:%M:%S.%N%z") 

| search trade_id = "*" 
| search sss_id = "*" 
| search event_id = "*" 
| search action = "*" 
| search mx_status = "live" 
| search operation = "*" 


| table trade_id, sss_id, event_id, booking_timestamp, booking_mq_timestamp, booking_ep_timestamp, mx_status, operation, action, mq_latency, ep_latency, portfolio_name, portfolio_entity
| sort booking_ep_timestamp
| join type=left sss_id
       [ search index=sss sourcetype=sss_cashfx_catchup_logs "[Www] - Done incremental update" 
| rex field=_raw "Max Ssslib TradeID: (?<sss_id>\d+)"
| rex field=_raw "^(?<catchup_updated_time>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3})"
| eval catchup_updated_time = strftime(strptime(catchup_updated_time, "%Y-%m-%d %H:%M:%S"), "%Y/%m/%d %H:%M:%S")
| table sss_id, catchup_updated_time, _raw, 
]
| eval booking_timestamp_unix = strptime(booking_timestamp, "%Y/%m/%d %H:%M:%S") 
| eval catchup_unix_time = strptime(catchup_updated_time, "%Y/%m/%d %H:%M:%S") 
| eval www_to_sss_latency = round(catchup_unix_time - booking_timestamp_unix, 0)
| eval booking_mq_timestamp_unix = strptime(booking_mq_timestamp, "%Y/%m/%d %H:%M:%S") 
| eval mq_latency = round(booking_mq_timestamp_unix - booking_timestamp_unix, 0)
| eval booking_ep_timestamp_unix = strptime(booking_ep_timestamp, "%Y/%m/%d %H:%M:%S") 
| eval ep_latency = round(booking_ep_timestamp_unix - booking_mq_timestamp_unix, 0)
| eval mq_to_sss_update_latency = round(catchup_unix_time - booking_mq_timestamp_unix, 0)
| table trade_id, portfolio_name, portfolio_entity, sss_id, event_id, booking_timestamp, booking_mq_timestamp, booking_ep_timestamp, mq_latency, ep_latency, catchup_updated_time, www_to_sss_latency, mq_to_sss_update_latency, mx_status, operation, action,
| dedup sss_id
| sort booking_timestamp

 

 

It gives me this table but as I cant show all the tables row ill show the relevant ones 

trade_idsss_idbooking_timestampbooking_mq_timestampbooking_ep_timestampmq_latencyep_latencycatchup_updated_timewww_to_sss_latencymq_to_sss_update_latency
abc 1235976165192024/06/15 09:22:372024/06/15 09:24:162024/06/15 09:24:169902024/06/15 09:24:2610910
abc 341597616518        
abc 218931597616517        
abc 12015976149372024/06/15 07:50:142024/06/15 07:51:122024/06/15 07:51:125802024/06/15 07:51:19657
abcc 219597614936        
abc 219597614935        

 

 

just assume the booking_timestamp, booking_mq_timestamp, booking_ep_timestamp, mq_latency, ep_latency are all filled

Ok but since my catchup_updated_time is taking from a log entry its populated (eg. 2024-06-15 10:57:03,114 [Www] - Done incremental update. Max SSSSSS TradeID: 597618769), but the rest of the rows/columns are not populated.

I want to highlight this specific row since its taking from logs and also fill the empty catchup_updated_time such that 597616518 and 597616517 should take the catchup_updated_time, latency etc of 597616519 as their id is before and 597616519 is the max id taken from logs its row should be highlighted. hence anything before or smaller than 597616519 should have same catchup_updated_time. However not until 597614937 as it already has a catchup_updated_time taken from logs. So same for the rest of the rows. Is this complicateD?

Please let me know if you need more info 

0 Karma

yuanliu
SplunkTrust
SplunkTrust

Your question would be much easier to understand if you skip the complex SPL, first give sample data (anonymize as needed), illustrate desired output, then explain the logic between illustrated data sample and desired output without SPL.  To diagnose your attempted SPL, you also illustrate the actual output from that SPL, and explain how actual output is different from desired output if that is not painfully obvious. (Remember: What is "obvious" to you is not always obvious to volunteers who lack intimate knowledge about your dataset and your use case.)

As a side note, the illustrated SPL implies that your sourcetype=sss_trade_www_timestamp contains snippets like "trade_id=foo",  "mx_status=bar", and so on.  If so, Splunk would have extracted trade_id, status, etc. without your rex.  Is there any reason Splunk is not giving you those?

0 Karma

wm
Loves-to-Learn Everything

ah okay thanks gotcha. 

Current output is like the table i shown and desired output is the table after that.

catchup_updated_time is not populated for some as its taken from another sourcetype which is www_cattchup_Logs

Please let me know if futher info is required

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Will filldown do the trick? E.g.

| filldown catchup_updated_time
0 Karma

wm
Loves-to-Learn Everything

hi @ITWhisperer , it works but my other columns which are a calculation of that column don't get populated

| eval distributor_to_abc_latency = catchup_unix_time - CIMsendingTime_unix



since the column was empty and was fillled using filldown the other columns dont get filled

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Move the filldown to before the calculations (Splunk is not Excel (or other spreadsheet applications) - the calculations are not dynamic formulae held in cells!)

0 Karma

wm
Loves-to-Learn Everything

Noted thanks @ITWhisperer 

However looks like its not working as expected
This is before filldown

wm_2-1725527569587.png

 

This is after filldown

wm_1-1725527525583.png

Why is it not populating 2024/09/04 07:54:20.445 from the rows below instead it is filling with 2024/09/04 07:54:52.137

 

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Please share your search / SPL, preferably in a code block, not a picture

0 Karma

wm
Loves-to-Learn Everything
index=abc sourcetype=abc_trade_wss_timestamp 
| rex field=_raw "trade_id=\"(?<trade_id>\X+)\", event_id" 
| rex field=_raw "mx_status=\"(?<status>\X+)\", operation" 
| rex field=_raw "operation=\"(?<operation>\X+)\", action" 
| rex field=_raw " action=\"(?<action>\X+)\", tradebooking_sgp" 
| rex field=_raw " eventtime_sgp=\"(?<distributor_timestamp>\X+)\", sky_to_mq" 
| rex field=_raw "distributor_latency=\"(?<distributor_latency>[^\"]+)\".*\bportfolio_name=\"(?<portfolio_name>[^\"]+)\".*\bportfolio_entity=\"(?<portfolio_entity>[^\"]+)\".*\btrade_type=\"(?<trade_type>[^\"]+)"

| join event_id 
    [ search index=sky sourcetype=Sky_WSS_EP_Logs "Successfully processed event" 
    | rex field=_raw "INFO: (?<ep_timestamp>\d{8} \d{2}:\d{2}:\d{2}.\d{3})" 
    | rex field=_raw "Successfully processed event: (?<event_id>\X+), action" 
    | eval ep_timestamp = strftime(strptime(ep_timestamp."+0800", "%Y%d%m %H:%M:%S.%N%z"), "%Y-%m-%d %H:%M:%S.%3N")]
    
| join type=left sky_id 
    [ search index=sky "New trades in amendment" "*pnl*"
    | rex "Trade Events (?<trades>.*)"
    | rex max_match=0 field=trades "(?<both_id>\d+:\d+)"
    | mvexpand both_id
    | rex field=both_id ":(?<sky_id>\d+)" 
    | rex max_match=1 field=_raw "(?<booking_pnl_timestamp>\d{4}+-\d{2}+-\d{2} \d{2}:\d{2}:\d{2},\d{3})"] 

| rex field=tradebooking_sgp "(?<booking_timestamp>\d{4}/\d{2}/\d{2} \d{2}:\d{2}:\d{2})"
| rex field=distributor_timestamp "(?<distributor_timestamp>\d{4}/\d{2}/\d{2} \d{2}:\d{2}:\d{2}\.\d{3})"


| eval booking_pnl_timestamp = booking_pnl_timestamp."+0800" 

| eval ep_latency = strptime(ep_timestamp, "%Y-%m-%d %H:%M:%S.%N%z") - strptime(booking_timestamp, "%Y/%m/%d %H:%M:%S.%N%z") 
| eval pnl_latency = strptime(booking_pnl_timestamp, "%Y-%m-%d %H:%M:%S,%N%z") - strptime(booking_timestamp, "%Y/%m/%d %H:%M:%S.%N%z") 

| search trade_id = "*" 
| search sky_id = "*" 
| search event_id = "*" 
| search action = "*" 
| search mx_status = "live" 
| search operation = "*" 


| table trade_id, sky_id, event_id, booking_timestamp, distributor_timestamp, ep_timestamp, mx_status, operation, action, distributor_latency, ep_latency, portfolio_name, portfolio_entity
| join type=left sky_id
       [ search index=sky sourcetype=sky_cashfx_catchup_logs "[WSS] - Trade Store has been updated"
| rex field=_raw "Max Skylib TradeID: (?<sky_id>\d+)"
| rex field=_raw "^(?<catchup_updated_time>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3})"
| eval catchup_updated_time = strftime(strptime(catchup_updated_time, "%Y-%m-%d %H:%M:%S,%3N"), "%Y/%m/%d %H:%M:%S.%3N")
| dedup sky_id sortby +_time 
| table sky_id, catchup_updated_time, _raw, 
]
| eval booking_timestamp_unix = strptime(booking_timestamp, "%Y/%m/%d %H:%M:%S") 
```| eval wss_to_sky_latency = catchup_unix_time - booking_timestamp_unix```



| eval mq_to_sky_update_latency = catchup_unix_time - distributor_timestamp_unix
| eval ep_timestamp = strftime(strptime(ep_timestamp, "%Y-%m-%d %H:%M:%S.%3N"), "%Y/%m/%d %H:%M:%S.%3N")
| eval distributor_timestamp = strftime(strptime(distributor_timestamp, "%Y/%m/%d %H:%M:%S.%4N"), "%Y/%m/%d %H:%M:%S.%4N")

| table trade_id, portfolio_name, portfolio_entity, sky_id, event_id, booking_timestamp, booking_timestamp_unix, distributor_timestamp, distributor_timestamp_unix, ep_timestamp, distributor_latency, ep_latency, catchup_updated_time, wss_to_sky_latency, mq_to_sky_update_latency, mx_status, operation, action, catchup_unix_time
| rex field=trade_id "^\w+ (?<dealnumber>\d+)$"
| join type=left dealnumber
        [ search index=wss "Sending message" source="/proj/flowfx/wss/FFXWS01P/log/MQ1.log"
    ```Exclude Far Legs of Swap Trades for first Iteration of Dash```
    NOT "<swap_leg>2</swap_leg>" 
    ```Exclude Cancels, Amends, Auxiliaries, Allocations, Blocks - allocated ```
    NOT "<status>"
    ```Exclude MM Deals ```
    NOT "<WSSMMTRADE>" 
| rex "\<transaction\>(?P<tid>.*?)\<\/transaction\>" 
| rex "\<deal_number\>(?P<dealnumber>.*?)\<\/deal_number\>" 
| rex "\<external_deal\>(?P<sourcesystemid>.*?)\<\/external_deal\>" 
| rex "\<cust_type\>(?P<custType>.*?)\<\/cust_type\>" 
| eval region=case(host == "pffxa01z", "Global", host == "pffxa02z", "China") 
| eval wssSendingTime=strftime(_time,"%Y-%m-%d %H:%M:%S.%Q") 
| table tid,dealnumber,region,custType,sourcesystemid,wssSendingTime, booking_timestamp, booking_timestamp_unix, distributor_timestamp_unix, catchup_unix_time
| join type=left tid 
    [ search index=wss source="/proj/flowfx/ffx/log/flowfx-processor.log" "INFO SLA FFX-Processor received" 
    | rex "transactionId\=(?P<tid>.*?)\," 
    | eval flowfxincomingtime=strftime(_time,"%Y-%m-%d %H:%M:%S.%Q") 
    | table flowfxincomingtime,tid, booking_timestamp, booking_timestamp_unix, distributor_timestamp_unix, catchup_unix_time
        ] 
| table tid,dealnumber,region,custType,sourcesystemid,wssSendingTime,flowfxincomingtime 
| eval wssSendingTimeUnix=strptime(wssSendingTime,"%Y-%m-%d %H:%M:%S.%Q") 
| eval flowfxincomingtimeUnix=strptime(flowfxincomingtime,"%Y-%m-%d %H:%M:%S.%Q") 
| eval timebetweenWssFlowfx = flowfxincomingtimeUnix - wssSendingTimeUnix 
| table tid,dealnumber,region,custType,sourcesystemid,wssSendingTime,flowfxincomingtime,timebetweenWssFlowfx,wssSendingTimeUnix,flowfxincomingtimeUnix, booking_timestamp, booking_timestamp_unix, distributor_timestamp_unix, catchup_unix_time
| join type=left tid 
    [ search index=wss source="/proj/flowfx/ffx/log/flowfx-trade-sender-cim.log" "INFO SLA FFX-Trade-Sender sent" 
    | rex "nearTransactionId\=(?P<tid>.*?)\," 
    | eval CIMsendingTime=strftime(_time,"%Y/%m/%d %H:%M:%S.%Q") 
    | eval MQ_available_time=strftime(_time - 7200, "%Y-%m-%d %H:%M:%S.%Q") 
    | table CIMsendingTime,tid,MQ_available_time,booking_timestamp, booking_timestamp_unix, distributor_timestamp_unix
        ] 
| table tid,dealnumber,region,custType,sourcesystemid,wssSendingTime,flowfxincomingtime,timebetweenWssFlowfx,wssSendingTimeUnix,flowfxincomingtimeUnix,CIMsendingTime, MQ_available_time, booking_timestamp, booking_timestamp_unix, distributor_timestamp_unix, catchup_unix_time
]
| table trade_id, portfolio_name, sky_id, booking_timestamp,CIMsendingTime, distributor_timestamp, ep_timestamp, catchup_updated_time, cim_latency, distributor_latency, ep_latency, catchup_latency, wss_to_sky_latency, distributor_to_sky_latency
|sort - sky_id
| join type=left sky_id
 [ search index=sky sourcetype=sky_webservices_logs source="D:\\SkyNet\\SkyWebService\\logs\\live-risk-stomp-broadcast.log" "maxskyid"
| where maxskyid > 0
| dedup maxskyid
| rename maxskyid as sky_id
| eval sky_ui_timestamp=strftime(_time, "%Y/%m/%d %H:%M:%S.%3N")
| table sky_id host sky_ui_timestamp
        ]
| sort -sky_id
| filldown catchup_updated_time, sky_ui_timestamp
| eval mq_to_sky_update_latency = round(mq_to_sky_update_latency * 1000,0)
| eval sky_ui_unix_time = strptime(sky_ui_timestamp, "%Y/%m/%d %H:%M:%S.%3N") 
| eval catchup_unix_time = strptime(catchup_updated_time, "%Y/%m/%d %H:%M:%S.%3N") 
| eval booking_timestamp_unix = strptime(booking_timestamp, "%Y/%m/%d %H:%M:%S") 
| eval wss_to_sky_latency = sky_ui_unix_time - booking_timestamp_unix
| eval wss_to_sky_latency = round(wss_to_sky_latency * 1000,0)
| eval CIMsendingTime_unix = strptime(CIMsendingTime, "%Y/%m/%d %H:%M:%S.%3Q") 
| eval distributor_to_sky_latency = sky_ui_unix_time - CIMsendingTime_unix
| eval distributor_to_sky_latency = round(distributor_to_sky_latency * 1000,0)
| eval cim_latency = CIMsendingTime_unix - booking_timestamp_unix
| eval cim_latency = round(cim_latency * 1000,0)
| eval distributor_timestamp_unix = strptime(distributor_timestamp, "%Y/%m/%d %H:%M:%S.%3N") 
| eval distributor_latency = distributor_timestamp_unix - CIMsendingTime_unix
| eval distributor_latency = round(distributor_latency * 1000,0)
| eval ep_timestamp_unix = strptime(ep_timestamp, "%Y/%m/%d %H:%M:%S.%3N") 
| eval ep_latency = ep_timestamp_unix - distributor_timestamp_unix
| eval ep_latency = round(ep_latency * 1000,0)
| eval catchup_latency = catchup_unix_time - ep_timestamp_unix
| eval catchup_latency = round(catchup_latency * 1000,0)
| eval ui_latency = sky_ui_unix_time - catchup_unix_time
| eval ui_latency = round(ui_latency * 1000,0)
| table trade_id, portfolio_name, sky_id, booking_timestamp,CIMsendingTime, distributor_timestamp, ep_timestamp, catchup_updated_time, sky_ui_timestamp, cim_latency, distributor_latency, ep_latency, catchup_latency, ui_latency, wss_to_sky_latency, distributor_to_sky_latency
| dedup sky_id
| search portfolio_name = $portfolio$

| where len(CIMsendingTime) > 0
0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

There is no guarantee that the first event for each sky_id has a value in catchup_updated_time, so the filldown can be pulling any value from the previous sky_id down. When the dedup is done, only the first event for each sky_id is kept (which could have the wrong catchup_updated_time. Try either

| sort -sky_id catchup_updated_time
| filldown catchup_updated_time, sky_ui_timestamp

or

| sort -sky_id
| eventstats values(catchup_updated_time) as catchup_updated_time, values(sky_ui_timestamp) as sky_ui_timestamp by sky_id
0 Karma

wm
Loves-to-Learn Everything

Hi thanks for helping!

I got this still

wm_0-1725591881641.png

that first entry for 10:03:16... shouldnt be the cause and should be 10:02:43


0 Karma

wm
Loves-to-Learn Everything

FYI this was before the | filldown catchup_updated_time sky_ui_timestamp

wm_0-1725592049579.png

 

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

So what happens if you use eventstats instead of filldown?

0 Karma

wm
Loves-to-Learn Everything

nothing happens when i use
it doesnt fill the rows below when i use that. issue is happening for both catchup_updated_time and sky_ui_timestamp

| sort -sky_id
| eventstats values(catchup_updated_time) as catchup_updated_time, values(sky_ui_timestamp) as sky_ui_timestamp by sky_id

 

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Where would 10:02:43 come from as all these sky_id's are different?

0 Karma

wm
Loves-to-Learn Everything

it comes from this part

| join type=left sky_id
[ search index=sky sourcetype=sky_cashfx_catchup_logs ....
....
| table sky_id, catchup_updated_time, _raw
]



So yes once it has that part it should filldown everything  below until a populated field of catchup_updated_time after sorting by sky_id descending. then once a populated field of catchup_updated_time is met it fills down until another populated field, same for sky_ui_timestamp. This is working but randomly not


0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Try both

| eventstats values(catchup_updated_time) as catchup_updated_time, values(sky_ui_timestamp) as sky_ui_timestamp by sky_id
| sort -sky_id catchup_updated_time
| filldown catchup_updated_time, sky_ui_timestamp
0 Karma

wm
Loves-to-Learn Everything

Thank you. On that note how can I highlight in red or something that specific timestamp that was used to fill down the rest of the rows below etc. i.e a way to differentiate it from the rest which was filled using filldown?

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

That a whole different ball game! Are you using Studio or Classic Simple XML dashboards?

0 Karma

wm
Loves-to-Learn Everything

Thanks all team i am using simple xml classic dasboard

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

This can be done with CSS which you can define in your SimpleXML in a html panel. Essentially, what you need to do is to turn columns into a multi-value fields with an indicator for the format option to pick up and change the background colour.

For the indicator in your case, you could check whether the previous date is different i.e. it is the first in a series of copied dates. You can use streamstats to do this, something like this

| streamstats values(catchup_updated_time) as previous_catchup_updated_time window=1 current=f
| eval catchup_updated_time=if(isnull(previous_catchup_updated_time) OR catchup_updated_time != previous_catchup_updated_time, mvappend(catchup_updated_time,"RED"), catchup_updated_time)

You then use CSS to hide (display: none) the second value in the multi-value field. See this answer (and similar) for clues  https://community.splunk.com/t5/Splunk-Search/How-to-color-the-columns-based-on-previous-column-valu...

 

0 Karma
Get Updates on the Splunk Community!

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...

State of Splunk Careers 2024: Maximizing Career Outcomes and the Continued Value of ...

For the past four years, Splunk has partnered with Enterprise Strategy Group to conduct a survey that gauges ...

Data-Driven Success: Splunk & Financial Services

Splunk streamlines the process of extracting insights from large volumes of data. In this fast-paced world, ...