Splunk Search Query to fill empty columns of data ...

wm · ‎06-14-2024

I'll first insert my whole splunk search query and show whats it showing and whats the expected result

index=sss sourcetype=sss_trade_www_timestamp 
| rex field=_raw "trade_id=\"(?<trade_id>\X+)\", event_id" 
| rex field=_raw "mx_status=\"(?<status>\X+)\", operation" 
| rex field=_raw "operation=\"(?<operation>\X+)\", action" 
| rex field=_raw " action=\"(?<action>\X+)\", tradebooking_sgp" 
| rex field=_raw " eventtime_sgp=\"(?<booking_mq_timestamp>\X+)\", sky_to_mq" 
| rex field=_raw "mq_latency=\"(?<mq_latency>[^\"]+)\".*\bportfolio_name=\"(?<portfolio_name>[^\"]+)\".*\bportfolio_entity=\"(?<portfolio_entity>[^\"]+)\".*\btrade_type=\"(?<trade_type>[^\"]+)"

| join event_id 
    [ search index=sss sourcetype=Sss_Www_EP_Logs "Successfully processed event" 
    | rex field=_raw "INFO: (?<booking_ep_timestamp>\d{8} \d{2}:\d{2}:\d{2}.\d{3})" 
    | rex field=_raw "Successfully processed event: (?<event_id>\X+), action" 
    | eval booking_ep_timestamp = strftime(strptime(booking_ep_timestamp."+0800", "%Y%d%m %H:%M:%S.%N%z"), "%Y/%m/%d %H:%M:%S")] 

    
| join type=left sss_id 
    [ search index=sss "New trades in amendment" "*pnl*"
    | rex "Trade Events (?<trades>.*)"
    | rex max_match=0 field=trades "(?<both_id>\d+:\d+)"
    | mvexpand both_id
    | rex field=both_id ":(?<sky_id>\d+)" 
    | rex max_match=1 field=_raw "(?<booking_pnl_timestamp>\d{4}+-\d{2}+-\d{2} \d{2}:\d{2}:\d{2},\d{3})"] 

| rex field=tradebooking_sgp "(?<booking_timestamp>\d{4}/\d{2}/\d{2} \d{2}:\d{2}:\d{2})"
| rex field=booking_mq_timestamp "(?<booking_mq_timestamp>\d{4}/\d{2}/\d{2} \d{2}:\d{2}:\d{2})"
| eval booking_pnl_timestamp = booking_pnl_timestamp."+0800" 

| eval ep_latency = strptime(booking_ep_timestamp, "%Y-%m-%d %H:%M:%S.%N%z") - strptime(booking_timestamp, "%Y/%m/%d %H:%M:%S.%N%z") 
| eval pnl_latency = strptime(booking_pnl_timestamp, "%Y-%m-%d %H:%M:%S,%N%z") - strptime(booking_timestamp, "%Y/%m/%d %H:%M:%S.%N%z") 

| search trade_id = "*" 
| search sss_id = "*" 
| search event_id = "*" 
| search action = "*" 
| search mx_status = "live" 
| search operation = "*" 


| table trade_id, sss_id, event_id, booking_timestamp, booking_mq_timestamp, booking_ep_timestamp, mx_status, operation, action, mq_latency, ep_latency, portfolio_name, portfolio_entity
| sort booking_ep_timestamp
| join type=left sss_id
       [ search index=sss sourcetype=sss_cashfx_catchup_logs "[Www] - Done incremental update" 
| rex field=_raw "Max Ssslib TradeID: (?<sss_id>\d+)"
| rex field=_raw "^(?<catchup_updated_time>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3})"
| eval catchup_updated_time = strftime(strptime(catchup_updated_time, "%Y-%m-%d %H:%M:%S"), "%Y/%m/%d %H:%M:%S")
| table sss_id, catchup_updated_time, _raw, 
]
| eval booking_timestamp_unix = strptime(booking_timestamp, "%Y/%m/%d %H:%M:%S") 
| eval catchup_unix_time = strptime(catchup_updated_time, "%Y/%m/%d %H:%M:%S") 
| eval www_to_sss_latency = round(catchup_unix_time - booking_timestamp_unix, 0)
| eval booking_mq_timestamp_unix = strptime(booking_mq_timestamp, "%Y/%m/%d %H:%M:%S") 
| eval mq_latency = round(booking_mq_timestamp_unix - booking_timestamp_unix, 0)
| eval booking_ep_timestamp_unix = strptime(booking_ep_timestamp, "%Y/%m/%d %H:%M:%S") 
| eval ep_latency = round(booking_ep_timestamp_unix - booking_mq_timestamp_unix, 0)
| eval mq_to_sss_update_latency = round(catchup_unix_time - booking_mq_timestamp_unix, 0)
| table trade_id, portfolio_name, portfolio_entity, sss_id, event_id, booking_timestamp, booking_mq_timestamp, booking_ep_timestamp, mq_latency, ep_latency, catchup_updated_time, www_to_sss_latency, mq_to_sss_update_latency, mx_status, operation, action,
| dedup sss_id
| sort booking_timestamp

It gives me this table but as I cant show all the tables row ill show the relevant ones

trade_id	sss_id	booking_timestamp	booking_mq_timestamp	booking_ep_timestamp	mq_latency	ep_latency	catchup_updated_time	www_to_sss_latency	mq_to_sss_update_latency
abc 123	597616519	2024/06/15 09:22:37	2024/06/15 09:24:16	2024/06/15 09:24:16	99	0	2024/06/15 09:24:26	109	10
abc 341	597616518
abc 218931	597616517
abc 1201	597614937	2024/06/15 07:50:14	2024/06/15 07:51:12	2024/06/15 07:51:12	58	0	2024/06/15 07:51:19	65	7
abcc 219	597614936
abc 219	597614935

just assume the booking_timestamp, booking_mq_timestamp, booking_ep_timestamp, mq_latency, ep_latency are all filled

Ok but since my catchup_updated_time is taking from a log entry its populated (eg. 2024-06-15 10:57:03,114 [Www] - Done incremental update. Max SSSSSS TradeID: 597618769), but the rest of the rows/columns are not populated.

I want to highlight this specific row since its taking from logs and also fill the empty catchup_updated_time such that 597616518 and 597616517 should take the catchup_updated_time, latency etc of 597616519 as their id is before and 597616519 is the max id taken from logs its row should be highlighted. hence anything before or smaller than 597616519 should have same catchup_updated_time. However not until 597614937 as it already has a catchup_updated_time taken from logs. So same for the rest of the rows. Is this complicateD?

Please let me know if you need more info

yuanliu · ‎06-14-2024

Your question would be much easier to understand if you skip the complex SPL, first give sample data (anonymize as needed), illustrate desired output, then explain the logic between illustrated data sample and desired output without SPL. To diagnose your attempted SPL, you also illustrate the actual output from that SPL, and explain how actual output is different from desired output if that is not painfully obvious. (Remember: What is "obvious" to you is not always obvious to volunteers who lack intimate knowledge about your dataset and your use case.)

As a side note, the illustrated SPL implies that your sourcetype=sss_trade_www_timestamp contains snippets like "trade_id=foo", "mx_status=bar", and so on. If so, Splunk would have extracted trade_id, status, etc. without your rex. Is there any reason Splunk is not giving you those?

wm · ‎06-16-2024

ah okay thanks gotcha.

Current output is like the table i shown and desired output is the table after that.

catchup_updated_time is not populated for some as its taken from another sourcetype which is www_cattchup_Logs

Please let me know if futher info is required

ITWhisperer · ‎06-16-2024

Will filldown do the trick? E.g.

| filldown catchup_updated_time

wm · ‎09-04-2024

hi @ITWhisperer , it works but my other columns which are a calculation of that column don't get populated

| eval distributor_to_abc_latency = catchup_unix_time - CIMsendingTime_unix

since the column was empty and was fillled using filldown the other columns dont get filled

ITWhisperer · ‎09-04-2024

Move the filldown to before the calculations (Splunk is not Excel (or other spreadsheet applications) - the calculations are not dynamic formulae held in cells!)

wm · ‎09-05-2024

Noted thanks @ITWhisperer

However looks like its not working as expected
This is before filldown

This is after filldown

Why is it not populating 2024/09/04 07:54:20.445 from the rows below instead it is filling with 2024/09/04 07:54:52.137

ITWhisperer · ‎09-05-2024

Please share your search / SPL, preferably in a code block, not a picture

wm · ‎09-05-2024

index=abc sourcetype=abc_trade_wss_timestamp 
| rex field=_raw "trade_id=\"(?<trade_id>\X+)\", event_id" 
| rex field=_raw "mx_status=\"(?<status>\X+)\", operation" 
| rex field=_raw "operation=\"(?<operation>\X+)\", action" 
| rex field=_raw " action=\"(?<action>\X+)\", tradebooking_sgp" 
| rex field=_raw " eventtime_sgp=\"(?<distributor_timestamp>\X+)\", sky_to_mq" 
| rex field=_raw "distributor_latency=\"(?<distributor_latency>[^\"]+)\".*\bportfolio_name=\"(?<portfolio_name>[^\"]+)\".*\bportfolio_entity=\"(?<portfolio_entity>[^\"]+)\".*\btrade_type=\"(?<trade_type>[^\"]+)"

| join event_id 
    [ search index=sky sourcetype=Sky_WSS_EP_Logs "Successfully processed event" 
    | rex field=_raw "INFO: (?<ep_timestamp>\d{8} \d{2}:\d{2}:\d{2}.\d{3})" 
    | rex field=_raw "Successfully processed event: (?<event_id>\X+), action" 
    | eval ep_timestamp = strftime(strptime(ep_timestamp."+0800", "%Y%d%m %H:%M:%S.%N%z"), "%Y-%m-%d %H:%M:%S.%3N")]
    
| join type=left sky_id 
    [ search index=sky "New trades in amendment" "*pnl*"
    | rex "Trade Events (?<trades>.*)"
    | rex max_match=0 field=trades "(?<both_id>\d+:\d+)"
    | mvexpand both_id
    | rex field=both_id ":(?<sky_id>\d+)" 
    | rex max_match=1 field=_raw "(?<booking_pnl_timestamp>\d{4}+-\d{2}+-\d{2} \d{2}:\d{2}:\d{2},\d{3})"] 

| rex field=tradebooking_sgp "(?<booking_timestamp>\d{4}/\d{2}/\d{2} \d{2}:\d{2}:\d{2})"
| rex field=distributor_timestamp "(?<distributor_timestamp>\d{4}/\d{2}/\d{2} \d{2}:\d{2}:\d{2}\.\d{3})"


| eval booking_pnl_timestamp = booking_pnl_timestamp."+0800" 

| eval ep_latency = strptime(ep_timestamp, "%Y-%m-%d %H:%M:%S.%N%z") - strptime(booking_timestamp, "%Y/%m/%d %H:%M:%S.%N%z") 
| eval pnl_latency = strptime(booking_pnl_timestamp, "%Y-%m-%d %H:%M:%S,%N%z") - strptime(booking_timestamp, "%Y/%m/%d %H:%M:%S.%N%z") 

| search trade_id = "*" 
| search sky_id = "*" 
| search event_id = "*" 
| search action = "*" 
| search mx_status = "live" 
| search operation = "*" 


| table trade_id, sky_id, event_id, booking_timestamp, distributor_timestamp, ep_timestamp, mx_status, operation, action, distributor_latency, ep_latency, portfolio_name, portfolio_entity
| join type=left sky_id
       [ search index=sky sourcetype=sky_cashfx_catchup_logs "[WSS] - Trade Store has been updated"
| rex field=_raw "Max Skylib TradeID: (?<sky_id>\d+)"
| rex field=_raw "^(?<catchup_updated_time>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3})"
| eval catchup_updated_time = strftime(strptime(catchup_updated_time, "%Y-%m-%d %H:%M:%S,%3N"), "%Y/%m/%d %H:%M:%S.%3N")
| dedup sky_id sortby +_time 
| table sky_id, catchup_updated_time, _raw, 
]
| eval booking_timestamp_unix = strptime(booking_timestamp, "%Y/%m/%d %H:%M:%S") 
```| eval wss_to_sky_latency = catchup_unix_time - booking_timestamp_unix```



| eval mq_to_sky_update_latency = catchup_unix_time - distributor_timestamp_unix
| eval ep_timestamp = strftime(strptime(ep_timestamp, "%Y-%m-%d %H:%M:%S.%3N"), "%Y/%m/%d %H:%M:%S.%3N")
| eval distributor_timestamp = strftime(strptime(distributor_timestamp, "%Y/%m/%d %H:%M:%S.%4N"), "%Y/%m/%d %H:%M:%S.%4N")

| table trade_id, portfolio_name, portfolio_entity, sky_id, event_id, booking_timestamp, booking_timestamp_unix, distributor_timestamp, distributor_timestamp_unix, ep_timestamp, distributor_latency, ep_latency, catchup_updated_time, wss_to_sky_latency, mq_to_sky_update_latency, mx_status, operation, action, catchup_unix_time
| rex field=trade_id "^\w+ (?<dealnumber>\d+)$"
| join type=left dealnumber
        [ search index=wss "Sending message" source="/proj/flowfx/wss/FFXWS01P/log/MQ1.log"
    ```Exclude Far Legs of Swap Trades for first Iteration of Dash```
    NOT "<swap_leg>2</swap_leg>" 
    ```Exclude Cancels, Amends, Auxiliaries, Allocations, Blocks - allocated ```
    NOT "<status>"
    ```Exclude MM Deals ```
    NOT "<WSSMMTRADE>" 
| rex "\<transaction\>(?P<tid>.*?)\<\/transaction\>" 
| rex "\<deal_number\>(?P<dealnumber>.*?)\<\/deal_number\>" 
| rex "\<external_deal\>(?P<sourcesystemid>.*?)\<\/external_deal\>" 
| rex "\<cust_type\>(?P<custType>.*?)\<\/cust_type\>" 
| eval region=case(host == "pffxa01z", "Global", host == "pffxa02z", "China") 
| eval wssSendingTime=strftime(_time,"%Y-%m-%d %H:%M:%S.%Q") 
| table tid,dealnumber,region,custType,sourcesystemid,wssSendingTime, booking_timestamp, booking_timestamp_unix, distributor_timestamp_unix, catchup_unix_time
| join type=left tid 
    [ search index=wss source="/proj/flowfx/ffx/log/flowfx-processor.log" "INFO SLA FFX-Processor received" 
    | rex "transactionId\=(?P<tid>.*?)\," 
    | eval flowfxincomingtime=strftime(_time,"%Y-%m-%d %H:%M:%S.%Q") 
    | table flowfxincomingtime,tid, booking_timestamp, booking_timestamp_unix, distributor_timestamp_unix, catchup_unix_time
        ] 
| table tid,dealnumber,region,custType,sourcesystemid,wssSendingTime,flowfxincomingtime 
| eval wssSendingTimeUnix=strptime(wssSendingTime,"%Y-%m-%d %H:%M:%S.%Q") 
| eval flowfxincomingtimeUnix=strptime(flowfxincomingtime,"%Y-%m-%d %H:%M:%S.%Q") 
| eval timebetweenWssFlowfx = flowfxincomingtimeUnix - wssSendingTimeUnix 
| table tid,dealnumber,region,custType,sourcesystemid,wssSendingTime,flowfxincomingtime,timebetweenWssFlowfx,wssSendingTimeUnix,flowfxincomingtimeUnix, booking_timestamp, booking_timestamp_unix, distributor_timestamp_unix, catchup_unix_time
| join type=left tid 
    [ search index=wss source="/proj/flowfx/ffx/log/flowfx-trade-sender-cim.log" "INFO SLA FFX-Trade-Sender sent" 
    | rex "nearTransactionId\=(?P<tid>.*?)\," 
    | eval CIMsendingTime=strftime(_time,"%Y/%m/%d %H:%M:%S.%Q") 
    | eval MQ_available_time=strftime(_time - 7200, "%Y-%m-%d %H:%M:%S.%Q") 
    | table CIMsendingTime,tid,MQ_available_time,booking_timestamp, booking_timestamp_unix, distributor_timestamp_unix
        ] 
| table tid,dealnumber,region,custType,sourcesystemid,wssSendingTime,flowfxincomingtime,timebetweenWssFlowfx,wssSendingTimeUnix,flowfxincomingtimeUnix,CIMsendingTime, MQ_available_time, booking_timestamp, booking_timestamp_unix, distributor_timestamp_unix, catchup_unix_time
]
| table trade_id, portfolio_name, sky_id, booking_timestamp,CIMsendingTime, distributor_timestamp, ep_timestamp, catchup_updated_time, cim_latency, distributor_latency, ep_latency, catchup_latency, wss_to_sky_latency, distributor_to_sky_latency
|sort - sky_id
| join type=left sky_id
 [ search index=sky sourcetype=sky_webservices_logs source="D:\\SkyNet\\SkyWebService\\logs\\live-risk-stomp-broadcast.log" "maxskyid"
| where maxskyid > 0
| dedup maxskyid
| rename maxskyid as sky_id
| eval sky_ui_timestamp=strftime(_time, "%Y/%m/%d %H:%M:%S.%3N")
| table sky_id host sky_ui_timestamp
        ]
| sort -sky_id
| filldown catchup_updated_time, sky_ui_timestamp
| eval mq_to_sky_update_latency = round(mq_to_sky_update_latency * 1000,0)
| eval sky_ui_unix_time = strptime(sky_ui_timestamp, "%Y/%m/%d %H:%M:%S.%3N") 
| eval catchup_unix_time = strptime(catchup_updated_time, "%Y/%m/%d %H:%M:%S.%3N") 
| eval booking_timestamp_unix = strptime(booking_timestamp, "%Y/%m/%d %H:%M:%S") 
| eval wss_to_sky_latency = sky_ui_unix_time - booking_timestamp_unix
| eval wss_to_sky_latency = round(wss_to_sky_latency * 1000,0)
| eval CIMsendingTime_unix = strptime(CIMsendingTime, "%Y/%m/%d %H:%M:%S.%3Q") 
| eval distributor_to_sky_latency = sky_ui_unix_time - CIMsendingTime_unix
| eval distributor_to_sky_latency = round(distributor_to_sky_latency * 1000,0)
| eval cim_latency = CIMsendingTime_unix - booking_timestamp_unix
| eval cim_latency = round(cim_latency * 1000,0)
| eval distributor_timestamp_unix = strptime(distributor_timestamp, "%Y/%m/%d %H:%M:%S.%3N") 
| eval distributor_latency = distributor_timestamp_unix - CIMsendingTime_unix
| eval distributor_latency = round(distributor_latency * 1000,0)
| eval ep_timestamp_unix = strptime(ep_timestamp, "%Y/%m/%d %H:%M:%S.%3N") 
| eval ep_latency = ep_timestamp_unix - distributor_timestamp_unix
| eval ep_latency = round(ep_latency * 1000,0)
| eval catchup_latency = catchup_unix_time - ep_timestamp_unix
| eval catchup_latency = round(catchup_latency * 1000,0)
| eval ui_latency = sky_ui_unix_time - catchup_unix_time
| eval ui_latency = round(ui_latency * 1000,0)
| table trade_id, portfolio_name, sky_id, booking_timestamp,CIMsendingTime, distributor_timestamp, ep_timestamp, catchup_updated_time, sky_ui_timestamp, cim_latency, distributor_latency, ep_latency, catchup_latency, ui_latency, wss_to_sky_latency, distributor_to_sky_latency
| dedup sky_id
| search portfolio_name = $portfolio$

| where len(CIMsendingTime) > 0

ITWhisperer · ‎09-05-2024

There is no guarantee that the first event for each sky_id has a value in catchup_updated_time, so the filldown can be pulling any value from the previous sky_id down. When the dedup is done, only the first event for each sky_id is kept (which could have the wrong catchup_updated_time. Try either

| sort -sky_id catchup_updated_time
| filldown catchup_updated_time, sky_ui_timestamp

or

| sort -sky_id
| eventstats values(catchup_updated_time) as catchup_updated_time, values(sky_ui_timestamp) as sky_ui_timestamp by sky_id

wm · ‎09-05-2024

Hi thanks for helping!

I got this still

that first entry for 10:03:16... shouldnt be the cause and should be 10:02:43

wm · ‎09-05-2024

FYI this was before the | filldown catchup_updated_time sky_ui_timestamp

ITWhisperer · ‎09-05-2024

So what happens if you use eventstats instead of filldown?

wm · ‎09-06-2024

nothing happens when i use
it doesnt fill the rows below when i use that. issue is happening for both catchup_updated_time and sky_ui_timestamp

| sort -sky_id
| eventstats values(catchup_updated_time) as catchup_updated_time, values(sky_ui_timestamp) as sky_ui_timestamp by sky_id

ITWhisperer · ‎09-06-2024

Where would 10:02:43 come from as all these sky_id's are different?

wm · ‎09-06-2024

it comes from this part

| join type=left sky_id
[ search index=sky sourcetype=sky_cashfx_catchup_logs ....
....
| table sky_id, catchup_updated_time, _raw
]

So yes once it has that part it should filldown everything below until a populated field of catchup_updated_time after sorting by sky_id descending. then once a populated field of catchup_updated_time is met it fills down until another populated field, same for sky_ui_timestamp. This is working but randomly not

ITWhisperer · ‎09-06-2024

Try both

| eventstats values(catchup_updated_time) as catchup_updated_time, values(sky_ui_timestamp) as sky_ui_timestamp by sky_id
| sort -sky_id catchup_updated_time
| filldown catchup_updated_time, sky_ui_timestamp

wm · ‎09-08-2024

Thank you. On that note how can I highlight in red or something that specific timestamp that was used to fill down the rest of the rows below etc. i.e a way to differentiate it from the rest which was filled using filldown?

ITWhisperer · ‎09-08-2024

That a whole different ball game! Are you using Studio or Classic Simple XML dashboards?

wm · ‎09-08-2024

Thanks all team i am using simple xml classic dasboard

ITWhisperer · ‎09-08-2024

This can be done with CSS which you can define in your SimpleXML in a html panel. Essentially, what you need to do is to turn columns into a multi-value fields with an indicator for the format option to pick up and change the background colour.

For the indicator in your case, you could check whether the previous date is different i.e. it is the first in a series of copied dates. You can use streamstats to do this, something like this

| streamstats values(catchup_updated_time) as previous_catchup_updated_time window=1 current=f
| eval catchup_updated_time=if(isnull(previous_catchup_updated_time) OR catchup_updated_time != previous_catchup_updated_time, mvappend(catchup_updated_time,"RED"), catchup_updated_time)

You then use CSS to hide (display: none) the second value in the multi-value field. See this answer (and similar) for clues https://community.splunk.com/t5/Splunk-Search/How-to-color-the-columns-based-on-previous-column-valu...

Splunk Search Query to fill empty columns of data with data from other rows

eval

join

rex

search job inspector

stats

subsearch

table

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas

What Is the Name of the USB Key Inserted by Bob Smith? (BOTS Hint, Not the Answer)

Automating Threat Operations and Threat Hunting with Recorded Future

Join the Conversation

Splunk Search Query to fill empty columns of data with data from other rows

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.