Splunk Search

How to get compare latest values across an index with a lookup and display the latest?

hummingbird81
Explorer

Hi All, looking for some advice as in how to take the latest values from 2 datasets .  We have a base search that pulls user details like name, start_date, end_date, title, location etc from an index =okta. 

namestart_dateend_datetitleuser_id
John Smith2021-06-28T23:59:59.00+05:302025-06-28T23:59:59.00+05:30Consultant001


The above index has the most current data of a user.

 Next we have another master lookup file (identities.csv)  where we maintain all user details from past few years.   This master lookup also contains same fields as the above index.  For example:

namestart_dateend_datetitleuser_id
John Smith2021-06-28T23:59:59.00+05:302022-06-28T23:59:59.00+05:30Administrator001


Notice the end _date and title are different in the lookup.

Below is our current search that compares the 2 datasets.  We want it to update the date fields or any other field  whichever is the latest  but at the moment it does NOT update the fields even if any field like end_date or title is modified under index.  

index=okta 
stats latest(_time) as _time,  values(profile.title) as title, values(profile.email) as email, values(profile.startDate) as start_Date,values(profile.endDate) as end_Date, values(profile.Name) as Name by user_id
| append [|inputlookup identities.csv]
|stats latest(_time) as _time,  latest(profile.title) as title, latest(profile.email) as email, latest(profile.startDate) as start_Date,latest(profile.endDate) as end_Date, latest(profile.Name) as Name by user_id
| table Name title start_date end_date user_id

 Running the above query still shows the old info which has the old end_date and title  even though i am using |stats latest()  .   Pls advise how to retrieve the latest be it date format or be it string format which is "title"

namestart_dateend_datetitle
John Smith2021-06-28T23:59:59.00+05:302022-06-28T23:59:59.00+05:30Administrator
Labels (3)
0 Karma
1 Solution

kiran_panchavat
Champion

@hummingbird81 

I tested this using makeresults with dummy data. Copy and paste this query into your Splunk search bar to run it. It doesn’t depend on your actual index or CSV, so it's safe for testing.

Dummy data:- 

| makeresults 
| eval _time=strptime("2025-03-01T12:00:00.00+05:30", "%Y-%m-%dT%H:%M:%S.%2Q%z"), 
       user_id="001", 
       Name="John Smith", 
       title="Consultant", 
       email="john.smith@example.com", 
       start_Date="2021-06-28T23:59:59.00+05:30", 
       end_Date="2025-06-28T23:59:59.00+05:30", 
       source="okta", 
       mod_time=_time
| fields user_id, Name, title, email, start_Date, end_Date, mod_time, source
| append [
    | makeresults 
    | eval _time=strptime("2022-06-01T12:00:00.00+05:30", "%Y-%m-%dT%H:%M:%S.%2Q%z"), 
           user_id="001", 
           Name="John Smith", 
           title="Administrator", 
           email="john.smith@example.com", 
           start_Date="2021-06-28T23:59:59.00+05:30", 
           end_Date="2022-06-28T23:59:59.00+05:30", 
           source="csv", 
           mod_time=if(isnull(_time), strptime(end_Date, "%Y-%m-%dT%H:%M:%S.%2Q%z"), _time)
    | fields user_id, Name, title, email, start_Date, end_Date, mod_time, source
]
| sort 0 -mod_time
| dedup user_id
| table Name, title, start_Date, end_Date, user_id

 

kiran_panchavat_1-1741781556996.png

 

You can try this:-

index=okta 
| eval source="okta", mod_time=_time
| fields user_id, Name, title, email, start_Date, end_Date, mod_time, source
| append [ 
    | inputlookup identities.csv 
    | eval source="csv", mod_time=if(isnull(_time), strptime(end_Date, "%Y-%m-%dT%H:%M:%S.%2Q%z"), _time) 
    | fields user_id, Name, title, email, start_Date, end_Date, mod_time, source 
]
| sort 0 -mod_time  /* Sort by mod_time descending to prioritize latest */
| dedup user_id     /* Keep only the first (latest) record per user_id */
| table Name, title, start_Date, end_Date, user_id

 

Did this help? If yes, please consider giving kudos, marking it as the solution, or commenting for clarification — your feedback keeps the community going!

View solution in original post

hummingbird81
Explorer

can you pls confirm why are you doing a " mod_time=if(isnull(_time), strptime(end_Date, "%Y-%m-%dT%H:%M:%S.%2Q%z"), _time)"  in the append portion ?  What is the need to do a strptime here ?

0 Karma

kiran_panchavat
Champion

@hummingbird81 

  • Why the if(): To handle cases where _time might be missing in the CSV, using end_Date as a fallback.
  • Why strptime: To convert end_Date from a string to a numeric epoch timestamp for proper sorting.
  • Why in append: Because the CSV’s timestamp situation is less certain than Okta’s
Did this help? If yes, please consider giving kudos, marking it as the solution, or commenting for clarification — your feedback keeps the community going!
0 Karma

hummingbird81
Explorer

Thank you so much. Let me test it out in detail but it looks promising.

kiran_panchavat
Champion

@hummingbird81 Sounds good! Please test it and let me know. If everything works fine, kindly accept the solution.

Did this help? If yes, please consider giving kudos, marking it as the solution, or commenting for clarification — your feedback keeps the community going!
0 Karma

kiran_panchavat
Champion

@hummingbird81 

I tested this using makeresults with dummy data. Copy and paste this query into your Splunk search bar to run it. It doesn’t depend on your actual index or CSV, so it's safe for testing.

Dummy data:- 

| makeresults 
| eval _time=strptime("2025-03-01T12:00:00.00+05:30", "%Y-%m-%dT%H:%M:%S.%2Q%z"), 
       user_id="001", 
       Name="John Smith", 
       title="Consultant", 
       email="john.smith@example.com", 
       start_Date="2021-06-28T23:59:59.00+05:30", 
       end_Date="2025-06-28T23:59:59.00+05:30", 
       source="okta", 
       mod_time=_time
| fields user_id, Name, title, email, start_Date, end_Date, mod_time, source
| append [
    | makeresults 
    | eval _time=strptime("2022-06-01T12:00:00.00+05:30", "%Y-%m-%dT%H:%M:%S.%2Q%z"), 
           user_id="001", 
           Name="John Smith", 
           title="Administrator", 
           email="john.smith@example.com", 
           start_Date="2021-06-28T23:59:59.00+05:30", 
           end_Date="2022-06-28T23:59:59.00+05:30", 
           source="csv", 
           mod_time=if(isnull(_time), strptime(end_Date, "%Y-%m-%dT%H:%M:%S.%2Q%z"), _time)
    | fields user_id, Name, title, email, start_Date, end_Date, mod_time, source
]
| sort 0 -mod_time
| dedup user_id
| table Name, title, start_Date, end_Date, user_id

 

kiran_panchavat_1-1741781556996.png

 

You can try this:-

index=okta 
| eval source="okta", mod_time=_time
| fields user_id, Name, title, email, start_Date, end_Date, mod_time, source
| append [ 
    | inputlookup identities.csv 
    | eval source="csv", mod_time=if(isnull(_time), strptime(end_Date, "%Y-%m-%dT%H:%M:%S.%2Q%z"), _time) 
    | fields user_id, Name, title, email, start_Date, end_Date, mod_time, source 
]
| sort 0 -mod_time  /* Sort by mod_time descending to prioritize latest */
| dedup user_id     /* Keep only the first (latest) record per user_id */
| table Name, title, start_Date, end_Date, user_id

 

Did this help? If yes, please consider giving kudos, marking it as the solution, or commenting for clarification — your feedback keeps the community going!
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Leveraging Automated Threat Analysis Across the Splunk Ecosystem

Are you leveraging automation to its fullest potential in your threat detection strategy?Our upcoming Security ...

Can’t Make It to Boston? Stream .conf25 and Learn with Haya Husain

Boston may be buzzing this September with Splunk University and .conf25, but you don’t have to pack a bag to ...

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...