Getting Data In

How to update a lookup with a scheduled search by appending new data or editing existing rows, not replace the entire lookup?

Cuyose
Builder

I can't seem to find this scenario which is odd. Basically I want to update a list of usernames. I want to run an initial search over a time frame to create the initial lookup. I then want to run a scheduled search to update this lookup, either by appending new data or editing the existing rows in the lookup based on a column being the primary key (userid).

I DO NOT WANT the entire CSV replaced, or duplicate userid's, which seems to be what all the documentation says. Does Splunk support this?

0 Karma

MuS
SplunkTrust
SplunkTrust

Hi Cuyose,

Yes, you can do such a thing in Splunk. Here is an example that will add all hosts found within the last last hour into a lookup file:

index=_internal earliest=-1h 
| eval server_name=host 
| table server_name 
| append [ inputcsv server_down ] 
| dedup server_name 
| outputcsv server_down

The second example uses the same lookup file and removes all hosts found in the events within the last and the lookup file:

index=_internal earliest=-1h 
| eval server_name=host 
| table server_name 
| append [ inputcsv server_down ] 
| stats count by server_name 
| eval server_name=case(count = 2,null() , count=1,server_name) 
| table server_name 
| outputcsv server_down

To test it and to verify use the first search and use the 2nd line as | eval server_name="foo" to add some dummy hosts.

Hope this helps ...

cheers, MuS

0 Karma

Cuyose
Builder

Since the subsearch is going to be run on a much more limited time frame I will be good.

The issue I am currently having is that I can't re-write the UPDATED value back after doing the latest(UPDATED) to filter out the dupes.

0 Karma

Cuyose
Builder

I took a slightly different route. I think this should work. Testing now, but would like any input, it seems it might be able to be optimized.

abc.csv generated by

base search
| stats  latest(_time) as UPDATED by LOGIN LOCATION LOCATIONCOUNTRY VENDOR ROLE TYPE SUPERVISOR 
| convert ctime(UPDATED)
| table  UPDATED LOGIN LOCATION LOCATIONCOUNTRY VENDOR ROLE TYPE SUPERVISOR

Update (scheduled search)

inputlookup abc.csv
|append [search {base search}
|stats  latest(_time) as UPDATED by LOGIN LOCATION LOCATIONCOUNTRY VENDOR ROLE TYPE SUPERVISOR | convert ctime(UPDATED)| table  UPDATED LOGIN LOCATION LOCATIONCOUNTRY VENDOR ROLE TYPE SUPERVISOR]
|stats latest(UPDATED) as UPDATED by LOGIN LOCATION LOCATIONCOUNTRY VENDOR ROLE TYPE SUPERVISOR

So basically this is running a subsearch and adding all those rows to the current lookup csv which has an "updated" column. It then compares the UPDATED with "latests" to get rid of the duplicates.

0 Karma

MuS
SplunkTrust
SplunkTrust

Nice - just be aware off the sub search limits http://docs.splunk.com/Documentation/Splunk/6.4.2/Search/Aboutsubsearches#Subsearch_performance
But as long as you don't come close those limits and are happy with the performance, use it 😉

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Matching cron expressions

This puzzle (first published here) is based on matching timestamps to cron expressions.All the timestamps ...

Design, Compete, Win: Submit Your Best Splunk Dashboards for a .conf26 Pass

Hello Splunkers,  We’re excited to kick off a Splunk Dashboard contest! We know that dashboards are a primary ...

May 2026 Splunk Expert Sessions: Security & Observability

Level Up Your Operations: May 2026 Splunk Expert Sessions Whether you are refining your security posture or ...