Solved: Changing _time in a savedsearch

robertgiffin · ‎08-03-2023

I have a set of data that I upload into Splunk every morning as a .csv file because the tool doesn't feed the particular data automatically. It is a list of agents installed on assets. I use a savedsearch to query the data because I latest() and stats every field to make sure it is the latest record in the database, it's a pretty big query.

I am interested in forcing the data to look like it was just ingested every time it is queried (when the savedsearch is executed). I tried adding a field to the savedsearch called _time (also tried timestamp) and setting it now(), that worked for the display but my records are still going stale so my assumption is that Splunk is using the original timestamp for these records (and I am ASSUMING I cannot change that).

While the query works for the 24 hour timeframe or if I set the timeframe to the current day, all is good (until the day passes). But if I shorten the timeframe to last 15 minutes, last 4 hours, last 60 minutes, I get nothing from the query, which makes sense because that data was timestamped outside the range. But I need the timstamp to look like right now.

The data is timestamped when I upload it using the current date/time. I want the last upload to ALWAYS be the current set of records (including the time). It should work with any timeframe. In a perfect world I would like the timestamp to be when the query is executed, so it is always now(), making the data 'look' like it is fresh. Is there a way to do this?

PickleRick · ‎08-04-2023

If you are uploading a list of somethings (in your case agents), as I understand, you want to have the list. That's the whole point, right?

Why not just use a lookup then (a csv-based lookup or kvstore one)?

If you don't want to fiddle with replacing the lookup contents manually, that's fine. You can upload your data as events and then have a scheduled search which will create such lookup for you.

View solution in original post

yuanliu · ‎08-03-2023

As a time series database, correct _time is essential. Splunk has a number of ways to try to get time correct. Just like in real life, you are correct to assume that you cannot change history.

Can you explain what is the problem you are trying to solve here? What does saved search have to do with needing to change history? If you are doing something to ingest data every morning, would earliest=-25h give you that set of data?

robertgiffin · ‎08-04-2023

A savedsearch isn't really the issue other than the fact that that's what I'm using. I kind of figured that was the case, no getting around the time thing. I'm accustomed to dealing with databases that aren't time based (unless I make the schema support it), like SQL databases.

In this case, I query assets from the M365 API, which is of course time based, and then my list of agents is JOINed to that by the machine name. The list of agents is what is uploaded every morning. If I forget to upload it, it of course goes stale and the agent side of the JOIN shows no agents (of course, because the data is 'stale').

If I were doing this in a SQL database, I would have a job that injects the data from the M365 API every so often and replace the records with what I bring in with the current ones (or I could time base it myself with a timestamp so I could just query the 'latest'). The agent data would be injected every morning, again, and would be the current 'state', regardless of time, until I injected new data because I would replace the agent data based on the ID. I would query that data set for the JOIN. That data set would not be time based, although I might save a timestamp with it.

Changing history, interesting take on this. What I am looking for is a static set of data that represents the current 'state' of things until I change it with a new upload, regardless of the time. I COULD use a lookup for this instead of a JOIN but I also have a dashboard where I show the status of these agents so I need to be able to just query the data like any other Splunk data. Plus, maintaining lookups is kind of a drag, especially if they change daily. I also can't do an outer JOIN type of thing on a lookup, I would have to execute that differently. I do that to list the machines that do not have an agent on them.

Hope this clarifies things. Thanks for you answer, it is what I expected. Things are working, just trying to bend the rules a bit. I didn't think it was actually possible, knowing how Splunk works, but I just wanted to verify for myself.

PickleRick · ‎08-04-2023

If you are uploading a list of somethings (in your case agents), as I understand, you want to have the list. That's the whole point, right?

Why not just use a lookup then (a csv-based lookup or kvstore one)?

If you don't want to fiddle with replacing the lookup contents manually, that's fine. You can upload your data as events and then have a scheduled search which will create such lookup for you.

robertgiffin · ‎08-04-2023

That's a great idea. I'll have to look into that. The reason I am not using a lookup is that I dashboard the agent status. But you CAN dashboard a lookup file. Thanks for the suggestion! This could solve a couple of issues I am having.

Changing _time in a savedsearch

field extraction

fields

Detecting Remote Code Executions With the Splunk Threat Research Team

Observability | Use Synthetic Monitoring for Website Metadata Verification

More Ways To Control Your Costs With Archived Metrics | Register for Tech Talk