Hi,
I have a file with hostname. I need to find out the newly added server in it. When I use the set diff
command, it shows the difference between today's log & the previous week's log. But the problem is, the difference may contain a newly added server or decommissioned server. How to find out the newly added server alone?
Thanks in advance
One way is to use a metasearch. This checks all indexes for all hosts and finds the earliest and latest times that _host is on any index, telling you which indexes it is on. I assumed that you were looking for the hosts, not necessarily when the host first reported to a specific index. In this case, the code is set for 86400*30, i.e. hosts that first reported in the last 30 days.
metasearch index=* host=*
| stats earliest(_time) as firsttime, latest(_time) as lasttime values(index) as index by host
| addinfo
| eval testtime=info_search_time-86400*30
| where firsttime>testtime
| eval firstseen=strftime(firsttime,"%Y-%m-%d %H:%M:%S"),lastseen=strftime(lasttime,"%Y-%m-%d %H:%M:%S"),testseen=strftime(testtime,"%Y-%m-%d %H:%M:%S")
| table host index firstseen lastseen testseen
A second way, if you have weekly lists, is to use a left join from the current list to the prior list. When the prior list does not return anything, then the host is presumably new.
(this weeks list)
| table host foo bar TheDate
| join type=left host [|inputlookup hostlist | table host foo bar TheDate | rename TheDate as PriorDate]
| where isnull(PriorDate)
The second method could also be used to output a host list into a map
command, to do more extensive reporting on new hosts.
One approach is to use a lookup to hold the state of servers you have seen before. I'm going to build up to the solution, so read along carefully. Let's imagine you can do this:
index=myindex | stats min(_time) as oldest max(_time) as newest by host
If you run this over a (say) 7 day period, then you can get an idea of hosts that are both "new" and "missing" in that time period. Hosts will be in one of three states:
This is good but not optimal. So let's make a slightly different variant:
index=myindex | stats min(_time) as oldest, max(_time) as newest by host
| outputlookup myindex_host_status.csv
This doesn't provide any new logic, but merely persists the data made by the search out to a lookup file. Now, let's use that lookup file to provide context in a slightly more complex search:
index=myindex | stats min(_time) as oldest, max(_time) as newest by host
| inputlookup myindex_host_status.csv
| stats min(oldest) as oldest, max(newest) as newest by host
| outputlookup myindex_host_status.csv
We can now take this search and run it every day over the past 24 hours. Or every hour over the past hour, or whatever. It winds up keeping for us - over an infinitely long period of time - the first timestamp and last timestamp for a given host. It keeps this even if the original data ages off. The scheduled maintenance search runs and maintains the lookup holding state for us. Now, we can use that state:
| inputlookup myindex_host_status.csv | where oldest > now() - (86400 * 3)
Giving us a list of host who first sent data within the past three days. Or
| inputlookup myindex_host_status.csv | where newest < now() - (86400 * 7)
Giving us a list of every host that has not sent in any new data in the past 7 days.
The trick here is we're using lookups to hold the long term state, and taking advantage of how splunk stores _time as an integer value that increases as time goes on. Every day is exactly 86,400 seconds, and bigger numbers are higher times. Simple mathematical functions like min() and max() work to compute earliest and latest times.
thanks for ur time & help. But here the host does not send data to the splunk,
Remedy tool is sending data to splunk, in that hostname is an field in it. Now i need to find the newly added host name & decommissioned or deleted host name from the remedy log.
You should be able to adapt this to work in the exact same way.
Have you tried using set diff between last two weeks data and last week's data?
ya, i have tried it. I have mentioned in my question itself
I meant a slightly different diff from what you mentioned. I meant diff between last TWO weeks and just last week. Kind of akin to (AUB)-B in set notations. A = current week's hosts, B = last week's hosts, A U B = hosts in the last two weeks. (AUB) - B = hosts in A that weren't present in B. Hope this works out for you.