Splunk Search

Input lookup and matadata truncates the logs

diirn
Explorer

Hi All,

Can you please help me with my problem? I would like to check all the hosts in the CSV file which are for some reason truncated due to too many records.  I have modified the search which was provided on the other posts by some good soul 🙂 

Here is my search:

 

 

| inputlookup my_lookup_definition | join type=left [metadata type=hosts] |dedup host lastTime firstTime | eval age = now()-lastTime | convert ctime(lastTime) | eval field_in_ddhhmmss=tostring((age) , "duration") |rename field_in_ddhhmmss as "Time Offline" lastTime as "Last Time" | sort + "lastTime" | table host "Time Offline" "Last Time"

 

 



My main goal is to search all hosts from the CSV file, check which one of them have been reporting to Splunk and which ones have stopped.  The above search would do the trick, but the logs are truncated 😞 Is there any other way to achieve my goal without modifying the config files? 

 

 

 

[subsearch]: Subsearch produced 100000 results, truncating to maxout 50000.
[subsearch]: Metadata results may be incomplete: 100000 entries have been received from all peers (see parameter maxcount under the [metadata] stanza in limits.conf), and this search will not return metadata information for any more entries.

 

 

 

I would be very grateful for your assistance here.  

 

Kind regards,

Diirn

0 Karma
1 Solution

bowesmana
SplunkTrust
SplunkTrust

OK, your issue is that it's difficult to control the output from metadata. Even if you do metadata for the last 1 second, you will not only get hosts seen in the last 1 second. Therefore, you will most likely always exceed the 100,000 output max rows. I have had this problem in the past and found no simple solution.

You are better off approaching this using tstats as that will honour the time picker settings. You will not get the 'first seen' values, but it looks like you're only after last seen, but you can always save last seen into your lookup to calculate age.

So, replace metadata type=hosts with

| tstats latest_time(host) as lastTime where index=* OR index=_* by host

and then use the rest of my search from after metadata.

The lastTime will be empty for those hosts not seen in your time search window, but you could collect the last time regularly in a saved search that runs periodically and save that in the lookup, so you always have an updated last seen time.

You could then use that as part of the search

 

View solution in original post

bowesmana
SplunkTrust
SplunkTrust

@diirn 

Using subsearches and joins is always challenging with large data sets. What is the number of hosts in the CSV?

What I suggest is first collecting the hosts with metadata, then appending the CSV and combining the two with stats, not join.

| metadata type=hosts 
| eval age = now()-lastTime 
| convert ctime(lastTime) 
| eval field_in_ddhhmmss=tostring((age) , "duration") 
| rename field_in_ddhhmmss as "Time Offline" lastTime as "Last Time" 
| table host "Time Offline" "Last Time"
| inputlookup append=t my_lookup_definition
| stats values(*) as * by host
| sort + "lastTime" 

This will be faster than doing the join anyway.

Note that you can then do 

| where isnull('Last Time')

to just show those that have no last time.

I don't know what the contents of your inputlookup is, so you may need to adjust the search above to handle the data that is in the lookup. If it contains the previous last times, then you will have to do some testing to handle that, but I would need more info on the contents.

 

diirn
Explorer

Hi @bowesmana,

Thank you very much for your swift response and valuable feedback.  There are around ~20k hosts in this CSV file. I just need the hostnames from that file. 

I have tried your suggestion and the results are also truncated. 

 

Kind regards,

Diirn

0 Karma

bowesmana
SplunkTrust
SplunkTrust

OK, your issue is that it's difficult to control the output from metadata. Even if you do metadata for the last 1 second, you will not only get hosts seen in the last 1 second. Therefore, you will most likely always exceed the 100,000 output max rows. I have had this problem in the past and found no simple solution.

You are better off approaching this using tstats as that will honour the time picker settings. You will not get the 'first seen' values, but it looks like you're only after last seen, but you can always save last seen into your lookup to calculate age.

So, replace metadata type=hosts with

| tstats latest_time(host) as lastTime where index=* OR index=_* by host

and then use the rest of my search from after metadata.

The lastTime will be empty for those hosts not seen in your time search window, but you could collect the last time regularly in a saved search that runs periodically and save that in the lookup, so you always have an updated last seen time.

You could then use that as part of the search

 

diirn
Explorer

Thank you so much for your kind assistance! That query worked as a charm! 

0 Karma
Get Updates on the Splunk Community!

Detecting Remote Code Executions With the Splunk Threat Research Team

REGISTER NOWRemote code execution (RCE) vulnerabilities pose a significant risk to organizations. If ...

Observability | Use Synthetic Monitoring for Website Metadata Verification

If you are on Splunk Observability Cloud, you may already have Synthetic Monitoringin your observability ...

More Ways To Control Your Costs With Archived Metrics | Register for Tech Talk

Tuesday, May 14, 2024  |  11AM PT / 2PM ET Register to Attend Join us for this Tech Talk and learn how to ...