I'm building out a search to look through email logs. The main search is fine, but I'd like to add fields showing when an email domain was first seen on our network, whether that be yesterday or three years ago.
I was initially considering some kind of sub-search but I'm not sure how much something like that would impact my search time wherein I'm searching through several years of data every time I run the search. The only fields I'd care about are the domain field and the _time field, so I could cut out the rest, but I don't think that'd be enough.
In this instance, would it be better to setup an accelerated data model instead and have it update at an interval(once every four hours, maybe?)? Or some kind of lookup table, perhaps? I also considered summary indexing, but I don't know enough about the specific of that feature set to draw any conclusions.
Just looking to see what my best option is~ I plan to pass this search to SOC analysts to help them search through email, hence it'd be a search run frequently.
I ended up choosing a little of all options! Created an accelerated data model to gather the data then used tstats in a report to actually process the data. That report uses outputlookup to create a lookup file from the report and is scheduled to run every 15 minutes and update the CSV with new data.
I ended up choosing a little of all options! Created an accelerated data model to gather the data then used tstats in a report to actually process the data. That report uses outputlookup to create a lookup file from the report and is scheduled to run every 15 minutes and update the CSV with new data.
I think easiest approach could be use a summary index or a lookup table.
Eg. for lookup table
index=email
| stats earliest(_time) as first_seen by domain
| outputlookup domain_first_seen.csv
Then in your main search add lookup or use appendcols/join if you are using summary index.
index=email
| lookup domain_first_seen.csv domain OUTPUT first_seen
Regards,
Prewin
If this answer helped you, please consider marking it as the solution or giving a Karma. Thanks!
Building a lookup is perhaps the easiest. Assuming domain is already extracted, you could do
sourcetype = mailstuff domain=* earliest=0
| stats min(_time) as first_seen by domain
| output lookup DomainFirstAppeared
Of course, you need to define lookup DomainFirstAppeared.
After this, you can add this field first_seen in any search using lookup command, like this
sourcetype = mailstuff
| lookup DomainFirstAppeared domain