- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Is it possible to preserve sourcetype, host, and source when using the collect command?
We have an index with access logs from multiple hosts and systems with different sourcetypes.
When I trying to add information from a dynamic lookup to events and save them in a summary index with the collect
command, I can't save original information about source, sourcetype, and host because collect command arguments take values as text, but not field values.
For example, search:
index=access sourcetype=*_type_access |
lookup xxx AS yyy |
collect index=enriched_access sourcetype=sourcetype
saves results with sourcetype equal "sourcetype", but not the original sourcetype.
When I try to rename sourcetype, result is the same.
Where a, I going wrong?
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hey!
Since I was searching for this topic/solution, I'll just add what I think is the right solution for this case.
To preserve the _time, host, source and sourcetype:
(...)
| collect index=main output_format=hec
------------
If this was helpful, some karma would be appreciated.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Have a sourcetype value in anohther field like "origSourceType" and push this value in summary index. From summary index you can search based on origSourceType field.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Totally different approach: Keep the lookup data in the lookup, enrich at search time, skip indexing things twice through collect
?
What you're doing feels quite wrong, considering collect
would index _raw
while the lookup is just adding fields - have you checked that those lookup output fields are actually retained in the second index?
That being said, https://answers.splunk.com/answers/88926/modify-raw-collect-into-second-index-how-to-best-retain-hos...
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Since there are perhaps several sourcetypes I would try the map command
| metasearch index=access sourcetype=*_type_access | stats count by sourcetype | map [ search index=access sourcetype=$sourcetype$ | lookup xxx AS yyy | collect index=enriched_access sourcetype=$sourcetype$ ]
At least that works in theory; I haven't tested it. It should work though. I used the metasearch command for speed and the stats command is just to get the unique list of sourcetypes. Tstats might be a hair faster still but I'm not spun up on that one /shrug. There are folks who are kinda anti map but it is a tool in the tool chest. What you are doing is for each result line from your initial search is passing the sourcetype as a token to the included search.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

I tried this out with "host=$host$" in my collect statement and no-dice.
Any other ideas?
