Getting Data In

Is it possible to preserve sourcetype, host, and source when using the collect command?

gots
Path Finder

We have an index with access logs from multiple hosts and systems with different sourcetypes.
When I trying to add information from a dynamic lookup to events and save them in a summary index with the collect command, I can't save original information about source, sourcetype, and host because collect command arguments take values as text, but not field values.

For example, search:

 index=access sourcetype=*_type_access | 
 lookup xxx AS yyy |
 collect index=enriched_access sourcetype=sourcetype

saves results with sourcetype equal "sourcetype", but not the original sourcetype.
When I try to rename sourcetype, result is the same.

Where a, I going wrong?

glc_slash_it
Path Finder

Hey!

Since I was searching for this topic/solution, I'll just add what I think is the right solution for this case.

To preserve the _time, host, source and sourcetype:

(...)

| collect index=main  output_format=hec

 

------------
If this was helpful, some karma would be appreciated.

0 Karma

jvishwak
Path Finder

Have a sourcetype value in anohther field like "origSourceType" and push this value in summary index. From summary index you can search based on origSourceType field.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Totally different approach: Keep the lookup data in the lookup, enrich at search time, skip indexing things twice through collect?

What you're doing feels quite wrong, considering collect would index _raw while the lookup is just adding fields - have you checked that those lookup output fields are actually retained in the second index?

That being said, https://answers.splunk.com/answers/88926/modify-raw-collect-into-second-index-how-to-best-retain-hos...

0 Karma

Runals
Motivator

Since there are perhaps several sourcetypes I would try the map command

| metasearch index=access sourcetype=*_type_access | stats count by sourcetype | map [ search index=access sourcetype=$sourcetype$ | lookup xxx AS yyy | collect index=enriched_access sourcetype=$sourcetype$ ]

At least that works in theory; I haven't tested it. It should work though. I used the metasearch command for speed and the stats command is just to get the unique list of sourcetypes. Tstats might be a hair faster still but I'm not spun up on that one /shrug. There are folks who are kinda anti map but it is a tool in the tool chest. What you are doing is for each result line from your initial search is passing the sourcetype as a token to the included search.

0 Karma

gurlest
Path Finder

I tried this out with "host=$host$" in my collect statement and no-dice.

Any other ideas?

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...