- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
what is the best way to enrich events from another search?

I have two data sources
Source A
time action src_ip session user
- "action" varies between (logon, logoff and relogon)
- "session" contains a randomly generated sessionID that is unique and does not appear to be reused
- "user" is the userid
Source B
time session
What is the most efficient way of enriching the event data in Source B with the user that matches the session from source A?
To give you an idea of the data-set size.
source="Source A" | stats values(user) as user by session
returns 17,000 odd unique tuples from around a million events and the job completes in under a second.
Source B contains over 100 million events.
Was thinking of running a regularly scheduled search to maintain a csv of user,session and then setting up a calculated field that performs a lookup (with the expectation that the newest events will not be enriched with a user.
Suggestions anyone?
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

When we are trying to do something like match an IP that gets recycled with the users/session/mac that obtained it, we take the obtaining/identifying dataset and use a Scheduled Search
to create/trim/updated a time-based lookup
and the use that lookup (which can be setup to be an Automatic Lookup
) to enrich the other dataset:
https://docs.splunk.com/Documentation/Splunk/latest/Knowledge/Defineatime-basedlookupinSplunkWeb
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Something like:
index=sourceA OR index=SourceB
| stats min(_time) as start, max(_time) as end, values(action) as action, values(user) by user, values(scr_ip) as src_ip by session
You could also use something like a left join, however, join uses the same limits as subsearch and 10K
index=SourceB
| join type=left session [ index=sourceA | fields user, action ]
Would need more specifics
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

what is the problem you are trying to solve? how does "Source B" data looks like? what fields or values are a match to source A?
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

I'm not wanting to build a dashboard, I wish to enrich the event data so that an investigator can search for a user and find the events associated with that user. The events in Source-B do not include a user field within the data, just a session.
Source-A will map a session to a user when the user logs on.
