Splunk Search

Joining data from 2 data sources in Splunk

wlouisharris
New Member

I am trying to join data from 2 data sources. The first data source contains events; source=events. The second source contains Service tickets; source=tickets. So I need to display event data along with the ticket data. The event source contains a ticket# field called scnumber; while the ticket source contains the ticket# in a field called NUMBER. I want to join the data between scnumber and NUMBER where the ticket # value is equal; they represent the same data but the fields have different names. Both have several fields of data I need; event: node, summary, scnumber; ticket: assignment group, class, and sub-class.

So I thought the query would be either of the 2:

sourcetype=event | rename scnumber as NUMBER | join type=outer NUMBER [search sourcetype=ticket] | table node summary NUMBER class sub-class

or

sourcetype=event OR sourcetype=ticket | eval ticket=coalesce(scnumber,NUMBER) | table node summary NUMBER class sub-class

So far I'm having no luck. Any help would be appreciated.

Tags (2)
0 Karma

gkanapathy
Splunk Employee
Splunk Employee

I would suggest:

sourcetype=event OR sourcetype=ticket
| eval ticket = coalesce(scnumber,TICKET)
| stats first(node) as node 
        first(summary) as summary
        first(assignment) as assignment
        first(class) as class
        first(sub_class) as sub_class
  by ticket

as the most efficient (in terms of performance and scalability). The implementations of join and transaction are generally less efficient and scalable than stats. If there are field name conflicts between the sourcetypes and those matter, you can eval/rename them before stats, or you can do things like:

... first(eval(if(sourcetype=="event",node,null()))) as node

which will ensure that if a field "node" exists for a ticket number in both "event" and "ticket", that the one from the "event" sourcetype will be selected. If you don't do it this way, Splunk will just pick the first (most recent) value of "node" it finds for the ticket, regardless of sourcetype (which may be okay for you).

alexandermunce
Communicator

@gkanapathy

In the second line of your code above:
| eval ticket = coalesce(scnumber,TICKET)

Is the second arguement for the coalesce function correct?

If so, what does TICKET refer to exactly?

0 Karma

lguinn2
Legend

BTW, the field name sub-class violates Splunk's field naming rules. It may work in many commands, but I would change it.

I assume that you can have multiple events per ticket? Or should there be an exact 1-1 match?

sourcetype=event OR sourcetype=ticket
| rename scnumber as NUMBER
| transaction NUMBER 
| table node summary NUMBER class sub-class
| sort NUMBER

Try not to use join in Splunk unless absolutely necessary. Since Splunk will retrieve data across sources, sourcetypes, hosts, etc. in a normal search, it is "joining" data already. This is a big difference between how Splunk works vs. a relational database.

BTW, the transaction command may be overkill here. If I could see exactly what you want for output, there might be a better option.

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Matching cron expressions

This puzzle (first published here) is based on matching timestamps to cron expressions.All the timestamps ...

Design, Compete, Win: Submit Your Best Splunk Dashboards for a .conf26 Pass

Hello Splunkers,  We’re excited to kick off a Splunk Dashboard contest! We know that dashboards are a primary ...

May 2026 Splunk Expert Sessions: Security & Observability

Level Up Your Operations: May 2026 Splunk Expert Sessions Whether you are refining your security posture or ...