Splunk Search

Join/Transaction with field comparison operators

richard_whiffen
Explorer

I have some data sources in splunk that are XML formated. The initial request:

<query id=12345-54321>
<Request_1 initialQueryTime="1294214640000"><SearchParms startNum="1" numResults="14">
<TextQuery type="Title">Children's Hour</TextQuery>
</Request_1>
</query>

The Response:

<query id=c12345-54321>
<?xml version="1.0" encoding="ISO-8859-1"?>
<Response_1>
<NumberOfMatches>0</NumberOfMatches>
<SearchResults>
</SearchResults>
</Response_1>
</query>

or:

<tvsquery id=12345-54321>
<?xml version="1.0" encoding="ISO-8859-1"?>
<Response_1>
<NumberOfMatches>2</NumberOfMatches>
<SearchResults>
<VODShow closedCaptioning="Y" dolby="Y" hd="Y" entitled="Y">
<ResultTitle>Graffiti Bridge HD</ResultTitle>
<AssetID>INTL0609000006823480</AssetID>
</VODShow>
<VODShow closedCaptioning="Y" dolby="Y" entitled="Y">
<ResultTitle>Graffiti Bridge HD</ResultTitle>
<AssetID>INTL0609000005278648</AssetID
</VODShow>
</SearchResults>
</Response_1>
</query>

So I have a field id that should be a transaction. If I do a simple search:

host="srchhost*" src="/path/to/tomcat/log/qlogs/*" | transaction id

The transactions marry up correctly by id. If I try to look for transactions with specific attributes, I don't get the results I"m expecting. I've tried several variations of the following (with both the transaction command and join command:

host="srchhost*" src="/path/to/tomcat/log/qlogs/*" NumberOfMatches < 0 | transaction id [search host="srchhost*" src="/path/to/tomcat/log/qlogs/*" "Request_1" | top TextQuery ]

I think I'm not looking at the problem the right way and am hoping someone out there can put me on the right path. I have a field id that is common to the search and the result set. I want to know what TextQuery people used that returned zero NumberOfMatches. So I ask for a result set of id values for zero matches. I then want to return the TextQuery field for every one of those IDs.

The individual queries work. I can get a list of the zero result IDs. I can get the TextQuery values for a given ID. I just can't do it in a single search pipeline.

Cheers, Rich

Tags (2)
0 Karma
1 Solution

sideview
SplunkTrust
SplunkTrust

Yep. This is a very common stumbling block when you're first getting started. You have to get away from old SQL-ish ideas of joining two different datasets. Instead you just mash them together from the very beginning with a big OR:

host="srchhost*" src="/path/to/tomcat/log/qlogs/*" ( NumberOfMatches<0 OR Request_1 )

and then you pipe to transaction or to stats to put the mess together sensibly:

host="srchhost*" src="/path/to/tomcat/log/qlogs/*" ( NumberOfMatches<0 OR Request_1 ) | transaction id

1) Transaction

the transaction command is good for the use cases where, at the end of the day you need the actual multiline raw text of the various events all stuck together. Usually however you find that's not what you need and rather you just need various statistics and bits of text from the overall set of conjoined/conglommified events. In those cases the stats command is a much better tool.

2) stats

Here's an example that hopefully gives you the idea of how you could use stats:

host="srchhost*" src="/path/to/tomcat/log/qlogs/*" ( NumberOfMatches<0 OR Request_1 )
| stats first(AssetId) as AssetId first(TextQuery) as TextQuery sum(NumberOfMatches) as totalMatches by id

3) join

As for the 'join' command, it's mostly evil. I'd avoid it unless you really really cant do it with stats or transaction.

Further reading:

http://answers.splunk.com/questions/8694/joining-two-log-files-that-have-a-common-field/8708#8708

http://answers.splunk.com/questions/822/simulating-a-sql-join-in-splunk/1717#1717

View solution in original post

sideview
SplunkTrust
SplunkTrust

Yep. This is a very common stumbling block when you're first getting started. You have to get away from old SQL-ish ideas of joining two different datasets. Instead you just mash them together from the very beginning with a big OR:

host="srchhost*" src="/path/to/tomcat/log/qlogs/*" ( NumberOfMatches<0 OR Request_1 )

and then you pipe to transaction or to stats to put the mess together sensibly:

host="srchhost*" src="/path/to/tomcat/log/qlogs/*" ( NumberOfMatches<0 OR Request_1 ) | transaction id

1) Transaction

the transaction command is good for the use cases where, at the end of the day you need the actual multiline raw text of the various events all stuck together. Usually however you find that's not what you need and rather you just need various statistics and bits of text from the overall set of conjoined/conglommified events. In those cases the stats command is a much better tool.

2) stats

Here's an example that hopefully gives you the idea of how you could use stats:

host="srchhost*" src="/path/to/tomcat/log/qlogs/*" ( NumberOfMatches<0 OR Request_1 )
| stats first(AssetId) as AssetId first(TextQuery) as TextQuery sum(NumberOfMatches) as totalMatches by id

3) join

As for the 'join' command, it's mostly evil. I'd avoid it unless you really really cant do it with stats or transaction.

Further reading:

http://answers.splunk.com/questions/8694/joining-two-log-files-that-have-a-common-field/8708#8708

http://answers.splunk.com/questions/822/simulating-a-sql-join-in-splunk/1717#1717

Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Observability Simplified: Combining User Experience, Application Performance & ...

Tech Talk Observability Simplified: Combining User Experience, Application Performance & Network ...

Event Series May & June: From Network Visibility to Service Intelligence

Unifying the Network: Moving from Alert Noise to Service Intelligence with Splunk ITSI In today’s hybrid ...

Global Splunk User Group Events: May + June 2026

Your Splunk Community Awaits: Discover Upcoming User Group Events Worldwide    Staying ahead in the fast-paced ...