<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Optimize join for audit logs in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Optimize-join-for-audit-logs/m-p/352078#M104192</link>
    <description>&lt;P&gt;Assuming that the timestamps are exactly the same for the events that need to be connected, this is a perfect use case for &lt;CODE&gt;selfjoin&lt;/CODE&gt;:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index=os sourcetype=linux_audit AND ((type=SYSCALL AND key=pci) OR type=CWD)
| selfjoin msg
| table _time, host, exe, comm, success, auid, cwd
&lt;/CODE&gt;&lt;/PRE&gt;</description>
    <pubDate>Sun, 22 Apr 2018 16:20:54 GMT</pubDate>
    <dc:creator>woodcock</dc:creator>
    <dc:date>2018-04-22T16:20:54Z</dc:date>
    <item>
      <title>Optimize join for audit logs</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Optimize-join-for-audit-logs/m-p/352076#M104190</link>
      <description>&lt;P&gt;I have a search that returns correct results. However, the join subsearch portion is constantly hitting the max 50000 results limit. I'd like to run this against a larger timerange so I can produce a weekly report. Right now, I have to keep the timerange small to get any results.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index=os sourcetype=linux_audit type=SYSCALL key=pci
| join msg [search index=os sourcetype=linux_audit type=CWD]
| table _time, host, exe, comm, success, auid, cwd
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;The field I want to use within the join is the msg field. Is there a way to pass the msg value in the join to speed up the search? &lt;/P&gt;

&lt;P&gt;Some sample data from the log messages:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;    type=SYSCALL msg=audit(1524096248.939:201277):  success=yes pid=6561 auid=1000 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=33671 comm="rm" exe="/bin/rm" key="pci"
 type=CWD msg=audit(1524096248.939:201277):  cwd="/home/user"

    type=SYSCALL msg=audit(1524096249.335:201280): success=yes pid=6561 auid=1000 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=33671 comm="rm" exe="/bin/rm" key="pci"
type=CWD msg=audit(152409649.335:201280):  cwd="/home/user"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;The expected results match based on the contents of the msg field&lt;BR /&gt;
&lt;span class="lia-inline-image-display-wrapper" image-alt="alt text"&gt;&lt;img src="https://community.splunk.com/t5/image/serverpage/image-id/4802i56E2BB56C5808067/image-size/large?v=v2&amp;amp;px=999" role="button" title="alt text" alt="alt text" /&gt;&lt;/span&gt;&lt;/P&gt;

&lt;P&gt;None of the provided answers seems to be what I need. Anyone else able to answer this?&lt;/P&gt;</description>
      <pubDate>Sun, 22 Apr 2018 00:04:57 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Optimize-join-for-audit-logs/m-p/352076#M104190</guid>
      <dc:creator>BrandonKeep</dc:creator>
      <dc:date>2018-04-22T00:04:57Z</dc:date>
    </item>
    <item>
      <title>Re: Optimize join for audit logs</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Optimize-join-for-audit-logs/m-p/352077#M104191</link>
      <description>&lt;P&gt;@BrandonKeep while the actual query would be based on sample data and correlation between two sourcetype and fields coming from each sourcetype&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;  index=os (sourcetype=linux_audit type=SYSCALL key=pci) OR (index=os sourcetype=linux_audit type=CWD)
 | stats count as eventCount values(type) as types earliest(_time) as EarliestTime latest(_time) as LatestTime by msg
 | search eventCount&amp;gt;1 types="SYSCALL" AND types="linux_audit"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;PS: stats aggregate above needs to have other fields (like exe, comm, success) included as per need and their correlation/aggregation.&lt;/P&gt;</description>
      <pubDate>Sun, 22 Apr 2018 05:26:13 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Optimize-join-for-audit-logs/m-p/352077#M104191</guid>
      <dc:creator>niketn</dc:creator>
      <dc:date>2018-04-22T05:26:13Z</dc:date>
    </item>
    <item>
      <title>Re: Optimize join for audit logs</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Optimize-join-for-audit-logs/m-p/352078#M104192</link>
      <description>&lt;P&gt;Assuming that the timestamps are exactly the same for the events that need to be connected, this is a perfect use case for &lt;CODE&gt;selfjoin&lt;/CODE&gt;:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index=os sourcetype=linux_audit AND ((type=SYSCALL AND key=pci) OR type=CWD)
| selfjoin msg
| table _time, host, exe, comm, success, auid, cwd
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Sun, 22 Apr 2018 16:20:54 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Optimize-join-for-audit-logs/m-p/352078#M104192</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2018-04-22T16:20:54Z</dc:date>
    </item>
    <item>
      <title>Re: Optimize join for audit logs</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Optimize-join-for-audit-logs/m-p/352079#M104193</link>
      <description>&lt;P&gt;While this is a cool command that I didn't know existed, it doesn't give me the results that I need. I end up with over a million results. My posted search gives me two results. I will update the initial question with some sample data and expected results.&lt;/P&gt;

&lt;P&gt;Thanks for the attempt.&lt;BR /&gt;
Regards&lt;/P&gt;</description>
      <pubDate>Sun, 22 Apr 2018 16:35:47 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Optimize-join-for-audit-logs/m-p/352079#M104193</guid>
      <dc:creator>BrandonKeep</dc:creator>
      <dc:date>2018-04-22T16:35:47Z</dc:date>
    </item>
    <item>
      <title>Re: Optimize join for audit logs</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Optimize-join-for-audit-logs/m-p/352080#M104194</link>
      <description>&lt;P&gt;Thanks for the reply. I have added some sample log data and a screenshot of the expected output. Can you clarify your example a bit as it isn't clear to me how to get my expected output.&lt;/P&gt;

&lt;P&gt;Regards,&lt;/P&gt;</description>
      <pubDate>Sun, 22 Apr 2018 16:57:09 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Optimize-join-for-audit-logs/m-p/352080#M104194</guid>
      <dc:creator>BrandonKeep</dc:creator>
      <dc:date>2018-04-22T16:57:09Z</dc:date>
    </item>
  </channel>
</rss>

