<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Sequence of activities at index time in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/Sequence-of-activities-at-index-time/m-p/692389#M115107</link>
    <description>You propably used raw endpoint on HEC?</description>
    <pubDate>Thu, 04 Jul 2024 08:39:13 GMT</pubDate>
    <dc:creator>isoutamo</dc:creator>
    <dc:date>2024-07-04T08:39:13Z</dc:date>
    <item>
      <title>Sequence of activities at index time</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Sequence-of-activities-at-index-time/m-p/692381#M115106</link>
      <description>&lt;P&gt;Hi at all,&lt;/P&gt;&lt;P&gt;I have a new doubt about the sequence of activities during indextime.&lt;BR /&gt;I have a data flow, arriving from HEC on an HF that I need to elaborate it because these data arrive from a concentrator and are relative to many different data flows (linux, oracle, etc...), so I have to assign the correct sourcetype to these data and I have to elaborate logs because they are modified by securelog: the original logs are inserted in a field of json adding some metadata.&lt;/P&gt;&lt;P&gt;I configured the following flow:&lt;/P&gt;&lt;P&gt;in props.conf:&lt;/P&gt;&lt;P&gt;[source::http:logstash*]&lt;BR /&gt;TRANSFORMS-000 = global_set_metadata&lt;BR /&gt;TRANSFORMS-001 = set_sourcetype_by_regex&lt;BR /&gt;TRANSFORMS-001 = set_index_by_sourcetype&lt;/P&gt;&lt;P&gt;in transforms.conf:&lt;/P&gt;&lt;P&gt;[global_set_metadata]&lt;BR /&gt;INGEST_EVAL = host := coalesce(json_extract(_raw, "host.name"), json_extract(_raw, "host.hostname")), relay_hostname := json_extract(_raw, "hub"), source := "http:logstash".coalesce("::".json_extract(_raw, "log.file.path"), "")&lt;BR /&gt;&lt;BR /&gt;[set_sourcetype_by_regex]&lt;BR /&gt;INGEST_EVAL = sourcetype := case(searchmatch("/var/log/audit/audit.log"), "linux_audit", true(), "logstash")&lt;/P&gt;&lt;P&gt;[set_index_by_sourcetype]&lt;BR /&gt;INGEST_EVAL = index:=case(sourcetype=linux, "index_linux", sourcetype=logstash, "index_logstash")&lt;/P&gt;&lt;P&gt;in which:&lt;BR /&gt;the first transformation extract (using INGEST_EVAL) metadata as host, source and relay_hostname (the concentrator from which the logs arrive),&lt;BR /&gt;the second one assign the correct sourcetype based on a regex.&lt;BR /&gt;the third one assign the correct index based on sourcetype and usig INGEST_EVAL to avoid to re-run a regex,&lt;BR /&gt;the first two transformations are correctly executed, but the third doesn't use the sourcetype assigned by the second one.&lt;/P&gt;&lt;P&gt;I also tried a different approach using CLONE_SOURCETYPE in the second one (instead of INGEST_EVAL) and it runs, but I'm verifying if the above flow can run because it's more linear and should be less heavy for the system.&lt;/P&gt;&lt;P&gt;Where could I search the issue?&lt;BR /&gt;is there something wrong in the activity flow?&lt;/P&gt;&lt;P&gt;Thank you to all.&lt;BR /&gt;Ciao.&lt;BR /&gt;Giuseppe&lt;/P&gt;</description>
      <pubDate>Thu, 04 Jul 2024 06:26:58 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Sequence-of-activities-at-index-time/m-p/692381#M115106</guid>
      <dc:creator>gcusello</dc:creator>
      <dc:date>2024-07-04T06:26:58Z</dc:date>
    </item>
    <item>
      <title>Re: Sequence of activities at index time</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Sequence-of-activities-at-index-time/m-p/692389#M115107</link>
      <description>You propably used raw endpoint on HEC?</description>
      <pubDate>Thu, 04 Jul 2024 08:39:13 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Sequence-of-activities-at-index-time/m-p/692389#M115107</guid>
      <dc:creator>isoutamo</dc:creator>
      <dc:date>2024-07-04T08:39:13Z</dc:date>
    </item>
    <item>
      <title>Re: Sequence of activities at index time</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Sequence-of-activities-at-index-time/m-p/692458#M115113</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/214410"&gt;@isoutamo&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;nice to hear you!&lt;/P&gt;&lt;P&gt;yes, I'm using HEC on premise, so I cannot use Edge.&lt;/P&gt;&lt;P&gt;Ciao.&lt;/P&gt;&lt;P&gt;Giuseppe&lt;/P&gt;</description>
      <pubDate>Fri, 05 Jul 2024 06:15:39 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Sequence-of-activities-at-index-time/m-p/692458#M115113</guid>
      <dc:creator>gcusello</dc:creator>
      <dc:date>2024-07-05T06:15:39Z</dc:date>
    </item>
    <item>
      <title>Re: Sequence of activities at index time</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Sequence-of-activities-at-index-time/m-p/692485#M115117</link>
      <description>But are you using HEC's raw endpoint instead of event?&lt;BR /&gt;&lt;BR /&gt;Also you have two same TRANSFORMS&lt;BR /&gt;TRANSFORMS-001 = set_sourcetype_by_regex&lt;BR /&gt;TRANSFORMS-001 = set_index_by_sourcetype&lt;BR /&gt;&lt;BR /&gt;Which means that only one of those are used!</description>
      <pubDate>Fri, 05 Jul 2024 14:12:03 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Sequence-of-activities-at-index-time/m-p/692485#M115117</guid>
      <dc:creator>isoutamo</dc:creator>
      <dc:date>2024-07-05T14:12:03Z</dc:date>
    </item>
    <item>
      <title>Re: Sequence of activities at index time</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Sequence-of-activities-at-index-time/m-p/692594#M115131</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/214410"&gt;@isoutamo&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;thank you for your support.&lt;/P&gt;&lt;P&gt;it was a mistyping, the issue was that the searchmatch() function doesn't run in INGEST_EVAL, ising the match() function, my INGEST_EVAL is working.&lt;/P&gt;&lt;P&gt;Thank you again for your support.&lt;/P&gt;&lt;P&gt;Ciao.&lt;/P&gt;&lt;P&gt;Giuseppe&lt;/P&gt;</description>
      <pubDate>Mon, 08 Jul 2024 06:03:06 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Sequence-of-activities-at-index-time/m-p/692594#M115131</guid>
      <dc:creator>gcusello</dc:creator>
      <dc:date>2024-07-08T06:03:06Z</dc:date>
    </item>
  </channel>
</rss>

