<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic How to change segmenters to make my data working with PREFIX directive in tstats? in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/How-to-change-segmenters-to-make-my-data-working-with-PREFIX/m-p/653897#M110876</link>
    <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;I'm trying to use the PREFIX directive in TSTATS (here :&amp;nbsp;&lt;A href="https://docs.splunk.com/Documentation/Splunk/9.1.0/SearchReference/Tstats#Use_PREFIX.28.29_to_aggregate_or_group_by_raw_tokens_in_indexed_data" target="_blank" rel="noopener"&gt;https://docs.splunk.com/Documentation/Splunk/9.1.0/SearchReference/Tstats#Use_PREFIX.28.29_to_aggregate_or_group_by_raw_tokens_in_indexed_data&lt;/A&gt;).&lt;/P&gt;&lt;P&gt;In the docs, it says that it can work with data that &lt;STRONG&gt;does not&lt;/STRONG&gt; contain major breakers such as spaces.&lt;/P&gt;&lt;P&gt;My data contains spaces so I decided to try to change the major breakers this way:&lt;/P&gt;&lt;P&gt;&lt;U&gt;props.conf:&lt;/U&gt;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;[test_sourcetype]
SEGMENTATION = test_segments&lt;/LI-CODE&gt;&lt;P&gt;&lt;U&gt;segmenters.conf:&lt;/U&gt;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;[test_segments]
MAJOR = \t
MINOR = / : = @ . - $ # % \\ _ [ ] &amp;lt; &amp;gt; ( ) { } | ! ; , ' " * \n \r \s &amp;amp; ? + %21 %26 %2526 %3B %7C %20 %2B %3D -- %2520 %5D %5B %3A %0A %2C %28 %29&lt;/LI-CODE&gt;&lt;P&gt;This way, only the tab (\t) is considered as a major breaker.&lt;/P&gt;&lt;P&gt;I applied this, restarted and tried to ingest a line of log with the sourcetype "test_sourcetype".&lt;/P&gt;&lt;P&gt;Unfortunately, it seems the segmenters.conf does not work because it keeps breaking with a space for example.&lt;/P&gt;&lt;P&gt;I also tried to remove all MINOR and keep only MAJOR, but no luck:&lt;/P&gt;&lt;P&gt;MAJOR = \t&lt;BR /&gt;MINOR =&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Have I made a mistake? Is it possible to do what I want? I think so because in this .conf presentation (&lt;A href="https://conf.splunk.com/files/2020/slides/PLA1089C.pdf" target="_blank" rel="noopener"&gt;https://conf.splunk.com/files/2020/slides/PLA1089C.pdf&lt;/A&gt;) they mention it briefly (page 37).&lt;/P&gt;&lt;P&gt;Should I also use&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;SEGMENTATION-&amp;lt;segment selection&amp;gt; = &amp;lt;segmenter&amp;gt;&lt;/PRE&gt;&lt;P&gt;in props.conf ? The docs says it is for SplunkWeb but I am considering all options...&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;</description>
    <pubDate>Fri, 11 Aug 2023 07:39:37 GMT</pubDate>
    <dc:creator>cdaviet</dc:creator>
    <dc:date>2023-08-11T07:39:37Z</dc:date>
    <item>
      <title>How to change segmenters to make my data working with PREFIX directive in tstats?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-change-segmenters-to-make-my-data-working-with-PREFIX/m-p/653897#M110876</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;I'm trying to use the PREFIX directive in TSTATS (here :&amp;nbsp;&lt;A href="https://docs.splunk.com/Documentation/Splunk/9.1.0/SearchReference/Tstats#Use_PREFIX.28.29_to_aggregate_or_group_by_raw_tokens_in_indexed_data" target="_blank" rel="noopener"&gt;https://docs.splunk.com/Documentation/Splunk/9.1.0/SearchReference/Tstats#Use_PREFIX.28.29_to_aggregate_or_group_by_raw_tokens_in_indexed_data&lt;/A&gt;).&lt;/P&gt;&lt;P&gt;In the docs, it says that it can work with data that &lt;STRONG&gt;does not&lt;/STRONG&gt; contain major breakers such as spaces.&lt;/P&gt;&lt;P&gt;My data contains spaces so I decided to try to change the major breakers this way:&lt;/P&gt;&lt;P&gt;&lt;U&gt;props.conf:&lt;/U&gt;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;[test_sourcetype]
SEGMENTATION = test_segments&lt;/LI-CODE&gt;&lt;P&gt;&lt;U&gt;segmenters.conf:&lt;/U&gt;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;[test_segments]
MAJOR = \t
MINOR = / : = @ . - $ # % \\ _ [ ] &amp;lt; &amp;gt; ( ) { } | ! ; , ' " * \n \r \s &amp;amp; ? + %21 %26 %2526 %3B %7C %20 %2B %3D -- %2520 %5D %5B %3A %0A %2C %28 %29&lt;/LI-CODE&gt;&lt;P&gt;This way, only the tab (\t) is considered as a major breaker.&lt;/P&gt;&lt;P&gt;I applied this, restarted and tried to ingest a line of log with the sourcetype "test_sourcetype".&lt;/P&gt;&lt;P&gt;Unfortunately, it seems the segmenters.conf does not work because it keeps breaking with a space for example.&lt;/P&gt;&lt;P&gt;I also tried to remove all MINOR and keep only MAJOR, but no luck:&lt;/P&gt;&lt;P&gt;MAJOR = \t&lt;BR /&gt;MINOR =&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Have I made a mistake? Is it possible to do what I want? I think so because in this .conf presentation (&lt;A href="https://conf.splunk.com/files/2020/slides/PLA1089C.pdf" target="_blank" rel="noopener"&gt;https://conf.splunk.com/files/2020/slides/PLA1089C.pdf&lt;/A&gt;) they mention it briefly (page 37).&lt;/P&gt;&lt;P&gt;Should I also use&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;SEGMENTATION-&amp;lt;segment selection&amp;gt; = &amp;lt;segmenter&amp;gt;&lt;/PRE&gt;&lt;P&gt;in props.conf ? The docs says it is for SplunkWeb but I am considering all options...&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;</description>
      <pubDate>Fri, 11 Aug 2023 07:39:37 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-change-segmenters-to-make-my-data-working-with-PREFIX/m-p/653897#M110876</guid>
      <dc:creator>cdaviet</dc:creator>
      <dc:date>2023-08-11T07:39:37Z</dc:date>
    </item>
    <item>
      <title>Re: How to change segmenters to make my data working with PREFIX directive in tstats?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-change-segmenters-to-make-my-data-working-with-PREFIX/m-p/653995#M110884</link>
      <description>&lt;P&gt;I am not expert on this but I guess one thing is to run btool and make sure you are getting the settings that you think you are.&lt;/P&gt;&lt;P&gt;A few months ago I did a quick test to remove double quotes in order that I could use tstats.&lt;BR /&gt;&lt;BR /&gt;props.conf&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;[my_sourcetype]
SEGMENTATION = no_double_quotes&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;segmenters.conf&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;[no_double_quotes]
MAJOR = [ ] &amp;lt; &amp;gt; ( ) { } | ! ; , ' * \n \r \s \t &amp;amp; ? + %21 %26 %2526 %3B %7C %20 %2B %3D -- %2520 %5D %5B %3A %0A %2C %28 %29&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;Then I could search with tstats where I had myfield="123"&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;| tstats count where index=myindex by PREFIX(myfield=)&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 11 Aug 2023 00:02:44 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-change-segmenters-to-make-my-data-working-with-PREFIX/m-p/653995#M110884</guid>
      <dc:creator>burwell</dc:creator>
      <dc:date>2023-08-11T00:02:44Z</dc:date>
    </item>
    <item>
      <title>Re: How to change segmenters to make my data working with PREFIX directive in tstats?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-change-segmenters-to-make-my-data-working-with-PREFIX/m-p/654020#M110888</link>
      <description>&lt;P&gt;Hi burwell, thanks for taking the time to answer.&lt;/P&gt;&lt;P&gt;I actually ran a test on Splunk Enterprise (was doing it on our SplunkCloud production env) and it works!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;So it looks like SplunkCloud does not allow to change this kind of parameters... That's sad.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Anyway, thanks!&lt;/P&gt;</description>
      <pubDate>Fri, 11 Aug 2023 07:25:16 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-change-segmenters-to-make-my-data-working-with-PREFIX/m-p/654020#M110888</guid>
      <dc:creator>cdaviet</dc:creator>
      <dc:date>2023-08-11T07:25:16Z</dc:date>
    </item>
  </channel>
</rss>

