<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Create field extraction before linebreaks and apply it to broken-out sections after? in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/Create-field-extraction-before-linebreaks-and-apply-it-to-broken/m-p/159847#M32410</link>
    <description>&lt;P&gt;I do not think this can be done at index-time without pre-processing the file yourself and copying the line to every section but you can do it easily enough at search-time like this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;... | eventstats first(ID_Number) AS ID_Number by source | ...
&lt;/CODE&gt;&lt;/PRE&gt;</description>
    <pubDate>Wed, 17 Jun 2015 15:27:50 GMT</pubDate>
    <dc:creator>woodcock</dc:creator>
    <dc:date>2015-06-17T15:27:50Z</dc:date>
    <item>
      <title>Create field extraction before linebreaks and apply it to broken-out sections after?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Create-field-extraction-before-linebreaks-and-apply-it-to-broken/m-p/159846#M32409</link>
      <description>&lt;P&gt;Let's say I'm doing extractions on a really big file, thousands of lines, that looks like this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;Section1
ID_Number: 12345
lots and lots of text
@@@
Section2
lots and lots of text
@@@
Section3
lots and lots of text
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;So I've set up props and transforms to handle it so that it breaks at the @@@, and each section is a sourcetype -- sourcetype=Section1, sourcetype=Section2, etc.&lt;/P&gt;

&lt;P&gt;My question is this: that ID_Number in the first section is important, is there any way to extract it and add it to each section/sourcetype as a field so that it's not stuck only in Section1?&lt;/P&gt;</description>
      <pubDate>Wed, 17 Jun 2015 14:54:38 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Create-field-extraction-before-linebreaks-and-apply-it-to-broken/m-p/159846#M32409</guid>
      <dc:creator>willial</dc:creator>
      <dc:date>2015-06-17T14:54:38Z</dc:date>
    </item>
    <item>
      <title>Re: Create field extraction before linebreaks and apply it to broken-out sections after?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Create-field-extraction-before-linebreaks-and-apply-it-to-broken/m-p/159847#M32410</link>
      <description>&lt;P&gt;I do not think this can be done at index-time without pre-processing the file yourself and copying the line to every section but you can do it easily enough at search-time like this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;... | eventstats first(ID_Number) AS ID_Number by source | ...
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 17 Jun 2015 15:27:50 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Create-field-extraction-before-linebreaks-and-apply-it-to-broken/m-p/159847#M32410</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2015-06-17T15:27:50Z</dc:date>
    </item>
    <item>
      <title>Re: Create field extraction before linebreaks and apply it to broken-out sections after?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Create-field-extraction-before-linebreaks-and-apply-it-to-broken/m-p/159848#M32411</link>
      <description>&lt;P&gt;The sections are broken out at index time, so I can't grab the ID from Section1 if I'm searching on Section3. Currently I lose the relationship between sections, which is why I'm trying to put a unique identifier in all sections so they can be tracked after they're broken out.&lt;/P&gt;</description>
      <pubDate>Wed, 17 Jun 2015 15:43:25 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Create-field-extraction-before-linebreaks-and-apply-it-to-broken/m-p/159848#M32411</guid>
      <dc:creator>willial</dc:creator>
      <dc:date>2015-06-17T15:43:25Z</dc:date>
    </item>
    <item>
      <title>Re: Create field extraction before linebreaks and apply it to broken-out sections after?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Create-field-extraction-before-linebreaks-and-apply-it-to-broken/m-p/159849#M32412</link>
      <description>&lt;P&gt;I am pretty sure it is impossible without pre-processing.  Did you try my search-time solution?  It should work just fine.&lt;/P&gt;</description>
      <pubDate>Wed, 17 Jun 2015 17:51:40 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Create-field-extraction-before-linebreaks-and-apply-it-to-broken/m-p/159849#M32412</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2015-06-17T17:51:40Z</dc:date>
    </item>
    <item>
      <title>Re: Create field extraction before linebreaks and apply it to broken-out sections after?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Create-field-extraction-before-linebreaks-and-apply-it-to-broken/m-p/159850#M32413</link>
      <description>&lt;P&gt;As I mentioned, I'm dividing the files at index time so if I'm searching sourcetype=Section2 or sourcetype=Section3, the ID number isn't in the current search results to run eventstats on. The files are ~15,000 lines long and contain 20 or so sections, so I can't really manage them at search time in their original format.&lt;/P&gt;</description>
      <pubDate>Wed, 17 Jun 2015 18:03:00 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Create-field-extraction-before-linebreaks-and-apply-it-to-broken/m-p/159850#M32413</guid>
      <dc:creator>willial</dc:creator>
      <dc:date>2015-06-17T18:03:00Z</dc:date>
    </item>
    <item>
      <title>Re: Create field extraction before linebreaks and apply it to broken-out sections after?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Create-field-extraction-before-linebreaks-and-apply-it-to-broken/m-p/159851#M32414</link>
      <description>&lt;P&gt;I don't think you understand my answer, probably because you may not appreciate how &lt;CODE&gt;eventstats&lt;/CODE&gt; works.  Just pretend the field exists and write your search.  Then insert my &lt;CODE&gt;eventstats&lt;/CODE&gt; solution at the very beginning of the command chain and it will work as you would expect.  The only thing is that you need to be sure NOT to discriminate out &lt;CODE&gt;sourcetype=section1&lt;/CODE&gt; until after adding in the solution, even if it means doing a broader search than you need at first.  Just give it a try.&lt;/P&gt;</description>
      <pubDate>Wed, 17 Jun 2015 18:16:12 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Create-field-extraction-before-linebreaks-and-apply-it-to-broken/m-p/159851#M32414</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2015-06-17T18:16:12Z</dc:date>
    </item>
    <item>
      <title>Re: Create field extraction before linebreaks and apply it to broken-out sections after?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Create-field-extraction-before-linebreaks-and-apply-it-to-broken/m-p/159852#M32415</link>
      <description>&lt;P&gt;If this is right, I don't think I understand how eventstats works. This may get complicated.&lt;/P&gt;

&lt;P&gt;So I made my sample look very generic. My actual ID looks like (sorry for the regex):&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;System Type:       \w+-\w+
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;(I'm also looking to do similar searches filtering out by MAC addresses and some other things)&lt;/P&gt;

&lt;P&gt;I'm pretty sure I can't just ask it to eventstats first("System Type") and expect it to pick up what I want -- also I tried it, so I'm even more pretty sure there.  Will I need to build in an auto field extraction or is there a better way around this?&lt;/P&gt;</description>
      <pubDate>Wed, 17 Jun 2015 18:44:20 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Create-field-extraction-before-linebreaks-and-apply-it-to-broken/m-p/159852#M32415</guid>
      <dc:creator>willial</dc:creator>
      <dc:date>2015-06-17T18:44:20Z</dc:date>
    </item>
    <item>
      <title>Re: Create field extraction before linebreaks and apply it to broken-out sections after?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Create-field-extraction-before-linebreaks-and-apply-it-to-broken/m-p/159853#M32416</link>
      <description>&lt;P&gt;Let us assume you have forwarded 2 files as follows:&lt;/P&gt;

&lt;P&gt;File1:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;Section1
ID_Number: 12345
lots and lots of text
@@@
Section2
lots and lots of text
@@@
Section3
lots and lots of text
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;File 2:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;Section1
ID_Number: 98765
lots and lots of text
@@@
Section2
lots and lots of text
@@@
Section4
lots and lots of text
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Then you do a search like this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index=myindex | rex "Section(?&amp;lt;Section&amp;gt;\d+)" | rex "ID_Number:\s*(?&amp;lt;ID_Number&amp;gt;\d+)" | eventstats first(ID_Number) AS ID_Number by source | table Section ID_Number,sourcetype,source
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Then you will get data like this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;1,12345,Section1,File1
2,12345,Section2,File1
3,12345,Section3,File1
1,98765,Section1,File2
2,98765,Section2,File2
4,98765,Section4,File2
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;So &lt;CODE&gt;ID_Number&lt;/CODE&gt; has been associated with every &lt;CODE&gt;Section&lt;/CODE&gt; (sourcetype/event) within each file/source, which is what you said you needed.&lt;/P&gt;

&lt;P&gt;But there is nothing that can make this automatic; you just have to do the &lt;CODE&gt;eventstats&lt;/CODE&gt; on every search.&lt;/P&gt;</description>
      <pubDate>Wed, 17 Jun 2015 19:30:01 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Create-field-extraction-before-linebreaks-and-apply-it-to-broken/m-p/159853#M32416</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2015-06-17T19:30:01Z</dc:date>
    </item>
    <item>
      <title>Re: Create field extraction before linebreaks and apply it to broken-out sections after?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Create-field-extraction-before-linebreaks-and-apply-it-to-broken/m-p/159854#M32417</link>
      <description>&lt;P&gt;That doesn't work. It just extracts from the very first event that matches the rex for System Type and applies that to every result in the search. Since each section was broken into its own event at index time and there's no overlapping unique information, I don't have any relational linkage between sections.&lt;/P&gt;

&lt;P&gt;EDIT: Additionally, the files themselves are too big not to break into smaller events. I still get some truncation issues on subsections, even.&lt;/P&gt;</description>
      <pubDate>Wed, 17 Jun 2015 19:43:13 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Create-field-extraction-before-linebreaks-and-apply-it-to-broken/m-p/159854#M32417</guid>
      <dc:creator>willial</dc:creator>
      <dc:date>2015-06-17T19:43:13Z</dc:date>
    </item>
    <item>
      <title>Re: Create field extraction before linebreaks and apply it to broken-out sections after?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Create-field-extraction-before-linebreaks-and-apply-it-to-broken/m-p/159855#M32418</link>
      <description>&lt;P&gt;You DO have relational linkage: the &lt;CODE&gt;source&lt;/CODE&gt; field!  You are totally mistaken: JUST TRY THE SEARCH!  My search does EXACTLY what I say it does for the example files and data that I showed.  It most certainly DOES NOT "just extracts from the very first event that matches the rex for System Type and applies that to every result in the search"  if I had used &lt;CODE&gt;| eventstats first(ID_Number) AS ID_Number&lt;/CODE&gt; then it would, but I DID NOT, I used &lt;CODE&gt;| eventstats first(ID_Number) AS ID_Number by source&lt;/CODE&gt;.  I have never worked so hard to help a person to just try an answer before!  If you would just TRY IT you would see that it works and that you have had a workable solution (the only one, mind you) since the very first comment.&lt;/P&gt;</description>
      <pubDate>Wed, 17 Jun 2015 20:03:12 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Create-field-extraction-before-linebreaks-and-apply-it-to-broken/m-p/159855#M32418</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2015-06-17T20:03:12Z</dc:date>
    </item>
    <item>
      <title>Re: Create field extraction before linebreaks and apply it to broken-out sections after?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Create-field-extraction-before-linebreaks-and-apply-it-to-broken/m-p/159856#M32419</link>
      <description>&lt;P&gt;I'm sorry if I wasn't clear -- I did try it, and it acted as I stated in my previous comment. The Source field is not unique.&lt;/P&gt;</description>
      <pubDate>Thu, 18 Jun 2015 14:09:51 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Create-field-extraction-before-linebreaks-and-apply-it-to-broken/m-p/159856#M32419</guid>
      <dc:creator>willial</dc:creator>
      <dc:date>2015-06-18T14:09:51Z</dc:date>
    </item>
    <item>
      <title>Re: Create field extraction before linebreaks and apply it to broken-out sections after?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Create-field-extraction-before-linebreaks-and-apply-it-to-broken/m-p/159857#M32420</link>
      <description>&lt;P&gt;I know that the &lt;CODE&gt;source&lt;/CODE&gt; field is not unique (i.e. you have more than 1 file, each of which has a different &lt;CODE&gt;ID_Number&lt;/CODE&gt;); that is the WHOLE POINT, right?  Unless there is something totally crazy that you are not divulging, then I stand 100% by this answer.  I will not believe that it does't work until you show me the output and out how it is wrong (please do so):&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt; index=myindex | rex "Section(?&amp;lt;Section&amp;gt;\d+)" | rex "ID_Number:\s*(?&amp;lt;ID_Number&amp;gt;\d+)" | eventstats first(ID_Number) AS ID_Number by source | table Section ID_Number,sourcetype,source
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 18 Jun 2015 14:21:31 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Create-field-extraction-before-linebreaks-and-apply-it-to-broken/m-p/159857#M32420</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2015-06-18T14:21:31Z</dc:date>
    </item>
    <item>
      <title>Re: Create field extraction before linebreaks and apply it to broken-out sections after?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Create-field-extraction-before-linebreaks-and-apply-it-to-broken/m-p/159858#M32421</link>
      <description>&lt;P&gt;No, the source field is not unique in that source = source = source = source = source for most files, unless you're literally using individual text files as your inputs. If you're gathering data from a TCP source, source is always going to equal TCP:500 or however. Which means when you go to do eventstats by source, and none of your sources are unique, the ID number that's applied to each event is going to be the same one.&lt;/P&gt;

&lt;P&gt;This isn't crazy, and there are limited instances in which your solution would work, but this isn't one of them.&lt;/P&gt;</description>
      <pubDate>Thu, 18 Jun 2015 14:35:48 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Create-field-extraction-before-linebreaks-and-apply-it-to-broken/m-p/159858#M32421</guid>
      <dc:creator>willial</dc:creator>
      <dc:date>2015-06-18T14:35:48Z</dc:date>
    </item>
    <item>
      <title>Re: Create field extraction before linebreaks and apply it to broken-out sections after?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Create-field-extraction-before-linebreaks-and-apply-it-to-broken/m-p/159859#M32422</link>
      <description>&lt;P&gt;Perhaps the confusion is in your phrase &lt;CODE&gt;I'm dividing the files at index time&lt;/CODE&gt;.  I took this to mean that you are doing the &lt;CODE&gt;event linebreaking&lt;/CODE&gt; at index-time but &lt;EM&gt;perhaps&lt;/EM&gt; what you means is that you are pre-processing the files outside of Splunk and breaking up each big file into many smaller files such that only the first file has &lt;CODE&gt;Section1&lt;/CODE&gt; with the &lt;CODE&gt;ID_Number&lt;/CODE&gt;.  Even if that's the case, just name each split off file like "file1.1", "file1.2", etc. and we can still use the &lt;CODE&gt;source&lt;/CODE&gt; filed (with a bit of a tweak to ignore everything after the last period) as I described.  Other than this, I cannot imagine how it is that we are so misunderstanding and confusing one another.&lt;/P&gt;</description>
      <pubDate>Thu, 18 Jun 2015 14:42:41 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Create-field-extraction-before-linebreaks-and-apply-it-to-broken/m-p/159859#M32422</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2015-06-18T14:42:41Z</dc:date>
    </item>
    <item>
      <title>Re: Create field extraction before linebreaks and apply it to broken-out sections after?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Create-field-extraction-before-linebreaks-and-apply-it-to-broken/m-p/159860#M32423</link>
      <description>&lt;P&gt;I think we broke the comment section. Moving up here.&lt;/P&gt;

&lt;P&gt;I'm doing event linebreaking at index time, but I'm not actually working with a unique source field per original file since I'm getting data over TCP and UDP -- I'm using files too, but not 100%, which is the only case in which your solution would really work.  I'm not pre-processing outside of Splunk, I'm using props.conf and transforms.conf stanzas to handle the event linebreaking at index time. What I'm really looking for, and what I think would be most useful, is a way to extract a value from the input &lt;EM&gt;before&lt;/EM&gt; event linebreaking and then embed it in each new event &lt;EM&gt;after&lt;/EM&gt;.&lt;/P&gt;</description>
      <pubDate>Thu, 18 Jun 2015 14:54:43 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Create-field-extraction-before-linebreaks-and-apply-it-to-broken/m-p/159860#M32423</guid>
      <dc:creator>willial</dc:creator>
      <dc:date>2015-06-18T14:54:43Z</dc:date>
    </item>
    <item>
      <title>Re: Create field extraction before linebreaks and apply it to broken-out sections after?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Create-field-extraction-before-linebreaks-and-apply-it-to-broken/m-p/159861#M32424</link>
      <description>&lt;P&gt;OK, now that I know the full input/forwarding pipeline (TCP was not originally mentioned; only files were), I think I have a solution for you.  You can use &lt;CODE&gt;netcat&lt;/CODE&gt; to perform a quick and dirty man-in-the-middle scripting for this.  Write a script that reads from STDIN and writes to STD out and watches for any "ID_Number" line and outputs the last-seen line after every time it sees a "Section" line.  I whipped up an &lt;CODE&gt;awk&lt;/CODE&gt; script for your sample data that you can use as a start.  Then redirect your Splunk forwarder to read from a different unused port (say port 1313) and use &lt;CODE&gt;netcat&lt;/CODE&gt; on the old port like this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;nc -l 540 | awk '{if ( $1 ~ /ID_Number/ ) {ID_Number = $0; print $0} else {if ( $1 ~ /Section/ &amp;amp;&amp;amp; $1 !~ /Section1/ ) {print $0 "\n" ID_Number} else {print $0}}}' | nc localhost 1313
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Every good linux man should know about &lt;CODE&gt;netcat&lt;/CODE&gt;.&lt;/P&gt;</description>
      <pubDate>Thu, 18 Jun 2015 15:09:34 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Create-field-extraction-before-linebreaks-and-apply-it-to-broken/m-p/159861#M32424</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2015-06-18T15:09:34Z</dc:date>
    </item>
  </channel>
</rss>

