About ITWhisperer

PickleRick

Then all you can rely on is the event order. But this obviously raises questions about abnormal situations (like whether/how the source side handles error situations - does it just drop a request or does it reissue one?). Generally, you can use filldown (or streamstats) to populate a field based on a previous event's value like it's been already shown in this thread. Just remember that Splunk by default returns events in reverse chronological order. So if you want to rely on a request being _before_ the response, you need to resort your results to have older ones first.

KJ10

Basically we are inserting data using Rest Api, after 1 hour interval our stream events get called and it dumps all the data, to avoid this we use lookup before insertion. On UI if we remove duplicate, it works as expected but in event there is lot of duplicates values, which is taking lots of space and giving slow performance

muzicman0

Not what I wanted to do, but due to other reasons, I move the objects into the canvas (instead of the default Above Canvas) and was able to get them lined up there.

ITWhisperer

Events will be processed reverse chronological order, so unless you resort them, you might want first rather than last | streamstats time_window=24h first(FIELD1) as prev_field_value by FIELD2,FIELD3,FIELD4,FIELD5

ITWhisperer

This challenge was first posted on Slack #puzzles channel For a previous puzzle, I needed a set of fixed-length pipe-delimited events, so I took a public domain data set, which happened to be in XML format and converted it to the required format. The overall aim of this puzzle is to convert XML event to fixed-length events, and it has been split into multiple parts. The first part was about preparing the field template by dereferencing the field names, so that their positions could be compared. This second part is about an alternative approach to the field template process. To that end, the challenge for this part is to take some XML events and determine the correct order that the fields appear in by using nested loops to process each sequence segment against all the other sequences, and merging or joining the sequence segments until the whole sequence is determined. Using the same example set of events from the previous part: <row num="1600"><Mercury>0</Mercury><Venus>0</Venus><Earth>1</Earth></row> <row num="1625"><Jupiter>97</Jupiter></row> <row num="1675"><Saturn>274</Saturn></row> <row num="1800"><Saturn>274</Saturn><Uranus>29</Uranus></row> <row num="1850"><Saturn>274</Saturn><Neptune>16</Neptune></row> <row num="1875"><Uranus>29</Uranus></row> <row num="1900"><Earth>1</Earth><Mars>2</Mars><Jupiter>97</Jupiter><Saturn>274</Saturn></row> <row num="1950"><Jupiter>97</Jupiter><Uranus>29</Uranus><Neptune>16</Neptune></row> <row num="1975"><Jupiter>97</Jupiter><Saturn>274</Saturn></row> <row num="2000"><Jupiter>97</Jupiter><Saturn>274</Saturn><Uranus>29</Uranus><Neptune>16</Neptune></row> Develop a process to join sequences where they start and end with the same planet, and expand sequences where another sequence has one or more planets between a pair of planets which are consecutive in the sequence. Create a tilde-delimited template for the fields in this set of XML events. This article contains spoilers! In fact, most of this article is a spoiler as it contains partial solutions to the puzzle. If you are trying to solve the puzzle yourself and just want some pointers to get you started, stop reading when you have enough, and return if you get stuck again, or just want to compare your solution to mine! Field sequences This puzzle has been split into multiple parts. This second part is about an alternative approach to the field template process. Using the same example set of events from the previous part: <row num="1600"><Mercury>0</Mercury><Venus>0</Venus><Earth>1</Earth></row> <row num="1625"><Jupiter>97</Jupiter></row> <row num="1675"><Saturn>274</Saturn></row> <row num="1800"><Saturn>274</Saturn><Uranus>29</Uranus></row> <row num="1850"><Saturn>274</Saturn><Neptune>16</Neptune></row> <row num="1875"><Uranus>29</Uranus></row> <row num="1900"><Earth>1</Earth><Mars>2</Mars><Jupiter>97</Jupiter><Saturn>274</Saturn></row> <row num="1950"><Jupiter>97</Jupiter><Uranus>29</Uranus><Neptune>16</Neptune></row> <row num="1975"><Jupiter>97</Jupiter><Saturn>274</Saturn></row> <row num="2000"><Jupiter>97</Jupiter><Saturn>274</Saturn><Uranus>29</Uranus><Neptune>16</Neptune></row> We can see that row 1600 ends with Earth, and row 1900 starts with Earth. These can be combined to have Mercury -> Venus -> Earth -> Mars -> Jupiter -> Saturn. Furthermore, row 1850 shows Saturn -> Neptune; this can be expanded to be Saturn -> Uranus -> Neptune, because row 2000 shows Jupiter -> Saturn -> Uranus -> Neptune. To that end, the challenge for this part is to find all the possible sequences of the field names without compromising the overall integrity of the series of sequences, and combine and expand overlaps. Planets in order As we did in the previous part, using the planet data, we can create a template for the fields present in each event: | makeresults format=csv data="row <row num=\"1600\"><Mercury>0</Mercury><Venus>0</Venus><Earth>1</Earth></row> <row num=\"1625\"><Jupiter>97</Jupiter></row> <row num=\"1675\"><Saturn>274</Saturn></row> <row num=\"1800\"><Saturn>274</Saturn><Uranus>29</Uranus></row> <row num=\"1850\"><Saturn>274</Saturn><Neptune>16</Neptune></row> <row num=\"1875\"><Uranus>29</Uranus></row> <row num=\"1900\"><Earth>1</Earth><Mars>2</Mars><Jupiter>97</Jupiter><Saturn>274</Saturn></row> <row num=\"1950\"><Jupiter>97</Jupiter><Uranus>29</Uranus><Neptune>16</Neptune></row> <row num=\"1975\"><Jupiter>97</Jupiter><Saturn>274</Saturn></row> <row num=\"2000\"><Jupiter>97</Jupiter><Saturn>274</Saturn><Uranus>29</Uranus><Neptune>16</Neptune></row>" ``` Create a sequence of fields used in the event ``` | eval fieldnames=mvjoin(fields,"~") Field combinations Ideally, we would like to process each field name (planet) combined with every other field name (planet) to check if this combination is found in the data. If this was written in pseudo-code, it might look something like this: For x in planets For y in planets Does x followed by y exist is the data The closest thing to this (without writing a custom search command), is to use the foreach command. This command requires a field list, so let us generate a set of fields for the planets. Perhaps the simplest way to do this is with the chart command: ``` Create a set of fields for the planets ``` | chart values(eval(0)) by fieldnames fields This gives us a zero every time the planet represented by the field is used in the field name sequence, but more importantly, it gives us a field named for each planet. Now we can try to process each planet against every other planet: ``` Create a cross-product of the planets ``` | foreach * [| eval planet_one="<<FIELD>>" | foreach * [| eval planet_two="<<FIELD>>" | eval cross_product=mvappend(cross_product,planet_one."~".planet_two)] ] As you can see, this does not work as we would have liked. There are two problems: firstly, the fieldnames field has been processed; secondly, and more frustratingly, the nested foreach does not override the value of the <<FIELD>> variable (you might call this scope-bleed or globalisation of <<FIELD>> variable?). The following SPL is a simple test to demonstrate what is going on: ``` Create a cross-product of the planets ``` | foreach * [| eval planet_one="<<FIELD>>" | foreach 1 2 3 [| eval planet_two="<<FIELD>>" | eval cross_product=mvappend(cross_product,planet_one."~".planet_two)] ] There are three repetitions for each of the fields with both planet_one and planet_two being identical (and using the value of the <<FIELD>> variable from the outer foreach). Nested loops Since we want to process all the other planets, perhaps it would be better to replace the zero with a complete list of the planets: ``` Create a set of fields for the planets ``` | eventstats values(fields) as field | chart values(field) by fieldnames fields This gives us a list of all the planets every time the planet represented by the field is used in the field name sequence. Now we can use foreach in multivalue mode to process each field against all the other fields. ``` Rename fieldnames so it is not picked up by foreach ``` | rename fieldnames as _fieldnames ``` For each field ... ``` | foreach * [ ``` (Nested) for each value in the multivalue field ... ``` | foreach mode=multivalue <<FIELD>> [ Now we can check for sequences of field names and add them to a list of sequences, if either the field names directly or indirectly follow each other. ``` Build a list of field name sequences present in the current sequence with the following conditions: i) outer fieldname does not match inner fieldname, and a) outer fieldname is directly followed by inner fieldname (add direct pairing to the list), or b) outer fieldname is indirectly followed by inner fieldname (add all intervening fieldnames) ``` | eval sequence=if("<<FIELD>>”=<<ITEM>>, sequence, if(match(_fieldnames,"<<FIELD>>"."~".<<ITEM>>), mvappend(sequence,"<<FIELD>>"."~".<<ITEM>>), if(match(_fieldnames,"<<FIELD>>"."~[\w~]+~".<<ITEM>>), mvappend(sequence,mvjoin(mvindex(split(_fieldnames,"~"),mvfind(split(_fieldnames,"~"),"<<FIELD>>"),mvfind(split(_fieldnames,"~"),<<ITEM>>)),"~")), sequence))) Note that, as we discovered earlier, the variable <<FIELD>> in the inner foreach refers to the field from the outer foreach. Also note that, the <<ITEM>> variable is already a string value, whereas <<FIELD>> is the field value so needs to be double-quoted to use it as a string. For clarity, this could be rewritten as: ``` For each field ... ``` | foreach * [ | eval field_one="<<FIELD>>" ``` (Nested) for each value in the multivalue field ... ``` | foreach mode=multivalue <<FIELD>> [ ``` Build a list of field name sequences present in the current sequence with the following conditions: i) outer fieldname does not match inner fieldname, and a) outer fieldname is directly followed by inner fieldname (add direct pairing to the list), or b) outer fieldname is indirectly followed by inner fieldname (add all intervening fieldnames) ``` | eval field_two=<<ITEM>>, sequence=if(field_one=field_two,sequence,if(match(_fieldnames,field_one."~".field_two),mvappend(sequence,field_one."~".field_two),if(match(_fieldnames,field_one."~[\w~]+~".field_two),mvappend(sequence,mvjoin(mvindex(split(_fieldnames,"~"),mvfind(split(_fieldnames,"~"),field_one),mvfind(split(_fieldnames,"~"),field_two)),"~")),sequence))) ] ] Note that, foreach mode=multivalue only allows a single command to be used, however, multiple fields can be evaluated in a single eval command by separating them with commas. Overlaps and expansions Now that we have a list of field sequences, we need to look for opportunities to expand sequences where they end where another one starts, or start where another one ends, or where they contain a consecutive sequence which matches the start and end of another sequence. ``` Create a list of unique sequences ``` | stats count by sequence ``` Make complete list available to each sequence ``` | eventstats values(sequence) as sequences Move the current sequence out of the way so we can create a new set of sequences ``` Move existing sequence out of the way so we can create a new list ``` | rename sequence as _sequence For each of the sequences, create a new list of sequences containing, the current sequence, expanded sequences where the start of the other matches the start of the current sequence, or where the end of the other matches the start of the current sequence, or where the start and end of the other sequence match a consecutive pair in the current sequence. ``` For each sequence (event), compare all the other sequences ``` | foreach mode=multivalue sequences [ | eval sequence=if(_sequence=<<ITEM>>,mvappend(sequence,_sequence),if(mvindex(split(<<ITEM>>,"~"),0)=mvindex(split(_sequence,"~"),-1),mvappend(sequence,_sequence."~".mvjoin(mvindex(split(<<ITEM>>,"~"),1,-1),"~")),if(mvindex(split(<<ITEM>>,"~"),-1)=mvindex(split(_sequence,"~"),0),mvappend(sequence,mvjoin(mvindex(split(<<ITEM>>,"~"),0,-2),"~")."~"._sequence),if(match(_sequence,mvindex(split(<<ITEM>>,"~"),0)."~".mvindex(split(<<ITEM>>,"~"),-1)),mvappend(sequence,replace(_sequence,mvindex(split(<<ITEM>>,"~"),0)."~".mvindex(split(<<ITEM>>,"~"),-1),<<ITEM>>)),sequence))) ] Since we do not appear to have found the full sequence yet, I will leave you to work out which steps need to be repeated, and by how many times until the complete sequence is discovered. Have questions or thoughts? Comment on this article or in Slack #puzzles channel. Whichever you prefer.

ITWhisperer

It is working for me in version 9.3.5, 9.4.3, 10.0.x by editing web-features.conf - the settings server settings doesn't always appear in the gui though

ITWhisperer

On prem works too although you may need at least 10.0

PickleRick

Please don't share "ready to use" scripts based on serious assumptions without at least explaining what those assumptions are. In here your quite strong assumption is that there would be no files matching firewall-*.log.gz coming from other sources (possibly in other subdirectories. Also - you're mixing -name with -iname.

ITWhisperer

This challenge was first posted on Slack #puzzles channel For a previous puzzle, I needed a set of fixed-length pipe-delimited events, so I took a public domain data set, which happened to be in XML format and converted it to the required format. The overall aim of this puzzle is to convert XML event to fixed-length events, and it has been split into multiple parts. This first part is about preparing the field template so that it can be used to place the data in the correct order in the fixed-length (and pipe-delimited) events. To that end, the challenge for this part is to take some XML events and determine the correct order that the fields appear in. The approach requires determining, for each event, where each field exists in the sequence and comparing it with the position of every other field by dereferencing the field names to find their positions. An example would be the following set of events: <row num="1600"><Mercury>0</Mercury><Venus>0</Venus><Earth>1</Earth></row> <row num="1625"><Jupiter>97</Jupiter></row> <row num="1675"><Saturn>274</Saturn></row> <row num="1800"><Saturn>274</Saturn><Uranus>29</Uranus></row> <row num="1850"><Saturn>274</Saturn><Neptune>16</Neptune></row> <row num="1875"><Uranus>29</Uranus></row> <row num="1900"><Earth>1</Earth><Mars>2</Mars><Jupiter>97</Jupiter><Saturn>274</Saturn></row> <row num="1950"><Jupiter>97</Jupiter><Uranus>29</Uranus><Neptune>16</Neptune></row> <row num="1975"><Jupiter>97</Jupiter><Saturn>274</Saturn></row> <row num="2000"><Jupiter>97</Jupiter><Saturn>274</Saturn><Uranus>29</Uranus><Neptune>16</Neptune></row> For example, Mercury has a lower index in the sequence, i.e. is to the left of Venus, which has a lower index in the sequence, i.e. is to the left of Earth, and, in a different sequence, Jupiter has a lower index, i.e. is to the left of Saturn. Create a tilde-delimited template for all the fields in this set of XML events. For a partial example Mercury~Venus~Earth An optional bonus question is to explain what the data represents. This article contains spoilers! In fact, most of this article is a spoiler as it contains partial solutions to the puzzle. If you are trying to solve the puzzle yourself and just want some pointers to get you started, stop reading when you have enough, and return if you get stuck again, or just want to compare your solution to mine! Preparing the field template This puzzle has been split into multiple parts. This first part is about preparing the field template so that it can be used to place the data in the correct order in the fixed-length (and pipe-delimited) events. To that end, the challenge for this part is to take some XML events and determine the correct order that the fields appear in. A simple example would be the following set of events: <row num="1"><john>L</john><paul>M</paul><ringo>S</ringo></row> <row num="2"><john>L</john><george>H</george><ringo>S</ringo></row> <row num="3"><john>L</john><paul>M</paul><george>H</george></row> Here, john has a lower index in the sequence, i.e. is to the left of paul, and paul has a lower index in the sequence, i.e. is to the left of ringo. A more complex example would be the following set of events: <row num="1600"><Mercury>0</Mercury><Venus>0</Venus><Earth>1</Earth></row> <row num="1625"><Jupiter>97</Jupiter></row> <row num="1675"><Saturn>274</Saturn></row> <row num="1800"><Saturn>274</Saturn><Uranus>29</Uranus></row> <row num="1850"><Saturn>274</Saturn><Neptune>16</Neptune></row> <row num="1875"><Uranus>29</Uranus></row> <row num="1900"><Earth>1</Earth><Mars>2</Mars><Jupiter>97</Jupiter><Saturn>274</Saturn></row> <row num="1950"><Jupiter>97</Jupiter><Uranus>29</Uranus><Neptune>16</Neptune></row> <row num="1975"><Jupiter>97</Jupiter><Saturn>274</Saturn></row> <row num="2000"><Jupiter>97</Jupiter><Saturn>274</Saturn><Uranus>29</Uranus><Neptune>16</Neptune></row> For example, Mercury has a lower index in the sequence, i.e. is to the left of Venus which has a lower index in the sequence, i.e. is to the left of Earth, and Jupiter has a lower index in the sequence, i.e. is to the left of Saturn. Create a tilde-delimited template for the fields in these sets of XML events. Field names By inspecting the data set, we can see that all the first level fields that we are interested in have simple word character names. This means that they can be extracted using a regular expression. | rex max_match=0 "\<(?<fields>\w+)\>" This creates a multi-value field (fields) with the fieldnames in the order they are found in the event. By joining these values (using a neutral delimiter, I chose a tilde (~) so it does not cause complications with regular expressions further down the line), we can create an initial template for each event. | eval fieldnames=mvjoin(fields,"~") Looking at these fieldname templates, you can see that they are not all the same. This is because not all fields are present in all the events. Fortunately, fieldnames are not repeated in any single template, and there are enough examples of different templates to be able to determine the full and correct order. This is not always guaranteed as duplicated fieldnames and/or a smaller data set could present problems. For example, consider the following two events: <row num="1"><john>L</john><paul>M</paul><ringo>S</ringo></row> <row num="2"><john>L</john><george>H</george><ringo>S</ringo></row> These would have the following templates: john~paul~ringo john~george~ringo Trying to combine these sequence, which of these would represent the full template? john~paul~george~ringo john~george~paul~ringo With these two events, there is insufficient information to determine this. But add a further event example: <row num="3"><john>L</john><paul>M</paul><george>H</george></row> With the following template: john~paul~george The full template can now be determined, even when none of the individual events has this complete template: john~paul~george~ringo Planets in order Turning to Splunk now, by taking the planet data as an example, we can create a template for the fields present in each event: | makeresults format=csv data="row <row num=\"1600\"><Mercury>0</Mercury><Venus>0</Venus><Earth>1</Earth></row> <row num=\"1625\"><Jupiter>97</Jupiter></row> <row num=\"1675\"><Saturn>274</Saturn></row> <row num=\"1800\"><Saturn>274</Saturn><Uranus>29</Uranus></row> <row num=\"1850\"><Saturn>274</Saturn><Neptune>16</Neptune></row> <row num=\"1875\"><Uranus>29</Uranus></row> <row num=\"1900\"><Earth>1</Earth><Mars>2</Mars><Jupiter>97</Jupiter><Saturn>274</Saturn></row> <row num=\"1950\"><Jupiter>97</Jupiter><Uranus>29</Uranus><Neptune>16</Neptune></row> <row num=\"1975\"><Jupiter>97</Jupiter><Saturn>274</Saturn></row> <row num=\"2000\"><Jupiter>97</Jupiter><Saturn>274</Saturn><Uranus>29</Uranus><Neptune>16</Neptune></row>" ``` Find all relevant field names ``` | rex field=row max_match=0 "\<(?<fields>\w+)\>" ``` Create a sequence of fields used in the event ``` | eval fieldnames=mvjoin(fields,"~") This contains the following information: fieldnames fields Mercury~Venus~Earth Mercury Venus Earth Jupiter Jupiter Saturn Saturn Saturn~Uranus Saturn Uranus Saturn~Neptune Saturn Neptune Uranus Uranus Earth~Mars~Jupiter~Saturn Earth Mars Jupiter Saturn Jupiter~Uranus~Neptune Jupiter Uranus Neptune Jupiter~Saturn Jupiter Saturn Jupiter~Saturn~Uranus~Neptune Jupiter Saturn Uranus Neptune To get the names of the planets as fields, place this information in a chart with an initial value of zero: ``` Chart the fields used in the fieldname sequences ``` | eval minimum=0 | chart max(minimum) by fieldnames fields fieldnames Earth Jupiter Mars Mercury Neptune Saturn Uranus Venus Earth~Mars~Jupiter~Saturn 0 0 0 0 Jupiter 0 Jupiter~Saturn 0 0 Jupiter~Saturn~Uranus~Neptune 0 0 0 0 Jupiter~Uranus~Neptune 0 0 0 Mercury~Venus~Earth 0 0 0 Saturn 0 Saturn~Neptune 0 0 Saturn~Uranus 0 0 Uranus 0 Find the index of each named field in each field sequence: ``` Find the index of each field named in each sequence ``` | eval fields=split(fieldnames,"~") | rename fieldnames as _fieldnames, fields as _fields | foreach * [| eval <<FIELD>>=mvfind(_fields,"<<FIELD>>")] Note the rename of fieldnames and fields, so that they do not get picked up by the * on the foreach command. Earth Jupiter Mars Mercury Neptune Saturn Uranus Venus 0 2 1 3 0 0 1 0 3 1 2 0 2 1 2 0 1 0 1 0 0 0 Notice that some planets have multiple entries and that not all the entries for a planet are the same. This is because the planet appears in more than one sequence and that there may be different numbers of planets appearing before it in the sequence. So, which is right? In fact, nearly all of them are wrong. How do we find the correct order? Finding the right order To find the right order, we can start with the maximum value for each planet, i.e. the one representing where it is furthest to the right in any sequence. ``` Find the maximum index of each field named in each sequence i.e. the furthest to the right that the field appears in any sequence ``` | eventstats max(*) as * Earth Jupiter Mars Mercury Neptune Saturn Uranus Venus 2 2 1 0 3 3 2 1 Rows removed from tables to save space. So far, we have been dealing with the sequences as a whole; now we need to deal with each field in each sequence separately. What we would really like to do is find the value of the index for the first field named in the sequence, and add 1 to it for each of the remaining fields listed in the sequence. Reverse dereferencing Unfortunately, SPL does not have a direct way to dereference the field name to find the corresponding field value. So, we will have to try a different approach. Start by expanding out the field names used in each sequence so we can process them separately: ``` Split out the fieldnames used in each sequence ``` | rename _fieldnames as fieldnames, _fields as fields | mvexpand fields fieldnames Earth Jupiter Mars Mercury Neptune Saturn Uranus Venus fields Earth~Mars~Jupiter~Saturn 2 2 1 0 3 3 2 1 Earth Earth~Mars~Jupiter~Saturn 2 2 1 0 3 3 2 1 Mars Earth~Mars~Jupiter~Saturn 2 2 1 0 3 3 2 1 Jupiter Earth~Mars~Jupiter~Saturn 2 2 1 0 3 3 2 1 Saturn Find the position of each field in its sequence (as an index): ``` Find the current position of the field in its sequence ``` | streamstats count as _position by fieldnames ``` Convert to index ``` | eval _position=_position-1 Earth Jupiter Mars Mercury Neptune Saturn Uranus Venus fields position 2 2 1 0 3 3 2 1 Earth 0 2 2 1 0 3 3 2 1 Mars 1 2 2 1 0 3 3 2 1 Jupiter 2 2 2 1 0 3 3 2 1 Saturn 3 For each sequence, find the current rightmost position of the starting field (effectively, dereferencing the first named planet field): ``` For each sequence, find the current rightmost position of the start ``` | rename fieldnames as _fieldnames, fields as _fields | fields - _start | foreach * [| eval _start=if("<<FIELD>>"=_fields AND _position=0,<<FIELD>>,_start)] | eventstats min(_start) as _start by _fieldnames Earth Jupiter Mars Mercury Neptune Saturn Uranus Venus fields position start 2 2 1 0 3 3 2 1 Earth 0 2 2 2 1 0 3 3 2 1 Mars 1 2 2 2 1 0 3 3 2 1 Jupiter 2 2 2 2 1 0 3 3 2 1 Saturn 3 2 Calculate the new position based on the relative position in the sequence and the rightmost position of the first field ``` Calculate the new position based on the relative position in the sequence and the rightmost position of the first field ``` | eval _new_position=_start+_position Earth Jupiter Mars Mercury Neptune Saturn Uranus Venus fields new position position start 2 2 1 0 3 3 2 1 Earth 2 0 2 2 2 1 0 3 3 2 1 Mars 3 1 2 2 2 1 0 3 3 2 1 Jupiter 4 2 2 2 2 1 0 3 3 2 1 Saturn 5 3 2 When the new position is greater than the field's current rightmost position, update the field's position: ``` When the new position is greater than the field's current rightmost position, update the field's position ``` | foreach * [| eval <<FIELD>>=if("<<FIELD>>"=_fields AND _new_position><<FIELD>>,_new_position,<<FIELD>>)] Earth Jupiter Mars Mercury Neptune Saturn Uranus Venus fields new position position start 2 2 1 0 3 3 2 1 Earth 2 0 2 2 2 3 0 3 3 2 1 Mars 3 1 2 2 4 1 0 3 3 2 1 Jupiter 4 2 2 2 2 1 0 3 5 2 1 Saturn 5 3 2 Find the new maximum index of each field named in each sequence ``` Find the new maximum index of each field named in each sequence ``` | eventstats max(*) as * Earth Jupiter Mars Mercury Neptune Saturn Uranus Venus fields new position position start 2 4 3 0 5 5 4 1 Earth 2 0 2 2 4 3 0 5 5 4 1 Mars 3 1 2 2 4 3 0 5 5 4 1 Jupiter 4 2 2 2 4 3 0 5 5 4 1 Saturn 5 3 2 Are we done yet? There are a couple of ways to tell whether we have found the correct order yet (which can be done visually, or as a stretch exercise). Firstly, more than one planet has the same position as another, and secondly, the maximum index position is less than 7. Since we do not appear to have finished, I will leave you to work out which steps need to be repeated, and by how many times. Generating the final template? Once all the necessary repetitions have been done, we need to convert the information into a template. This can be done by finding the final position for each field and sorting them. ``` Find the final position for each field and sort them ``` | rename _fieldnames as fieldnames | untable fieldnames field position | stats values(position) as position by field | sort 0 position Create a template from the sorted list of fields ``` Create a template from the sorted list of fields ``` | stats list(field) as fields | eval fields=mvjoin(fields,"~") You should now have a process for generating the template required for the planets data set. Have questions or thoughts? Comment on this article or in Slack #puzzles channel. Whichever you prefer.

ddrillic

I would probably create a python script that does the renaming and sanity checks on the incoming csv file(s).

dtaylor

Apologies it took so long to get back to this question. Thank you both for your enlightening responses. Fortunately, after reading through them, I managed to come across a working solution. | eval Parent_User=coalesce(ParentUser,null()) | eval user=coalesce(User,Account_Name) | eval parent_image=coalesce(ParentImage, Creator_Process_Name) | eval ParentCMD=coalesce(ParentCommandLine, null()) | eval parent_pid=coalesce(ParentProcessId, tonumber(Creator_Process_ID, 16)) | eval process_pid=coalesce(ProcessId, tonumber(New_Process_ID, 16)) | eval process_image=coalesce(Image,New_Process_Name) | eval command=coalesce(CommandLine, Process_Command_Line, script_content) | eval processInfo = '_time' + "|,|" + 'parent_image' + "|,|" + 'parent_pid' + "|,|" + 'command' | convert timeformat="%F %T" ctime(_time) AS time | table * | outputlookup tempEvents.csv create_empty=true allow_updates=false output_format=splunk_mv_csv I already posted this above, but ultimately, all I'm doing is sending all my events to a lookup table before filtering it down. The important part is the processInfo field which a basic concatination of the above fields and split by a unique delimitor. The lookup overwrites itself each run. | lookup tempEvents.csv ComputerName AS ComputerName process_pid AS parent_pid process_image AS parent_image OUTPUT processInfo AS grand_processInfo _time AS grand_time | eval min_grand_time = mvindex(mvsort(mvmap(grand_time, _time - grand_time)), 0) | eval grand_processInfo = mvdedup(grand_processInfo) | eval grand_processInfo = mvappend(grand_processInfo, "") | eval true_grandparent=null() | foreach mode=multivalue grand_processInfo [ eval split_mv = split(<<ITEM>>, "|,|"), true_grandparent = if((_time - tonumber(mvindex(split_mv, 0))) = min_grand_time, mvappend(true_grandparent, tostring(<<ITEM>>)), true_grandparent)] | rex field=true_grandparent "^\d+\|,\|(?<grandparent_image>.+?)\|,\|(?<grandparent_pid>.+?)\|,\|(?<parent_commandline>.+$)" Here, the lookup command re-adds events from the lookup table which match the ComputerName to the lookup's ComputerName fields, the process_id as the lookup's parent_id, and the process_image as the lookup's parent_image. The original issue was caused due to the fact that the lookup command would add multiple events where there should idealy only be a single event which matches for all three of those fields(ComputerName, process_id, and process_image). This is just a consequences of how Windows uses process_ids. They're only unique for as a long as a process is open. As soon as a process is ended, its process id can be recycled. As such, my solution hinged on comparing the _time field for the search events and the lookup events. In the first line after the lookup command, I declare a new field called min_grand_time and use mvmap to itterate over the grand_time field from the lookup table. I subtract each value in grand_time from the current search event's _time field to get a positive integer(time doesn't move backwards, after all). The resulting mv field is then sorted using mvsort(this isn't actually a sort based on number values, but it works out regardless). After the sort, I can use mvindex to return the value at the first index to get the value closest to 0. The next line is a dedup which I noticed was needed in grand_processInfo. I only realized after I added the dedup that the extra events are caused by me querying both sysmon and windows_event logs(a log from each exists for each process created on a system). I'll adjust the seach later by changing out the first table command(just prior to the initial outputlookup) and using stats to filter it down before sending it to the lookup table. After the dedup, grand_processInfo is single value field......however, I need it register as a mv field for the purpose of the following foreach command. To do so without actually adding anything, I use mvappend to add an empty string. I then create new field called true_grandparent for use in the foreach command. I may not need to declare a field prior to using it in a foreach, but I did so anyway. The foreach is where the magic happens which I realized I could use from ITWhisperer(thank you for that). Using it, I itterate over the grand_processInfo field(which is why I needed to use mvappend earlier in the case that grand_processInfo was a single value field). In the loop, I declare a new field named split_mv where I use the split command to split the current <<ITEM>> along its delimiter which I created at the start of the search. After it's been split into a new mv field with the _time, parent_image, parent_id, and command fields, I declare a new field called true_grandparent where if the _time for a search event minus the _time for a lookup event equals the value found in the field min_grand_time which I declared earlier, then I use mvappend to add the current <<ITEM>> to true_grandparent. Otherwise, it simply stays the same. By running this foreach over each value of grand_processInfo, I'm guarnteed to only get only one value appended to true_grandparent. Finally, I simply use a rex command on true_grandparent to split out the fields again. It isn't shown in the above code snippets, but I also add a fillnull command to 'fill in' the blank spots that could occur if a process's grandparent process isn't found in the lookup table. | fillnull value="N/A" great_grandparent_image, grandparent_image, great_grandparent_pid, grandparent_pid, grandparent_commandline, parent_commandline Hopefully the above explanation makes sense. I've been testing over the past few days, and fortunately, it does seem to work exactly as intended. With any luck, others attempting to do the same thing will be able to follow some variation of the above steps to achieve the same result. Thank you both, ITWhisperer and PickleRick for your expertise.

premadhas

Hi, @Maheswari1812 You was able to figureout the solution for this? If so can you please share? I am also experiencing the same issue. In my case, I want to set up a monitor to track latency for individual endpoints. However, when I search for the metric http.server.request.duration in the Observability Metric Finder, I see it listed (as shown in the screenshot), but when I click on it, it does not show the direct metric data. Instead, it only displays the aggregated metrics such as sum, count, min, max, and bucket, which are derived from http.server.request.duration. My requirement is to access http.server.request.duration itself so that I can configure a monitor based directly on the latency duration. Could you please guide me on how to access this metric or suggest the correct metric to use for endpoint-level latency monitoring?

livehybrid

Hi @akathuria I would recommend Analysis Of SplunkBase Apps for Splunk (https://splunkbase.splunk.com/app/2919) which is created internally at Splunk and published as Splunk Works. This has a bunch of info about compatibility such as below: 🌟 Did this answer help you? If so, please consider: Adding karma to show it was useful Marking it as the solution if it resolved your issue Commenting if you need any clarification Your feedback encourages the volunteers in this community to continue contributing

yh

I tried looking but I can't really find it. If it is a transforms affecting it, I am thinking shouldn't it have affected all events, and why it's only 1/30 events for example exhibiting the additional spaces at times. I am using the default windows TA, but I suppose my local props and transforms should have overwritten those.

yuanliu

As you will find everywhere in this forum, map is usually not the solution; subsearch also should not be the default go to. What you ought to ask yourself are: What is my input? What is my desired output? What is the logical relationship between input and desired output? Once you can verbalize these, you will find that stats is often all you need. But in your case, you are not interested in source1 at all. So, subsearch could be of good use - but not with map or append. Let me see if I get your question correct. You have two sources, source1 has two fields of interest, empid, domain; source2 has three, userid, domain, and status. All you want is a list of unique values of status - like 200, 203, 400, 404, 500, 501, as long as they are from index=source2 that matches any and all allowable combinations of empid and domain from index=source1 provided that userid in source2 equaling an allowable empid in source1. If the above is correct, the simplest approach would be index = source2 | tojson output_field=user_domain userid domain | search [search index = source1 | fields empid domain | rename empid as userid | stats count by userid domain ``` can be simplified to dedup but performance may suffer ``` | fields - count | tojson output_field=user_domain userid domain] | stats values(status) Here, subsearch is used as a filter, not going through append. So, the 50K limit doesn't apply. A more traditional approach with no subsearch could be index IN (source1, source2) | eval userid = coalesce(userid, empid) | stats values(status) as status values(index) as sources by userid domain | where sources = "source1" ``` make sure to only count empid-domain combo that do appear in source1 ``` | stats values(status) Hope this helps.

ITWhisperer

This is a strange one and given that it sems to work in every other environment, it sounds like there is nothing wrong with the process you used. Is it possible for you to compare you config files on a file by file basis between a working environment and non-working environment to determine what the significant difference might be?

livehybrid

Hi @PoojaDevi Based on that HTML response you are hitting the Web UI port for Splunk and not the management port. Typically this runs on 8089 but as you are running port 8001 not 8000 for the Web port there is every chance yours is also different. Are you running this locally or in a Docker container? Are there any local firewall restrictions that could be blocking 8089 when you try to use it? Also - For Splunk Cloud the recommendation is to use Admin Config Service (ACS) - see https://help.splunk.com/en/splunk-cloud-platform/administer/admin-config-service-manual/9.2.2406/administer-splunk-cloud-platform-using-the-admin-config-service-acs-api/manage-http-event-collector-hec-tokens-in-splunk-cloud-platform 🌟 Did this answer help you? If so, please consider: Adding karma to show it was useful Marking it as the solution if it resolved your issue Commenting if you need any clarification Your feedback encourages the volunteers in this community to continue contributing

PickleRick

Depends how you're using the inputlookup. A "nice" number suggests you're hitting one of splunk's limits which I suspect comes from using inputlookup within a subsearch. You can use inputlookup with append=t but not every use case can be expressed this way.

spisiakmi

Hi ITWhisperer, you are unbelievable. It is working. Even I nowhere show the Link, it is hidden behind the Label. Thank you very much.

robertlynch2020

Hi Thanks for this - this worked very well. Cheers for the help

PrewinThomas

@Anders333 The lookup file is created with the fields _time and test, then you run stats values(test) as testing. This produces a new field testing in the search results. Splunk lookup files are schema‑flexible. If later commands introduce new fields, splunk adds them as new columns, even if they’re empty for existing rows. If you need only testing field then write your outputlookup command after your stats. Eg: | makeresults | eval test = "this is a testing thing" | stats values(test) as testing | outputlookup append=false test.csv Regards, Prewin 🌟If this answer helped you, please consider marking it as the solution or giving a Karma. Thanks!

PickleRick

Let me add some different onlook on this riddle. Not a different solution, but maybe a bit different approach of getting there. First thing, which might not be obvious for newcomers to regexes - while a natural initial thought would be to match the whole street part (the part between third and fourth pipe delimiter), you can't just do that because the replacement string must be of fixed width. You could of course replace the whole street part with a string of asterisks but that's not what we're after. We want to replace only letters. That means we need to replace only a single letter each time. That makes the core of this challenge - how to match a single letter but only in specific context? Well, this is where lookaheads and lookbehinds come into play. With them we can actually try to "check" the surroundings of our match without actually consuming the search space. And that's what we need here. We need to match our non-whitespace character which is not a pipe either (this part is easy) [^\s|] but make it match only if it's in the street part of our event. As a side note - strictly formally, the result will not be a regular expression in language theory terms. But we have PCRE at our disposal and PCRE has some useful constructs and at this point we don't care whether ethey are "formally" regular expressions 😉 So, circling back to our lookaheads and lookbehinds - we hit another brick wall trying to intuitively anchor our match to the beginning of the event. It would be very easy to use lookbehind and match only after a string containing three pipe characters (or even better, we could use the fact that the fields are of constant width and just count the characters up to the third pipe) and we'd end up with something like (?<=^.{50}\|[^|]+)[^\s|] (the 50 should be adjusted to the actual number of characters; I was too lazy to count 😁) - the lookbehind would match the part up to the pipe and additional non-pipe characters after that and we'd accept any single non-space character after that. And it would be a very good idea but there is one issue with it - it won't work. Why is it so? Because the lookbehinds must have a constant width. This way we could only match the first character in the street part. So that's not the way to go. Luckily the lookahead does not have the limitation of fixed-width. So we can use the fact that after our matching letter we need to have a string with a well-defined contents. The initial naive approach of just counting pipe characters for the remaining fields gets us to this lookahead matching anything up to our first pipe character and then 16 more pipes because we have that many more fields. (?=[^|]*\|([^|]+\|){16}$) This approach will work but it will be very ineffective (requires almost 142k steps to match our sample data). But we can use the fact that the fields are of constant size. Simple replacement of 16 repetitions of the pipe-ending field with a constant number of characters does wonders. (?=[^|]*\|.{135}$) This lookahead drops us down to 16173 steps. So the final pattern to match and replace for a single asterisk would be [^| ](?=[^|]*\|.{135}$)

livehybrid · ‎11-11-2025

Hi @lady_bl00dst0n3 Are you referring to when you are writing a search in the search bar? Unfortunately Im not aware of any settings that can be changed to prevent this auto-running like you can on a dashboard. 🌟 Did this answer help you? If so, please consider: Adding karma to show it was useful Marking it as the solution if it resolved your issue Commenting if you need any clarification Your feedback encourages the volunteers in this community to continue contributing

ITWhisperer · ‎11-10-2025

Welcome to the Splunk Puzzle Playground If you are anything like me, you love to solve problems, and what better way to do it than with Splunk! Expand your Splunkiverse by learning and using lesser known/used commands, techniques, and data analysis insights to solve innovative puzzles and challenges. The Slack channel The Slack #puzzles channel has been specifically set up as an arena for having fun both with setting and solving puzzles. It is a fun channel for anyone to post puzzles etc. and for others to respond with solutions, queries and spoilers, which demonstrate some of the fun things that can be done with Splunk. Think about it as being mini (or maxi) versions of B.O.R.E. (Boss Of Regular Expressions) and/or SPLing Bee and/or B.O.T.S. (Boss Of The SOC) all year round, not just for the few hours we get at .conf! All levels The puzzles are open to all levels, and, depending on your level of expertise, they could take anything from just a few minutes (which you might do in your breaks - to relax?) to something more challenging (which may take multiple breaks over several days to completely nail!). You do not need to be an expert with Splunk, and if you get stuck, or want some pointers, just ask; there is no shame in the willingness to learn! Monthly puzzles I aim to have a new puzzle every month and would welcome contributions from others in this regard. You do not necessarily need to have a solution yourself (although it might be helpful if you did), as it could be a problem you are struggling with and want some help. If you need help framing your puzzle, please reach out to me and I will do what I can to help. Please bear in mind that data sets (where needed for the puzzle) would have to be non-NDA, non-proprietary, non-confidential, basically public-domain and freely available e.g. Splunk Tutorial, Montgomery County, github, etc., or generated, e.g. eventgen, makeresults, gentimes, etc. or posted in regex101.com, for example. Answers and solutions The puzzles can be answered and commented on in the Slack channel, for example, if you would like some hints, or have some hints for others. I will also usually try to have a Community Blog post published after about a week, giving hints and sometimes partial solutions to help guide people to solving the puzzles. Have questions or thoughts? Comment on this article or in Slack #puzzles channel. Whichever you prefer.

_guy · ‎11-10-2025

Thanks for the suggestion richgalloway, but yes, I tried that and it resulted in this nav_chart_mode="\2"

Posts	11493
Solutions	2320
Karma Given	80
Karma Received	3460
Member Since	‎2020-08-18

[Puzzles] Solve, Learn, Repeat: Nested loops in Ev...

[Puzzles] Solve, Learn, Repeat: Dereferencing XML ...

[Puzzles] Solve, Learn, Repeat: Character substitu...

Solve, Learn, Repeat: New Puzzle Channel Now Live

Fun with Regular Expression - multiples of nine

BORE at .conf25

Dashboards: Hiding charts while search is being ex...

Buttercup Games: Further Dashboarding Techniques (...

Buttercup Games: Further Dashboarding Techniques (...

Buttercup Games: Further Dashboarding Techniques (...

Re: Group events based on order of occurance of th...

Re: best way to check data exists before insert

Re: Text box and Drop down alignment issues

Re: Streamstats with rex and multiple "by" fields

[Puzzles] Solve, Learn, Repeat: Nested loops in Ev...

Re: How do I disable redirection warning?

Re: Is it possible to create re-usable inputs and ...

Re: script

[Puzzles] Solve, Learn, Repeat: Dereferencing XML ...

Re: How to rename dynamic fields ?

Re: Help Getting Grandparent Processes added to ev...

Re: APM -http.server.request.duration metrics comi...

Re: List of All Apps on Splunkbase

Re: Occasional trailing white space in created fie...

Re: How to finetune subsearch

Re: Extraction does not take place in raw events

Re: Create a HEC Token For splunk Enterprise & Spl...

Re: Limits of events returned

Re: link from lookup table into drilldown new page

Re: How to Graph white space before and after my d...

Re: Outputlookup followed by stats command causes ...

Re: [Puzzles] Solve, Learn, Repeat: Character subs...

Re: disable automatic search execution when time r...

Solve, Learn, Repeat: New Puzzle Channel Now Live

Re: classic xml - token eval using replace functio...

Join the Conversation