Splunk Search

Can someone give me an incredibly simple custom, streaming splunk command?

Builder

All of the examples I've seen are too advanced or don't describe the code line by line.

Can someone take the time to write up a custom command for something like this:

Supply a numerical field from the events. Add a new field called "the_total" to each event where "the_total" is the sum of each digit in that field combined
Example: phone_number=5557884432, the_total=51

I don't really need code to add up the values. I need code for ingesting and spitting the data back out to splunk. That is what I don't understand. Also looking for beginner tutorials as well.

1 Solution

Builder

I finally figured it out. I didn't know how to test the script because it imports a splunk library that is only on the splunk servers, so I debugged and figured out the event structure by constantly updating, uploading the script and doing a debug refresh on the servers. I sent the data I needed to see to a new field in each event. Ghetto, I know.

Anyways, here is the script line by line for anyone else who was clueless on how to ingest and send the results back to splunk like me. It takes in one field, figures out its entropy and places that value back into the event as a new field:

import math
import csv
import sys
import re
import time
import splunk.Intersplunk

def entropy(string):
  # get probability of chars in string
  prob = [ float(string.count(c)) / len(string) for c in dict.fromkeys(list(string)) ]
  # calculate the entropy
  entropy = - sum([ p * math.log(p) / math.log(2.0) for p in prob ])
  return entropy

# A basic shell for any custom streaming command. Just pass the events to it
def customcommand(results, settings):
  try:
    # Get the command's arguments (the value(s) that are passed after the command in the Splunk query)
    fields, argvals = splunk.Intersplunk.getKeywordsAndOptions()
    # We'll use the first parameter as the field we want to run the entropy math on
    # We don't actually use any args in this one (for commands that would pass something like blah=true)
    entropy_field = fields[0]

    # Set a default return value
    entropy_value = "Field does not exist"

    # If the parameter provided exists as a field in the event, run the entropy math on its value
    for result in results:
      # If field exists in event
      if entropy_field in result:
        # Get the field's actual value
        entropy_value = result[entropy_field]
        # Create the new field we'll place into the events
        newfield = entropy_field + "_entropy"
        # Finally, run the math on the field's value and place it into the newfield we just created
        result[newfield] = entropy(entropy_value)

    # Let the modified events flow back into the search results
    splunk.Intersplunk.outputResults(results)

  except:
    import traceback
    stack =  traceback.format_exc()



#####Start######

# Get the events from splunk
results, dummyresults, settings = splunk.Intersplunk.getOrganizedResults()
# Send the events to be worked on
results = customcommand(results, settings)
  1. The script takes in the events flowing from splunk with splunk.Intersplunk.getOrganizedResults()
  2. It sends those events to customcommand(results, settings): That massages the data a bit and checks to make sure the parameter passed to the command (example: | entropy host) exists in the event.
  3. If the parameter is actually a field in the event, it will run the entropy command on its value and place it into the event as a new field with with it's entropy value with "result[newfield] = entropy(entropy_value)"
  4. Finally, it sends those events back to splunk to be displayed in the search results with "splunk.Intersplunk.outputResults(results)"

Usage:

index=* sourcetype=* | entropy myfield

That will return a new field called myfield_entropy.

View solution in original post

Builder

I finally figured it out. I didn't know how to test the script because it imports a splunk library that is only on the splunk servers, so I debugged and figured out the event structure by constantly updating, uploading the script and doing a debug refresh on the servers. I sent the data I needed to see to a new field in each event. Ghetto, I know.

Anyways, here is the script line by line for anyone else who was clueless on how to ingest and send the results back to splunk like me. It takes in one field, figures out its entropy and places that value back into the event as a new field:

import math
import csv
import sys
import re
import time
import splunk.Intersplunk

def entropy(string):
  # get probability of chars in string
  prob = [ float(string.count(c)) / len(string) for c in dict.fromkeys(list(string)) ]
  # calculate the entropy
  entropy = - sum([ p * math.log(p) / math.log(2.0) for p in prob ])
  return entropy

# A basic shell for any custom streaming command. Just pass the events to it
def customcommand(results, settings):
  try:
    # Get the command's arguments (the value(s) that are passed after the command in the Splunk query)
    fields, argvals = splunk.Intersplunk.getKeywordsAndOptions()
    # We'll use the first parameter as the field we want to run the entropy math on
    # We don't actually use any args in this one (for commands that would pass something like blah=true)
    entropy_field = fields[0]

    # Set a default return value
    entropy_value = "Field does not exist"

    # If the parameter provided exists as a field in the event, run the entropy math on its value
    for result in results:
      # If field exists in event
      if entropy_field in result:
        # Get the field's actual value
        entropy_value = result[entropy_field]
        # Create the new field we'll place into the events
        newfield = entropy_field + "_entropy"
        # Finally, run the math on the field's value and place it into the newfield we just created
        result[newfield] = entropy(entropy_value)

    # Let the modified events flow back into the search results
    splunk.Intersplunk.outputResults(results)

  except:
    import traceback
    stack =  traceback.format_exc()



#####Start######

# Get the events from splunk
results, dummyresults, settings = splunk.Intersplunk.getOrganizedResults()
# Send the events to be worked on
results = customcommand(results, settings)
  1. The script takes in the events flowing from splunk with splunk.Intersplunk.getOrganizedResults()
  2. It sends those events to customcommand(results, settings): That massages the data a bit and checks to make sure the parameter passed to the command (example: | entropy host) exists in the event.
  3. If the parameter is actually a field in the event, it will run the entropy command on its value and place it into the event as a new field with with it's entropy value with "result[newfield] = entropy(entropy_value)"
  4. Finally, it sends those events back to splunk to be displayed in the search results with "splunk.Intersplunk.outputResults(results)"

Usage:

index=* sourcetype=* | entropy myfield

That will return a new field called myfield_entropy.

View solution in original post

Influencer

I don't quite understand - you want the data taken out of splunk and then reindexed?

Your actual example is quite simple to do within Splunk, take the following run anywhere example (gentimes just provides some data to work with) :

|gentimes start=-1 | fields starttime | eval the_total=split(starttime,"") | mvexpand the_total | stats values(starttime) sum(the_total)

Is that what you're after?

I reckon the best Splunk tutorial is the official one: see http://docs.splunk.com/Documentation/Splunk/latest/SearchTutorial/WelcometotheSearchTutorial

0 Karma

Builder

No re-indexing or anything. I just want a new field in each event that comes back in my search. In reality, I want to do things to the values that are not possible in Splunk's query language (entropy math).

It should look like this:

index=* sourcetype=* | entropy domain_name | table domain_name entropy_value

domain_name     entropy_value
asdasdas.com     3.4
google.com         2.8
aaa                       0.0
0 Karma

Influencer

Have you tried this app? https://splunkbase.splunk.com/app/2734/#/overview

Pretty sure it will do what you need

0 Karma

Builder

The domain name was just an example. I'd like to be able to supply a field and value (string) and get an entropy value back.

0 Karma
State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!