Splunk Search

Special fillnull? Polulate null values with evaluated calculations

morethanyell
Builder

This is my code

 index="google_apis" source="https://www.googleapis.com/youtube"
 | timechart span=1h avg(subCount) by channel

the results looks something like this

alt text

Now, my problem is I can't seem to find a way on how to fill the null values with this formula:

"average of the field" + ("stdev of the field" * random(-3, 3))

My intention is to fill the null values with psuedo values that is 3 sigmas away (below or above) from the mean of the fields. I can get the avg and stdev by the eventstats command. I can also get random number between -3 to 3. But I can't put them to the right place.

Thanks in advance

0 Karma
1 Solution

morethanyell
Builder

Below code is working and possibly the most efficient way of doing it. I'm accepting this as the correct answer but it was only because of @VatsalJagani 's answer that I was able to work it out.

index="google_apis" source="https://www.googleapis.com/youtube" 
| timechart span=1h avg(subCount) by channel 
| fields - NULL 
| eventstats avg as a_*, stdev as s_* 
| eval rand = tostring("0." . random())
| eval rand =
    case(
    (rand > 0.00 AND rand < 0.16), -3,
    (rand > 0.16 AND rand < 0.32), -2,
    (rand > 0.32 AND rand < 0.48), -1,
    (rand > 0.48 AND rand < 0.64), 1,
    (rand > 0.64 AND rand < 0.80), 2,
    (rand > 0.80 AND rand < 1.00), 3,
    true(), 1) 
| foreach *
    [ eval "<<FIELD>>" = floor(ifnull('<<FIELD>>', 'a_<<FIELD>>' + ('s_<<FIELD>>' * rand), '<<FIELD>>'))] 
| fields - a_*, s_*, rand

View solution in original post

0 Karma

morethanyell
Builder

Below code is working and possibly the most efficient way of doing it. I'm accepting this as the correct answer but it was only because of @VatsalJagani 's answer that I was able to work it out.

index="google_apis" source="https://www.googleapis.com/youtube" 
| timechart span=1h avg(subCount) by channel 
| fields - NULL 
| eventstats avg as a_*, stdev as s_* 
| eval rand = tostring("0." . random())
| eval rand =
    case(
    (rand > 0.00 AND rand < 0.16), -3,
    (rand > 0.16 AND rand < 0.32), -2,
    (rand > 0.32 AND rand < 0.48), -1,
    (rand > 0.48 AND rand < 0.64), 1,
    (rand > 0.64 AND rand < 0.80), 2,
    (rand > 0.80 AND rand < 1.00), 3,
    true(), 1) 
| foreach *
    [ eval "<<FIELD>>" = floor(ifnull('<<FIELD>>', 'a_<<FIELD>>' + ('s_<<FIELD>>' * rand), '<<FIELD>>'))] 
| fields - a_*, s_*, rand
0 Karma

VatsalJagani
SplunkTrust
SplunkTrust

Hi @morethanyell,

Check below query, it is really easy to understand whats happening.

| makeresults | eval a="12", b="13" | append [| makeresults | eval a="13"] | append [| makeresults | eval b="3"] | append [| makeresults | eval a="2", b="1"] 
| appendpipe [| stats stdev(*) as stdev_*, avg(*) as avg_*
| foreach avg_* 
    [ eval r_<<MATCHSTR>>=random() % 2 
    | eval r_<<MATCHSTR>>=if(r_<<MATCHSTR>>==0,1,-1) 
    | eval r_<<MATCHSTR>>=(random() % 4) * r_<<MATCHSTR>>
    | eval replace_<<MATCHSTR>> = <<FIELD>> + (stdev_<<MATCHSTR>> * r_<<MATCHSTR>>)] | table replace_*]
| eventstats avg(replace_*) as replace_*
| foreach replace_* [eval <<MATCHSTR>>=coalesce(<<MATCHSTR>>, <<FIELD>>)]
| fields - replace_*

Above query will work, just replace first line with your query in question. Below is just understanding how the query is working, if you are interested.

   | makeresults | eval a="12", b="13" | append [| makeresults | eval a="13"] | append [| makeresults | eval b="3"] | append [| makeresults | eval a="2", b="1"] 

is just to make similar results that you are having.

| appendpipe [| stats stdev(*) as stdev_*, avg(*) as avg_*

Finds the stdev and avg.

| foreach avg_* 
    [ eval r_<<MATCHSTR>>=random() % 2 
    | eval r_<<MATCHSTR>>=if(r_<<MATCHSTR>>==0,1,-1) 
    | eval r_<<MATCHSTR>>=(random() % 4) * r_<<MATCHSTR>>
    | eval replace_<<MATCHSTR>> = <<FIELD>> + (stdev_<<MATCHSTR>> * r_<<MATCHSTR>>)]  | table replace_*]

Finds the random number between (-3,3) and find the value that can be replaced where NULL is present. (replace_* fields)

| eventstats avg(replace_*) as replace_*
| foreach replace_* [eval <<MATCHSTR>>=coalesce(<<MATCHSTR>>, <<FIELD>>)]

Replace NULL values with replace_* field values with coalesce command.

| fields - replace_*

Just to remove unnecessary field replace_*

Hope this helps!!! BTW @morethanyell this is really nice question.

morethanyell
Builder

Amazing work. But how can I capture the fields with spaces?
EDIT: i worked it out. thank you very much!

morethanyell
Builder

Brute force method I used:

index="google_apis" source="https://www.googleapis.com/youtube"
| timechart span=4h avg(subCount) by channel 
| fields - NULL 
| eval rand = tonumber("0." . random()) 
| eval rand =
    case(
    (rand > 0.00 AND rand < 0.16), -3,
    (rand > 0.16 AND rand < 0.32), -2,
    (rand > 0.32 AND rand < 0.48), -1,
    (rand > 0.48 AND rand < 0.64), 1,
    (rand > 0.64 AND rand < 0.80), 2,
    (rand > 0.80 AND rand < 1.00), 3,
    true(), 1) 
| eventstats avg as a_*, stdev as s_* 
| eval fnv = 'a_Making it happen Vlog' + (rand * 's_Making it happen Vlog') 
| eval "Making it happen Vlog" = floor(coalesce('Making it happen Vlog', fnv))
| eval fnv = 'a_The Juicy Vlog' + (rand * 's_The Juicy Vlog') 
| eval "The Juicy Vlog" = floor(coalesce('The Juicy Vlog', fnv))
| eval fnv = a_Morgz + (rand * s_Morgz) 
| eval Morgz = floor(coalesce(Morgz, fnv))
| eval fnv = a_MrBeast + (rand * s_MrBeast) 
| eval MrBeast = floor(coalesce(MrBeast, fnv))
| eval fnv = a_PewDiePie + (rand * s_PewDiePie) 
| eval PewDiePie = floor(coalesce(PewDiePie, fnv))
| eval fnv = a_Music + (rand * s_Music) 
| eval Music = floor(coalesce(Music, fnv))
| eval fnv = a_TSeries + (rand * s_TSeries) 
| eval TSeries = floor(coalesce(TSeries, fnv))
| fields - a_*, s_*, rand, fnv
| fields _time, *
0 Karma
Get Updates on the Splunk Community!

Federated Search for Amazon S3 | Key Use Cases to Streamline Compliance Workflows

Modern business operations are supported by data compliance. As regulations evolve, organizations must ...

New Dates, New City: Save the Date for .conf25!

Wake up, babe! New .conf25 dates AND location just dropped!! That's right, this year, .conf25 is taking place ...

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud  In today’s fast-paced digital ...