Splunk Search

Special fillnull? Polulate null values with evaluated calculations

morethanyell
Builder

This is my code

 index="google_apis" source="https://www.googleapis.com/youtube"
 | timechart span=1h avg(subCount) by channel

the results looks something like this

alt text

Now, my problem is I can't seem to find a way on how to fill the null values with this formula:

"average of the field" + ("stdev of the field" * random(-3, 3))

My intention is to fill the null values with psuedo values that is 3 sigmas away (below or above) from the mean of the fields. I can get the avg and stdev by the eventstats command. I can also get random number between -3 to 3. But I can't put them to the right place.

Thanks in advance

0 Karma
1 Solution

morethanyell
Builder

Below code is working and possibly the most efficient way of doing it. I'm accepting this as the correct answer but it was only because of @VatsalJagani 's answer that I was able to work it out.

index="google_apis" source="https://www.googleapis.com/youtube" 
| timechart span=1h avg(subCount) by channel 
| fields - NULL 
| eventstats avg as a_*, stdev as s_* 
| eval rand = tostring("0." . random())
| eval rand =
    case(
    (rand > 0.00 AND rand < 0.16), -3,
    (rand > 0.16 AND rand < 0.32), -2,
    (rand > 0.32 AND rand < 0.48), -1,
    (rand > 0.48 AND rand < 0.64), 1,
    (rand > 0.64 AND rand < 0.80), 2,
    (rand > 0.80 AND rand < 1.00), 3,
    true(), 1) 
| foreach *
    [ eval "<<FIELD>>" = floor(ifnull('<<FIELD>>', 'a_<<FIELD>>' + ('s_<<FIELD>>' * rand), '<<FIELD>>'))] 
| fields - a_*, s_*, rand

View solution in original post

0 Karma

morethanyell
Builder

Below code is working and possibly the most efficient way of doing it. I'm accepting this as the correct answer but it was only because of @VatsalJagani 's answer that I was able to work it out.

index="google_apis" source="https://www.googleapis.com/youtube" 
| timechart span=1h avg(subCount) by channel 
| fields - NULL 
| eventstats avg as a_*, stdev as s_* 
| eval rand = tostring("0." . random())
| eval rand =
    case(
    (rand > 0.00 AND rand < 0.16), -3,
    (rand > 0.16 AND rand < 0.32), -2,
    (rand > 0.32 AND rand < 0.48), -1,
    (rand > 0.48 AND rand < 0.64), 1,
    (rand > 0.64 AND rand < 0.80), 2,
    (rand > 0.80 AND rand < 1.00), 3,
    true(), 1) 
| foreach *
    [ eval "<<FIELD>>" = floor(ifnull('<<FIELD>>', 'a_<<FIELD>>' + ('s_<<FIELD>>' * rand), '<<FIELD>>'))] 
| fields - a_*, s_*, rand
0 Karma

VatsalJagani
SplunkTrust
SplunkTrust

Hi @morethanyell,

Check below query, it is really easy to understand whats happening.

| makeresults | eval a="12", b="13" | append [| makeresults | eval a="13"] | append [| makeresults | eval b="3"] | append [| makeresults | eval a="2", b="1"] 
| appendpipe [| stats stdev(*) as stdev_*, avg(*) as avg_*
| foreach avg_* 
    [ eval r_<<MATCHSTR>>=random() % 2 
    | eval r_<<MATCHSTR>>=if(r_<<MATCHSTR>>==0,1,-1) 
    | eval r_<<MATCHSTR>>=(random() % 4) * r_<<MATCHSTR>>
    | eval replace_<<MATCHSTR>> = <<FIELD>> + (stdev_<<MATCHSTR>> * r_<<MATCHSTR>>)] | table replace_*]
| eventstats avg(replace_*) as replace_*
| foreach replace_* [eval <<MATCHSTR>>=coalesce(<<MATCHSTR>>, <<FIELD>>)]
| fields - replace_*

Above query will work, just replace first line with your query in question. Below is just understanding how the query is working, if you are interested.

   | makeresults | eval a="12", b="13" | append [| makeresults | eval a="13"] | append [| makeresults | eval b="3"] | append [| makeresults | eval a="2", b="1"] 

is just to make similar results that you are having.

| appendpipe [| stats stdev(*) as stdev_*, avg(*) as avg_*

Finds the stdev and avg.

| foreach avg_* 
    [ eval r_<<MATCHSTR>>=random() % 2 
    | eval r_<<MATCHSTR>>=if(r_<<MATCHSTR>>==0,1,-1) 
    | eval r_<<MATCHSTR>>=(random() % 4) * r_<<MATCHSTR>>
    | eval replace_<<MATCHSTR>> = <<FIELD>> + (stdev_<<MATCHSTR>> * r_<<MATCHSTR>>)]  | table replace_*]

Finds the random number between (-3,3) and find the value that can be replaced where NULL is present. (replace_* fields)

| eventstats avg(replace_*) as replace_*
| foreach replace_* [eval <<MATCHSTR>>=coalesce(<<MATCHSTR>>, <<FIELD>>)]

Replace NULL values with replace_* field values with coalesce command.

| fields - replace_*

Just to remove unnecessary field replace_*

Hope this helps!!! BTW @morethanyell this is really nice question.

morethanyell
Builder

Amazing work. But how can I capture the fields with spaces?
EDIT: i worked it out. thank you very much!

morethanyell
Builder

Brute force method I used:

index="google_apis" source="https://www.googleapis.com/youtube"
| timechart span=4h avg(subCount) by channel 
| fields - NULL 
| eval rand = tonumber("0." . random()) 
| eval rand =
    case(
    (rand > 0.00 AND rand < 0.16), -3,
    (rand > 0.16 AND rand < 0.32), -2,
    (rand > 0.32 AND rand < 0.48), -1,
    (rand > 0.48 AND rand < 0.64), 1,
    (rand > 0.64 AND rand < 0.80), 2,
    (rand > 0.80 AND rand < 1.00), 3,
    true(), 1) 
| eventstats avg as a_*, stdev as s_* 
| eval fnv = 'a_Making it happen Vlog' + (rand * 's_Making it happen Vlog') 
| eval "Making it happen Vlog" = floor(coalesce('Making it happen Vlog', fnv))
| eval fnv = 'a_The Juicy Vlog' + (rand * 's_The Juicy Vlog') 
| eval "The Juicy Vlog" = floor(coalesce('The Juicy Vlog', fnv))
| eval fnv = a_Morgz + (rand * s_Morgz) 
| eval Morgz = floor(coalesce(Morgz, fnv))
| eval fnv = a_MrBeast + (rand * s_MrBeast) 
| eval MrBeast = floor(coalesce(MrBeast, fnv))
| eval fnv = a_PewDiePie + (rand * s_PewDiePie) 
| eval PewDiePie = floor(coalesce(PewDiePie, fnv))
| eval fnv = a_Music + (rand * s_Music) 
| eval Music = floor(coalesce(Music, fnv))
| eval fnv = a_TSeries + (rand * s_TSeries) 
| eval TSeries = floor(coalesce(TSeries, fnv))
| fields - a_*, s_*, rand, fnv
| fields _time, *
0 Karma
Get Updates on the Splunk Community!

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...