Splunk Search

Special fillnull? Polulate null values with evaluated calculations

morethanyell
Builder

This is my code

 index="google_apis" source="https://www.googleapis.com/youtube"
 | timechart span=1h avg(subCount) by channel

the results looks something like this

alt text

Now, my problem is I can't seem to find a way on how to fill the null values with this formula:

"average of the field" + ("stdev of the field" * random(-3, 3))

My intention is to fill the null values with psuedo values that is 3 sigmas away (below or above) from the mean of the fields. I can get the avg and stdev by the eventstats command. I can also get random number between -3 to 3. But I can't put them to the right place.

Thanks in advance

0 Karma
1 Solution

morethanyell
Builder

Below code is working and possibly the most efficient way of doing it. I'm accepting this as the correct answer but it was only because of @VatsalJagani 's answer that I was able to work it out.

index="google_apis" source="https://www.googleapis.com/youtube" 
| timechart span=1h avg(subCount) by channel 
| fields - NULL 
| eventstats avg as a_*, stdev as s_* 
| eval rand = tostring("0." . random())
| eval rand =
    case(
    (rand > 0.00 AND rand < 0.16), -3,
    (rand > 0.16 AND rand < 0.32), -2,
    (rand > 0.32 AND rand < 0.48), -1,
    (rand > 0.48 AND rand < 0.64), 1,
    (rand > 0.64 AND rand < 0.80), 2,
    (rand > 0.80 AND rand < 1.00), 3,
    true(), 1) 
| foreach *
    [ eval "<<FIELD>>" = floor(ifnull('<<FIELD>>', 'a_<<FIELD>>' + ('s_<<FIELD>>' * rand), '<<FIELD>>'))] 
| fields - a_*, s_*, rand

View solution in original post

0 Karma

morethanyell
Builder

Below code is working and possibly the most efficient way of doing it. I'm accepting this as the correct answer but it was only because of @VatsalJagani 's answer that I was able to work it out.

index="google_apis" source="https://www.googleapis.com/youtube" 
| timechart span=1h avg(subCount) by channel 
| fields - NULL 
| eventstats avg as a_*, stdev as s_* 
| eval rand = tostring("0." . random())
| eval rand =
    case(
    (rand > 0.00 AND rand < 0.16), -3,
    (rand > 0.16 AND rand < 0.32), -2,
    (rand > 0.32 AND rand < 0.48), -1,
    (rand > 0.48 AND rand < 0.64), 1,
    (rand > 0.64 AND rand < 0.80), 2,
    (rand > 0.80 AND rand < 1.00), 3,
    true(), 1) 
| foreach *
    [ eval "<<FIELD>>" = floor(ifnull('<<FIELD>>', 'a_<<FIELD>>' + ('s_<<FIELD>>' * rand), '<<FIELD>>'))] 
| fields - a_*, s_*, rand
0 Karma

VatsalJagani
SplunkTrust
SplunkTrust

Hi @morethanyell,

Check below query, it is really easy to understand whats happening.

| makeresults | eval a="12", b="13" | append [| makeresults | eval a="13"] | append [| makeresults | eval b="3"] | append [| makeresults | eval a="2", b="1"] 
| appendpipe [| stats stdev(*) as stdev_*, avg(*) as avg_*
| foreach avg_* 
    [ eval r_<<MATCHSTR>>=random() % 2 
    | eval r_<<MATCHSTR>>=if(r_<<MATCHSTR>>==0,1,-1) 
    | eval r_<<MATCHSTR>>=(random() % 4) * r_<<MATCHSTR>>
    | eval replace_<<MATCHSTR>> = <<FIELD>> + (stdev_<<MATCHSTR>> * r_<<MATCHSTR>>)] | table replace_*]
| eventstats avg(replace_*) as replace_*
| foreach replace_* [eval <<MATCHSTR>>=coalesce(<<MATCHSTR>>, <<FIELD>>)]
| fields - replace_*

Above query will work, just replace first line with your query in question. Below is just understanding how the query is working, if you are interested.

   | makeresults | eval a="12", b="13" | append [| makeresults | eval a="13"] | append [| makeresults | eval b="3"] | append [| makeresults | eval a="2", b="1"] 

is just to make similar results that you are having.

| appendpipe [| stats stdev(*) as stdev_*, avg(*) as avg_*

Finds the stdev and avg.

| foreach avg_* 
    [ eval r_<<MATCHSTR>>=random() % 2 
    | eval r_<<MATCHSTR>>=if(r_<<MATCHSTR>>==0,1,-1) 
    | eval r_<<MATCHSTR>>=(random() % 4) * r_<<MATCHSTR>>
    | eval replace_<<MATCHSTR>> = <<FIELD>> + (stdev_<<MATCHSTR>> * r_<<MATCHSTR>>)]  | table replace_*]

Finds the random number between (-3,3) and find the value that can be replaced where NULL is present. (replace_* fields)

| eventstats avg(replace_*) as replace_*
| foreach replace_* [eval <<MATCHSTR>>=coalesce(<<MATCHSTR>>, <<FIELD>>)]

Replace NULL values with replace_* field values with coalesce command.

| fields - replace_*

Just to remove unnecessary field replace_*

Hope this helps!!! BTW @morethanyell this is really nice question.

morethanyell
Builder

Amazing work. But how can I capture the fields with spaces?
EDIT: i worked it out. thank you very much!

morethanyell
Builder

Brute force method I used:

index="google_apis" source="https://www.googleapis.com/youtube"
| timechart span=4h avg(subCount) by channel 
| fields - NULL 
| eval rand = tonumber("0." . random()) 
| eval rand =
    case(
    (rand > 0.00 AND rand < 0.16), -3,
    (rand > 0.16 AND rand < 0.32), -2,
    (rand > 0.32 AND rand < 0.48), -1,
    (rand > 0.48 AND rand < 0.64), 1,
    (rand > 0.64 AND rand < 0.80), 2,
    (rand > 0.80 AND rand < 1.00), 3,
    true(), 1) 
| eventstats avg as a_*, stdev as s_* 
| eval fnv = 'a_Making it happen Vlog' + (rand * 's_Making it happen Vlog') 
| eval "Making it happen Vlog" = floor(coalesce('Making it happen Vlog', fnv))
| eval fnv = 'a_The Juicy Vlog' + (rand * 's_The Juicy Vlog') 
| eval "The Juicy Vlog" = floor(coalesce('The Juicy Vlog', fnv))
| eval fnv = a_Morgz + (rand * s_Morgz) 
| eval Morgz = floor(coalesce(Morgz, fnv))
| eval fnv = a_MrBeast + (rand * s_MrBeast) 
| eval MrBeast = floor(coalesce(MrBeast, fnv))
| eval fnv = a_PewDiePie + (rand * s_PewDiePie) 
| eval PewDiePie = floor(coalesce(PewDiePie, fnv))
| eval fnv = a_Music + (rand * s_Music) 
| eval Music = floor(coalesce(Music, fnv))
| eval fnv = a_TSeries + (rand * s_TSeries) 
| eval TSeries = floor(coalesce(TSeries, fnv))
| fields - a_*, s_*, rand, fnv
| fields _time, *
0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...