Splunk Search

Adding two indexes to one search again

the_dude
Engager
index=jedi   domain="jedi.lightside.com" (master!="yoda" AND master!="mace" AND master="Jinn")
| table saber_color, Jname, strengths, mentor, skill, domain, mission

index-=sith  broker sithlord!=darth_maul
| table saber_color, Sname, strength, teacher, actions

I need to list where Jname=Sname, but I need to list all columns

The third one is

where the Jname!=Sname

The caveat is I cannot use the join for this query.

This helped however I am unable to utilize the index drill down for each in the search otherwise the query is 75% white noise.

index=jedi OR index=sith
| eval name=coalesce(Jname, Sname)
| stats values(name) as names by saber_color strengths
| where mvcount(names)=1

Please help.

Labels (3)
0 Karma

PickleRick
SplunkTrust
SplunkTrust

I'm still not sure what are the source datasets and what should be the result. I see some attempts at solving this riddle in the thread but I'm not 100% sure we're all on the same page regarding what we're working with and what we want to achieve in the end.

Could you please post samples of your data and what the result should look like?

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @the_dude,

your search is correct and I like that you don't use join.

but what's te issue?

You can use the name value for the drilldown, remembering that the filed name is different in the two indexes:

so the drilldown search could be:

index=jedi OR index=sith
| eval name=coalesce(Jname, Sname)
| search name=$name$
| table ...

Ciao.

Giuseppe

0 Karma

the_dude
Engager

I need to filter out the white noise.

index-=sith broker sithlord!=darth_maul OR index=jedi domain="jedi.lightside.com" (master!="yoda" AND master!="mace" AND master="Jinn")

 

If I use the correct one, 75% of it is white noise. I need to use the above to isolate what I need specifically from the indexes. I need to show a 1:1 match plus a few other columns. I was asked to get the former but now they need more.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @the_dude,

your above search permits you to find the values common to both the indexes.

The last search gives you (in drilldown), the informations, from both the indexes relative to the choosed value.

So, what do you want to filter?

Ciao.

Giuseppe

0 Karma

the_dude
Engager

This was the concept:

index-=sith broker sithlord!=darth_maul OR index=jedi domain="jedi.lightside.com" (master!="yoda" AND master!="mace" AND master="Jinn")
| where Jname=Sname
| table Jname, Sname, strengths, mentor, skill, domain, mission, strength, teacher, actions

And

index-=sith broker sithlord!=darth_maul OR index=jedi domain="jedi.lightside.com" (master!="yoda" AND master!="mace" AND master="Jinn")
| where Jname!=Sname
| table Jname, Sname, strengths, mentor, skill, domain, mission, strength, teacher, actions

 

I am trying to get the results where Jname=Sname are the same plus all following columns. This is for a comparison for our analysts and they want the first two columns to match plus following columns. Then a report where they are not matching.

0 Karma

yuanliu
SplunkTrust
SplunkTrust

First, it seems to me that (master!="yoda" AND master!="mace" AND master="Jinn") and master="Jinn" are semantically identical.  Is this correct? (I'm unfamiliar with the Jedi lore.)  I'll assume it to be true in the following.

Second, what is preventing you from doing, for example

index=sith broker sithlord!=darth_maul OR index=jedi domain="jedi.lightside.com" master="Jinn"
| eval name=coalesce(Jname, Sname)
| stats values(name) as names by saber_color strengths
| where mvcount(names)=1

or even

index=sith broker sithlord!=darth_maul OR index=jedi domain="jedi.lightside.com" master="Jinn"
| eval name=coalesce(Jname, Sname)
| stats values(*) as * by saber_color strengths
| where mvcount(names)=1

This way, you will have all columns preserved.

Third, could you explain "unable to utilize the index drill down for each in the search otherwise the query is 75% white noise?"  Are you trying to use "Automatic" in drilldown action?  Anything "automatic" is really Splunk's guess.  If you have something specific in mind. you will want to write custom drilldown instead.

0 Karma

dtburrows3
Builder

Would an aggregated resulting dataset be sufficient for your ask?
I tried to do what I think you are asking by utilizing a stats command to aggregate data from the two indexes together but has just a compressed overview of the analysis.

Example of output with simulation data:

dtburrows3_0-1702583556619.png


To achieve this with the base searches you provided would look like this.

(index=sith broker sithlord!=darth_maul) OR (index=jedi domain="jedi.lightside.com" (master!="yoda" AND master!="mace" AND master="Jinn"))
    | fields + _time, index, Jname, saber_color, domain, master, strengths, mentor, skill, mission, Sname, strength, teacher, actions
    | tojson str(saber_color) str(domain) str(master) str(actions) str(mentor) str(mission) str(skill) str(strength) str(strengths) str(teacher) output_field=unique_field_combos_json
    | fields + _time, index, Jname, Sname, unique_field_combos_json
    | eval
        name=coalesce('Jname', 'Sname')
    | stats
        min(_time) as earliest_event,
        max(_time) as latest_event,
        count as total_count,
        count(eval('index'=="jedi")) as jedi_count,
        count(eval('index'=="sith")) as sith_count,
        values(index) as indexes,
        dc(index) as dc_indexes,
        latest(eval(case('index'=="jedi", unique_field_combos_json))) as jedi_unique_field_combos_json,
        latest(eval(case('index'=="sith", unique_field_combos_json))) as sith_unique_field_combos_json
            by name
    | eval
        scenario=if(
            'dc_indexes'==1,
                case(
                    'indexes'=="jedi", "Jedi Only",
                    'indexes'=="sith", "Sith Only"
                    ),
                "Jedi and Sith"
            )
    | foreach *_unique_field_combos_json
        [
            | eval
                unique_field_combos_json=if(
                    isnotnull('<<FIELD>>'),
                        mvappend(
                            'unique_field_combos_json',
                            json_set('<<FIELD>>', "type", "<<MATCHSTR>>")
                            ),
                        'unique_field_combos_json'
                    )
            ]
    | fields - *_unique_field_combos_json
    | mvexpand unique_field_combos_json
    | fromjson unique_field_combos_json
    | fields - unique_field_combos_json
    | fields + name, type, scenario, total_count, jedi_count, sith_count, saber_color, domain, master, actions, mentor, mission, skill, strength, strengths, teacher
    | stats
        values(*) as *
            by name
    | fields + name, type, scenario, *_count, saber_color, domain, master, actions, mentor, mission, skill, strength, strengths, teacher
    | eval
        scenario_sort=case(
            'scenario'=="Jedi and Sith", 1,
            'scenario'=="Jedi Only", 2,
            'scenario'=="Sith Only", 3
            )
    | sort 0 +scenario_sort
    | fields - scenario_sort

 

To generate the simulation data was a doozy since I dont have a datagen setup right now but was able to put something together using build in splunk commands. SPL used to simulate for reference.

| makeresults count=1000
    | eval
        low=1,
        high=[ 
            | makeresults
                | eval
                    index="sith",
                    fields_to_gen=split("Sname|saber_color|strength|teacher|actions", "|")
                | append
                    [
                        | makeresults
                            | eval
                                index="jedi",
                                fields_to_gen=split("Jname|saber_color|strengths|mentor|skill|mission|master|domain", "|")
                        ]
                | mvexpand fields_to_gen
                | fields - _time
                | eval
                    value_format=if(
                        match('fields_to_gen', "^[A-Z]name$"), 
                            "name",
                            'fields_to_gen'
                        )
                | rename
                    fields_to_gen as fieldname
                | tojson str(fieldname) str(value_format) output_field=field_format_json
                | fields + index, field_format_json
                | stats
                    values(field_format_json) as field_format_json
                        by index
                | eval
                    field_format_json_array="[".mvjoin(field_format_json, ",")."]"
                | fields - field_format_json
                | streamstats
                    count as index_number_assignment
                | stats
                    max(index_number_assignment) as index_count
                | return $index_count
            ],
        rand=round(((random()%'high')/'high')*('high'-'low')+'low')
    | fields - low, high
    | rename
        rand as index_number_assignment
    ``` distribute timestamps ```
    | streamstats
        count as iter
    | eval
        _time=now()-('iter'/10)
    
    
    | join type=left index_number_assignment
        [
            | makeresults
                | eval
                    index="sith",
                    fields_to_gen=split("Sname|saber_color|strength|teacher|actions", "|")
                | append
                    [
                        | makeresults
                            | eval
                                index="jedi",
                                fields_to_gen=split("Jname|saber_color|strengths|mentor|skill|mission|master|domain", "|")
                        ]
                | mvexpand fields_to_gen
                | fields - _time
                | eval
                    value_format=if(
                        match('fields_to_gen', "^[A-Z]name$"), 
                            "name",
                            'fields_to_gen'
                        )
                | rename
                    fields_to_gen as fieldname
                | tojson str(fieldname) str(value_format) output_field=field_format_json
                | fields + index, field_format_json
                | stats
                    values(field_format_json) as field_format_json
                        by index
                | tojson str(index) str(field_format_json) output_field=json
                | streamstats
                    count as index_number_assignment
                | fields + index_number_assignment, json
            ]
    | fromjson json
    | fields - json, index_number_assignment
    
    | eval
        json=json_object()
    | foreach mode=multivalue field_format_json
        [
            | eval
                fieldname=spath('<<ITEM>>', "fieldname"),
                json=json_set(json, 'fieldname', spath('<<ITEM>>', "value_format")."_")
            ]
    | fields - field_format_json
    | spath input=json
    | fields - json, fieldname
    | fields + index, *
    | foreach *name
        [
            | eval
                low=1, 
                high=5, 
                rand=round(((random()%'high')/'high')*('high'-'low')+'low'),
                <<FIELD>>='<<FIELD>>'.'rand'
                | fields - low, high, rand
            ]
    | foreach *
        [
            | eval
                low=1, 
                nested_high=10,
                nested_rand=round(((random()%'nested_high')/'nested_high')*('nested_high'-'low')+'low'),
                high='nested_rand', 
                rand=round(((random()%'high')/'high')*('high'-'low')+'low'),
                <<FIELD>>=if(
                    NOT match("<<FIELD>>", "[A-Z]name$") AND NOT "<<FIELD>>"=="index",
                        '<<FIELD>>'.'rand',
                        '<<FIELD>>'
                    )
                | fields - low, high, rand, nested_high, nested_rand
            ]  
    | eval
        Jname=if(
            'index'=="jedi" AND 'Jname'=="name_1",
                "name_unique_jedi",
                'Jname'
            ),
        Sname=if(
            'index'=="sith" AND 'Sname'=="name_2",
                "name_unique_sith",
                'Sname'
            )
            
            
            
            
```
(index-=sith broker sithlord!=darth_maul) OR (index=jedi domain="jedi.lightside.com" (master!="yoda" AND master!="mace" AND master="Jinn"))
    | fields + _time, index, Jname, saber_color, domain, master, strengths, mentor, skill, mission, Sname, strength, teacher, actions
```
    | tojson str(saber_color) str(domain) str(master) str(actions) str(mentor) str(mission) str(skill) str(strength) str(strengths) str(teacher) output_field=unique_field_combos_json
    | fields + _time, index, Jname, Sname, unique_field_combos_json
    | eval
        name=coalesce('Jname', 'Sname')
    | stats
        min(_time) as earliest_event,
        max(_time) as latest_event,
        count as total_count,
        count(eval('index'=="jedi")) as jedi_count,
        count(eval('index'=="sith")) as sith_count,
        values(index) as indexes,
        dc(index) as dc_indexes,
        latest(eval(case('index'=="jedi", unique_field_combos_json))) as jedi_unique_field_combos_json,
        latest(eval(case('index'=="sith", unique_field_combos_json))) as sith_unique_field_combos_json
            by name
    | eval
        scenario=if(
            'dc_indexes'==1,
                case(
                    'indexes'=="jedi", "Jedi Only",
                    'indexes'=="sith", "Sith Only"
                    ),
                "Jedi and Sith"
            )
    | foreach *_unique_field_combos_json
        [
            | eval
                unique_field_combos_json=if(
                    isnotnull('<<FIELD>>'),
                        mvappend(
                            'unique_field_combos_json',
                            json_set('<<FIELD>>', "type", "<<MATCHSTR>>")
                            ),
                        'unique_field_combos_json'
                    )
            ]
    | fields - *_unique_field_combos_json
    | mvexpand unique_field_combos_json
    | fromjson unique_field_combos_json
    | fields - unique_field_combos_json
    | fields + name, type, scenario, total_count, jedi_count, sith_count, saber_color, domain, master, actions, mentor, mission, skill, strength, strengths, teacher
    | stats
        values(*) as *
            by name
    | fields + name, type, scenario, *_count, saber_color, domain, master, actions, mentor, mission, skill, strength, strengths, teacher
    | eval
        scenario_sort=case(
            'scenario'=="Jedi and Sith", 1,
            'scenario'=="Jedi Only", 2,
            'scenario'=="Sith Only", 3
            )
    | sort 0 +scenario_sort
    | fields - scenario_sort
0 Karma

the_dude
Engager

It is correct somewhat, I'm trying to 1:1 for the two specific columns. The use it just to start with the two columns matching at first, then another where they do not. Where Qui-gonn Jinn is in both Sith and Jedi indexes and listed in both columns. For some reason I thinking I might be making this more difficult than it needs to be. If the two IDs match in both columns then they are listed with the rest. Hopefully that clear is up. I am still trying to relearn the whole search in Splunk currently so I do apologize.

0 Karma
Get Updates on the Splunk Community!

Join Us for Splunk University and Get Your Bootcamp Game On!

If you know, you know! Splunk University is the vibe this summer so register today for bootcamps galore ...

.conf24 | Learning Tracks for Security, Observability, Platform, and Developers!

.conf24 is taking place at The Venetian in Las Vegas from June 11 - 14. Continue reading to learn about the ...

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...