Splunk Search

append data from two indexes

mcohen13
Loves-to-Learn

i have two indexes:
index#1 contain raw event log.
from this event log i calc for every domain the number of events so that i have:
domain_name description Event_count
in this index i look at time span in the time span selection in search

index#2 contain aggregated information on domain meaning for each domain i have query_count on each day
in this index i calc more data for every domain_name
in this index i look at time span of last 30 days

what i want at the end of the day is that the query will return for me a table that will contain:
event_domain Event_count Dates_Count SumQueries MaxQueries avg30Days

i tried to use join but some domain that appear in index2 don't appear after the join

the query i use:
index="event_raw_data" | join event_domain [search index="domain_agg_info" earliest=-30d | eval epoch33days_ago=relative_time(now(), "-33d@d" ) | eval epochEventDays = strptime(date,"%Y-%m-%d") | where epochEventDays > epoch33days_ago | eventstats dc(date) as "Dates_Count" by event_domain| eventstats count(date) as "Record_count" by event_domain | eventstats max(query_count) as "MaxQueries" by date | eventstats max(Dates_Count) as "MaxDatesCount"| eventstats sum(query_count) as "SumQueries" by event_domain | eventstats avg(customer_count) as "AvgCustomerCount" by event_domain | eval AvgCustomerCount=round(AvgCustomerCount,0)| eval avg30Days=round(if(Record_count < 30,SumQueries/MaxDatesCount,SumQueries/Record_count)) | eval avg30Days=avg30Days+1 | eventstats max(query_count) as "MaxQueries" by event_domain | eval Ratio = round(MaxQueries/avg30Days,3) | where Ratio <= 5 ] | eventstats count as "Event_count" by event_domain | table event_domain,Event_count,Dates_Count,AvgCustomerCount,Ratio,SumQueries,MaxQueries,avg30Days | dedup event_domain,Event_count,Dates_Count,AvgCustomerCount,Ratio,SumQueries,MaxQueries,avg30Days | sort by Event_count desc | head 10

what i need to change to get this data for every domain on index1?

Tags (1)
0 Karma

Sukisen1981
Champion

hi, the default settings of join is inner, which is to say if you have 2 indexes A and B, you will get A intersection B if you use default join.
If you want ALL values for the first index you need to specify | join type=left.
I would advise you to look at other options in the join docs - http://docs.splunk.com/Documentation/Splunk/7.1.1/SearchReference/Join
AND
do you really need a join? it has its own limitations

0 Karma

mcohen13
Loves-to-Learn

thats not answering my problem
i am familiar with join type=left but that also doesn't work properly.
some domain names that are in both indexes (i checked) doesn't appear after the join (no matter which type of join)

0 Karma

Sukisen1981
Champion

how many rows of data is the join running on?
Also, you have a head10 at the end, could it be that the domain name you are expecting is getting trimmed by the head command?
Suggest, run the query without head and see the job inspector, if the join is not able to pick data due to large volumes the job inspector will state the same.
It will be helpful to get a mock of your data and expected output

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Community Content Calendar, September edition

Welcome to another insightful post from our Community Content Calendar! We're thrilled to continue bringing ...

Splunkbase Unveils New App Listing Management Public Preview

Splunkbase Unveils New App Listing Management Public PreviewWe're thrilled to announce the public preview of ...

Leveraging Automated Threat Analysis Across the Splunk Ecosystem

Are you leveraging automation to its fullest potential in your threat detection strategy?Our upcoming Security ...