How to combine 2 csv with 3 csv to get the correct...

aditsss · ‎05-17-2021

Hi Everyone,

I have one requirement where I need to combine 3 csv with 3 csv for Environment E1,E2 and E3 . I have made the query like this:

|inputlookup JOB_MDJX_CS_STATS_2.csv|append [ inputlookup JOB_MDJX_CS_STATS_2_E2.csv]|append [ inputlookup JOB_MDJX_CS_STATS_2_E3.csv]|join type=outer JOBFLOW_ID [ inputlookup JOB_MDJX_CS_MASTER.CSV ]
|join type=outer JOBFLOW_ID [ inputlookup JOB_MDJX_CS_MASTER_E2.csv ]
|join type=outer JOBFLOW_ID [ inputlookup JOB_MDJX_CS_MASTER_E3.csv]|where Environment="E3"|eval y=20|eval Run_date1= y."".RUNDATE2|eval Run_Date=strftime(strptime(Run_date1,"%Y%m%d"),"%d/%m/%Y")|eval nowdate=strftime(relative_time(now(), "-7d@d" ), "%d/%m/%Y")|stats sum(JOB_EXEC_TIME) as TotalExecTime by JOBFLOW_ID |eval TotalExecTime=round(TotalExecTime,2)|sort -TotalExecTime limit=10

The issue I am facing is its showing the data only for E3 even when I am putting environment as E1 or E2 despite there is Data in E1 and E2.

Can someone guide me what's wrong in my query.

Thanks in advance

ITWhisperer · ‎05-17-2021

Is JOBFLOW_ID unique across all three environments i.e. each JOBFLOW_ID only exists in one environment (and therefore at most the two corresponding csv files)?

The where clause is restricting the Environment to E3 - are you saying that you only get E3 data even if this where clause is changed to E2 etc?

aditsss · ‎05-17-2021

Spoiler

@ITWhisperer

Jobflow id is common in all three environments.

I am getting correct data when I combine like this:

But when I combine 3 more csv then data is not correct.

Is there something problem in append:

|inputlookup JOB_MDJX_CS_STATS_2.csv|append [ inputlookup JOB_MDJX_CS_STATS_2_E2.csv]|append [ inputlookup JOB_MDJX_CS_STATS_2_E3.csv]|join type=outer JOBFLOW_ID [ inputlookup JOB_MDJX_CS_MASTER.CSV ]
|join type=outer JOBFLOW_ID [ inputlookup JOB_MDJX_CS_MASTER_E2.csv ]
|join type=outer JOBFLOW_ID [ inputlookup JOB_MDJX_CS_MASTER_E3.csv]|where Environment="E2"|eval y=20|eval Run_date1= y."".RUNDATE2|eval Run_Date=strftime(strptime(Run_date1,"%Y%m%d"),"%d/%m/%Y")|eval nowdate=strftime(relative_time(now(), "-7d@d" ), "%d/%m/%Y")|stats sum(JOB_EXEC_TIME) as TotalExecTime by JOBFLOW_ID |eval TotalExecTime=round(TotalExecTime,2)|sort -TotalExecTime limit=10

ITWhisperer · ‎05-17-2021

By unique across environments I mean do all the values of JOBFLOW_ID in JOB_MDJX_CS_STATS_2.csv only exist in JOB_MDJX_CS_MASTER.csv and all the values of JOBFLOW_ID in JOB_MDJX_CS_STATS_2_E2.csv only exist in JOB_MDJX_CS_MASTER_E2.csv and all the values of JOBFLOW_ID in JOB_MDJX_CS_STATS_2_E3.csv only exist in JOB_MDJX_CS_MASTER_E3.csv

aditsss · ‎05-17-2021

@ITWhisperer

yes its correct:

By unique across environments I mean do all the values of JOBFLOW_ID in JOB_MDJX_CS_STATS_2.csv only exist in JOB_MDJX_CS_MASTER.csv and all the values of JOBFLOW_ID in JOB_MDJX_CS_STATS_2_E2.csv only exist in JOB_MDJX_CS_MASTER_E2.csv and all the values of JOBFLOW_ID in JOB_MDJX_CS_STATS_2_E3.csv only exist in JOB_MDJX_CS_MASTER_E3.csv

How to combine 2 csv with 3 csv to get the correct data

chart

drilldown

simple XML

Join Us for Splunk University and Get Your Bootcamp Game On!

.conf24 | Learning Tracks for Security, Observability, Platform, and Developers!

Announcing Scheduled Export GA for Dashboard Studio