Solved: How to compare these two CSV files?

Rithekakan · ‎02-10-2022

I have reports Quarter1.csv and Quarter2.csv.
after I upload these two csv report I got

host="***" source="****"  sourcetype="***" and those fields  IP_Address,Plugin_Name,Severity,Protocol,Port,Exploit,Synopsis,Description,Solution,See_Also,CVSS_V2_Base_Score,CVE,Plugin.

I want 3 reports base on joining with these 6 files:

IP_Address, Plugin_Name, Severity, Protocol, Port, Exploit,
| table IP_Address,Plugin_Name,Severity,Protocol,Port,Exploit,Synopsis,Description,Solution,See_Also,CVSS_V2_Base_Score,CVE,Plugin, status

First report: - if the event are in Quarter1.csv and Quarter2.csv. show status as "Active Vulnerability"

Second report:- if the event are in Quarter1.csv but not in Quarter2.csv. show status as "Fixed"

Third report:- if the event are not in Quarter1.csv but there are in Quarter2.csv. show status as "New Active Vulnerability"

yuanliu · ‎02-10-2022

I assume that you ingested these files as events, not upload as CSV lookups, i.e., respective contents can be accessed with source=Quarter1.csv and source=Quarter2.csv. You can do something like

| fields IP_Address,Plugin_Name,Severity,Protocol,Port,Exploit,Synopsis,Description,Solution,See_Also,CVSS_V2_Base_Score,CVE,Plugin source
| stats values(*) as * by Plugin,IP_Address,Plugin_Name,Severity,Protocol,Port,Exploit
| eval status = case(mvcount(source)>1,"Active Vulnerability", source=="Quarter1.csv","Fixed", true(), "New Active Vulnerability")

View solution in original post

Rithekakan · ‎02-19-2022

Hi PickleRick, Thanks for your query, It help me for my report, but something is missing.

I want a report with these fields.
| table IP_Address, device, Plugin_Name, Severity, model, Protocol, Port, Exploit, Synopsis, Description, Solution, See_Also, CVSS_V2_Base_Score, CVE,Plugin

with join only 6 fields in the | stats sum " IP_Address,Plugin_Name,Plugin,Severity,Protocol,Port. "

| stats sum(picker) as.....

How eve if I add all 13 fields in the | stats sum(picker)as ..........It provide the uncorrected report becuse of some fields have duplicate data.
Note: There are two even (device and model) are lookup from other Inventory CSV file.

Could you pls help with this reequipment? I will be appreciate for your help.

Regards,
Ritheka Kan

yuanliu · ‎02-10-2022

I assume that you ingested these files as events, not upload as CSV lookups, i.e., respective contents can be accessed with source=Quarter1.csv and source=Quarter2.csv. You can do something like

| fields IP_Address,Plugin_Name,Severity,Protocol,Port,Exploit,Synopsis,Description,Solution,See_Also,CVSS_V2_Base_Score,CVE,Plugin source
| stats values(*) as * by Plugin,IP_Address,Plugin_Name,Severity,Protocol,Port,Exploit
| eval status = case(mvcount(source)>1,"Active Vulnerability", source=="Quarter1.csv","Fixed", true(), "New Active Vulnerability")

Rithekakan · ‎02-11-2022

Hi yuanliu , I appreciate your helpful Splunk query. It is 100% correct and fit my query. Thanks for your help.

Regards,

Rithekakan

Rithekakan · ‎02-19-2022

Hi yuanliu , I 've reply to accept your query in the previous time. It is 100% correct.
How ever the management needs two more fields (device, model) which it is lookup from the CVS inventory file.
and the report is something look like this.
| table IP_Address, device, Plugin_Name, Plugin, Severity, model, Protocol, Port, Exploit, Synopsis, Description, Solution, See_Also, CVSS_V2_Base_Score, CVE,

Could you pls help me edit your query you've provided to make it work with the lookup CSV file.

I have this lookup fils.
| lookup ABLNInventory.csv ip_address as IP_Address output device, model
| table IP_Address, device, Plugin_Name, Severity, model, Protocol, Port, Exploit, Synopsis, Description, Solution, See_Also, CVSS_V2_Base_Score, CVE, Plugin

I will appreciate for your help.
Best regards,
Ritheka Kan

yuanliu · ‎02-20-2022

The way I see it, device and model are not used as criteria. So, simply add it.

| fields IP_Address,Plugin_Name,Severity,Protocol,Port,Exploit,Synopsis,Description,Solution,See_Also,CVSS_V2_Base_Score,CVE,Plugin source
| stats values(*) as * by Plugin,IP_Address,Plugin_Name,Severity,Protocol,Port,Exploit
| eval status = case(mvcount(source)>1,"Active Vulnerability", source=="Quarter1.csv","Fixed", true(), "New Active Vulnerability")
| lookup ABLNInventory.csv ip_address as IP_Address output device, model

After this, you can arrange the table however you like.

Rithekakan · ‎02-21-2022

Hi yuanliu,

I got it now. Thanks for your solution.

Regards,

Ritheka Kan

PickleRick · ‎02-10-2022

With two data sets to compare I prefer the "summing" approach.

sourcetype=A OR sourcetype=B
| eval picker=if(sourcetype="A",1,2)
| stats sum(picker) as picker by whatever fields you need
| eval status=case(picker=1,"only A",picker=2,"only B",picker=3,"both",1=1,"something wrong")

Yes, I know OP wanted three separate reports, but that's the general approach.

How to compare these two CSV files?

Can’t make it to .conf25? Join us online!

Can’t Make It to Boston? Stream .conf25 and Learn with Haya Husain

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Unlock What’s Next: The Splunk Cloud Platform at .conf25

Are you a member of the Splunk Community?

How to compare these two CSV files?

Can’t make it to .conf25? Join us online!

Can’t Make It to Boston? Stream .conf25 and Learn with Haya Husain

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Unlock What’s Next: The Splunk Cloud Platform at .conf25