Hi,
I'm trying to compare events from two sources to show where the outliers are (they "should" be the same but we know that there are discrepancies.
I can compare "number of rows/events" easily with a "chart count by source" command, but I also want to check the integrity of the field values.
Basically, each event has 10 fields (same ten fields in each source). How do I check that they are the same, and return some kind of message/raw event/field value if they are not the same?
Thanks.
May be something like this
source=source1 | eval source1Data=Field1."##".Field2."##"...<<all 10 fields concatenated>>."##".Field10 | appendcols [search source=source2 | eval source2Data=Field1."##".Field2."##"...<<all 10 fields concatenated>>."##".Field10]
| eval result=if(source1Data=source2Data,"Matched","Unmatched")
Would it be possible to do this with raw data? Meaning using the field "_raw"? This runs, but results are incorrect due to the sources being different..Here's that example:
source="D:\Bluesheets\ExtractFromFES.csv" eval FESdata=_raw | appendcols [search source="D:\Bluesheets\SentToReg.csv" | eval Regdata=_raw] | eval result=if(FESdata=Regdata,"Matched","Unmatched") | table result
Ok it's maybe a bit late … but for future searchers the other answers are to complicated, no help or wrong
```
index=main sourcetype="test2" | stats values(source) as sources by _raw | eval sources=if(mvcount(sources)>1,"match","no match")
```
Beware of linecounts>1 in the main search, this could create false "no match"
Can you post your query? and may be some sample data?
This keeps telling me I have mismatched "]" but I checked multiple times to ensure it's correct. Could the fact that my fields contain "." and "-" have anything to do with this?
Since I used appendcols, it will compare source1 event1 with source2 event1. It would fail for the cases no of rows differ in the sources.
This looks like it will work. Will provide an update tomorrow. Will this know to compare source1 event1 with source2 event1?
This is for reporting to regulators, so everything should be EXACTLY the same. Same timestamp, same field order, etc. I want to be able to check if any fieldname is different (I can pivot on field TRANSACTION_ID) for everything else. All field names static, and yes, If row numbers are the same (which they should be) I should be able to compare row1.source1 to row1.source2
Does both sources have timestamp and do they differ? What should be the order of rows/events for field comparison;first row of source1 with first row of source2?? Are field names static?