I have a question about filtering in data. We have a customer who is requesting a set of fields to be sent in from 0365. The issue is, we cant modify what we pull in because we are using an API, not the universal forwarder. Currently I am trying to test out the search query to confirm that I am only pulling in the correct events with those fields. The o365 data pulls in about 400+ fields. We are wanting about 40 of those events for a specific use case. My question is, what is the correct syntax for splunk to only search for those fields. Original query that brings in about 400+ fields: index=o365 New query for about 35 fields: index=o365 "Operation"="*" OR "LabelAction"="*" OR "LabelAppliedDateTime"="*" OR "LabelIid"="*" OR "abelName"="*" OR "DlpAuditEventMetadata.DlpPolicyMatchId"="*" OR "DlpAuditEventMetadata.EvaluationTime"="*" OR "DlpOriginalFilePath"="*" OR "IrmContentId"="*" OR "PolicyMatchInfo.PolicyId"="*" OR "PolicyMatchInfo.PolicyName"="*" OR "PolicyMatchInfo.RuleId"="*" OR "PolicyMatchInfo.RuleName"="*" OR "ProtectionEventData.IsProtected"="*" OR "ProtectionEventData.IsProtectedBefore"="*" OR "ProtectionEventData.ProtectionEventType"="*" OR "ProtectionEventData.ProtectionOwner"="*" OR "ProtectionEventData.ProtectionType"="*" OR "ProtectionEventData.TemplateId"="*" OR "ProtectionEventType"="*" OR "RMSEncrypted"="*" OR "SensitiveInfoTypeData{}.Confidence"="*" OR "SensitiveInfoTypeData{}.Count"="*" OR "SensitiveInfoTypeData{}.SensitiveInfoTypeId"="*" OR "SensitiveInfoTypeData{}.SensitiveInfoTypeName"="*" OR "SensitiveInfoTypeData{}.SensitiveInformationDetailedClassificationAttributes{}.Confidence"="*" OR "SensitiveInfoTypeData{}.SensitiveInformationDetailedClassificationAttributes{}.Count"="*" OR "SensitivityLabelEventData.ActionSource"="*" OR "SensitivityLabelEventData.ActionSourceDetail"="*" OR "SensitivityLabelEventData.ContentType"="*" OR "SensitivityLabelEventData.JustificationText"="*" OR "SensitivityLabelEventData.LabelEventType"="*" OR "SensitivityLabelEventData.OldSensitivityLabelId"="*" OR "SensitivityLabelEventData.SensitivityLabelId"="*" OR "SensitivityLabelEventData.SensitivityLabelPolicyId"="*" OR "LabelName"="*" | fields Operation,LabelAction,LabelAppliedDateTime,LabelIid,abelName,DlpAuditEventMetadata.DlpPolicyMatchId,DlpAuditEventMetadata.EvaluationTime,DlpOriginalFilePath,IrmContentId,PolicyMatchInfo.PolicyId,PolicyMatchInfo.PolicyName,PolicyMatchInfo.RuleId,PolicyMatchInfo.RuleName,ProtectionEventData.IsProtected,ProtectionEventData.IsProtectedBefore,ProtectionEventData.ProtectionEventType,ProtectionEventData.ProtectionOwner,ProtectionEventData.ProtectionType,ProtectionEventData.TemplateId,ProtectionEventType,RMSEncrypted,SensitiveInfoTypeData{}.Confidence,SensitiveInfoTypeData{}.Count,SensitiveInfoTypeData{}.SensitiveInfoTypeId,SensitiveInfoTypeData{}.SensitiveInfoTypeName,SensitiveInfoTypeData{}.SensitiveInformationDetailedClassificationAttributes{}.Confidence,SensitiveInfoTypeData{}.SensitiveInformationDetailedClassificationAttributes{}.Count,SensitivityLabelEventData.ActionSource,SensitivityLabelEventData.ActionSourceDetail,SensitivityLabelEventData.ContentType,SensitivityLabelEventData.JustificationText,SensitivityLabelEventData.LabelEventType,SensitivityLabelEventData.OldSensitivityLabelId,SensitivityLabelEventData.SensitivityLabelId,SensitivityLabelEventData.SensitivityLabelPolicyId,LabelName Basically, From my understanding and my research, if you just append a specific string in quotes, or outside of quotes, splunk searches all events for that string and pulls it in. Such as: index=Test field1 field2 field3 That would bring in only events with field1 or field2 or field3 within it. Adding quotes to it, such as index=Test "field1"="*" "field2"="*" "field3"="*" Should filter the same way. I have tested it both way, with double quotes surrounding the field, as well as no quotes. Im also using | fields Which should only bring those fields in, but i dont know if its only showing those fields, but bringing in ALL of the events. My question is, is this correct? With the base searches ive been testing with, searching all of the events in o365 for one day, full 24 hours, brings in 23,410,064 events. Filtering out with the query I pasted above, for the same day, same 24 hours, brings in 23,409,887 events. Ive tested this a couple of ways, and each time, searching over the same time period, the filtering query brings in about 1k less events. But I can still only view the first 1k events, 20 pages worth. But that may be another question. My longwinded question boils down to, am I searching this data correctly? I know its a heavy index with millions of events, but filtering out to only 40 or so fields, some of which only appear .6% of the time, still brings in millions of events. Is there a way to fully validate it?
... View more