Splunk Search

How to combine disparate log data into a single time chart?

dgillam
Engager

I have mail processing log lines I need to combine and report on.

One type of log line contains strings like "cloned from Aggressive", "cloned from "Blocklist", etc.

Another type of log line contains a field "classification=" This field has values like "Zero-Hour", "Spam-Clean, Spam-Confirmed", "Passed", etc.

The various needed log lines do not share a common field name.

I need a report that combines all these disparate data, to show a stacked column of all email, colored as to its classification and "cloned from" counts by time interval.

I can get a report on classifications, but it drops the other two types of data. I can get a report on the other types of data (separately), but they drop the classification type, and so on.

How do I formulate the search/report to combine all these into a single chart?

0 Karma

dgillam
Engager

Thank you to all. I believe I have a working solution to this:

index=myindex AND classification=* | timechart count by classification
| append [search index=myindex AND "cloned from" | timechart count AS Reputation]
| append [search index=myindex AND "User unknown" | timechart count AS "User Unknown"]
| append [search index=myindex AND "stat=Sent" | timechart count AS "Sent"]
| sort _time

Creating a stacked column (combined) chart from this gets me essentially what I need. Each subsearch is a different column, but I can live with that, I think.

0 Karma

aweitzman
Motivator

I thought @martin_mueller's answer was better generally, as it avoids subsearches, which is why I erased mine. But I'm glad this works too.

If you want one combined column per time period, replace "| sort _time" with:

| transaction _time | fields - linecount _raw closed_txn duration eventcount field_match_sum | sort _time

martin_mueller
SplunkTrust
SplunkTrust

So... an event in one log file doesn't have anything to do with an event from the other log file?

Extract the reasons from the first file into a field called classification and run this:

sourcetype=st1 OR sourcetype=st2 | timechart count by classification

martin_mueller
SplunkTrust
SplunkTrust

You can still extract the fields. For example the cloned-from-logfile:

\[(?<classification>cloned from \w+)\]

That way you get a classification field in each source and hence can do a count by classification.

0 Karma

dgillam
Engager

same log file. different email filtering components all log to the same log facility/file. they each use different syntax in doing so. I tried the route of eval and renaming fields, etc. but some components log in such a way that there essentially are no field names.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

If the field names don't match you can define field aliases or choose matching field names in your extractions or use rename or eval in the search to make the names match.

0 Karma

dgillam
Engager

all the data are the same sourcetype (mail log), but only one has a classification field, only one has "User unknown", only one has "cloned from". Even though they are the same sourcetype, they have no intersecting filed names.

0 Karma

dgillam
Engager

I need to tally all of the above (count(reject), count("cloned from"), count(classification) by classification) all on the same chart, so we have something like:

classification-1, classification-2, classification-3, classification-4, user-unknown, Bad-Reputation

with their individual tallies, charted as a stacked column over time.

0 Karma

dgillam
Engager

Apr 22 03:03:31 host.com MM: [Jilter Processor 3 - Async Jilter Worker 37 - 127.0.0.1:40909-s3M33SSv011875] INFO user.log - AntiSpam.Log.Header.Debug: classification=Cloudmark, cloudmark_spam_score=100.00, cloudmark_content_score=100.00, cloudmark_ip_score=0.00, cloudmark_sender_score=0.00, cloudmark_analysis="v=2.1 cv=XMMJF2RE c=1 sm=1 tr=0 p=pKOSPnCJtLv9pbStFNYA:9 p=WthgjtGrYmcLPO50j_8A:9 a=XWQSJyLHRzquKgEqAPxMQA==:117 a=XWQSJyLHRzquKgEqAPxMQA==:17 a=aoWKRLlwSNoA:10 a=-N4dak_cAAAA:8 a=KGjhK52YXX0A:10 a=awlg0vDVAAAA:8 a=3fMtmCSMTM1j8r91:21 a=

0 Karma

dgillam
Engager

Apr 22 03:00:41 host.com sm-mta[10912]: s3M30dPD010912: Milter: to=user@domain.com, reject=550 5.1.1 User unknown

Apr 22 03:01:19 host.com flow-control[16526]: something.com: selected class something.com [cloned from Moderate]

(more coming)

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Do post some actual (anonymized) data from both sources.

0 Karma

aweitzman
Motivator

While there is no common field name, there must be some bit of common information across the lines of data that identify a single piece of email. Otherwise you'll have no way to do this.

Or do you just want counts of two different things in one graph?

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...