I'm working with email response data which comes into my index in individual messages. Each email message can have more than 100 entries in the index.
I'm trying to build tables to make the data easy to read.
This is what some simplified and sanitized results from my search look like:
[01:00:22.164297] x=ABC mod=mail cmd=msg rule=ruleQ subject="Test 123" size=8583
[01:00:22.136496] x=ABC mod=spam cmd=run rule=notspam
[01:00:22.106325] x=ABC mod=spam cmd=run policy=outbound
[01:00:22.067675] x=ABC mod=mail cmd=attachment file=text.html size=3347
[01:00:22.039732] x=ABC mod=mail cmd=attachment file=text.txt size=2093
[01:00:22.010986] x=ABC mod=session cmd=data
[email protected]
[01:00:22.010986] x=ABC mod=session cmd=data
[email protected]
[01:00:22.000234] x=ABC mod=mail
[email protected]
Tabled to show how data is structured for columns I care about:
╔═══════════════════╦═════╦══════════╦═══════════╦════════════════════╦═════════════════╦══════╦═════════╗
║ time ║ x ║ subject ║ file ║ sender ║ rcpt ║ size ║ rule ║
╠═══════════════════╬═════╬══════════╬═══════════╬════════════════════╬═════════════════╬══════╬═════════╣
║ [01:00:22.164297] ║ ABC ║ Test 123 ║ ║ ║ ║ 8583 ║ ruleQ ║
║ [01:00:22.136496] ║ ABC ║ ║ ║ ║ ║ ║ notspam ║
║ [01:00:22.106325] ║ ABC ║ ║ ║ ║ ║ ║ ║
║ [01:00:22.067675] ║ ABC ║ ║ text.html ║ ║ ║ 3347 ║ ║
║ [01:00:22.039732] ║ ABC ║ ║ text.txt ║ ║ ║ 2093 ║ ║
║ [01:00:22.010986] ║ ABC ║ ║ ║ ║
[email protected] ║ ║ ║
║ [01:00:22.010986] ║ ABC ║ ║ ║ ║
[email protected] ║ ║ ║
║ [01:00:22.000234] ║ ABC ║ ║ ║
[email protected] ║ ║ ║ ║
╚═══════════════════╩═════╩══════════╩═══════════╩════════════════════╩═════════════════╩══════╩═════════╝
This is what I'd like to get back:
╔═══════════════════╦═════╦══════════╦═══════════╦════════════════════╦═════════════════╦══════╦═════════╗
║ time ║ x ║ subject ║ file ║ sender ║ rcpt ║ size ║ rule ║
╠═══════════════════╬═════╬══════════╬═══════════╬════════════════════╬═════════════════╬══════╬═════════╣
║ [01:00:22.164297] ║ ABC ║ Test 123 ║ text.html ║
[email protected] ║
[email protected] ║ 3347 ║ notspam ║
║ [01:00:22.164297] ║ ABC ║ Test 123 ║ text.txt ║
[email protected] ║
[email protected] ║ 2093 ║ notspam ║
║ [01:00:22.164297] ║ ABC ║ Test 123 ║ text.html ║
[email protected] ║
[email protected] ║ 3347 ║ notspam ║
║ [01:00:22.164297] ║ ABC ║ Test 123 ║ text.txt ║
[email protected] ║
[email protected] ║ 2093 ║ notspam ║
╚═══════════════════╩═════╩══════════╩═══════════╩════════════════════╩═════════════════╩══════╩═════════╝
As you can see, the transformations I want for the data include:
creating a unique row for each person receiving each attachment
the size value is for the attachment, while the size of the whole
message is dropped
The time from the entry which contains the subject name is used for each entry
The 'rule' from mod=spam AND rule!=null fills in the rule column for
each entry, and the rule from the line which contains subject is ignored
The subject, sender and rule get copied to every entry
... View more