Harvester Operation and Data Logging

When the Harvester searches mailboxes for attachments eligible for storage management, it does so by dividing the analysis into segments of time, and looking at each mailbox for eligible attachments within that slice of time. For example, if your organization stores attachments older than six months, the Harvester would establish time slices such as 6-9 months old, 9-12 months old, or 12-15 months old. It would search all applicable mailboxes for eligible mail for the 6-9 month period, then search all mailboxes for the 9-12 month period, and so on. It would not look at a single mailbox for all eligible attachments from 6-15 months old at one time.

During a scanning task on an individual mailbox, the Harvester gathers up to 100 messages at a time, and processes these messages together. If the Harvester encounters an error when connecting to a mailbox, it pauses for five minutes, then reattempts to connect. If it fails four times, the Harvester moves on to the next mailbox, and the retry attempts are listed in the summary log as faults.

Harvester data is collected in a file called HarvesterAudit.log (located in C:\ by default). This log provides administrators with high-level information on the completion of Harvester tasks. It provides:

Storage activity per user, per time slice.

If there were no attachments eligible for storage within the time slice, no data is logged. If a user had attachments eligible for storage in the time slice, the Harvester logs a message such as:

Stubbing scan of messages from 10/24/2019 9:50:21 PM to 1/22/2020 9:50:21 PM for user cn=test208,
cn=recipients,ou=first administrative group,o=acme demo and test completed: stubbed 2 attachments (4.0 MB) in 2 messages; sent 2 messages to data center; 0 errors.

If the Harvester has to pause, then reconnect to the mailbox, the final log entry contains information about each of the interrupted attempts. Processing status definitions are provided in this section.

Summary information for each run

The Harvester processes each Storage Management policy separately. It logs messages recording the starting time and the name of the policy. If the task is interrupted (due to an error, or if the amount of allotted scanning time expires, for example), the Harvester resumes the task on the next restart. When the Harvester has finished processing all time slices, it generates the summary for the run. A sample summary looks like this:

Stubbing scan for storage management policy Admin completed at 2:30:10 AM on 8/30/2007 Processed 10 of 10 mailboxes: 10 succeeded, 0 had faults. Stubbed 20 attachments (7.5 MB) in 20 messages; sent 20 messages to data center

The summary report also lists any errors encountered during processing.

User Summary Table in CSV, if enabled

The Harvester can provide a per-user summary table, available in CSV format. To enable this report, you must edit the HarvesterAudit.log4net.config file and turn on debug level logging. This report can generate a large amount of data, so it is not recommended as a default setting.

A sample of the CSV file looks like this:

User,Status,Messages Stubbed,Attachmens Stubbed,Stubbed Attachment Size (bytes),Messages Imported,Errors

“cn=seight,cn=recipients,ou=first administrative group,o=testcompanion”,Completed,2,2,785741,0,0

“cn=sfive,cn=recipients,ou=first administrative group,o=testcompanion”,Completed,2,2,785741,0,0

List of statuses and their definition:

Status Definition

Not Started

The mailbox has not yet been scanned by the Harvester.

In Progress

The mailbox is currently being scanned. This message appears when the task is interrupted — either because the allotted time expired, or there is no more space on the disk for sending messages to the data center.

Completed with Faults

Some messages from the mailbox have been processed, but the Harvester had to pause and restart at least once.

Completed

The mailbox was successfully examined for the time slice.

Failed

All attempts to connect to the mailbox failed; the mailbox was not scanned.

NOTE

HarvesterAudit.log does not rotate like other log files. You can manually rotate the file whenever the storage management task is not running.