Skip to content

Archiving Services For Domino

Chapter 12-17
Archiving Services For Domino

Introduction

The Archiving Services API allow you to:

    • Stream notes document and attachment data from a notes database directly to an archive repository.
    • Restore notes document and attachment data back to its original state with no fidelity loss.
    • Generate index data based on exported content.
    • Optimize storage of redundant data such as message body and attachments.


Exporting Documents

The ArchiveExportDatabase function copies notes and their related attachments from a database and passes the exported data to callers via a set of callback functions. Attachments are decompressed so that they may be consumed by programs that understand their native format. Notes documents are encoded into a canonical format that is opaque to the caller but can be manipulated in a limited way by the Archiving Services API.

ArchiveExportDatabase is able to copy many notes and attachments in a single client/server interaction which provides benefits in terms of network overhead, but it requires the caller implement 4 callback functions that represent the state of the note streaming operation. The signature of the ArchiveExportDatabaseFunction:

STATUS LNPUBLIC ArchiveExportDatabase(
DBHANDLE hDb,
DHANDLE hIDTable,
DWORD Flags,
NOTEINITCALLBACK NoteInitCallback,
ARCHIVEATTACHINIT AttachInitCallback,
ARCHIVEATTACHOUTPUT AttachOutputCallback,
  ARCHIVEDOCUMENTCALLBACK ArchiveDocumentCallback,
KFHANDLE hKFC,
void pUserCtx);

The first two pararameters indicate which database and what documents are to be exported. Flags is currently unused (see the API reference guide for the latest details).

NoteInitCallback

NoteInitCallback is called for every note that is copied by ArchiveExportDatabase. This function marks the start of the copy process for every document and is typically used to initialize some sort of storage location for the exported document. The signature of NoteInitCallback:

typedef STATUS (LNCALLBACKPTR NOTEINITCALLBACK)
(
NOTEHANDLE hNote,
STATUS retError,
void 
pUserCtx
);

The hNote is a handle to an open note (assuming retError is NOERROR) which can be used to gather metadata useful for archiving. One must not invoke a function that will cause a remote call to a server in this function. The result will be a panic from code where this function is called. Implementers should check the retError value as it indicates the error status of the streaming operation. It can be the case that IDTable passed to ArchiveExportDatabase references a note that doesnt exist or has since been deleted, in which case retError will indicate ERR_NOEXIST or ERR_NOTE_DELETED respectively. Implementors can return NOERROR in these cases to keep processing the remaining notes in hIDTable.

AttachInitCallback

If there are attachments in a note, AttachInitCallback will be invoked next. AttachInitCallback indicates that attachment data is about to be sent. The signature of the callback:

typedef STATUS (LNCALLBACKPTR ARCHIVEATTACHINIT)
(
const char szFileName,
DWORD dwFlags,
DWORD dwDupIdx,
STATUS retError,
void 
pUserCtx
);

szFileName is the original filename of the attachment when it was added to the database. Implementors should check dwFlags as they indicate some important aspects about the data that will be sent. There are currently 3 flags that can be passed: ARCHIVE_ATTACH_ENCRYPTED, ARCHIVE_ATTACH_MACBIN_RAW, and ARCHIVE_ATTACH_RAW

ARCHIVE_ATTACH_ENCRYPTED indicates that an attachment is going to be passed encrypted and therefore cannot be consumed by its originating application without being restored back to a Notes database.

ARCHIVE_ATTACH_MACBIN_RAW indicates that the attachment is a Mac binary attachment as it is stored in NSF. It is therefore not consumable as a Mac binary file unless it is restored to a Notes database and extracted with NSFNoteExtractFile (or with the Notes client).

ARCHIVE_ATTACH_RAW indicates that the file is an object stored by NSF that is only known to Notes. Implementers must store this object and return it to the database when restoring the associated note.

dwDupIdx is used to differentiate attachments that have the same file name.

Implementers should examine retError for possible error conditions that may occur during the streaming operation. While there are no known recoverable errors, it will be considerably easier to track down issues if the calling code is aware of when an error condition arises and reports it accordingly. Returning a non-zero error status will cause ArchiveExportDatabase to stop.

AttachOutputCallback

Attachment data is passed via the AttachOutputCallback function provided by the caller.  The signature of the callback:

typedef STATUS (LNCALLBACKPTR ARCHIVEATTACHOUTPUT)
(
const BYTE Buffer,
DWORD BufferSize,
BOOL bLastBuffer,
STATUS retError,
void 
pUserCtx
);

Data is passed through the Buffer parameter. The amount of data is indicated by BufferSize. When bLastBuffer is true, implementers know to clean up resources that may have been allocated to receive the data. Callers should examine retError to ensure that there are no errors in the streaming operation. There is one recoverable error passed by retError, ERR_ARCHIVE_DECOMPRESSION_RETRY. This indicates that there was a problem decompressing the data just passed (since the AttachInitFunction). It is recommended that implementors delete any resources created with this data and return NOERROR. This will allow the streaming operation to retry decompression with a different algorithm. There can be up to 2 retries after which the attachment will be considered unreadable. It should be noted that the AttachInitCallback function will not be called before each retry.

ArchiveDocumentCallback

Once all attachments have been received, the notes document in its archived form is passed to the caller. The signature of the callback:

typedef STATUS (LNCALLBACKPTR ARCHIVEDOCUMENTCALLBACK)
(
HARCHIVEDOCUMENT hArchDoc,
void pUserCtx
);

hArchDoc is a handle to an Archive Document object which provides the caller with a few functions to implement common archiving and storage management operations.

Callers should free the hArchDoc object when they are done with it using ArchiveDocumentDestroy.

Encryption/Decryption Support

Documents and their related attachments that are encrypted can be exported in their encrypted form and later returned to a notes database without fidelity loss. It is possible to access the non-encrypted parts of a document (sender, recipients for example) but the encrypted portions remain inaccessible and can only be transferred to and from a database in their encrypted state.

It is possible to decrypt documents and attachments using ArchiveExportDatabase and passing a non-NULL KFHANDLE, but the resulting exported document cannot be restored to a database. The most likely scenario for utilizing this decryption method is when the mail journal is used as a source for archiving common parts of a message (for example body item and attachments). These parts can be re-assembled into a complete document that may have originated from a users mail file for example.

Archive Document

The ArchiveDocument abstraction gives users of the API the ability to access the archived data to produce textual renderings of selected fields and optimize storage by allowing a document to be decomposed into parts.

Callers are passed an ArchiveDocument handle via the ARCHIVEDOCUMENTCALLBACK of ArchiveExportDatabase (example below in section Database Level Export) . One can also create an ArchiveDocument from a previously exported document using the ArchiveDocumentImport function:

HARCHIVEDOCUMENT hArchDoc;
ArchiveDocumentImport(NoteImportCallback, &hArchDoc);

Extracting/Inserting Items

Items can be extracted from an ArchiveDocument before it is exported. A placeholder is left in the export stream to ensure that the document is complete before attempting to restore it to a database. The data format of the extracted item is internal but it can be compared to previously extracted items from other documents for equality. If there are multiple items with the same name, all of the items are concatenated to a single data stream and extracted.

Extracting:
// Define a program context struct
typedef struct {
FILE
bodyfile;
} CTX;
//

CTX myctx;
DWORD StreamLen;
// Create a file to accept the data
myctx.bodyfile = fopen(body.dat, wb);

// Extract the item named body from the archive
// document.
ArchiveDocumentExtractItem(hArchDoc, Body, strlen(Body), ItemExtractCallback, &myctx);



}

// The callback function that accepts the data from
// ArchiveDocumentExtractItem
STATUS far PASCAL ItemExtractCallback(const BYTE Buffer, DWORD BufLen, BOOL bLastBuffer, STATUS retError, void pUserCtx)
{
CTX pMyCtx = (CTX ) pUserCtx;
// If retError is non-zero, it indicates that there was
//internal error in processing.
if(retError)
{

      fclose(pMyCtx->bodyfile);

return retError;
}
fwrite(Buffer, BufLen, 1, pMyCtx->bodyfile);
if(bLastBuffer)
fclose(pMyCtx->bodyfile);
}

Inserting:
 {
CTX ctx;
// Open the file where the body item was stored
ctx->bodyfile = fopen(body.dat, rb);

// Create the archive document from its serialized form
// See samples for more details on importing
ArchiveDocumentImport(0, NoteImportCallback, pCtx, &hArchDoc);

// Put the body field back
// Note that the name of the item is stored
// in the extracted item data so it does not
// need to be specified.
ArchiveDocumentInsertItem(hArchDoc, ItemInsertCallback, &ctx);

fclose(ctx->bodyfile);

}

//The callback function that ArchiveDocumentInsertItem
// will use to read data
DWORD far PASCAL ItemInsertCallbackCallback(BYTE pBuffer, DWORD MaxToRead, void pUserCtx)
{
CTX pCtx = (CTX )pUserCtx;
fread(pBuffer, MaxToRead, 1, pCtx->bodyfile);
}

Text Conversion

It is possible to convert items of an Archive Document to text (for supported types see NSFItemConvertToText). As well, all MIME subtypes of type text (i.e text/plain, text/html) are returned when an item contains MIME.
See ArchiveDocumentGetText in the API reference for details.

Importing Documents

There are two functions provided for returning an exported document back to the Notes environment. ArchiveRestoreDocument writes a note and its related attachments to the caller-specified database replacing any note that may currently exist with the same UNID. ArchiveRestoreDocumentToNote returns the exported note data to an open note supplied by the caller. It does not restore attachments and thus care needs to be taken to ensure that either the note is not saved to the database or attachments are restored and references to attachments in the note are adjusted accordingly.

Restoration happens in two steps. First the ArchiveDocument must be imported from its external storage (vendor specific):

HARCHIVEDOCUMENT hArchDoc
// Create an archive document from its serialized format
ArchiveDocumentImport(0, NoteImportCallback, pUserCtx, &hArchDoc)

Then ArchiveRestoreDocument is called to return the document and its attachments to the database.
// Pass the imported ArchiveDocument to ArchiveRestoreDocument 
STATUS far PASCAL ArchiveRestoreDocument(
hDb,
0,// No flags
hArchDoc,  
AttachImportCallback,
pUserCtx,
          &hNote);

Callers pass a handle to the target database and a pointer to a callback function that will be used to stream attachment data back to the database. The imported document will replace any document with the same UNID that exists in the target database.

See the samples for more details on importing documents.

Import/Export fidelity

A note that is exported by ArchiveExportDatabase is restored to a database exactly as it existed when it was export with a few exceptions. There are a few fields that are updated by NSF internals that must change whenever a note is written to a database. These are $Revisions and $UpdatedBy. Signatures however, are preserved as is all rich text formatting.

Trace/Debug

The archiving functions have the ability to output trace information to the file specified in the DEBUG_OUTFILE notes ini setting. Tracing is enabled via the LOG_ARCHSVC notes ini variable. There are 3 flags that enable various aspects of tracing:
1 Enables tracing on all error returns so that callers can see the origin of a low level API error.
2 - Enables the debug dump of key data structures during the export/import process.
4 - Logs the entry/exit of all archiving service functions along with argument and return values. This will help support pinpoint the cause of any errors that may occur.

Flags can be combined to yield the desired amount of trace information.