Black Loyalists in New Brunswick, 1783-1854: Behind the Scenes

Behind the Scenes

The process of digitizing archival documents for presentation on the web requires a diverse team of skilled individuals to photograph, transcribe, and encode each document.

Digital Imaging

The collections are imaged following industry standards for the digital capture, processing, and archiving of archival documents by the digital imaging team.

Master Archival Images

Master archival image files are created in full colour (24 bit RGB) at a resolution of 300 dots per inch (dpi). Tonal scale and colour balance controls are set prior to image capture to create digital surrogates that are true to the appearance of original documents. Files are sharpened during image processing, as needed, with an unsharp mask algorithm to achieve the approximate appearance of the original. Master image files are stored as uncompressed TIFF files (Intel byte order, header version 6).

Web Surrogate Images

To improve networked access to the images, reduced-resolution web surrogates are derived from the master archival TIFF files. Thumbnail- and full-size web surrogates are created with a resolution of 72 dpi and stored as full colour (24 bit RGB) JPEG files.

Image Archiving

Master images (TIFFs) are archived to CD-R while web surrogates (JPEGs) are uploaded to a Unix/Apache web server subject to a nightly backup process.

Metadata Creation

Descriptive metadata is created at document and component image levels according to the Electronic Text Centre's extended Dublin Core metadata schema. The schema follows a Dublin Core framework with relevant terminology standards and controlled vocabularies to create rich and highly portable metadata records.

Text Transcription and Encoding

Transcription

Using a word processing application, transcribers type document text following editorial guidelines developed to assist automatic XML encoding of the resulting transcriptions. Transcriptions are proofread using the two-person, read-aloud technique.

Text Encoding

All texts are encoded in the eXtensible Markup Language (XML) according to guidelines developed by the Text Encoding Initiative (TEI) consortium for the representation of texts in digital form.

The initial encoding of document transcriptions is automated with a Perl script that maps textual structures and features identified in the transcription text to TEI elements. Once mapped, the transcribed text is encoded with the appropriate TEI markup.

Project encoders then proofread and edit the initial XML encoding using oXygen XML Editor. In addition to correcting errors and omissions in the document text or markup, encoders have several main tasks: