
Scanning
The plan for this project was to outsource the scanning of microfilm.
The objective was to create images in TIFF4 format, which would be converted
by the vendor to the GIF or JPEG images required by the CONTENT program.
Because the interior pages of each issue of the Morning Leader were undated
and unpaginated, we specified that each image would be tagged to identify
date and page.
This was the most
time-consuming and problematic aspect of the project. In the end, we opted
for sample scans from the "gold standard" vendor and a local
vendor in order to compare image quality and ease of uploading.
Local vendor experience:
A list of possible vendors was obtained from the ReferenceUSA business
directory (available online at the library through the Statewide Database
Licensing program). Based on company descriptors and number of employees,
Northwest Center Document Management of Seattle was contacted for a price
quote. A copy of the museum microfilm was sent for sample scans, which
could be transmitted back to the library via attachment to electronic
mail. Unfortunately, the JPEG file conversion format could not be opened
with software available on Port Townsend Library computers. After several
conversations with NW Center and UW CONTENT administration staff, we decided
to purchase the Adobe Photshop 5.5 software for resizing images and changing
file format. Unfortunately, the JPEG images still incompatible with the
software. Finally, we determined the best course of action was to create
TIFF images which could be converted to JPEG with a "batch action"
process in Photoshop (with thanks to Jim Gossett at UW for showing me
how to do this). In the meantime, I contacted other local vendors, who
either answered with a machine or neglected to follow up on my inquiry.
The staff at Northwest
Center tried very hard to make the scanning work; the problem was that
their image software was less commonly used. This whole process took a
lot of time because of shipping of microfilm back and forth, vacations,
voice mail phone tag, and so forth. Complicating matters, the director
of the museum was unwilling to loan the microfilm from the Jefferson County
Historical Society library for the extended time needed to make the test
scans. We ended up borrowing film from the Washington State Library on
interlibrary loan.
Out of state vendor:
At the suggestion of Geri Bunker Ingram, Digital Projects Coordinator
for UW Libraries, I contacted Preservation Resources in Bethlehem, Pennsylvania,
a division of OCLC (commonly known as PresRes). This is an organization
with much experience in digitization of library materials, and they began
the process with a lengthy questionnaire regarding the specifications
of the microfilm source and the digital product. Not surprisingly, the
price quote was considerably higher than the one from NW Center. In conversations
with the representative from PresRes, we decided to purchase a clean microfilm
negative from Bell & Howell and have it shipped to PresRes, who would
provide a scan of a sample of the issues of the newspaper on the roll.
Installation of
CONTENT
Joe Tavares from UW and Tamara Georgick from the Washington State Library
came to the library in March to install the CONTENT acquisitions module.
This went smoothly; trying to download Internet Explorer from the web
did not (some of the workstations only had Netscape; CONTENT requires
IE). Eventually one computer's hard drive had to be completely done over
(many problems unrelated to this project), and we reinstalled CONTENT
ourselves with no problem. The acquisitions module was also installed
on a computer purchased by the grant to be on site at the museum. Because
of constraints of space and wiring, we decided to keep the computer in
a room at adjacent City Hall.
Uploading of images
Several versions of images have been uploaded to CONTENT (though not vast
quantities of images at this point). There is no image resizing capability
in the CONTENT software, so I was able to resize images using Photoshop
and see how legible they were in CONTENT (balancing type size with how
much scrolling was required to read an article).
Metadata field
structure
The CONTENT program is based on the Dublin Core metadata structure for
indexing and retrieval of digital images. Based on the content of the
local newspaper at the time, I added several searchable fields keyed to
the DC title field, such as advertisement and vessel (there was a lot
of shipping news at the time). This is still under development.
Indexing
This is probably the most rewarding part of the project, in part because
it's what brings history alive. This is the part I am still working on
and one of the challenges of indexing a daily newspaper is deciding how
much coverage you want. One of the nice features of CONTENT is that the
full page of the newspaper is available for viewing, so that if at some
point a reader feels that a certain article wasn't properly indexed, we
can always change it.
I discovered that
the OCLC/WLN Lasercat database is a great source for LC subject headings
for the articles in the paper. I've also discovered some useful Internet
resources: For example, one article was about U.S. Senator Ankeny; he
wasn't mentioned in the 1994 edition of the Encyclopedia of Washington
(we now have an updated edition which does cover former legislators),
but I did find him at politicalgraveyard.com, "the web site that
tells where the dead politicians are buried."
|