|
HARDWARE
& SOFTWARE REQUIREMENTS
Issues
of importance
Capture Hardware
requirements:
Flatbed scanners will perform well for small digitization efforts of under
5,000 documents. If microforms will be created, planetary cameras can
be used for preservation quality microforms. The use of microfilm versus
microfiche is another decision to be made.
Software requirements:
Capture software may be simple Windows or MAC based applications for small,
local collections. For any significant collection, over 5,000 images,
production software is recommended. Use of a vendor to complete the capture
process should be investigated. Scanning, indexing and Web-based retrieval
tools are available such as Alchemy, CONTENTdm, and Imara.
Standards for scanning
hardware and software developed by the Western States Digital Digital Imaging Working Group are referenced here for your convenience.
Resources: Western States Digital Imaging Best Practices, Version 1.0, January 2003, http://www.cdpheritage.org/digital/scanning/documents/WSDIBP_v1.pdf
Storage Hardware
requirements
Digitized files must be retained in a durable storage device with procedures
for backup and recovery. This places a demand for redundancy and isolation
of data from applications that run it. The total effect is to require
a significant storage capacity for digitized files.
- For the purpose
of estimating storage volumes, we will consider the demands of storing
simple to complex files.
- Simple digital
file of textual material (ASCII) on an 8 by 11-inch document: 2,000
bytes. (a 100,000 document collection, compressed, with index and mirrored
drives require about 3,000 MB or 3 GB).
- Simple digital
file of an image of an 8 by 11 inch text document using 8 bit color
and 75 dots per inch resolution: 50,000 bytes. (a 100,000 image collection,
compressed, with index and mirrored drives require about 40,000 MB or
40 GB).
- Medium resolution
image of an 8 by 11-inch document using 8-bit color and 200 dots per
inch resolution: 300,000 bytes. (a 100,000 image collection, compressed,
with index and mirrored drives require about 100,000 MB or 100 GB).
- High resolution
of an 8 by 11 inch document using 16 bit color and 600 dots per inch
resolution: 6,500,000 bytes. (a 100,000-image collection, compressed,
with index and mirrored drives require about 500,000mb or 500 GB).
As you can observe,
the storage size increases dramatically with increases in quality of the
image. Collections of images of letter size materials can tax the largest
computer system. Even using file compression techniques, the database
of digitized image files can be quite large. For example, at the Washington
State Department of Labor and Industries a highly compressed collection
of three million medium resolution images requires over 700,000,000,000
bytes of storage (700 GB). With optical drive storage devices (jukeboxes)
the physical facility is manageable, but the software to manage the files
is significant and expensive.
Collections under
50,000 images can be accommodated using small removable disk recorders
/ readers. These small jukeboxes can be attached to servers for Web browsing.
Growth potential:
After the initial capture of objects into digital form, the likelihood
exists that additional material will be included. The decision to digitize
collections generates a prioritized list. Once the top items have been
digitized, the next objects in the list become "top". As funds
are available, it is natural to expect the digitization will continue.
It is important to set a target capacity for the system including further
digitization.
Obsolescence factor:
Any investment in software or hardware must be weighed against the understanding
that the technology will become obsolete in the near future. For small
digitization projects, out-sourcing the production may be justified to
minimize one-time purchase of equipment that may soon become obsolete.
TOP
Options
to consider
Migration strategies:
The best migration strategy for smaller institutions will be to employ
the highest quality standards for image capture that you can afford when
doing the project. Thought should also be given to creating and documenting
thoroughly the steps in the image capture process especially information
about the scanning equipment, where and how the original documents are
stored and preserved, and where the digital master file is kept.
Obsolescence of
hardware and software:
It is inevitable that the "state-of-the-art" today is tomorrow's
dinosaur. It used to be thought that a durable process could be employed
that would guarantee usefulness for a century or longer. With today's
rapidly changing technical environment, shorter goals are appropriate.
The most wasteful digitization effort is one that creates a product that
can not be migrated to the next technical environment. This is true for
file formats, software or storage, retrieval, and viewing, and hardware
that becomes an orphan. Consider the constant march of processing capacities,
image file formats, operating systems, and viewing resolutions.
Conversion to future
media:
Given the effort employed to capture digital images from source objects,
it is not reasonable to expect rescanning with each new level of technology.
By adopting popular standards, and aggressively migrating to proven and
stable environments / media, the digitized collection will enjoy an extended
life.
Research and Development:
It is not practical to establish a process that endures through technical
evolution. The nature of digitization is constantly changing. Attention
is needed to this field including participation in conferences, subscription
to topical journals, and dialog with other practitioners. As the tools
improve, the digitization process should consider migration to them.
Choose formats
that fit your need and budget:
Text documents can be offered as images, but costs rise if you use
OCR, or other interpretive tools. Color images cost more and require far
higher rates of storage than black and white images. Materials that can
be fed through a sheet scanner cost less to digitize than those that require
flatbed scanning. Scanning from archival quality microfilm may be better
for rare and delicate materials than digitizing directly.
TOP
Project
Checklist
These guidelines
are from the Collaborative
Digitization Program
- Scanning at the
highest resolution appropriate to the informational content of the originals.
- Scanning at an
appropriate level of quality to avoid rescanning and re-handling of
the originals in the future--scan once.
- Creating and storing
a master image file that can be used to produce derivative image files
and serve a variety of current and future user needs.
- Using system components
that are non-proprietary.
- Using image file
formats and compression techniques that conform to industry standards.
- Creating backup
copies of all files on a stable medium.
- Creating meaningful
metadata for image files or collections.
- Storing media
in an appropriate environment.
- Monitoring and
recopying data as necessary.
- Outlining a migration
strategy for transferring data across generations of technology.
- Anticipating and
planning for future technological developments.
TOP
|