TECHNICAL NOTES

Equipment
The Service Center uses 3 scanners. A Kodak Photo CD Imaging Workstation 4220 scans 35mm, 4 in. x 5 in., and 120 format transparencies and negatives. It consists of a PCD Scanner 4045, a UNIX-based Sun SPARC Station data manager PC, a PCD Writer 225 which writes scanned images to a Kodak Photo CD, and a Kodak XLS 8650 Color Printer which prints index prints to insert in CD jewel cases. All software is proprietary to Kodak.

A Xerox DocuImage 620s flatbed scanner with an automatic sheet feeder scans text or grayscale images up to 11.5 in. x 17 in. Proprietary XDOD software, requiring Windows NT, scans to TIFF format through a Hewlett-Packard Vectra PC.

An Epson 836XL flatbed scanner scans color or grayscale images up to 11.5" x 17". An adapter for transparencies allows scanning of negatives or positive transparencies from 35mm to 11.5" x 17". Epson Twain Pro software allows scanning into any TWAIN-compliant application. LaserSoft SilverFast, an Adobe Photoshop plugin, provides a scanning interface with Photoshop.

The Service Center has 2 PCs. The HP Vectra is a 266 Pentium II with 128MB RAM, an 8GB hard drive and an ultrawide SCSI board. It drives both the DocuImage and Epson scanners through the use of dual-booting to either Windows 95 or Windows NT. A Dell Dimension 450 Pentium II with 128MB RAM and a 16.8GB hard drive does other computing under Windows 98. Neither PC is used as the Web site server.

A Hewlett Packard Color LaserJet 5 printer is networked to both PCs.

A Hewlett Packard CD Writer Plus 7200 is used for storing archival digital files to CD.

The managers of the California State Library Web site have offered web server space on a temporary basis.

Scanning
Several 35mm slides and 4x5 in. negatives are scanned on the Kodak Workstation. Black and white photos, line drawings, and text are scanned on the DocuImage scanner. Large transparencies, glass lantern slides, 4x5 in. negatives, and original broadsides and lithographs are scanned on the Epson scanner. This selection of various formats allowed a good comparison of scanner operation, requirements, and results.

Sometimes the same images were scanned from different media, because there existed negatives and prints and 35mm slides for the same images. When this was the case, the basic database search terms remain the same, so a query will yield all images for online comparison. For the most part, images were not enhanced to "improve on" the original documents, although in some cases spots or creases were removed during the manipulation process.

Estimated time spent scanning and manipulating images, formatting three sizes and naming files, and storing TIF files on CD’s is about 8 hours for 30 images. This varies, of course, with frequent changes in formats or conditions of source documents.

Software
Scanning software used is that packaged with the respective scanner, as indicated above.

For image manipulation and formatting, Adobe Photoshop 5.0 is used to crop, clean, and balance. The original TIF image is then transformed from a Bit Mapped TIF image to a Grayscale/RGB JPEG image (8 bit greyscale/24 bit color) and resized to around 580x360 (depending on original image size) at 72 dpi. It is then saved using a compression ratio of 7 for b/w images, and 4 for color images. This achieves a file size between 75K and 200K, depending on the original image size, which is suitable for web publication. The web image is reduced again to thumbnail size. The thumbnails are all 144 pixels wide (2 inches) for conformity.

The descriptive data is maintained in a Microsoft Access database. A query program written with Cold Fusion allows dynamic searching on the Web site. 

Filenames and Directories
After scanning is completed, the resulting digital files are named. Filenames with some mnemonic attachment to their origins seems more useful than a strictly numeric or chronological system. Various conventions for this began to emerge, and the Service Center filenames are loosely based on the original collections. Each image file has a discrete filename, generally referring to the owning library or collection. Exceptions to this convention are the images scanned on the Kodak scanner directly to a CD, which have names referring to the Kodak-assigned CD numbers and image numbers. 

Files for each library are grouped in separate disk directories, and can easily be found if needed.

TIFF image files are stored on CD’s by owning library. The information database was constructed so as to accommodate the various data fields maintained by each library.

All thumbnail files bear the same names as the larger images followed by "a".

Project Staff
Staffing for this project consists of a Project Manager working about 100 hours per month, a student Scanning Assistant who works about halftime, and a student programmer who works on a consultant basis.

The Scanning Assistant scans material on the appropriate scanner, manipulates the digital images for archiving and for Web access, and names image files in an orderly fashion. A student Web designer, designed the Web page and wrote the database query program. He continues to maintain the Web site by adding images and data, and consults on refining the appearance of the Web pages.

The Project Manager, in addition to general overseeing of the project, maintains the database and makes arrangements with participating libraries for scanning their materials.