Cover image for How to Build a Digital Library.
How to Build a Digital Library.
Title:
How to Build a Digital Library.
Author:
Witten, Ian H.
ISBN:
9780080890395
Personal Author:
Edition:
2nd ed.
Physical Description:
1 online resource (655 pages)
Contents:
Front cover -- Half title page -- How to Build a Digital Library -- Copyright page -- Table of Contents -- Preface -- The Greenstone Software -- Updated and Revised Content -- How the Book Is Organized -- What the Book Covers -- About the Web Site -- Acknowledgments -- Part I Principles and Practices -- Chapter 1 Orientation -- Example One: Supporting Human Development -- Example Two: Pushing on the Frontiers of Science -- Example Three: Preserving a Traditional Culture -- Example Four: Exploring Popular Music -- The scope of digital libraries -- 1.1 Libraries and Digital Libraries -- 1.2 The Changing Face of Libraries -- In the beginning -- The information explosion -- The Alexandrian principle -- Early technodreams -- The library catalog -- The changing nature of books -- 1.3 Searching for Sophocles -- 1.4 Digital Libraries in Developing Countries -- Disseminating humanitarian information -- Disaster relief -- Preserving indigenous culture -- Locally produced information -- The technological infrastructure -- 1.5 The Pen Is Mighty: Wield It Wisely -- Copyright law -- The public domain -- Relinquishing copyright -- Digital rights management -- Copyright and digitization -- Collecting from the Web -- Illegal and harmful material -- Cultural sensitivity -- 1.6 Planning a Digital Library -- 1.7 Implementing a Digital Library: The Greenstone Software -- 1.8 Notes and Sources -- Chapter 2 People in digital libraries -- 2.1 Roles -- Global users -- Roles of librarians -- Change -- 2.2 Identity -- Anonymous use -- Authenticated use -- Recording usage data -- 2.3 Help and User Support Services -- 2.4 Working with Digital Collections -- Using information from digital libraries -- Referring to objects in a digital library -- Berry-picking -- 2.5 User Contributions -- Annotations -- Keywords -- Ratings -- Corrections -- New documents.

Partial and fluid documents -- 2.6 Notes and Sources -- Chapter 3 Presentation -- From People to Presentation -- 3.1 Presenting Textual Documents -- Documents, chapters, sections -- Unstructured text documents -- Page images -- Images with text -- Realistic books -- 3.2 Presenting Multimedia Documents -- Sound and pictures -- Video -- Music -- 3.3 Document Surrogates -- Metadata -- Multimedia surrogates -- 3.4 Searching -- Types of queries -- Case-folding and stemming -- Phrase searching -- Query interfaces -- Searching multimedia -- Searching music -- Searching images -- 3.5 Metadata Browsing -- Lists -- Dates -- Hierarchies -- Facets -- 3.6 Putting It All Together -- An institutional repository -- 3.7 Notes and Sources -- Chapter 4 Textual documents -- 4.1 Representing Textual Documents -- ASCII -- Unicode -- Plain text -- Indexing -- Word segmentation -- 4.2 Textual Images -- Scanning -- Optical character recognition -- Acquisition, cleanup, and page analysis -- Recognition -- Checking and saving -- Page handling -- Planning an image digitization project -- Inside an OCR shop -- An example project -- 4.3 Web Documents: HTML and XML -- Markup and stylesheet languages -- Basic HTML -- Using HTML in a digital library -- Basic XML -- Parsing XML -- Using XML in a digital library -- 4.4 Presenting Web Documents: CSS and XSL -- CSS -- Cascading style sheets -- Context- and media-dependent formatting -- Extensible stylesheet language -- Using Formatting Objects -- Context- and media-dependent formatting -- Processing in XSL -- 4.5 Page Description Languages: PostScript and PDF -- PostScript fundamentals -- The language -- Evolution -- Encapsulated PostScript -- Fonts -- Font formats -- Composite fonts -- Compatibility with Unicode -- Text extraction -- A simple text extraction program -- Improving the output -- Using PostScript in a digital library.

Portable Document Format: PDF -- Inside a PDF file -- Features of PDF -- Linearized PDF -- Security and PDF documents -- PDF and PostScript -- 4.6 Word-Processor Documents -- Rich Text Format: RTF -- Basic types -- Backward compatibility -- File structure -- Other features -- Using RTF in a digital library -- Native Word formats -- Using native Word in a digital library -- Office Open XML: OOXML -- Open Document format: ODF -- Open Document files -- Formatting -- Using ODF in a digital library -- Scientific documents: LaTeX -- Using LaTeX in a digital library -- 4.7 Other Documents -- Spreadsheets and presentation files -- E-mail -- 4.8 Notes and Sources -- Chapter 5 Multimedia -- 5.1 Introducing Compression and Transforms -- Basic compression techniques -- Transforms -- The Fourier transform -- 5.2 Audio -- Pulse code modulation: PCM -- Variants of PCM -- Early formats: WAV, AIFF, AU -- MPEG audio: MP3 and its siblings -- Post-MP3 formats: AAC, Ogg Vorbis, FLAC -- Replaying audio -- An audio digital library -- 5.3 Images -- Lossless compression: GIF and PNG -- Lossy compression: JPEG -- Progressive refinement -- Archiving images: JPEG 2000 and TIFF -- A digital library of photographs -- Vector graphics images -- 5.4 Video -- Codecs -- Multimedia compression: MPEG -- Inside MPEG -- MPEG-1 -- Mixing media -- MPEG-2 -- MPEG-4 -- High Definition Digital Television -- Proprietary formats -- Streaming -- Ogg Theora -- Using multimedia in a digital library -- A video digital library -- Reflection -- 5.5 Rich Media -- Synchronized Multimedia Integration Language: SMIL -- Adobe Flash -- 5.6 Music -- Musical Instrument Digital Interface: MIDI -- Channel Events -- Meta Events -- System Exclusive Events -- Digital music libraries -- 5.7 Notes and Sources -- Audio -- Images -- Video -- Rich Media -- Music -- Chapter 6 Metadata -- 6.1 Characteristics of Metadata.

6.2 Bibliographic Metadata -- MARC -- MARCXML -- Dublin Core: DC -- Qualified Dublin Core -- Metadata Object Description Schema: MODS -- BibTeX -- EndNote -- 6.3 Metadata for Multimedia -- Image metadata: TIFF -- Image metadata: EXIF, XMP, IPTC, and MIX -- Audio metadata -- Video metadata -- Multimedia metadata: MPEG-7 -- Multimedia application metadata: MPEG-21 -- 6.4 Metadata for Compound Objects -- Resource Description Framework: RDF -- Metadata Encoding and Transmission Standard: METS -- Collection-level metadata -- Open Archives Initiative Object Reuse and Exchange: OAI-ORE -- Metadata for education: LOM and SCORM -- Metadata for eResearch -- 6.5 Metadata Quality -- Authority control: Names -- Authority control: Subjects -- Controlling metadata values -- Metadata tools -- 6.6 Extracting Metadata -- Extracting document metadata -- Generic entity extraction -- Bibliographic references -- Language identification -- Acronym extraction -- Key-phrase metadata -- Key-phrase extraction -- Key-phrase indexing -- 6.7 Notes and Sources -- Chapter 7 Interoperability -- 7.1 Z39.50 Protocol -- 7.2 Open Archives Initiative -- OAI Protocol for Metadata Harvesting: OAI-PMH -- Serving -- Harvesting -- 7.3 Object Identification -- Handles -- Digital object identifiers: DOIs -- OpenURLs -- Persistence -- 7.4 Web Services -- Search/Retrieval via URL: SRU -- 7.5 Authentication and Security -- 7.6 DSpace and Fedora -- DSpace -- Fedora -- 7.7 Notes and Sources -- Chapter 8 Internationalization -- 8.1 Multilingual interfaces and documents -- 8.2 Unicode -- Composite and combining characters -- Unicode character encodings -- UTF-32 -- UTF-16 -- UTF-8 -- Using Unicode in a digital library -- 8.3 Hindi and indic scripts -- ISCII: Indian Script Code for Information Interchange -- Unicode for Indic scripts -- Problems with the adoption of Unicode.

8.4 Word segmentation and sorting -- Segmenting words -- Segmenting words in Thai/Khmer/Lao -- Sorting Chinese text -- 8.5 Notes and sources -- Chapter 9 Visions -- 9.1 Libraries of the future -- Today's visions -- Tomorrow's visions -- Working inside the digital library -- 9.2 Preserving the past -- The problem of preservation -- A sorry tale -- The Domesday Project -- Demise -- Resurrection -- The digital dark ages -- Preservation strategies -- Preservation in practice -- The Internet Archive -- 9.3 Trends in digital libraries -- Mobility: Portable collections -- Knowledge-based information retrieval -- 9.4 Digital libraries for oral cultures -- 9.5 Notes and sources -- Part II Greenstone Digital Library Software -- Chapter 10 Building collections -- 10.1 The Reader's Interface -- The Greenstone digital library -- Exploring the Demo collection -- Browsing -- Searching -- Preferences -- 10.2 The Librarian Interface -- Users and functions -- A walk-through -- Getting started -- Assembling the source material -- Enriching the documents -- Designing the collection -- Building the collection -- Formatting the pages -- Previewing -- Help -- 10.3 Working with Documents -- HTML documents -- Word and PDF files -- Enriching with metadata -- Designing the collection -- Changing the format -- Enhanced Word document handling -- Extracting document structure -- Detecting user-defined styles -- Extracting document properties -- Enhanced PDF document handling -- Trouble-shooting PDF collections -- Switching modes in the Librarian interface -- Splitting PDF documents into sections -- Converting PDF documents to page images -- Working with mixed PDF collections -- Highlighting search terms -- Enhanced HTML document handling -- Extracting metadata -- Extracting document structure -- Metadata for hierarchical documents -- Scaling up -- Examining different file types.

Adding metadata.
Abstract:
How to Build a Digital Library reviews knowledge and tools to construct and maintain a digital library, regardless of the size or purpose. A resource for individuals, agencies, and institutions wishing to put this powerful tool to work in their burgeoning information treasuries. The Second Edition reflects developments in the field as well as in the Greenstone Digital Library open source software. In Part I, the authors have added an entire new chapter on user groups, user support, collaborative browsing, user contributions, and so on. There is also new material on content-based queries, map-based queries, cross-media queries. There is an increased emphasis placed on multimedia by adding a "digitizing" section to each major media type. A new chapter has also been added on "internationalization,"  which will address Unicode standards, multi-language interfaces and collections, and issues with non-European languages (Chinese, Hindi, etc.). Part II, the software tools section, has been completely rewritten to reflect the new developments in Greenstone Digital Library Software, an internationally popular open source software tool with a comprehensive graphical facility for creating and maintaining digital libraries. Outlines the history of libraries on both traditional and digital Written for both technical and non-technical audiences and covers the entire spectrum of media, including text, images, audio, video, and related XML standards Web-enhanced with software documentation, color illustrations, full-text index, source code, and more.
Local Note:
Electronic reproduction. Ann Arbor, Michigan : ProQuest Ebook Central, 2017. Available via World Wide Web. Access may be limited to ProQuest Ebook Central affiliated libraries.
Added Author:
Electronic Access:
Click to View
Holds: Copies: