Solr 1.4 Enterprise Search Server.

Title:

Author:

Smiley, David.

ISBN:

9781847195890

Personal Author:

Smiley, David.

Edition:

1st ed.

Physical Description:

1 online resource (391 pages)

Contents:

Solr 1.4 Enterprise Search Server -- Table of Contents -- Solr 1.4 Enterprise Search Server -- Credits -- About the Authors -- About the Reviewers -- Preface -- What this book covers -- Who this book is for -- Conventions -- Reader feedback -- Customer support -- Downloading the example code for the book -- Errata -- Piracy -- Questions -- 1. Quick Starting Solr -- An introduction to Solr -- Lucene, the underlying engine -- Solr, the Server-ization of Lucene -- Comparison to database technology -- Getting started -- The last official release or fresh code from source control -- Testing and building Solr -- Solr's installation directory structure -- Solr's home directory -- How Solr finds its home -- Deploying and running Solr -- A quick tour of Solr! -- Loading sample data -- A simple query -- Some statistics -- The schema and configuration files -- Solr resources outside this book -- Summary -- 2. Schema and Text Analysis -- MusicBrainz.org -- One combined index or multiple indices -- Problems with using a single combined index -- Schema design -- Step 1: Determine which searches are going to be powered by Solr -- Step 2: Determine the entities returned from each search -- Step 3: Denormalize related data -- Denormalizing-"one-to-one" associated data -- Denormalizing-"one-to-many" associated data -- Step 4: (Optional) Omit the inclusion of fields only used in search results -- The schema.xml file -- Field types -- Field options -- Field definitions -- Sorting -- Dynamic fields -- Using copyField -- Remaining schema.xml settings -- Text analysis -- Configuration -- Experimenting with text analysis -- Tokenization -- WorkDelimiterFilterFactory -- Stemming -- Synonyms -- Index-time versus Query-time, and to expand or not -- Stop words -- Phonetic sounds-like analysis -- Partial/Substring indexing -- N-gramming costs -- Miscellaneous analyzers.

Summary -- 3. Indexing Data -- Communicating with Solr -- Direct HTTP or a convenient client API -- Data streamed remotely or from Solr's filesystem -- Data formats -- Using curl to interact with Solr -- Remote streaming -- Sending XML to Solr -- Deleting documents -- Commit, optimize, and rollback -- Sending CSV to Solr -- Configuration options -- Direct database and XML import -- Getting started with DIH -- The DIH development console -- DIH DataSources of type JdbcDataSource -- DIH documents, entities -- DIH fields and transformers -- Importing with DIH -- Indexing documents with Solr Cell -- Extracting binary content -- Configuring Solr -- Extracting karaoke lyrics -- Indexing richer documents -- Summary -- 4. Basic Searching -- Your first search, a walk-through -- Solr's generic XML structured data representation -- Solr's XML response format -- Parsing the URL -- Query parameters -- Parameters affecting the query -- Result paging -- Output related parameters -- Diagnostic query parameters -- Query syntax -- Matching all the documents -- Mandatory, prohibited, and optional clauses -- Boolean operators -- Sub-expressions (aka sub-queries) -- Limitations of prohibited clauses in sub-expressions -- Field qualifier -- Phrase queries and term proximity -- Wildcard queries -- Fuzzy queries -- Range queries -- Date math -- Score boosting -- Existence (and non-existence) queries -- Escaping special characters -- Filtering -- Sorting -- Request handlers -- Scoring -- Query-time and index-time boosting -- Troubleshooting scoring -- Summary -- 5. Enhanced Searching -- Function queries -- An example: Scores influenced by a lookupcount -- Field references -- Function reference -- Mathematical primitives -- Miscellaneous math -- ord and rord -- An example with scale() and lookupcount -- Using logarithms -- Using inverse reciprocals.

Using reciprocals and rord with dates -- Function query tips -- Dismax Solr request handler -- Lucene's DisjunctionMaxQuery -- Configuring queried fields and boosts -- Limited query syntax -- Boosting: Automatic phrase boosting -- Configuring automatic phrase boosting -- Phrase slop configuration -- Boosting: Boost queries -- Boosting: Boost functions -- Min-should-match -- Basic rules -- Multiple rules -- What to choose -- A default search -- Faceting -- A quick example: Faceting release types -- MusicBrainz schema changes -- Field requirements -- Types of faceting -- Faceting text -- Alphabetic range bucketing (A-C, D-F, and so on) -- Faceting dates -- Date facet parameters -- Faceting on arbitrary queries -- Excluding filters -- The solution: Local Params -- Facet prefixing (term suggest) -- Summary -- 6. Search Components -- About components -- The highlighting component -- A highlighting example -- Highlighting configuration -- Query elevation -- Configuration -- Spell checking -- Schema configuration -- Configuration in solrconfig.xml -- Configuring spellcheckers (dictionaries) -- Processing of the q parameter -- Processing of the spellcheck.q parameter -- Building the dictionary from its source -- Issuing spellcheck requests -- Example usage for a mispelled query -- An alternative approach -- The more-like-this search component -- Configuration parameters -- Parameters specific to the MLT search component -- Parameters specific to the MLT request handler -- Common MLT parameters -- MLT results example -- Stats component -- Configuring the stats component -- Statistics on track durations -- Field collapsing -- Configuring field collapsing -- Other components -- Terms component -- termVector component -- LocalSolr component -- Summary -- 7. Deployment -- Implementation methodology -- Questions to ask -- Installing into a Servlet container.

Differences between Servlet containers -- Defining solr.home property -- Logging -- HTTP server request access logs -- Solr application logging -- Configuring logging output -- Logging to Log4j -- Jetty startup integration -- Managing log levels at runtime -- A SearchHandler per search interface -- Solr cores -- Configuring solr.xml -- Managing cores -- Why use multicore -- JMX -- Starting Solr with JMX -- Take a walk on the wild side! Use JRuby to extract JMX information -- Securing Solr -- Limiting server access -- Controlling JMX access -- Securing index data -- Controlling document access -- Other things to look at -- Summary -- 8. Integrating Solr -- Structure of included examples -- Inventory of examples -- SolrJ: Simple Java interface -- Using Heritrix to download artist pages -- Indexing HTML in Solr -- SolrJ client API -- Indexing POJOs -- When should I use Embedded Solr -- In-Process streaming -- Rich clients -- Upgrading from legacy Lucene -- Using JavaScript to integrate Solr -- Wait, what about security? -- Building a Solr powered artists autocomplete widget with jQuery and JSONP -- SolrJS: JavaScript interface to Solr -- Accessing Solr from PHP applications -- solr-php-client -- Drupal options -- Apache Solr Search integration module -- Hosted Solr by Acquia -- Ruby on Rails integrations -- acts_as_solr -- Setting up MyFaves project -- Populating MyFaves relational database from Solr -- Build Solr indexes from relational database -- Complete MyFaves web site -- Blacklight OPAC -- Indexing MusicBrainz data -- Customizing display -- solr-ruby versus rsolr -- Summary -- 9. Scaling Solr -- Tuning complex systems -- Using Amazon EC2 to practice tuning -- Firing up Solr on Amazon EC2 -- Optimizing a single Solr server (Scale High) -- JVM configuration -- HTTP caching -- Solr caching -- Tuning caches -- Schema design considerations.

Indexing strategies -- Disable unique document checking -- Commit/optimize factors -- Enhancing faceting performance -- Using term vectors -- Improving phrase search performance -- The solution: Shingling -- Moving to multiple Solr servers (Scale Wide) -- Script versus Java replication -- Starting multiple Solr servers -- Configuring replication -- Distributing searches across slaves -- Indexing into the master server -- Configuring slaves -- Distributing search queries across slaves -- Sharding indexes -- Assigning documents to shards -- Searching across shards -- Combining replication and sharding (Scale Deep) -- Summary -- Index.

Abstract:

Enhance your search with faceted navigation, result highlighting, fuzzy queries, ranked scoring, and more.

Local Note:

Electronic reproduction. Ann Arbor, Michigan : ProQuest Ebook Central, 2017. Available via World Wide Web. Access may be limited to ProQuest Ebook Central affiliated libraries.

Subject Term:

Client/server computing.

Data mining.

Electronic books. -- local.

Open source software.

Search engines -- Programming.

Genre:

Added Author:

Electronic Access:

Holds: Copies:

Available:*

Bound With These Titles

On Order