Cover image for Apache Solr Beginner’s Guide.
Apache Solr Beginner’s Guide.
Title:
Apache Solr Beginner’s Guide.
Author:
Serafini, Alfredo.
ISBN:
9781782162537
Personal Author:
Physical Description:
1 online resource (435 pages)
Contents:
Apache Solr Beginner's Guide -- Table of Contents -- Apache Solr Beginner's Guide -- Credits -- About the Author -- Acknowledgments -- About the Reviewers -- www.PacktPub.com -- Support files, eBooks, discount offers and more -- Why Subscribe? -- Free Access for Packt account holders -- Preface -- What this book covers -- What you need for this book -- Who this book is for -- Conventions -- Time for action - heading -- What just happened? -- Pop quiz - heading -- Reader feedback -- Customer support -- Downloading the example code -- Errata -- Piracy -- Questions -- 1. Getting Ready with the Essentials -- Understanding Solr -- Learning the powerful aspects of Solr -- Working with Java installation -- Downloading and installing Java -- Configuring CLASSPATH and PATH variables for Java -- Installing and testing Solr -- Time for action - starting Solr for the first time -- What just happened? -- Taking a glance at the Solr interface -- Time for action - posting some example data -- What just happened? -- Time for action - testing Solr with cURL -- What just happened? -- Who uses Solr? -- Resources on Solr -- How will we use Solr? -- Pop quiz -- Summary -- 2. Indexing with Local PDF Files -- Understanding and using an index -- Posting example documents to the first Solr core -- Analyzing the elements we need in Solr core -- Time for action - configuring Solr Home and Solr core discovery -- What just happened? -- Knowing the legacy solr.xml format -- Time for action - writing a simple solrconfig.xml file -- What just happened? -- Time for action - writing a simple schema.xml file -- What just happened? -- Time for action - starting the new core -- What just happened? -- Time for action - defining an example document -- What just happened? -- Time for action - indexing an example document with cURL -- What just happened?.

Executing the first search on the new core -- Adding documents to the index from the web UI -- Time for action - updating an existing document -- What just happened? -- Time for action - cleaning an index -- What just happened? -- Creating an index prototype from PDF files -- Time for action - defining the schema.xml file with only dynamic fields and tokenization -- What just happened? -- Time for action - writing a simple solrconfig.xml file with an update handler -- What just happened? -- Testing the PDF file core with dummy data and an example query -- Defining a new tokenized field for fulltext -- Time for action - using Tika and cURL to extract text from PDFs -- What just happened? -- Using cURL to index some PDF data -- Time for action - finding copies of the same files with deduplication -- What just happened? -- Time for action - looking inside an index with SimpleTextCodec -- What just happened? -- Understanding the structure of an inverted index -- Understanding how optimization affects the segments of an index -- Writing the full configuration for our PDF index example -- Writing the solrconfig.xml file -- Writing the schema.xml file -- Summarizing some easy recipes for the maintenance of an index -- Pop quiz -- Summary -- 3. Indexing Example Data from DBpedia - Paintings -- Harvesting paintings' data from DBpedia -- Analyzing the entities that we want to index -- Analyzing the first entity - Painting -- Writing Solr core configurations for the first tests -- Time for action - defining the basic solrconfig.xml file -- What just happened? -- Looking at the differences between commits and soft commits -- Time for action - defining the simple schema.xml file -- What just happened? -- Introducing analyzers, tokenizers, and filters -- Thinking fields for atomic updates -- Indexing a test entity with JSON -- Understanding the update chain.

Using the atomic update -- Understanding how optimistic concurrency works -- Time for action - listing all the fields with the CSV output -- What just happened? -- Defining a new Solr core for our Painting entity -- Time for action - refactoring the schema.xml file for the paintings core by introducing tokenization and stop words -- What just happened? -- Using common field attributes for different use cases -- Testing the paintings schema -- Collecting the paintings data from DBpedia -- Downloading data using the DBpedia SPARQL endpoint -- Creating Solr documents for example data -- Indexing example data -- Testing our paintings core -- Time for action - looking at a field using the Schema browser in the web interface -- What just happened? -- Time for action - searching the new data in the paintings core -- What just happened? -- Using the Solr web interface for simple maintenance tasks -- Pop quiz -- Summary -- 4. Searching the Example Data -- Looking at Solr's standard query parameters -- Adding a timestamp field for tracking the last modified time -- Sending Solr's query parameters over HTTP -- Testing HTTP parameters on browsers -- Choosing a format for the output -- Time for action - searching for all documents with pagination -- What just happened? -- Time for action - projecting fields with fl -- What just happened? -- Introducing pseudo-fields and DocTransformers -- Adding a constant field using transformers -- Time for action - adding a custom DocTransformer to hide empty fields in the results -- What just happened? -- Looking at core parameters for queries -- Using the Lucene query parser with defType -- Time for action - searching for terms with a Boolean query -- What just happened? -- Time for action - using q.op for the default Boolean operator -- What just happened? -- Time for action - selecting documents with the filter query.

What just happened? -- Time for action - searching for incomplete terms with the wildcard query -- What just happened? -- Time for action - using the Boost options -- What just happened? -- Understanding the basic Lucene score -- Time for action - searching for similar terms with fuzzy search -- What just happened? -- Time for action - writing a simple phrase query example -- What just happened? -- Time for action - playing with range queries -- What just happened? -- Time for action - sorting documents with the sort parameter -- What just happened? -- Playing with the request -- Time for action - adding a default parameter to a handler -- What just happened? -- Playing with the response -- Summarizing the parameters that affect result presentation -- Analyzing response format -- Time for action - enabling XSLT Response Writer with Luke -- What just happened? -- Listing all fields names with CSV output -- Listing all field details for a core -- Exploring Solr for Open Data publishing -- Publishing results in CSV format -- Publishing results with an RSS feed -- Good resources on Solr Query Syntax -- Pop quiz -- Summary -- 5. Extending Search -- Looking at different search parsers - Lucene, Dismax, and Edismax -- Starting from the previous core definition -- Time for action - inspecting results using the stats and debug components -- What just happened? -- Looking at Lucene and Solr query parsers -- Time for action - debugging a query with the Lucene parser -- What just happened? -- Time for action - debugging a query with the Dismax parser -- What just happened? -- Using an Edismax default handler -- Time for action - executing a nested Edismax query -- What just happened? -- A short list of search components -- Adding the blooming filter and real-time Get -- Time for action - executing a simple pseudo-join query -- What just happened?.

Highlighting results to improve the search experience -- Time for action - generating highlighted snippets over a term -- What just happened? -- Some idea about geolocalization with Solr -- Time for action - creating a repository of cities -- What just happened? -- Playing more with spatial search -- Looking at the new Solr 4 spatial features - from points to polygons -- Time for action - expanding the original data with coordinates during the update process -- What just happened? -- Performing editorial correction on boosting -- Introducing the spellcheck component -- Time for action - playing with spellchecks -- What just happened? -- Using a file to spellcheck against a list of controlled words -- Collecting some hints for spellchecking analysis -- Pop quiz -- Summary -- 6. Using Faceted Search - from Searching to Finding -- Exploring documents suggestion and matching with faceted search -- Time for action - prototyping an auto-suggester with facets -- What just happened? -- Time for action - creating wordclouds on facets to view and analyze data -- What just happened? -- Thinking about faceted search and findability -- Faceting for narrowing searches and exploring data -- Time for action - defining facets over enumerated fields -- What just happened? -- Performing data normalization for the keyword field during the update phase -- Reading more about Solr faceting parameters -- Time for action - finding interesting topics using faceting on tokenized fields with a filter query -- What just happened? -- Using filter queries for caching filters -- Time for action - finding interesting subjects using a facet query -- What just happened? -- Time for action - using range queries and facet range queries -- What just happened? -- Time for action - using a hierarchical facet (pivot) -- What just happened? -- Introducing group and field collapsing.

Time for action - grouping results.
Abstract:
Written in a friendly, example-driven format, the book includes plenty of step-by-step instructions and examples that are designed to help you get started with Apache Solr.This book is an entry level text into the wonderful world of Apache Solr. The book will center around a couple of simple projects such as setting up Solr and all the stuff that comes with customizing the Solr schema and configuration. This book is for developers looking to start using Apache Solr who are stuck or intimidated by the difficulty of setting it up and using it.For anyone wanting to embed a search engine in their site to help users navigate around the mammoth data available this book is an ideal starting point. Moreover, if you are a data architect or a project manager and want to make some key design decisions, you will find that every example included in the book contains ideas usable in real-world contexts.
Local Note:
Electronic reproduction. Ann Arbor, Michigan : ProQuest Ebook Central, 2017. Available via World Wide Web. Access may be limited to ProQuest Ebook Central affiliated libraries.
Electronic Access:
Click to View
Holds: Copies: