Automatic quote detection from literary work
tarafından
 
Altıntaş, Aybike Güzel, author.

Başlık
Automatic quote detection from literary work

Yazar
Altıntaş, Aybike Güzel, author.

Yazar Ek Girişi
Altıntaş, Aybike Güzel, author.

Fiziksel Tanımlama
ix, 60 leaves: charts;+ 1 computer laser optical disc.

Özet
Literature inspires readers, and readers tend to share quotes from a literary work. The reader underlines the quotes in the book and shares them on social media, or on an online platform used by book readers. The definition of a quote is a span in a written text that is interesting for many readers and readers can use the quote in different contexts. In this study, a novel task in the field of Natural Language Processing is proposed: the Quote Detection Task. Also, an original dataset was formed from the Goodreads and Gutenberg websites with web scraping. Quotes are Goodreads data sourced from Kaggle and data that has been voted by 10 or more users are selected. These quotes have been validated with the books on the Project Gutenberg website. The final dataset consists of 4554 rows. The dataset contains quotes with their book spans. The span of a quote consists of the previous 10 sentences of the quote, the quote itself, and the following 10 sentences of the quote. Conditional Random Field (CRF) and Extractive Summarization as Text Matching (MatchSum) were run as two different baselines for quote detection. The Quote Detection Task is span detection that can be modeled with sequence labeling solutions and Neural extractive summarization systems in the literature. For this sequence tagging problem, the statistics-based CRF was run as first baseline. Extractive Summarization as Text Matching baseline is the second baseline chosen for the experimental part. Rouge-1 scores of 27.24% and 40.54%, respectively, were obtained from these baselines.

Konu Başlığı
Natural language processing (Computer science)

Yazar Ek Girişi
Tekir, Selma,

Tüzel Kişi Ek Girişi
İzmir Institute of Technology. Computer Engineering.

Tek Biçim Eser Adı
Thesis (Master)--İzmir Institute of Technology:Computer Engineering.
 
İzmir Institute of Technology: Computer Engineering--Thesis (Master).

Elektronik Erişim
Access to Electronic Versiyon.


LibraryMateryal TürüDemirbaş NumarasıYer NumarasıDurumu/İade Tarihi
IYTE LibraryTezT002673QA76.9.N38 A46 2022Tez Koleksiyonu