You are here
STTR Phase I: Book Discovery through Literary DNA
Phone: (415) 205-8331
Phone: (415) 205-8331
Contact: Thamar Solorio
Type: Nonprofit college or university
The broader impact/commercial potential of this Small Business Technology Transfer (STTR) Phase I project will be to bring modern data analytics to the book publishing industry and apply machine learning to extract and articulate human emotion as applied to the reading of literature for the first time in history. This innovation will dramatically change the way books are discovered, resulting in the first commercial version of a book recommendation system based on the experiential reading value of books. With approximately 1.4 million new books published each year, it's extremely difficult for authors to connect with readers and for readers to find the book that is just right for them. Current recommendation systems are based on purchase history or social networks and fail to provide what readers told us are the most important factors in their reading satisfaction: writing style and how a book will make them feel. The proposed STTR project will lead to a commercially marketable product that deeply personalizes the book discovery process and perpetuates literacy. Not only will the innovation help authors and readers connect, but on an even greater scale, it will impact the way books are written, acquired, distributed and sold. This Small Business Technology Transfer (STTR) Phase I project proposes to tackle the next challenge in text classification: the higher level experience of reading a book. The computational model of books will learn the relationship between content, genre, author's writing style, and the mixture of sentiments in the book that, together, define how a book will make a reader feel. The opportunity is to extend research beyond what is already possible in analyzing thematic content in texts and stylistic marks that characterize authors' writeprint into those systems that can also understand and articulate the reading experience itself. The knowledge derived from the successful completion of this research represents a new frontier in natural language processing and machine learning akin to machine reading. Using supervised learning to perform the classification of reading experience for books, the project proposes to develop a large corpus of human annotated books to use for training, development and evaluation of the approaches examined. The goal is to initially use multiple human annotators to create the training set from which the machine learning system will be trained. Then we apply machine learning to 19 million current books to generate deeply personalized book recommendations.
* Information listed above is at the time of submission. *