Программы.Mreadz. MREADZ.COM

Designing for Digital Reading

Reading is a complex human activity that has evolved, and co-evolved, with technology over thousands of years. Mass printing in the fifteenth century firmly established what we know as the modern book, with its physical format of covers and paper pages, and now-standard features such as page numbers, footnotes, and diagrams. Today, electronic documents are enabling paperless reading supported by eReading technologies such as Kindles and Nooks, yet a high proportion of users still opt to print on paper before reading. This persistent habit of «printing to read» is one sign of the shortcomings of digital documents – although the popularity of eReaders is one sign of the shortcomings of paper. How do we get the best of both worlds?
The physical properties of paper (for example, it is light, thin, and flexible) contribute to the ease with which physical documents are manipulated; but these properties have a completely different set of affordances to their digital equivalents. Paper can be folded, ripped, or scribbled on almost subconsciously – activities that require significant cognitive attention in their digital form, if they are even possible. The nearly subliminal interaction that comes from years of learned behavior with paper has been described as lightweight interaction, which is achieved when a person actively reads an article in a way that is so easy and unselfconscious that they are not apt to remember their actions later.
Reading is now in a period of rapid change, and digital text is fast becoming the predominant mode of reading. As a society, we are merely at the start of the journey of designing truly effective tools for handling digital text.
This book investigates the advantages of paper, how the affordances of paper can be realized in digital form, and what forms best support lightweight interaction for active reading. To understand how to design for the future, we review the ways reading technology and reader behavior have both changed and remained constant over hundreds of years. We explore the reasoning behind reader behavior and introduce and evaluate several user interface designs that implement these lightweight properties familiar from our everyday use of paper.
We start by looking back, reviewing the development of reading technology and the progress of research on reading over many years. Drawing key concepts from this review, we move forward to develop and test methods for creating new and more effective interactions for supporting digital reading. Finally, we lay down a set of lightweight attributes which can be used as evidence-based guidelines to improve the usability of future digital reading technologies. By the end of this book, then, we hope you will be equipped to critique the present state of digital reading, and to better design and evaluate new interaction styles and technologies.
Table of Contents: Preface / Acknowledgments / Figure Credits / Introduction / Reading Through the Ages / Key Concepts / Lightweight Interactions / Improving Digital Reading / Bibliography / Authors' Biographies

Provenance

Luc Moreau

The World Wide Web is now deeply intertwined with our lives, and has become a catalyst for a data deluge, making vast amounts of data available online, at a click of a button. With Web 2.0, users are no longer passive consumers, but active publishers and curators of data. Hence, from science to food manufacturing, from data journalism to personal well-being, from social media to art, there is a strong interest in provenance, a description of what influenced an artifact, a data set, a document, a blog, or any resource on the Web and beyond. Provenance is a crucial piece of information that can help a consumer make a judgment as to whether something can be trusted. Provenance is no longer seen as a curiosity in art circles, but it is regarded as pragmatically, ethically, and methodologically crucial for our day-to-day data manipulation and curation activities on the Web.
Following the recent publication of the PROV standard for provenance on the Web, which the two authors actively help shape in the Provenance Working Group at the World Wide Web Consortium, this Synthesis lecture is a hands-on introduction to PROV aimed at Web and linked data professionals. By means of recipes, illustrations, a website at www.provbook.org, and tools, it guides practitioners through a variety of issues related to provenance: how to generate provenance, publish it on the Web, make it discoverable, and how to utilize it. Equipped with this knowledge, practictioners will be in a position to develop novel applications that can bring open-ness, trust, and accountability.
Table of Contents: Preface / Acknowledgments / Introduction / A Data Journalism Scenario / The PROV Ontology / Provenance Recipes / Validation, Compliance, Quality, Replay / Provenance Management / Conclusion / Bibliography / Authors' Biographies / Index

Library Linked Data in the Cloud

Shenghui Wang

This book describes OCLC’s contributions to the transformation of the Internet from a web of documents to a Web of Data. The new Web is a growing ‘cloud’ of interconnected resources that identify the things people want to know about when they approach the Internet with an information need.
The linked data architecture has achieved critical mass just as it has become clear that library standards for resource description are nearing obsolescence. Working for the world’s largest library cooperative, OCLC researchers have been active participants in the development of next-generation standards for library resource description. By engaging with an international community of library and Web standards experts, they have published some of the most widely used RDF datasets representing library collections and librarianship.
This book focuses on the conceptual and technical challenges involved in publishing linked data derived from traditional library metadata. This transformation is a high priority because most searches for information start not in the library, nor even in a Web-accessible library catalog, but elsewhere on the Internet. Modeling data in a form that the broader Web understands will project the value of libraries into the Digital Information Age.
The exposition is aimed at librarians, archivists, computer scientists, and other professionals interested in modeling bibliographic descriptions as linked data. It aims to achieve a balanced treatment of theory, technical detail, and practical application.

Transforming Technologies to Manage Our Information

William Jones

With its theme, «Our Information, Always and Forever,» Part I of this book covers the basics of personal information management (PIM) including six essential activities of PIM and six (different) ways in which information can be personal to us. Part I then goes on to explore key issues that arise in the «great migration» of our information onto the Web and into a myriad of mobile devices.
Part 2 provides a more focused look at technologies for managing information that promise to profoundly alter our practices of PIM and, through these practices, the way we lead our lives.
Part 2 is in five chapters:
– Chapter 5. Technologies of Input and Output. Technologies in support of gesture, touch, voice, and even eye movements combine to support a more natural user interface (NUI). Technologies of output include glasses and «watch» watches. Output will also increasingly be animated with options to «zoom».
– Chapter 6. Technologies to Save Our Information. We can opt for «life logs» to record our experiences with increasing fidelity. What will we use these logs for? And what isn’t recorded that should be?
– Chapter 7. Technologies to Search Our Information. The potential for personalized search is enormous and mostly yet to be realized. Persistent searches, situated in our information landscape, will allow us to maintain a diversity of projects and areas of interest without a need to continually switch from one to another to handle incoming information.
– Chapter 8. Technologies to Structure Our Information. Structure is key if we are to keep, find, and make effective use of our information. But how best to structure? And how best to share structured information between the applications we use, with other people, and also with ourselves over time? What lessons can we draw from the failures and successes in web-based efforts to share structure?
– Chapter 9. PIM Transformed and Transforming: Stories from the Past, Present and Future. Part 2 concludes with a comparison between Licklider’s world of information in 1957 and our own world of information today. And then we consider what the world of information is likely to look like in 2057. Licklider estimated that he spent 85% of his «thinking time» in activities that were clerical and mechanical and might (someday) be delegated to the computer. What percentage of our own time is spent with the clerical and mechanical? What about in 2057?
Table of Contents: Technologies of Input and Output / Technologies to Save Our Information / Technologies to Search Our Information / Technologies to Structure Our Information / PIM Transformed and Transforming: Stories from the Past, Present, and Future

Incentive-Centric Semantic Web Application Engineering

Elena Simperl

Recent years have seen a dramatic growth of natural language text data, including web pages, news articles, scientific literature, emails, enterprise documents, and social media such as blog articles, forum posts, product reviews, and tweets. This has led to an increasing demand for powerful software tools to help people analyze and manage vast amounts of text data effectively and efficiently. Unlike data generated by a computer system or sensors, text data are usually generated directly by humans, and are accompanied by semantically rich content. As such, text data are especially valuable for discovering knowledge about human opinions and preferences, in addition to many other kinds of knowledge that we encode in text. In contrast to structured data, which conform to well-defined schemas (thus are relatively easy for computers to handle), text has less explicit structure, requiring computer processing toward understanding of the content encoded in text. The current technology of natural language processing has not yet reached a point to enable a computer to precisely understand natural language text, but a wide range of statistical and heuristic approaches to analysis and management of text data have been developed over the past few decades. They are usually very robust and can be applied to analyze and manage text data in any natural language, and about any topic. This book provides a systematic introduction to all these approaches, with an emphasis on covering the most useful knowledge and skills required to build a variety of practically useful text information systems. The focus is on text mining applications that can help users analyze patterns in text data to extract and reveal useful knowledge. Information retrieval systems, including search engines and recommender systems, are also covered as supporting technology for text mining applications. The book covers the major concepts, techniques, and ideas in text data mining and information retrieval from a practical viewpoint, and includes many hands-on exercises designed with a companion software toolkit (i.e., MeTA) to help readers learn how to apply techniques of text mining and information retrieval to real-world text data and how to experiment with and improve some of the algorithms for interesting application tasks. The book can be used as a textbook for a computer science undergraduate course or a reference book for practitioners working on relevant problems in analyzing and managing text data.

Sentiment Analysis and Opinion Mining

Bing Liu

Sentiment analysis and opinion mining is the field of study that analyzes people's opinions, sentiments, evaluations, attitudes, and emotions from written language. It is one of the most active research areas in natural language processing and is also widely studied in data mining, Web mining, and text mining. In fact, this research has spread outside of computer science to the management sciences and social sciences due to its importance to business and society as a whole. The growing importance of sentiment analysis coincides with the growth of social media such as reviews, forum discussions, blogs, micro-blogs, Twitter, and social networks. For the first time in human history, we now have a huge volume of opinionated data recorded in digital form for analysis.
Sentiment analysis systems are being applied in almost every business and social domain because opinions are central to almost all human activities and are key influencers of our behaviors. Our beliefs and perceptions of reality, and the choices we make, are largely conditioned on how others see and evaluate the world. For this reason, when we need to make a decision we often seek out the opinions of others. This is true not only for individuals but also for organizations.
This book is a comprehensive introductory and survey text. It covers all important topics and the latest developments in the field with over 400 references. It is suitable for students, researchers and practitioners who are interested in social media analysis in general and sentiment analysis in particular. Lecturers can readily use it in class for courses on natural language processing, social media analysis, text mining, and data mining. Lecture slides are also available online.
Table of Contents: Preface / Sentiment Analysis: A Fascinating Problem / The Problem of Sentiment Analysis / Document Sentiment Classification / Sentence Subjectivity and Sentiment Classification / Aspect-Based Sentiment Analysis / Sentiment Lexicon Generation / Opinion Summarization / Analysis of Comparative Opinions / Opinion Search and Retrieval / Opinion Spam Detection / Quality of Reviews / Concluding Remarks / Bibliography / Author Biography

Resource-Oriented Architecture Patterns for Webs of Data

Brian Sletten

The surge of interest in the REpresentational State Transfer (REST) architectural style, the Semantic Web, and Linked Data has resulted in the development of innovative, flexible, and powerful systems that embrace one or more of these compatible technologies. However, most developers, architects, Information Technology managers, and platform owners have only been exposed to the basics of resource-oriented architectures. This book is an attempt to catalog and elucidate several reusable solutions that have been seen in the wild in the now increasingly familiar «patterns book» style. These are not turn key implementations, but rather, useful strategies for solving certain problems in the development of modern, resource-oriented systems, both on the public Web and within an organization's firewalls.
Table of Contents: List of Figures / Informational Patterns / Applicative Patterns / Procedural Patterns

Recognizing Textual Entailment

Ido Dagan

In the last few years, a number of NLP researchers have developed and participated in the task of Recognizing Textual Entailment (RTE). This task encapsulates Natural Language Understanding capabilities within a very simple interface: recognizing when the meaning of a text snippet is contained in the meaning of a second piece of text. This simple abstraction of an exceedingly complex problem has broad appeal partly because it can be conceived also as a component in other NLP applications, from Machine Translation to Semantic Search to Information Extraction. It also avoids commitment to any specific meaning representation and reasoning framework, broadening its appeal within the research community. This level of abstraction also facilitates evaluation, a crucial component of any technological advancement program.
This book explains the RTE task formulation adopted by the NLP research community, and gives a clear overview of research in this area. It draws out commonalities in this research, detailing the intuitions behind dominant approaches and their theoretical underpinnings.
This book has been written with a wide audience in mind, but is intended to inform all readers about the state of the art in this fascinating field, to give a clear understanding of the principles underlying RTE research to date, and to highlight the short- and long-term research goals that will advance this technology.
Table of Contents: List of Figures / List of Tables / Preface / Acknowledgments / Textual Entailment / Architectures and Approaches / Alignment, Classification, and Learning / Case Studies / Knowledge Acquisition for Textual Entailment / Research Directions in RTE / Bibliography / Authors' Biographies

Visual Information Retrieval using Java and LIRE

Mathias Lux

Visual information retrieval (VIR) is an active and vibrant research area, which attempts at providing means for organizing, indexing, annotating, and retrieving visual information (images and videos) from large, unstructured repositories.
The goal of VIR is to retrieve matches ranked by their relevance to a given query, which is often expressed as an example image and/or a series of keywords. During its early years (1995-2000), the research efforts were dominated by content-based approaches contributed primarily by the image and video processing community. During the past decade, it was widely recognized that the challenges imposed by the lack of coincidence between an image's visual contents and its semantic interpretation, also known as semantic gap, required a clever use of textual metadata (in addition to information extracted from the image's pixel contents) to make image and video retrieval solutions efficient and effective. The need to bridge (or at least narrow) the semantic gap has been one of the driving forces behind current VIR research. Additionally, other related research problems and market opportunities have started to emerge, offering a broad range of exciting problems for computer scientists and engineers to work on.
In this introductory book, we focus on a subset of VIR problems where the media consists of images, and the indexing and retrieval methods are based on the pixel contents of those images – an approach known as content-based image retrieval (CBIR). We present an implementation-oriented overview of CBIR concepts, techniques, algorithms, and figures of merit. Most chapters are supported by examples written in Java, using Lucene (an open-source Java-based indexing and search implementation) and LIRE (Lucene Image REtrieval), an open-source Java-based library for CBIR.

Linked Data

Tom Heath

The World Wide Web has enabled the creation of a global information space comprising linked documents. As the Web becomes ever more enmeshed with our daily lives, there is a growing desire for direct access to raw data not currently available on the Web or bound up in hypertext documents. Linked Data provides a publishing paradigm in which not only documents, but also data, can be a first class citizen of the Web, thereby enabling the extension of the Web with a global data space based on open standards – the Web of Data. In this Synthesis lecture we provide readers with a detailed technical introduction to Linked Data. We begin by outlining the basic principles of Linked Data, including coverage of relevant aspects of Web architecture. The remainder of the text is based around two main themes – the publication and consumption of Linked Data. Drawing on a practical Linked Data scenario, we provide guidance and best practices on: architectural approaches to publishing Linked Data; choosing URIs and vocabularies to identify and describe resources; deciding what data to return in a description of a resource on the Web; methods and frameworks for automated linking of data sets; and testing and debugging approaches for Linked Data deployments. We give an overview of existing Linked Data applications and then examine the architectures that are used to consume Linked Data from the Web, alongside existing tools and frameworks that enable these. Readers can expect to gain a rich technical understanding of Linked Data fundamentals, as the basis for application development, research or further study.

Table of Contents: List of Figures / Introduction / Principles of Linked Data / The Web of Data / Linked Data Design Considerations / Recipes for Publishing Linked Data / Consuming Linked Data / Summary and Outlook