You are on page 1of 53

Semantics in Adaptive and

Personalised Systems Methods Tools


and Applications Pasquale Lops
Visit to download the full and correct content document:
https://textbookfull.com/product/semantics-in-adaptive-and-personalised-systems-met
hods-tools-and-applications-pasquale-lops/
More products digital (pdf, epub, mobi) instant
download maybe you interests ...

Crop Breeding: Genetic Improvement Methods Pasquale


Tripodi

https://textbookfull.com/product/crop-breeding-genetic-
improvement-methods-pasquale-tripodi/

Production Management: Advanced Models, Tools, and


Applications for Pull Systems Yacob Khojasteh

https://textbookfull.com/product/production-management-advanced-
models-tools-and-applications-for-pull-systems-yacob-khojasteh/

Complex Dynamical Systems in Education Concepts Methods


and Applications 1st Edition Matthijs Koopmans

https://textbookfull.com/product/complex-dynamical-systems-in-
education-concepts-methods-and-applications-1st-edition-matthijs-
koopmans/

Artificial Adaptive Systems Using Auto Contractive Maps


Theory Applications and Extensions 1st Edition Paolo
Massimo Buscema

https://textbookfull.com/product/artificial-adaptive-systems-
using-auto-contractive-maps-theory-applications-and-
extensions-1st-edition-paolo-massimo-buscema/
Multicomponent and Multiscale Systems Theory Methods
and Applications in Engineering 1st Edition Juergen
Geiser (Auth.)

https://textbookfull.com/product/multicomponent-and-multiscale-
systems-theory-methods-and-applications-in-engineering-1st-
edition-juergen-geiser-auth/

Quantum Systems in Chemistry and Physics Progress in


Methods and Applications 1st Edition Erkki J. Brändas
(Auth.)

https://textbookfull.com/product/quantum-systems-in-chemistry-
and-physics-progress-in-methods-and-applications-1st-edition-
erkki-j-brandas-auth/

Advances in Hybridization of Intelligent Methods:


Models, Systems and Applications 1st Edition Ioannis
Hatzilygeroudis

https://textbookfull.com/product/advances-in-hybridization-of-
intelligent-methods-models-systems-and-applications-1st-edition-
ioannis-hatzilygeroudis/

Systems Modeling Methodologies and Tools Antonio


Puliafito

https://textbookfull.com/product/systems-modeling-methodologies-
and-tools-antonio-puliafito/

Adaptive Resonance Theory in Social Media Data


Clustering Roles Methodologies and Applications Lei
Meng

https://textbookfull.com/product/adaptive-resonance-theory-in-
social-media-data-clustering-roles-methodologies-and-
applications-lei-meng/
Pasquale Lops
Cataldo Musto
Fedelucio Narducci
Giovanni Semeraro

Semantics in
Adaptive and
Personalised
Systems
Methods, Tools and Applications
Semantics in Adaptive and Personalised Systems
Pasquale Lops Cataldo Musto
• •

Fedelucio Narducci Giovanni Semeraro


Semantics in Adaptive
and Personalised Systems
Methods, Tools and Applications

123
Pasquale Lops Cataldo Musto
Dipartimento di Informatica Dipartimento di Informatica
Università di Bari Aldo Moro Università di Bari Aldo Moro
Bari, Italy Bari, Italy

Fedelucio Narducci Giovanni Semeraro


Dipartimento di Informatica Dipartimento di Informatica
Università di Bari Aldo Moro Università di Bari Aldo Moro
Bari, Italy Bari, Italy

ISBN 978-3-030-05617-9 ISBN 978-3-030-05618-6 (eBook)


https://doi.org/10.1007/978-3-030-05618-6
© Springer Nature Switzerland AG 2019
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part
of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission
or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, expressed or implied, with respect to the material contained
herein or for any errors or omissions that may have been made. The publisher remains neutral with regard
to jurisdictional claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
To my kids Giuseppe and Annapaola,
love of my life.
Pasquale Lops

To those who taught me about the importance


of the “meaning”. In every thing.
Cataldo Musto

To women and men who change our lives


for the better.
Fedelucio Narducci

In memoriam Aaron Hillel Swartz and Enzo


Roberto Tangari.
Giovanni Semeraro
Foreword

Web search engines and recommender systems are among today’s most visible and
successful applications of Artificial Intelligence technology in practice. We rely on
such systems every day when we search for information, when we shop online, or
when we stream videos. Without such systems, it almost seems impossible to find
things that interest us within the huge amounts of information that are offered online
today.
Research in the area of Information Filtering and Information Retrieval dates
back to the 1970s or even earlier. One central task of such systems then and today is
to estimate to what extent a given document or web page is relevant for a given
query by the user. Although the final ranking of the relevant documents is often
influenced by other factors, e.g., the general popularity of a web page, any search
system at some stage makes inferences about what each indexed web page is about,
i.e., about its content.
Over the decades, these forms of reasoning have become more and more
sophisticated. On one hand, different internal document representations were
developed, from simple term-counting approaches, over latent semantic approaches
to embedding models, which implicitly encode semantic relationships between terms.
At the same time, more and more structured or unstructured external knowledge
sources have become available, e.g., in the form of Linked Data, which allow search
and information filtering systems to make inferences using explicitly given semantic
relations between the concepts that appear in queries and in documents.
The same set of techniques can also be applied in the field of recommender
systems, which is the main focus of this book. Here, the input to ranking task is not
an individual query, but a user profile that the system has learned from past user
interactions over time. Accordingly, such content-based or semantics-based rec-
ommenders are able to personalize the ranking on the bases of the assumed user
interests.
In the traditional categorization of recommendation techniques, content-based
methods (here, the term content also covers metadata and other side information)
are often considered as an alternative to collaborative filtering approaches. This
latter class of systems, which base their recommendations on behavioral patterns of

vii
viii Foreword

a larger user community, dominates the research landscape. However, pure col-
laborative approaches can have a number of limitations. It is, for example, difficult
to ensure that a set of recommendations is diverse when we do not know anything
about the similarity of two items. Likewise, explaining recommendations to users
can be a challenge when we cannot inform users how the attributes of a recom-
mended item relate to their preferences. In a number of application domains, it is
therefore favorable to design a hybrid system that combines knowledge about the
items with collaborative information.
The literature on semantics-based or hybrid recommender systems is actually
quite rich, but unfortunately also scattered. Today, relevant works appear in pub-
lication outlets of different communities, e.g., Information Retrieval, Semantic
Web, or Recommender Systems. This book therefore fills an existing gap in the
literature. It first provides an introduction to the basic concepts of content repre-
sentations, then discusses approaches for semantic analysis, and covers today’s
external knowledge sources that can be leveraged for information filtering and
recommendation. Based on these foundations, it then reviews use cases of how rich
content information can be used to build better recommenders for the future.

Klagenfurt, Austria Dietmar Jannach


June 2019
Preface

The human desire to make the machine always smarter has been the driving force
for all the research in the Artificial Intelligence (AI) area.
Generally speaking, what makes a system intelligent is the capability of
understanding signals coming from the environment and of correctly adapting its
behavior accordingly. Such a capability is strictly related to the definition and the
design of specific techniques for interpreting messages generated by the users.
Some years ago, when we typed on Google the query How tall is the Eiffel
Tower?, the system answered with a set of documents, some of them including the
information we were seeking for, but without a precise identification of the correct
answer. Today, this is no longer the case since intelligent assistants like Siri, Alexa,
or Google assistant, and the Google search engine itself, are able to provide the
exact answer the user is looking for, that is, in the case of the Eiffel Tower, 300 m.
Without any doubt, we can state that the semantics represents the theoretical
foundation to implement models and technologies that allow the machines to
interpret and understand information provided in natural language. Indeed, thanks
to the semantics, it is possible to give meaning to documents, sentences, and
questions expressed in natural language and to create a bridge between the infor-
mation needs of a user and the answers to those needs.
Such an intuition is currently implemented in several tools and platforms as
search engines, recommender systems, digital assistants, and contributes to their
tangible improvement in accuracy and effectiveness we are recently witnessing.
We hope this book could become a reference point in the panorama of adaptive
and personalized systems exploiting semantics. The book is organized into three
main parts. First, we motivate the need to exploit textual content in intelligent
information access systems, and then we give an overview of the basic method-
ologies to process and represent content-based features. Next, we thoroughly
describe state-of-the-art methodologies and techniques to enrich textual content
representation by introducing semantics. Finally, the last part of the book provides a
more practical perspective and discusses several applications that exploit the
techniques introduced and described in the previous chapters.

ix
x Preface

We would like to sincerely thank everyone who contributed to this book, and the
various people who provided us with comments and suggestions and encouraged us
to summarize years of work in a single book. We thank, in particular, Nancy
Wade-Jones from Springer, who supported us throughout the editorial process.
We are very grateful to the people of the Semantic Web Access and
Personalization—SWAP research group,1 who contributed to most of the work
cited and described in this book. We would like to thank Marco de Gemmis, who
started to investigate how Natural Language Processing techniques could be
adopted to devise a new generation of content-based recommender systems,
Pierpaolo Basile, who made available his great expertise related to Word Sense
Disambiguation and Distributional Semantics Models, which were successfully
used in complex recommendation environments, Annalina Caputo, a former
member of the research group working on semantic information retrieval methods.
We would also like to thank all the other collaborators, Ph.D. students, and research
fellows of the SWAP research group, in particular, Leo Iaquinta, Andrea Iovine,
Piero Molino, Marco Polignano, Gaetano Rossiello, Lucia Siciliani, and Vincenzo
Tamburrano, each giving a specific contribution to the ideas, systems, and research
presented in this book.

Bari, Italy Pasquale Lops


July 2019 Cataldo Musto
Fedelucio Narducci
Giovanni Semeraro

1
http://www.di.uniba.it/%7Eswap/.
Contents

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Data Explosion and Information Overload . . . . . . . . . . . . . . . . . . 2
1.2 Intelligent Information Access . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.1 Information Retrieval and Information Filtering . . . . . . . . . 5
1.2.2 Recommender Systems . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Why Do We Need Content? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.3.1 Tackling the Issues of Collaborative Filtering . . . . . . . . . . 11
1.3.2 Feed and Follow Recent Trends . . . . . . . . . . . . . . . . . . . . 15
1.4 Why Do We Need Semantics? . . . . . . . . . . . . . . . . . . . . . . . . . . 16
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2 Basics of Content Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.1 Pipeline for Text Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.1.1 Lexical Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.1.2 Syntactic Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.2 Vector Space Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.3 Semantics-Aware Content Representation . . . . . . . . . . . . . . . . . . . 40
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3 Encoding Endogenous Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.1 Distributional Semantics Models . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.2 Word Embedding Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.2.1 Latent Semantic Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.2.2 Random Indexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.2.3 Word2Vec . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.3 Explicit Semantic Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

xi
xii Contents

4 Encoding Exogenous Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71


4.1 Overview of Structured Knowledge Sources . . . . . . . . . . . . . . . . . 72
4.1.1 WordNet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.1.2 BabelNet: An Encyclopedic Dictionary . . . . . . . . . . . . . . . 75
4.1.3 Linked Open Data and DBpedia . . . . . . . . . . . . . . . . . . . . 78
4.1.4 Wikidata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.2 Linking Item Features to Concepts . . . . . . . . . . . . . . . . . . . . . . . 84
4.2.1 Word Sense Disambiguation . . . . . . . . . . . . . . . . . . . . . . . 85
4.2.2 Entity Linking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
4.3 Linking Items to a Knowledge Graph . . . . . . . . . . . . . . . . . . . . . 94
4.3.1 Use of Ontologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
4.3.2 Use of Linked Open Data . . . . . . . . . . . . . . . . . . . . . . . . 98
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
5 Adaptive and Personalized Systems Based on Semantics . . . . . . . . . 105
5.1 Semantics-Aware Recommender Systems . . . . . . . . . . . . . . . . . . . 105
5.1.1 Approaches Based on Endogenous Semantics . . . . . . . . . . 106
5.1.2 Approaches Based on Exogenous Semantics . . . . . . . . . . . 112
5.1.3 Semantics-Aware User Profiling Techniques . . . . . . . . . . . 125
5.2 Semantics-Aware Social Media Analysis . . . . . . . . . . . . . . . . . . . 129
5.2.1 L’Aquila Social Urban Network . . . . . . . . . . . . . . . . . . . . 132
5.2.2 The Italian Hate Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
5.3 New Trends and Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
5.3.1 Cross-Lingual Recommendations . . . . . . . . . . . . . . . . . . . 138
5.3.2 Conversational Recommender Systems . . . . . . . . . . . . . . . 144
5.3.3 Explaining Recommendations . . . . . . . . . . . . . . . . . . . . . . 149
5.3.4 Serendipitous Recommendations . . . . . . . . . . . . . . . . . . . . 158
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
6 Conclusions and Future Challenges . . . . . . . . . . . . . . . . . . . . . . . . . 169
6.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
6.2 Future Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171

Appendix: Available Tools and Resources . . . . . . . . . . . . . . . . . . . . . . . . 173


Acronyms

AI Artificial Intelligence
BNC British National Corpus
BOW Bag of Words
CBOW Continuous Bag of Words
CBRS Content-Based Recommender System
CF Collaborative Filtering
CL-ESA Cross-Language Explicit Semantic Analysis
CoRS Conversational Recommender System
DM Dialog Manager
DSM Distributional Semantics Model
EL Entity Linking
EPG Electronic Program Guides
ER Entity Recognition
ESA Explicit Semantic Analysis
GDPR General Data Protection Regulation
IDF Inverse Document Frequency
IF Information Filtering
IMDB Internet Movie Database
IR Information Retrieval
KB Knowledge Base
LDA Latent Dirichlet Allocation
LOD Linked Open Data
LSA Latent Semantic Analysis
LSI Latent Semantic Indexing
MSS Most Specific Subsumer
MUC Message Understanding Conference
NER Named Entity Recognition
NLP Natural Language Processing
NMF Nonnegative Matrix Factorization
NNDB Notable Names Database

xiii
xiv Acronyms

NP Noun Phrases
OWL Ontology Web Language
PCA Principal Component Analysis
pLSA Probabilistic Latent Semantic Analysis
PMI Pointwise Mutual Information
POS Part of Speech
QA Question Answering
RDF Resource Description Framework
RI Random Indexing
RP Random Projection
RS Recommender Systems
SA Sentiment Analyzer
SG Skip-Gram
SPARQL SPARQL Protocol and RDF Query Language
SUN Social Urban Network
SVD Singular Value Decomposition
TF Term Frequency
TR-ESA Translation-based Explicit Semantic Analysis
URI Uniform Resource Identifier
VP Verb Phrases
VSM Vector Space Model
WSD Word Sense Disambiguation
List of Figures

Fig. 1.1 Workflow carried out by a generic search engine . . . . . . . . . . .. 6


Fig. 1.2 Workflow carried out by a generic information filtering
tool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 6
Fig. 1.3 Workflow carried out by a recommender system . . . . . . . . . . .. 8
Fig. 1.4 Toy example of a data model for a collaborative
recommender system, based on the user–item matrix . . . . . . . .. 12
Fig. 1.5 Issues of collaborative recommender systems: sparsity
and new item problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 13
Fig. 1.6 A content-based recommendation pipeline . . . . . . . . . . . . . . . .. 14
Fig. 1.7 Ambiguity in user modeling and recommendations . . . . . . . . .. 16
Fig. 1.8 Limits of keyword-based representation in recommendation
tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Fig. 1.9 Vocabulary mismatch problem . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Fig. 2.1 The natural language processing pipeline . . . . . . . . . . . . . . . . . . 23
Fig. 2.2 Named entities recognized in the text fragment . . . . . . . . . . . . . 29
Fig. 2.3 Penn treebank tagset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Fig. 2.4 POS tagging for a text fragment . . . . . . . . . . . . . . . . . . . . . . . . . 31
Fig. 2.5 Chunking for a text fragment. NP ¼ noun phrase,
VP ¼ verb phrase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 32
Fig. 2.6 A parse tree for a text fragment . . . . . . . . . . . . . . . . . . . . . . . .. 34
Fig. 2.7 The term–document matrix reporting the number of times
each word (row) occurs in each document (column). . . . . . . . .. 36
Fig. 2.8 The term–document matrix reporting the TF-IDF weight
for each word (row) in each document (column) . . . . . . . . . . . . 38
Fig. 2.9 A graphical representation of cosine similarity . . . . . . . . . . . . . . 39
Fig. 2.10 Items and user profiles represented in a vector space . . . . . . . . . 40
Fig. 2.11 Classification of semantic representation techniques . . . . . . . . . . 41
Fig. 3.1 Similar terms share similar usages . . . . . . . . . . . . . . . . . . . . . . . 45
Fig. 3.2 The term–context matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Fig. 3.3 A two-dimensional representation of a WordSpace . . . . . . . . . 47
Fig. 3.4 An example of term–sentence matrix . . . . . . . . . . . . . . . . . . . . . 48

xv
xvi List of Figures

Fig. 3.5 An example of term–term matrix . . . . . . . . . . . . . . . . . . . . . . .. 48


Fig. 3.6 An example of SVD applied to the term–document
matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 52
Fig. 3.7 A visual explanation of the Johnson–Lindenstrauss
lemma. Z is the nearest point to X in the reduced vector space,
as in the original space, even though the numerical value
of their pairwise similarity is different . . . . . . . . . . . . . . . . . . .. 54
Fig. 3.8 Context vectors of dimension k ¼ 8 . . . . . . . . . . . . . . . . . . . . .. 56
Fig. 3.9 The vector space representation of a term obtained
by summing the context k-dimensional index vectors
the term co-occurs with . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Fig. 3.10 Uniform representation of WordSpace and DocSpace . . . . . . 57
Fig. 3.11 Structure of the network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Fig. 3.12 Continuous Bag-of-Words methodology . . . . . . . . . . . . . . . . . . . 59
Fig. 3.13 Skip-Gram methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
Fig. 3.14 The ESA matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Fig. 3.15 Semantics of the word cat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Fig. 3.16 Attribute vector of the Panthera Wikipedia article . . . . . . . . . . . 63
Fig. 3.17 Semantic relatedness between semantic interpretation
vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 64
Fig. 3.18 An example of semantic interpretation vector of a text
fragment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 65
Fig. 3.19 The semantic interpretation vector of The Matrix . . . . . . . . . . .. 65
Fig. 4.1 The hierarchy of sense 1 of the word “bat” obtained
from WordNet (version 2.1) . . . . . . . . . . . . . . . . . . . . . . . . . . .. 74
Fig. 4.2 Apple in the sense of fruit in BabelNet. . . . . . . . . . . . . . . . . . .. 76
Fig. 4.3 Apple in the sense of multinational corporation
in BabelNet. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 77
Fig. 4.4 The Linked Open Data cloud. Each bubble represents
a dataset (a set of RDF statements). Datasets encoding
similar or related information are represented with
the same colors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 79
Fig. 4.5 The so-called Semantic Web Cake. Each element in the cake
represents a formalism or a technology that was necessary
to enable the vision of the Semantic Web. . . . . . . . . . . . . . . . .. 80
Fig. 4.6 An example of RDF triple, encoding the information that
Keanu Reeves has acted in The Matrix. The URI dbr:
Keanu_Reeves is an abbreviation for http://dbpedia.
org/resource/resource/Keanu_Reeves . . . . . . . . . .. 80
Fig. 4.7 Data Mapping between Wikipedia and DBpedia. . . . . . . . . . . .. 81
Fig. 4.8 A (tiny) portion of the properties, available in the LOD cloud,
that describe the band The Coldplay . . . . . . . . . . . . . . . . . . . . .. 82
Fig. 4.9 A portion of the data available in Wikidata that describe
the band The Coldplay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 83
List of Figures xvii

Fig. 4.10 The preprocessing of sentence “The white cat is hunting


the mouse.” Each token is labeled with a tag describing its
lexical role in the sentence. NN = noun, singular—VB = verb,
base form—JJ = adjective . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 87
Fig. 4.11 A fragment of the WordNet taxonomy . . . . . . . . . . . . . . . . . . .. 88
Fig. 4.12 Similarities between synsets . . . . . . . . . . . . . . . . . . . . . . . . . . .. 88
Fig. 4.13 The synset–document matrix reporting the number of times
each synset (row) occurs in each document (column) . . . . . . . .. 91
Fig. 4.14 An example of entity linking performed by Tagme . . . . . . . . .. 93
Fig. 4.15 An example of entity linking performed by Babelfy . . . . . . . . .. 93
Fig. 4.16 An example of entity linking performed by DBpedia
Spotlight . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
Fig. 4.17 A tiny portion of the Movie Ontology. . . . . . . . . . . . . . . . . . . . . 96
Fig. 4.18 A tiny portion of the Movie Ontology with instances . . . . . . . . . 96
Fig. 4.19 Genre class hierarchy in Movie Ontology . . . . . . . . . . . . . . . . . . 97
Fig. 4.20 An excerpt of the Quickstep research paper topic ontology . . . . 98
Fig. 4.21 An example of SPARQL query . . . . . . . . . . . . . . . . . . . . . . . . . 99
Fig. 4.22 Mapping between items and DBpedia URIs . . . . . . . . . . . . . . . . 100
Fig. 4.23 A data model including the features extracted from
the Linked Open Data cloud . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
Fig. 5.1 An example of enrichment by ESA . . . . . . . . . . . . . . . . . . . . . . 111
Fig. 5.2 An example of a keyword-based profile . . . . . . . . . . . . . . . . . . . 114
Fig. 5.3 An example of a synset-based profile . . . . . . . . . . . . . . . . . . . . . 114
Fig. 5.4 Neighborhood formation from clustered partitions . . . . . . . . . . . 115
Fig. 5.5 Sample RDF graph extracted from DBpedia
and LinkedMDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
Fig. 5.6 Matrix representation of property resource indexes . . . . . . . . . . . 117
Fig. 5.7 Vector space model for LOD . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
Fig. 5.8 Basic bipartite graph representing users, items, and their
preferences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
Fig. 5.9 Tripartite graph representing users, items, and information
coming from the LOD cloud . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Fig. 5.10 Examples of user profiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
Fig. 5.11 Workflow for Holistic user modeling carried out by Myrror . . . . 128
Fig. 5.12 Semantic Tag cloud showing users’ interests . . . . . . . . . . . . . . . 129
Fig. 5.13 The architecture of the semantic content analysis
framework. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
Fig. 5.14 Example of data visualizations available in CrowdPulse . . . . . . . 133
Fig. 5.15 Social capital indicators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
Fig. 5.16 Example of ambiguous Tweets . . . . . . . . . . . . . . . . . . . . . . . . . . 134
Fig. 5.17 Entity-based representation of a Tweet . . . . . . . . . . . . . . . . . . . . 134
Fig. 5.18 An example of the workflow carried out by the content
scoring and classification module of L’Aquila social urban
network project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
xviii List of Figures

Fig. 5.19 An example of the output returned by the pipeline


of algorithms implemented in the Italian hate map project . . . . . 136
Fig. 5.20 Alignment of synsets in different languages
in MultiWordNet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
Fig. 5.21 Example of synset-based document representation
in different languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
Fig. 5.22 Multilingual document representation using translation-based
ESA (TR-ESA) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
Fig. 5.23 Multilingual document representation using cross-language
ESA (CL-ESA) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
Fig. 5.24 Wikipedia cross-language links (in the red box) . . . . . . . . . . . . . 143
Fig. 5.25 The general architecture of a conversational recommender
system. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
Fig. 5.26 The Bot workflow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
Fig. 5.27 A screenshot of the Bot during the training phase
in typing mode (a) and the recommendation phase (b) . . . . . . . . 150
Fig. 5.28 The histogram with grouping interface, which performed
best in the study of Herlocker et al. [46] . . . . . . . . . . . . . . . . . . 152
Fig. 5.29 Personalized tag cloud as defined in [37] . . . . . . . . . . . . . . . . . . 152
Fig. 5.30 Workflow carried out by our framework . . . . . . . . . . . . . . . . . . . 154
Fig. 5.31 Toy example related to the modeling of direct connections . . . . 157
Fig. 5.32 Toy example related to modeling of indirect connections . . . . . . 158
Fig. 5.33 Fragment of the row of the correlation matrix for the movie
Star Wars. Each cell reports the correlation index between
Star Wars and the movie on the column, and the set
of plot keywords which match the new keywords produced
by the KI process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
Chapter 1
Introduction

Adaptive and personalized systems play an increasingly important role in our daily
lives, since we more and more rely on systems that tailor their behavior on the ground
of our preferences and needs, and support us in a broad range of heterogeneous
decision-making tasks.
As an example, we are used to exploit Spotify to get the best music playlist for our
workout, we ask Netflix to suggest us a movie to watch in a rainy night at home, and
we use Amazon to get recommendations about items to buy. Even complex tasks,
such as identifying the best location for our summer holidays or tailoring financial
investments to our needs and plans, are now often tackled through personalized and
adaptive systems.
The rise of such a user-centric vision made technologies such as personalized
search engines, recommender systems, and intelligent personal assistants very popu-
lar and essential. However, these technologies could never have existed in the absence
of the main fuel that feeds them: the data. These platforms tremendously need data
to carry out a broad set of tasks, ranging from modeling users’ needs and preferences
to training complex machine learning models to make inferences and predictions.
In the absence of the data, most of the intelligent systems we discuss in this book
would not have become so popular.
Accordingly, it is straightforward to imagine that the recent growth of such tech-
nologies goes hand in hand with the recent growth of online (personal) data that
are spread through social networks, collaborative platforms, and personal devices.
The more the available data about a person, the more effective the personalization
and adaptation process can be. In turn, such a scenario fueled two different research
trends. On one side, such a unique availability of data, typically referred to as Data
Explosion [31], emphasizes the so-called problem of Information Overload and en-
courages the development and the design of systems able to support the users in
sifting through this huge flow of data, such as Information Retrieval and Information
Filtering systems. On the other side, all the personal data that are now available on
the Web and on social networks (what we like, who are our friends, which places
© Springer Nature Switzerland AG 2019 1
P. Lops et al., Semantics in Adaptive and Personalised Systems,
https://doi.org/10.1007/978-3-030-05618-6_1
2 1 Introduction

we often visit, etc.) also fostered the research in the area of user modeling, since
the acquisition and the processing of these data contributes to the definition of very
precise representation of the person, which in turn enables accurate personalization
and adaptation mechanisms.
In this chapter, we present and discuss both the aspects, since they represent the
main motivations that led us to the writing of this book.
First, we discuss how we can effectively cope with the surplus of information
by developing technologies for intelligent information access. Next, we deepen the
discussion and we show how the available data can be used to feed intelligent in-
formation access systems by providing a very precise and fine-grained modeling of
users’ interests and needs. Specifically, we will pay particular attention to investigate
and discuss the importance of content-based and textual information sources in such
a scenario.
To sum up, this chapter is organized as follows: First, we introduce the concepts
of Data Explosion and Information Overload. Next, we focus our attention on the
available strategies to effectively tackle this issue, as the exploitation of Information
Retrieval and Information Filtering methodologies to develop tools for intelligent
information access. Finally, we discuss the role of data in such a scenario, by em-
phasizing the importance of gathering and modeling content-based information and
by showing that the injection of semantics in content representation can lead to even
more precise user models and more effective personalization algorithms.

1.1 Data Explosion and Information Overload

The concept of Data Explosion (or Information Explosion) was recently introduced
to refer to the growth of the information spread through the Web and through the
Internet of Things. A primary cause of this uncontrolled increase of the available data
is represented by the recent rise of collaborative platforms and social networks [13]
as Wikipedia, YouTube, Facebook, Instagram, and so on, which made the authoring
of content easier and easier.
As shown by several studies, Web users are making the most of this opportunity,
since a tremendous amount of information is produced and generated through these
platforms1 : As an example, 481,000 Tweets and 46,000 posts are published every
minute on Twitter and Instagram, respectively. Similarly, 973,000 people have access
every minute to Facebook to produce information by posting materials or by leaving
comments on public pages. Messaging services as WhatsApp are involved in this
scenario as well, since 38 million messages are sent through the app every minute.
Such systems upset very stable Web dynamics, since they replaced the original
dichotomy between producers and consumers of information, typical of the “old”

1 https://www.visualcapitalist.com/internet-minute-2018/.
1.1 Data Explosion and Information Overload 3

Web, with a new and more “democratic” vision where each user can act at the same
time as both producer and consumer of information.
This phenomenon, already prophesied by Alvin Toffler in the early 80s,2 has two
main consequences: on one side, it gives great opportunities to the users, since content
can be authored and published more easily than 5 or 10 years ago. On the other, it is
also unfortunately leading to the diffusion of such an amount of information which
is objectively unbearable and with no control in terms of quality and reliability of
produced content. As stated by recent analyses,3 every day 2.5 quintillion bytes of
data are produced, and the pace is further accelerating with the growth of the Internet
of Things.
Two main questions arise from this scenario:
1. Can we effectively deal with such a huge amount of information?
2. Is there any opportunity resulting from this surplus of (personal) data?
The answer to the first question is equally simple and straightforward: no, we
can’t, and some data immediately confirm this intuitive idea.
Given that 300 hours of videos are uploaded on YouTube every minute,4 it would
take around 800 years of nonstop watching to watch each and every video uploaded
in the last year. Moreover, the spread of mobile devices makes even more difficult
to follow the flow of information (there are currently 1.3 billion gigabytes of data
traffic on mobile networks only5 ), despite we spend about 22% of our navigation
time on social networks, as shown by an analysis carried out by Nielsen.6
The inability of dealing with all the available online information is also confirmed
by the studies carried on by Adrian Ott. As shown in [24], our brain has a physiological
limit since it can process 126 bits of information per second. Unfortunately, the
amount of information we have to deal with in our daily navigation on the Web is
equal to 393 bits per second, and thus the information we should process every day
is three times the amount of information we can process in an effective way.7 Such
a state of things is typically referred to as Information Overload.
Even if the amount of the available information significantly grew in the last few
years, the concept of Information Overload has older origins and it is not even strictly
related to the Web. Indeed, this term was first mentioned in 1852 by the Secretary
of the Smithsonian Institute in Washington. Later, during the 1948 Royal Society’s
Influential Scientific Information Conference, the Information Overload began to be
labeled as a “problem” [3]. In literature [37], Alvin Toffler used this term to describe

2 Alvin Toffler first proposed the portmanteau “prosumers” in the book “Third Wave” in 1980.
3 https://www.forbes.com/sites/bernardmarr/2018/05/21/how-much-data-do-we-create-every-

day-the-mind-blowing-stats-everyone-should-read.
4 https://merchdope.com/youtube-stats/.
5 http://www.mobithinking.com/mobile-marketing-tools/latest-mobile-stats.
6 http://blog.nielsen.com/nielsenwire/social/.
7 The study was carried out in 2010. It is likely that the amount of information today available would

make this ratio even higher.


4 1 Introduction

a prophetic scenario where the rapid technological growth of the society (he called
it as super-industrial society) caused stress and confusion in the individuals.
As established by some research [38], this dystopian scenario is close to come
true, since several works showed that the Information Overload is a problem that can
decrease both productivity and quality of life of the individuals, by leading in the
worse case to attention deficits, anxiety, cyberchondria, and so on [11].
Currently, there is not a single and universally accepted definition of this issue. In
general, Information Overload tends to describe as a state of things where efficiency
is jeopardized by the amount of available information [8]. More precisely, humans are
placed at overload when the information is presented at a rate too fast to be processed
by a single person [33]. Nowadays, we can experience the Information Overload in
several daily activities we perform on the Web: scrolling the huge number of search
results returned by a search engine, browsing a large set of items in a catalogue, or
just filtering a news feed to drop out things we are not interested in.
Unfortunately, as we previously stated, data are quickly spread through the Web
at a pace that is even going to increase. This is an irreversible process we needs to
dominate, in order to create some opportunities from the huge flow of data people
have to face every day.
A possible direction to effectively tackle the problem of Information Overload
is proposed by Shirky [34], who emphasized that the abundance of data does not
represent a problem by itself. According to Shirky, the main issue regards the absence
of appropriate filters that support the physiological deficits of our brain and help us
in selecting the most important pieces of information among the available ones.
In other terms, humans have to develop effective strategies to filter the information
in a proper way, rather than simply reducing or avoiding the production of data. The
opinion spread by Shirky fostered a huge research effort since the uncontrolled
growth of information—beyond being considered as a problem—also triggers a lot
of opportunities for researchers and practitioners aiming at dominate the flow of
information through the development of new and better filters.
Indeed, most of the data nowadays available are first of all personal data, since they
someway regard the person who produced it. What we write, the places where we have
been, the persons we follow, our emotions. All these signals provide heterogeneous
and important information about our preferences and our needs that a personalized
filter needs to take into account.
As we will show throughout this book, the development of proper filters is a very
effective way to tackle the problem of Data Explosion and Information Overload.
This is confirmed by several success stories in the area of search engines as Google,
or in the area of personalized systems as Amazon, Netflix, and YouTube.
Generally speaking, the development of such systems has two requirements: first,
a precise and fine-grained description of all the aspects that describe the target user,
who is supposed to exploit the filter (typically referred to as user model); and second,
a precise description of what the filter should do. Accordingly, one of the goals of
this book is to provide an overview of the most effective methodologies to address
both the requirements and to develop very effective intelligent information access
systems.
1.2 Intelligent Information Access 5

1.2 Intelligent Information Access

In recent years, several methodologies to effectively cope with Information Overload


have been proposed in research. These approaches rely on the idea of filters we
previously introduced since they aim to introduce tools and techniques to provide
users with personalized and intelligent access to textual and multimedia information
sources.
When we talk about “having intelligent access to information sources,” we refer
to concrete tasks, such as performing searches, filtering results, aggregating similar
information, interpreting data, and so on. Usually, all these tasks need to handle
large collections of semi-structured or even nonstructured data, as a catalogue of
items (as in Amazon or Netflix) or a huge set of documents (as for Google). To make
information access more “intelligent”, or, in general, more efficient, it is necessary
to introduce some techniques to facilitate these activities, in order to decrease the
time needed to perform each task and to increase both overall accuracy and user
satisfaction.
In general, Intelligent Information Access [5] encompasses a wide group of tech-
nologies, ranging over Information Retrieval, Information Extraction, Text Cluster-
ing, Information Filtering, and so on [16]. In this book, we will focus on two main
classes of intelligent information access techniques: Information Retrieval and Infor-
mation Filtering methods, with a specific focus on recommender systems, the main
technology that nowadays implements the principles of information filtering.

1.2.1 Information Retrieval and Information Filtering

Information Retrieval (IR) concerns the finding of relevant information from a col-
lection of data (usually unstructured text) [30]. Search engines, such as Google and
Bing, are typical examples of IR applications. A formal characterization of an IR
model is given by Baeza-Yates and Ribeiro-Neto [1]. Generally speaking, the goal
of IR systems is to tackle the problem of information overload by driving the user to
those documents that will satisfy her own information needs. User information needs
are usually represented by means of a query, expressed in a language understood by
the system.
In the typical workflow of an IR system, the query is submitted to the search
engine, whose goal is to understand the meaning of user request and to identify the
most relevant pieces of information among those that are stored in the collection.
Next, a ranking function orders the documents in a descending relevance criterion
and the top entries are finally returned to the user. An example of such a workflow
is provided in Fig. 1.1.
In 2009, Google enhanced the original paradigm of IR by introducing8 some tips to
take into account also personal and contextual information in their search algorithms.

8 http://googleblog.blogspot.com/2009/12/personalized-search-for-everyone.html.
6 1 Introduction

Fig. 1.1 Workflow carried out by a generic search engine

As an example, by storing users’ clicks in previous searches, the algorithm can better
identify users’ preferred topics, and this information can be used to better rank search
results by moving up the pages the user is more likely to be interested in. In this way,
the plethora of personal information about the users can be exploited to improve the
ranking of the returned results.
However, even if this choice started the evolution of the classical “search”
paradigm toward its personalized and contextual variants, the current paradigm still
requests an explicit query that expresses and describes the informative needs of the
user. As stated by Ingwersen and Willett [14], this is a very challenging and problem-
atic task, since the users have to model their needs by relying on a limited vocabulary
only based on keywords, which is too far from the one they typically use.
To effectively tackle this issue, alternative methodologies for Intelligent Infor-
mation Access emerged. As an example, Information Filtering (IF) techniques were
introduced to expose users with the information they want without the need of an
explicit query that triggers the whole process [12].
Even if both IR and IF have the goal of optimizing the access to unstructured
information sources, they have a strict methodological difference: First, IF does not
exploit an explicit query of the user, but it relies on some filtering criterion that
triggers the whole process. Moreover, IF systems are not designed to find relevant
information pieces, but rather to filter out the noise from a generic information
flow according to such a criterion. An example of the workflow carried out by an
Information Filtering tool is reported in Fig. 1.2. Such a strict relationship between
the areas was already discussed by Belkin and Croft [4], who defined IF and IR as
two sides of the same coin.

Fig. 1.2 Workflow carried out by a generic information filtering tool


1.2 Intelligent Information Access 7

As already pointed out by O’Brien, the development of these systems is a first step
to shift the paradigm of classical search toward discovery,9 that is to say, a scenario
where the information is automatically pushed to the users instead of being pulled
according to an explicit query.
According to Malone et al. [22], the approaches to Information Filtering can
be classified into three different categories, according to the filtering criterion they
implement:
• Cognitive filtering: It is based on content analysis;
• Social filtering: It is based on individual judgments of quality, communicated
through personal relationships;
• Economic filtering: It is based on estimated search cost and benefits.
Typically, cognitive filtering is carried out by simply analyzing the content as-
sociated to each informative item. Whether it contains (or, alternatively, it does not
contain) specific features, it is filtered out. A typical scenario where this filtering
approach is applied is spam detection in e-mail clients: If an e-mail contains specific
terms, it is labeled as “spam” and the mail is filtered out. Similar methodologies are
also implemented to identify relevant articles or posts from a complete news feed:
The more the overlap between the keywords describing an article (e.g., sports, poli-
tics, etc.) and those appearing in the articles the target user was previously interested
in, the more the likelihood that the user would be interested in reading that news.
Conversely, social filtering complements the cognitive approach by focusing on
the characteristics of the users. As an example, some features describing the user
or some explicit relationships (e.g., an e-mail message received from the supervisor
has to be considered as relevant) can be used as a signal to filter the information
flow. Similarly, users’ behavior can be analyzed to bring out similarity or patterns
that can be exploited to forecast their future behavior (e.g., whether all the users in
my group—the so-called neighborhood—liked a specific song or a specific movie,
it might be relevant for me, too).
Finally, the economic filtering approach relies on various types of cost–benefit
assessments and explicit or implicit pricing mechanisms. A simple cost-versus-value
heuristic is the length of an e-mail message.

1.2.2 Recommender Systems

The set of techniques that can be exploited to filter the information flow is very wide,
ranging from simple heuristics to complex statistical models and machine learning
methodologies. When a filtering system also takes into account as filtering criterion
some information about a specific user (namely, a profile), it is common to refer to
it as personalized system [23].

9 http://archive.fortune.com/magazines/fortune/fortune_archive/2006/11/27/8394347/index.htm.
8 1 Introduction

A typical example of personalized systems is represented by Recommender Sys-


tems (RS) [28], one of the most disrupting technologies appeared on the scene in
the last decades, as stated by Jannach et al. [15]. Such systems typically acquire
information about user needs, interests, and preferences, and tailor their behavior on
the ground of such information by personalizing user experience and by supporting
people in several decision-making tasks.
The workflow carried out by a recommender system is provided in Fig. 1.3. As
shown in the picture, an RS represents a specialization of a generic IF tool, where a
representation of the user (a profile) is exploited as filtering criterion and documents
are replaced by items, since RS can also provide suggestions in the absence of textual
information.10
It is acknowledged that RS have an enormous influence on consumers’ behaviors,
since many people use these systems to buy products on Amazon, to listen to music
on Spotify, to choose the restaurants picked by Foursquare, or even to read the posts
Facebook has ranked at the top of our feed.
These systems typically work by exploiting various knowledge sources to obtain
information about both the users and the available items. A typical information source
is represented by the historical interactions of the users (e.g., which items a person
previously bought, which songs has previously played, and so on), which are used
to determine the preferences of the target user and, in turn, which items she will be
interested in the future.
Accordingly, RS are based on the assumption that user preferences stay stable
and do not change over time. This is a very strong assumption, but the application of
recommendation technologies in many domains [20] has already reported that such an
assumption effectively holds in several real-world scenarios [25]. The effectiveness of
RS has been also confirmed by several articles [19], which discussed the significant
impact of such algorithms on both sales volumes and click-through rates. As an
example, 35% of Amazon’s revenues are generated through its recommendation

Fig. 1.3 Workflow carried out by a recommender system

10 In most of the scenarios that will be discussed in this book documents and items can be considered

as synonyms, since we will always describe the items by providing them with some descriptive
features. However, it is necessary to state that this is not a constraint, and RS can also work without
exploiting content.
1.2 Intelligent Information Access 9

engine,11 and many companies frequently report claims that RS contribute from
10% to 30% of total revenues [10].
Regardless of the specific methodology adopted to generate the recommendations,
a RS basically carries out the following three steps:
1. Training: First, the system needs to acquire information about a target user
(what she knows, what she likes, the task to be accomplished, demographical or
contextual information, and so on). This step could be accomplished in an explicit
or implicit way. In the first case, the user explicitly expresses her preferences (by
means of a numeric scale, for example) on randomly chosen items, while in the
latter user preferences are gathered by analyzing her transactional or behavioral
data (for example, clicking a link or reading a news article could be considered
as a clue of user interest in that item).
2. User Modeling: In general, the concept of personalization implies the presence
of something describing and identifying the user that interacts with the system.
So, the information extracted is usually modeled and stored in a user profile.
Modeling the user profile is a central step of the pipeline since it is the compo-
nent that triggers the whole recommendation process. The choices about which
information has to be stored and the way the user profile is built, updated, and
maintained are generally strictly related to the specific filtering model imple-
mented by the system.
3. Filtering and Recommendation: Finally, the information flow is filtered out by
exploiting data stored in the user profile. The goal of this step is to rank the items
according to a relevance criterion and to provide user with a list of the most
relevant items in order to let her express her own feedbacks on the proposed
ones. Formally, at the end of the filtering step, the system returns a subset of
items ranked in a descending relevance order.
As reported in [6], recommender system techniques can be classified on the ground
of the different approaches they adopt for the user modeling step as well as for the
filtering and recommendation one.
1. Content-Based Recommender Systems (CBRS): This class of RS suggests
items similar to those preferred in the past by the user.
2. Collaborative Recommender Systems: This class of RS suggests items pre-
ferred by users with similar needs or preferences.
3. Demographic Recommender Systems: This class of RS suggests items on the
ground of the demographic profile of the user.
4. Knowledge-Based Recommender Systems: This class of RS suggests items
whose features meet user needs and preferences according to specific domain
knowledge.
5. Community-Based Recommender Systems: This type of system recommends
items based on the preferences of user’s friends.

11 http://www.mckinsey.com/industries/retail/our-insights/how-retailers-can-keep-up-with-

consumers.
10 1 Introduction

6. Hybrid Recommender Systems: This class of RS combines two or more rec-


ommendation techniques to overcome the typical drawbacks of each approach.
To sum up, the scenario recommender systems deal with is very natural and
common, since people get advice all the time, especially when it is necessary to
choose among different alternatives and the knowledge to critically discern them is
not enough. This is a very typical situation. Sometimes, we need suggestions about
what movie to watch on a rainy night or which is the best paper to read about a
specific research topic, but the scenario is very common and may be extended to
several scopes: Music to listen to, books, web pages or news to read, electronic
devices to buy, restaurants to try, and so on. The list could be infinite.
People use a lot of strategies to solve decision-making processes [29]. Sometimes,
human recommendations are adequate to effectively face the problem (for example,
to find the best restaurants in our own city), but usually they are not because the infor-
mation held by the people in our network is not enough to provide good suggestions
in all the cases.
It is not by chance that these systems gained a lot of popularity in the last few
years, since decision-making processes are much more difficult in the Big Data era.
Indeed, as the number of possible choices (for example, which digital camera to buy
in an online store or which movie to watch on Netflix) increases, the difficulty to
evaluate the overwhelming number of alternatives increases as well, and this leads
to the need for some aids to guide us sifting this flow of choices.
This is a well-known problem, usually referred to as Paradox of Choice [32]. In-
deed, when the number of possible alternatives is too big with respect to the knowl-
edge that the individuals hold about a problem, to choose become difficult and it
is common to fall in the so-called paradox of Buridan [27], namely, the inability
to make decisions when too many alternatives are available. As stated by Leibniz,
in things which are absolutely indifferent, there can be no choice and consequently
no option or will, since choice must have some reason or principle. In other terms,
when knowledge is not enough the individuals need to be assisted in decision-making
processes. IF systems like RS are the tools that can best cope with these tasks.
The choice among the different recommendation paradigms (Which one best fit our
scenario?) is not trivial and usually there is not a recommendation paradigm which
is universally acknowledged as the best. In this book, we will mostly discuss the
paradigms that gained more popularity in the last few years, namely, Collaborative
RS, Hybrid RS, and Content-Based RS.
However, we will focus the most on the latter class, by showing how the exploita-
tion of (semantic) content-based information in CBRS can lead to a very precise
representation of user interests’ and can produce very precise recommendations.
The next pages of this chapter will be devoted to discuss these aspects, which
represent the cornerstones that supported the writing of this book. We will first explain
the importance of introducing content-based information in intelligent information
access scenarios, and then we will emphasize the advantages that follow the adoption
of semantics-aware representation strategies in these systems.
1.3 Why Do We Need Content? 11

1.3 Why Do We Need Content?

As previously explained, this book introduces and discusses several strategies to ef-
fectively exploit content-based information and textual data in intelligent information
access platforms.
The first question that may arise in such a context is simple: Why do we need
content? Why is it so important to handle and process textual information to develop
effective filters and provide users with intelligent information access?
As for some specific tools, the need for textual information is quite trivial. As
an example, search engines cannot simply work in the absence of content-based
information. As previously shown in Fig. 1.1, the typical pipeline implemented in
search engines relies on a query and a set of textual documents: In the absence of
content, no keyword will describe the available content so every query that will be
run will return no results. As a consequence, a proper design of a search engine
cannot leave out of consideration a proper modeling of content-based and textual
information.

1.3.1 Tackling the Issues of Collaborative Filtering

Conversely, the usefulness of injecting content-based data in recommendation tasks


is not that trivial and needs to be properly discussed and justified. Indeed, as shown
in Fig. 1.3, a recommender system requires a very generic input, that is to say, a set
of items and a user profile.
It is straightforward that, different from search engines, both of them can be
modeled and represented even in the absence of textual data. As an example, a user
can be just represented as a set of preferences, as the set of ratings she expressed,
or through the set of items she is interested in. Moreover, in some scenarios, it is
very hard to provide a textual description or to identify the best descriptive features
for some specific classes of items, as financial portfolios. As a consequence, it is
possible to implement recommendation algorithms that do not take into account
content-based features.
Such an intuition is implemented by collaborative recommendation algorithms
[18], which provide effective suggestions by just exploiting the ratings expressed by
the users.
A typical example of the workflow carried out by collaborative recommender
systems is provided in Fig. 1.4: In this toy example, we have five different users and
five different items. The preference of a specific user toward a specific item is encoded
through a checkmark. If we want to provide u 5 with a recommendation, we first look
for a user who shared similar preferences with her. In this case, u 1 can be labeled as
a neighbor,12 since both u 1 and u 5 liked Matrix (the first column) and V for Vendetta

12 Incollaborative filtering methodologies, two users sharing similar preferences are labeled as
neighbors.
12 1 Introduction

Fig. 1.4 Toy example of a data model for a collaborative recommender system, based on the
user–item matrix

(the second column). Once the list of the neighbors has been built,13 collaborative
recommendations are generated by looking for items the neighbors already liked
and the target user did not yet enjoyed. In this case, the recommendation would have
been Cloud Atlas (the fourth column of the matrix) since u 1 , who shared the same
preferences with the target user, liked that movie.
As shown in this example, collaborative filtering can provide suggestions by
only relying on user ratings (or user preferences, generally speaking), and thus the
usefulness of exploiting content-based data in this scenario is not that straightforward.
Moreover, some work [26] further emphasized these aspects by showing that in some
particular scenario even a few ratings are more valuable than item metadata, and thus
the usage of textual information can be even counterproductive.
However, collaborative filtering algorithms are not free from problems. One of the
main issues that affect collaborative recommendation algorithms is typically referred
to as “sparsity”. These algorithms suffer from sparsity when the number of available
ratings is very tiny and most of the boxes in the user–item matrix are empty. Why
does sparsity is a relevant problem?
In the worst case, sparsity can make the recommendation algorithm unable to
generate suggestions. This is a scenario that always happens when collaborative
filtering algorithms are run. In particular, when a new algorithm is deployed, the
user–item matrix is completely empty since no interaction has been stored in the
data model.14 In this case, no recommendation can be returned.
Similarly, when just a few ratings are available, it is not easy to calculate the neigh-
borhood of the target user. An example of such a problem is reported in Fig. 1.5. Who

13 In a real collaborative filtering scenario, a neighbor typically consists of tens or hundreds of users.
14 This problem is typically referred to as cold start.
Another random document with
no related content on Scribd:
noted below. It is far easier to fill the cylinder when it is disassembled
from the cradle. If assembled in the cradle, bring the gun to its
maximum elevation and remove both filling and drain plugs. It is
necessary that the drain plug holes should be lubricated on top of
the cylinder. Fill through the hole in the piston rod. Allow a few
minutes for the air to escape and the oil to settle.
Refill and repeat two or three times. When satisfied that the
cylinder is entirely full of oil, insert both plugs, and depress the gun
to its maximum depression. After a few moments elevate again to its
maximum elevation and unscrew both plugs. Now refill as described
above. When entirely full, allow not more than two cubic inches
(about one-fourth of a gill) of the oil to escape, insert both plugs and
lash them with copper wire. It may happen that after firing a few
rounds the gun will not return to battery. This may be due to, first,
weakness of springs, second, stuffing box gland being screwed up
too tight, or third, the oil having expanded, due to heat. It any case
the cause must be ascertained and remedied, if due to expansion of
oil, it is proven by the fact that the gun cannot be pushed into battery
by force exerted on the breech of the gun. In that case elevate the
gun to its maximum elevation and remove the filling plug. The oil will
now escape permitting the gun to return to battery. In emergencies,
water may be used in the cylinder. This should be done only when
absolutely necessary, and never in freezing weather, and as soon as
practicable the cylinder should be emptied, cleaned, and thoroughly
dried and filled with hydroline oil. About 9 pints of hydroline oil are
required for filling the recoil cylinder.
To empty the recoil cylinder.—The cylinder may be emptied
either when assembled or disassembled from the cradle. In either
case, remove both the filling and drain plugs, depress the forward
end of the cylinder and drain the contents into a clean can or other
receptacle over which a piece of linen or muslin has been stretched,
for straining the oil.
To clean the recoil cylinder oil.—The hydroline oil used in the
cylinder should be cleaned and free from grit and dirt. The oil should
be stored in the closed cans provided for the purpose, and be
carefully protected from dirt, sand, or water. Oil withdrawn from the
cylinders and containing any sediment must not be used again until it
has been allowed to settle for not less than 24 hours. When
sediment has thus been permitted to settle great care must be taken
not to disturb it in removing the oil. To insure the cleanliness of all
cylinder oil it should be strained through a clean piece of linen or
muslin before using.
To clean the bore of the gun.—After firing and at other times
when necessary, the bore of the gun should be cleaned to remove
the residue of smokeless powder, and then oiled. In cleaning, wash
the bore with a solution made by dissolving one-half pound of Sal
Soda in one gallon of boiling water. After washing with the soda
solution, wipe perfectly dry and then oil the bore with a thin coating
of the light slushing oil furnished for that purpose. Briefly stated, the
care of the bore consists of removing the fouling resulting from firing,
in obtaining a chemically clean surface and in coating this surface
with a film of oil to prevent rusting. The fouling which results from
firing of two kinds—one, the production of combustion of powder, the
other, copper scraped off the rotating band. Powder fouling because
of its acid reaction, is highly corrosive, that is, it will induce rust and
must be removed. Metal fouling of itself is unactive, but its presence
prevents the action of cleaning agents. It should be removed if it
accumulates. At every opportunity in the intermission of fire, the bore
of the gun should be cleaned and lubricated.
To clean the breech mechanism.—The breech mechanism
should be kept clean and well lubricated. It should be dismounted for
examination and oiled when assembled.
To clean the recoil springs.—Dismount to clean. All rust should
be removed and the springs well oiled before assembling. When the
springs are dismounted the interior of the cradle should be cleaned
and examined for defective riveting, missing rivet heads and scoring.
The condition of the spring support guide should be noted and all
burrs or scores carefully smoothed off.
To clean, lubricate and care for the elevating and traversing
mechanism.—The contact surfaces between the cradle and the
rocker should be kept clean, thoroughly oiled, and free from rust. If
indications of rusting, cutting, or scoring of these surfaces appear,
the cradle should be dismounted, the rust removed, and rough spots
smoothed away. The elevating and traversing mechanisms should
be dismounted for thorough cleaning and overhauling. They should
be kept well oiled and should work easily. If at any time either
mechanism works harder than usual, it should be immediately
overhauled and the cause discovered and removed. In traveling, the
cradle should be locked to the trail by means of the elevating and
traversing lock, so as to relieve the pointing mechanism of all travel
stresses.
To clean, lubricate and care for the wheels.—The wheel and
wheel fastenings should be dismounted periodically and the
fastenings, hub boxes, axle arms, and axle bore cleaned and
examined. All roughness due to scoring or cutting should be
smoothed off. The hollow part of the axle acts as a reservoir for the
oil to lubricate the wheel bearings. Experience will show how much
oil is needed, but enough should be used to insure that the oil will
pass through the axle arms to the hub caps. The nuts on the hub
bolts should be tightened monthly during the first year of service and
twice a year thereafter. The ends of the bolts should be lightly riveted
over to prevent the nut from unscrewing. When the hub bolts are
tightened, the hub band should be screwed up as tightly as possible
against the lock washer at the outer end of the hub ring.

GENERAL INSTRUCTIONS FOR THE CARE AND


MAINTENANCE OF MATERIEL

Assembling and Disassembling.

(a) Cradle mechanism, cylinder, springs et cetera.


All work upon recoil cylinders, sights, and other optical equipment
should be done in the presence of a commissioned officer. The recoil
cylinder should never be clamped in a vise, but when necessary to
hold it from turning, a spanner applied to the front end of cylinder
should be used. Never remove the cylinder end stud nut when the
piece is at an elevation. See that proper kind of oil is used in
cylinders and for lubrication. Strain the oil used in filling the cylinders
through a fine clean cloth and be sure that the receptacles used in
handling the oil are clean. Take every precaution to keep the interior
of the cylinders clean and to prevent the entrance of foreign
particles. In assembling the gland be sure that at least four threads
of the gland are engaged with the threads of the cylinder head. Lash
parts with copper wire to prevent unscrewing. Close down the ends
of the recoil-indicator guide to avoid loss of the indicator. Prevent
possible injury to cannoneers by causing them to stand clear of the
counter-recoil spring column in assembling or dismounting. Remove
cylinder end stud screw before trying to unscrew cylinder end stud.
(b) Gun
In moving the gun on or off the cradle, provide ample support for
the breech end, so that the gun clips are in prolongation of the cradle
guides; if this is not done the cradle guides may be ruined.
(c) Elevating and traversing mechanism.
If the gun will not remain at the elevation at which set, the crank
shafts are probably not correctly assembled. If the elevating screw
do not house in traveling, they are incorrectly assembled.
(d) Sights and quadrant.
Frequently verify the adjustments of sights and quadrant. Require
special care in handling sights. Do not permit cannoneers to use
front sight as a handle in mounting. Be sure that the range disk of
the quadrant and range strip of the rear sight shank are graduated
for the particular type of ammunition used by the battery. This also
applies to the fuze setter.

Care of Materiel.
(a) Parts of the Carriages.
All nuts are secured by split pins, which should be replaced and
properly opened when nuts are screwed home. Do not strike any
metal part directly with a hammer; interpose a buffer of wood or
copper. All working and bearing surfaces of the carriage require
oiling; those not directly accessible for this purpose are provided with
oil holes closed by spring covers or handy oilers. Do not permit
brake levers to be released with a kick or blow. It has been found
that the apron hinges occasionally become broken, and that the
apron hinge pins are frequently lost. Whenever this happens the
hinge or hinge-pins should be immediately replaced. For if this is not
done the apron, which is very expensive is apt to become cracked or
broken. When the lunette becomes loosened the lunette nuts should
at once be tightened.
(b) Wheels.
Keep hub bolts and hub bands properly tightened. To tighten the
hub bands screw them as tightly as possible with a wrench and then
force them farther by striking the end of the wrench with a hammer.
All wheels and pintle bearings should be frequently oiled.
(c) Inspections.
Battery commander should frequently make a detailed inspection
of all the vehicles in the battery, to see if any parts of them are
broken or if any screws, nuts, split-pins, et cetera are missing. If any
such defects are found they should immediately take steps to
replace missing or broken parts. At these inspections the material
should also be examined to ascertain whether the cleaning
schedules have been properly carried out. Compliance with these
instructions will do much toward prolonging the life of the carriage.

Firing, Precautions and Preparations for.


Before firing, inspect to see that cylinders are properly closed and
that the cylinder end stud nut and piston rod-nut are in place. If time
permits, oil slides before firing. Note length of recoil for the first few
shots to be sure that the recoil mechanism is working properly. There
is no danger as long as the recoil does not exceed 48 inches.
Therefore, for first shot always set recoil indicator for about 42
inches. If the gun fails to return fully into battery, it is probably due (1)
to dirt on slides and guides, (2) to cutting of slide surfaces on
account of lack of oil, (3) to gland being screwed up too tightly, (4) to
dirt or foreign particles in the cylinder, and especially in the counter
recoil buffer recess, (5) to weakness of springs, 90% of such cases
will be found to be due to 1, 2 or 3. Lock the cradle to the trail at drill
and at traveling to avoid unnecessary strain upon the pointing
mechanism. After unlimbering, release elevating and traversing lock
before attempting to elevate or traverse gun.

Cleaning and Care of Leather.


All leather contains more or less oil. When the amount of oil
decreases the leather becomes harder, less pliable, and shows a
tendency to crack. It loses its elasticity and breaks more readily
under sudden strains. Exposure to the sun evaporates the oil and
exposure to the rain washes it out, both conditions tending in the
long run to bring the same result, namely, hardening and stiffening of
the leather. Accumulations of foreign substances are very injurious
for they tend to absorb the oil from the leather underneath, leaving it
dry and hard, or they retain moisture on the surface, prevent the air
from getting to it and rot the leather. Also many substances such as
perspiration and excretion from the horse contain chemicals which
are very injurious. For these reasons all leather must be kept clean.
The cleaning agent used is castile soap with water. All pieces should
be taken apart and as much dirt and dust as possible removed with a
damp sponge or cloth. All remaining dirt is then removed by washing
with castile soap and water. In doing this always use as little water
as possible. Wring the sponge out nearly dry, rub it on the soap and
work it with the hands until a thick, creamy lather is formed. Then
scrub the leather thoroughly until all dirt is removed. Special care
should be taken around spots that have been in contact with metal
perspiration or excreta. If there is an old accumulation of dirt a soft
stick may be used to scrape this off. Never allow a knife or a piece of
glass to be used or any sharp edged instrument. After all dirt is
removed run the sponge in one direction, all the way along the
leather to remove all remaining dirt and extra lather. Never allow the
pieces to be rinsed off in a bucket of water. Metal parts should be
washed thoroughly and dried and if necessary any rust removed with
a crocus cloth. Although as little water as possible is used some of
the oil in the leather will certainly be washed out, and, to keep the
leather in proper condition, it must be replaced. After an ordinary
cleaning this is normally done by an application of saddle soap.
Saddle soap is not a cleaning agent—only a dressing for leather. It
contains enough oil to replace, if properly applied, all loss through
ordinary use. It is used in the same manner as castile soap, in the
form of a thick, creamy lather, rubbed well into the leather and
allowed to dry. The leather should be well rubbed with the hands
while drying, to keep it soft and pliable, and to work the oil in. Always
allow it to dry in the shade, preferably for several hours, and never
assemble pieces, especially fastening straps into buckles until
thoroughly dry. Oil is never applied directly to leather unless it has
become so hard and dry that saddle soap is not sufficient to soften it.
There is only one oil issued for that purpose. That is “Neat’s Foot
Oil.” This should only be applied to the flesh side of the leather and
very lightly. Several light applications give much better results than a
few heavy ones. The oil should be well rubbed in with the hands and
should be preferably applied after cleaning the leather with castile
soap and water, as the pores of the leather are then more open and
the oil penetrates much better. Never oil leather until it becomes
greasy, for, besides wasteful, it makes the leather too spongy. In
emergencies, where Neat’s Foot Oil cannot be procured any good
vegetable oil may be used, preferably olive oil. This is only for
emergencies and is not to be used unless Neat’s Foot Oil or saddle
soap cannot be had. All new leather equipment should be cleaned
with castile soap and water as soon as unpacked as leather very
often becomes covered with mold after being packed in boxes for
some time. Ordnance leather as it comes to the battery is very dry
and should be thoroughly oiled before being used. New equipment
after being washed thoroughly, should be given, in several light
applications as much oil as it will absorb without becoming greasy.
Each application should be allowed to dry thoroughly and should be
given frequent rubbings to soften the leather. With the proper kind of
preparation for use there is no excuse for the large amount of broken
new leather equipment which is so common. In packing harness for
shipment, especially into harness sacks the harness should be
cleaned and oiled and then dried for at least twenty-four hours
before putting into the sacks. After removing from the sacks at the
destination, cleaning and dressing with saddle soap is sufficient.
When the leather is dirty, clean it—not to improve its looks but to
preserve it. When wet allow it to dry in a warm (not hot) place, in cold
weather; or in the shade in summer. When it is dry apply saddle
soap or oil. Never hang any piece of equipment over a nail or sharp
edge as cracks always develop where the leather has been folded
over sharply. Ordinary oils and greases rot leather, so all such
equipment must be kept away from contact with them. Never leave
any piece of leather where it will chafe against any sharp edge or
corner and never leave it exposed to the sun longer than necessary.

Care of Cloth Equipment.


All cloth equipment should be kept as clean as possible by
continual brushing. The fewer times necessary to wash it, the longer
its life. Canvas goods such as paulins, webbing, etc., when it
becomes necessary to wash them should be scrubbed with Paco, or
H. & H. soap and water. Make a solution of one cake of either soap
in nine cups of hot water. Brush the article to be cleaned thoroughly
and spread it on a clean table. Scrub with the above solution and
scrubbing brush until a good lather appears. Rinse in clean water
and hang in the shade to dry. Woolen articles may be cleaned in the
same manner or with ordinary laundry soap. The first method being
always the best. It is preferable to wash these in cool or warm water,
as hot water shrinks them. Never wring woolens out, but after
washing, rinse in clean water and hang immediately up to dry.
Saddle blankets should be kept well brushed and should be
frequently unfolded, hung in the sun and beaten with a whip. When
removed from the horse they should be doubled over with the wet
side out and put in the shade to dry. If no shade is available, and
they must be exposed to the sun, always fold the wet side inward.
With these precautions, saddle blankets should not need washing
oftener than twice a year. In washing immerse the blanket in tepid
soap suds repeatedly until clean, rinse in clean water, and hang in
the sun to dry. Do not scrub the blanket.

Care of Metal.
All metal equipment should be kept clean and free from rust. Coal
oil is used to remove rust, but it must always be removed as it will
rust the metal if allowed to remain. The coal oil should be applied to
the metal and if possible allowed to remain for a short time. This will
loosen and partially dissolve the rust so that it can be rubbed off with
a rag or a sponge. Continued applications may be necessary if there
is much rust. A solution of Sal Soda is also a good rust remover. The
articles must be washed thoroughly after using this to remove all
traces of the soda as it is a very active corrosive. Never scour metals
to remove rust if it can be avoided as this leaves a roughened
surface which will rust again much more easily. Polished surfaces
such as brass fittings should be cleaned and polished with Lavaline.
This may also be used on the bearing surfaces of steel collars. All
surfaces after cleaning should be dried thoroughly and if not painted
should be greased with cosmis or cosmoline. These form an air-
proof coating over the metal surface so that no moisture may reach it
and cause rusting. If the metal is not dried thoroughly, some
moisture may be held between the grease and the metal surface
which will in time cause rust to appear. Care must be taken that the
grease covers the surface completely. All surfaces against which
there is no friction should be painted and kept so. Ordinary olive drab
or collar paint is very satisfactory for this purpose.

Care of Guns During and After Firing.


Always while firing keep the bore as clean as possible. If there is
time to swab out between shots, do so. During continued firing a
bucket of water should be kept near the gun, and the sponge on the
rammers staff kept wet while swabbing. Watch the recoil indicator
and occasionally push it ahead so as to be sure you are getting a
correct reading. Be sure that the gun returns fully into battery after
each shot. Keep the ammunition, and especially the rotating bands,
free from dust and dirt.
The rotating band should be greased very lightly with cosmis just
before inserting the projectile into the breech. In continued firing, oil
the slides frequently. Keep the fuze setter clean and be careful that
no dirt gets down around the stop pin. Examine the breech recess
frequently and wipe out all dirt and brass filing that may accumulate.
The gun should be cleaned thoroughly immediately after firing. Make
a solution of one pound of sal soda in one gallon of boiling water.
Remove the breechblock and carrier, and let one man clean and oil it
thoroughly while the rest of the gun is being cleaned. Remove the
sponge from the rammer staff, and over the brass rammer, fit a piece
of folded burlap. Fold this burlap as many times as you can and still
force it through the bore. Soak the burlap in the sal soda solution
and swab the bore out thoroughly. Be careful to remove all copper
filing, and the bore should be as bright as a piece of glass when
finished. After cleaning it is best though not absolutely necessary to
swab out with clean water. Then dry thoroughly with a dry swab, and
grease every exposed surface. In cleaning the breechblock and firing
mechanism always dismantle it completely. Clean and oil the slides,
fuze setters and all parts of the carriage. Decap the empty cartridge
cases and wash them out thoroughly with the sal soda solution.
There is a decapping set with every battery. Rinse out in clean water
and set them in the sun to dry.

A CLEANING SCHEDULE FOR MATERIEL.

Daily.

Before leaving the park:


1. Unlock boxes and chests and secure them with snaps.
2. See that all tools, paulins, etc., are secure.
After returning to the park:
1. Remove from carriages all dust, excess oil and mud. Examine
for missing nuts, split pins, broken parts and parts that need
adjustment. Make necessary repairs.
2. Clean and oil breech recess and breechblock; after firing, clean
bore with salsoda solution, wipe perfectly dry and oil.
3. Oil wheels, elevating and traversing mechanism, tools if
necessary.
4. See that all oil holes are properly closed and that carriages are
ready for immediate use.
5. Clean and oil without dismounting; rear sight, quadrant and fuze
setter.
6. Lock all boxes and chests.
7. Signal detail: clean all instruments, oil all exposed bearing
surfaces. Test telephones and go over all wire used that day and
repair same by covering exposed parts. Have all instruments, wire
etc., ready for immediate use.
8. Clean all collars and bits and dry the blankets; wipe dirt from the
harness.
9. Clean and oil all pistols and revolvers that have been used that
day.

Weekly.

1. Wash and clean entire carriage.


2. Disassemble and clear all oil breech mechanism. Always do so
immediately after firing.
3. Clean out and fill with oil, all oil holes of gun clips and cradle
pintle.
4. Clean all leather straps as you would clean harness.
5. Take apart and thoroughly clean all parts of harness.
6. Take apart and clean and oil all pistols and revolvers.
7. Clean with castile soap and harness soap all leather of the
personal equipment.

Monthly.

1. Disassemble the following and clean and oil: elevating


mechanism, traversing mechanism.
2. Pull from battery and clean and oil guide rails and clips. Trip gun
and test recoil.
3. Tighten all hub nuts and inspect wheels for dish.
4. Take off wheels, clean and oil axles and hubs. Replace hub liner
when necessary.
5. Dismount poles, double trees and spare pole, clean and oil.
6. Dismount rear sight bracket from support, clean and oil. Do the
same for the front sight.

Every Three Months.

1. Dismount, clean, oil and assemble the recoil mechanism.


2. Inspect the surplus kits and replace all articles that are not in
proper condition.
3. Unpack, clean, oil and repack the battery and store wagons,
forge limber.

Every Six Months.

1. Inspect all articles of the permanent camp equipment, dry, oil


and repair when needed. Pitch tentage for examination and drying.
2. Examine all articles in store such as leather, harness and spare
metal parts. Clean the harness, dub the leather, oil all metal parts.

PRECAUTIONS AND GENERAL INSTRUCTIONS.


1. Never allow steel parts to be struck with a steel hammer.
Always use a copper drift between the hammer and the steel part.
2. Never try to force a delicate part if stuck. The sticking is
probably due to rust and the parts can be loosened by soaking in
coal oil or by heating the exterior surfaces with a torch.
3. Be careful in using screw drivers or wrenches not to let them
slip and thus ruin the heads of the screw or nut.
4. Insist upon the rule that any part needing repairs be repaired
immediately upon arrival in garrison or camp.
5. Never allow a broken part to be stored except for the action of
an inspector or survey.
6. Before any article is put away for storage, have it thoroughly
inspected and make necessary repairs.
7. See that all articles of your equipment are always marked or
stamped with the insignia and the battery number.
8. Hold all members of your organization responsible for any
carelessness or negligence in the care of the equipment.
CHAPTER XIV
FIRE CONTROL EQUIPMENT.

SIGHTS.
The instruments provided for sighting and laying the gun include a
line sight, a rear sight, a front sight, a panoramic sight, and a range
quadrant.
Line sights.—The line sight consists of a conical point as a front
sight and a V notch as a rear sight, located on the top element of the
gun. They determine a line of sight parallel to the axis of the bore,
useful in giving general direction to the gun.
Front and rear sights.—The front and rear sights are for general
use in direct aiming. The front sight carries cross wires. The rear
sight is of the peep variety, constructed as follows: To the sight
bracket is attached the shank socket upon which a spirit level is
mounted for the necessary correction due to difference in level of
wheels. The sight shank consists of a steel arc, the center of which
is the front sight. It slides up and down in the shank socket and is
operated by a scroll gear. A range strip is attached to the face of the
shank and is graduated up to 6500 yards, least reading 50 yards. To
the left side of the shank is an elevation spirit level, permitting
approximate quadrant elevations to be given with the sight shank
when the quadrant is out of order.
The peep sight and its deflection scale are mounted above the
shank. This peep traverses along a screw operated by a knurled
head. A socket and ratchet are also provided for the attachment of
the panoramic sight.
Rear Sight.

Nomenclature of the important parts of the Rear Sight:—


Peep sight
Elevation level
Deflection scale
Peep sight screw and head
Range strip
Shank
Shank socket
Cross level
Leveling screw
Scroll gear and handle
Rear sight bracket
Panoramic sight socket and ratchet

PANORAMIC SIGHT, MODEL 1917.

Description.

The panoramic sight is a vertical telescope so fitted with an


optical system of reflecting prisms and lenses that the gunner can
bring into his field any point in a plane perpendicular to the axis of
the telescope. The optical characteristics of the instruments are as
follows:
Power = 4.
Field of view = 10°.
The rotating head prism has a movement of 600 mils in a vertical
plane; movement is obtained by turning elevation micrometer. The
amount and direction of rotation is indicated on a scale in the head
by the elevation index and micrometer. The scale is graduated in
100-mil intervals, the micrometer in mils. One complete turn of the
micrometer is equivalent to one space on the sale. The head is level
when the index is opposite 3 and micrometer at zero.
PANORAMIC SIGHT MODEL OF 1917

Movement in azimuth is obtained by turning azimuth worm. The


amount of rotation is read from the scale on the azimuth circle and
the azimuth micrometer. The azimuth micrometer may be turned
independently of the azimuth worm to set any desired deflection.
Figures in black are for right-hand deflection and in red for left-hand
deflection. The scale on the azimuth circle is graduated in 100-mil
divisions from 0 to 32 in each half circle. The micrometer is
graduated for every mil. For larger angular deflections, by turning the
throw-out lever the azimuth worm is disengaged, permitting the head
to be turned to any desired position.
The reticule is provided with a horizontal and a vertical cross line.
The horizontal line is graduated in mils.
An open sight attached to the side of the rotating head is for
approximate setting of the instrument.
No disassembling or adjustment of the panoramic sight, except as
described herein, is to be made, except by ordnance personnel
detailed for such work.
The panoramic sight is seated in a T slot in a socket of the sight,
model of 1916, in firing, and is carried in a panoramic sight case on
the shield when traveling.

Use of the Panoramic Sight for Direct Fire.

Level rocker with zero on range scale opposite 300 on angle-of-


site scale and gun at center of traverse. Set azimuth scale at zero,
azimuth micrometer knob at zero, micrometer index at zero,
elevation scale at 3, and elevation micrometer knob at zero. By
means of cross-leveling knob on sight socket bring cross-level
bubble level.
Correct for deflection in azimuth by turning azimuth micrometer
until required deflection is opposite fixed arrow pointer; bring zero on
micrometer index to zero on azimuth micrometer by means of
micrometer-index knob.
Elevate gun by means of angle-of-site handwheel and traverse
until cross hairs of panoramic sight are on target.

For Indirect Fire.

Level rocker and set scales for zero setting as directed in first
paragraph under “direct fire.”
Lay off required deflection in azimuth by means of micrometer
index and azimuth worm knob, so that deflection may be read from
azimuth index and azimuth micrometer. Traverse gun until vertical
cross hair of panoramic sight is on aiming point.
Vertical angles may be read by means of elevation scale and
micrometer scale. Zero point of elevation scale is 3. Each division on
elevation scale represents 100 mils.
All scales are graduated in mils.
The open sight on side of rotating head is used to obtain
preliminary direction of sight.
In turning azimuth angles greater than 100 mils the throw-out lever
may be pressed and rotating head turned to nearest division in even
hundreds desired. Each unit on azimuth scale represents 100 mils.

Panoramic Sight, Model of 1915.

The panoramic sight is a vertical telescope so fitted with an


optical system of reflecting prisms and lenses that the gunner with
his eye at the fixed eyepiece in a horizontal position can bring into
the field of view an object situated at any point in a plane
perpendicular to the axis of the telescope.
The rays coming from the object are reflected downward from the
rotating head prism into the rotating prism. The rotating prism
rectifies the rays; after their passage through the achromatic
objective lens, the lower reflecting prism reflects them in such a way
that there is presented to the eyepiece a rectified image, which the
eyepiece magnifies. A glass reticule marked with graduated cross
lines is located in the focal plane of the instrument, with the
intersection of the cross lines coincident.
The instrument has a universal focus, a magnifying power of 4
and field of view of 180 mils.
PANORAMIC SIGHT MODEL OF 1915

The principal parts of the panoramic sight are the rotating head,
the elevation device and its micrometer, the azimuth mechanism with
limb and micrometer, the rotating prism mechanism, the deflection
mechanism, R and L scale and micrometer, the shank and the
eyepiece.
The limb or azimuth scale is divided into 64 parts, each division
representing 100 mils.
The azimuth micrometer is divided in 100 equal divisions or mils,
numbered every 5 mils. One complete revolution of the azimuth
micrometer is equal to the distance between divisions on the azimuth
scale. The limb of the deflection scale is divided into six divisions;
three on each side of the zero, red for right and black for left, each
division representing 100 mils. The deflection micrometer, engraved
upon the front end, is graduated into 100 equal divisions, numbered
every 10 mils, red and black in opposite directions.

You might also like