Professional Documents
Culture Documents
Learning Microsoft Cognitive Services Leverage Machine Learning Apis To Build Smart Applications Second Edition. Edition Larsen
Learning Microsoft Cognitive Services Leverage Machine Learning Apis To Build Smart Applications Second Edition. Edition Larsen
https://textbookfull.com/product/learning-microsoft-cognitive-
services-larsen/
https://textbookfull.com/product/machine-learning-refined-
foundations-algorithms-and-applications-second-edition-borhani/
https://textbookfull.com/product/machine-learning-with-r-
cookbook-second-edition-analyze-data-and-build-predictive-models-
bhatia/
https://textbookfull.com/product/predictive-analytics-with-
microsoft-azure-machine-learning-barga/
Machine Learning Refined: Foundations, Algorithms, and
Applications Second Edition Jeremy Watt
https://textbookfull.com/product/machine-learning-refined-
foundations-algorithms-and-applications-second-edition-jeremy-
watt/
https://textbookfull.com/product/python-deeper-insights-into-
machine-learning-leverage-benefits-of-machine-learning-
techniques-using-python-a-course-in-three-modules-hearty/
https://textbookfull.com/product/learn-pyspark-build-python-
based-machine-learning-and-deep-learning-models-1st-edition-
pramod-singh/
https://textbookfull.com/product/foundations-of-machine-learning-
second-edition-mehryar-mohri/
BIRMINGHAM - MUMBAI
Learning Microsoft Cognitive Services
Second Edition
All rights reserved. No part of this book may be reproduced, stored in a retrieval
system, or transmitted in any form or by any means, without the prior written
permission of the publisher, except in the case of brief quotations embedded in
critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of
the information presented. However, the information contained in this book is sold
without warranty, either express or implied. Neither the author, nor Packt Publishing,
and its dealers and distributors will be held liable for any damages caused or alleged
to be caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the
companies and products mentioned in this book by the appropriate use of capitals.
However, Packt Publishing cannot guarantee the accuracy of this information.
ISBN 978-1-78862-302-5
www.packtpub.com
Credits
Author
Copy Editor
Sameen Siddiqui
Leif Larsen
Reviewer
Project Coordinator
Vaidehi Sawant
Abhishek Kumar
Commissioning Editor
Proofreader
Safis Editing
Richa Tripathi
Acquisition Editor Indexer
You can find out more about him by checking his blog (http://blog.leiflarsen.org/) and
following him on Twitter (@leif_larsen) and LinkedIn (lhlarsen).
Writing a book requires a lot of work from a team of people. I would like to give a
huge thanks to the team at Packt Publishing, who have helped make this book a
reality. Specifically, I would like to thank Rohit Kumar Singh and Pavan
Ramchandani, for excellent guidance and feedback for each chapter, and Denim
Pinto and Chaitanya Nair, for proposing the book and guiding me through the start.
I also need to direct a thanks to Abhishek Kumar for providing good technical
feedback.
Also, I would like to say thanks to my friends and colleagues who have been
supportive and patient when I have not been able to give them as much time as they
deserve.
Thanks to my mom and my dad for always supporting me.
Thanks to my sister, Susanne, and my friend Steffen for providing me with ideas from
the start, and images where needed.
I need to thank John Sonmez and his great work, without which, I probably would
not have got the chance to write this book.
Last, and most importantly, I would like to thank my friend, Miriam, for supporting
me through this process, for pushing me to work when I was stuck, and being there
when I needed time off. I could not have done this without her.
About the Reviewer
Abhishek Kumar works as a consultant with Datacom, New Zealand, with more
than 9 years of experience in the field of designing, building, and implementing
Microsoft Solution. He is a coauthor of the book Robust Cloud Integration with
Azure, Packt Publishing.
Abhishek is a Microsoft Azure MVP and has worked with multiple clients
worldwide on modern integration strategies and solutions. He started his career in
India with Tata Consultancy Services before taking up multiple roles as consultant at
Cognizant Technology Services and Robert Bosch GmbH.
He has published several articles on modern integration strategy over the Web and
Microsoft TechNet wiki. His areas of interest include technologies such as Logic
Apps, API Apps, Azure Functions, Cognitive Services, PowerBI, and Microsoft
BizTalk Server.
I would like to thank the people close to my heart, my mom, dad, and elder bothers,
Suyasham and Anket, for the their continuous support in all phases of life.
I would also like to take this opportunity to thank Datacom and my manager, Brett
Atkins, to for their guidance and support throughout our write-up journey.
www.PacktPub.com
For support files and downloads related to your book, please visit www.PacktPub.com.
Did you know that Packt offers eBook versions of every book published, with PDF
and ePub files available? You can upgrade to the eBook version at www.PacktPub.com
and as a print book customer, you are entitled to a discount on the eBook copy. Get
in touch with us at service@packtpub.com for more details.
At www.PacktPub.com, you can also read a collection of free technical articles, sign up
for a range of free newsletters and receive exclusive discounts and offers on Packt
books and eBooks.
https://www.packtpub.com/mapt
Get the most in-demand software skills with Mapt. Mapt gives you full access to all
Packt books and video courses, as well as industry-leading tools to help you plan
your personal development and advance your career.
Why subscribe?
Fully searchable across every book published by Packt
Copy and paste, print, and bookmark content
On demand and accessible via a web browser
Customer Feedback
Thanks for purchasing this Packt book. At Packt, quality is at the heart of our
editorial process. To help us improve, please leave us an honest review on this
book's Amazon page at https://www.amazon.com/dp/1788623029.
If you'd like to join our team of regular reviewers, you can e-mail us
at customerreviews@packtpub.com. We award our regular reviewers with free eBooks and
videos in exchange for their valuable feedback. Help us be relentless in improving
our products!
Table of Contents
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Downloading the color images of this book
Errata
Piracy
Questions
1. Getting Started with Microsoft Cognitive Services
Cognitive Services in action for fun and life-changing purposes
Setting up boilerplate code
Detecting faces with the Face API
An overview of what we are dealing with
Vision
Computer Vision
Emotion
Face
Video
Video Indexer
Content Moderator
Custom Vision Service
Speech
Bing Speech
Speaker Recognition
Custom Recognition
Translator Speech API
Language
Bing Spell Check
Language Understanding Intelligent Service (LUIS)
Linguistic Analysis
Text Analysis
Web Language Model
Translator Text API
Knowledge
Academic
Entity Linking
Knowledge Exploration
Recommendations
QnA Maker
Custom Decision Service
Search
Bing Web Search
Bing Image Search
Bing Video Search
Bing News Search
Bing Autosuggest
Bing Entity Search
Getting feedback on detected faces
Summary
2. Analyzing Images to Recognize a Face
Learning what an image is about using the Computer Vision API
Setting up a chapter example project
Generic image analysis
Recognizing celebrities using domain models
Utilizing Optical Character Recognition
Generating image thumbnails
Diving deep into the Face API
Retrieving more information from the detected faces
Deciding whether two faces belong to the same person
Finding similar faces
Grouping similar faces
Adding identification to our smart-house application
Creating our smart-house application
Adding people to be identified
Identifying a person
Automatically moderating user content
The content to moderate
Image moderation
Text moderation
Moderation tools
Using the review tool
Other tools
Summary
3. Analyzing Videos
Knowing your mood using the Emotion API
Getting images from a web camera
Letting the smart-house know your mood
Diving into the Video API
Video operations as common code
Getting operation results
Wiring up the execution in the ViewModel
Detecting and tracking faces in videos
Detecting motion
Stabilizing shaky videos
Generating video thumbnails
Analyzing emotions in videos
Unlocking video insights using Video Indexer
General overview
Typical scenarios
Key concepts
Breakdowns
Summarized insights
Keywords
Sentiments
Blocks
How to use Video Indexer
Through a web portal
Video Indexer API
Summary
4. Letting Applications Understand Commands
Creating language-understanding models
Registering an account and getting a license key
Creating an application
Recognizing key data using entities
Understanding what the user wants using intents
Simplifying development using prebuilt models
Prebuilt domains
Training a model
Training and publishing the model
Connecting to the smart-house application
Model improvement through active usage
Visualizing performance
Resolving performance problems
Adding model features
Adding labeled utterances
Looking for incorrect utterance labels
Changing the schema
Active learning
Summary
5. Speaking with Your Application
Converting text to audio and vice versa
Speaking to the application
Letting the application speak back
Audio output format
Error codes
Supported languages
Utilizing LUIS based on spoken commands
Knowing who is speaking
Adding speaker profiles
Enrolling a profile
Identifying the speaker
Verifying a person through speech
Customizing speech recognition
Creating a custom acoustic model
Creating a custom language model
Deploying the application
Summary
6. Understanding Text
Setting up a common core
New project
Web requests
Data contracts
Correcting spelling errors
Natural Language Processing using the Web Language Model
Breaking a word into several words
Generating the next word in a sequence of words
Learning if a word is likely to follow a sequence of words
Learning if certain words are likely to appear together
Extracting information through textual analysis
Detecting language
Extracting key phrases from text
Learning if a text is positive or negative
Exploring text using linguistic analysis
Introduction to linguistic analysis
Analyzing text from a linguistic viewpoint
Summary
7. Extending Knowledge Based on Context
Linking entities based on context
Providing personalized recommendations
Creating a model
Importing catalog data
Importing usage data
Building a model
Consuming recommendations
Recommending items based on prior activities
Summary
8. Querying Structured Data in a Natural Way
Tapping into academic content using the Academic API
Setting up an example project
Interpreting natural language queries
Finding academic entities from query expressions
Calculating the distribution of attributes from academic entities
Entity attributes
Creating the backend using the Knowledge Exploration Service
Defining attributes
Adding data
Building the index
Understanding natural language
Local hosting and testing
Going for scale
Hooking into Microsoft Azure
Deploying the service
Answering FAQs using QnA Maker
Creating a knowledge base from frequently asked questions
Training the model
Publishing the model
Improving the model
Summary
9. Adding Specialized Searches
Searching the web from the smart-house application
Preparing the application for web searches
Searching the web
Getting the news
News from queries
News from categories
Trending news
Searching for images and videos
Using a common user interface
Searching for images
Searching for videos
Helping the user with auto suggestions
Adding Autosuggest to the user interface
Suggesting queries
Search commonalities
Languages
Pagination
Filters
Safe search
Freshness
Errors
Summary
10. Connecting the Pieces
Connecting the pieces
Creating an intent
Updating the code
Executing actions from intents
Searching news on command
Describing news images
Real-life applications using Microsoft Cognitive Services
Uber
DutchCrafters
CelebsLike.me
Pivothead - wearable glasses
Zero Keyboard
The common theme
Where to go from here
Summary
11. LUIS Entities and Additional Information on Linguistic Analysis
LUIS pre-built entities
Part-of-speech tags
Phrase types
12. License Information
Video Frame Analyzer
OpenCvSharp3
Newtonsoft.Json
NAudio
Definitions
Grant of Rights
Conditions and Limitations
Preface
Artificial intelligence and machine learning are complex topics, and adding such
features to applications has historically required a lot of processing power, not to
mention tremendous amounts of learning. The introduction of Microsoft Cognitive
Service gives developers the possibility to add these features with ease. It allows us
to make smarter and more human-like applications.
This book aims to teach you how to utilize the APIs from Microsoft Cognitive
Services. You will learn what each API has to offer and how you can add it to your
application. We will see what the different API calls expect in terms of input data
and what you can expect in return. Most of the APIs in this book are covered with
both theory and practical examples.
This book has been written to help you get started. It focuses on showing how to use
Microsoft Cognitive Service, keeping current best practices in mind. It is not
intended to show advanced use cases, but to give you a starting point to start playing
with the APIs yourself.
What this book covers
Chapter 1, Getting Started with Microsoft Cognitive Services, introduces Microsoft
Cognitive Services by describing what it offers and providing some basic examples.
Chapter 2, Analyzing Images to Recognize a Face, covers most of the image APIs,
introducing face recognition and identification, image analysis, optical character
recognition, and more.
Chapter 5, Speaking with Your Application, dives into different speech APIs, covering
text-to-speech and speech-to-text conversions, speaker recognition and
identification, and recognizing custom speaking styles and environments.
Chapter 8, Querying Structured Data in a Natural Way, deals with the exploration of
academic papers and journals. Through this chapter, we look into how to use the
Academic API and set up a similar service ourselves.
Chapter 9, Adding Specialized Search, takes a deep dive into the different search APIs
from Bing. This includes news, web, image, and video search as well as auto
suggestions.
Chapter 10, Connecting the Pieces, ties several APIs together and concludes the book
by looking at some natural steps from here.
Appendix B, License Information, presents relevant license information for all third-
party libraries used in the example code.
What you need for this book
To follow the examples in this book you will need Visual Studio 2015 Community
Edition or later. You will also need a working Internet connection and a subscription
to Microsoft Azure; a trial subscriptions is OK too.
To get the full experience of the examples, you should have access to a web camera
and have speakers and a microphone connected to the computer; however, neither is
mandatory.
Who this book is for
This book is for .NET developers with some programming experience. It is assumed
that you know how to do basic programming tasks as well as how to navigate in
Visual Studio. No prior knowledge of artificial intelligence or machine learning is
required to follow this book.
Code words in text, database table names, folder names, filenames, file extensions,
pathnames, dummy URLs, user input, and Twitter handles are shown as follows:
"With the top emotion score selected, we go through a switch statement, to find the
correct emotion."
When we wish to draw your attention to a particular part of a code block, the
relevant lines or items are set in bold:
private BitmapImage _imageSource;
public BitmapImage ImageSource
{
set
{
_imageSource = value;
RaisePropertyChangedEvent("ImageSource");
}
}
New terms and important words are shown in bold. Words that you see on the
screen, for example, in menus or dialog boxes, appear in the text like this: "In order
to download new modules, we will go to Files | Settings | Project Name | Project
Interpreter."