Professional Documents
Culture Documents
CODEMagazine 2019 NovemberDecember PDF
CODEMagazine 2019 NovemberDecember PDF
NOV
DEC
2019
codemag.com - THE LEADING INDEPENDENT DEVELOPER MAGAZINE - US $ 8.95 Can $ 11.95
What’s Your
Superpower?
SCOTT ERIC
GUTHRIE BOYD
Executive Vice President, Corporate
Cloud + AI Platform, Vice President,
Microsoft AI Platform, Microsoft
SCOTT SCOTT
HANSELMAN HUNTER
Principal Program Director of Program
Manager, Web Platform Management .NET,
Team, Microsoft Microsoft
GET THE
JEFF JOHN
FRITZ PAPA
Senior Program Principal Developer
Manager, Microsoft Advocate, Microsoft
INSIDER VIEW
REGISTER by JANUARY 13
for a WORKSHOP PACKAGE
and receive a choice of Surface Go
hardware or hotel gift card! Xbox One S
Shown are samples of past Xbox One X
hardware choices.
Surface
Headphones
Powered by
DEVintersection.com
203-264-8220 m–f, 9-4 est
TABLE OF CONTENTS
Features
8 Enhance Your Search Applications 50 POURing Over Your Website:
with Artificial Intelligence An Introduction to Digital
Search is everywhere. But unless you add it to your app, you won’t Accessibility
find it there! Sahil examines the various search tools in the Microsoft Everyone knows that there are standards when it comes to building
ecosystem and shows you how to make the most of them. apps. And most people know that there are standards for accessibility.
Sahil Malik But did you know that writing accessible apps is better for everyone?
Ashleigh shows you what to think about the next time you sit down
14 S ynchronizing the In-Browser to create something.
Ashleigh Lodge
Database with the Server
Craig shows you how to gracefully resolve conflicts and synchronization
54 est Practices for Data
B
issues with disconnected databases.
Craig Shoemaker
Visualizations:
A Recipe for Success
22 et Started with Serverless
G Helen shows you the ins and outs of creating really useful charts and
Azure Functions
graphs with Tableau. You’ll never make a boring old pie chart again.
Helen Wall
Azure Functions take care of most of the server-related problems tied
to hosting. Julie shows you how to integrate them with your own app
and then monitor the results.
28
Julie Lerman
Departments
If you’ve been wondering how to use it, you’ll be fascinated
by what Jeannine serves up.
Jeannine Takaki-Nelson
US subscriptions are US $29.99 for one year. Subscriptions outside the US pay US $49.99. Payments should be made in US dollars drawn on a US bank. American Express,
MasterCard, Visa, and Discover credit cards are accepted. Bill Me option is available only for US subscriptions. Back issues are available. For subscription information,
send e-mail to subscriptions@codemag.com or contact Customer Service at 832-717-4445 ext. 9.
Subscribe online at www.codemag.com
CODE Component Developer Magazine (ISSN # 1547-5166) is published bimonthly by EPS Software Corporation, 6605 Cypresswood Drive, Suite 425, Spring, TX 77379 U.S.A.
POSTMASTER: Send address changes to CODE Component Developer Magazine, 6605 Cypresswood Drive, Suite 425, Spring, TX 77379 U.S.A.
and the other shoe. That’s how I put on my socks The most useful thing was what I found energiz- doing the various tasks in a large project gives
and shoes. ing—working in a different place, starting at the you the opportunity for different views about
other end of the To Do List, changing the music I how to proceed. Even the act of delegating tasks
The next day, I was at the acupuncture clinic, and listen to when I’m working. gives you different perspectives on timing, tool-
I noticed someone putting on his shoes: both ing, and approaches.
socks first, then both shoes, starting on the right It seems logical that shaking up the usual or-
both times. der of things would be useful in a larger project There are many ways to shake up your process,
with more staff, too. Not change for the sake of and I plan to be more interested in them, and to
I asked a few people, and everyone seemed to change if it’s disruptive, but maybe the alphabet take note when I’ve fallen into a habit. Even so,
have really strong opinions about the correct or- could start in the middle every now and then. it’s just wrong to put on both socks and then both
der of such things. Several people explained that shoes. That’s just crazy talk.
they put on both socks first “in case there was a Back in the day, when we used to edit on print-
fire.” (Wouldn’t you rather have one shoe than ed-out copies, I learned a neat trick for catch- Melanie Spiller
no shoes in a fire? What if you needed to stomp ing things like repeated words or extra spac-
the fire out?) People were very committed to the es—things that are hard to catch under normal
“correct” order of things. circumstances. Turn the page upside-down. As it
happens, I have some peculiarity in my brain that
It got me thinking about what other things we do lets me read and write upside-down with great
in a certain order by rote: brushing teeth, brush- facility (I can’t write in cursive upside-down, only
ing hair, starting the car and putting on a seat printing, which makes me wonder if I learned
belt. I walk every day and always take the same this trick before I learned cursive), but even so, I
route or some version of the same route. I told found more errors like that than I did when edit-
myself it was because I loved the walk so much. ing right-side up.
So I decided to prove it. I walked the route in the
opposite direction. Another trick is for when I get stuck while writ-
ing. Write it out of order! You don’t have to start
A walk that normally takes an hour and a half at the beginning of the article/chapter/book/
took two and a quarter hours! I had trouble paper/story when you’re writing any more than
getting across a major intersection—normally I you have to start at the beginning when you’re
crossed there an hour later when the traffic had reading. Write the bits that interest you most or
died down. People I routinely nodded and smiled that are easiest to write, and then go back and
at and said good morning to didn’t recognize me introduce them. Or, if you’re really stuck, write
(nor I them until I realized that I wasn’t paying each heading/chapter title/topic on three-by-
attention). I noticed houses and shops that I’d five cards and toss them in the air. Start by writ-
never seen before, I got a whole different view ing the cards that land right-side-up. Or the up-
of some massive construction, and I began to tire side-down ones. Boom! Responsibility for writing
right where I usually got to my happy place. in order is over. Writing this way guarantees that
you’ll go back and read from end-to end to make
I know that I’m a creature of many habits, so I sure it all gels. (Believe me, your editor knows
decided that it was clearly time to shake things that you didn’t reread your article a single time
up. I sat to my work in a different place, listened because of all the stupid typos, but also, things
to different music, practiced my music at a dif- written in haste read like things written in haste.)
ferent time of day, watched different things on
television, wore different clothes than usual, Software development projects are commonly
did my shopping and laundry on different days, done out of order. Sometimes you need to test
went to bed at a different time—even brushed my the viability of a project by tackling the most dif-
teeth in a different order. Some of the changes ficult bits first. Sometimes you start with the out-
felt refreshing, some were annoying or felt like put of a project: reports and user interfaces come
interference, and some, like the walk reversal, to mind. Sometimes it’s just how you delegate
were interesting and informative. responsibilities for features. Assigning people to
6 Editorial codemag.com
codemag.com
ONLINE QUICK ID 1911021
It’s very difficult to implement. We all know it’s more than Finally, there’s Azure Search, which is the focus of this ar-
just simple text matching. Even simple text matching isn’t ticle. Azure Search is one of the products under the Azure
easy. Those of us with database backgrounds know that umbrella. It allows you to create your own private search cor-
searching for “prefix*” is a lot easier than searching for pus in the cloud. It’s best viewed as a cloud-hosted, Internet
“*suffix”. And users want to do all sorts of weird searches scale, search-as-a-service solution. It allows you to search
like “*run*”, which should match ran, or shrunken or brunt, your data, in an index you define, with documents you put in
or—you get the idea. Quick search results and performance the index, at a schedule you define. All this, but with none of
are important, as is accuracy and ranking. You almost have the complexity that’s typically paired with an enterprise-class
to read the user’s mind. And then there’s the whole idea of search product. Microsoft Azure manages all of the infrastruc-
Sahil Malik keeping your search results fresh. Not an easy task, is it? ture complexity, and, as I mentioned earlier, I assure you, the
www.winsmarts.com learning curve here, is indeed quite gentle.
@sahilmalik What’s amazing is that all that complexity barely scratches
the surface of the endless possibilities. About 70% of the One pretty amazing capability of Azure Search is the abil-
Sahil Malik is a Microsoft data on the Internet is visual. Photos and videos. Another ity to enhance it with the power of AI using the cognitive
MVP, INETA speaker,
big part is audio. Wouldn’t it be useful to be able to search capabilities of Azure Search. The typical process of search
a .NET author, consultant,
through audio and video as well? is to define the index, import data, and execute queries.
and trainer.
Cognitive capabilities allow you to make further sense of the
Sahil loves interacting with Have you ever thought yourself asking a question such as, imported data. For instance, a video could be further deci-
fellow geeks in real time. “I have this tune stuck in my head, what song is that?” Yes, phered into the people appearing in the video, and text-to-
His talks and trainings are we know there are apps that’ll do that on your phone. But speech capabilities could make the spoken text in the video
full of humor and practical what if that power was brought to your corporate world. searchable. Or you could use OCR capabilities to make the
nuggets. Say, “Someone said xyz in a meeting or perhaps an email, or text in the images searchable. I’ll show you how to do all of
maybe it was document, I wish I could find out easily where this in this article.
His areas of expertise are xyz was said and by whom.” Personally, I struggle with this
cross-platform Mobile app deluge of information every day. Finding that needle in the Again, I assure you, the learning curve is quite gentle.
development, Microsoft haystack when my boss is on the call with me is something
anything, and security I deal with far too often.
and identity. Create a Simple Search Engine
Search is incredibly powerful. It saves the user’s time. In The best way to learn how to swim is to dive in. Without
this article, I’ll show you how you can build an application much further ado, let’s go ahead and build a simple search
with a very gentle learning curve that allows you build such engine. I’ll do it using Azure Search, and I’ll explain the
functionality and more. important concepts as I go along.
But first, let’s start with clarifying the various search prod- The first thing I’ll need is the data I wish to search. There
ucts available in the Microsoft space. are two ways to put data in an Azure Search index: Push
and Pull.
Search Products Push Data into Azure Search
In the Microsoft ecosystem, there are multiple search prod- The first way to get data into your Azure Search index is by
ucts with overlapping names. The Microsoft Department of pushing data into it. Azure Search comes with a REST API,
Confusing Naming can sometimes do a great job, so it is best or .NET and Java SDKs. You can choose to push any search-
to clarify them first. able data in the index using this push-based mechanism.
Certainly, this has its advantages. You can now make almost
Microsoft has three search products. anything searchable, as long as you can programmatically
push the data in. Also, you control how and when new data
The first is Bing, which you can find at www.bing.com. becomes searchable. This means that if you have a specific
It’s an Internet-facing search engine, it’s free to use, and requirement where new data must become searchable with
searches execute against the open, anonymous Internet. a very short latency, the push-based mechanism is what you
need.
The second is search under Cognitive Services, not to be
confused with cognitive search under Azure search, which is At a high level, the process of pushing data involves you
an entirely different product. You can read more about Cog- defining an index first. When you define an index, you get
nitive Services search at https://azure.microsoft.com/en- to define a lot of details, such as what columns in the entity
us/services/cognitive-services/directory/search/. But, to are searchable, which columns are retrievable in search re-
put it simply, this is your way to tap into the power of Bing, sults, which you can perform facets on, etc. Once you define
to create an ad-free search experience, completely brand- such an index, you can push documents in that match that
able to your requirements, available as a paid offering. data structure.
Although these operations give you some flexibility, they won’t There is another tier, the S3HD tier. S3HD is designed for
be as efficient as just pushing a document into an index, like multi-tenant environments and it has a feature set differ-
the push mechanism allows you to do. That’s because when ence. Indexers are not available in S3HD.
kick-starting an index and getting the status of an index, it still
has to find all new changes and then pull them in, one by one. For the purposes of this article, go ahead and provision an
But hey, it’s a good middle ground between increasing index- instance of Azure Search under the free tier.
ing results, and not having to write a lot of code.
Once inside Azure Search, you’ll see a number of interest-
Set Up a Data Source ing things. Right through the portal, you can choose to
For the purposes of this article, I’ll index the Northwind da- scale it. Because you went with the free tier, this will be
tabase. Yes, that old tired, boring Northwind database. You disabled. All tiers, including free tier, give you access to
can grab the script for the Northwind database from here, keys. There are two kinds of keys: admin and query. The
https://github.com/Microsoft/sql-server-samples/tree/ admin key can be used to programmatically affect search
master/samples/databases/northwind-pubs. Why did I pick service configuration. You can create up to a maximum of
Northwind? Well I didn’t have to. I just want a data source. two equivalent keys. Or you can create up to 50 query keys,
But feel free to target any other similar content source. which only let you query data. An admin key also lets you
query data but an admin key is a lot more powerful than
Once I set up the data source, I see the usual Northwind a query key; therefore you should use query keys for pure
tables. One of those tables is the Customers table that I querying functions.
Figure 1: The import data button at the top of your search instance overview page
the queries. To integrate searches within your applications, Leveraging the Power of AI
you need to make a REST call to a request URL, with the api- You have a neat little search engine and it wasn’t too hard
key header. The value of the header is the query key. to create. Everything I showed via the Azure portal can also
be built using the REST API or the .NET or Java SDKs. And
Here’s a little tip. You’ll pay for data egress costs, but you remember, this example that I showed used an indexer to
only pay for what leaves the data center. So, if you have a query the Northwind database. What if you don’t have an in-
Web front-end for the search results, place it in the same dexer for the objects you wish to have made searchable. For
data center as your search instance. That way, you only pay instance, what if the data resides in an ERP system that has
for the data egress once. a weird arcane Web API? You can still push the objects in,
in a neat and clean JSON format that matches your index.
Let’s execute some search queries now. Under the search
explorer button, as can be seen in Figure 1, type in a search “Neat and clean JSON format.” Did that make you hiccup?
query. For instance, I’m trying to search for “London.” This We all know that the real world is hardly neat and clean.
can be seen in Figure 4. The real world is messy. So in my next example, I’m going
to leverage the power of AI to make sense of unstructured
That’s fantastic! Just like that, I was able to search for all cus- data via search.
tomers that had the word “London” anywhere in their entity.
In order to do so, I’ll use a fantastic capability of search
I can even do some wildcard searches; for instance, try called Cognitive Search. Put simply, Cognitive Search is a
searching for AR*. You’ll see that all of the objects returned bunch of skillsets that leverage the power of AI to make
have “ar” somewhere in the object. Also note that the re- sense out of unstructured data. For instance, you can OCR
turned object, as can be seen in Figure 4, contains all of text out of images and make those images searchable. You
the columns that you marked as retrievable when defining can submit a bunch of pictures and have AI recognize celeb-
the index. rities in those pictures. Or you can do speech to text and so
much more. Where the out-of-the-box abilities fall short,
Remember that the Country field was special? You made it you’re welcome to write your own skill.
filterable. Can you search for AR* in just the UK? Sure, just
use the search query like this: For this part of the article, I had a hard time coming up with
a good example, so I just took a screenshot of this article I
AR*&$filter=Country eq 'UK' am writing. Seriously, the text you see here, unedited so far
by the editor, is a screenshot I took of it and decided to make
This simple query should now show you the customers with it searchable. The goal is, via OCR, I want to be able to search
the pattern AR only from UK. Integrating this within your through the text of this article. You’re welcome to make this
application is also quite trivial. All you need to do is pick the more compelling by uploading pictures of other kinds, such as
request URL from Figure 4 and execute a simple REST call to landmarks, celebrities, your dog—whatever floats your boat.
that URL with your query key in the api-key header.
Back in the search service instance, go ahead and delete
Congratulations, you’ve just made yourself a neat little the previous index. I’m doing that just to keep my search
search engine with your data. results clean.
Click Next to add cognitive skills. Here’s where things be- Finally, choose to create the index.
come interesting. Under the Add enrichments section,
choose to Enable OCR and merge all text into merged_con- Clicking the submit button causes the search engine to
tent field as shown in Figure 5. crawl the document. In my case, it’s a simple screenshot of
one page of text, so it shouldn’t take too long. You can click
Notice the other capabilities you can tap into, as shown in the Refresh button you see in Figure 1 to keep up to date
Figure 6. with your progress.
I know my data is a simple screenshot of this article with Once the search crawling is done, visit the search explorer
just text, so I’ll skip checking all those textboxes. Depend- and execute a search. For instance, I used the phrase “Seri-
ously, the text you see here”, so let me just search for the vice solution, hosted in the cloud. This means that you can
word Seriously. now bring the power of search into your applications with
ease. Did you notice that there was no code in this article?
The search results can be seen in Figure 8. Well, everything I showed can be done via the SDK or the
REST APIs. That’s the gentle learning curve of Azure Search
This is truly mind-blowing. I just did an OCR search of an im- putting all that power in your hands with such ease.
age. And it doesn’t even begin to scratch the surface of the
possibilities here. For instance, if you were a media com- And then you add AI to the mix and the power multiplies
pany and all your photos, audio, and video were in Azure exponentially. Now you can search any kind of content. You
BLOB storage, just through a simple point-and-click, you can issue search queries in various languages. You can make
could make all that media searchable. sense of completely unstructured data. Have you ever run
into a law firm saying, “we have so many documents and we
Then you could issue a search query such as “Picture of wish we could search through them easily” and they wish to
Satya Nadella” and it’ll show pictures of Satya Nadella, as- keep their data private?
suming your media library had such pictures. Or you could
search “a dog lying on grass” and it’ll match the pictures. Or Azure search is your answer, and it’s an answer to many
you could even issue queries in English, and it’ll match non- other commonly heard problems.
English documents via the magical powers of AI-powered
language translation. How will you use Azure Search? Do let me know.
disconnected databases. The examples explored in this article In the same fashion as with multiple server instances of
demonstrate how to work with the PouchDB API (Listing 1) CouchDB, data from PouchDB synchronizes with server-side
as well as how to create a to-do list application that synchro- databases. This means that data manipulated in a discon-
nizes with server (Listing 2 and Listing 3). Figure 1 shows nected state from the server can seamlessly flow up to the
a screenshot of the running application. The application is server.
available on GitHub at https://github.com/craigshoemaker/
synchronize-dbs-demo. PouchDB is a JavaScript implementation of CouchDB that
uses IndexedDB, and, on rare occasion, Web SQL. The fol-
lowing similarities exist between PouchDB and CouchDB.
Craig Shoemaker
Different Databases
craigshoemaker.net in Different Contexts
@craigshoemaker CouchDB (http://couchdb.apache.org) is a server-side multi-
master-document database that seamlessly synchronizes data PouchDB is a browser-based data-
Craig Shoemaker is a devel-
oper, author, speaker, and
among disconnected database instances. As data changes, a base interface that’s tailor-made to
complete revision history for each document is stored, giving
Senior Content Developer
CouchDB the context to handle synchronization and resolve synchronize with CouchDB. This means
for Microsoft on the Azure
Functions team. From conflicts. As databases are synchronized, the revision history that data manipulated in the browser
is used to decide which revisions prevail among the different
building samples, internal
versions. When dealing with conflicts, the revision informa-
can seamlessly flow up to the server.
tools, and writing articles,
Craig helps developers tion is used to allow users to select winning revisions.
around the world learn
to build serverless A core aspect of CouchDB known as “eventual consistency” • The APIs are consistent. Although not identical,
applications. means that changes are incrementally replicated across the much of the code you write for PouchDB works directly
network. This same principle is at work when dealing with against CouchDB.
As a Pluralsight author, databases found inside a Web browser. • PouchDB implements CouchDB’s replication algo-
Craig specializes in teaching rithm. The same rules are enforced on the client as
JavaScript, HTML5, and PouchDB (https://pouchdb.com) is a browser-based database exist on the server that decide how data is synchro-
IndexedDB. interface that’s tailor-made to synchronize with CouchDB. nized across multiple database instances.
In the future, Craig wants
to learn how to tell a joke.
As the document changes, the prefix is incremented by 1 const add = async () => {
and a new GUID is generated. Therefore, when you update
the document, the revision ID prefix advances from 1 to 2. const person = {
_id: ‘craigshoemaker’,
2-def name: ‘Craig Shoemaker’,
twitter: ‘craigshoemaker’
Revision IDs are updated in this way in concert with any };
data changes. Even if you delete the document from the
database at this point, the revision ID advances to 3 and const response = await localDB.put(person);
the document metadata is marked as deleted. Tracking with console.log(response);
revision IDs allows the database to maintain a full revision };
history of each document. By sustaining a running revision
history for every document, the database has the context The result returned from the database resembles an HTTP re-
necessary to replicate changes among different database sponse code. When successful, the response from PouchDB
instances. returns a response with ok: true, the document’s unique
identifier, and the revision ID value.
Working with PouchDB {
To begin working with a database in the browser, you first ok: true,
need to reference the pouchdb.js script in your HTML page. id: “craigshoemaker”,
rev: “1-747b2b81bf8ef992e8ec1f44aa737c48”
<script src="scripts/pouchdb.js"></script> }
Next, inside a script tag or in a separate JavaScript file, Once you have the identifier and revision ID, you can access
create a new instance of PouchDB. The constructor accepts and manipulate the data as you wish. To retrieve a record from
the database name. the database, you pass the document ID to the get method.
As you create a new instance of PouchDB, the resulting ob- const person =
ject either points to an existing database or it creates a new await localDB.get(‘craigshoemaker’);
database for you. In this case, a new IndexedDB database
is created in the browser. PouchDB uses one of a series of console.log(person);
adapters to interface with different databases. If you in- };
spect the localDB instance in the browser console, notice
that the adapter, as shown in Figure 2, is set as idb. This The response from the database includes the full document
alludes to the fact that in the browser, PouchDB is using the data including the unique identifier and revision ID.
IndexedDB adapter.
{
PouchDB is architected with a Promise-based API that pro- _id: “craigshoemaker”,
vides an opportunity to use JavaScript’s async/await syntax _rev: “1-747b2b81bf8ef992e8ec1f44aa737c48”
when calling methods. The following snippet demonstrates name: “Craig Shoemaker”,
how to add a new object to the database by calling the put twitter: “craigshoemaker”,
method. }
{ return response.rows;
ok: true, };
Removing a document from the data also requires reference The revision ID starts with a 4 instead of a 1, even though
to the unique identifier and latest revision ID values. The best a new document is inserted into the database. Building on
way to get the latest values is to call get immediately before these API basics, you can begin synchronizing data between
attempting to remove the document from the database. two databases.
Managing Conflicts The easiest way to resolve this conflict is to fetch the docu-
Dealing with conflicting data sits at the heart of any attempt to ment’s latest version, update the required values and then
synchronize databases. Embracing the inevitability of conflicts, attempt to save the document again.
revIds.push(item._rev);
The functions are event-driven and have a beautiful way to I found an easy inspiration for a new function. I often need
orchestrate a variety of services through configurable trig- to know how many words I’ve written for things like confer-
gers and bindings, reducing the amount of code you have ence abstract submissions, etc. I find myself copying the
to write. You can just focus on the logic you’re trying to text into Microsoft Word to get that count. You’ll get to
achieve, not on the effort of wiring up the orchestrations. create your own function that returns character and word
counts for a given bit of text.
HTTP Requests, like a Web service, are just one type of event
that can trigger your function to run. You can also wire func- I’ll be using Visual Studio Code along with its Azure Func-
tions up to other triggers, such as listening for changes in an tion extension and a few other related extensions. If you
Julie Lerman Azure Cosmo DB database or a message queue. Other tasks haven’t used Visual Studio Code before, I invite you to in-
thedatafarm.com/blog you might want to perform can also be specified through con- stall it (for free on MacOS, Linux, or Windows) to try it
@julielerman figurable bindings that also don’t require code. For example, out as you follow along. VS Code is cross-platform and is
you can use an input binding to retrieve some data from a da- a breeze to install. (Go to code.visualstudio.com to install
Julie Lerman is a Microsoft
tabase that your function needs to process. Output bindings— and learn more.) You can use Visual Studio 2017 or 2019,
Regional Director, Docker
again defined with configurations, not code—let you send re- which has a similar extension built into the Azure workload.
Captain and a long-time
sults to another service. Your function only needs to create the The VS extension doesn’t have the same workflow, however.
Microsoft MVP who now
counts her years as a coder results in code and the binding will ensure that those results You can see how to get started with that in the docs and
in decades. She makes get passed on to their destination. A single function could be then come back to walk through the functions built in this
her living as a coach and triggered by a write to your Cosmos DB database, then uses an article.
consultant to software input binding to gather relevant data from the database, and
teams around the world. then uses a message queue output binding to update a cache. In VS Code, start by installing the Azure Functions exten-
You can find Julie present- If you don’t need any additional logic, the function is totally sion through the Extensions icon in VS Code’s Activity Bar
ing on Entity Framework, defined by the trigger and bindings. (along the left side of the IDE). A prerequisite of the Azure
Domain Driven Design, and Functions extension is that you install the Azure Functions
other topics at user groups All of these bindings and triggers remove many of the re- Core Tools. There are links to the OS-specific installers in the
and conferences around dundant tasks that you might otherwise have to perform Prerequisites section of the extension details (marketplace.
the world. Julie blogs at and they allow you to focus on the logic of the function. And visualstudio.com/items?itemName=ms-azuretools.vscode-
thedatafarm.com/blog, is the Azure Functions service takes care of all of the server- azurefunctions). I’ll warn you now that even for Windows, it
the author of the highly related problems tied to hosting. Integration with Applica- is an npm install, but it’s quick and painless.
acclaimed “Programming tion Insights lets you monitor your functions to observe how
Entity Framework” books, your functions are performing and being used. You don’t need an account to use this extension unless you
the MSDN Magazine Data plan to follow the deployment test later on. Note that if
Points column and popular you have a Visual Studio Subscription, you have an Azure
videos on Pluralsight.com. The Structure of Azure Functions account. If you have neither, you can get a free account at
The structure of Azure Functions is defined by a Function https://azure.microsoft.com/en-us/free/.
App that hosts one or more related functions. The app has
its own settings that are secure by default. This is a good
place to store details like connection strings, credentials, Creating Your First Function
and more. Then each function within the app is a self-con- I began with a parent folder named Code Functions, in case
tained set of triggers and bindings with its own additional I want to add more functions later. Then I created a sub-
settings. The only thing the functions share are the sub- folder called WordCount. Open VS Code in the WordCount
domain URL and the app settings. Figure 1 shows some of folder.
the Function Apps in my subscription. I’ve expanded the
DataAPINode app so you can also see the three functions I Next, you’ll use the Azure Functions extension to turn the
created for that app. WordCount folder into an Azure Functions project.
Click the Azure icon in the Activity Bar to show the Azure
Preparing Your Environment explorer. You’ll see a window for Azure Functions. If you’re
Although it’s possible to create functions directly in the logged into an Azure account, the Azure Functions explorer
portal, both Visual Studio and Visual Studio Code have ex- will also show you the Function apps in your account (Figure
tensions that make it easy to develop, debug, and deploy 2). I have a lot of demo function apps in my subscription.
Azure Functions. I’ll use VS Code and its Azure Functions To save resources in my account, I’ve stopped them all while
extension. they’re not actively in use.
That’s it. The extension will then build out the assets need-
ed for this project.
Now it’s time for the logic that reads the text and creates
an output with the character and word counts. Rather than
creating yet another Azure Function to perform that task,
I’ll first added a sub class called DocObject to encapsulate
the results of the analysis.
private static DocObject AnalyzeText(string text)
Figure 3: Initial result of creating a function project using the Azure Functions extension {
var charsLen = text.Length;
if (charsLen==0) return “ “;
ADVERTISERS INDEX
CODE Framework
DEC
2019
codemag.com - THE LEADING INDEPENDENT DEVELOPER MAGAZINE - US $ 8.95 Can $ 11.95
www.codemag.com/framework 75
What’s Your
Superpower? CODE Staffing
www.codemag.com/staffing 7
DeveloperWeek
www.developerweek.com 49
DEVintersection Conference
SQL Server An Introduction Get Started
www.DEVintersection.com 2
and Machine to Digital with Serverless
Learning Accessibility Azure Functions
dtSearch
www.dtSearch.com 21
Advertising Sales: Figure 4: The Azure Functions logo displayed in the
Tammy Ferguson JetBrains terminal when the SDK is running properly
832-717-4445 ext. 26
tammy@codemag.com
www.jetbrains.com/resharper 76
LEAD Technologies
www.leadtools.com 5
result shown in Figure 6 tells me that my text has 35 charac- Advanced version of this option but choose the simple ver-
ters, 30 without spaces, and the word count is six. sion for this demo. You’ll need to provide a new name for
the function app. The prompt asks for a “globally unique”
name. That doesn’t just mean unique to your account, but
Deploying My New Function to anyone’s account in the whole world. That’s because the
to the Cloud final URI will be a subdomain of azurewebsites.com. I chose
For my new function to be truly useful, I’ll need to deploy CodeMagFunctions for this example. The simple version of
it to Azure. Even with 30 years of software experience, the “Create new Function App doesn’t give you the chance to se-
term “deploy” still makes my heartrate go up a little. Luck- lect a resource group or choose the location. The advanced
ily, both Visual Studio and VS Code make it easy. Remember option lets you specify these and additional settings for the
the “upload” icon in the Azure Functions explorer shown new Function App. You can also modify settings in the Azure
in Figure 2? As long as you’re logged into your account, Portal after the fact.
there’s not much to deploying this function. In my case, I’ll
need to ensure that the Function App is created first and After the extension creates the new Function App, it zips up
then the WordCount function inside of it. Also, keep in mind the compiled function and pushes it up to Azure. You’ll get
those local settings I pointed out earlier. For this beginner some status reports and then a notification when it’s done.
function, I didn’t do much that involved settings, such as At the end, you’ll be prompted to upload the settings. I
define bindings, provide connection strings, or credentials. didn’t need to do that, so I just closed that prompt window.
You’ll get a chance to upload your local settings at the end
of the deployment process. Because it’s a zip file deployment, the function code will be
read-only in the portal. You’ll get a message about that with
Go ahead and click the upload button. You’ll follow a series guidance to change an app setting if you want to edit directly
of prompts as you did when creating the function. First, cre- in the portal. Essentially, if you’re creating these in VS Code
ate a new Function App in Azure (as opposed to adding it or VS, the assumption is that you will make any changes in
to an existing function app). You’ll see that there’s also an your IDE and then re-deploy the updated function.
public static async Task<IActionResult> Run(
[HttpTrigger(AuthorizationLevel.Anonymous,
“get”, “post”, Route = null)]
HttpRequest req)
[CosmosDB (
databaseName: “WordCounts”,
collectionName: "Items”,
ConnectionStringSetting =
"CosmosDBConnection”,
CreateIfNotExists=true)]
ICollector<doc> docs)
{
There are three ways to ensure that the data intended for
the output binding are discovered by the binding. One is to
create an Out parameter in the signature. Because I’m us-
ing an asynchronous method, that’s not possible with C#.
advice, and work, and how what she had to say was inspiring ized how much I loved computing and computers because
to Sumeya—and can be to all of us. And we can also pick up they could make communities happen. I knew I was going to
that sense of wonder and excitement from Sumeya’s infec- be a computer programmer in my senior year of high school.
tious interest. Here’s Sumeya’s introduction to the interview. I took a C++ class with a teacher named Mrs. Gaul, and for
the first time, I felt like the computers thought the same
“When I think of inspiring women who are making a difference way that I thought—very logically.
in the tech world, a few women come to mind. One is Sara
Chipps, JavaScript lover, and co-founder of Girl Develop It, Do you still see sexism and discrimination in the work place?
where women can learn computer programing skills online.
Sumeya Block She’s currently at Jewelbots (which she also co-founded). I definitely experienced it when I first started my career, I
Jewelbots launched on Kick Starter just over six years ago know a lot of women who ended up leaving the industry be-
Sumeya is a passionate and, since then, has been committed to getting girls inter- cause of it. I think the positive thing now, being on this side
writer, lover of creative ested in STEM fields. Jewlbots currently sells two projects. One of my career, is that I can help mentor younger women and
expression, and a recent is a JewelBits science kit that sparks creativity through DIY I can step in. Now that I’m older, I can step in when I see it
Teen Tix press corps writer. neon-colored light-up signs. The second is a programmable happening to other women. I think it’s important that we’re
She’s currently in her friendship bracelet that can be used to talk to friends through all aware of and keep our eyes on these things.
sophomore year of high Morse code. It can light up when paired with other bracelets
school and spends most of
and do even more as the users develop their coding skills. Before you started Jewelbots, you were in Girl Develop It.
her time going to poetry
How does your work in both organizations help to encour-
slams, writing art reviews,
“When I called Sara on a rainy New York night, she talk- age more diversity in STEM fields?
and speaking at events.
She’s been published in ed admiringly about the women she works with and men-
The Evergrey, Teen Tix blog, tors. She talked about how she and others (not just other The thing they both have in common is helping to teach
and in the Poetry on Busses women) can support their female coworkers by stressing the women and girls that coding isn’t something that’s impos-
contest. She has also pre- importance of reaching out and sharing opportunities. Talk- sible to achieve. It can be something that’s fun and power-
sented at CPP con and .NET ing to Sara, I learned about why she continues working to ful. Often, I heard from women in Girl Develop It classes
Fringe. When Sumeya isn’t create spaces for girls and women to learn about STEM. I that they didn’t know what an engineer was until they got
running around, she enjoys learned more about her own encounters, advice, and work. to college, and by then they felt it was too late to learn or
bingeing Netflix, reading And what she had to say was inspiring.” take the classes needed. The interesting thing about what I
books, and attending social do now with Jewelbots is to help encourage younger girls.
events.
Sara’s Answers as Asked by Sumeya Would you say that the environment has changed since
When did you first become interested in tech and was those first girls became women? Is it the same for kids now?
Sara Chipps there a moment where you knew you were going to be a
sarajchipps@gmail.com computer programmer? I’ve really seen a push to get more girls involved at a young-
er age, and I think that’s really important. It’s important
Sara’s an engineering I was around 11 or 12. This was before the Internet existed, that we help girls understand that this is something that is
manager at Stack Overflow and there were these things called BBSs (bulletin board sys- for them, it’s something that can really help their lives, that
and the cofounder of http:// tems), that were linked to your computer and were like early it’s something that they can really have fun doing.
Jewelbots.com. Sara was chat rooms. I used to hang out on those a bunch and real-
formerly the CTO of http:// Jewelbots has really changed and it’s developed into a re-
FlatironSchool.com and in ally great community of girls. For me, that was something
2010, she cofounded Girl I always loved when I was first learning. As the CEO, how
Develop It, a non-profit
does the process of developing a product and working
focused on helping women
with beta testers like me change how you work?
become software developers.
I learned a ton! It really gave me respect for people who do
product management and things that aren’t strictly engi-
neering. Something I learned really early on is that a lot of
assumptions that you make about a product can be wrong.
Just because I happen to remember what it’s like to be a
girl doesn’t mean that now, 20 years later, I understand
what girls want today. One thing that it’s really taught me is
to not make assumptions or pretend that I can understand
what someone else might be facing just because I think I
can imagine it or I think I can remember it from a long time
ago. No matter what, it’s really important to talk to people
and see what you are building with a product.
That sounds really fun! Are these lessons also being How should people reach out?
taught in science classes? What’s special about using
these boxes, as opposed to learning about it in school? Anyone can reach out and just say, “how can I be helpful? or
ask, “are you job hunting?” “Are you practicing for interviews?”
That’s a really good question. Often, these are things that “Are you facing anything at work that you could use some help
people are learning in science class. The difference here is with from someone with more experience?” Just making yourself
learning through play, something we at Jewelbots believe in available and saying, “how can I help?” is a great way to do it.
a lot. I was an okay student, but when I really cared about
something and it was something that I could play with and
have fun with, those are the things that I really remember Sumeya’s Answers to Sara’s Questions
now, 20 years later. The goal is more education and a better Why is coding important to you?
grasp and understanding of concepts through play.
It’s really important to me because I know that in this society,
Jewelbots has its own YouTube channel. In your opinion, coding has definitely become prominent in general. Especially
why are these videos important and how do they create a in my age group, technology is just so prominent and I re-
community? Why is a community so important? ally like coding as a great way to be creative. I haven’t really
been able to do a lot of it since I started high school because
Community really helps with learning, whether it’s the You- high school is very demanding. But what I’ve always loved is
Tube community or any other type of on- or off-line com- the creativity about it. The community of getting to share the
munity. People really respond when they see other people things you’ve done with other people. I think it’s important
their own age doing the same things they are. I think that’s to know for the future because, like I said, it’s just so integral.
a really neat thing about kids and girls.
What’s the first thing you ever coded?
On the Jewelbots website, it’s stated that your age market
is 8-14-year-old girls. What’s so special about this particu- I’m pretty sure the first thing I coded was with my dad. We
lar age group? Why is Jewelbots targeted at this audience? programmed this series of colors with the Jewelbots brace-
Instantly Search
nalism. I really love journalism. I think that I’ll be in college
and definitely advocating for women, and I hope that I’ll be How does no strings, free
Terabytes
going to conventions, continuing my path of learning about advice on a new or existing
coding because I think it’s important that everyone learns project sound? Need free
about it. And I also hope to continue my work with activ- advice on migrating an
ism, empowering women and empowering all people, mak- existing application from
ing sure to amplify the voices to whom injustices happen so an aging legacy platform
they can be heard and their needs met. to a modern cloud
or Web application?
The people at CODE Magazine want to know: What is the
CODE Consulting experts dtSearch’s document filters
best way to inspire and excite young women to code?
have experience in cloud, support:
Web, desktop, mobile,
microservices, and DevOps • popular file types
Let’s see, something that I really love is just seeing these
opportunities that I’m getting and that my friends are get-
and are a great resource • emails with multilevel
for your team! Contact us
ting, and I really hope that’s happening across the nation. today to schedule your free
attachments
I think when code is talked about and celebrated, it’s defi- hour of CODE consulting • a wide variety of databases
nitely more exciting. Girls who learn about coding are more
like, “Oh, this is like a real career.” I don’t think everybody
call with our expert • web data
consultants (not a sales
knows exactly what coding is. I’ve talked to people who call!). For more information,
don’t understand how much of a great opportunity it is. I visit www.codemag.com/
think one way that people who read CODE Magazine can help consulting or email us at Over 25 search options
to empower girls and women is to see if young people can info@codemag.com. including:
come in and learn about coding and learn about what you
do at your workplace. Some of the things that hinder people • efficient multithreaded search
is not knowing what coding is and thinking that they’re not • easy multicolor hit-highlighting
empowered to do it. And when you don’t learn, you might be
• forensics options like credit
like “Oh this is really cool. But it feels kind of out of reach.”
Going to someone’s workplace and seeing that they’re cod-
card search
ing and they’re doing all these things, working or running
a business, I think it definitely shows that you can do it and
you can be inspired to do it. The best thing people can do Developers:
is just inspire us all to work really hard—all young people,
not just women and girls—to get to our goals and to learn • SDKs for Windows, Linux,
about coding. macOS
• Cross-platform APIs for
Sumeya Block, Sara Chipps .
C++, Java and NET with
. .
NET Standard / NET Core
• FAQs on faceted search,
granular data classification,
Azure and more
T-SQL scripts or Integration Services packages, expanding • Your data science tools can connect securely to the
the capabilities of ETL and database scripting. database to develop models without duplicating or
compromising data.
What has this to do with stone soup, you ask? It’s a meta- • You can save trained models to a database and gener-
phor, of course, but one that captures the essence of why ate predictions using customer data and leave optimi-
SQL Server works so well with Python and R. To illustrate zation to your DBA.
the point, I’ll provide a simple walkthrough of data explora- • You can build predictive or analytical capacity into your
tion and modeling combining SQL and Python, using a food ETL processes using embedded R or Python scripts.
and nutrition analysis dataset from the US Department of
Jeannine Takaki-Nelson Agriculture. Let’s look at how it works and how the integration makes it
j.takaki@live.com easier to combine tools as needed.
@jrrnt Let’s get cooking!
The article is targeted at the developer with an interest in
While at Microsoft, Jeannine machine learning (ML), who’s concerned about the com-
worked as a tester and Machine Learning, from Craft to Pro plexity of ML and is looking for an easier way to incorporate
wrote technical documenta-
You might have heard that data science is more of a craft ML with other services and processes. I’ve chosen “stone
tion for machine learning
products, including SQL than a science. Many ingredients have to come together ef- soup” as a metaphor to describe the process of collabora-
Server Data Mining, ficiently, to process intake data and generate models and tion between data scientists and database professionals to
SQL Server Machine Learning, predictions that can be consumed by business users and end brew up the perfectly performant ML solution.
and Azure Machine Learn- customers.
ing Studio. She’s currently Security Architecture
retired, which gives more However, what works well at the level of “craftsmanship” of- First off, let’s be clear about the priorities in this platform:
time to read about data ten has to change at commercial scale. Much like the home security, security, and security. Also, accountability, and
science and run really inef- cook who has ventured out of the kitchen into a restaurant management at scale.
ficient code. She’s grateful or food factory, big changes are required in the roles, ingre-
to the many writers in dients, and processes. Moreover, cooking can no longer be Data science, like cooking, can be tremendous fun when
the R-blogger and SQL a “one-man show;” you need the help of professionals with you’re experimenting in your own kitchen. Remove vari-
Server community for different specializations and their own tools to create a suc- ables, mash data into new formats, and no one cares if the
their excellent examples cessful product or make the process more efficient. These result is half-baked. But once you move into production, or
and gentle explanations. specialists include data scientists, data developers and tax- use secure data, the stakes go up. You don’t want some-
onomists, SQL developers, DBAS, application developers, one contaminating the ingredients that go into a recipe or
and the domain specialists or end users who consume the spying on your data and production processes. So how do
results. you control who’s allowed in the kitchen, when you can’t
have just anyone involved in preparing your food or touch-
Any kitchen would soon be chaos if the tools used by each ing your data?
professional were incompatible with each other, or if pro-
cesses had to be duplicated and slightly changed at each With ML in SQL Server, security and management is enforced
step. What restaurant would survive if carrots chopped up at four layers (see Figure 1):
at one station were unusable at the next? Unfortunately,
the variety (and sometimes incompatibility) of tools used • Isolation of Python or R processes: When you install
in data science means that a lot of work has had to be rein- the ML service, the database server gets its own local
vented or created ad hoc and left unmanaged. For example, instance of Python (or R). Only a database adminis-
ETL processes often create data slices that are too big for trator or someone with the appropriate permissions
analysis or they change core aspects of the data irreparably. can run scripts or modify installed packages. (No more
installing packages from the Internet on a whim.)
The core business proposition of integrating Python and R • Secure lockdown of Python launcher: The stored pro-
with SQL and the RDBMS is to end such duplication of ef- cedure that calls the Python (or R) runtime is not en-
fort by creating commercial-strength bridges among all the abled by default; after the feature has been installed,
tools and processes. an administrator must enable external code execution
32 Stone Soup: Cooking Up Custom Solutions with SQL Server Machine Learning codemag.com
at the server level, and then assign specific users the From the standpoint of the DBA, drawbacks include not just
permissions to access data and run the stored proce- the crazy data scientists asking for Python installs, but new
dure. workloads. The administrator must allocate server resources
• Data access: Managed by traditional SQL Server se- to support ML workloads, which can have very, very differ-
curity. To get access to data, you must have database ent performance profiles. ML also uses new database and
permissions, either via SQL login or Windows authen- server roles to control script execution as well as the ability
tication. You can run Python or R entirely in the server to install Python or R packages. Other new tasks for the DBA
context and return the results back to SQL Server include backing up your ML data, along with your data sci-
tables. If you need more flexibility, data scientists ence users and their installed libraries.
with permission to connect to the database can also
connect from a remote client, read data from text files The SQL Server development team put a lot of effort into
stored on the local computers, and use the XDF file figuring out workflows that support data science without
format to make local copies of models or intermedi- burdening the DBA too much. However, data scientists who
ate data. lack familiarity with the SQL Server security model might
• Internal data movement and data storage: The SQL need help to use the features effectively.
Server process manages all connections to the server
and manages hand-offs of data from the database to Package Installation and Management
the Python or R processes. Data is transferred between Security is great, but the data scientist needs to be able to
SQL Server and the local Python (or R) process via a install open source Python or R packages. Moreover, they
compressed, optimized data stream. Interim data is expect to install those new packages and all their depen-
stored in a secure file directory accessible only by the dencies straight from the Internet. How does this even work
server admin. in a secured environment?
Whereas data science used to be a headache for control- First off, the default installation includes the most popular
minded DBAs, the integrated ML platform in SQL Server packages used for data science, including nltk, scikit-learn,
provides room for growth, as well as all the monitoring and numpy, etc. SQL Server also supports installing new pack-
management required in a commercial solution. Compare ages and sharing packages among a group of data scien-
this to the old paradigm of exporting customer data to a tists. However, the package installation process is restricted
data scientist to build a model on an unsecured laptop. to admins and super users. This is understandable because
Add in the SQL Server infrastructure that supports monitor- new Python or R libraries can be a security risk. Also, if you
ing—who viewed what data, who ran which job, and for how install version x.n of a Python package, you risk breaking
long—infrastructure that would be complex to implement in the work of everyone who’s been using a different version
an all-Python or R environment. of the package.
Resource Link
Introduction to the extensibility framework https://docs.microsoft.com/ sql/advanced-analytics/concepts/extensibility-framework?view=sql-server-2017
Network protocols and how Python is called from SQL Server https://docs.microsoft.com/sql/advanced-analytics/concepts/extension-python?view=sql-server-2017
Security details at the database level https://docs.microsoft.com/sql/advanced-analytics/concepts/security?view=sql-server-2017
codemag.com Stone Soup: Cooking Up Custom Solutions with SQL Server Machine Learning 33
Therefore, a database administrator typically has to perform Management, Optimization, and Monitoring
or approve the installation. You can install new packages If you’re a database professional, you already know how
on the server directly if an admin gives you permissions to to optimize server performance and have experienced the
install packages. After that, installation is as easy or hard challenges of balancing multiple server workloads. For ML,
as any other Python install, assuming the server can access you’ll want to make full use of your DBA’s knowledge in this
the Internet. Whoops. Fortunately, there are workarounds area and think hard about server allocation. But you’ll also
for that too. need to lean hard on your data scientist.
The SQL Server product team has thought long and hard Let’s start with the basics. Calling Python (or R) does add
about how to enable additional packages without break- processing time. Like any other service, you’ll notice the lag
ing the database, annoying the DBA, or blocking data sci- the first time the executable is called, or the first time a
entists. Package management features in SQL Server 2017 model is loaded from a table. Successive processing is much
let the DBA control package permissions at the server and faster, and SQL Server keeps models in cache to improve
database level. Typically, a super user with the correct da- scoring performance.
tabase role installs needed packages and shares them with
a team. The package management features also help the If you set up some event traces, you might also detect small
DBA back-up and restore a set of Python libraries and their effects on performance from factors such as:
users. Remote installation is also supported for R pack-
ages. • Moving data, plots, or models between a remote client
and the server
Because this feature is complex, I won’t provide more de- • Moving data between SQL Server and the Python or R
tails here. Just know that in a secure server, there are nec- executables
essarily restrictions on package installation. Table 2 lists • Converting text or numeric data types as required by
some great resources. Python, R, or the RevoScale implementations
Some caveats before I move on: (For the nitty-gritty details of performance, I strongly rec-
ommend the blog series by SQL Server MVP Niels Berglund
• Azure SQLDB uses a different method for managing on Machine Learning Services internals: https://nielsber-
packages. Because multiple databases can run in a glund.com/2018/05/20/sp_execute_external_script-and-
container, stricter control is applied. For example, sql-compute-context---i/)
the SQL Server ML development team has tested and
“whitelisted” R packages for compatibility and use in Considered as a platform, SQL Server Machine Learning of-
Azure SQL DB. At this time, the R language is the only fers a lot of options for optimization. Several of the most
one supported for Azure SQL DB. important use cases have been baked into the platform. For
• There is no comparable “whitelist” of Python packages example, native scoring uses C++ libraries in T-SQL (https://
that are safe to run on SQL Server. ML in the Linux docs.microsoft.com/sql/advanced-analytics/sql-native-
edition of SQL Server is also still in preview. Watch the scoring?view=sql-server-2017) to generate predictions from
documentation for more details. a stored model very fast. Optimized correctly, this feature
can generate as many as a million predictions per second
(see: One million predictions per second: https://blogs.
technet.microsoft.com/machinelearning/2016/09/22/pre-
dictions-at-the-speed-of-data/).
Resource Link
Package management roles https://blogs.msdn.microsoft.com/microsoftrservertigerteam/2017/05/11/enterprise-grade-r-package-management-
made-easy-in-sql-server/
Using sqlmutils to install packages remotely https://docs.microsoft.com/sql/advanced-analytics/package-management/install-additional-r-packages-on-sql-
server?view=sql-server-2017
34 Stone Soup: Cooking Up Custom Solutions with SQL Server Machine Learning codemag.com
Task Description and link to resource
Optimize Windows server and Although this case study was originally for R, most of the tips apply to Python models as well.
SQL Server The experiment compares a solution before and after server optimizations such as use of NUMA and maximum parallelism.
https://docs.microsoft.com/sql/advanced-analytics/r/sql-server-configuration-r-services?view=sql-server-2017
Be sure to catch this part of the series, which covers use of compression and columnstore indexes:
https://docs.microsoft.com/ sql/advanced-analytics/r/sql-server-r-services-performance-tuning?view=sql-server-2017
Optimize for concurrent The Microsoft Tiger Team captures real-world customer problems and periodically distills them into useful blogs.
execution https://blogs.msdn.microsoft.com/microsoftrservertigerteam/2016/09/20/tips-sql-r-services-optimization-for-concurrent-
execution-of-in-database-analytics-using-sp_execute_external_script/
Choose models and data There are many ways that the RevoScale platform can improve performance: enhanced numerical computation, streaming and
processing methods batching, parallel algorithms, and pretrained algorithms. This guide to distributed and parallel computing provides a high-level
introduction to the types of distributed computing provided by the RevoScale algorithms.
https://docs.microsoft.com/machine-learning-server/r/how-to-revoscaler-distributed-computing
Use pretrained models Talk about a shortcut—the pretrained models in microsoftml (for Python and R) support sentiment analysis and image
recognition, two areas where it would be impossible for most users to get and use enough training data.
https://docs.microsoft.com/sql/advanced-analytics/install/sql-pretrained-models-install?view=sql-server-2017
Manage SQL resources Resource Governance is an awesome feature for helping manage ML workloads, although it’s available only with Enterprise Edition.
https://docs.microsoft.com/sql/advanced-analytics/administration/resource-governance?view=sql-server-2017
Optimize for specific As noted earlier, fast scoring is particularly important for enterprise customers. There are lots of ways to accomplish this based
high-priority tasks on whether you are using a single server or distributed servers and even a Web farm.
https://docs.microsoft.com/sql/advanced-analytics/r/how-to-do-realtime-scoring?view=sql-server-2017
https://docs.microsoft.com/machine-learning-server/operationalize/concept-what-are-web-services
The key to success is capturing a baseline so that your DBA Prepare the Environment
and your data scientist can begin the process of optimi- For specific set up steps, see the Microsoft documentation.
zation—figuring out which processes are taking the most Links to all the pertinent set up docs are provided in Table
time, and how you can adjust your data sources and code to 5, near the end of this section. Set up of the server takes
streamline performance. The goal here is simply to provide about an hour and ditto for the client tools.
a set of starter resources that you can use to begin to opti-
mize a workload in SQL Server Machine Learning. Choosing which features to install, and which version, is the
first step. What features you install depend on what version
is available, and what you will be doing with ML. Figure 3
Python and SQL: A Walkthrough summarizes the versions of SQL Server that support ML.
Let’s get cooking! For this walkthrough, the goal is simple—
to get the server and client tools set up and learn the basics For this demo, I installed Developer Edition for SQL Server
of the stored procedure sp_execute_external_script. 2017, because Python became available starting in 2017.
The first two sections cover basic set up of Machine Learning You can use a laptop or any other personal computer with
Services on SQL Server, as well as set up of R or Python client sufficient memory and processing speed, or you can create
tools. If you already have SQL Server 2017 installed, includ- an Azure Windows VM. Remember, you need to meet at least
ing the ML features, you can skip the first part. the minimum SQL Server requirements and then have ex-
tra memory to support the Python or R workloads. Such an
In the third and fourth sections, I’ll explore a simple data environment will let you try out all the features, including
set. The data was obtained from the US Department of Food passing data between R and Python.
codemag.com Stone Soup: Cooking Up Custom Solutions with SQL Server Machine Learning 35
Setup type Link
Set up SQL Server with Python https://docs.microsoft.com/ sql/advanced-analytics/install/sql-machine-learning-services-windows-install?view=sql-server-2017
Python client: https://docs.microsoft.com/sql/advanced-analytics/python/setup-python-client-tools-sql?view=sql-server-2017
R client https://docs.microsoft.com/sql/advanced-analytics/r/set-up-a-data-science-client?view=sql-server-2017
Troubleshooting and Known issues https://docs.microsoft.com/sql/advanced-analytics/known-issues-for-sql-server-machine-learning-services?view=sql-server-2017
https://docs.microsoft.com/ sql/advanced-analytics/common-issues-external-script-execution?view=sql-server-2017
What’s different between the versions? https://docs.microsoft.com/ azure/sql-database/sql-database-machine-learning-services-differences
That said, not everyone will need to set up a client, and such
software might not be allowed in certain highly secured en-
vironments. If you can accept the limitations around debug-
ging and viewing charts, you can develop and run every-
thing in SQL Server.
36 Stone Soup: Cooking Up Custom Solutions with SQL Server Machine Learning codemag.com
-- check Revoscalepy version OutputDataSet = Midwest’
EXECUTE sp_execute_external_script WITH RESULT SETS ((col1 nvarchar(50),
@language = N’Python’, col2 varchar(50), Rank1 int,
@script = N’ Amt1 float, Pct1 float, Rank2 int,
import revoscalepy Amt2 float, Pct2 float))
import sys
print(revoscalepy.__version__) It’s not a practical example, but it demonstrates some key
print(sys.version) aspects of running Python (or R) code in SQL Server:
The following code merges two views as inputs and returns SQL Server and Python are kind
some subset from Python. of like a chain saw and a food
EXECUTE sp_execute_external_script processor. Both can process huge
@language = N’Python’ amounts of data, but they differ
, @input_data_1 = N’
(SELECT * FROM [dbo].[vw_allmidwest])
in the way they chop it up.
UNION (SELECT * FROM [dbo].[vw_allsouth]) ‘
, @input_data_1_name = N’SummaryByRegion’
, @script = N’ Such differences can break your code if you aren’t aware
import revoscalepy of them or don’t prepare your code to account for the dif-
import pandas as pd ferences. Be sure to review the data type conversion topics
df = pd.DataFrame(SummaryByRegion) listed in Table 5, as well as the Known Issues in Books On-
Midwest = df[df.Region == “midwest”] line, before you port over existing code.
Resource Description
https://docs.microsoft.com/sql/advanced-analytics/python/python-libraries-and-data-types?view=sql-server-2017 Data type mismatches and other warnings
for SQL to Python conversion
https://docs.microsoft.com/sql/advanced-analytics/r/r-libraries-and-data-types?view=sql-server-2017 Data type mismatches and other warnings SQL
to R conversion
https://docs.microsoft.com/sql/advanced-analytics/r-script-execution-errors?view=sql-server-2017 Issues that apply only to R scripts
codemag.com Stone Soup: Cooking Up Custom Solutions with SQL Server Machine Learning 37
Figure 5: Comparison of major food purchases per region
The principal finding by the USDA was that there were no major
differences in the spending patterns of households that used
food assistance vs. those that do not use food assistance. For
example, both types of households spend about 40 cents of Figure 6: Differences between SNAP and non-SNAP
every dollar of food expenditures on basic items such as meat, households for infant formula
fruits, vegetables, milk, eggs, and bread. However, the study
found some ways to improve access to healthy food choices
for SNAP participants. For example, authorized retailers were and R. Fortunately, in the SQL Server 2017 environment,
required to offer a larger inventory and variety of healthy food you’re not constrained to any one tool and can use whatever
options, and convenience-type stores were asked to stock at gets the job done.
least seven varieties of dairy products, including plant-based
alternatives. I decided that it would be interesting to compare the three
regions included in the report. A bar chart might work, but
I’ll do some easy data exploration to find additional insights radar charts are also handy for displaying complex differ-
into food consumption by food stamp users: ences at a glance. A radar chart doesn’t show much detail,
but it does highlight the similarities, and might suggest
• Differences between regions in terms of seasonal veg- some areas a nutritionist might want to dig into, such as
etable consumption, or meat purchases the heavy use of frozen prepared foods. See Figure 5 for the
• Top commodities for each region and for all regions summary of purchases by food type, per region.
• Differences based on age of the head of household
and poverty level of the surrounding community The cool graphic was produced not in Python at all, but in Ex-
cel. The Python library matplotlib includes a function for creat-
Such descriptive statistics have long been the domain of ing a radar chart, but it was pretty complex, and I’m a Python
traditional BI, and there are lots of tools for displaying nice amateur, whereas creating a radar chart in Excel takes only
summary charts, from Power BI to Reporting Services and a few clicks. That’s right; I don’t particularly care which tool
plain old T-SQL, to the many graphing libraries in Python I use, as long as I can explore the data interactively. I don’t
38 Stone Soup: Cooking Up Custom Solutions with SQL Server Machine Learning codemag.com
and the answer was clearly negative. That’s okay. We tend to
expect that brilliant insights will emerge from every set of
data and forget that the original goal of statistical analysis
is to disprove that an effect exists.
codemag.com Stone Soup: Cooking Up Custom Solutions with SQL Server Machine Learning 39
SQL Server, and open a command prompt (not a Python in- Step 3. Create the plot. Having extracted the table of words
teractive window) to run pip as follows: and weights, it’s simple to input the data to sp_execute_
external_script as a view or query, and build a word cloud
python -m pip install wordcloud using Python or R. The Python script has these basic steps:
Step 2. Prepare text data used for the word cloud. There 1. Import the required libraries.
are many ways to create and provide weights to a word 2. Put word data from the SQL query into a data frame.
cloud. To simplify the demo, I used Python to process the 3. Create a word cloud using Python libraries.
list of top commodities for each region and wrote that data 4. Dump the plot objects as a serialized variable using
back to a table. pickle.
5. Save the variable as an output to SQL Server.
Data preparation is another area where the SQL Server plat-
form gives you the ability to use the most convenient, fast- You can see the full text of the stored procedure in Listing 1,
est tool for the job. This data set had very short text entries, but this excerpt shows the key steps:
so I merely concatenated the text and removed nulls, but
you can imagine text data sources where the ability to pro- from wordcloud import WordCloud,
cess data in Python’s nltk and return the tokenized text to ImageColorGenerator
SQL Server would be useful. On a later iteration, I’ll probably
add a stopword list or expand abbreviations. # Handle and prepare data
df = pd.DataFrame(WesternFoods)
INSERT [dbo].[WesternFoods] descriptors = df.MergedText.values
EXECUTE sp_execute_external_script text = descriptors[0]
@language = N’Python’ wordcloud = WordCloud().generate(text)
, @input_data_1 = N’(SELECT Region, plot0 =
[Subcommodity], [CompositeSubcat],[OtherSubcat] pd.DataFrame(data =[pickle.dumps(wordcloud)], columns =[“plot”]
,[SNAP_Pct] FROM [dbo].[vw_FoodListWest])’
, @script = N’ The pattern of creating a complex object and saving it to a
import revoscalepy binary data stream is standard for handling complex struc-
import pandas as pd tures like plots or predictive models. SQL Server can’t under-
df = pd.DataFrame(InputDataSet) stand or display them, so you generate the object, save it as
# prevent Python from inserting None a binary data stream, and then pass the variable to another
df = df.fillna(“”) SQL statement or client or save it to a table.
df[“mergedtext”] = df[“Subcommodity”].map(str)
+ “ “ + df[“CompositeSubcat”].map(str) In the case of a predictive model, you’ll generally save the
print(list(df.columns.values)) model to a table. That way you can save and manage mod-
OutputDataSet = df[[“Region”,”SNAP_Pct”, els, add metadata about when the model was run on how
“mergedtext”]]’ many rows of data, and which prediction runs it was used
for. To see an example of this process for models used in
Note: There’s some slight cost incurred when moving data production, I recommend this tutorial from the Microsoft
between SQL Server and Python, but the pipeline is highly data science team: Python for SQL developers (https://docs.
compressed and optimized; certainly, it’s faster that moving microsoft.com/sql/advanced-analytics/tutorials/sqldev-
data across the network to a Python client. py3-explore-and-visualize-the-data?view=sql-server-2017).
40 Stone Soup: Cooking Up Custom Solutions with SQL Server Machine Learning codemag.com
For plots, you need some way to view the object, and there
are several options:
• Save the plot to a local image file, and then copy the
file elsewhere to view it. That way, you aren’t inviting
people to open image files on the server.
• Save the binary object to a table.
• Pass the binary variable to a Python client so that it
can be read and displayed.
codemag.com Stone Soup: Cooking Up Custom Solutions with SQL Server Machine Learning 41
required in scenarios where performance and handling of • Formally defining the business problem and the data
large data is required, such as these: required
• Defining the scope and lifecycle of the data science
• Creating models that can be processed in parallel us- project. Describing the people who are required in a
ing revoscalepy, RevoScaleR, or microsoftml algo- large data science project and their roles
rithms • Providing to partners the detailed requirements in
• Saving models to a table, which you can then reuse in terms of packages, server resources, data types, data
any server that supports native scoring SLAs, etc.
• Loading pretrained models from disk and caching for • Specifying ownership and SLAs for related operations
subsequent predictions such as data cleansing, scoring pipelines, backups, etc.
• Scaling predictions from a saved model to multiple
servers In case you’re thinking “this is all too complex for my little
• Embedding Python or R scripts in ETL tasks, using an project,” consider how many applications have started as
Execute SQL task in Integration Services demo projects but ended up in production and ran for years
with scant documentation. Given that data science projects
typically entail massive amounts of data that change fre-
Data scientists often find quently, with small tweaks to algorithms that few people
understand, best start documenting early!
themselves scrambling to efficiently
Food Is Big Business
productize and hand over the Scaling Up = Changing Your Recipe
A core value proposition for integration of SQL Server with
perfect mix of data and algorithm. an open source language like Python (or R) is to increase
Enrollment in food assistance
the limited processing capacity of Python and R to consume
programs grew from nine
percent of the U.S. population more data, and to build models on wider data sets. Revolu-
in 2007, to about 15% (47.6 Go Pro in the Kitchen with Team Data Science tion Analytics did the pioneering work on scaling R to meet
million of Americans) in 2013. Going pro with data science is more complicated than get- enterprise needs, and their acquisition by Microsoft led
ting bigger data or moving the data to a server from a file to the incorporation of R (and later Python) into the SQL
After the recession, share. Scaling up requires fundamental changes in the way Server platform.
participation has gradually you work.
declined, to 43.4 million Other solutions exist to support scaling ML, of course: dis-
people as of July 2016 and Back to the cooking metaphor, imagine a pastry chef who tributed cloud computing, specialized chipsets such as FPGAs,
40.3 million in 2018. has crafted an elaborate French pastry. Like the data scien- use of GPUs, and very large VMs customized specifically to
tist who has painstakingly selected and prepared data and support data science. However, the SQL Server platform has
fine-tuned the results using feature selection and param- the advantage of being both ubiquitous and accessible by
eters, the result is a one-off masterpiece. most developers, and it offers a well-tested security model.
Now imagine that chef being asked to turn that delight- Here are some challenges of scaling data science and solu-
ful recipe into a commodity at the pace of several hundred tions in the SQL Server Machine Learning platform:
thousand per day. The problem is no longer one of taste
and invention, but of scale and process. And because a lot Scaling up is rarely a linear process. This applies to cook-
of money rests on the results, consistency and guaranteed ing, as well as ML. A recipe is not a formula, and a model
results are critical, as well as accountability for preparation that runs on your laptop will not magically scale to millions
and cooking time and ingredient cost. of rows on the server. The training time could be quadratic
to the number of points, depending on the type of model
Data scientists often find themselves scrambling to effi- and data. The problem is not just the size of the data, or
ciently productize and hand over the perfect mix of data even the number of input features that can blow you out
and algorithm. Tasks include documenting what was done of the water. Even algorithms widely known for speed and
and why, ensuring that results are repeatable, changing the tractability with large datasets can include features that
recipe as needed to support scale and cost reduction, and greatly increase the number of computations and thus the
tracking the consistency and quality of results. time your model churns in memory.
The good news is that there’s help from the Team Data Sci- There are different ways to address computation complexity.
ence Project (TDSP). TDSP is a solution created by Microsoft Is the model big because it uses a lot of data, or because
data science teams to guide a team through development, it’s complex, with many columns and different types of fea-
iteration, and production management of a large data sci- tures? In the first case, SQL Server Machine Learning might
ence project. You can read more here: https://docs.micro- be the solution because it supports algorithm optimizations
soft.com/azure/machine-learning/team-data-science-pro- and parallel process that allow distributed processing. In
cess/overview. the second case, SQL Server and Machine Learning Server
offer ways to chunk and stream data, to process far more
data than is possible with native R or Python. You might also
Based loosely on the CRISP-DM standard, TDSP provides a collaborate with your DBA and ETL folks to ensure that the
set of templates for reproducible, well-managed data min- data is available for model training and that the workload
ing projects. The templates apply to multiple products, not can be scheduled.
solely SQL Server, and provide a structure around key data
science tasks that you can use to organize a project or com- Refactoring processes takes time but saves time in the
municate with a client, such as: long run. Functions such as data cleansing or feature engi-
42 Stone Soup: Cooking Up Custom Solutions with SQL Server Machine Learning codemag.com
neering that were run in-line as part of the model building lm(y ~ x1 + x2)
process now might need to be offloaded to an ETL process.
Reporting and exploration are other areas where the typical If you like a challenge, you could always implement it en-
workflow of the data scientist might need drastic change. tirely in T-SQL. But using R or Python sure is a shortcut! To
For example, rather than display a chart in your Python cli- see other examples of what you can do with a few lines of R,
ent, push results to a table so that it can be presented in look up “R one-liners.”
interactive reports or consuming applications.
For some additional ideas of how a DBA might have fun with
Scoring (prediction) is a business priority. Scoring is the R, I recommend this book by a long-time SQL MVP and a
process of doing work with a model, of serving up predic- Microsoft PM: SQL Server 2017 Machine Learning with R:
tions to users. Optimizing this process can either be a “nice Data Exploration, Modeling and Advanced Analytics by
to have” or a showstopper for your ML project. For example, Tomaž Kaštrun and Julie Koesmarno. The book is a depar-
real-time constraints on recommendation systems mean ture from the usual “data science”-centered discussions of
that you must provide a user with a list of “more items like Python and R and is written with the database pro in mind.
this” within a second or they leave the page. Retail stores It includes multiple scenarios where R is applied to typical
must obtain a credit score for a new customer instantly or DBA tasks.
risk losing a customer.
For this reason, SQL Server Machine Learning has placed great Conclusion
emphasis on scoring optimization. Models might be retrained My goal was to demonstrate that running Python (or R) in
infrequently but called often, using millions of rows of input SQL Server is a fun, flexible, and extensible way to do ma- Parameter Help
data. Several features are provided to optimize scoring: chine learning. Moving from the kitchen to the factory is
If you find any part of the
a true paradigm shift that requires coordination as well as
parameters mysterious, I
• Parallel processing from SQL Server innovation, and flexibility in the choice of tools and pro- highly recommend the series
• Native scoring, in which the model “rules” are writ- cesses. Here’s how the new process-oriented, scalable, com- by SQL Server MVP Niels
ten to C, so that predictions can be generated without mercial data science kitchen works: Berglund on the mechanics of
ever loading R or Python. Native scoring from a stored sp_execute_external_script:
model is extremely fast and generally can make maxi- • Your data scientist contributes code that has been de- https://nielsberglund.com/
mum use of SQL Server parallelism. Native scoring can veloped and tested in Python, then optimized by the 2018/03/07/microsoft-sql-
also run in Azure SQLDB. (There are some restrictions new options in revoscalepy. server-r-services---sp_
on the type of models that support native scoring.) • Your DBA brings to the table the ability to keep data execute_external_script---i/
• Distributed scoring, in which a model is saved to a optimized through the model training and scoring
table in another server that can run predictions with processes and guarantees security of your product.
or without R/Python) • Your data architect is busy cooking up new ideas for
using R and Python in the ETL and reporting.
Data Science for the DBA
Want to know the secret cook in the data science kitchen? Stone soup? Sure, the combination of ingredients—SQL
It’s your DBA. They don’t just slave away on your behalf, op- Server plus an open source language—might seem like an
timizing queries and preventing resource contention. They odd one, but in fact they complement each other well, and
also do their own analyses, proactively finding and fending the results improve with the contribution of each tool, cook,
off problems and even intrusions. or bit of data.
To do all this, developers often have code or tools of their Jeannine Takaki-Nelson
own running on a local SQL Server instance. However, rather
than install some other tool on the server and pass data
through the SQL Server security boundary, doesn’t it make
more sense to use the Python or R capabilities provided by
Machine Learning Services? Simply by enabling the external
script execution feature, the DBA gains the ability to push
event logs to Python inside the SQL Server security bound-
ary and do some cool analytics. Like, for example:
codemag.com Stone Soup: Cooking Up Custom Solutions with SQL Server Machine Learning 43
ONLINE QUICK ID 1911081
Emotional Code
I’ve been paid to program since 1979 and for most of that time, I’ve been working with other people’s code. At first it was “add
this little feature to something we already have.” These days, it’s “how can we be better” and “is this code worth keeping?” Reading
code has always been a huge part of my job, and so I care a lot about the kind of code I (and the people I work with) write.
Of course, I want it to be fast – I’m a C++ programmer after lot of people really feel this way and try to suppress emo-
all. It also needs to be correct, yes. But there’s more to it tions in themselves and in others.
than those two things: I want code that’s readable, under-
standable, sensible, and even pleasant. To a certain extent, they have a point. If we’re arguing about
how many parameters a function takes, I don’t want to hear
I’ve put a lot of work into looking at code and seeing how it that you feel five is the right number: “It just makes me happy
could be better. Often, I recommend making it better by us- seeing it like that.” I want you to persuade me with logic.
ing things—language keywords, library functionality, etc.— But for many cases, the quick overview conclusion delivered
that we’ve added to C++ in this century or even this decade. by emotions is super useful—not for winning arguments, but
Kate Gregory I show people how to write code that does the same thing for doing my work. I look over an API, a whole collection of
kate@gregcons.com but is clearer, shorter, more transparent, or better encap- functions and all their parameters, and something in me says
www.gregcons.com/kateblog sulated. Until recently, I didn’t spend a lot of time thinking EEEWWWW. I don’t know exactly what is so yucky, but it draws
@gregcons about why people wrote that code the way they did, even me in there to give a closer look and to see what I rationally
when there were things they could have done at the time think about that part of the code. I’m not going to say, “pay
Kate Gregory has been using
C++ since before Microsoft that were clearly better. I just made the code better. In this me to re-architect your whole system because it feels kind of
had a C++ compiler, and has article, I want to talk about that why factor, and about the gross and wrong,” but that first emotional response had great
been paid to program since humans who write the code you read and maintain. value in bringing my attention to a place that needed it. I’ve
1979. She loves C++ and be- learned to value those signals a great deal.
lieves that software should
make our lives easier. That
What Are Emotions? But not everyone does. When I tell them “I don’t know
includes making the lives Because I mention Emotions in the title, it’s probably a what specifically I dislike about this API right now; at first
of developers easier! She’ll good idea to discuss what they are. I think of emotions as glance, my gut has a problem with it and I know it needs to
stay up late arguing about out-of-band interrupts. Emotions deliver a conclusion with- be looked at so I’ll write you a summary this afternoon,”
deterministic destruction or out all the supporting evidence being clearly listed. For in- they may reject that because there’s nothing to argue with,
how modern C++ is not the stance, you’re walking down the street, or negotiating to they have to just trust me. They may reject it because they
C++ you remember. buy a car, or on a date, and suddenly your brain tells you think I’m being weird or emotional instead of using my
something like: experience. (This is odd, because emotional and intuitive
Kate runs a small consult- reactions to situations are how your experience generally
ing firm in rural Ontario shows itself to you.) Perhaps they may just be in the habit
• Get out!
and provides mentoring and
• Trust this person. of telling other people not to feel emotions: not to get re-
management consultant
• Smile, relax, everything’s fine. ally happy about things, or really upset either. We have a
services, as well as writing
code every week. She has • Fight, yell, hit, and scream! whole strain of humor about this: “Oh, the humanity” and
spoken all over the world, other memes where people are mocked for being upset over
written over a dozen books, These reactions may be right—you may be talking to a won- relatively small things or for being happy.
and helped thousands of de- derful person you can trust—or wrong—the sales rep may
velopers to be better at what be trying to flatter you into paying too much for the car.
they do. Kate is a Visual C++ But the point is that at the moment the message arrives,
MVP, an Imagine Cup judge you don’t have a nice clear list of reasons to feel that way. Emotional and intuitive reactions to
and mentor, and an active Your brain has done some pattern matching and delivered a
contributor to StackOverflow conclusion. You can act on it or not. situations are how your experience
and other StackExchange generally shows itself to you.
sites. She develops courses Some people don’t like it when other people take actions
for Pluralsight, primarily on based on emotions. If you become angry and leave a situa-
C++ and Visual Studio. Since tion, you may not be able to explain the precise details that
its founding in 2014, she led to your conclusion. You may not be able to prove that Mocking people for their “first-world problems” or replying
has served on the Planning leaving was right. I’ve been told that relying on emotional “oh, the humanity” to them when they’re getting worked up
and Program committees reactions to make decisions is lazy, non-rigorous, or cutting is our way of enforcing a social norm within the programming
for CppCon, the largest C++ corners. community, especially programming communities that have
conference ever held, where roots in the 20th century rather than the 21st. The social norm
she also delivers sessions.
says “don’t express emotions to me,” and ideally, don’t even
No Emotions Allowed have emotions. But here’s the thing: Programmers are human
People who feel that that emotional reactions are inappro- beings and human beings have emotions. Therefore, whether
priate are especially common in the field of software de- you like it or not, programmers have emotions.
velopment. They ban the disruptive out-of-band interrupts
that emotions are. They insist that you must win arguments In reality, emotions are a big part of software development.
with logic and not with feeling strongly about things. Just It’s a lot more than writing and debugging code, and the not-
the pure crystalline logic of the 1s and 0s of the matrix. A code parts of it are FULL of emotion: getting users to tell you
This developer hasn’t decided how to calculate the limit, It’s easy to say that training people and doing code reviews
isn’t prepared to stand behind that calculation, is telling will teach them that delete on a null pointer is a no-op. But
you “hey, if you have any issues with this line of code, don’t what if the developer is afraid their coworkers will let them
talk to me, go and find Bill, it was all Bill’s idea.” And it’s not down? They check the same conditions repeatedly because
just that they aren’t confident how to calculate the limit, it’s they can’t be sure the conditions are always met. “That was
also that they’re worried someone is going to object to this in a team-mate’s code, they might have changed it without
line and they want to defend themselves against hostile re- telling me.” Here’s a place where a thorough test suite, and
views. Somehow the environment has left people unable to running those tests after every change, can improve your
feel confident about even the simplest calculations. runtime performance. If you can confidently write the code
knowing you’re getting valid parameters from the other
Fear is also why people don’t delete variables they’re not us- parts of the application, think how many of those runtime
ing. They leave in lines of code calculating something that’s checks (make sure x is positive, check that y is not over the
never used after it’s calculated. Functions that are never limit, and so on) could be dropped!
Sometimes this sort of “I’m better than anyone else” is Selfish code has short and opaque names because the de-
what drives developers to use raw loops instead of some- veloper didn’t bother thinking of good ones. It uses magic
thing from a library, to write their own containers, and so numbers instead of constants with names. There are side ef-
on. They perhaps ran into a performance problem a decade fects and consequences everywhere, like using public vari-
or two ago in one popular library and concluded that they ables because it’s quicker, or mutable global state because
would always be better than those library writers. Although it’s quicker. Sure, it might be slower next time, but next
“it ain’t bragging if you can do it,” very few developers can time is someone’s else problem, right?
actually outperform those who concentrate on a specific li-
brary all day long. Maybe they measured the performance of Selfishness also leads to information hoarding. My job is
their hand-rolled solution with their unusual data against safe if nobody else can do this. If I explain this to others, I’ll
the library solution, but then again, maybe they didn’t. It’s be less valuable because I’ll be replaceable. (As an outside
worth checking. consultant, my usual reaction on hearing that someone is
irreplaceable is to change that first. It’s not good for teams
and it doesn’t lead to good code in most cases.) This devel-
oper sees their coworkers as competitors and doesn’t want
Malware has more swear words to help them. That’s not how good teams work.
in the code than non-malware.
Laziness
Not all programmers are selfish, of course. Some of them
just can’t be bothered. “Whatever; it works. Mostly. We
You’d be surprised (I no longer am) how often I find sneer- have testers, right?” They don’t use libraries because they
ing comments and names in live code. People who say lus- resist learning new things or finding new ways to do things.
ers, PEBCAK, and RTFM in emails and Slack say it in their They’re busy typing code. Or copying and pasting code.
code too. They say it in their commit messages too: April “Abstraction? Sounds like work to me!” When you suggest
Wensel found a pile of hits for “stupid user” in GitHub com- that they add some testing, or build automation or scripts,
mit comments. Obviously, those comments are public (she you’re likely to hear, “If you think that matters, you do it.”
found them). How do you imagine users would feel discover- They don’t show any kind of commitment to the quality of
ing a commit message that was nothing more than “Be nicer the code, the user’s experience, deadlines, their own future,
to the stupid user” when learning about a product they the success of the company, or the goals of the team. They
used? And that committer needs to try harder at “be nicer,” just want to come in, type without thinking a lot, and go
by the way, because that comment shows a distinct lack of home, having been paid for the day regardless of what was
nice. Steve Miller searches executables for swear words: actually accomplished. And it shows in the code!
Malware has more of them than non-malware does.
It’s not just that they haven’t refactored, haven’t spotted
I’ve also seen function names and variable names that drip useful abstractions, haven’t given things good names. You
with disdain and contempt for the work being done. No, see repetition, really long functions, a mishmash of naming
I don’t mean calling a variable “dummy.” I mean things conventions—all things that are easy to clean up on a slow
like putting a coworker’s name in a function to show you day when you don’t want to be thinking about new code. But
Sometimes code just strikes you as brilliant. It’s clearly and Choose to Show Positive Emotions
obviously right, and dramatically easy to grasp. I don’t mean So sure, your code could show fear, selfishness, laziness,
that it’s clever. I mean that it’s obvious. Consider this C#: and arrogance. But why not show confidence, generosity,
humility, and how hard working you are? Your code will be
var creditScore = application.CreditScore; easier to read and maintain. You’ll enjoy reading and main-
switch (creditScore) taining it more, and your reputation will improve as other
{ people realize they can understand what you write, it’s easy
case int n when (n <= 50): to change it when life changes, and it’s generally better
return false; code. Even if the code isn’t better, there’s a lot to be gained
case int n when (n > 50): from writing this way. But it probably will be better.
return true;
default: I want you to care about those who wrote the code you main-
return false; tain and those who maintain the code you write. When you
} find crummy code, fix it. Show your confidence. Clean up.
Make it right. Name things well. You’re going to show emo-
This code isn’t wrong. It compiles without any warnings, and tions in your code and they might as well be positive ones!
it implements the logic that’s needed. But there are three
cases (two ranges and a default) and lots of places to make Kate Gregory
mistakes. Someone might return true in two of these return
Having a message on your site that it only works in the lat- Role
est version of Chrome doesn’t do anyone any good. When Semantic and interactive HTML elements have a role based
thinking about inclusive design as well as accessibility, con- on the type of control—button, checkbox, heading, etc.
sider the user who is at work and doesn’t have permission to Non-interactive HTML elements can be given a role via ARIA,
install the latest version of Chrome, as well as the disabled and the default role of a semantic element can be overrid-
user who can’t use Chrome with their assistive technology. den. Consider this snippet of code:
And of course, I can also do the opposite: When you’re thinking about which library or framework to
use; when you’re considering performance enhancements;
<button role=”heading” aria-level=”1”> when you’re writing a spec doc, designing a mockup, or
My Button? writing a test script—just put the word accessibility out
</button> there. Maybe it’ll force you to reconsider some assump-
tions. Maybe you’ll realize that you don’t need a library at
Again, I’ve overridden the role of a semantic element (in all, because there’s a native HTML element that does what
this case a button), so the accessibility tree believes it’s a you need. And you know what? Maybe nothing will change.
Level 1 heading. You can see why it would be better to have But sometimes it will. In the future, you’ll consider another
no ARIA at all, rather than misleading and just plain wrong aspect of accessibility. Build an accessibility toolbox, just as
ARIA. you would with any skill.
Relationship Maybe you’re thinking that you don’t have time for acces-
ARIA can be used to describe the relationship between two sibility. However, using properly semantic HTML is much
elements. This is similar to setting the for attribute on a quicker than using custom elements, so you’ll likely be able
label, which links the label and the control together, as far to gain some time that way. Then think about all the times
as the browser is concerned. you’ve given an estimate for finishing a task and completed
it quicker than expected. Now you have an extra half hour
If you think about the tab example again, you can use ARIA or more to dedicate to accessibility, and tasks like checking
to link the first element in a list (i.e., the “tab”) to the the contrast ratio, adding alt text to images, and verifying
appropriate div. This tells the browser—and the user, ulti- keyboard navigation take very little time.
mately—that clicking on a specific tab activates or displays
a specific div. Don’t forget about your internal tools and systems: Just
because you’re not selling something doesn’t mean that ac-
Another use for ARIA relationships would be for a descrip- cessibility doesn’t matter. Could you hire a visually impaired
tion of a chart’s data. In this case, you’d display your chart, developer tomorrow and have them use the same tools as
and then have a visually hidden div or span linked via the your current team?
describedby attribute. When a screen reader user reaches
the chart, the contents of the div will be read out because I’ll also ask you to think about physical accessibility, which is
the two elements have been linked together in the acces- still a problem and concern for many disabled people. When
sibility tree. you go to meetups or conventions, ask if they’re wheelchair
accessible (and make sure they’re truly accessible, not “oh,
<canvas aria-describedby=”chartDesc”></canvas> there’s just one small step, I’m sure it’s fine”). Ask if they
<div id=”chartDesc” style=”display:none;”> have accessible and gender-neutral washrooms. Ask if there
This is a description of the data are accommodations for visually impaired, blind, or deaf
in the chart. users. If they’re serving food, ask if there’s a process for
</div> handling allergies and sensitivities.
many of us easily fixate ourselves on the potential capabili- • Choosing visuals to convey strategic points
ties for creating queries, developing custom programming • Positioning and the number of charts/visuals
code, or creating extensive calculations on the back-end be- • Instruction prompts
hind the scenes. However, the majority of business users use • Colors to point out key trends
these data visualization platforms to dynamically interact
with this top layer of application: the dashboard. To develop Users want you to do the analysis of the components before-
these user-friendly dashboards, we need to think more like hand, but they want to interact with the data themselves
designers instead of programmers. using the instructions you provide. If the pieces don’t fit
together, or if the instructions don’t make sense, building a
Helen Wall successful final product becomes much more difficult.
www.helendatadesign.com The Importance of Dashboard Design
Helen Wall is a power user of The best-designed and implemented dashboards appear The Starting Dashboard
Microsoft Power BI, Excel, and effortless; we like them more, but yet can’t quite explain I chose Tableau for this example because I think it enables
Tableau. The primary driver exactly why. Their likeability comes not by accident, but by a focus on making changes with best design practices in
behind working in these tools mindfully following design principles implemented with the mind, given that the tool focuses on the visuals themselves.
is finding the point where end viewer in mind. Good design isn’t an accidental result, It also works on Apple products at the time of publication
data analytics meets design but rather a strategic decision to choose to make the small for this article, and Microsoft Power BI, unfortunately, does
principles, thus making data design choices that have a huge impact on the end result. not. You can use different versions of Tableau, but for the
visualization platforms both an purposes of this article, I’ll use Tableau Public Desktop,
art and a science. She considers Nudging which is free to download. You can download the starting
herself both a lifelong teacher In order to maximize the likelihood that users make your file from Tableau Public Online to follow along with the
and learner. She is a LinkedIn dashboard part of their everyday processes, you need to changes I make in this article, and then save it by uploading
Learning instructor for Power design dashboards that guide users through not only key it to your own Tableau Public Online account.
BI courses that focus on all as- visuals and figures, but also how they collectively interact
pects of using the application, with each other. A dashboard can check all of the user’s I obtained the infant mortality data from the impressive
including data methods, dash- requirements, yet still not yield much value to the user be- data section of the Gapminder website developed by the
board design, and programming
cause there’s a dissonance between what the y tell you they late Hans Rosling. The data set contains the infant mortal-
in DAX and M formula language.
would like and how their actual behavior responds to the ity rates by country and by year from 1800 until 2015. Note
Her work background includes
an array of industries, and in
dashboard. How do you bridge this gap? Let’s put this dis- that there are incomplete data sets because some countries
numerous functional groups, cussion in the context of the nudging proposition. may not have data (or at least useable data) for all of these
including actuarial, financial years. I categorized each country into its own region of
reporting, forecasting, IT, Nudging is defined as tiny prompts that alter our behav- the world, as you can see in the region mapping key file in
and management consulting. ior—specifically social behavior. Richard Thaler extensively Figure 1.
She has a double bachelor’s examined nudge theory in his book Nudge. Examples of
degree from the University of nudging techniques include: The Gapminder site defines infant mortality as the number
Washington where she studied of deaths in the first two years of life for every 1000 live
math and economics, and also • Encouraging recycling, by placing a larger recycling births. Although you may think this project is taking a mor-
was a Division I varsity rower. bin in a more prominent location than a smaller gar- bid direction, you’ll see that the Tableau dashboard helps
On a note about brushing with bage bin in many cities and businesses. communicate a much more positive outcome. A decrease in
history, the real-life characters • Sending out electricity bills that compare usage to these rates means that the survival rates are increasing and
from the book The Boys in the that of neighboring housing units to encourage users improving global health outcomes.
Boat were also Husky rowers to limit their electricity usage.
that came before her. She • Charging even the nominal amount of a few cents for To walk through the steps of applying best visualization de-
also has a master’s degree in single-use plastic bags encourages people to bring sign practices, let’s begin with a less-than-optimal Tableau
financial management from their own reusable bags when shopping or not use one dashboard I created, as seen in Figure 1. I can make stra-
Durham University (in the at all. tegic design and formatting changes that transform it into
United Kingdom).
a more effective dashboard built with the end user in mind.
Much like Ikea furniture comes in a box with pre-cut pieces
and instructions for assembly, you want to present your You will need several components to try this out on your
dashboards to the user in a similar manner. Think of the in- own, including the Tableau Public dashboard links below
struction manual as the nudging component to the product. and the link for the two Excel files and the PNG image that
This translates to techniques in designing data visualiza- are available on the CODE Magazine page associated with
tions (which I’ll discuss later) such as: this article.:
• Starting Tableau dashboard: https://public.tableau. and they enjoy the process, you’re telling them that you
com/views/VisualizationBestPracticesstartingfile/ value their ability to analyze the data trends and take own-
Dashboard1?:embed=y&:display_count=yes&:origin=viz_ ership in this process. Psychologists define this as the “Ikea
share_link effect” (https://www.bbc.com/worklife/article/20190422-
• The Excel file from Gapminder data with the infant how-the-ikea-effect-subtly-influences-how-you-spend)
mortality rates where the customers (in this case dashboard viewers or
• The Excel file for country to region mapping users) feel they achieve the greatest value for their invest-
• The Gapminder logo ment. This is the Holy Grail for many businesses.
• Ending Tableau dashboard: https://public.tableau.com/
shared/8TZ3K2TWW?:display_count=yes&:origin=viz_ On the flip side, you need to do a lot of work on your end to
share_link get the users to feel this empowerment and ownership in the
process. Making the process easy for the user involves putting
You need to update the Tableau file to point to the Excel file yourself into their thought patterns to analyze the unknowns
on your own dashboard: and numbers before they even see them in the dashboard.
These areas for you to analyze beforehand for them include:
1. Download the Tableau Desktop Public application (free ver-
sion) or you can use Tableau Desktop if you already use that • The meaning and magnitude of data set numbers
2. Download both Excel files and the Gapminder logo to a • The relationship between data points and data fields
folder in your own desktop or documents folder. • Optimal ways to see the data in visuals and charts
3. Open up the Tableau link for the starting dashboard
with your own Tableau application How do you want to measure the infant mortality rate? Does
4. Go into the Data Source tab of the Tableau file, click on a higher rate indicate a better or worse metric? You need
each of the connections for the rates and region key, and to establish that you want to see lower infant mortality
set the folder connection to the path on your own desktop. numbers because this means that more babies are surviving
out of infancy, which also indicates improving public health
After you update the sources, the rest of the visualization will outcomes. You can’t assume that the reader already knows
update as well, and you can begin the transformation process. this, and you need to explicitly say what the numbers mean
in context of the bigger picture.
The “Ikea Effect”
Looking at Figure 1, can you tell at first glance what the Furthermore, you also need to indicate how you’re aggregat-
initial dashboard is analyzing? Inconspicuous legends and ing these infant mortality numbers. Each point in the data
axis labels serve as the only indications that you’re studying source represents the infant mortality for a given year and
infant mortality rates. The user shouldn’t have to guess at country. If you wanted to determine the global infant mortal-
what you’re trying to do. ity rate for 1990, for example, you need to analyze the data
points for all of the countries that year. It doesn’t make any
Building successful data visualizations involves striking a sense to sum them together because they represent rates and
balance between giving dynamic options to the viewer and not absolute values. It makes more sense to average all of
your own design and analysis process. When viewers feel like these data points to get this global mortality rate the years
they do most of the work by interacting with the dashboard and countries we want to see as an aggregated number.
If you buy furniture from Ikea, you assemble it yourself Effective chart options include:
from pre-cut pieces that come in a box that designers
planned out and tested ahead of time. Similarly, in dash- • Bar charts
boards, you want to analyze the data and plan out the • Line charts
dashboards before passing it off to the user to interact • Scatter plots
with in a pre-packaged box in the form of a dashboard. If • Box and whisker plots
you don’t include a necessary piece or if the sizing doesn’t • KPI metrics
work, neither you nor the viewer get the desired finished • Heat maps
product or result. Much like Ikea furniture comes in a box • Highlight tables
with pre-cut pieces and instructions for assembly, you
want to present your dashboards to the user in a similar You can see the infant mortality rates by region represented
manner. Users want you to do the analysis of the com- as a pie chart in the upper left-hand corner of Figure 1. This
ponents beforehand, but they want to interact with the chart presents two big issues when you view it:
data themselves using the instructions you provide. If the
pieces don’t fit together or if the instructions don’t make • You can’t easily distinguish between the slices of the
sense, the likelihood that they will embrace this dash- pie because you have to guess the angles rather than
board as their own goes down substantially. actually measuring the numbers.
• The chart represents the infant mortality rate as a sum
of the rates, which can mislead the audience because
Good Design Isn’t an Accident regions with more data points and complete data will
How do you approach designing your best possible version have a bigger slice, even if they have lower infant
of the dashboard? What do you consider for the job at hand? mortality rates.
You want to analyze the data initially to create components
for the dashboard that fit together with each other. Much The bar chart (Figure 2) does a much more effective job
like an Ikea furniture pack, you design and test the pieces of showing the average infant mortality rate by region
to make sure they fit together beforehand. (changed from the sum aggregation in the pie chart), and
you can easily rank and compare these rates between re-
When you like the way something looks, you can’t always gions directly within the chart.
quite explain why. The “why” comes through your decision
to strategically choose to apply design principles to it. De- To change a pie chart into a horizontal bar chart:
signing an effective dashboard that users embrace interact-
ing with is not an accident, but rather a well-planned ap- 1. Move the “Region” dimension to rows.
proach that keeps the user in mind. 2. Change the chart from a pie chart to a horizontal bar
chart in the Show Me options menu.
Choose the Right Visuals for the Job at Hand 3. Change the aggregation of the infant mortality rate
The first step in this process is selecting visuals that repre- from Sum to Average.
sent the data correctly, and also effectively communicate
the results and trends in the data. There’s no one chart that You also lose the time dimension this way because you’re mea-
works for all data and no one data set that works for all suring the average infant mortality rate for all years for coun-
charts. I encourage you to experiment with charts within tries within each region in Figure 2 rather than in a certain year.
the data visualization application to compare how they
represent the data and what visual works best for your in- Because you want to measure a time value, you can use a
tended result. line graph or a bar graph. What if you took this infant mor-
You can’t stack up the region bars in Figure 3 because you To change from a bubble chart (Figure 1) into a shaded
want to average rather than sum the rates. Showing the filled chart (Figure 5):
average infant mortality rates by region as a line chart miti-
gates the size and readability issues that you encounter for 1. Select the Show Me option menu and pick the maps
this scenario with bar charts. chart icon
2. Change the aggregation from Sum to Average for the
The line chart in Figure 4 allows you to easily rank the regions Color Marks card option.
for each year, and you can see the infant mortality rate trends
by region because each point in the graph joins to the point for Now I’m going to tackle the two charts you saw in Figure
the next year and the previous year and so on. More importantly, 1 that show the infant mortality rates by country as a bar
you can also easily see that the rates trend downward across all chart and the average infant mortality rate by year as a data
regions, which means that global health outlooks are improving, table by combining them into a single visual rather than
even if you continue to see disparity among the regions. two, which makes it easier to easier. You can see some key
Selecting an aggregation
option allows you to analyze
trends in the data set.
visual shows both the rate as a text value and a color, and you 3. In the values area (or the Text Marks card), you have an
can see that the color effectively illustrates an improved in- average of infant mortality rate. It doesn’t matter if you
fant survival rate in recent years for all countries in this view. use sum or average here because for each row and col-
umn coordinate in the data table, you only have a single
Now change it into a single table (Figure 6) with rows and corresponding value, but it makes the most sense to just
columns combined with the data in the chart, and then into select the average aggregation to line up with the other
a highlight table (Figure 7): visuals. If you inspected the Excel file, you may remem-
ber this is what the data table looks like.
1. Move the Year dimension from rows to columns. 4. To convert to a highlight table, select the highlight
2. Add the Country dimension to the rows, so that you table icon from the Show Me menu. If it switches the
now have a data table with years in the column labels rows and columns when you convert the visual type,
and countries in the row labels. just move them back into the correct positions.
• If a country has gaps in a time range, averaging out To sort the country order:
the rates within a ten-year segment allows you to
smooth out those inconsistencies. 1. Go to the last column where the 2010 through 2015
• It also makes it possible to see the entire time range years aggregate and click on the header. You’ll see the
in a single view, as you can see in Figure 8. I chose to sorting icon that looks like a little bar chart appear.
use ten years because it allows me to have enough ag- 2. Click on the little horizontal bar chart icon once, where
gregated rates within a reasonable time range without you see the highest rate for Angola, then click on it
missing too many data points. again to see it listed by lowest infant mortality rate
To create bins for the years and update the highlight table Make the Labels Easy to Read
(Figure 8): If you cut off the label names in a visual, do you expect the
viewer to fill in the missing letters and guess the name?
1. Add a newly calculated field for the year and enter the In Figure 11, you can’t see the entire country name for
formula Year (YEAR) = YEAR([Year]) (see Figure 9). the healthiest country. Even if you know to add an “in” to
complete “Liechtenstein,” that may not be an option if you You also add the year filter to the filled map chart as you
have more than two hundred or so options (country names) can see in Figure 15, with the option to select multiple or
to guess the finished name outcome, which Figure 12 will all years within the view rather than showing the average
spare you the pain of having to do. infant mortality rate across all available years. This allows
the viewer to create a custom view of the map based on their
To expand the size of the country column: selected year, and dynamically update the colors on the map
that represent the infant mortality rates averaged across
1. Hover over the border between the country name and the selected year or years.
the values section until a double arrow appears.
2. Drag the arrow until the column width expands to com- To add the Years filter:
fortably fit the country names in the immediate view,
as you see in Figure 12. 1. Go into the map worksheet and put a filter on the filters
shelf. Figure 11: A table sorting by
You can also wrap the text fields, but I wouldn’t recommend 2. Select all the years, and then select to show the filter. lowest mortality rate in 2010
that approach for the highlight table because it increases 3. Setting this up as a drop-down list has many benefits, to 2015 to highest rate
the height of these wrapped fields and throws off the sizing including taking up less space. I’d also recommend the
for the entire visual, and can make it more difficult to read. drop-down list because you can see that the single list
takes up a great deal of space and doesn’t do much
Use Filters Within Visuals for you.
The line chart you created in Figure 4 looks like colored
spaghetti lines over a two-hundred-year time frame, which The drop-down list is shown in Figure 16.
can make it slightly difficult to analyze. Because incomplete
data drives much of this fluctuation, if you want to use this I’ll revisit the filter options in much more detail later when
line chart visual as an effective analysis tool, it seems sen- you set up the dashboard.
sible to filter down the chart to only show the trends from
1950 and onward, as you see in Figure 13. Applying Color Effectively
In Tableau and many other data visualization platforms,
To adjust the line chart: the application automatically assigns a color palette to the
chart. However, using the default option may not present
1. Go to Sheet 1 and add Years to the Filters Marks card. the best color scheme. To leverage color effectively, you
2. Select Years and the condition as greater than or equal want to limit the color scheme you use (as contradictory as
to the 1st of January in 1950, as you see in Figure 14. that sounds) because strategically applying just a few colors
3. Filter out the null region from the table (remember not only makes the visuals easier to read, but also allows
these are countries that you don’t have a matching users to focus on key trends and numbers.
region for because they are so small) by dragging the
Region dimension to the Filter card. Color-blindness is a visual disability that affects the eyesight
4. Also filter out nulls by excluding them from the data by abilities of one in ten men and a smaller group of women. You
clicking on the null values at the bottom of the chart may be color-blind yourself or work with someone who is, or
and selecting Filter data. you may not even realize that this impairment is among your
colleagues and peers. If you look at the diagram in Figure 17, You update the filled map in Figure 15 to use a diverging
you can see how orange and blue navigate around issues with Orange-Blue color scale with a very light gray serving as the
vision impairment pretty easily. Green and red, on the other color representing the midpoint. Although blue represents
hand, look the same for those with color-blind impairments. lower infant mortality rates and orange represents high infant
Also, remember that, like, many other disabilities, it occurs in mortality rates, the key part about setting up this color scale
a spectrum rather than an absolute impact. is selecting what value to use as the midpoint (Figure 20).
Accounting for those with color-blindness can serve as a In 2015, the country of Angola experienced the highest infant
starting point to selecting your own color palette. Tableau mortality rate of all the countries, at 96 deaths in the first two
has a color palette that you can see in Figure 18, specifically years of a baby’s life for every 1000 live births. I decided to set
offering ten color options to choose from. the center or midpoint of the color scale to this rate because
it allows the viewer to put in context how historical infant
You want to give a unique color for each region so that even mortality rates across all countries compare to today. It allows
those who are colorblind can distinguish the regions in the the viewer to see that although unfortunately Angola still lags
line charts, as seen in Figure 19. behind other countries in population health in today’s world,
1. Go to Sheet 2 and click on the Color Marks card. As you see in Figure 22, although all the countries in the most
2. A dialog box opens up where you select the Orange- recent time frame of 2010 to 2015 have better infant survival
Blue color scheme, and then select to reverse the col- rates than Angola (indicated with the blue cell color), when
ors so that blue indicates lower rates and orange indi- you look back at historical trends for these rates, some of the
cates higher rates (Figure 20). most developed countries today, like Japan and Singapore,
3. To change the midpoint that the color scale uses, go to had higher infant mortality rates only fifty years ago than
the Advanced options and put a check mark next to the Angola does today. Even developed European countries, like
center options, where you can now type in 96 as the France, Germany, and Austria, also have much lower rates.
value to center the color scale. Although you may lament about the difference between the
the Year of Year from the list, where you now see the down, which indicates an improved population health out-
filter on the far right. look. Similar to the way you set up the blue-orange color
7. Select this filter container, click on the down arrow, scale for the filled map and highlight tables in Figure 21
and choose Multiple Values (dropdown). Then click on and Figure 22 with the highest mortality rate in 2015 as
the down arrow again, select Floating, which means the midpoint in the diverging color scale, you can also use
that you can move the filter over the map to select the a reference line as another way for the viewer to analyze
year, and you can drag it over to the bottom of the map these rate trends. In 1950, you can see that the healthiest
where it doesn’t directly sit on top of any countries in region, Europe, had an infant mortality rate of 58.9 as you
the map. see in Figure 27. By setting a constant reference line at this
point on the y-axis, you can see that although other regions
Adding Elements of Analysis may lag behind Europe in terms of relative improved health
In the line chart in Figure 19, you saw that since 1950, outcomes, you can use this number as a benchmark to show
average infant mortality rates across all regions trend what you could call a time delay in this trend
9. Make sure that you can see all your visuals. The best Now that you’ve set up the dashboard, you can put yourself in
way to do this is to make sure to leave enough white the position of the user and test it out. In Figure 32, you see
space between the legend fields and the visuals, and Asia selected as the region in the line chart legend. This creates
then upload to Tableau Public Online to make sure a new view of the data that highlights key trends and analysis
it fits as you anticipated. If it doesn’t, go back to the for the Asia region. You can also select a single year in the map
dashboard and make adjustments based on what you to see the Asian countries’ health for that year. Choosing Asia
saw pushed out of place. This make take a little bit of filters all three charts, and you can see in the highlight table
practice! the disparity between the rankings within the Asian countries.
You maximize your dashboard’s influence by taking the ini- I encourage you to take these best practices techniques
tial Tableau dashboard and strategically making changes and use some creativity to set up and experiment with vi-
to update the design, formatting, and interactivity. This, sual options, then see how the users respond and go from
in turn, increases the user’s understanding and interac- there!
tions with the dashboard interface and the likelihood they
will use it by letting them take ownership in changing the Helen Wall
views. Designing an effective dashboard gives flexibility.
(Continued from 74) with household chores. And like them, managers
should always look for opportunities to do the
should consider the kind of example you’re set- grunt work, if only to remember how hard and
ting for your daughter.” mind-numbing it can sometimes be. Nov/Dec 2019
Volume 20 Issue 6
And so Maria heads off with Carol and the rest Plus, said Suzanne, an administrative services
of the gang. That’s a clear-cut case of a manager officer for a county government (and my wife), Group Publisher
allowing her people to “lead up,” to make the big “Doing tasks with people can sometimes lead to Markus Egger
decision. For growth to happen, those at every greater strength of relationships for the people Associate Publisher
Rick Strahl
level need to exert influence on the people above you need to have follow you.”
them in the organizational chart and managers Editor-in-Chief
Rod Paddock
can help them do that by responding favorably
to their ideas. Compile Your Playlist Managing Editor
Ellen Whitney
Heck, yeah, few jobs are as hard as managing
Content Editor
people. But music can help you, like nothing
When You Fall Down, else, push through the limitations, recommit to
Melanie Spiller
Get Back Up the mission, and inspire your team to keep up.
Editorial Contributors
Otto Dobretsberger
Failure hurts. In Carol’s case, that includes crash- What’s on Captain Marvel’s playlist? For a few, try Jim Duffy
ing a go-kart as a kid, falling off a climbing rope as Heart’s “Crazy on You,” No Doubt’s “Just A Girl,” Jeff Etter
Mike Yeager
a young woman, and putting up with another pi- and Des’ree’s “You Gotta Be.”
Writers In This Issue
lot—this one male—in a bar, over a beer, telling her Sumeya Block Sara Chipps
she’s a “decent pilot, but [she’s] too emotional.” With tunes like that and a little work honing that Kate Gregory Julie Lerman
inspirational stance, Danvers, I’d think about Ashleigh Lodge Sahil Malik
Later on, these scenes re-emerge, but this time we signing onto your team. Jeannine Takaki-Nelson Dian Schaffhauser
Craig Shoemaker Helen Wall
get to see how those scenes play out, with Carol
Technical Reviewers
picking herself up after every failure. Sure, it’s a Dian Schaffhauser Markus Egger
montage just like the ones Nike feeds us, but those schaffhauser@gmail.com Rod Paddock
commercials get a bazillion views because they Her management days over, these days, Production
work. Managers don’t give up; they get up. Dian Schaffhauser prefers to go it alone as a Franz Wimmer
King Laurin GmbH
freelance reporter covering business and 39057 St. Michael/Eppan, Italy
“Failure, failure, failure, failure,” said Staci. “You technology from Northern California.
keep getting up and that will get you closer toward Printing
Fry Communications, Inc.
your goal.” 800 West Church Rd.
Mechanicsburg, PA 17055
After all, who of us alive will ever forget the out- down a prey that can take the form of anybody Carol expresses curiosity about the “communica-
stretched arms and taunting stance the purple- around you. How do you identify the enemy? But tor” and Fury reassures her that he’s only texting
haired Rapinoe displayed every time she scored dur- because she’s our hero, she has an innate ability his mom.
ing the Women’s World Cup? (Alas, the movie ap- to pick out the bad guy, which becomes more ob-
peared months before the American win in France.) vious when he blasts his way through the top of a When they finally do escape, Carol comes under
railcar and they take the fight up on the roof. As attack again from what we believe at first to be a
Pose aside, however, there was plenty more signal- 1990s Los Angeles flashes by in the background, S.H.I.E.L.D. team; Fury has led them right to her.
ing that “Vers” was somebody worth following. Re- Carol gives as good as she gets until they head He quickly realizes, however, that appearances
cently, a small corps of friends gathered in front of into a tunnel. Suddenly, she can’t see anything can be deceptive and rejoins Carol in her attempt
my flatscreen to rewatch the movie (they’d all seen until they stop at the station. to leave the bunker, this time via fighter jet. They
it in the theater), finish off a couple of bottles of barricade themselves on the bunker’s flight deck,
California vino and share what struck them about She leaps out of the car to join the surge of and moments from certain doom, Carol holds her
Captain Marvel’s management qualities, viewed people disembarking and spies the shapeshifter hand out. She wants Fury to give her the com-
from their own perspectives as managers. walking away, still in the image of the man whose municator. Now. As she tells him, “You obviously
form he last took. She grabs him from behind and can’t be trusted with it.”
Warning! Yes, the rest of this column contains a he turns, ready to receive the blow. But one look
mighty collection of what some might consider tells her this is the authentic person, not the “She jumps right on it,” observes my friend Sta-
spoilers, but that I prefer to call “preparatory form stolen by the shapeshifter. Her fist drops ci, firefighter and forest aviation officer. “She
notes.” I enjoyed my second viewing of Captain Mar- and she moves on—with no apology. doesn’t let it fester.” She takes care of what she
vel more than my first, not only because I knew what views as a problem immediately. Good managers
was coming but also because I understood things Later on, after she’s insulted Tom, the guy who don’t avoid conflict.
better. (The wine helped too.) No need to thank me. lives next door to her BFF Maria, the same thing
happens. Likewise, they don’t hold grudges. That takes too
much energy.
It’s Not Always Bad to Be As my friend Donna, a college instructor, pointed
Driven by Your Emotions out, “Look at that! She didn’t apologize. Women
During a sparring match, when mentor Yon-Rogg apologize far too freakin’ much.” Let Your People Lead Up
has our hero Carol pinned to the floor and her When Carol’s friend Maria is invited to join the
fists begin to sizzle orange in frustration, he Sure, a good manager is capable of saying, “I’m mission as a co-pilot to track down Dr. Lawson’s
tells her, “There is nothing more dangerous to a sorry,” but not every time she makes a blunder. ship in a jerry-rigged plane, she begs off. As a
warrior than emotion.” (Oh, how often have we So when does an apology pass Carol’s lips? Only single mom, she reminds Carol that she can’t
heard, “Don’t let your emotions get the best of when she finds out just why Skrull General Talos leave her daughter Monica. “There’s no way I’m
you”?) Yet leaders know that the passion behind has been trying to capture her. And then it’s an going, baby,” she says. “It’s too dangerous.”
the emotion can drive them and their staff to authentic apology, born of self-awareness. When
keep going when things are looking down. On a manager uses “sorry” too much, it loses impact But Monica won’t have any of that. “Testing
top of that, having a grasp of emotional intel- and can be perceived as weakness. brand new aerospace tech is dangerous. Didn’t
ligence—being able to understand your own emo- you use to do that?” she suggests. Besides, she
tions and influence the emotions of those around adds, she’ll stay with her grandparents.
you—will get you much further than sheer tech- Jump on the Problem
nical skill. Understanding how people work and Carol and S.H.I.E.L.D. agent Fury are in a hid- Maria turns to Carol, who’s listening in on the
what motivates them emotionally is critically im- den government mountain bunker hunting down conversation. “Your plan is to leave the atmo-
portant to pulling them in to help you achieve information about the mysterious Dr. Lawson, sphere in a craft not designed for the journey,
your goals and is far more effective than sending whom Carol believes holds the key to stopping and you anticipate hostile encounters with a
out yet another directive-by-Slack. the Skrulls from taking over the universe. When technologically superior foreign enemy. Correct?”
the pair tell officials why they’re there, they’re
locked into a nameless office. After a half-heart- Carol doesn’t say a word; just shrugs. But Monica
Stop Apologizing ed attempt to get free, Fury pulls out his “state- speaks up: “That’s what I’m saying. You have to
Carol has just spent the last five minutes chasing of-the-art two-way pager” and sends a furtive go.” Besides, she adds, “I just think that you
down a continually shape-shifting Skrull through message to his work partner, Agent Coulson:
a moving Metro Rail train. Imagine trying to hunt “Detained with target. Need backup.” (Continued on page 73)