Professional Documents
Culture Documents
DOI 10.1007/s12599-014-0344-2
data sets that are too large and com- and a study by the McKinsey Global Insti-
The Authors plex to be processed using traditional tute states that “[t]he United States alone
storage (e.g., relational database man- faces a shortage of 140,000 to 190,000
Stefan Debortoli, M.Sc. () agement systems) and analysis technolo- people with deep analytical skills as well
Dr. Oliver Müller gies (e.g., packaged software for statistical as 1.5 million managers and analysts
Prof. Dr. Jan vom Brocke analysis). More specifically, researchers to analyze big data and make decisions
Institute of Information Systems and practitioners use the term “big data” based on their findings” (Manyika et al.
University of Liechtenstein to refer to the ongoing expansion of 2011, p. 3).
Fürst-Franz-Josef-Strasse data in terms of volume, variety, velocity Given these figures, we academics have
9490 Vaduz (Laney 2001), and veracity (IBM 2012). to ask ourselves to what degree current
Principality of Liechtenstein Given the current excitement around research agendas and curricula satisfy in-
stefan.debortoli@uni.li big data, critical voices question whether dustry’s growing demand for competence
oliver.mueller@uni.li big data is “really something new or [. . .] in the areas of big data and analytics.
jan.vom.brocke@uni.li just new wine in old bottles” (Buhl et al. Against this background, the objective of
url: http://www.uni.li/iwi 2013) or postulate that we should “for- this paper is to clarify the competency re-
get big data [because] small data is the quirements of the emerging field of big
Received: 2013-10-31 real revolution” (Polock 2013). Others, data (BD) and compare them to the re-
Accepted: 2014-05-07 such as Chen et al. (2012) and Golden quirements of the established field of BI.
Accepted after two revisions by the (2013), argue that big data is not a rev- In particularly, we seek to (1) identify
editors of the special focus. olution but an evolution of traditional and categorize competency requirements
business intelligence (BI). According to for BD professionals and BI profession-
this view, big data analytics widens the als from a practitioner’s point of view
This article is also available in Ger- scope of BI, which focuses on integrating and (2) highlight theses requirements’
man in print and via http://www. and reporting structured data residing similarities and differences.
wirtschaftsinformatik.de: S Debor- in company-internal databases, by seek- The current literature contains only a
toli, O Müller, J vom Brocke ing to extract value from semi-structured few contributions on the topic of BI and
(2014) Vergleich von Kompetenz- and unstructured data originating in data BD competencies, so we collected and an-
anforderungen an Business-Intel- sources like the web, mobile devices, and alyzed empirical data from the BI and
ligence- und Big-Data-Spezialisten. sensor networks that are external to the BD job market. Following the logic of
Eine Text-Mining-Studie auf Basis company. extant studies on information systems
von Stellenausschreibungen. WIRT-
Big data offers enormous opportunities competency requirements (e.g., Gallivan
SCHAFTSINFORMATIK. doi: 10.1007/
for businesses but also poses many chal- et al. 2004; Litecky and Aken 2010; Todd
s11576-014-0432-4.
lenges (Buhl 2013). A survey of nearly et al. 1995), we used online job adver-
3000 executives, managers, and analysts tisements as a data source and performed
© Springer Fachmedien Wiesbaden
from more than 30 industries and 100 a quantitative content analysis of 1357
2014
countries conducted by MIT Sloan Man- BI-related and 450 BD-related job adver-
agement Review and the IBM Insti- tisements using a text-mining technique
tute for Business Value finds that top- called latent semantic analysis (LSA).
performing organizations use analytics Our analysis revealed fifteen distinct
1 Introduction five times more often than lower per- areas of competency for BI profession-
formers do (LaValle et al. 2011), yet not als and fifteen distinct areas of compe-
Big data and big data analytics are among all corporate big data initiatives are suc- tency for BD professional. On the most
today’s most frequently discussed top- cessful. Research shows that “inadequate abstract level, these areas of competency
ics in research and practice (Buhl et al. staffing and skills are the leading barriers can be classified into business competen-
2013). In loose terms, big data refers to to Big Data Analytics” (Russom 2011), cies and IT competencies. The business
competencies can be further sub-divided structures, policies and rules, workplace and depth of the data processed, but
into management and domain compe- practices, culture). Section 2.1 elaborates also the types of questions they answer.
tencies, and the IT competencies can be on the technological IT resources asso- While BI traditionally focuses on using
further sub-divided into methodological, ciated with BI and BD, Sects. 2.2 and a consistent set of metrics to measure
conceptual, and product-specific compe- 2.3 discuss required human IT resources, past business performance (Davenport
tencies. Comparing and contrasting the and Sect. 2.4 addresses complementary 2006), big data applications emphasize
competency requirements for BI and BD organizational capital resources. exploration, discovery, and prediction.
professionals shows areas of overlap, es- As Dhar (2013) states, “Big data makes it
pecially regarding IT concepts and meth- 2.1 Business Intelligence and Big Data feasible for a machine to ask and validate
ods and the business domain, as well interesting questions humans might not
as clear differences when it comes to IT Howard Dresner of the Gartner Group consider.”
competencies. While BI requires skills in introduced the term “business intelli-
the area of commercial software plat- gence” in 1989, describing “a set of con-
2.2 Business Intelligence Competencies
forms, BD largely relies on software engi- cepts and methods to improve business
neering, statistics skills, and open-source decision making by using fact-based sup-
port systems” (Power 2007). The first As we found no literature that studies in-
products. dividual BI competencies, we gained an
Our empirically grounded frameworks productive BI systems were implemented
at large consumer goods manufacturers overview of individual BI competency re-
of BI and BD competencies contribute to quirements by consulting extant work on
the IS body of knowledge by (1) helping like Procter & Gamble and retailers like
Wal-Mart for the purpose of analyzing BI maturity/capability models, reviews of
professionals to assess and advance their the BI literature, and panel reports.
individual competencies, (2) guiding or- sales data (Power 2007). Although Dres-
ner’s original definition of BI, as well as Both research and practice have
ganizations in composing effective port-
more recent definitions from analysts like engaged in developing BI maturity/
folios of BI and BD professionals, and (3)
Gartner, Forrester, and TDWI, are broad capability models. (For an overview, see,
informing the development of academic
in scope, most practitioners associate e.g., Russell et al. 2010). The general pur-
and professional education programs.
with the term a narrow set of capabilities, pose of such models is to systematize
The remainder of this paper is struc-
such as extraction, transformation, and organizational capabilities and outline
tured as follows. The next section pro-
loading (ETL); data warehousing; on-line pathways for advancing them. Models
vides research background on the topic of
analytical processing (OLAP); and re- that originate from industry include the
BI and BD competencies. Then we intro-
porting (Davenport 2006). The focus of TDWI Business Intelligence Maturity
duce our methodology and explain our
these traditional BI solutions is on ana- Model (Eckerson 2004), Gartner’s Matu-
data-collection and analysis processes.
lyzing historical data in order to answer rity Model for Business Intelligence and
Next, we present our results and dis-
questions like “how much did we sell in Performance Management (Hostmann
cuss our findings against the background
a certain region?” and “how much profit and Hagerty 2010), Gartner’s Magic
of related work. We close by pointing
did we make last quarter?” Quadrant for Business Intelligence Plat-
out the limitations of our work and
At the end of the 1990s, the term forms (Schlegel et al. 2013), and Logica’s
implications for future research.
“big data” started to appear in the scien- Capability/Maturity Model (Van Roekel
tific literature, referring to data sets that et al. 2009). Lahrmann et al. (2011), Din-
were too large to fit into main memory ter (2012), and Cates et al. (2005) provide
2 Research Background or even local disks (Cox and Ellsworth examples of academic BI maturity mod-
1997; Forbes 2013). The first publica- els. Industry maturity models tend to
The resource-based view (RBV) of tions about big data originated from the
the firm, especially the framework by focus on technological capabilities that
field of scientific computing, but in 2001 BI platforms should provide (Russell
Melville et al. (2004), can be used to Doug Laney, an analyst with the Meta
evaluate BI/BD implementations’ gen- et al. 2010). For example, Gartner lists
Group, transferred the concept to the
eration of business value and to assess thirteen essential capabilities, includ-
business domain and coined the term
which resources and competencies are ing reporting, OLAP, and visualization
“the 3Vs” to stand for volume, veloc-
required and may lead to competitive (Schlegel et al. 2013). Such functional IT
ity, and variety, which quickly became
advantage. In the focal firm, IT business the constituting dimensions of big data capabilities provide some guidance for
value is generated by the deployment of (Laney 2001). After the mid-2000s, fu- assessing and developing individual-level
IT and complementary organizational eled by Davenport’s (2006) seminal ar- BI competencies but largely neglect the
resources (Melville et al. 2004). How- ticle “Competing on Analytics,” busi- business-related aspects of BI, such as
ever, IT affects organizational perfor- nesses became increasingly interested in project management and domain skills.
mance only via intermediate business big data, and the focus shifted from tech- By contrast, the academic models pro-
processes. Melville et al. (2004) opera- nical issues around the storage of big data vide a high-level view of strategic BI
tionalize IT based on Barney’s (1991) to its analysis. Internet-based businesses capabilities like architecture planning,
classification of firm resources into phys- like Google, Amazon, and Facebook were IT-business alignment, and generation
ical capital (technological IT resources among the first to exploit big data by ap- of business value. While these topics are
or TIR, i.e., infrastructure and business plying sophisticated data mining and ma- key to engaging effectively in BI on an
applications), human capital (human IT chine learning techniques. What differ- organizational level, we believe that they
resources or HIR, i.e., technical skills and entiates today’s big data analytics appli- are too abstract to be useful in assess-
managerial skills), and organizational cations from traditional business intelli- ing and developing individual-level BI
capital resources (e.g., organizational gence applications is not only the breadth competencies.
The purpose of literature reviews is of BD specialists or similar jobs, such as Therefore, we study the competencies re-
to analyze and synthesize the academic those of data scientists. quired of BI and BD professionals by per-
body of knowledge, so it is reasonable to In an influential Harvard Business Re- forming an automated content analysis
expect that reviews can provide insight view article, Davenport and Patil (2012) of job ads using a text mining technique
into competency requirements by, for ex- describe a data scientist as “a hybrid of called latent semantic analysis (LSA),
ample, outlining curricula. We identified data hacker, analyst, communicator, and a quantitative method for analyzing qual-
one review in the area of BI that explicitly trusted adviser” (p. 73) and call the job itative data. LSA extracts word usage pat-
comments on aspects of education. Based of the data scientist “the sexiest job of the terns and their meaning through sta-
on market research results from Gartner, 21st century” (p. 70). Likewise, Hammer- tistical computations (Landauer et al.
Chen et al. (2012) perform a bibliomet- bacher, who created the first data science 1998) based on the idea that the contexts
ric study of academic and industry pub- team at Facebook, portrays a data scien- (e.g., documents, paragraphs, sentences)
lications on business intelligence and an- tist as “a team member [who] could au- in which a word appears or does not ap-
alytics and structured the business intel- thor a multistage processing pipeline in pear largely determine the word’s mean-
ligence and analytics (BI&A) discipline Python, design a hypothesis test, perform ing. LSA is based on the classical vector
into three evolutionary waves – BI&A a regression analysis over data samples space model (Salton et al. 1975), in which
1.0 (database-based, structured content), with R, design and implement an algo- documents are represented as vectors of
BI&A 2.0 (web-based, unstructured con- rithm for some data-intensive product or terms, and a collection of documents is
tent), and BI&A 3.0 (mobile and sensor- service in Hadoop, or communicate the represented as a term-document matrix
based content) – and five emerging re- results of our analyses to other members that contains the number of times each
search areas – big data analytics, text of the organization” (as cited in Loukides term appears in each document (Man-
analytics, web analytics, network analyt- 2012). ning et al. 2008). In a fashion similar
ics, and mobile analytics. Chen et al. These characterizations seem to call to exploratory factor analysis, LSA per-
(2012) also outline and map the com- for a hybrid of a computer scientist forms a matrix operation called singular
petency requirements for each of these and statistician, yet many more business- value decomposition (SVD) on the term-
fields and advocate that higher educa- related authors state that, in the world of document matrix in order to reduce its
tion should consider these competencies big data, one cannot separate data pro- dimensionality. The latent semantic fac-
in their curricula. Examples of the com- cessing from analysis or from domain tors that are extracted during this pro-
petencies Chen et al. (2012) name include knowledge (e.g., Chen et al. 2012; Dav- cess can be interpreted as topics running
relational database management systems enport and Patil 2012; Loukides 2012; through the collection of documents an-
(RDBMS), data warehousing, ETL, data Provost and Fawcett 2013; Waller and alyzed. LSA has received growing atten-
mining, statistical analysis, web crawl- Fawcett 2013). Hence, BD specialists tion in the IS discipline for quantitative
ing, recommender systems, social net- must have substantial industry knowl- content analysis of academic papers (e.g.,
work theories, smartphone platforms, edge in order to make sense of statisti- Larsen et al. 2008; Sidorova et al. 2008),
machine learning, process mining, in- cal analyses and communicate effectively social media posts (e.g., Evangelopou-
memory DBMS, cloud computing, senti- with business colleagues. los and Visinescu 2012), sustainability re-
ment analysis, and web visualization. ports (e.g., Reuter et al. 2014), vendor
Wixom’s et al. (2011) panel report 2.4 Organizational Setup of Business case studies (e.g., Herbst et al. 2014), and
notes that industry trends raise concerns Intelligence and Big Data Teams customer feedback (e.g., Coussement and
that “academia may be behind the curve Poel 2008).
in delivering effective Business Intelli- The differences between BI and BD also A typical LSA is comprised of three
gence programs and course offerings to have consequences on how they are orga- phases. (For a more detailed introduc-
students.” Based on surveys conducted nized. Traditionally, BI teams are located tion and numerical examples, see Lan-
at BI practitioner events, Wixom et al. in internal consulting organizations, cen- dauer et al. 1998 and Evangelopoulos
(2011) formulate four academic BI best ters of excellence, or IT departments, et al. 2012). In the first phase, a collection
practices that would close the gap be- where they provide managers and exec- of documents is transformed into a term-
tween BI market needs and the content utives with reports for their well-defined document matrix. This step typically re-
of IS education programs: (1) provide and stable information needs (Burton quires pre-processing of documents (e.g.,
a broader range of BI skills, (2) take et al. 2006; Davenport et al. 2012; Varon removing irrelevant or duplicate docu-
an interdisciplinary approach to BI pro- 2012). However, since most BD initiatives ments) and terms (e.g., uni- and bi-gram
grams, (3) develop reusable teaching re- lack predefined questions and are much tokenization, filtering out uninformative
sources, and (4) align with practice. Be- more experimental in nature (Casey et al. terms, weighting terms according to their
sides arguing for the need for techni- 2013), BD specialists must be organized relative importance).
cal skills, Wixom et al. (2011) argue that so they are close to products and pro- In the second phase, the term-
a deep understanding of business sub- cesses in organizations, that is, co-located document matrix undergoes SVD to
jects (e.g., finance, marketing) and strong with business units (Davenport et al. reduce the dimensionality of the term-
communication skills are required. 2012). document matrix without losing essen-
tial information by identifying groups of
2.3 Big Data Competencies highly correlated terms (i.e., terms that
3 Methodology co-occur together in documents) and
No scientific literature on the topic of BD highly correlated documents (i.e., docu-
competences has yet been published, al- While the literature provides first insights ments that contain similar terms). The
though a number of articles and web re- into the topic of BI and BD competen- result of the SVD is a set of factors (top-
sources anecdotally describe the profile cies, it is not grounded in empirical data. ics) with associated high-loading terms
Factor ID Factor label High-loading descriptive terms (excerpt) Titles of high-loading job ads (excerpt)
(% of jobs)
BI15.01 Healthcare care, health, systems, reporting, Business Analyst Regulatory Healthcare, Report Writer
(9 %)a information, analysis Business Analyst, Manager Clinical Decision Support
BI15.02 Sales and Business sales, business development, executive, Legal Sales Executive, Business Development Manager,
(8 %) Development legal, sales team Sales Executive Business Intelligence, Sales Manager
Business Intelligence
BI15.03 BI Platforms sql server, ssis, ssrs, ssas, microsoft, BI Developer SSIS SSAS SSRS SQL, BI Data Warehouse
(16 %) (Microsoft) microsoft bi, reporting services, etl Developer SQL Server, SQL Server Developer, ETL
Developer Business Intelligence SSIS SQL SSRS
BI15.04 BI Platforms (SAP) sap, sap bi, hana, business objects, sap bw, SAP BI Principal Consultant, SAP BI Senior Technical
(5 %) erp, consultant, business analyst, crystal Consultant, SAP BI Report Analyst Developer, Senior
Business Objects Consultant
BI15.05 Digital Marketing marketing, digital, campaigns, product, Senior Marketing Executive Online Data Solutions Job,
(6 %) analytics, segmentation, customer Marketing Database Analyst, Email Marketing Manager,
Digital Relationship Marketing Manager
BI15.06 Database dba, database administrator, sql server, Oracle DBA SQL Server Database Administrator, Senior
(2 %) Administration oracle, sql, production, developer, tuning DBA SQL Server Database Administrator, SQL DBA with
BI Business Intelligence, MS-SQL Server DBA
BI15.07 BI Platforms (SAS) sas, studio, analytics, statistical, mining, SAS BI Analyst, Data Analytics Business Intelligence
(3 %) olap, data mining, data analytics Consultant, Senior SAS Developer, SAS Consultant
BI15.08 Software java, eclipse, apache, web, linux, engineer, Senior Java Consultant, Senior Java Technical Consultant,
(4 %) Engineering software, javascript, developer, big data Mobile Developer Java jQuery HTML5, Front End
Engineer, Senior Backend Engineer
BI15.09 BI Architecture bi developer, etl, developer, bi stack, Business Intelligence Developer, ETL Business Intelligence
(5 %) organization, report Developer, BI Developer Excel Microsoft BI SQL Server,
Senior BI Developer Architect
BI15.10 Project Management project manager, project, management, Senior Project Manager, Technical Project Manager, BI
(5 %) head, client, change, agile, planning Project Manager Data Warehouse Implementations, Sr
Project Manager Business Intelligence
BI15.11 Web Portals sharepoint,.net, server, microsoft, SharePoint Developer, SharePoint 2007–2010 Developer,
(3 %) (Microsoft) administrator, software, web, application SharePoint Administrator SharePoint 2010 Server,
SharePoint Consultant, SharePoint Architect
BI15.12 BI Platforms (IBM) cognos, studio, manager, report, Cognos BI Developer, Cognos BI Manager, Cognos
(5 %) framework, developer, ibm, query, Designer, MIS Manager with Cognos BI experience,
analyst, etl Cognos 10 Consultant Developer
BI15.13 BI Platforms qlikview, microstrategy, oracle, obiee, MicroStrategy Business Intelligence Analyst,
(15 %) (QlikView, warehouse, etl, architect, consultant MicroStrategy Developer, Senior QlikView Developer, BI
Microstrategy, Visualization Consultant, ETL Specialist
OBIEE)
BI15.14 Business Analysis business analyst, data analyst, reporting, Business Analyst, Business Analyst SAP APO Excel Expert,
(6 %) excel, organization, specialist, pivot Data Analyst, Reporting Data Analyst, BI Report Analyst,
Technical Business Analyst
BI15.15 Business consultancy, business development, sales, Business Development Manager & Market Intelligence
(7 %) Development development manager, account, market Consultancy, Sales Business Development Manager, Sales
(Consultancy) Account Manager Research Consultancy
a Even though we retained the top-1/k term and document loadings and set the computed threshold value accordingly, we followed Evangelopoulos
et al. (2012) in double-checking and manually selecting a threshold for each factor separately based on domain knowledge. As a result, we had to
reduce the number of jobs that loaded on the first factor (BI15.01).
NoSQL.” In contrast, the top five descrip- In contrast to the BI competencies, we 5.4 Comparison
tive terms for the second topic were “digi- find no factors related to the technolo-
tal,” “sales,” “manager,” “advertising,” and gies of commercial vendors, yet many We identified a number of similarities be-
“marketing,” and frequent job titles in- conceptual and methodological compe- tween the fields of BI and BD. Especially
cluded “Digital Sales Executive,” “Sales tencies, as well as programming skills when it comes to generic IT concepts and
Manager Big Data,” and “Digital Rela- in various languages are required. In methods and business skills, we observed
tionship Marketing Manager.” The exam- the factor representing competency in a considerable overlap between BI and
ination of the highest-loading terms and NoSQL (BD15.01), not a single prod- BDA (cf. Fig. 4). For example, working in
job titles for both factors suggests that uct or technology name of one of the either field requires a certain amount of
the first factor describes jobs related to big commercial database vendors ap- software engineering and database com-
the development of BD solutions (big pears. Instead, terms referring to open- petency. Sales and business development
data developers), while the second factor source technologies from the Apache skills for managing BI and BD solutions
refers to the use of BD in marketing and also overlap. Finally, domain knowledge
Foundation are dominating the descrip-
sales (big data users). overlaps in healthcare/life sciences and
tions (e.g., “hadoop,” “hive,” “pig,” “cas-
Table 3 provides an overview of the digital marketing, domains known to be
sandra”). Furthermore, conceptual and
results of the fifteen-factor solution and especially data-driven. The absence of
methodological IT skills like quantita-
shows exemplary high-loading terms and other domain skills is a result of the level
tive analysis (BD15.03), machine learn- of analysis we chose; a more granular LSA
job titles, as well as the manually assigned
ing (BD15.05), database administration on BI and BD job ads (e.g., 50 instead
labels for each of the extracted factors.
(BD15.10), and software engineering and of 15 factors) would reveal the additional
The inspection of the identified areas of
testing (BD15.13, BD15.14) are in high domains of banking, finance, insurance,
competency shows that, just as for BI
demand. These findings suggest that the and supply chain management.
jobs, competencies can be clustered into
business competencies and IT competen- field of BD is not (yet) dominated by The major differences between BI and
cies. The IT competency area can be fur- big vendors’ standard software but (still) BD competencies are discussed in the
ther broken down into generic concepts relies largely on open-source technolo- next section.
and methods like quantitative analysis, gies and custom-made software solu-
machine learning, and database adminis- tions.
tration, and products for developing big Comparing the relative demand be- 6 Discussion
data solutions (i.e., a variety of program- tween business and IT competencies re-
ming languages and NoSQL databases). veals that almost 70 percent of the Our research revealed highly demanded
The group of business-oriented compe- posted BD-related job ads seek techni- BI and BD skills in at least two ar-
tencies is made up of domain competen- cal skills. Knowledge in NoSQL databases eas, business and IT. This first finding
cies in the areas of life sciences and digital and software engineering and program- empirically grounds the ongoing discus-
marketing, as well as managerial compe- ming are the most highly demanded areas sion about business knowledge’s being as
tencies in sales and business development of technical competency. Digital market- important as technical skills for work-
and working in start-up companies. Fig- ing, business development, and sales con- ing successfully on BI and BD initia-
ure 3 summarizes these findings in a big stitute highly demanded business compe- tives (e.g., Chen et al. 2012; de Lange
data competency taxonomy. tencies. 2013; Waller and Fawcett 2013; Wixom
Factor ID Factor label High-loading descriptive terms (excerpt) Titles of high-loading job ads (excerpt)
(% of jobs)
BD15.01 NoSQL Databases hadoop, nosql, java, hive, scripting, Java Hadoop Developer, Big Data Solutions Architect, Big Data
(17 %) distributed, database, apache, mapreduce, Consultant, Database Architect, Big Data Scientist, Chief
hbase, pig, cassandra Architect Big Data Guru
BD15.02 Sales digital, sales, advertising, manager, media, Junior Digital Sales Manager, Digital Agency Sales Manager,
(6 %) forecasting, presentation, platforms New Business Sales Manager
BD15.03 Quantitative quantitative, risk, analyst, models, Senior Quantitative Analyst, Quantitative Analyst Financial
(2 %) Analysis modeling, matlab, java, algorithms, Risk Management, Big Data Business Systems Analyst, Sr Data
physics, phd, financial, mathematics, data Analyst
analyst
BD15.04 Programming java developer, junit, tdd, hadoop, maven, Experienced Java Developer Java Multi Thread JUnit TDD,
(8 %) (Java) git, nosql, hibernate, eclipse, agile, hive, Java Developer Big Data, Java Hadoop Developer, Senior Java
mongodb, apache, pig Architects Developers Core Java Programming, Senior Java
Consultant Java Spring Hibernate Maven
BD15.05 Machine Learning data scientist, machine learning, Data Scientist Machine Learning C Java Python, Software
(7 %) visualization, statistical, algorithms, Engineer Data Scientist Machine Learning, Security Cleared
mining, predictive, analysis, science, Big Data Scientist, Big Data Architect Hadoop R Machine
mathematics Learning
BD15.06 Startup startup, sales, analytics platforms, market, Sales Big Data Software, Front End Developer for Big Data
(2 %) data analytics, applications, solutions, Startup, Big Data Analytics Sales Consultant, Junior Python
information, enterprise, consultant Developer Big Data Tech Startup, Java Software Engineers
High Profit Big Data Startup
BD15.07 Programming net, sql server, microsoft, visual, High Paid Junior C# ASP.NET Developer, Developer.NET
(2 %) (.NET) developer, warehouse, api, front end, MVC API, C#.NET Developer SQL Server Senior Software
scrum, mvc, agile, project, online Engineer Big Data, Junior C# ASP.NET Developer
BD15.08 Life Science sciences, life, medical, visualization, Strategic Account Manager Big Data Life Sciences, Big Data
(2 %) revenue, health, care, project Engineer Exciting Start Up, Revenue Analyst, Project Manager
management, industries, consulting Big Data, Solutions Consultant Big Data Analytics
Visualisation
BD15.09 Programming developer, php, web, javascript, front end, Lead PHP Ninja, Front End Developer Start Up, Senior PHP
(8 %) (PHP/JavaScript) user, css, agile, html, jquery, web services, Developer, UI Developer Big Data JavaScript, Front End
api, mysql, open source, mongodb Developer HTML5 CSS3 JavaScript, PHP Web Developer OOP
LAMP
BD15.10 Database dba, mysql, oracle, high availability, sql MySQL DBA Big Data High Availability Replication, DBA
(4 %) Administration server, linux, database, senior, Data Modeler, Junior Database Administrator, DBA Systems
consultancy Engineer MySQL NoSQL Big Data Unix Linux
BD15.11 Digital Marketing marketing, digital, analytics, media, Associate Director Digital Media Analytics, Marketing
(10 %) insights, information, social, research, Director, Senior Analyst Big Data Digital Media, Digital
strategy Relationship Marketing Manager
BD15.12 Business sales, customer, revenue, account, Business Development Manager Big Data Technology, Business
(11 %) Development management, executive, business Development Manager Cloud Computing
development, marketing, relationships
BD15.13 Software software engineer, linux, data engineer, Senior Engineer Big Data, Principle Software Engineer Head of
(9 %) Engineering online, professional, open, product, Software Development, Senior Big Data Engineer, Senior
natural language, systems, agile, Software Engineer
distributed
BD15.14 Software Testing testing, software, engineer, product, Python Test Engineer, Software Test Engineer, Java Tester,
(9 %) machine learning, automated, Software Design Engineer in Test, Test Lead
development, open source, agile, building
BD15.15 Data Warehousing etl, data warehouse, business intelligence, Data Warehouse Product Owner, Director of Data
(3 %) manager, project, technical, modeling, Engineering, Snr Business Intelligence Developer, ETL
agile Engineer, DWH Delivery Manager
and track their development over time Chen H, Chiang R, Storey V (2012) Business in-
we plan to repeat the study presented telligence and analytics: from big data to Abstract
big impact. MIS Q 36:1165–1188
here regularly in the future. Second, our Coussement K, Poel den Van D (2008) Im- Stefan Debortoli, Oliver Müller,
data analysis used job advertisements to proving customer complaint management Jan vom Brocke
elaborate on the differences between BI by automatic email classification using lin-
guistic style features as predictors. Decis
and BD competencies, as it is reasonable Support Syst 44:870–882 Comparing Business
to assume that job advertisements act as Cox M, Ellsworth D (1997) Application-
proxies for a demand for human capi- controlled demand paging for out-of-
Intelligence and Big Data Skills
tal in industry and that they can provide core visualization. In: Proc 8th conf vis, A Text Mining Study Using Job
pp 235–244
insights into competency requirements. Davenport TH (2006) Competing on analytics. Advertisements
However, one must be aware that job ads Harv Bus Rev 84:98–107
do not always reflect an employer’s true Davenport TH, Patil D (2012) Data scientist. While many studies on big data an-
Harv Bus Rev 90:70–76
requirements, as the employer may ask Davenport TH, Barth P, Bean R (2012) How
alytics describe the data deluge and
for more competencies than can be rea- “big data” is different. MIT Sloan Manag Rev potential applications for such ana-
sonably expected from an applicant, or 54:22–24 lytics, the required skill set for deal-
they may use a specific vocabulary to pol- de Lange C (2013) So you want to be a ing with big data has not yet been
data scientist? Nat Jobs Blog http://blogs.
ish job ads so they are appealing to a cer- nature.com/naturejobs/2013/03/18/so- studied empirically. The difference be-
tain group of candidates. Such may be you-want-to-be-a-data-scientist. Accessed, tween big data (BD) and traditional
the case especially in the area of BI and 2013–05–2013-03 business intelligence (BI) is also heav-
Dhar V (2013) Data science and prediction.
BD, which lacks clear-cut definitions and Commun ACM 56:64–73 ily discussed among practitioners and
is full of industry jargon. While we ac- Dinter B (2012) The maturing of a business in- scholars. We conduct a latent seman-
knowledge that such biases may exist in telligence maturity model. In: Proc Am conf tic analysis (LSA) on job advertisements
our data, we believe that the number of inf syst, Seattle, pp 1–10
Eckerson W (2004) Gauge your data ware- harvested from the online employment
job ads that we examined should be suf- house maturity. http://www.information- platform monster.com to extract infor-
ficient to minimize the effect of biases management.com/issues/20041101/ mation about the knowledge and skill
in a few ads. The processing of such a 1012391-1.html. Accessed 2013-04-29
Evangelopoulos N, Visinescu L (2012) Text- requirements for BD and BI profession-
broad data source as that used in this mining the voice of the people. Commun als. By analyzing and interpreting the
research gives a particular advantage to ACM 55:62–69 statistical results of the LSA, we de-
the approach we used over other research Evangelopoulos N, Zhang X, Prybutok VR velop a competency taxonomy for big
(2012) Latent semantic analysis: five
methods, such as interviews, because it methodological recommendations. Eur J data and business intelligence. Our ma-
diminishes the risk of biases caused by Inf Syst 21:70–86 jor findings are that (1) business knowl-
specific contextual backgrounds. Third, Forbes (2013) A very short history of big data. edge is as important as technical skills
our findings are limited to job markets http://www.forbes.com/sites/gilpress/
2013/05/09/a-very-short-history-of-big- for working successfully on BI and BD
in English-speaking countries because of data/. Accessed 2014-03-23 initiatives; (2) BI competency is char-
the nature of the text mining technique Gallivan MJ, Truex DP, Kvasny L (2004) Chang- acterized by skills related to commer-
we applied, which cannot process multi- ing patterns in IT skill sets: a content anal-
ysis of classified advertising. Data Base Adv cial products of large software vendors,
lingual texts. Future studies may look at Inf Syst 35:64–87 whereas BD jobs ask for strong soft-
job markets in other major language re- Golden B (2013) Does big data spell the end ware development and statistical skills;
gions (e.g., Spanish, French, Portuguese, of business intelligence as we know it?
In: CIO. Accessed 2013-10-04. http://www. (3) the demand for BI competencies is
German, Russian, Hindustani, Mandarin cio.com/article/print/730774 still far bigger than the demand for BD
Chinese). Finally, our study is induc- Herbst A, Simons A, vom Brocke J et al (2014) competencies; and (4) BD initiatives are
tive and exploratory in nature, so future Identifying and characterizing topics in en-
currently much more human-capital-
confirmatory research (e.g., surveys) is terprise content management: a latent se-
mantic analysis of vendor case studies. In: intensive than BI projects are. Our find-
needed in order to test and refine our Proc 22nd Eur conf inf syst ings can guide individual professionals,
results. Hostmann B, Hagerty J (2010) IT-score organizations, and academic institu-
overview for business intelligence and
performance management. http://www. tions in assessing and advancing their
gartner.com/id=1433813. Accessed 2013- BD and BI competencies.
References 04-29
IBM (2012) Analytics: the real-world use of big Keywords: Big data, Business intelli-
Barney J (1991) Firm resources and sustained data. http://www-935.ibm.com/services/ gence, Competencies, Latent semantic
competitive advantage. J Manage 17:99– us/gbs/thoughtleadership/ibv-big-data-at- analysis, Text mining
120 work.html. Accessed 2013-10-04
Buhl HU (2013) Interview with Martin Petry on Lahrmann G, Marx F, Winter R, Wortmann F
”Big data“. Bus Inf Syst Eng 5:101–102 (2011) Business intelligence maturity: de-
Buhl HU, Röglinger M, Moser F, Heidemann J velopment and evaluation of a theoreti-
(2013) Big data. Bus Inf Syst Eng 5:65–69 cal model. In: Proc Hawaii int conf syst sci,
Burton B, Geishecker L, Hostmann B et al Koloa, pp 1–10
(2006) Organizational structure: business Landauer TK, Foltz PW, Laham D (1998) An
intelligence and information management introduction to latent semantic analysis.
pp 1–11 Discourse Process 25:259–284
Casey T, Krishnamurthy K, Abezgauz B Laney D (2001) 3D data management: con-
(2013) Who should own big data? trolling data volume, velocity, and variety.
strategy+business In: META Gr. http://blogs.gartner.com/
Cates J, Gill S, Zeituny N (2005) The ladder doug-laney/files/2012/01/ad949-3D-Data-
of business intelligence (LOBI): a framework Management-Controlling-Data-Volume-
for enterprise IT planning and architecture. Velocity-and-Variety.pdf. Accessed 2013-10
Int J Bus Inf Syst 1:220–238 -04
Larsen KR, Monarchi DE, Hovorka DS, Bai- forget-big-data-small-data-is-the-real- Schlegel K, Sallam RS, Yuen D, Tapadinhas Y
ley CN (2008) Analyzing unstructured text revolution/. Accessed 2013-10-04 (2013) Magic quadrant for business intelli-
data: using latent categorization to iden- Power D (2007) A brief history of decision gence platforms. http://www.gartner.com/
tify intellectual communities in informa- support systems technology/reprints.do?id=1-1DZLPEP&ct
tion systems. Decis Support Syst 45:884– Provost F, Fawcett T (2013) Data science and =130207&st=sb. Accessed 2013-04-29
896 its relationship to big data and data-driven Sidorova A, Evangelopoulos N, Valacich JS,
LaValle S, Lesser E, Shockley R et al (2011) Big decision making. Big Data 1:51–59 Ramakrishnan T (2008) Uncovering the in-
data, analytics and the path from insights Reuter N, Vakulenko S, vom Brocke J et al tellectual core of the information systems
to value big data, analytics and the path (2014) Identifying the role of information discipline. MIS Q 32:467–482
from insights to value. MIT Sloan Manag systems in achieving energy-related envi-
ronmental sustainability using text mining. Todd PA, McKeen JD, Gallupe RB (1995) The
Rev 52:21–31 evolution of IS job skills: a content analysis
Litecky C, Aken A (2010) Mining for comput- In: Proc 22nd Eur conf inf syst
Van Roekel H, Linders J, Raja K et al (2009) The of IS job advertisements from 1970 to 1990.
ing jobs. IEEE Softw 27:78–85 MIS Q 19:1–27
BI framework: how to turn information into
Loukides M (2012) What is data science? a competitive asset. http://www.tdwi.eu/ Varon E (2012) Rethink your org chart for big
O’Reilly Media, Sebastopol wissen/whitepaper/?no_cache=1&tx_ data analytics teams. http://data-informed.
Manning CD, Raghavan P, Schutze H (2008) mwknowledgebase_pi1[showUid]=45. com/rethink-your-org-chart-for-big-data-
Introduction to information retrieval. Cam- Accessed 2013-04-29 analytics-teams/. Accessed 2014-03-13
bridge University Press, New York Russell S, Haddad M, Bruni M, Granger M vom Brocke J, Debortoli S, Müller O, Reuter
Manyika J, Chui M, Brown B et al (2011) (2010) Organic evolution and the capability N (2014) How in-memory technology can
Big data: the next frontier for innovation, maturity of business intelligence. In: Proc create business value: insights from the
competition, and productivity, pp 1–156 Am conf inf syst Hilti case. Commun Assoc Inf Syst 34:151–
Marchand D, Peppard J (2013) Why IT fumbles Russom P (2011) Big data analytics. TDWI Res 168
analytics. Harv Bus Rev 91:104–112 Salton G, Wong A, Yang CS (1975) A vec- Waller Ma, Fawcett SE (2013) Data science,
Melville N, Kraemer KL, Gurbaxani V (2004) tor space model for automatic indexing. predictive analytics, and big data: a revolu-
Review: information technology and or- Commun ACM 18:613–620
ganizational performance: an integrative SAP (2014) SAP HANA integrates predictive tion that will transform supply chain design
model of IT business value. MIS Q 28:283– analytics, text and big data in a single and management. J Bus Logist 34:77–84
322 package. http://www.sap.com/pc/tech/in- Wixom B, Ariyachandra T, Goul M et al (2011)
Polock R (2013) Forget big data, small data memory-computing-hana/software/ The current state of business intelligence in
is the real revolution. Open Knowl Found analytics/big-data.html. Accessed 2014-03- academia. Commun Assoc Inf Syst 29:299–
Blog http://blog.okfn.org/2013/04/22/ 23 312