You are on page 1of 5


Designed by Cleve Moler

Developer MathWorks

OS Cross-platform
Filename extension(s) .m
Written in C, C++, Java, MATLAB, Fortan
Operating system Cross-platform: Microsoft Windows,Linux,
and Mac OS X

MATLAB (matrix laboratory) is a multi-paradigm numerical computing environment and fourth-generation
programming language. Developed by MathWorks, MATLAB allows matrix manipulations, plotting
of functions and data, implementation ofalgorithms, creation of user interfaces, and interfacing with programs
written in other languages, including C, C++, Java, and Fortran.

Stable release 8.3 / 2014


is a high-level language and interactive environment for numerical computation, visualization, and programming. Using
MATLAB, you can analyze data, develop algorithms, and create models and applications. The language, tools, and built-in math
functions enable you to explore multiple approaches and reach a solution faster than with spreadsheets or traditional programming
languages, such as C/C++ or Java

Advantages & disadvantages:
Why should we use MATLAB (Matrix Laboratory)?

MATLAB has several advantages over other methods or languages:

Its basic data element is the matrix. A simple integer is considered an matrix of one row and one column. Several
mathematical operations that work on arrays or matrices are built-in to the Matlab environment. For example, cross-
products, dot-products, determinants, inverse matrices.
Vectorized operations. Adding two arrays together needs only one command, instead of a for or while loop.
The graphical output is optimized for interaction. You can plot your data very easily, and then change colors, sizes,
scales, etc, by using the graphical interactive tools.
Matlabs functionality can be greatly expanded by the addition of toolboxes. These are sets of specific functions that
provided more specialized functionality. Ex: Excel link allows data to be written in a format recognized by Excel,
Statistics Toolbox allows more specialized statistical manipulation of data (Anova, Basic Fits, etc)

There are also disadvantages:

It uses a large amount of memory and on slow computers it is very hard to use.
It sits on top of Windows, getting as much CPU time as Windows allows it to have. This makes real-time applications
very complicated.

Developer(s) University of Waikato
Stable release 3.6.11 (book 3rd ed.), 3.7.11
(development) / 2014/04/24
Written in Java
Operating system Cross-platform
Type Machine Learning
License GNU General Public License

Weka is a collection of machine learning algorithms for data mining tasks. The
algorithms can either be applied directly to a dataset or called from your own Java
code. Weka contains tools for data pre-processing, classification, regression,
clustering, association rules, and visualization. It is also well-suited for developing new
machine learning schemes.
Weka is open source software issued under the GNU General Public License.
Yes, it is possible to apply Weka to big data!
Collections of Datasets
Some example datasets are included in the Weka distribution.
Available separately:
A jarfile containing 37 classification problems, originally obtained from the UCI
repository (datasets-UCI.jar, 1,190,961 Bytes).
A jarfile containing 37 regression problems, obtained from various
sources (datasets-numeric.jar, 169,344 Bytes).
A jarfile containing 6 agricultural datasets obtained from agricultural researchers in
New Zealand (agridatasets.jar, 31,200 Bytes).
A jarfile containing 30 regression datasets collected by Luis Torgo (regression-
datasets.jar, 10,090,266 Bytes).
A gzip'ed tar containing UCI and UCI KDD datasets (uci-20070111.tar.gz,
17,952,832 Bytes)
A gzip'ed tar containing StatLib datasets (statlib-20050214.tar.gz, 12,785,582
A gzip'ed tar containing ordinal, real-world datasets donated by Dr. Arie Ben
David (Holon Inst. of Technology/Israel) (datasets-arie_ben_david.tar.gz,
11,348 Bytes)
A zip file containing 19 multi-class (1-of-n) text datasets donated by George
Forman/Hewlett-Packard Labs(, 14,084,828 Bytes)
A bzip'ed tar file containing the Reuters21578 dataset split into separate files
according to the ModApte split (reuters21578-ModApte.tar.bz2, 81,745,032 Bytes)
A zip file containing 41 drug design datasets formed using the Adriana.Code
software - - donated by Dr.
M. Fatih Amasyali (Yildiz Technical Unversity) (, 11,376,153
Attribute-Relation File Format (ARFF

Advantages & disadvantages:
The Main Advantages Of Weka Data Mining
Weka data mining can truly aid an enterprise attain its fullest prospective. It is an approach to
evaluate how business is becoming impacted by particular qualities, and may assist company
entrepreneurs improve their earnings and steer clear of generating company mistakes down the
line. Fundamentally, through this process, a company is analyzing specific information from
distinct perspectives to be able to obtain a total rounded watch of how their business is
performing. Enterprise proprietors can get a broad point of view on points these as client trending,
where they may be shedding cash and where they're creating cash. The knowledge may also
reveal methods that may help a business lower unneeded fees and may aid them boost their
overall income.


Developer(s) RapidMiner
Stable release 6.0 / 21 November 2013; 7 months ago
Operating system Cross-platform
Type Statistical analysis, data mining, predictive
License AGPL/Proprietary

The Affero General Public License
RapidMiner, the leader in Predictive Analytics, delivers an easy-to-use desktop-to-cloud solution designed for the
collaboration team of data scientists and business leaders. RapidMiner empowers this collaborative team to execute
on smarter business decisions by using predictive intelligence and predactions prediction-based actions to
improve operations across the organization.
RapidMiner provides software, solutions, and services in the field of advanced analytics, including predictive
analytics, data mining, and text mining. We automatically and intelligently analyze data including databases and
text on a large scale.

RapidMiner is a software platform developed by the company of the same name that provides an integrated
environment for machine learning, data mining, text mining, predictive analytics and business analytics. It is
used for business and industrial applications as well as for research, education, training, rapid prototyping, and
application development and supports all steps of thedata mining process including results visualization,
validation and optimization.
RapidMiner is developed on a business source model which means the core and
earlier versions of the software are available under an OSI-certified open source license. A Starter Edition is
available for free download, a Personal Edition is offered for US$999, a Professional Edition is $2,999 and
pricing for the Enterprise Edition is available from the developer.
RapidMiner uses a client/server model with the server offered as Software as a Service or on cloud

Easy-to-use visual environment for predictive analytics. No programming
Open and extensible
Advanced analytics at every scale perfect for big data.
Runs on all major platforms and operating systems.

Developer(s) AG
Stable release 2.9 / December 6, 2013
Written in Java
Operating system Linux, OS X, Windows
Type Enterprise Reporting /Business
Intelligence /Data mining
License GNU General Public License

KNIME, the Konstanz Information Miner, is an open source data analytics, reporting and integration platform.
KNIME integrates various components for machine learning and data mining through its modular data pipelining
concept. A graphical user interfaceallows assembly of nodes for data preprocessing (ETL: Extraction,
Transformation, Loading), for modeling and data analysis and visualization.
KNIME is used in pharmaceutical research, CRM customer data analysis,business intelligence and financial
data analysis.
KNIME is the leading open platform for data-driven innovation helping organizations to stay ahead of change. Innovative
organizations use our open-source, enterprise-grade analytics platform to discover the potential hidden in their data, mine for fresh
insights, or predict new futures.
Quick to deploy, easy to scale and intuitive to use, KNIME is used in over 60 countries on data of every kind: from numbers to
images, molecules to humans, signals to complex networks, from kilo- to petabytes, or simple reports to complex analyses.
KNIME is developed and supported by AG.
KNIME stands for KoNstanz Information MinEr