Professional Documents
Culture Documents
Talend Real Time Scenario
Talend Real Time Scenario
Overview of Pre-requisites
Real-time Big to run
Data Sandbox Sandbox
Sandbox Obtaining a
Demo
Setup & Talend
(Scenario)
Configuration License
Talend Real-Time Big Data Sandbox
Big Data Insights Cookbook
Using the Talend Real-Time Big Data The demo is built on a real world use- Whether batch, streaming or real-
Platform, this Cookbook provides case in the Retail industry and time integration, understand how
step-by-step instructions to built and demonstrates how Talend, Spark, Talend can be used to address your
run an end-2-end integration NoSQL and real-time messaging can big data challenges and move you
scenario. be easily used together to provide into and beyond the sandbox stage.
real-time “offers” as part of an online
shopping experience.
Talend Real-Time Big Data Sandbox
Big Data Insights Cookbook
About Talend
At Talend, it’s our mission to connect the data-driven enterprise, so our customers can operate in real-time with new insight about their
customers, markets and business.
Virtual Environment
Sandbox Examples
Sample
Talend Real-
scenarios Real-time
Time Big Data Data
pre-built and decisions
Platform
ready-to-run
The Talend Real-Time Big Data See how Talend can turn data into
Sandbox is a virtual environment that real-time decisions through sandbox
combines the Talend Real-Time Big examples that integrate Apache
Data Platform with some sample Kafka, Spark, Spark Streaming,
scenarios pre-built and ready-to-run. Hadoop and NoSQL.
Talend Real-Time Big Data Sandbox
Big Data Insights Cookbook
Talend Platform for Big Data includes a graphical IDE (Talend Studio),
teamwork management, data quality, and advanced big data features.
To see a full list of features please visit Talend’s Website: You will need a Virtual Machine player such as VMWare,
http://www.talend.com/products/platform-for-big-data which can be downloaded from VMware Player Site
Disk
Memory
Space
8GB
20GB (10GB is for the
image download)
Talend Real-Time Big Data Sandbox
Big Data Insights Cookbook
Download the Sandbox Virtual Machine file at You will receive an email with a license key attachment and
www.talend.com/talend-big-data-sandbox. a second email with a list of support resources and videos.
5. Click on “Import”. 3b
5
Note: The Username/Sudo Username = talend Having trouble with Sandbox configuration settings?
Password = talend click here for troubleshooting guide
Talend Real-Time Big Data Sandbox
Big Data Insights Cookbook
8. Start the VM
Talend Real-Time Big Data Sandbox
Big Data Insights Cookbook
1
Talend Real-Time Big Data Sandbox
Big Data Insights Cookbook
1
Talend Real-Time Big Data Sandbox
Big Data Insights Cookbook
You should have been provided a license file by your Talend representative or
by an automatic email from the Talend Real-time Big Data Sandbox program.
This license file is required to open the Talend Studio and must reside within the VM.
Important Notes:
“For VirtualBox users, there is a known issue with Drag-and-drop functionality. The easiest way to get the Talend license file onto the VM is by saving it to a cloud storage site
such as Dropbox.com or sending it to a web-based email client that you have access (such as gmail, yahoo, hotmail, etc…), then navigating to that location from within the
Virtual Machine web browser to download the file.”
Talend Real-Time Big Data Sandbox
Big Data Insights Cookbook
Customers Channels
The following Demo will help you see the value that using Talend can bring to your big data projects:
The Real-time Recommendation Demo is designed to illustrate the simplicity and flexibility Talend brings to using Spark in your Big Data Architecture.
Talend Real-Time Big Data Sandbox
Big Data Insights Cookbook
Create a Kafka Topic to Produce Create a Spark recommendation See live streaming
and Consume real-time streaming model based on specific user recommendations to a Cassandra
data actions NoSQL database for “Fast Data”
access for a WebUI
If you are familiar with the ALS model, you can update the ALS parameters to enhance the model or just leave the default values.
Talend Real-Time Big Data Sandbox
Big Data Insights Cookbook
1
1. From the Desktop, double click on
the “Start_Kafka Icon”. If
prompted for a password enter
talend.
2
2. You can stop Kafka at any time by
double-clicking on “Stop_Kafka”.
If prompted for a password, enter
talend.
Talend Real-Time Big Data Sandbox
Big Data Insights Cookbook
5
Talend Real-Time Big Data Sandbox
Big Data Insights Cookbook
If you are familiar with the ALS model, you can update the ALS parameters to enhance the model or just leave the default values.
Talend Real-Time Big Data Sandbox
Big Data Insights Cookbook
Conclusion
Let’s take
one final
What are your next steps? look at how
Talend will
Now that you understand how you can address your big data
opportunities using Talend... help you…
Conclusion
Talend vastly simplifies big data Talend is built for batch and
real-time big data. Talend lowers operations costs
integration
First, Talend vastly simplifies big Second, Talend is built for batch And third, Talend lowers
data integration, allowing you to and real-time big data. Unlike other operations costs.
leverage in-house resources to solutions that “map” to big data or
use Talend's rich graphical tools support a few components, Talend Talend’s zero footprint solution
that generate big data code is the first data integration platform takes the complexity out of…
(Spark, MapReduce, PIG, Java) for built on Spark with over 100 Spark integration deployment,
you. components. management,
maintenance
Talend is based on standards such Whether integrating batch
as Eclipse, Java, and SQL, and is (MapReduce, Spark), streaming A usage based subscription
backed by a large collaborative (Spark), NoSQL, or in real-time, model provides a fast return on
community. Talend provides a single tool for all investment without large upfront
your integration needs. costs.
So you can up skill existing
resources instead of finding new Talend’s native Hadoop data quality
resources. solution delivers clean and
consistent data at infinite scale.