This presentation gives an overview of the Apache Phoenix project. It explains Phoenix in terms of its architecture, environment, ETL, SQL, UDF's and transactions.
Links for further information and connecting
http://www.amazon.com/Michael-Frampton/e/B00NIQDOOM/
https://nz.linkedin.com/pub/mike-frampton/20/630/385
https://open-source-systems.blogspot.com/
This presentation gives an overview of the Apache Phoenix project. It explains Phoenix in terms of its architecture, environment, ETL, SQL, UDF's and transactions.
Links for further information and connecting
http://www.amazon.com/Michael-Frampton/e/B00NIQDOOM/
https://nz.linkedin.com/pub/mike-frampton/20/630/385
https://open-source-systems.blogspot.com/
This presentation gives an overview of the Apache Phoenix project. It explains Phoenix in terms of its architecture, environment, ETL, SQL, UDF's and transactions.
Links for further information and connecting
http://www.amazon.com/Michael-Frampton/e/B00NIQDOOM/
https://nz.linkedin.com/pub/mike-frampton/20/630/385
https://open-source-systems.blogspot.com/
● Supports OLTP for Hadoop ● Uses Apache HBase as its backing store ● Open source / Apache 2.0 license ● Written in Java , SQL ● ACID (atomicity, consistency, isolation, durability) – Via Apache Tephra integration Phoenix SQL Support
● Accepts SQL queries
● Compiles them to HBase scans ● Orchestrates running of scans ● Produces regular JDBC result sets ● Creates performance gains by using – HBase API/coprocessors/custom filters ● Results in query response times – Milliseconds for small queries – Seconds for tens of millions of rows Phoenix SQL Support
● See phoenix.apache.org for full syntax support
Phoenix Environment Phoenix Bulk Loading
● Bulk load data via
● Single-threaded for CSV via psql i.e. – bin/psql.py -t EXAMPLE localhost data.csv – Load for EXAMPLE table – For HBase on local machine ● MapReduce-based for CSV and JSON – See next slide Phoenix Bulk Loading
Bulk load example for MapReduce
●
– For CSV and JSON loads
– Using Phoenix MapReduce library – Against the EXAMPLE table Phoenix Performance Phoenix User-defined functions(UDFs)
● Create temporary/permanent UDF's
– Temporary for session only ● Use UDF's in SQL and Indexes ● Permanent UDF's stored in SYSTEM.FUNCTION ● Tenant specific UDF usage supported ● UDF jar files must be placed on HDFS ● UDF jar updates not currently possible – (without cluster bounce) Phoenix Transactions
● Cross row/table/ACID support using Apache Tephra
● Transactional functionality currently beta ● Enable transactions and snapshot dir in hbase-site.xml ● Also set a transational timeout value ● Start Tephra ● Create tables with flag TRANSACTIONAL=true ● Then transactions act as follows – Start with statement against table – End with commit or rollback Available Books
● See “Big Data Made Easy”
– Apress Jan 2015 ● See “Mastering Apache Spark” – Packt Oct 2015 ● See “Complete Guide to Open Source Big Data Stack – “Apress Jan 2018”
● Find the author on Amazon
– www.amazon.com/Michael-Frampton/e/B00NIQDOOM/ ● Connect on LinkedIn – www.linkedin.com/in/mike-frampton-38563020 Connect
● Feel free to connect on LinkedIn
– www.linkedin.com/in/mike-frampton-38563020
● See my open source blog at
– open-source-systems.blogspot.com/
● I am always interested in – New technology – Opportunities – Technology based issues – Big data integration