Apache Tajo is an open-source distributed data warehouse framework for Hadoop. Tajo was initially started by Gruter, a Hadoop-based infrastructure company in south Korea. Later, experts from Intel, Etsy, NASA, Cloudera, Hortonworks also contributed to the project. Tajo refers to an ostrich in Korean language. In the year March 2014, Tajo was granted a top-level open source Apache project. This tutorial will explore the basics of Tajo and moving on, it will explain cluster setup, Tajo shell, SQL queries, integration with other big data technologies and finally conclude with some examples.
Before proceeding with this tutorial, you must have a sound knowledge on core Java, any of the Linux OS, and DBMS.
This tutorial has been prepared for professionals aspiring to make a career in big data analytics. This tutorial will give you enough understanding on Apache Tajo.