SADGURU TECHNOLOGIES - 04040154733
8179736190 www.sadgurutechnologies.com
1. Intro to Hadoop, BigData
• What is BigData?
• Parallel Computer vs. Distributed Computing
• Brief history of Hadoop
• RDBMS/SQL vs. Hadoop
• Scaling with Hadoop
• Intro to the Hadoop ecosystem
• Optimal hardware and network configurations for Hadoop
2. HDFS – Hadoop Distributed File System
• Linux File system options
• NameNode architecture
• Secondary NameNode architecture
• DataNode architecture
• Heartbeats, Rack Awareness, Health Check
• Exploring the HDFS Web UI
LAB #2: HDFS CMD Line
3. Beginning MapReduce
• MapReduce Architecture
• JobTracker/TaskTracker
• Combiner
• Partitioner
• Shuffle and Sort
• Exploring the MapReduce Web UI
• Walkthrough of a simple Java MapReduce example
• Use case: Word Count in MapReduce
LAB #3: Running MapReduce in Java
4. Advanced MapReduce
• Data Types and File Formats.
• Driver, Mapper & Reducer Class Code.
• Build Map & Reduce programs using Eclipse.
• Serialization and File-Based Data Structures
• Input/output formats
• Counters
• Run Map Reduce locally and on cluster.
LAB #4: Java MapReduce API
5. Hive for Structured Data
• Hive architecture
• Hive vs. RDBMS
• HiveQL and Hive Shell
• Managing tables
• Data types and schemas
• Querying data
• Partitions and Buckets
• Intro to User Defined Functions
LAB #5: Exploring Hive Commands
6. Overview of NoSql and HBase
• Introduction of NoSql.
• CAP Theorem.
• HBase architecture
• HBase versions and origins
• HBase vs. RDBMS
• Data Modeling
• Column Families and Regions
LAB #6: Intro to HBase Command Line
7. Working with Sqoop
• Introduction to Sqoop
• Import Data
• Export Data
• Sqoop Syntaxes
• Database Connection
Lab#7: Hands on exercise on Sqoop and mysql DB.
Thanks & Regards,
SADGURU TECHNOLOGIES
H. No: 7-1-621/10, Flat No: 102, Sai Manor Apartment, S.R. Nagar Main Road,
Hyderabad-500038, Landmark: Beside Umesh Chandra Statue, Approach Road Parallel to Main Road
Mob: 91-8179736190, Ph: 040-40154733
USA: +1 (701) 660-0529