BIG DATA ANALYTICS
Unit - I Introduction to Big Data Analytics and Data Architecture
1.4 Big Data Types
Unit - II Introduction to Hadoop and MapReduce
2.6 MapReduce : Map Tasks, Key-Value Pair, Grouping by Key, Partitioning, Combiners, Reduce Tasks, Details of MapReduce Processing Steps
Unit - III NoSQL Databases and Big Data Management
3.1 Introduction NoSQL in Big Data
3.2 NoSQL Data Store : NoSQL, CAP theorem, Schema-less Models
3.3 NoSQL Data Architecture Patterns : Key-Value Store, Document Store, Tabular Data, Object Data Store.Graph Database
3.4 NoSQL to manage Big Data
3.5 MongoDB Database
Unit - IV Hive and Pig
4.1 Introduction to Hive : Hive Characteristics, Limitations
4.2 Hive Architecture
4.3 Hive Data Types and File Formats
4.4 Hive Integration and Workflow Steps
4.5 Hive Built-in functions
4.6 HiveQL : HiveQL DDL, HiveQL DML, HiveQL for Querying the Data
4.7 Introduction to Pig : Applications of Apache Pig, Features of Pig, Compare Pig with SQL, MapReduce, and Hive
4.8 Pig Architecture
4.9 Pig Latin Data Model
Unit - V Spark and Real-Time Analytics
5.1 Introduction to Big Data tool Spark : Main components of Spark Architecture, Features of Spark, Spark Software Stack
5.2 Introduction to Data Analysis with Spark : Spark SQL
5.3 Programming with RDDs and Machine learning with MLib
5.4 Data ETL (Extract, Transform and Load) Process: Composing Spark Program steps for ETL
5.5 Analytics, Reporting and Visualization
5.6 Apache Spark Streaming Platform: Spark Streaming Architecture, Spark streaming vs Structured streaming, Internal Working of Spark Streaming
5.7 Spark streaming characteristics: Scalable, Fault Tolerance and Load Balancing
Comments
Post a Comment