5th Sem, BD

315320: Big Data Technology Syllabus for Cloud Computing & Big Data 5th Sem K Scheme MSBTE PDF

Big Data Technology detailed Syllabus for Cloud Computing & Big Data (BD), K scheme PDF has been taken from the MSBTE official website and presented for the diploma students. For Subject Code, Subject Name, Lectures, Tutorial, Practical/Drawing, Credits, Theory (Max & Min) Marks, Practical (Max & Min) Marks, Total Marks, and other information, do visit full semester subjects post given below.

For all other MSBTE Cloud Computing & Big Data 5th Sem K Scheme Syllabus PDF, do visit MSBTE Cloud Computing & Big Data 5th Sem K Scheme Syllabus PDF Subjects. The detailed Syllabus for big data technology is as follows.

Rationale

For the complete Syllabus, results, class timetable, and many other features kindly download the iStudy App
It is a lightweight, easy to use, no images, and no pdfs platform to make students’s lives easier.
Get it on Google Play.

Course Outcomes:

Students will be able to achieve & demonstrate the following COs on completion of course based learning

  1. Illustrate characteristics and technologies of Big Data.
  2. Apply relevant Big Data Storage techniques for a given application.
  3. Use Big Data Technologies Hive and Pig to handle different nature of data.
  4. Implement Big Data Processing Technique for a given application.
  5. Apply Data Visualization Techniques for a given application.

Unit I

Introduction to Big Data Technologies 1.1 Introduction :Definition of Big Data, Characteristics of big data, types of big data 1.2 Applications of big data 1.3 Difference between big data and traditional data in terms of volume, velocity, variety, veracity, value, scalability 1.4 Big data technology-definition, types : operational and analytical 1.5 Classification of big data technologies based on its purpose: Data Storage, Data Mining, Data Analytics, Data Visualization

Suggested Learning Pedagogie
Lecture Using Chalk-Board Videos Collaborative Learning

Unit II

For the complete Syllabus, results, class timetable, and many other features kindly download the iStudy App
It is a lightweight, easy to use, no images, and no pdfs platform to make students’s lives easier.
Get it on Google Play.

Unit III

Handling data and Query with Hive and pig 3.1 Apache Hive: Hive architecture, warehouse directory and megastore, Hive data types, loading data into tables, Hive file formats, Hive built in functions, Querying data, Sorting, Aggregation 3.2 Apache Pig: Architecture, execution modes, built-in functions, Pig operations: group and join, combine and split, filter

Suggested Learning Pedagogie
Lecture Using Chalk-Board Video Demonstrations Collaborative learning Hands-on

Unit IV

Big Data Processing Tools 4.1 Apache Spark: Need for Apache Spark, Apache Spark architecture, Spark components, Resilient Distributed Datasets(RDD) transformations 4.2 Spark SQL, Spark Session, Data Frame operations 4.3 Apache Kafka: Need, benefits, Kafka Messaging System: Point to point and Public subscribe messaging system, Kafka cluster architecture

Suggested Learning Pedagogie
Lecture Using Chalk-Board Video Demonstrations Collaborative learning Hands-on

Unit V

For the complete Syllabus, results, class timetable, and many other features kindly download the iStudy App
It is a lightweight, easy to use, no images, and no pdfs platform to make students’s lives easier.
Get it on Google Play.

List of Experiments:

  1. *Write a python program for analyzing the characteristics of any large dataset case study
  2. *Install Single-node Hadoop cluster
  3. *Write HDFS commands for file operations
  4. *Write MapReduce program for word count operation
  5. Write MapReduce program for matrix addition
  6. Write MapReduce program to find max temperature
  7. *Write MapReduce program to sort array elements
  8. Write basic queries for CRUD operations in MongoDB
  9. *Hive installation and Queries
  10. Create Hive table with different storage format specification
  11. Apply Hive QL clauses
  12. Apache Pig: *a)Install Apache Pig *b)Use Pig operators
  13. *Develop spark program to perform basic operations on data frames using python
  14. Write a program to read and write data stored in Apache hive through Spark SQL
  15. Develop a program to perform word count operation
  16. *Write Program to perform RDD transformations
  17. *Develop program to produce and consume message and build simple pipeline
  18. *Write program to visualize data using verious plots using Matplotlib
  19. *Write program to visualize data using verious plots using Seaborn
  20. Develop program to create dashboard using plotly tool

Self Learning

Micro Project

  • Analyze an agricultural dataset to gain insights into crop production, seasonal trends, and regional performance with Apache Hive.
  • Analyze Amazon customer reviews to gain insights into product ratings, review trends, and customer sentiment with Apache Spark.
  • Generate a report showing total sales and average sales per region using Hive for SQL-like querying.
  • Create meaningful visualizations of likes, share and comments count of Facebook dataset using Tableau or PowerBI tools.

Other

  • Complete any one course related to big data technologies on Infosys Springboard or any other MOOCs Platform.

Assignment

  • Select any one big data technology:
    1. Perform a hands-on exercise.
    2. Write down the steps and challenges faced during the exercise.
    3. NoSQL database:
    4. Compare and contrast different NoSQL databases.
    5. Write down the scenarios where NoSQL databases are more suitable than traditional relational databases.

Laboratory Equipment

For the complete Syllabus, results, class timetable, and many other features kindly download the iStudy App
It is a lightweight, easy to use, no images, and no pdfs platform to make students’s lives easier.
Get it on Google Play.

Learning Materials

  1. M. Vijayalakshmi, Radha Shankarmani Big Data Analytics Publication details: Wiley c2017, 2022 N. Delhi Edition: 2nd ed. c2017,ISBN: 9788126565757
  2. Holden Karau, Andy Konwinski, Patrick Wendell, Matei Zaharia Learning Spark: Lightning-Fast Data Analytics O’Reilly Media Publication Date: January 28, 2015 ISBN-10: 1449358624 ISBN-13: 978-1449358624
  3. Pramod J. Sadalage, Martin Fowler NoSQL Distilled: A Brief Guide to the Emerging World of Polyglot Persistence Addison-Wesley August 10, 2012 ISBN: 978-0321826626
  4. Tom White Hadoop: The Definitive Guide 4th Edition, Released April 2015, Publisher(s): O’Reilly Media, Inc. ISBN: 9781491901632.
  5. Kieran Healy Data Visualization: A Practical Introduction Princeton University Press March 31, 2018 ISBN: 978-0691181622

Learning Websites

  1. https://intellipaat.com/blog/big-data-tutorial-for-beginners Basic Concepts of big data & its Technologies
  2. https://www.edureka.co/blog/big-data-tutorial Big Data Technologies
  3. https://www.javatpoint.com/hadoop-tutorial Hadoop Ecosystem
  4. https://www.tutorialspoint.com/apache_spark/index.htm Apache Spark
  5. https://www.bigdataelearning.com/blog/apache-hive-beginners-guide Apache Hive
  6. https://issuu.com/melissamatiasf/docs/ebook-data-visualizati on-en Data Visualization
  7. https://www.w3schools.com/python/matplotlib_intro.asp Data Visualization: Matplotlib
  8. https://www.w3schools.com/python/numpy/numpy_random_seaborn. asp Data Visualization :Seaborn

For detail Syllabus of all other subjects of Cloud Computing & Big Data, K scheme do visit Cloud Computing & Big Data 5th Sem Syllabus for K scheme.

For all Cloud Computing & Big Data results, visit MSBTE Cloud Computing & Big Data all semester results direct links.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

This site uses Akismet to reduce spam. Learn how your comment data is processed.