5th Sem, CC

5278: Big Data Lab Syllabus for Cloud Computing & Big Data 5th Sem 2021 Revision SITTTR

Big Data Lab detailed syllabus for Cloud Computing & Big Data (CC) for 2021 revision curriculum has been taken from the SITTTRs official website and presented for the Cloud Computing & Big Data students. For course code, course name, number of credits for a course and other scheme related information, do visit full semester subjects post given below.

For Cloud Computing & Big Data 5th Sem scheme and its subjects, do visit Cloud Computing & Big Data (CC) 5th Sem 2021 revision scheme. The detailed syllabus of big data lab is as follows.

Course Objectives:

  • Optimize decisions and improve efficacy in business administration with Big data analytics
  • Practice java concepts required for developing map reduce programs.
  • Impart the architectural concepts of Hadoop and introducing map reduce paradigm
  • Familiarize programming tools PIG and HIVE in the Hadoop ecosystem.
  • Implement best practices for Hadoop development in real world problems.

Course Outcomes:

On completion of the course, the student will be able to:

  1. Demonstrate the installation of VMWare & Hadoop operating modes.
  2. Practice the basic commands of LINUX Operating System & file management in Hadoop.
  3. Illustrate Map Reduce Paradigm & its programming applications in solving real time problems.
  4. Perform the installation and applications of programming tools like PIG & HIVE in the real world Hadoop Ecosystem.

Module 1:

Installation of VMWare to setup the Hadoop environment and its ecosystems, setting up and Installing Hadoop in its three operating modes- Standalone, Pseudo distributed, fully distributed, use web-based tools like spark,impala,mahout, Chukwa,Apache HBase,Ambari etc to monitor your Hadoop setup.

Module 2:

Use commands like pwd,mkdir,rmdir,ls,cd,touch,cat,rm,cp,mv,etc for practicing basic programs with linux operating systems.File Management tasks in Hadoop- Create directory in HDFS at given path(s), List the contents of directory, Upload and download file in HDFS, View contents of file, copy file from source to destination, copy file from/To Local file system to HDFS, move file from source to destination, remove file or directory in HDFS, display last few lines of a file, Rename File/Directory, Display the aggregate length of a file

Module 3:

Set up Map Reduce Paradigm to Run a basic word count, mines weather data, matrix multiplication & K-means clustering.

Sample Open Ended Experiments
(Not evaluated for End Semester Examination but to be included in Continuous Internal Evaluation.) Students can-do open-ended real time projects/ experiments as a group of 2-3. There is no duplication in experiments between groups. Consider the following sample situations:

  • Twitter Data Sentiment Analysis
  • Wiki Page Ranking with Hadoop
  • Health Care Data Management using Apache Hadoop Ecosystem
  • Retail Data Analysis Using Big Data
  • Aadhar Based Analysis using Hadoop
  • Climatic Data Analysis Using Hadoop
  • Flight History Analysis
  • Pseudo-Distributed Hadoop Cluster in Script
  • Sensex log data processing using Big Data tools Students should gather the required information, develop algorithms, map them to the Hadoop ecosystems to implement.

Module 4:

Set up the installation procedures of PIG & HIVE. Write Pig Latin scripts sort, group, join, project, and filter your data. Demonstrate how to (a) Run the Pig Latin Scripts to find Word Count (b) Run the Pig Latin Scripts to find a max temp for each and every year. Use Hive to create, alter, and drop databases, tables, views, functions, and indexes.

Sample Open Ended Experiments
(Not evaluated for End Semester Examination but to be included in Continuous Internal Evaluation.) Students can-do open-ended real time projects/ experiments as a group of 2-3. There is no duplication in experiments between groups. Consider the following sample scenarios:

  • Corporate data integration.
  • A use case for scalability.
  • Facebook Data Analysis Using Hadoop and Hive
  • Link prediction for social media sites.
  • Document analysis application.
  • Specialized analytics.
  • Streaming analytics.
  • Web Based Data Management of Apache Hive Students should gather the required information, develop algorithms, map them to the Hadoop ecosystems to implement.

Text Books:

  1. Jay Liebowitz, ―Big Data and Business Analytics Laboratory, CRC Press.
  2. Seema Acharya, Subhashini Chellappan, “Big Data Analytics”, Wiley, 2015

Reference Books:

  1. Tom White, ― Hadoop: The Definitive Guide Third Edition, O’Reilly Media, 2012

Online Resources

  1. https://www.vmware.com/tryvmware/?p=workstation-w
  2. https://www.tutorialspoint.com/apache_pig/index.htm
  3. https://www.tutorialspoint.com/hive/index.htm
  4. http:// apache.mirrors.lucidnetworks.net/hadoop/
  5. https://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html

For detailed syllabus of all other subjects of Cloud Computing & Big Data (CC), 2021 revision curriculum do visit Cloud Computing & Big Data 5th Sem subject syllabuses for 2021 revision.

To see the syllabus of all other branches of diploma 2021 revision curriculum do visit SITTTR diploma all branches syllabus..

To see the results of Cloud Computing & Big Data (CC) of diploma 2021 revision curriculum do visit SITTTR diploma Cloud Computing & Big Data (CC) results..

For all Cloud Computing & Big Data academic calendars, visit Cloud Computing & Big Data all semesters academic calendar direct link.

Leave a Reply

Your email address will not be published. Required fields are marked *

*