IT

IT803SE-05: Web and Text Mining Syllabus for IT 8th Sem 2020-21 DBATU (Elective-XII)

Web and Text Mining detailed syllabus scheme for Information Technology (IT), 2020-21 onwards has been taken from the DBATU official website and presented for the Bachelor of Technology students. For Subject Code, Course Title, Lecutres, Tutorials, Practice, Credits, and other information, do visit full semester subjects post given below.

For 8th Sem Scheme of Information Technology (IT), 2020-21 Onwards, do visit IT 8th Sem Scheme, 2020-21 Onwards. For the Elective-XII scheme of 8th Sem 2020-21 onwards, refer to IT 8th Sem Elective-XII Scheme 2020-21 Onwards. The detail syllabus for web and text mining is as follows.

Web and Text Mining Syllabus for Information Technology (IT) 4th Year 8th Sem 2020-21 DBATU

Web And Text Mining

Course Objectives:

For the complete syllabus, results, class timetable, and many other features kindly download the iStudy App
It is a lightweight, easy to use, no images, and no pdf platform to make students’s lives easier.
Get it on Google Play.

Course Outcomes:

After learning the course the students should be able:

  1. To examine the types of the data to be mined and present a general classification of tasks and primitives to integrate a data mining system
  2. To explore DWH and OLAP, and devise efficient and cost effective methods for maintaining DWHs.
  3. To discover interesting patterns from large amounts of data to analyze and extract patterns to solve problems , make predictions of outcomes
  4. To comprehend the roles that data mining plays in various fields and manipulate different data mining techniques
  5. To evaluate systematically supervised and unsupervised models and algorithms wrt their accuracy.

Unit I

introduction to information Retrieval, inverted indices and boolean queries. Query optimization, The nature of unstructured and semi-structured text.

Unit II

For the complete syllabus, results, class timetable, and many other features kindly download the iStudy App
It is a lightweight, easy to use, no images, and no pdf platform to make students’s lives easier.
Get it on Google Play.

Unit III

Index compression: lexicon compression and postings lists compression, Gap encoding, amma codes, Zipfs Law.Blocking. Extreme compression, Query expansion: spelling correction and synonyms. Wild-card queries, permuterm indices, n-gram indices. Edit distance, soundex, language detection. Index construction. Postings size estimation, merge sort, dynamic indexing, positional indexes, n-gram indexes, real-world issues.

Unit IV

Parametric or fielded search, Document zones, the vector space retrieval model, tf.idf weighting, Scoring documents, Vector space scoring, the cosine measure, Efficiency considerations, Nearest neighbor techniques, reduced dimensionality approximations, random projection. Results summaries: static and dynamic, Evaluating search engines. User happiness, precision, recall, F-measure, Creating test collections: kappa measure, interjudge agreement. Relevance, approximate vector retrieval.

Unit V

For the complete syllabus, results, class timetable, and many other features kindly download the iStudy App
It is a lightweight, easy to use, no images, and no pdf platform to make students’s lives easier.
Get it on Google Play.

Unit VI

Introduction to the problem, Partitioning methods, K-means clustering, Mixture of Gaussians model, clustering versus classification, Hierarchical agglomerative clustering, clustering terms using documents, Labelling clusters, Evaluating clustering, Text-specific issues, Reduced dimensionality/spectral methods, Latent semantic indexing (LSI), Applications to clustering and to information retrieval.
Vector space classification using hyperplanes; centroids; k Nearest Neighbors, Support Vector machine classifiers, Kernel functions, Text classification, Exploiting text-specific features, Feature selection, Evaluation of classification, Micro- and macro averaging, comparative results.

Text Books:

  1. Michael Geatz and Richard Roiger, Data Mining: A Tutorial Based Primer, Pearson Education
  2. Thomas W. Miller, Data and Text Mining: A Business Applications Approach, Pearson Education
  3. Pang-Ning Tan, Michael Steinbach, Vipin Kumar, Introduction to Data Mining, Pearson Education

Reference Book:

  1. R. Baeza-Yates and B. Ribeiro-Neto, Modern Information Retrieval, Pearson Education, 1999
  2. D.A. Grossman, O. Frieder, Information Retrieval: Algorithms and Heuristics, Springer, 2004.
  3. W. Frakes and R. Baeza-Yates, Information Retrieval: Data Structures and Algorithms, Pearson Education, 1st Edition.

For detail syllabus of all subjects of Information Technology (IT) 8th Sem 2020-21 onwards, visit IT 8th Sem Subjects of 2020-21 Onwards.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

This site uses Akismet to reduce spam. Learn how your comment data is processed.