| Start Date: | 2025-01-13 | Course Code: | CS 332 | L-T-P-C: | 3-0-0 |
|---|---|---|---|---|---|
| Course Name: | Natural Language Processing | Semester: | 6th Semester (Elective) | Course Faculty: | Partha Pakray |
Course Plan
Natural Language Processing (CS 332)
Professional Core Elective - I (6th Semester, CSE)
Course Details
Course Code: CS 332
Date of Starting: 13.01.2025
Course Faculty: Dr. Partha Pakray
Associate ProfessorDepartment of Computer Science & Engineering
National Institute of Technology Silchar, Assam, INDIA
Textbooks
- Jurafsky D., Martin J. H., Speech and Language Processing, Prentice Hall.
- Manning C., Schütze H., Foundations of Statistical Natural Language Processing, MIT Press.
Course Outcomes (COs)
- Understand basic concepts in linguistics.
- Learn fundamental mathematical models and algorithms in NLP.
- Apply these models and algorithms in software design for NLP.
- Understand theoretical underpinnings of NLP in linguistics and formal language theory.
| Unit | Topic | Hours | Content |
|---|---|---|---|
| Unit 1 | Introduction to NLP | 1 | Overview, applications, challenges |
| Regular Expressions & Text Normalization | 2 | Tokenization, case folding, stemming, lemmatization | |
| Edit Distance | 1 | Levenshtein distance, applications in NLP | |
| N-gram Language Models | 2 | Smoothing, perplexity, applications | |
| Ambiguity, Naive Bayes, and Sentiment Classification | 1 | Ambiguity in language, Naive Bayes for text classification, sentiment analysis | |
| Vector Semantics | 1 | Word embeddings, cosine similarity | |
| Unit 2 | Neural Networks and Neural Language Models | 2 | Feedforward networks, Word2Vec, Glove |
| RNN, LSTM, GRU | 2 | Recurrent architectures, handling long-term dependencies | |
| Part-of-Speech Tagging | 1 | Definition, applications, tagsets (Penn Treebank) | |
| HMM and Maximum Entropy Models | 2 | Probabilistic sequence models, applications | |
| CRF (Conditional Random Fields) | 1 | Overview, usage in sequence labeling | |
| Sequence Processing with Recurrent Networks | 2 | Applications of RNNs, LSTMs in tagging and entity recognition | |
| Unit 3 | Formal Grammars of English | 1 | CFGs, derivations, basic structures |
| Treebanks as Grammars | 1 | Penn Treebank, constituency structures | |
| Syntactic Parsing | 2 | Top-down, bottom-up parsing | |
| Statistical Parsing and PCFG | 2 | Probabilistic CFGs, statistical approaches | |
| Dependency Parsing | 2 | Dependency grammars, transition-based and graph-based parsing | |
| Unit 4 | The Representation of Sentence Meaning | 2 | Logical forms, semantic representation, challenges |
| Word Sense Disambiguation (WSD) | 1 | Supervised and unsupervised methods, Lesk algorithm | |
| Information Extraction | 2 | Named entity recognition, relation extraction | |
| Semantic Role Labeling | 1 | Predicate-argument structure, FrameNet | |
| Lexicons for Sentiment and Discourse Coherence | 2 | Sentiment lexicons, discourse parsing | |
| Unit 5 | Machine Translation | 2 | Statistical, rule-based, neural machine translation |
| Question Answering | 1 | QA systems, applications in NLP | |
| Dialog Systems and Chatbots | 1 | Architecture, types, conversational AI | |
| Speech Recognition and Synthesis | 2 | ASR systems, TTS systems, deep learning techniques |
Resources
Class PPTs and Notes
- Introduction to NLP
- Tokenization
- Lemmatization and Tokenization
- Porter Stemmer
- Regular Expression
- POS Tagging
- Language Model
- Probabilistic Language Model
- Recurrent Neural Network (RNN) Model
Attendance
Shared in Google Excel Sheet.
Course Evaluation
- End Semester: 50
- Mid Semester: 30
- Assignment + Tutorials: 10
- Minor Test: 10
Course Feedback
Feedback link to be shared later.
Note:
For any additional information, refer to the resources shared in class.

Natural Language Processing (CS 332): Lab Experiments
Guessing Game
Guess Number: Click Here
Guess Word: Click Here
Partha Pakray
Class Notes & PPTs
- - PPT









