Text Retrieval and Search Engines
University of Illinois at Urbana-Champaign via Coursera
-
407
-
- Write review
Overview
Class Central Tips
This course will cover search engine technologies, which play an important role in any data mining applications involving text data for two reasons. First, while the raw data may be large for any particular problem, it is often a relatively small subset of the data that are relevant, and a search engine is an essential tool for quickly discovering a small subset of relevant text data in a large text collection. Second, search engines are needed to help analysts interpret any patterns discovered in the data by allowing them to examine the relevant original text data to make sense of any discovered pattern. You will learn the basic concepts, principles, and the major techniques in text retrieval, which is the underlying science of search engines.
Syllabus
- Orientation
- You will become familiar with the course, your classmates, and our learning environment. The orientation will also help you obtain the technical skills required for the course.
- Week 1
- During this week's lessons, you will learn of natural language processing techniques, which are the foundation for all kinds of text-processing applications, the concept of a retrieval model, and the basic idea of the vector space model.
- Week 2
- In this week's lessons, you will learn how the vector space model works in detail, the major heuristics used in designing a retrieval function for ranking documents with respect to a query, and how to implement an information retrieval system (i.e., a search engine), including how to build an inverted index and how to score documents quickly for a query.
- Week 3
- In this week's lessons, you will learn how to evaluate an information retrieval system (a search engine), including the basic measures for evaluating a set of retrieved results and the major measures for evaluating a ranked list, including the average precision (AP) and the normalized discounted cumulative gain (nDCG), and practical issues in evaluation, including statistical significance testing and pooling.
- Week 4
- In this week's lessons, you will learn probabilistic retrieval models and statistical language models, particularly the detail of the query likelihood retrieval function with two specific smoothing methods, and how the query likelihood retrieval function is connected with the retrieval heuristics used in the vector space model.
- Week 5
- In this week's lessons, you will learn feedback techniques in information retrieval, including the Rocchio feedback method for the vector space model, and a mixture model for feedback with language models. You will also learn how web search engines work, including web crawling, web indexing, and how links between web pages can be leveraged to score web pages.
- Week 6
- In this week's lessons, you will learn how machine learning can be used to combine multiple scoring factors to optimize ranking of documents in web search (i.e., learning to rank), and learn techniques used in recommender systems (also called filtering systems), including content-based recommendation/filtering and collaborative filtering. You will also have a chance to review the entire course.
Taught by
ChengXiang Zhai
Tags
Related Courses
-
Text Mining and Analytics
University of Illinois at Urbana-Champaign
3.7 -
Discrete Optimization
University of Melbourne
4.3 -
Solving Algorithms for Discrete Optimization
University of Melbourne, The Chinese University of Hong Kong
-
Data Mining
University of Illinois at Urbana-Champaign
5.0 -
Machine Learning: Clustering & Retrieval
University of Washington
4.8 -
Introduction to Search Engine Optimization
University of California, Davis
3.8
Reviews
3.3 rating, based on 14 reviews
-
Gregory J Hamel ( Life Is Study) completed this course and found the course difficulty to be medium.
Text Retrieval and Search Engines is the second course in Coursera's new data mining specialization offered by the University of Illinois at Urbana-Champaign. The course covers a variety of topics in text data mining and natural language processing including... -
Marianne Cardwell completed this course.
I've taken a number of courses on Coursera and have thoroughly enjoyed some of them, but it's clear that the quality varies. I was very disappointed in this course. Having applied to the University of Illinois' Master of Computer Science - Data Science,... -
Anonymous is taking this course right now.
I was initially excited for this course as it seemed a good dive into unstructured text data. But now I'd say: *skip this course*. I think the instructor is okay and presents the material in a sufficient enough manner to get a decent grasp of it. The... -
Anonymous completed this course.
Great class with a nice mix of theoretical and practical lessons. There was a competition at the end of the course which pushed us to come up with new ideas. -
Anonymous completed this course.
Precise and clear explanation about the concepts .This course completes focuses on text retrieval concepts with strong strong intro on what is text retrieval , what are the challenges faced and further gives an insight on various models and improvement in this field .Therefore, this course is mostly only for people more interested in an area in information retrieval.
-
Anonymous completed this course.
pretty goodpretty goodpretty goodpretty goodpretty goodpretty goodpretty goodpretty goodpretty goodpretty goodpretty goodpretty goodpretty goodpretty goodpretty goodpretty goodpretty goodpretty goodpretty goodpretty goodpretty goodpretty goodpretty good -
Lien Block completed this course, spending 2 hours a week on it and found the course difficulty to be medium.
The course is not very organised and even though they share a lot of information, it's not really very useful for someone who wants to get his/her hands dirty and really learn NLP/Text retrieval.
(+ Instructor is sometimes very hard to understand) -
Kristina Šekrst completed this course and found the course difficulty to be medium.
I'm encouraging more programming assignments dealing with NLP, and a bit smaller focus on C++ and more R/Py support. It was a fun experience, and I hope that the theoretical approach will slowly turn into a combination of theory and practice.
-
Anonymous completed this course.
It's not complete, but a good start point for who want to learn more about information retrieval. Great course. I recommend. -
Colin Khein completed this course.
-
Basil Rormose completed this course.
-
Mike Rocke completed this course.
-
Deepak Jois is taking this course right now and found the course difficulty to be medium.
-
Rafael Prados completed this course.