CS246, Mining Massive Data Sets (Stanford)

The course will discuss data mining and machine learning algorithms for analyzing very large amounts of data. The emphasis will be on MapReduce and Spark as tools for creating parallel algorithms that can process very large amounts of data.
Topics include: Frequent itemsets and Association rules, Near Neighbor Search in High Dimensional Data, Locality Sensitive Hashing (LSH), Dimensionality reduction, Recommendation Systems, Clustering, Link Analysis, Large-scale Supervised Machine Learning, Data streams, Mining the Web for Structured Data, Web Advertising.


div widget
-iceberg_online(data) 2022-11-30
CS246, Mining Massive Data Sets (Stanford)
Lecture 01


div widget
-iceberg_online(data) 2022-11-30
Lecture 01
Slide:
http://web.stanford.edu/class/cs246/slides/01-intro.pdf -iceberg_online(data) 2022-12-26
Suggested Readings - Chapter 1: Data Mining
http://infolab.stanford.edu/~ullman/mmds/ch1n.pdf -iceberg_online(data) 2022-12-27
非常好,我居然把这集看完了,讲课小哥的咳嗽挺吓人的。
-nodream(~~~) 2022-12-27
Suggested Readings - Chapter 2: Large-Scale File Systems and Map-Reduce
http://infolab.stanford.edu/~ullman/mmds/ch2n.pdf -iceberg_online(data) 2023-1-7
Lecture 02


div widget
-iceberg_online(data) 2022-11-30
Lecture 02
Lecture 03


div widget
-iceberg_online(data) 2022-11-30
Lecture 03