CSE 450: Web Mining Seminar
Fall 2005 Reading Schedule

Date Paper(s) Presenter Critic Best Review Class Consensus
Tuesday
Aug 30
  • Michael J. Hanson and Dylan J. McNamee. Efficient Reading of Papers in Science and Technology, 2000.
  • Roy Levin and David D. Redell. An Evaluation of the Ninth SOSP Submissions -or- How (and How Not) to Write a Good Systems Paper. Operating Systems Review, 17(3):35-40, July, 1983.
  • Alan Jay Smith. The Task of the Referee. IEEE Computer, 23(4):65-71, April 1990.
  • Prof. Davison N/A N/A N/A
    Thursday
    Sep 1
  • K. Efe, V. Raghavan, and A. Lakhotia. Content and Link Structure Analysis for Searching the Web. Chapter from Computational Web Intelligence: Intelligent Technology for Web Applications, 2004.
  • Nie Khan N/A 3.5
    Discuss Reviewing Davison N/A N/A N/A
    Tuesday
    Sep 6
  • Vannevar Bush. As we may think. Atlantic Monthly, July 1945.
  • Hookway Goel Goel 4.0
    Discuss projects Davison N/A N/A N/A
    Thursday
    Sep 8
  • Trawling the web for emerging cyber-communities, Kumar, Raghavan, Rajagopalan, Tomkins, 1999.
  • Qi Hogg Goel 4.0
  • Data mining for hypertext: A tutorial survey, Chakrabarti, 2000.
  • N/A N/A Qi 4.0
    Tuesday
    Sep 13
  • Focused crawling: a new approach to topic-specific Web resource discovery, Chakrabarti, van den Berg, and Dom, 1999.
  • Hogg Nie Hookway 4.0
  • Web Mining Research: A Survey, Kosala and Blockeel, 2000.
  • N/A N/A Hogg 2.5
    Thursday
    Sep 15
  • MapReduce: Simplified Data Processing on Large Clusters, Dean and Ghemawat, 2004. Also skim background papers: The Google File System, Ghemawat et al., 2003, and Web search for a planet: The Google cluster architecture, Barroso et al, 2003.
  • Wu Khan Hogg 3.5
  • Interpreting the Data: Parallel Analysis with Sawzall, Pike, Dorward, Griesemer and Quinlan, 2005.
  • Hookway Qi Wu 3.0
    Tuesday
    Sep 20
  • What is this Page Known for? Computing Web Page Reputations, Rafiei and Mendelzon, 2000.
  • Khan Wu Goel 3.5
    Present project ideas Davison N/A N/A N/A
    Thursday
    Sep 22
  • Detecting Phrase-Level Duplication on the World Wide Web, Fetterly, Manasse, and Najork, 2005.
  • Goel Hookway Wu 3
    Present project ideas Davison N/A N/A N/A
    Tuesday
    Sep 27
  • The Structure of Broad Topics on the Web, Chakrabarti, Joshi, Punera, Pennock, 2002.
  • Qi Hogg Nie 4
    Thursday
    Sep 29
  • Using ODP Metadata to Personalize Search, Chirita, Nejdl, Paiu, and Kohlschuetter, 2005.
  • Nie Qi Wu 2.5
    Two-page Project Proposals due Davison N/A N/A N/A
    Tuesday
    Oct 4
  • Efficient Identification of Web Communities, Flake, Lawrence and Giles, 2000.
  • Wu Khan Hogg 3
    Thursday
    Oct 6
  • When Experts Agree: Using Non-Affiliated Experts to Rank Popular Topics, Bharat and Mihaila, 2001
  • Khan Hookway Nie 3.5
    Tuesday
    Oct 11
    Pacing Break
    Thursday
    Oct 13
  • Propagation of Trust and Distrust, Guha, Kumar, Raghavan, and Tomkins, 2004.
  • Hogg Wu Hogg 4
    Tuesday
    Oct 18
  • Exploiting the hierarchical structure for link analysis, Xue, Yang, Zeng, Yu, and Chen, 2005.
  • Qi Nie Goel 4
    Wednesday
    Oct 19
  • Mapping the Internet and Intranets (9am in PL303 for a discussion of the mechanics of mapping, 3:30pm in lobby for refreshments, and 4pm talk in PL416)
  • Cheswick N/A N/A N/A
    Thursday
    Oct 20
  • Mining the Semantic Web: Requirements for Machine Learning, Ciravegna and Chapman, 2005. Also skim background paper: The Semantic Web, Berners-Lee, Hendler and Lassila, 2001.
  • Hookway Goel Hogg 2
    Tuesday
    Oct 25
  • A Unified Probabilistic Framework for Web Page Scoring Systems, Diligenti, Gori, and Maggini, 2004.
  • Khan Hogg N/A 3.5
    Thursday
    Oct 27
  • Accurately Interpreting Clickthrough Data as Implicit Feedback, Joachims, Granka, Pan, Hembrooke, and Gay, 2005. Also skim background paper: Optimizing Search Engines using Clickthrough Data, Joachims, 2002.
  • Goel Wu N/A 4
    Tuesday
    Nov 1
  • The evolving mSpace platform: leveraging the semantic web on the trail of the memex, Schraefel, Smith, Owens, Russell, Harris, and Wilson, 2005.
  • Hogg Hookway N/A 3
    Thursday
    Nov 3
  • Searching the Workplace Web, Fagin, Kumar, McCurley, Novak, Sivakumar, Tomlin, and Williamson, 2003.
  • Khan Goel N/A 3.5
    Tuesday
    Nov 8
  • Block-level link analysis, Cai, He, Wen, and Ma, 2004.
  • Nie Qi N/A 4
    Thursday
    Nov 10
  • Ranking Definitions with Supervised Learning Methods, Xu, Cao, Li, and Zhao, 2005.
  • Wu Khan N/A 3.5
    Tuesday
    Nov 15
  • Query type classification for web document retrieval, Kang and Jim, 2003.
  • Qi Goel N/A 3
    Thursday
    Nov 17
  • Thwarting the Nigritude Ultramarine: Learning to Identify Link Spam, Drost and Scheffer, 2005.
  • Hookway Nie N/A 2.5
    Tuesday
    Nov 22
  • Is Question Answering an Acquired Skill?, Ramakrishnan, Chakrabarti, Paranjpe, and Bhattacharya, 2004.
  • Goel Hogg N/A 4
    Thursday
    Nov 24
    Thanksgiving Break
    Tuesday
    Nov 29
  • Undue Influence: Eliminating the Impact of Link Plagiarism on Web Search Rankings, Wu and Davison, 2006.
  • Wu Khan N/A 3.5
    Thursday
    Dec 1
  • Anti-aliasing on the Web, Novak, Raghavan, and Tomkins, 2004.
  • Hogg Nie N/A 3
    Tuesday
    Dec 6
  • Opinion observer: analyzing and comparing opinions on the Web, Liu, Hu, and Cheng, 2005.
  • Goel Hookway N/A 3
    Thursday
    Dec 8
    Discussion Davison N/A N/A N/A
    Friday
    Dec 16
    Final Exam slot 7:10-10:00pm for Presentations in MG110 Everyone N/A N/A N/A


    This page is http://www.cse.lehigh.edu/~brian/course/webmining/schedule.html
    Last revised: 8 December 2005.