Fernando Diaz

Principal Research Manager
Microsoft Research Montreal

diazf [at] acm [dot] org

Introduction

My primary research interest is information retrieval, the formal study of searching large collections of data for small bits of information. The most familiar instance of information retrieval is web search where users search a collection of webpages for one or a few relevant webpages. Information retrieval, however, goes beyond web search and includes topics such as cross-lingual retrieval, personalization, desktop search, and interactive retrieval. My research experience includes distributed information retrieval approaches to web search, interactive and faceted retrieval, mining of temporal patterns from news and query logs, cross-lingual information retrieval, graph-based retrieval methods, and exploiting information from multiple corpora. In my dissertation work, I studied the relationship between document clustering and document scoring for retrieval using methods from machine learning and statistics. As a result, I developed an algorithm for system self-assessment and self-tuning which significantly improves the performance of retrieval algorithms across a variety of corpora.

Detailed information can be found on my curriculum vitae.

Publications

Organization

Conferences

ACM International Conference on Web Search and Data Mining (WSDM 2014)

Workshops

Workshop on Fairness, Accountability, and Transparency on the World Wide Web (FATWEB)
2017

WSDM Workshop on the Ethics of Online Experimentation
2016

SIGIR Workshop on Reproducibility, Inexplicability, and Generalizability of Results (RIGOR)
2015

Social Web for Disaster Management (SWDM)
2015, 2016

SIGIR Workshop on Time-Aware Information Access
2012, 2013, 2014

ACM Workshop on Social Web Search and Mining: Analysis of User Generated Content Under Crisis
2011

TREC

Real-Time Summarization
2016

Temporal Summarization
2013, 2014, 2015

Web
2013, 2014

Teaching

Web Search Engines
Department of Computer Science
Courant Institute of Mathematical Sciences
NYU
Spring 2013, Fall 2014, Fall 2016

Experimental Design for Information Systems
University of Trento
Summer 2012

Advanced Information Retrieval and Databases
Department of Computer Science
School of Engineering
NYU
Spring 2011

Code

indri
A clone of indri-5.12 with minor customizations.

trec-data
scripts to download and standardize trec query and document sets.

latex-dependencies
generate .tex dependencies for a root latex file.

latex-merge
merge collection of latex source into a single latex file.

kstem
stand alone Krovetz stemmer.