The AOL guys who published the search logs also wrote a
paper. The data download includes a
readme which says
Please reference the following publication when using this collection: G. Pass, A. Chowdhury, C. Torgeson, "A Picture of Search" The First International Conference on Scalable Information Systems, Hong Kong, June, 2006.That paper is readily online: Google search, ACM citation, author PDF download. The first author is Greg Pass, an AOL employee. Can't find a web page for him but here are some papers he's published. The second author is Dr. Abdur Chowdhury, "AOL Chief Architect for Research at AOL". Ouch. Third author is Cayley Torgeson of Raybeam Solutions. I gave the paper a quick read. It's an analysis of usage patterns of search engines: query frequencies, user behaviour, scalability requirements, etc. I didn't see any particularly surprising analysis but it's a summary of a lot of interesting hard-to-come-by data. I have a feeling that the AOL employees who released the search logs were honestly just trying to be good researchers and share their data. Only in this case they blew it. Now I feel a little bad for them. The goal of this collection is to provide real query log data that is based on real users. It could be used for personalization, query reformulation or other types of search research. |