Analyzing Web Search Behavior for Software Engineering Tasks

Published at IEEE International Conference on Big Data (IEEE BigData), 2020

Web search plays an integral role in software engineering (SE) to help with various tasks such as finding documentation, debugging, installation, etc. In this work, we present the first large-scale analysis of web search behavior for SE tasks using the search query logs from Bing, a commercial web search engine. First, we use distant supervision techniques to build a machine learning classifier to extract the SE search queries with an F1 score of 93%. We then perform an analysis on one million search sessions to understand how software engineering related queries and sessions differ from other queries and sessions. Subsequently, we propose a taxonomy of intents to identify the various contexts in which web search is used in software engineering. Lastly, we analyze millions of SE queries to understand the distribution, search metrics and trends across these SE search intents. Our analysis shows that SE related queries form a significant portion of the overall web search traffic. Additionally, we found that there are six major intent categories for which web search is used in software engineering. The techniques and insights can not only help improve existing tools but can also inspire the development of new tools that aid in finding information for SE related tasks.

The paper has been accepted for virtual presentation at IEEE BigData 2020. [Acceptance Rate ≈ 15.5%]

Collaborators - Chetan Bansal, Dr. Tom Zimmermann, Dr. Ahmed Hassan Awadallah, Dr. Nachi Nagappan

Please find all the relevant resources below:

  1. Preprint on ArXiv.
  2. Conference paper.