Natural Language Processing (NLP) techniques and tools have become very powerful and are applicable in many domains. In the context of Software Engineering (SE), there are many promising opportunities for the application of NLP to be used to improve SE theory and practice. Recently, investigations have begun to unravel the extent to which large code corpora that can be retrieved from GitHub, StackOverflow, etc., are amenable to analysis using statistical NLP models and algorithms, so that the revolutionary advances in speech recognitions, translation, comprehension, etc. can be applied in SE.
This workshop will bring together an international group of researchers in Statistical NLP, Programming Languages, Software Engineering and related fields for an intensive period of discussion and presentation of results in the area. We invite a range of researchers with both NLP and SE backgrounds to come together, discuss their research, establish datasets, tasks, and baselines, and generally help the field build momentum.
The workshop will be held Sunday, November 13th, co-located with FSE,2016.
Sunday, November 13
- 9:00am – 9:15am
- 9:15am – 10:30am
- 10:30am – 11am
- coffee break
- 11am – 12:30pm
Session 1: Coding Style
- Learning to Name Code Identifiers by Miltiadis Allamanis and Charles Sutton.
- On the Use of Statistical Machine Translation for Code Beautification and Refactoring by Bogdan Vasilescu.
Session 2: Tracing & Translating
- Augmenting Natural Language Analysis with Trace Links to Mine Domain Facts from Software Requirements by Jin Guo and Jane Cleland-Huang.
- Learning to Translate Docstrings to Function Representations in Standard Library Documentation by Kyle Richardson.
- Using Natural Language Processing to Translate Software Project Queries into Structured Form by Jane Cleland-Huang, Jin Guo, Natawut Monaikul, Sugandha Lohar, William Goss and Alexander Rasin.
- 12:30pm – 2:00pm
- 2:00pm – 3:30pm
Technical Briefing: Statistical NLP for Software
- 3:30pm – 4:00pm
- coffee break
- 4:00pm – 5:30pm
Session 3: Language Models and Code Cloning
- Entropy Guided Spectrum Based Testing by Matthew Irvine and Baishakhi Ray.
- A deep language model for software code by Hoa Khanh Dam, Truyen Tran and Trang Pham.
- Can I Copy this Code? Extracting Norms from Software Licenses using Frame Semantics by Sayonnha Mandal, Robin Gandhi and Harvey Siy.
Session 4: Search & Retrieval
- Towards Improving Q&A Forum Search and Mining: Automatic Identification of Developer Goal and Symptom by JZachary R. Senzer, Lori Pollock and K. Vijay-Shanker.
- Finding Similar Projects in GitHub using Word2Vec and WMD by Md Masudur Rahman.
We gratefully acknowledge sponsorship from NSF. A limited amount of Travel funding is available to support participant travel. As per sponsor guidance, priority for travel funding will be given to Natural-language processing researchers and their students, and secondarily to software engineering students and faculty, with special considerations for people from under-represented groups and those without other funding sources.
Call for participation
We invite short position papers, of at most 4 pages in length. Submissions will be reviewed primarily for relevance, will not appear in ACM Digital Library, and may be published subsequently elsewhere. A few of the submissions will be invited for presentation.
Please submit your paper here:
- Aug 8, 2016
- 1-4 page paper due
- Nov 13, 2016
- Workshop date
|Prem Devanbu||University of California, Davis|
|Baishakhi Ray||University of Virginia|
|Abram Hindle||University of Alberta|
|Charles Sutton||University of Edinburgh|
|Christian Bird||Microsoft Research, Redmond|
|Dana Movshovitz-Attias||Google Research|
|Denys Poshyvanyk||College of William & Mary|
|Earl Barr||University College London|
|Tien N. Nguyen||Iowa State University|
|Vladimir Filkov||University of California, Davis|
|Zhendong Su||University of California, Davis|
- Prem Devanbu (UC Davis)
- Tien N. Nguyen (Iowa State University)
- Baishakhi Ray (University of Virginia)
- Earl Barr (University College, London)
- Christian Bird (Microsoft Research)