Sowmya Vajjala

Roles and responsibilities

I am a research officer in the Multilingual Text Processing team at the Digital Technologies Research Centre (DT), National Research Council of Canada (NRC-CNRC). I primarily conduct research in Natural Language Processing, and I have also previously worked on providing research guidance to Canadian companies through the IRAP program. I also mentor co-op students regularly. 

Current research and/or projects

- Named Entity Recognition

- Automatic Readability Assessment and Text Simplification

- Interface of NLP research with other disciplines (Education, Language Assessment, Economics, etc)

Research and/or project statements

I am broadly interested in NLP research and its relevance to other disciplines and industry practice. Specifically, my current research interests are in information extraction and text classification, and in evaluating NLP performance beyond comparing various models. Apart from research, I am also interested in communicating about NLP to various audiences, including students and researchers from other disciplines who are starting out with NLP. 


PhD in Computational Linguistics, Eberhard Karls University of Tuebingen, Germany, 2015 (Summa Cum Laude).

M.S. in Computer Science and Engineering, International Institute of Information Technology, Hyderabad, India, 2009.

B.Eng. in Electronics and Communications Engineering, Osmania University, Hyderabad, India, 2005.

Professional activities/interests

-  [December 2022] "Beyond the state of the art models: What is complex text, and what are we simplifying?", Invited talk at EMNLP-2022 workshop on Text Accessibility, Readability and Simplification. 

- "NLP Evaluation Beyond a standard test set", Invited talk at Gojek Tech, August 2022

- "NLP without an annotated dataset", 90 minute tutorial delivered at Toronto Machine Learning Summit (2021) and Open Data Science Conference (2021)

(Slides, Code and Other Materials

- "NLP beyond NLPers: The many faces of NLP in academia and the real-world", Plenary talk at the 46th Conference of the Japan Association of English Corpus Studies (JAECS), 2020 (Slides)


Association for Computational Linguistics (ACL)

Association for Computing Machinery (ACM)

Key publications

"Practical Natural Language Processing: a comprehensive guide to building real-world nlp systems". a book I co-authored with Bodhisattwa Majumder, Anuj Gupta and Harshit Surana, published by O'Reilly Media in 2020.

Full list of publications can be found here: ‪Sowmya Vajjala‬ - ‪Google Scholar‬

Previous work experience

- 2018-19: Senior Data Scientist at "The Globe and Mail" (Toronto), and "AbacusNext" (Toronto)

    * Building data science teams: hiring and mentoring

    * Working on building research prototypes for various NLP usecases

    * Talking to various stake holders to gather requirements

    * Working with devops teams to deploy applications

International experience and/or work

- January 2016-April 2018: Assistant Professor (Tenure Track), Iowa State University, USA

   * Teaching courses in data science, programming, natural language processing, and technical communciation

   * Mentoring students

   * Conducting research and writing grant applications

Sowmya Vajjala

Associate Research Officer
Digital Technologies
1200 Montreal Road
Ottawa, Ontario K1A 0R6
Preferred language: English
Other(s): English, Telugu, Hindi

Follow me


Information Technology, Artificial Intelligence, Machine Learning, Natural language processing, Data Science, Modelling