Text analytics

As the volume of data increases exponentially, professionals who deal with vast amounts of information can struggle to keep up. The National Research Council's of Canada's (NRC) text analytics team conceives, builds, and evaluates tools that help extract information from textual data: explicit, implicit and inferred information, composite information (summaries), subtext information (sentiment, sarcasm, metaphor) and meta-information.

The team collaborates with government, industry, and academia to drive text analytics technologies for a range of problem-driven and data-driven applications. We have been working in areas such as situational awareness, clinical research informatics, sentiment and emotion analysis, and intelligence.

What we offer

Housed within the NRC's Digital Technologies Research Centre, the team's core competencies include:

  • computational linguistics
  • data visualization
  • deep learning
  • document meta-analysis
  • emotion analysis
  • information extraction
  • information retrieval
  • medical and biomedical language analytics
  • neural networks and deep learning
  • sentiment analysis
  • situational awareness
  • summarization
  • time reasoning and event detection
  • unsupervised learning algorithms

International competitions and shared tasks

  • Top-ranking scores in clinical text processing in the i2b2 Clinical NLP challenges in 2010, 2012, and 2017
  • Top-ranking scores, as well as organizing multiple tracks, in SemEval Shared Tasks from 2012 to the present

Software and applications

Why work with us

Our experts have extensive experience and knowledge of the latest text analytics approaches, and know how and when to best apply the appropriate techniques. Our experience spans from tried-and-trusted statistical algorithms to state-of-the-art artificial neural networks and deep learning approaches.

Our innovative capacity and technical competency in machine learning methods and (big-) data-oriented methods have resulted in world-leading research. We have collected multiple first-place rankings in text analysis research challenges such as i2b2 NLP for Clinical Data and the Sentiment Analysis track of SemEval.

Public confidence and trust is key for companies and organizations responsible for the sound management and use of data. Our team is committed to the ethical development and use of text analytics technologies. We regularly convene thought leaders in this area to share their expertise with the research and policy community, and we build ethical evaluation processes into our projects from the outset in collaboration with our partners.

Global Public Health Intelligence Network

The Global Public Health Intelligence Network (GPHIN), headquartered at the Public Health Agency of Canada (PHAC), is an early warning system used to identify potential public health threats worldwide, including outbreaks such as avian influenza and SARS (Severe Acute Respiratory Syndrome).

Between 2016 and 2018, PHAC commissioned the National Research Council of Canada (NRC) to build the replacement multilingual text analytics software application for the GPHIN system. Under the contract, the NRC replaced the previous legacy GPHIN software application with an integrated suite of tools completed to specifications.

The NRC maintains the GPHIN software application as part of ongoing technical service but does not play a role in decisions made regarding public health threats.

For more information about GPHIN, please visit the Public Health Agency of Canada.