As the volume of data increases exponentially, professionals who deal with vast amounts of information can struggle to keep up. The National Research Council's of Canada's (NRC) text analytics team conceives, builds, and evaluates tools that help extract information from textual data: explicit, implicit and inferred information, composite information (summaries), subtext information (sentiment, sarcasm, metaphor) and meta-information.
The team collaborates with government, industry, and academia to drive text analytics technologies for a range of problem-driven and data-driven applications. We have been working in areas such as situational awareness, clinical research informatics, sentiment and emotion analysis, and intelligence.
What we offer
Housed within the NRC's Digital Technologies Research Centre, the team's core competencies include:
- computational linguistics
- data visualization
- deep learning
- document meta-analysis
- emotion analysis
- information extraction
- information retrieval
- medical and biomedical language analytics
- neural networks and deep learning
- sentiment analysis
- situational awareness
- summarization
- time reasoning and event detection
- unsupervised learning algorithms
International competitions and shared tasks
- Top-ranking scores in clinical text processing in the i2b2 Clinical NLP challenges in 2010, 2012, and 2017
- Top-ranking scores, as well as organizing multiple tracks, in SemEval Shared Tasks from 2012 to the present
Software and applications
- ExaCT Demo Clinical Information Extraction System
- Global Public Health Intelligence Network (GPHIN)
- Sentiment and emotion lexicons
Why work with us
Our experts have extensive experience and knowledge of the latest text analytics approaches, and know how and when to best apply the appropriate techniques. Our experience spans from tried-and-trusted statistical algorithms to state-of-the-art artificial neural networks and deep learning approaches.
Our innovative capacity and technical competency in machine learning methods and (big-) data-oriented methods have resulted in world-leading research. We have collected multiple first-place rankings in text analysis research challenges such as i2b2 NLP for Clinical Data and the Sentiment Analysis track of SemEval.
Public confidence and trust is key for companies and organizations responsible for the sound management and use of data. Our team is committed to the ethical development and use of text analytics technologies. We regularly convene thought leaders in this area to share their expertise with the research and policy community, and we build ethical evaluation processes into our projects from the outset in collaboration with our partners.
Global Public Health Intelligence Network
The Global Public Health Intelligence Network (GPHIN), headquartered at the Public Health Agency of Canada (PHAC), is an early warning system used to identify potential public health threats worldwide, including outbreaks such as avian influenza and SARS (Severe Acute Respiratory Syndrome).
Between 2016 and 2018, PHAC commissioned the National Research Council of Canada (NRC) to build the replacement multilingual text analytics software application for the GPHIN system. Under the contract, the NRC replaced the previous legacy GPHIN software application with an integrated suite of tools completed to specifications.
The NRC maintains the GPHIN software application as part of ongoing technical service but does not play a role in decisions made regarding public health threats.
For more information about GPHIN, please visit the Public Health Agency of Canada.
Contact us
Interested in applying our text analytics expertise to your project? Contact our experts today!
Berry de Bruijn
Team Leader, Text analytics
Telephone: 613-993-0604
Email: Berry.DeBruijn@nrc-cnrc.gc.ca
Publications
- Machine-learned solutions for three stages of clinical information extraction : the state of the art at i2b2 2010
- The unreasonable effectiveness of word representations for Twitter named entity recognition
- Examining gender and race bias in two hundred sentiment analysis systems
- NRC-Canada : building the state-of-the-art in sentiment analysis of tweets
- ExaCT: automatic extraction of clinical trial characteristics from journal publications
- Sentiment analysis of short informal texts
- Capturing reliable fine-grained sentiment associations by crowdsourcing and best–worst scaling
- Crowdsourcing a word-emotion association lexicon
- Using hashtags to capture fine emotion categories from tweets