What Is NLP? Natural Language Processing Explained
Natural language processing is a branch of AI that enables computers to understand, process, and generate language just as people do — and its use in business is rapidly growing.
Natural language processing definitionNatural language processing (NLP) is the branch of artificial intelligence (AI) that deals with training computers to understand, process, and generate language. Search engines, machine translation services, and voice assistants are all
While the term originally referred to a system's ability to read, it's since become a colloquialism for all computational linguistics. Subcategories include natural language generation (NLG) — a computer's ability to create communication of its own — and natural language understanding (NLU) — the ability to understand slang, mispronunciations, misspellings, and other variants in language.
The introduction of transformer models in the 2017 paper "Attention Is All You Need" by Google researchers revolutionized NLP, leading to the creation of generative AI models such as Bidirectional Encoder Representations from Transformer (BERT) and subsequent DistilBERT — a smaller, faster, and more efficient BERT — Generative Pre-trained Transformer (GPT), and Google Bard.
SUBSCRIBE TO OUR NEWSLETTER
From our editors straight to your inboxGet started by entering your email address below.
How natural language processing worksNLP leverages machine learning (ML) algorithms trained on unstructured data, typically text, to analyze how elements of human language are structured together to impart meaning. Phrases, sentences, and sometimes entire books are fed into ML engines where they're processed using grammatical rules, people's real-life linguistic habits, and the like. An NLP algorithm uses this data to find patterns and extrapolate what comes next. For example, a translation algorithm that recognizes that, in French, "I'm going to the park" is "Je vais au parc" will learn to predict that "I'm going to the store" also begins with "Je vais au." All the algorithm then needs is the word for "store" to complete the translation task.
NLP applicationsMachine translation is a powerful NLP application, but search is the most used. Every time you look something up in Google or Bing, you're helping to train the system. When you click on a search result, the system interprets it as confirmation that the results it has found are correct and uses this information to improve search results in the future.
Chatbots work the same way. They integrate with Slack, Microsoft Messenger, and other chat programs where they read the language you use, then turn on when you type in a trigger phrase. Voice assistants such as Siri and Alexa also kick into gear when they hear phrases like "Hey, Alexa." That's why critics say these programs are always listening; if they weren't, they'd never know when you need them. Unless you turn an app on manually, NLP programs must operate in the background, waiting for that phrase.
Transformer models take applications such as language translation and chatbots to a new level. Innovations such as the self-attention mechanism and multi-head attention enable these models to better weigh the importance of various parts of the input, and to process those parts in parallel rather than sequentially.
Rajeswaran V, senior director at Capgemini, notes that Open AI's GPT-3 model has mastered language without using any labeled data. By relying on morphology — the study of words, how they are formed, and their relationship to other words in the same language — GPT-3 can perform language translation much better than existing state-of-the-art models, he says.
NLP systems that rely on transformer models are especially strong at NLG.
Natural language processing examplesData comes in many forms, but the largest untapped pool of data consists of text — and unstructured text in particular. Patents, product specifications, academic publications, market research, news, not to mention social media feeds, all have text as a primary component and the volume of text is constantly growing. Apply the technology to voice and the pool gets even larger. Here are three examples of how organizations are putting the technology to work:
Whether you're building a chatbot, voice assistant, predictive text application, or other application with NLP at its core, you'll need tools to help you do it. According to Technology Evaluation Centers, the most popular software includes:
There's a wide variety of resources available for learning to create and maintain NLP applications, many of which are free. They include:
Here are some of the most popular job titles related to NLP and the average salary (in US$) for each position, according to data from PayScale.
Studies In Natural Language Processing
View description
Sentiment analysis is the computational study of people's opinions, sentiments, emotions, moods, and attitudes. This fascinating problem offers numerous research challenges, but promises insight useful to anyone interested in opinion analysis and social media analysis. This comprehensive introduction to the topic takes a natural-language-processing point of view to help readers understand the underlying structure of the problem and the language constructs commonly used to express opinions, sentiments, and emotions. The book covers core areas of sentiment analysis and also includes related topics such as debate analysis, intention mining, and fake-opinion detection. It will be a valuable resource for researchers and practitioners in natural language processing, computer science, management sciences, and the social sciences.In addition to traditional computational methods, this second edition includes recent deep learning methods to analyze and summarize sentiments and opinions, and also new material on emotion and mood analysis techniques, emotion-enhanced dialogues, and multimodal emotion analysis.
Natural Language Processing Creates An Audit Trail For Risk Adjustment
Photo: Kiyoshi Hijiki/Getty Images
The best way for insurers to make sure they're in compliance with the mandates of risk adjustment is to use natural language processing for accurate documentation and auditing, according to Dr. Calum Yacoubian, director of Healthcare Strategy for Linguamatics, an IQVIA company that offers an NLP-based AI platform.
Last week's publication of the final rule for risk adjustment data validation (RADV) comes after increasingly high profile instances of apparent over coding from Medicare Advantage Organizations, Yacoubian said.
There must be an audit trail, he said.
NLP identifies gaps in care from unstructured notes in the clinical record. It enables the creation of a longitudinal patient record from multiple providers.
WHY THIS MATTERS
In value-based care arrangements, payers need accurate risk adjustment to ensure they are properly compensated for assuming greater financial risk for patients. These savings are shared with providers.
When payers don't capture the full spectrum of a patient's diagnosis, they may be at risk for cost overruns associated with treating those unidentified conditions.
"There is a huge amount of medical record review for risk adjustment, to look at missed diagnoses," Yacoubian said.
The coding must be correct, as payment amounts are determined by risk scores associated with various Hierarchical Condition Categories or groups of medical codes linked to specific clinical diagnoses.
"As the population continues to age, the Medicare Advantage population, and therefore burden of care, is also increasing – and only set to get larger," Yacoubian said. "For the payers who are claiming appropriately, these new rules pose an increased burden upon them to ensure their submissions are audit proof."
NLP can also be used for other predictive risk modeling, such as identifying patients at risk for hospital admission or readmission, he said.
NLP has gone from something relatively niche and researched-focused to its being used by more than 50% of healthcare organizations in the United States, Yacoubian said.
THE LARGER TREND
On January 30, the Centers for Medicare and Medicaid Services finalized risk adjustment policies in a final rule to prevent overpayments to Medicare Advantage Organizations.
The Medicare Advantage Risk Adjustment Data Validation program is CMS's primary audit and oversight tool of MAO program payments.
As required by law, CMS' payments to MAOs are adjusted based on the health status of enrollees, as determined through medical diagnoses.
Studies and audits done separately by CMS and the Health and Human Services Office of Inspector General have shown that Medicare Advantage enrollees' medical records do not always support the diagnoses reported by MAOs, which leads to billions of dollars in overpayments to plans and increased costs to the Medicare program as well as taxpayers, CMS said.
The Risk Adjustment Data Validation final rule holds insurers accountable.
Twitter: @SusanJMorseEmail the writer: [email protected]