Get Even More Visitors To Your Blog, Upgrade To A Business Listing >>

Dual-use regulation: Managing hate and terrorism online before and ...



natural language processing in soft computing :: Article Creator

What Is NLP? Natural Language Processing Explained

Natural language processing is a branch of AI that enables computers to understand, process, and generate language just as people do — and its use in business is rapidly growing.

Natural language processing definition

Natural language processing (NLP) is the branch of artificial intelligence (AI) that deals with training computers to understand, process, and generate language. Search engines, machine translation services, and voice assistants are all

While the term originally referred to a system's ability to read, it's since become a colloquialism for all computational linguistics. Subcategories include natural language generation (NLG) — a computer's ability to create communication of its own — and natural language understanding (NLU) — the ability to understand slang, mispronunciations, misspellings, and other variants in language.

The introduction of transformer models in the 2017 paper "Attention Is All You Need" by Google researchers revolutionized NLP, leading to the creation of generative AI models such as Bidirectional Encoder Representations from Transformer (BERT) and subsequent DistilBERT — a smaller, faster, and more efficient BERT — Generative Pre-trained Transformer (GPT), and Google Bard.

SUBSCRIBE TO OUR NEWSLETTER

From our editors straight to your inbox

Get started by entering your email address below.

How natural language processing works

NLP leverages machine learning (ML) algorithms trained on unstructured data, typically text, to analyze how elements of human language are structured together to impart meaning. Phrases, sentences, and sometimes entire books are fed into ML engines where they're processed using grammatical rules, people's real-life linguistic habits, and the like. An NLP algorithm uses this data to find patterns and extrapolate what comes next. For example, a translation algorithm that recognizes that, in French, "I'm going to the park" is "Je vais au parc" will learn to predict that "I'm going to the store" also begins with "Je vais au." All the algorithm then needs is the word for "store" to complete the translation task.

NLP applications

Machine translation is a powerful NLP application, but search is the most used. Every time you look something up in Google or Bing, you're helping to train the system. When you click on a search result, the system interprets it as confirmation that the results it has found are correct and uses this information to improve search results in the future.

Chatbots work the same way. They integrate with Slack, Microsoft Messenger, and other chat programs where they read the language you use, then turn on when you type in a trigger phrase. Voice assistants such as Siri and Alexa also kick into gear when they hear phrases like "Hey, Alexa." That's why critics say these programs are always listening; if they weren't, they'd never know when you need them. Unless you turn an app on manually, NLP programs must operate in the background, waiting for that phrase.

Transformer models take applications such as language translation and chatbots to a new level. Innovations such as the self-attention mechanism and multi-head attention enable these models to better weigh the importance of various parts of the input, and to process those parts in parallel rather than sequentially.

Rajeswaran V, senior director at Capgemini, notes that Open AI's GPT-3 model has mastered language without using any labeled data. By relying on morphology — the study of words, how they are formed, and their relationship to other words in the same language — GPT-3 can perform language translation much better than existing state-of-the-art models, he says.

NLP systems that rely on transformer models are especially strong at NLG.

Natural language processing examples

Data comes in many forms, but the largest untapped pool of data consists of text — and unstructured text in particular. Patents, product specifications, academic publications, market research, news, not to mention social media feeds, all have text as a primary component and the volume of text is constantly growing. Apply the technology to voice and the pool gets even larger. Here are three examples of how organizations are putting the technology to work:

  • Edmunds drives traffic with GPT: The online resource for automotive inventory and information has created a ChatGPT plugin that exposes its unstructured data — vehicle reviews, ratings, editorials — to the generative AI. The plugin enables ChatGPT to answer user questions about vehicles with its specialized content, driving traffic to its website.
  • Eli Lilly overcomes translation bottleneck: With global teams working in a variety of languages, the pharmaceutical firm developed Lilly Translate, a home-grown NLP solution, to help translate everything from internal training materials and formal, technical communications to regulatory agencies. Lilly Translate uses NLP and deep learning language models trained with life sciences and Lilly content to provide real-time translation of Word, Excel, PowerPoint, and text for users and systems.
  • Accenture uses NLP to analyze contracts: The company's Accenture Legal Intelligent Contract Exploration (ALICE) tool helps the global services firm's legal organization of 2,800 professionals perform text searches across its million-plus contracts, including searches for contract clauses. ALICE uses "word embedding" to go through contract documents paragraph by paragraph, looking for keywords to determine whether the paragraph relates to a particular contract clause type.
  • Natural language processing software

    Whether you're building a chatbot, voice assistant, predictive text application, or other application with NLP at its core, you'll need tools to help you do it. According to Technology Evaluation Centers, the most popular software includes:

  • Natural Language Toolkit (NLTK), an open-source framework for building Python programs to work with human language data. It was developed in the Department of Computer and Information Science at the University of Pennsylvania and provides interfaces to more than 50 corpora and lexical resources, a suite of text processing libraries, wrappers for natural language processing libraries, and a discussion forum. NLTK is offered under the Apache 2.0 license.
  • Mallet, an open-source, Java-based package for statistical NLP, document classification, clustering, topic modeling, information extraction, and other ML applications to text. It was primarily developed at the University of Massachusetts Amherst.
  • SpaCy, an open-source library for advanced natural language processing explicitly designed for production use rather than research. Licensed by MIT, SpaCy was made with high-level data science in mind and allows deep data mining.
  • Amazon Comprehend. This Amazon service doesn't require ML experience. It's intended to help organizations find insights from email, customer reviews, social media, support tickets, and other text. It uses sentiment analysis, part-of-speech extraction, and tokenization to parse the intention behind the words.
  • Google Cloud Translation. This API uses NLP to examine a source text to determine language and then use neural machine translation to dynamically translate the text into another language. The API allows users to integrate the functionality into their own programs.
  • Natural language processing courses

    There's a wide variety of resources available for learning to create and maintain NLP applications, many of which are free. They include:

  • NLP – Natural Language Processing with Python from Udemy. This course provides an introduction to natural language processing in Python, building to advanced topics such as sentiment analysis and the creation of chatbots. It consists of 11.5 hours of on-demand video, two articles, and three downloadable resources. The course costs $94.99, which includes a certificate of completion.
  • Data Science: Natural Language Processing in Python from Udemy. Aimed at NLP beginners who are conversant with Python, this course involves building a number of NLP applications and models, including a cipher decryption algorithm, spam detector, sentiment analysis model, and article spinner. The course consists of 12 hours of on-demand video and costs $99.99, which includes a certificate of completion.
  • Natural Language Processing Specialization from Coursera. This intermediate-level set of four courses is intended to prepare students to design NLP applications such as sentiment analysis, translation, text summarization, and chatbots. It includes a career certificate.
  • Hands On Natural Language Processing (NLP) using Python from Udemy. This course is for individuals with basic programming experience in any language, an understanding of object-oriented programming concepts, knowledge of basic to intermediate mathematics, and knowledge of matrix operations. It's completely project-based and involves building a text classifier for predicting sentiment of tweets in real-time, and an article summarizer that can fetch articles and find the summary. The course consists of 10.5 hours of on-demand video and eight articles, and costs $19.99, which includes a certificate of completion.
  • Natural Language Processing in TensorFlow by Coursera. This course is part of Coursera's TensorFlow in Practice Specialization, and covers using TensorFlow to build natural language processing systems that can process text and input sentences into a neural network. Coursera says it's an intermediate-level course and estimates it will take four weeks of study at four to five hours per week to complete.
  • NLP salaries

    Here are some of the most popular job titles related to NLP and the average salary (in US$) for each position, according to data from PayScale.

  • Computational linguist: $60,000 to $126,000
  • Data scientist: $79,000 to $137,000
  • Data science director: $107,000 to $215,000
  • Lead data scientist: $115,000 to $164,000
  • Machine learning engineer: $83,000 to $154,000
  • Senior data scientist: $113,000 to $177,000
  • Software engineer: $80,000 to $166,000

  • Natural Language Processing

    Your hands are filthy from working on your latest project and you need to run the water to wash them. But you don't want to get the taps filthy too. Wouldn't it be nice if you could just tell them to turn on hot, or cold? Or if the water's too cold, you could tell them to make it warmer. [Vije Miller] did just that, he added servo motors to his kitchen tap and enlisted an AI to interpret his voice commands.

    Look closely at the photo and you can guess that he started with a single-lever type of tap, the kind which can be worked with an elbow, so this project was probably just for fun and judging by his video below, he does have a sense of humor. But the idea is practical for dual taps with rotating knobs. He did realize, however, that in future versions he should move the servo motor openings from the top plate to the bottom instead, to avoid any water getting in. A NodeMCU ESP8266 ESP-12E board serves for communicating with the speech recognition side but other than the name, JacobAI, he's keeping the speech part to himself. We secretly suspect that he has a friend named Jacob.

    However, we can think of a number of options for it such as DeepSpeech and Wit.Ai which we covered when talking about natural language phone bots, and the ubiquitous Alexa as used here with another NodeMCU for turning on Christmas tree lights.

    Continue reading "Talk To The Faucet" →


    Illinois Adopts Natural Language Processing Tech For Child Welfare

    The Illinois Department of Children and Family Services (DCFS) is now using Natural Language Processing (NLP) software to better serve children's needs through enhanced data insights.

    The agency has started using Augintel's NLP software to help child welfare caseworkers, supervisors and private agency provider staff better interpret and leverage narrative data about cases.

    The new tech is saving caseworkers an estimated 20 percent in time spent on administrative tasks.

    Julie Barbosa, chief deputy director of strategy and performance execution, explained in a written response that the software allows staff to more easily pull information from narrative fields.

    The tool is accessible to caseworkers and their supervisors. Those in other roles, such as quality assurance specialists, clinical staff and data stewards, will have levels of access to the tool that are consistent with their access to the information in the case management system.

    This tool can help users in a variety of ways depending on their role. For example, caseworkers could use it to get up to date on family cases that have been transferred to them, or to write court reports prior to hearings. Supervisors can identify trends in cases across workers. Another notable example is that the statewide administrator for deaf and blind services can identify families that need support rather than waiting to be notified by caseworkers.

    Before using this technology, DCFS used a combination of analyzing administrative data and using a sample of cases for qualitative case reviews.

    "Reading all of the notes word for word is very time consuming, especially when the reader is trying to find a specific piece of information," Barbosa said.

    The effort is part of a larger agencywide modernization initiative to increase efficiency. Barbosa noted that the tech can be used with the state's current Statewide Automated Child Welfare Information System and will be able to integrate with the new Illinois Connect system.

    The agency was willing to implement a tool while in the process of modernization efforts because the software required minimal time and effort for integration into the state's legacy case management system, Barbosa explained.

    The change management plan to implement this tool involves about six months of training it to establish the Illinois Model, which involves teaching the system the state's terminology, acronyms and common phrases, Barbosa said.

    This training process was followed by a rollout to around 7,500 users over seven months. Champions were identified from initial groups to provide ground support, obtain user feedback and work with Augintel staff as needed. Notably, the tool does not require any additional data entry. The state is still in the process of fully implementing the tool.

    As Barbosa explained, she initially thought the tool was primarily to be used to search specific words or terms within a case, but she was "pleasantly surprised" to learn about additional functions. For example, the search function helps with day-to-day work, while the queries function helps to assess system-level trends or changes for things like case planning.

    To measure the impact of NLP software, DCFS will explore tool use and assess trends over time, both positive and negative. She noted that additional metrics may be developed in the future upon a fuller implementation.

    Editor's note: Julie Barbosa's name and title were corrected.

    Julia Edinger is a staff writer for Government Technology. She has a bachelor's degree in English from the University of Toledo and has since worked in publishing and media. She's currently located in Southern California.








    This post first appeared on Autonomous AI, please read the originial post: here

    Share the post

    Dual-use regulation: Managing hate and terrorism online before and ...

    ×

    Subscribe to Autonomous Ai

    Get updates delivered right to your inbox!

    Thank you for your subscription

    ×