May 22nd 2024

nlp process flow :: Article Creator

Natural Language Processing

The Natural Language Processing Research Group , established in 1993 , is one of the largest and most successful language processing groups in the UK and has a strong global reputation.

Natural Language Processing (NLP) is an interdisciplinary field that uses computational methods:

To investigate the properties of written human language and to model the cognitive mechanisms underlying the understanding and production of written language (scientific focus)

To develop novel practical applications involving the intelligent processing of written human language by computer (engineering focus)

Research Themes Information Access

Building applications to improve access to information in massive text collections, such as the web, newswires and the scientific literature

Language Resources and Architectures for NLP

Providing resources - both data and processing resources - for research and development in NLP. Includes platforms for developing and deploying real world language processing applications, most notably GATE, the General Architecture for Text Engineering.

Machine Translation

Building applications to translate automatically between human languages, allowing access to the vast amount of information written in foreign languages and easier communication between speakers of different languages.

Human-Computer Dialogue Systems Building systems to allow spoken language interaction with computers or embodied conversational agents, with applications in areas such as keyboard-free access to information, games and entertainment, articifial companions. Detection of Reuse and Anomaly

Investigating techniques for determining when texts or portions of texts have been reused or where portions of text do not fit with surrounding text. These techniques have applications in areas such as plagiarism and authorship detection and in discovery of hidden content.

Foundational Topics

Developing applications with human-like capabilities for processing language requires progress in foundational topics in language processing. Areas of interest include: word sense disambiguation, semantics of time and events.

NLP for social media

Social Media, Online Disinformation, and Elections: A Quantitative, "Big Data" Perspective.

Biomedical Text Processing

GATE in Biomedical Text Processing

Core members

Academic staff

Senior research staff

Research staff

Ibrahim Abu Farha

Mehmet Bakir

Dr Emma Barker

Amit Gajibhiye

Dr Mark Greenwood

Wei He

Mali Jin

Tashin Khan

Yue Li

Yida Mu

Mugdha Pandya

Muneerah Patel

Oleysa Razuvayevskaya

Ian Roberts

Iknoor Singh

Jake Vasilakes

Ahmad Zareie

Cass Zhao

Visiting staff Publications Academic articles

Here you can find research publications for the Natural Language Processing Research Group, listed by academic. The head link navigates to the official web page for the relevant academic (with highlighted favourite publications). The remaining links navigate to their DBLP author page, their Google Scholar citations page and optionally a self-maintained publications page.

Academic staff

The Key Technologies Fuelling Chatbot Evolution

Most of us are familiar with chatbots on customer service portals, government departments, and through services like Google Bard and OpenAI's ChatGPT. They are convenient, easy to use, and always available, leading to their growing use for a diverse range of applications across the web.

Unfortunately, most current chatbots are limited due to their reliance on static training data. Data outputted by these systems can be obsolete, limiting our ability to gain real-time information for our queries. They also struggle with contextual understanding, inaccuracies, handling complex queries, and limited adaptability to our evolving needs.

To overcome these issues, advanced techniques like Retrieval-Augmented Generation (RAG) have emerged. By leveraging various external information sources, including real-time data collected from the open web, RAG systems can augment their knowledge base in real time, providing more accurate and contextually relevant responses to users' queries to enhance their overall performance and adaptability.

Chatbots: challenges and limitations

Current chatbots employ various technologies to handle training and inference tasks, including natural language processing (NLP) techniques, machine learning algorithms, neural networks, and frameworks like TensorFlow or PyTorch. They rely on rule-based systems, sentiment analysis, and dialog management modules to interpret user input, generate appropriate responses, and maintain the flow of conversation.

However, as mentioned previously, these chatbots face several challenges. Limited contextual understanding often results in generic or irrelevant responses because static training datasets may fail to capture the diversity of real-world conversations.

Furthermore, without real-time data integration, chatbots may experience "hallucinations" and inaccuracies. They also struggle with handling complex queries that require deeper contextual understanding and lack adaptability to open knowledge, evolving trends, and user preferences.

Improving the chatbot experience with RAG

RAG merges generative AI with information retrieval from external sources on the open web. This approach significantly improves contextual understanding, accuracy, and relevance in AI models. Moreover, information in the RAG system's knowledge base can be dynamically updated, making them highly adaptable and scalable.

RAG utilises various technologies, which can be categorised into distinct groups: frameworks and tools, semantic analysis, vector databases, similarity search, and privacy/security applications. Each of these components plays a crucial role in enabling RAG systems to effectively retrieve and generate contextually relevant information while maintaining privacy and security measures.

By leveraging a combination of these technologies, RAG systems can enhance their capabilities in understanding and responding to user queries with accuracy and efficiency, thereby facilitating more engaging and informative interactions.

The frameworks and tools

Frameworks and associated tools provide a structured environment for developing and deploying retrieval-augmented generation models efficiently. They offer pre-built modules and tools for data retrieval, model training, and inference, streamlining the development process and reducing implementation complexity.

Additionally, frameworks facilitate collaboration and standardisation within the research community, enabling researchers to share models, reproduce results, and advance the field of RAG more rapidly.

Some frameworks currently in use include:

LangChain: A framework specifically designed for Retrieval-Augmented Generation (RAG) applications that integrates generative AI with data retrieval techniques.

LlamaIndex: A specialised tool created for RAG applications that facilitates efficient indexing and retrieval of information from a vast number of knowledge sources.

Weaviate: One of the more popular vector bases; it has a modular RAG application called Verba, which can integrate the database with generative AI models.

Chroma: A tool that offers features such as client initialization, data storage, querying, and manipulation.

Vector databases for quick data retrieval

Vector databases efficiently store high-dimensional vector representations of public web data, enabling fast and scalable retrieval of relevant information. By organising text data as vectors in a continuous vector space, vector databases facilitate semantic search and similarity comparisons, enhancing the accuracy and relevance of generated responses in RAG systems.

Additionally, vector databases support dynamic updates and adaptability, allowing RAG models to continuously integrate new information from the web and improve their knowledge base over time.

Some popular vector databases are Pinecone, Weaviate, Milvus, Neo4j, and Qdrant. They can process high-dimensional data for RAG systems that require complex vector operations.

Semantic analysis, similarity search, and security

Semantic analysis and similarity enable RAG systems to understand the context of user queries and retrieve relevant information from vast datasets. By analysing the meaning and relationships between words and phrases, semantic analysis tools ensure that RAG applications generate contextually relevant responses. Similarly, similarity search algorithms are used to identify documents or data parts that would help the LLM to answer the query more accurately by giving it wider context.

Semantic analysis and similarity search tools used in RAG systems include:

Semantic Kernel: Provides advanced semantic analysis capabilities, aiding in understanding and processing complex language structures.

FAISS (Facebook AI Similarity Search): A library developed by Facebook AI Research for efficient similarity search and clustering of high-dimensional vectors.

Last but not least, privacy and security tools are essential for RAG in order to protect sensitive user data and ensure trust in AI systems. By incorporating privacy-enhancing technologies like encryption and access controls, RAG systems can safeguard user information during data retrieval and processing.

Additionally, robust security measures prevent unauthorised access or manipulation of RAG models and the data they handle, mitigating the risk of data breaches or misuse.

Skyflow GPT Privacy Vault: Provides tools and mechanisms to ensure privacy and security in RAG applications.

Javelin LLM Gateway: An enterprise-grade LLM that enables enterprises to apply policy controls, adhere to governance measures, and enforce comprehensive security guardrails. These include data leak prevention to ensure safe and compliant model use.

Embracing emerging technology in future chatbots

Emerging technologies used by RAG systems mark a notable leap forward in the use of responsible AI, aiming to enhance chatbot functionality significantly. By seamlessly integrating web data collection and generation capabilities, RAG facilitates superior contextual understanding, real-time web data access, and adaptability in responses.

This integration holds promise in revolutionising interactions with AI-powered systems, promising more intelligent, context-aware, and dependable experiences as RAG continues to evolve and refine its capabilities.

Patronus AI Secures $17M To Tackle AI Hallucinations And Copyright Violations, Fuel Enterprise Adoption

Join us in returning to NYC on June 5th to collaborate with executive leaders in exploring comprehensive methods for auditing AI models regarding bias, performance, and ethical compliance across diverse organizations. Find out how you can attend here.

As companies race to implement generative AI, concerns about the accuracy and safety of large language models (LLMs) threaten to derail widespread enterprise adoption. Stepping into the fray is Patronus AI, a San Francisco startup that just raised $17 million in Series A funding to automatically detect costly — and potentially dangerous — LLM mistakes at scale.

The round, which brings Patronus AI's total funding to $20 million, was led by Glenn Solomon at Notable Capital, with participation from Lightspeed Venture Partners, former DoorDash executive Gokul Rajaram, Factorial Capital, Datadog, and several unnamed tech executives.

Founded by former Meta machine learning (ML) experts Anand Kannappan and Rebecca Qian, Patronus AI has developed a first-of-its-kind automated evaluation platform that promises to identify errors like hallucinations, copyright infringement and safety violations in LLM outputs. Using proprietary AI, the system scores model performance, stress-tests models with adversarial examples and enables granular benchmarking — all without the manual effort required by most enterprises today.

Exposing the dark side of generative AI: hallucinations, copyright violations and safety risks

"There's a range of things that our product is actually really good at being able to catch, in terms of mistakes," said Kannappan, CEO of Patronus AI, in an interview with VentureBeat. "It includes things like hallucinations, and copyright and safety related risks, as well as a lot of enterprise-specific capabilities around things like style and tone of voice of the brand."

VB Event

The AI Impact Tour: The AI Audit

Join us as we return to NYC on June 5th to engage with top executive leaders, delving into strategies for auditing AI models to ensure fairness, optimal performance, and ethical compliance across diverse organizations. Secure your attendance for this exclusive invite-only event.

Request an invite

The emergence of powerful LLMs like OpenAI's GPT-4o and Meta's Llama 3 has set off an arms race in Silicon Valley to capitalize on the technology's generative abilities. But as hype cycles accelerate, so too have high-profile model failures, from news site CNET publishing error-riddled AI-generated articles to drug discovery startups retracting research papers based on LLM-hallucinated molecules.

These public missteps only scratch the surface of broader issues endemic to the current crop of LLMs, Patronus AI claims. The company's previously published research, including the "CopyrightCatcher" API released three months ago and the "FinanceBench" benchmark unveiled six months ago, reveals startling deficiencies in leading models' ability to accurately answer questions grounded in fact.

FinanceBench and CopyrightCatcher: Patronus AI's groundbreaking research reveals LLM deficiencies

For its "FinanceBench" benchmark, Patronus tasked models like GPT-4 with answering financial queries based on public SEC filings. Shockingly, the best performing model answered only 19% of questions correctly after ingesting an entire annual report. A separate experiment with Patronus' new "CopyrightCatcher" API found open-source LLMs reproducing copyrighted text verbatim in 44% of outputs.

"Even state-of-the-art models were hallucinating and only got like 90% of responses correct in finance settings," explained Qian, who serves as CTO. "Our research has shown that open source models had over 20% unsafe responses in many high priority areas of harm. And copyright infringement is a huge risk — large publishers, media companies, or anyone using LLMs needs to be concerned."

While a handful of other startups like Credo AI, Weights & Biases and Robust Intelligence are building tools for LLM evaluation, Patronus believes its research-first approach leveraging the founders' deep expertise sets it apart. The core technology is based on training dedicated evaluation models that reliably surface edge cases where a given LLM is likely to fail.

"No other company right now has the research and technology at the level of depth that we have as a company," Kannappan said. "What's really unique about how we've approached everything is our research-first approach — that's in the form of training evaluation models, developing new alignment techniques, publishing research papers."

This strategy has already gained traction with several Fortune 500 companies spanning industries like automotive, education, finance and software using Patronus AI to deploy LLMs "safely within their organizations," per the startup, though it declined to name specific customers. With the fresh capital, Patronus plans to scale up its research, engineering and sales teams while developing additional industry benchmarks.

If Patronus achieves its vision, rigorous automated evaluation of LLMs could become table stakes for enterprises looking to deploy the technology, in the same way security audits paved the way for widespread cloud adoption. Qian sees a future where testing models with Patronus is as commonplace as unit-testing code.

"Our platform is domain-agnostic and so the evaluation technology that we build can be extended to any domain, whether that's legal, healthcare or others," she said. "We want to enable enterprises across every industry to leverage the power of LLMs while having assurance the models are safe and aligned with their specific use case requirements."

Still, given the black-box nature of foundation models and near-endless space of possible outputs, conclusively validating an LLM's performance remains an open challenge. By advancing the state-of-the-art in AI evaluation, Patronus aims to accelerate the path to accountable real-world deployment.

"Measuring LLM performance in an automated way is really difficult and that's just because there's such a wide space of behavior, given that these models are generative by nature," acknowledged Kannappan. "But through a research-driven approach, we're able to catch mistakes in a very reliable and scalable way that manual testing fundamentally cannot."

VB Daily

Stay in the know! Get the latest news in your inbox daily

By subscribing, you agree to VentureBeat's Terms of Service.

Thanks for subscribing. Check out more VB newsletters here.

An error occured.

This post first appeared on Autonomous AI, please read the originial post: here

People also like

The Top 5 Machine Learning Certifications

Natural Language Processing

The Key Technologies Fuelling Chatbot Evolution

Patronus AI Secures $17M To Tackle AI Hallucinations And Copyright Violations, Fuel Enterprise Adoption

Related Articles

The Top 5 Machine Learning Certifications

Natural Language Processing

The Key Technologies Fuelling Chatbot Evolution

Patronus AI Secures $17M To Tackle AI Hallucinations And Copyright Violations, Fuel Enterprise Adoption

Related Articles

Share the post

Subscribe to Autonomous Ai

Thank you for your subscription