Get Even More Visitors To Your Blog, Upgrade To A Business Listing >>

Looking inside ROScribe and the idea of LLM-based robotic platform

Posted on Oct 23 ROScribeROScribe is an open source tool that uses LLM for Software generation in robotics within ROS (Robot Operating System) framework. Roscribe supports both ROS 1 and ROS 2 with python implementations. In this post, I want to dive deep into how ROScribe works under the hood, and explain some high level ideas behind this project. Further, I want to discuss how using LLM for code generation in a specific domain compares to a generic domain code generation. Robot Operating System (ROS)ROS is not an operating system; it is an open-source software framework for robot software development. It provides hardware abstraction, device drivers, libraries of commonly-used functionalities, visualizers, message-passing, package management, and more [1, 2].ROS was designed to be open source, so that users could choose the configuration of tools and libraries to get their software stacks to fit their robot and application area [1]. There is very little which is core to ROS, beyond the general structure within which programs must exist and communicate. In one sense, ROS is the underlying plumbing behind processes and message passing [1]. However, in reality, ROS is not only that plumbing, but also a rich and mature set of tools, a wide-ranging set of robot-agnostic abilities provided by packages, and a greater ecosystem of additions to ROS [1]. There are two versions of ROS available today (i.e. ROS 1 and ROS 2), each containing several distributions (e.g. ROS Noetic, ROS 2 humble, ROS 2 Iron). A comprehensive collection of repositories and packages across different ROS versions and distributions can be found in ROS Index [3]. ROS graphROS graph is a graph representation of all processes involved in the robot and the communication among them. ROS graph is a directed bipartite graph (a.k.a. bigraph), meaning that there are two disjoint sets of vertices in the graph: ROS nodes, and ROS topics. ROS nodes abstract hardware components such as sensors, actuators, controllers, processors, etc., that run ROS-based processes. ROS topics are the information communicated by the ROS nodes. Figure 1 shows an example of a ROS graph. A ROS node can publish to a topic (when the node is the source of the data) or subscribe to it (when the node is the destination of the data). The edges of the ROS graph represent the publish and subscribe actions. Figure 1. An example of a ROS graph; ovals represent the ROS nodes (i.e. the processes) and rectangles represent the ROS topics (i.e. the data); the direction of the edges indicate the flow of data.Adopting ROSROS is the most common framework for robot softwares; it is growing in popularity in robotics industry and it’s finding its way to other industries such as IoT (Internet of Things). However, if you are new to ROS, whether you are a robotics enthusiast, a college student, or a professional engineer, you might find it intimidating at first, and you might hesitate to adopt ROS for your robotic project. This skill barrier sometimes forces the beginner roboticist to give up on ROS altogether and opt out for non-standard options. ROScribe is designed to help you overcome the skill barrier so that you can adopt ROS for your robotic project. ROScribe creates the ROS graph for your project, and implements all ROS nodes, and takes care of the ROS-specific installation and launch files. If you are working on a new robotic solution (e.g. a new mapping algorithm, a new clustering solution), you can focus on your own component and let ROScribe take care of everything else that is needed to be connected to your component to have a functioning robot. In other words, you can use ROScribe to create a blueprint for your software; ROScribe gives you a complete software that you can use as an initial draft for your project; then, you can replace parts of the generated code (e.g. one of the ROS nodes) with the specific code that you have manually developed for your component of interest. We believe ROScribe helps beginners to better learn ROS and adopt it for their projects, while helping advanced users to quickly create a comprehensive blueprint of the robot software for their project.Using LLMs to generate robot softwareLLMs (Large Language Models) belong to a class of AI specialized in processing human languages. This raises the question: how well an AI made for human language can perform when dealing with programming languages such as Python and C? In [4], we discussed how LLMs fare in code generation for a general-purpose software. It is notable that the power of the LLM is mainly in understanding the software spec (which is written in human language) rather than in code generation itself. Please refer to [4] for a discussion on how LLMs can be used to generate a general-purpose software, and how computer programming paradigm could evolve in the presence of LLMs.In ROScribe, we are not dealing with software generation in general, but rather, software generation in a special case of robotics. More specifically, we use LLMs to generate robot softwares within ROS framework.This limits the scope of the software design, and narrows down the LLM’s attention to a smaller task, resulting in lower likelihood of hallucination and higher quality outcome. Like humans, LLMs have limited attention span; having a smaller design space, when dealing with a more specific task of robot software, yields a better outcome compared to when dealing with a broader design space for a generic software design. Prompt engineering techniques, fine tuning, and priming could further help to limit the design space. In the next section, we will explain how ROScribe breaks down the given task to smaller pieces to implement the software through divide and conquer. To improve the quality of the generated code, and increase the efficiency of the LLM, other methods such as RAG (Retrieval Augmented Generation) can be used. Having a well-structured scope for designing robot softwares in ROS makes it easier to effectively integrate RAG in ROScribe and further improve the quality of the generated robot software. The current version of ROScribe (v0.0.3) doesn’t use RAG, and therefore we save the RAG discussion for future articles. In the current version of ROScribe, the entire robot software is generated by the LLM. How ROScribe worksThe process of robot software generation in ROScribe can be explained in three steps: ROS graph synthesis First, ROScribe captures a high-level description of the robotic task from the user in the initial prompt, and asks the user to specify which version of ROS to use (i.e. ROS 1 or ROS 2). Then, ROScribe asks a series of high-level questions about the overall design of the system. The questioning process continues until ROScribe gathers enough information for generating the ROS graph. The process of generating follow-up questions is illustrated in Figure 2. ROScribe uses the chat history of all previous questions and answers, as well as the first prompt about the task and ROS version, to generate a prompt that will be fed to the LLM to create the follow-up question. Sometimes the follow-up question is accompanied by a suggested answer, or viable options, to help guide the user to the right direction.Our intention here is to keep the human in the driver’s seat and let him make all decisions on how the robotic system should be designed. ROScribe uses the LLM as a robotics expert who knows what questions to ask to get the necessary information from the human designer.When there is no further question, the ROS graph can be generated. ROScribe uses the entire chat history, the initial prompt and ROS version, and some internal formatting instructions to generate a prompt and feed it to LLM to create the ROS graph (Figure 3). ROScribe drafts a visual representation of the ROS graph in RQT-style, and allows the user to modify it by adding or removing ROS nodes. Figure 2. ROS graph synthesis; generating follow-up questionsFigure 3. ROS graph generationROS node synthesis For every ROS node identified and finalized in the previous step, ROScribe captures the spec from the user; only when the spec is completely clear, it implements that node. The task of capturing the spec involves an elaborate process of prompt engineering that we call prompt synthesis. This is the heart of ROScribe where the LLM’s strong suit is used in processing conversations and generating questions all in natural language. Figure 4 shows the process of prompt synthesis in which ROScribe uses a summary of the chat history plus the top-level information about the task, the ROS version, and the ROS graph to generate a prompt that will be fed to the LLM to create a follow-up question. This process will continue in a loop until the spec is clear and the user has provided the necessary details about the design.Finally, when the spec is clear, and all the design details are resolved, ROScribe generates the code for that ROS node and moves on to the next node. Figure 5 illustrates the ROS node generation step.Throughout the entire process of code generation, ROScribe keeps the user involved. The idea is to walk the user across the design space and shed light on different subjects that need clarification, and let the user make the decision on each subject. Ultimately, it is not the tool that invents the software; it is the user utilizing the tool who is in charge of the project.Figure 4. ROS node specification using prompt synthesisFigure 5. ROS node generationROS-specific installation scripts generationAt the final step of the software generation process, ROScribe creates the files needed for installation and launch of the generated ROS packages; the generated files include package.xml, CMakeLists.txt, and launchfile.launch (Figure 6).Figure 6. ROS-specific installation filesLessons we learned from ROScribeThe following remarks summarize the lessons we learned from development of ROScribe:About ROScribeWe made ROScribe open source hoping that it would benefit robotics students and engineers who want to speed up the process of robot software generation. We encourage all of you to check out this tool, and give us feedback here, or by filing issues on our github. If you like ROScribe, or the ideas behind it, please star our repository to give it more recognition and let others know about it. We plan to keep maintaining and updating this tool, and we welcome all of you to participate in this open source project.About RoboCoachWe are a small early-stage startup company based in San Diego, California. We are exploring the applications of LLMs in software generation in general, and in robot software generation specifically. You can learn more about our products in our github [5, 6].References[1] https://en.wikipedia.org/wiki/Robot_Operating_System[2] http://wiki.ros.org/Documentation[3] https://index.ros.org/[4] https://dev.to/robocoach/looking-inside-gpt-synthesizer-and-the-idea-of-llm-based-code-generation-41d9[5] https://github.com/RoboCoachTechnologies/GPT-Synthesizer [6] https://github.com/RoboCoachTechnologies/ROScribe Templates let you quickly answer FAQs or store snippets for re-use. Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment's permalink. Hide child comments as well Confirm For further actions, you may consider blocking this person and/or reporting abuse Timonwa Akintokun - Oct 11 Andreas Riedmüller - Sep 30 BekahHW - Oct 10 Joel Kang - Sep 7 Once suspended, robocoach will not be able to comment or publish posts until their suspension is removed. Once unsuspended, robocoach will be able to comment and publish posts again. Once unpublished, all posts by robocoach will become hidden and only accessible to themselves. If robocoach is not suspended, they can still re-publish their posts from their dashboard. Note: Once unpublished, this post will become invisible to the public and only accessible to RoboCoach. They can still re-publish the post if they are not suspended. Thanks for keeping DEV Community safe. Here is what you can do to flag robocoach: robocoach consistently posts content that violates DEV Community's code of conduct because it is harassing, offensive or spammy. Unflagging robocoach will restore default visibility to their posts. DEV Community — A constructive and inclusive social network for software developers. With you every step of your journey. Built on Forem — the open source software that powers DEV and other inclusive communities.Made with love and Ruby on Rails. DEV Community © 2016 - 2023. We're a place where coders share, stay up-to-date and grow their careers.



This post first appeared on VedVyas Articles, please read the originial post: here

Share the post

Looking inside ROScribe and the idea of LLM-based robotic platform

×

Subscribe to Vedvyas Articles

Get updates delivered right to your inbox!

Thank you for your subscription

×