Get Even More Visitors To Your Blog, Upgrade To A Business Listing >>

Using Amazon Alexa To Run AWS CLI Commands via Large Language Models (LLMs)

James MatsonFollowBetter Programming--ListenShareThe other night I had a fleeting idea that turned into a fun technical challenge. I have them all the time.The fleeting ideas, that is.They rarely turn into actual technical challenges that I complete because of life's myriad of other distractions (like season 2 of Foundation or my cat giving me feline stank-eye because he wants to be fed). But this particular night, the idea just wouldn’t leave my head until I sat down at my computer and began to stitch code and cloud components together until I had something working.The idea? Could I create a solution where I could execute Aws Cli commands on my AWS account using only my voice via Amazon Alexa while also not having to know the actual syntax required (but instead relying on natural language)?The answer is yes! And this article will show you how (with a bonus video at the end showing the solution in action).To follow along at home, you’ll need the following:Our entry point for the application is going to be Amazon Alexa. Why? Because Alexa provides an easy-to-use development experience for the creation of voice skills, and as you’ll see from my example, you don’t need to finalise and publish a skill to experiment with it, nor do you need an actual Alexa device like an echo.The voice skill I’ve set up is called — rather unimaginatively — VoiceToCommand — and as far as Alexa skills go, it’s really very simple. That’s because most of the brains of our solution will come when we plug in the chat completions API that exposes our large language model. But more on that later.For now, let’s take a look at our Alexa skill. We’ve set up a very simple launch phrase called ‘voice command’:And we’ve created a custom slot called NaturalLanguageCommand. This slot will be filled with our natural language request (for example, “show me my s3 buckets”).You can see we’ve added some sample slot values, but these don’t actually matter. It wouldn’t be possible to cover all the slot values for the ways we might talk about the things we want from the AWS CLI, so these are really just some samples so that we don’t have 0 slot values which can cause issues saving your skill.Also, make sure you turn Multi-Value on. This will ensure our entire natural language request (typically a short sentence) will be passed along to our yet-to-be-built backend.Now, to give Alexa an idea of how we might start our natural language Command, we’ve created an intent called ExecuteCommand. We’ve then combined that intent with some sample utterances as typical ways to introduce a request for our AWS CLI (the {CommandSlot}, and voila! We have the starting point of our skill.A quick test in the developer console, and we’re ready to move on.To take our spoken words and do something useful with them, we need to create a backend to accept the {CommandSlot}. For this, we will keep things in the Amazon family and create an AWS Lambda function using C#.We will name this Lambda VoiceToCommand, but it can really be thought of as our orchestrator.It’ll be responsible for receiving our voice command, sending it to our LLM endpoint to convert it to a usable AWS CLI command, and sending that command to our local computer by way of the WebSocket server (don’t worry — we’ll get to that too!) and finally responding to Alexa that the command has been executed.Put simply, our Lambda — and the services it talks to — will be responsible for doing something like:We’ll put the entire Lambda function code here — including its three methods — then go through them one at a time (note we’re introducing a couple of nuget packages here, namely Alexa.NET and Azure.AI.OpenAI).So, let’s break down our function and its three key methods.FunctionHandler is our main Lambda handler. It intercepts our SkillRequest from Alexa and determines whether it’s the one we’re interested in: ExecuteCommand.If it is, it’ll grab out the CommandSlot value (our natural language request, like ‘tail the log group for my lambda called x’) and pass it to our ConstructAwsCommand method.The ConstructAwsCommand method is where things get more interesting. Here we will use prompt engineering to construct just the right kind of prompt to elicit a suitable response from our LLM endpoint (in this case, a chat completion endpoint featuring the gpt-35-turbo-16k model).It took several iterations to get the prompt to work best for this situation. Typically it’s beneficial to receive a well-rounded response from an LLM, including introductory text and explanations. This isn’t what we’re after, though, because we want the exact command to be executed.To this end, I set up the system message to create a sort of ‘constrained’ personality for the model, instructing it that“You are an AI assistant that converts natural language requests to perform AWS actions into executable AWS CLI commands that are suitable to be run by other processes. You will only ever respond with the command itself. No other text should be included in your response.”This helped prevent ‘fluff’ from being sent back in the response. I still found issues where the command was sent back with special characters or string literal-breaking characters that caused issues trying to get that command sent anywhere else as JSON. To that end, I helped the model along by giving it a fairly lengthy prompt:“What is the command for: {command}. Please provide only the command itself in your response and make sure the output is suitable to be passed as a string literal. Make sure that your JMESPath expression has balanced and correctly matched single quotes (‘). If you have any literal strings or conditions in your expression, enclose them with single quotes, and ensure that they are properly closed.”With that done, and with the expectation we’d be getting back suitable commands, I now needed a way to get that command from my VoiceCommand Lambda in my AWS account to my local laptop environment where it could be executed.There are a few ways to do this, but I settled on using a WebSocket server. WebSocket is a great protocol for establishing a client/server connection where you want to broadcast data in near real time to and from multiple clients. In this case, the VoiceCommand Lambda is client 1, and my laptop is client 2. That’s where the final method, SubmitCommandToWebSocketServer, comes in. It establishes a connection to our yet-to-be-created WebSocket server and broadcasts our command from our VoiceCommand Lambda to the WebSocket server, and the laptop will listen for that same message.So, with the orchestrator lambda taken care of, it was time to set up that web socket server.Now full disclosure here, I took a very neat shortcut to get my WebSocket server set up (and don’t we love working smarter, not harder?). As primarily a C# developer and a heavy Visual Studio user, I have access to all kinds of really useful boilerplate templates for various programs. One of those happens to be a simple WebSocket server using Lambda and a Serverless Application Model (SAM) template:Creating this blueprint gives us a ready-to-go project containing a Lambda function acting as a WebSocket server and a SAM template that’ll create the connection, disconnection, and message Lambdas, an API Gateway configured for WebSocket communication and a DynamoDb table to store the sessions from the client(s) that connect. I won’t go into the code here or the template. You can check it out for yourself.The important thing to note is that once deployed to your AWS account, you’ll have a WebSocket server designed to send and receive messages in the following format:That {command} bit, that’s where we’re going to put our AWS CLI command. It’ll be sent to the WebSocket server to broadcast to other clients, and our laptop will be listening for just such a message.Deploy the complete stack to your AWS account, and take note of the produced API Gateway WSS URL; you’ll want to update that into your VoiceCommand Lambda, and into your local console application on your local machine.(A note: The blueprint deploys an unauthenticated WebSocket server. This is fine for experimentation, but if you have any thoughts about using this in a production setting, make sure you apply proper authentication and authorisation to anything that can be reached over the internet).The final part of our puzzle! The engine for executing our command locally. For this, we need a simple c# console application that will be set up to listen to our WebSocket server, receive our AWS CLI command, and execute it locally by wrapping it in a PowerShell execution. Let’s take a quick look at the code:At its simplest, the above simply launches a command window, sets up an always-open connection to our WebSocket server, and listens for the all-important AWS CLI command. Once it’s received, a nifty little progress animation plays (just for effect), our command is shown, run, and hopefully, some meaningful output is produced!So, after all that, what have we ended up with? Let’s take a look at the overall architecture. A picture is always handy:So, did it actually work? After all that code and tomfoolery, was the hypothesis proven? I’m happy to report that, yes, it did indeed work. Not granted, it’s a little bit shaky and needs work in the skill space and the prompt engineering space to scale it beyond simple commands, but the core of the idea works! And that’s awesome. For proof, here’s a short video of me testing a couple of commands. I hope you enjoyed this little journey from ideation to solution, and as always — feel free to reach out via the comments if you have any questions :).----Better ProgrammingDevOps Lead, C# and Python enthusiast, Writer, AWS Community Builder, Microsoft PowerApps Champion, AI/ML TinkererJames MatsoninBetter Programming--1Sergei SavvovinBetter Programming--13Dmitry KruglovinBetter Programming--35James MatsoninBetter Programming--1Francesco FoscarininTowards Data Science--2Cassie Kozyrkov--21Lore Van OudenhoveinArtificial Intelligence in Plain English--1Tomas Pueyo--35Nikhil VemuinMac O’Clock--3Sydney NyeinTowards Data Science--8HelpStatusWritersBlogCareersPrivacyTermsAboutText to speechTeams



This post first appeared on VedVyas Articles, please read the originial post: here

Share the post

Using Amazon Alexa To Run AWS CLI Commands via Large Language Models (LLMs)

×

Subscribe to Vedvyas Articles

Get updates delivered right to your inbox!

Thank you for your subscription

×