In the previous post, we discussed how the RAG pattern isn’t enough to answer complex user questions. We talked about using Semantic Kernel to orchestrate AI calls can allow AI to generate its own plan for answering questions from various data sources.
In my GitHub repo, I have a sample application that demonstrates this idea.
As a reminder, we are building a customer support chatbot for our fictional outdoor sporting equipment company. We want to be able to answer common customer support questions by utilizing our internal data sources.
Will my sleeping bag work for my trip to Patagonia next month?
Fictional data sources
Sample 3rd party APIs (data sources)
Looking at the data source API implementations, we can see sample implementations that return dummy data. In the real world, these would be more complex services (and might not even be owned by the team building the chatbot).
In order to allow the Semantic Kernel (and by extension, the Azure OpenAI service) to call 3rd party APIs, you need to register “plugins” with the kernel. These plugins are delegates that wrap API calls to external systems. This abstracts the OpenAI service from the implementation details of how to call APIs (authorization, URIs, etc.).
When you register the APIs, you provide a “Description” attribute decorator which is a human-readable description of what the plugin does. This is the instructions to the Semantic Kernel for when to call your API to retrieve a piece of data. For more information, see the docs for the “SKFunctionAttribute“.
Much of the “prompt engineering” work will be in trying to explain to the OpenAI service when it should call your API to retrieve part of the data needed to answer a query. I also had to run the example several times, see the common errors & mistakes in order to adjust the descriptions to influence the model to choose the API, pass in the right data & in the right order.
Order History plugin
Here is the implementation of the Order History API (in the
The most important piece of data in this API is the ProductID as this is needed to call the
ProductCatalog API. Part of the magic of the planner is that it can parse the input & output of the plugins and try to pull out the relevant pieces of information.
In all of my example plugin implementations, I use Dapr to abstract the service-to-service invocations. This makes it easy to both run the code locally & in Azure Container Apps without worrying about port numbers, URLs, etc. For instance, when running locally, each service needs a different port number. Dapr hides this deployment detail from my code. My code only needs to know “call a service with the service tag
order-history” and Dapr will figure out how to route the request correctly.
Historical Weather Lookup plugin
Here is the implementation of the Historical Weather Lookup API (in the
The most important piece of data is the lowest expected temperature the sleeping bag can support.
Semantic Kernel implementation
Setting up the kernel
Using the Semantic Kernel inside your application is straightforward.
First, we need to initialize the kernel.
We need to give the kernel both the endpoint of our Azure OpenAI service & the credentials needed to authenticate with it.
Now we need to register our native plugins (the function delegates that can call out to 3rd party APIs).
These “plugin” classes contain public methods that have the
SKFunction attribute decorator on them. This indicates to the Semantic Kernel code that this is a valid function to call.
Invoking the planner
Finally, we can write the code to respond to each user query (in the
src/RecommendationApi/Services/RecommendationService.cs file). Here is the whole implementation.
Breaking it down, first we need to add some additional context variables that provide useful information to the Semantic Kernel (such as the username of the user that is making the request & the current date). In a real-world example, you could pull this data from the JWT or call the Graph API.
Then, we can set up a new context for each query.
But the 3rd party APIs were not explicitly built with this exact usage pattern in mind (they are built & run by independent teams). Therefore, the Semantic Kernel planner needs to incrementally call them, analyze the result and then call them again if there is a mistake (bad input data format, for example).
Now we can initialize the
StepwisePlanner. This is the key part of making the orchestration happen. There are several planners available, but the most sophisticated one is the
StepwisePlanner. The magic of the
StepwisePlanner is that it can perform steps, analyze results and then take the next step. This helps it deal with the fact that the APIs have to be called in a certain way, certain order, etc. in order to correctly respond. We can limit the # of iterations that the planner tries to go through to answer the question.
Next, we provide the “system” prompts to explain to the OpenAI service how we want it to behave.
Finally, we execute the planner & wait for the response.
The user-facing web app is a React app that makes a
FetchAPI call to the
RecommendationApi ASP.NET Core web api.
In part 3, I will show you the demo app.