Introduction
In the beginning, we just wanted to create a chatbot that would reliably answer questions based on our resources within our project nestforms.com.
In the ideal world we wanted to have the resources section at the bottom of each question that would lead to relevant resources (like help pages links).
Our data resources are connected from different types of the content: FAQ, Help pages, frequent email responses, template forms, case studies, other blog pages. Obviously all in different formats. Originally cca 10MB of text.
We had this in our mind for some time and in September 2024, we just agreed to go ahead. We are developers, so we did not want just to order some service, we wanted to test the options ourselves.
We did not know enough about what are the options, we just knew there are different available models within ChatGPT (which we were using already) and some options with Google Claude (no experiences) and also there are some options with Amazon AWS which we are using as the cloud platform (but had no experiences with AI services available).
Starting
We have started with ChatGPT as it was the most obvious option. We had a paid plan for $20 and we were able to create a model. We have prepared our data into a JSON file and uploaded it into the model with some textual description of the data in the JSON.
Great news – ChatGPT was able to answer some of the questions.
It was quite hard to get the resources section as chatGPT was quite unreliable and sometimes hallucinated the URLs. But this was working quite well within the online environment.
Surprise number 1: There is no way to connect to the ChatGPT custom model.
Unfortunately, we could not figure out a way to connect ChatGPT to our custom model via API. This appeared after hours and hours of searching the internet and discussing with ChatGPT.
So we decided to try with openAI which is the mother company of ChatGPT. They have an API that you can connect to.
Surprise number 2: OpenAI custom model require different data.
Unfortunately, they needed the data in a different format. They wanted everything as JSONL (different to chatGPT) and also, they required a rigid structure as question and answer.
So we had to completely review our data to get the question / answer format from our resources.
Surprise number 3: OpenAI custom model does not work the same way as ChatGPT Custom Model.
When we delivered the data, the model started to learn. It was a different process than originally with ChatGPT. This took much longer as this time, the model really trained the neural network – called fine-tuning.
After it finished the training process, we tested the responses and we were shocked with the responses. The model was answering in a completely different way, linking to our concurrency projects. And there was no chance to get the resources section at the end of the result.
When we dig into why this is the case, the difference is that the Custom model in ChatGPT is searching within our resources and finds the most suitable texts and then tries to create a response from it. But the OpenAI custom model merged the knowledge base into the model and is not able to source any data directly at all.
We tried again to search for better options, if we have made any mistake in our setup, but we could not improve it in any real way.
So we had to resign on working with openAI and ChatGPT as it is simply not offering the way we need.
Moving to Amazon AWS
As mentioned earlier, our projects are already hosted on amazon, so we decided to try this platform.
While the ChatGPT was very clear and straightforward, the Amazon AWS is a complete opposite. The reason for this is that Amazon AWS is providing just the infrastructure, but is not user friendly at all. There are many and many options that you do not understand in the beginning and you are just picking “something”.
Playing wih Amazon Bedrock.
After some searching, we started creating a Bedrock knowledgebase. Prepared the resources in the fixed JSON structure (different to chatGPT and open AI again). Uploaded the data to S3 and tried to connect.
As we dug for more and more details, we found out that our option should be to create a vector store database based on our data.
Surprise number 4: We cannot use the JSON model.
Within ChatGPT, the setup was quite intuitive and you were able to describe the structure of the JSON file in the text way, but we could not manage this within Bedrock Knowledgebase. So after many tries, we resigned to JSON format and used csv format.
And now, we can finally see that the knowledgebase is answering with some data from our resources.
Unfortunately we are not able to get any links to our resources now because the knowledgebase is not able to recognise the data from different columns correctly. So digging again, there is an option to fix this, you can create a .metadata.json file that allows you to define what type of data is in which column. It allowed us to specify the column for the link resources. And hurray, we were able to get some resources links now.
We tested some communication within the API with ChatGPT already. And it was quite straight forward. But with Amazon, we reached the issue again. As there are many options, the documentation is also more complicated and nothing is as straight forward. Part of the issues here are the complicated permissions. So it took much longer again.
Good news – in the end, we were able to connect our amazon.
Surprise number 5: No streaming with Bedrock Flow, Agents or Prompt Management
Now we want to polish it and retrieve data in the ideal format with our resources links.
We started to play with Amazon Bedrock Flow, Amazon Bedrock Agents and Amazon Bedrock Prompt Management. They are all tools that are loosely connected to the knowledgebase. But any time we reached any valuable step forward from the result point of view, the service was not able to stream the response. So we had to wait for the whole response and display the whole response. This took a very long time and completely missed the nice effect of chatbot writing the response. So we always reverted back to the plain knowledgebase.
After many and many tests, we found out that we could use the feature RetrieveAndGenerateStreamCommand which is a special command from the Amazon Bedrock package for the NodeJs. That allowed us to stream the response and also receive other ultimate information required for our resources links section defined in the metadata file. Unfortunately, it does not stream the list items correctly, so when the AI is suggesting the bulletpoint list, the user must wait until the whole list is completed. But we could not set this up in any better way.
So we are now receiving the stream of the content from Knowledgebase and also receiving additional metadata which lets us generate the resources links section manually and place it at the end of the response.
Surprise number 6: Amazon pricing
We are a small service with less than thousand question requests per month, so we need to review the pricing.
When this was set up, we had to enable several services with many settings. It showed that amazon enabled 10 instances for search (and charging for these). So we have found the settings and limited this to 2 instances (which is minimum).
Additional price optimisation
Now the service is running as expected and amazon is charging us about $400 per month. As this is just an additional service and it is not crucial, we were wondering if this can be even cheaper. So after a lot of digging, we have found out that it might be possible to replace the OpenSearch Service with some RDS that might be possible to be paused.
So we have created a RDS cluster for our knowledgebase and set this cluster to be stopped after 15 minutes of inactivity. This will allow the database to stop overnight and then start again when somebody really requests to talk to our chatbot.
We had updated the service, how we are connecting to the knowledgebase, but it works quite well. The user will need to wait several more seconds, but when you handle this reasonably in the interface, it is not too dramatic.
Now we are under $50 per month.
Conclusion
We finally have our chatbot that is doing exactly what we want, supporting only answers from our resources and also supporting a reliable list of the resources links.
If you want to discuss any of the steps above with us or you are interested in deeper knowledge. Contact us and we will agree to some consultancy.
We finished this in March 2025. Since then, we have applied the same process into one of our client in-el.cz website for internal users only.
Technical background
The website is in PHP. PHP does not allow data streaming. So we have also created a NodeJS service that is connected to the website via websocket. This NodeJS is then communicating with Amazon AWS knowledgebase. We have also applied the rate limits (so that the service cannot be overused). And there is also a connection from the NodeJS to PHP to identify the type of the user (subscribed users have different rate limits) and NodeJS can also store the results into PHP for administration purposes.
In the amazon Knowledgebase, we are using Embeddings model: Titan Text Embeddingsv2 and for generating texts claude-3-5-sonnet.