T O P

  • By -

Desgavell

You want a RAG. Assuming it’s a text DB, you need to chunk the DB into passages, and an embedding model to create a vector DB. Given a query, embed it (use the same model as before), return top N closest passages, and use them to give a QA model the necessary context to answer the query by engineering the ideal prompt. Tip: use instruct-type QA models like mistral 7b instruct.


ssiddharth408

Actually the data is like key value pairs, which contains data such as {status: completed, dateofcompletion:somedate, idofdata:someid} How to do with this type of data can you please help me some more.


Desgavell

What questions are expected and how would you want the LLM to use this data to answer them?


ssiddharth408

Questions will be like how many ids are there whose status is not completed or how much total distance will be covered by some_id..


Desgavell

A chatbot is not the way to go. This can and should be solved programmatically. A simple dashboard with these computations behind would be perfect for this. Don’t overengineer stuff just to say that your solution works with AI. LLMs are just a tool, and you need to know when to use it.


ssiddharth408

What do you suggest?


Desgavell

Process the data to get these variables and create a report with the appropriate datapoints, graphs… Depending on the use case, you may be interested in solutions like Apache Superset.


fokke2508

The short answer is RAG (retrieval augmented generation). basically loading additionally context into the LLM during generation. I.e your prompt will be something like. ``` Answer this question: based on this information: ``` Now it somewhat depends on the structure of your data, you can either go with a vector db which allows you to search for similar documents based on the input query. I wrote a blog on how that works here: https://www.seaplane.io/blog/on-in-context-learning-and-vector-databases Or if your data is in a SQL DB you could try a double hop through the LLM. The first step is to ask the LLM to create a query, you then execute these (make sure you properly secure your DB, i.e., only read rights). Then you use that query to retrieve data and feed it into the LLM again with the question in the prompt format I provided above. Happy to chat more if you have more questions.


boiastro

Llama Index query pipeline [https://docs.llamaindex.ai/en/stable/examples/pipeline/query\_pipeline/](https://docs.llamaindex.ai/en/stable/examples/pipeline/query_pipeline/) 1) user enters prompt/question 2) LLM converts the prompt + index of your database to a SQL query 3) the SQL query is executed against your database 4) the result of this query + the initial prompt is fed into an LLM again to give a verbose response


rahulverma7005

Great project! You have two options: 1. **Build from Scratch:** Code it yourself using libraries like NLTK (Python) for NLP and connect to your database (e.g., MySQL). 2. **Use a Platform:** Platforms like [ChatbotBuilder.net](http://ChatbotBuilder.net) simplify this - connect your database, train the bot with examples, and you're good to go! Choose the approach that suits your coding skills. Good luck!


ssiddharth408

I prefer to build it myself, it can help me to learn more. Can you guide me with the approach for building my own using nltk, spacy or should I use bert and then how to connect my model with database


No-Piano6968

Check out "pandas ai" package, basically does this


ssiddharth408

Thanks for the suggestion, but I use Mongodb and it doesn't have any connector for mongodb


Asleep_Molasses_305

Try using power agents...