Start cutting AI cost in just 2 minutes
Prompt compression helps reduce RAG costs by making shorter prompts while retaining their meaning. Using Llumo AI’s simple API integration, you can easily compress your prompts and get the same output but at a much lower cost, with fewer hallucinations and faster inference speed.
RAG_Context and Query are pre-built variables that can be used to build your prompts. To create additional variables, use the “+Variable” format and reference them within your prompt using {{ variable_name }}
.
Think of RAG_Context as the “background information” that you give to the AI model to help it understand the situation better.
If you want to ask an AI, “What should I do to improve my health?” the AI could give a more specific answer if it knows the context:
So, the RAG_Context provides that background information.
RAG_Context Example:
"I am asking for health improvement advice for someone who is looking to lose weight and is currently inactive."
Query is the actual question or request you are asking the AI. It should be clear and direct.
Query: "What are the best exercises to start with for weight loss?"
A prompt is a combination of RAG_Context and Query to form a complete input for the AI.
Imagine you want to ask LLUMO AI for health advice. You would use RAG_Context and Query in a combined format, like this:
"I am asking for health improvement advice for someone who is looking to lose weight and is currently inactive."
"What are the best exercises to start with for weight loss?"
Full Prompt Example:
"Give me an answer to the following query: {{ Query }} using the given context: {{ RAG_Context }}."
Here, the AI understands both the RAG_context (the background information about losing weight and being inactive) and the Query (the specific question about exercises). This combination leads to a more accurate and relevant response.
Why Use Pre-Built Variables?
You can add more specific details by creating custom variables using the format +Variable
.
"I live in a hot, humid climate."
"I am 35 years old."
Full Prompt Example:
"Give me an answer to the following query: {{ Query }} using the given context: {{ RAG_Context }}, considering the location: {{ Location }}, and the age: {{ Age }}."
If you already have your own data ready for testing or compressing, LLUMO AI makes it easy to work with your specific set of information. Here’s a detailed step-by-step guide to help you get started with your data on LLUMO AI:
LLUMO AI offers pre-built variables like RAG_Context and Query, so you don’t have to set them up yourself. These are ready to use from the start, making it easier and faster to build your prompts. Simply write your RAG_Context and Query, and add any extra variables if needed. By using these pre-built variables, you save time by not having to manually define certain details in your prompt.
Let’s say you want to ask the AI for health advice related to weight loss. Here’s how it would look:
"I need health advice for someone trying to lose weight."
"What are the best practices for weight loss?"
So, your full prompt might look like this:
"Give me an answer to the following query: {{ Query }} using the given context: {{ RAG_Context }}."
When the AI processes this, it will:
"I need health advice for someone trying to lose weight."
"What exercises are best for weight loss?"
This results in a well-informed and accurate response from the AI.
When the AI processes this, it will take RAG_Context (“I need health advice for someone trying to lose weight”) and Query (“What exercises are best for weight loss?”) and combine them to generate a response.
Once you’ve written your prompt, the next step is to select the provider and model.
Pick a Provider:
Choose the service that will power the AI for your prompt. For example:
Pick a Model:
After selecting the provider, choose a specific model. For instance:
Once both the provider and model are selected, you’re all set to run your prompt and receive an answer!
Once your prompt is set and you’ve chosen the model, click the “Compress and Run” button. LLUMO AI will automatically compress the prompt by shortening it, making it more efficient.
After the system processes your compressed prompt, LLUMO AI will show you the results.
+variable
. When you click on it, you can see the RAG_Context and queries you’ve used before. This makes it easy to check your past inputs without having to enter them again.Once in the Connect API tab, you’ll find simple, step-by-step instructions on how to integrate the API with your system.
Follow the guidelines carefully to integrate the API. This may involve copying a key or adding a few lines of code to link LLUMO AI with your project.
By following these 3 simple steps, you’ll be set up to use prompt compression and enhance your workflow within just 5 minutes!
What is prompt compression, and how does it help with AI?
Prompt compression is when you shorten the text you give to AI but still keep the same meaning. This helps save money because the AI doesn’t need to process as much information. It also speeds up the AI’s response time, making everything run faster and more efficiently.
How do I use RAG_Context and Query in LLUMO AI?
LLUMO AI gives you two pre-made variables called RAG_Context and Query. RAG_Context is the background information, and Query is the question you want answered. Instead of typing everything out, you just use these variables, which makes creating prompts much quicker and easier.
Can I create my own variables in LLUMO AI?
Yes! If you need extra information in your prompts, you can create your own custom variables by adding “+Variable.” For example, if you need a variable for someone’s location, you can call it “+location.” Then, just refer to it in your prompt, just like the pre-built variables.
How does prompt compression make things cheaper and faster?
By shortening your prompts, you reduce the number of tokens, which lowers the cost for using AI. Plus, smaller prompts mean the AI can process them faster, so you get quicker answers. It’s a simple way to save money and time.
Can I view my old RAG_Context and Query values?
If you want to see the RAG_Context and Query you’ve used before, you can do that easily. Just click on the symbol next to “+variable,” and it will show you all your past inputs. This way, you don’t have to re-enter them every time.
Start cutting AI cost in just 2 minutes
Prompt compression helps reduce RAG costs by making shorter prompts while retaining their meaning. Using Llumo AI’s simple API integration, you can easily compress your prompts and get the same output but at a much lower cost, with fewer hallucinations and faster inference speed.
RAG_Context and Query are pre-built variables that can be used to build your prompts. To create additional variables, use the “+Variable” format and reference them within your prompt using {{ variable_name }}
.
Think of RAG_Context as the “background information” that you give to the AI model to help it understand the situation better.
If you want to ask an AI, “What should I do to improve my health?” the AI could give a more specific answer if it knows the context:
So, the RAG_Context provides that background information.
RAG_Context Example:
"I am asking for health improvement advice for someone who is looking to lose weight and is currently inactive."
Query is the actual question or request you are asking the AI. It should be clear and direct.
Query: "What are the best exercises to start with for weight loss?"
A prompt is a combination of RAG_Context and Query to form a complete input for the AI.
Imagine you want to ask LLUMO AI for health advice. You would use RAG_Context and Query in a combined format, like this:
"I am asking for health improvement advice for someone who is looking to lose weight and is currently inactive."
"What are the best exercises to start with for weight loss?"
Full Prompt Example:
"Give me an answer to the following query: {{ Query }} using the given context: {{ RAG_Context }}."
Here, the AI understands both the RAG_context (the background information about losing weight and being inactive) and the Query (the specific question about exercises). This combination leads to a more accurate and relevant response.
Why Use Pre-Built Variables?
You can add more specific details by creating custom variables using the format +Variable
.
"I live in a hot, humid climate."
"I am 35 years old."
Full Prompt Example:
"Give me an answer to the following query: {{ Query }} using the given context: {{ RAG_Context }}, considering the location: {{ Location }}, and the age: {{ Age }}."
If you already have your own data ready for testing or compressing, LLUMO AI makes it easy to work with your specific set of information. Here’s a detailed step-by-step guide to help you get started with your data on LLUMO AI:
LLUMO AI offers pre-built variables like RAG_Context and Query, so you don’t have to set them up yourself. These are ready to use from the start, making it easier and faster to build your prompts. Simply write your RAG_Context and Query, and add any extra variables if needed. By using these pre-built variables, you save time by not having to manually define certain details in your prompt.
Let’s say you want to ask the AI for health advice related to weight loss. Here’s how it would look:
"I need health advice for someone trying to lose weight."
"What are the best practices for weight loss?"
So, your full prompt might look like this:
"Give me an answer to the following query: {{ Query }} using the given context: {{ RAG_Context }}."
When the AI processes this, it will:
"I need health advice for someone trying to lose weight."
"What exercises are best for weight loss?"
This results in a well-informed and accurate response from the AI.
When the AI processes this, it will take RAG_Context (“I need health advice for someone trying to lose weight”) and Query (“What exercises are best for weight loss?”) and combine them to generate a response.
Once you’ve written your prompt, the next step is to select the provider and model.
Pick a Provider:
Choose the service that will power the AI for your prompt. For example:
Pick a Model:
After selecting the provider, choose a specific model. For instance:
Once both the provider and model are selected, you’re all set to run your prompt and receive an answer!
Once your prompt is set and you’ve chosen the model, click the “Compress and Run” button. LLUMO AI will automatically compress the prompt by shortening it, making it more efficient.
After the system processes your compressed prompt, LLUMO AI will show you the results.
+variable
. When you click on it, you can see the RAG_Context and queries you’ve used before. This makes it easy to check your past inputs without having to enter them again.Once in the Connect API tab, you’ll find simple, step-by-step instructions on how to integrate the API with your system.
Follow the guidelines carefully to integrate the API. This may involve copying a key or adding a few lines of code to link LLUMO AI with your project.
By following these 3 simple steps, you’ll be set up to use prompt compression and enhance your workflow within just 5 minutes!
What is prompt compression, and how does it help with AI?
Prompt compression is when you shorten the text you give to AI but still keep the same meaning. This helps save money because the AI doesn’t need to process as much information. It also speeds up the AI’s response time, making everything run faster and more efficiently.
How do I use RAG_Context and Query in LLUMO AI?
LLUMO AI gives you two pre-made variables called RAG_Context and Query. RAG_Context is the background information, and Query is the question you want answered. Instead of typing everything out, you just use these variables, which makes creating prompts much quicker and easier.
Can I create my own variables in LLUMO AI?
Yes! If you need extra information in your prompts, you can create your own custom variables by adding “+Variable.” For example, if you need a variable for someone’s location, you can call it “+location.” Then, just refer to it in your prompt, just like the pre-built variables.
How does prompt compression make things cheaper and faster?
By shortening your prompts, you reduce the number of tokens, which lowers the cost for using AI. Plus, smaller prompts mean the AI can process them faster, so you get quicker answers. It’s a simple way to save money and time.
Can I view my old RAG_Context and Query values?
If you want to see the RAG_Context and Query you’ve used before, you can do that easily. Just click on the symbol next to “+variable,” and it will show you all your past inputs. This way, you don’t have to re-enter them every time.