Setup prompt compression API
Start cutting AI cost in just 2 minutes
Getting Started with Prompt Compression API
This guide provides step-by-step instructions on setting up prompt compression API, that compresses input prompts before passing it to LLMs, helping you save cost and reduce inference speed.
1. Open compressor setup modal
The “Compressor setup modal” provides your essential connection details: API endpoint, key, and sample request bodies. This ensures a smooth setup for compressing your prompts.
Login and get access
Login and get access
If you’ve already done demo call and got the access to the LLUMO app, this step is complete.
If you are a new user for LLUMO AI, please refer to Create Account and Activate your account
Select cost saving journey
Select cost saving journey
After you have access and you login to LLUMO for the first time, you will see a modal asking you to select the journey - Cost saving and Evaluation. You need to select the Cost Saving option.
Navigate to Compressor and open setup modal
Navigate to Compressor and open setup modal
Now, you can see the setup modal for the compressor. If you are not able to see the below modal, contact our support team.
If you close the modal window, you can access it again by clicking “Setup Now” CTA on the banner at the top of the page.
2. Access your API keys
The API keys equips you with the credentials needed to connect with Prompt Compression API. This unlocks the power to compress your prompts and experience cost savings.
We suggest you to keep your API keys very safely. If you experience any misuse, contact our support team.
Click on Connect Now
Click on Connect Now
The setup modal provides an overview of Compressor and a “Connect Now” CTA.
Click on “Connect Now” and it will open the API connection detail modal
Copy API endpoint and API key
Copy API endpoint and API key
- API endpoint: The URL you will use to send requests to the Compressor API.
- API key: Your unique API key, which is required for authentication. You will be given one API key by default. But, if you want to create multiple API keys, refer to managing your keys.
Add sample code in your editor
Add sample code in your editor
Examples of codes containing request bodies in different languages is provided, that you can use to send a request to the Compressor API. Copy the sample code block and paste into editor of your choice. Below is an example of sample code block in python.
Understand the request body
Understand the request body
The request body includes the following important parameters:
-
prompt: (Mandatory) The text prompt you want to compress.
-
topic: (Optional) The topic of the prompt. We highly recommend providing a topic for better compression results.
Here is an example of these two parameters.
For fine-grained control over logging and cost visualization, explore the optional parameters within the request body. These parameters allow you to tailor metrics to your specific needs and visualize cost breakdowns on the dashboard. For a detailed breakdown on the API request and response, please refer to the API reference page.
3. Send first request to Prompt Compression API
Once you have your request setup for a sample body, you can send a request to the Compressor API to compress your own prompts.
Write code with correct request body in editor
Write code with correct request body in editor
-
Open your directory where LLM project reside in editor of your choice.
-
Edit the sample code snippet you copied from LLUMO setup modal.
-
Replace the placeholder values for prompt and topic with your specific prompt and topic.
-
Include your API key in the header of the request.
-
Run the code snippet.
You can find detailed instructions on setting up the API in your code on the dedicated guide page.
Sending request via API testing tools
Sending request via API testing tools
For a more interactive experience, we recommend using API testing tools like Postman or Thunder Client. Below is the step-by-step detailed video explaing the complete process.
See the request in logs and visualise metrics on dashboard
See the request in logs and visualise metrics on dashboard
After sending the request, you can view the logs and metrics generated on the dashboard to see how the compression process performed. This information can help you optimize your requests for better results.
All Done!
Congrats! You’ve set up your LLUMO’s Prompt Compression API and it has started saving your cost with each inference!
If you encounter any difficulties while setting up, please refer to the troubleshooting section of the guide or, contact our support team at connect@llumo.ai.