Setup prompt compression API

Getting Started with Prompt Compression API

This guide provides step-by-step instructions on setting up prompt compression API, that compresses input prompts before passing it to LLMs, helping you save cost and reduce inference speed. The “Compressor setup modal” provides your essential connection details: API endpoint, key, and sample request bodies. This ensures a smooth setup for compressing your prompts.

Select cost saving journey

After you have access and you login to LLUMO for the first time, you will see a modal asking you to select the journey - Cost saving and Evaluation. You need to select the Cost Saving option.

Navigate to Compressor and open setup modal

Now, you can see the setup modal for the compressor. If you are not able to see the below modal, contact our support team. If you close the modal window, you can access it again by clicking “Setup Now” CTA on the banner at the top of the page.

2. Access your API keys

The API keys equips you with the credentials needed to connect with Prompt Compression API. This unlocks the power to compress your prompts and experience cost savings.

We suggest you to keep your API keys very safely. If you experience any misuse, contact our support team.

Click on Connect Now

The setup modal provides an overview of Compressor and a “Connect Now” CTA.Click on “Connect Now” and it will open the API connection detail modal

Copy API endpoint and API key

API endpoint: The URL you will use to send requests to the Compressor API.
API key: Your unique API key, which is required for authentication. You will be given one API key by default. But, if you want to create multiple API keys, refer to managing your keys.

Add sample code in your editor

Examples of codes containing request bodies in different languages is provided, that you can use to send a request to the Compressor API. Copy the sample code block and paste into editor of your choice. Below is an example of sample code block in python.

compressPrompt.py

    import requests

    url = "https://app.llumo.ai/api/compress"

    payload = {"prompt": "<string>"}
    headers = {
      "Content-Type": "application/json"
      "Authorization": "Bearer <API_KEY>"
    }

    response = requests.request("POST", url, json=payload, headers=headers)

    print(response.text)

Understand the request body

The request body includes the following important parameters:

prompt: (Mandatory) The text prompt you want to compress.
topic: (Optional) The topic of the prompt. We highly recommend providing a topic for better compression results.

Here is an example of these two parameters.

"prompt": "In a forgotten corner of the universe, a nomadic tribe journeys across a desolate wasteland on colossal, bio-engineered sandworms. These colossal creatures, revered as both gods and transportation, are on the verge of extinction due to a mysterious blight. A young rider, ostracized for their curiosity about the world beyond the sand, discovers a hidden cache of ancient technology. This technology offers a glimmer of hope for saving the sandworms, but it also unlocks forbidden knowledge that threatens to shatter the tribe's long-held beliefs. Torn between tradition and the promise of progress, the young rider must embark on a perilous quest to uncover the truth about the blight and the tribe's origins. Their journey will lead them to face colossal sand creatures, navigate treacherous landscapes, and confront a powerful ruling class determined to maintain the status quo. Will they succeed in saving the sandworms and their people, or will they succumb to the dangers that lie hidden beneath the sands? Write a story about their fight for survival and the burden of knowledge."
"topic": "What are the potential consequences of the young rider's choice between tradition and progress in saving the sandworms?"

For fine-grained control over logging and cost visualization, explore the optional parameters within the request body. These parameters allow you to tailor metrics to your specific needs and visualize cost breakdowns on the dashboard. For a detailed breakdown on the API request and response, please refer to the API reference page.

3. Send first request to Prompt Compression API

Once you have your request setup for a sample body, you can send a request to the Compressor API to compress your own prompts.

Write code with correct request body in editor

Open your directory where LLM project reside in editor of your choice.
Edit the sample code snippet you copied from LLUMO setup modal.
Replace the placeholder values for prompt and topic with your specific prompt and topic.
Include your API key in the header of the request.
Run the code snippet.

You can find detailed instructions on setting up the API in your code on the dedicated guide page.

Sending request via API testing tools

For a more interactive experience, we recommend using API testing tools like Postman or Thunder Client. Below is the step-by-step detailed video explaing the complete process.

See the request in logs and visualise metrics on dashboard

After sending the request, you can view the logs and metrics generated on the dashboard to see how the compression process performed. This information can help you optimize your requests for better results.

All Done!

Congrats! You’ve set up your LLUMO’s Prompt Compression API and it has started saving your cost with each inference! If you encounter any difficulties while setting up, please refer to the troubleshooting section of the guide or, contact our support team at connect@llumo.ai.

How to docs

Setup

API Reference

Setup prompt compression API

Getting Started with Prompt Compression API

2. Access your API keys

3. Send first request to Prompt Compression API

All Done!

How to docs

Setup

API Reference

​Getting Started with Prompt Compression API

​1. Open compressor setup modal

​2. Access your API keys

​3. Send first request to Prompt Compression API

​All Done!

Getting Started with Prompt Compression API

1. Open compressor setup modal

2. Access your API keys

3. Send first request to Prompt Compression API

All Done!