Save cost in Langchain pipeline using LLUMO compressor API
Welcome to this Colab Notebook tutorial! In this guide, we will demonstrate how to efficiently incorporate the LLUMO Compressor API into your Langchain pipeline to save costs while utilizing Large Language Models (LLMs). Specifically, we will walk you through the process of extracting answers from a PDF document by integrating Langchain for vector search, OpenAI for generating answers, and LLUMO to compress prompts before sending them to OpenAI.
Using LLMs like OpenAI’s GPT-4 can be expensive, especially when dealing with large documents or frequent queries. The LLUMO Compressor API helps mitigate these costs by compressing prompts before they are sent to the LLM, ensuring you get the same high-quality responses at a lower cost. This tutorial is particularly useful for developers and data scientists who want to optimize their Langchain pipelines by incorporating cost-saving measures without compromising on performance.
By the end of this tutorial, you will have a fully functional pipeline that efficiently extracts and processes information from PDFs, providing accurate answers while minimizing costs through prompt compression.
Let’s get started!
In this first step, we will install several essential Python libraries that are required to build our Langchain pipeline with the LLUMO Compressor API. Each library serves a specific purpose in our workflow, enabling us to handle natural language processing, PDF reading, environment variable management, similarity search, and interaction with OpenAI’s API. Here’s a detailed overview of each library we will be using:
Langchain: This library and its components provide a robust framework for natural language processing tasks. Langchain offers tools for constructing, training, and deploying language models, making it an integral part of our pipeline for vector search and text processing.
PyPDF2: PyPDF2 is a pure Python library that allows us to read and manipulate PDF files. In this tutorial, we will use PyPDF2 to extract text from PDF documents, which will then be processed and analyzed using our natural language processing tools.
python-dotenv: This library is used for managing environment variables in Python. By using python-dotenv, we can securely store and access sensitive information such as API keys and other configuration settings needed for our project.
faiss-cpu: Developed by Facebook AI Research, faiss-cpu is a powerful library for efficient similarity search and clustering of dense vectors. We will use faiss-cpu to perform vector search on the text extracted from PDF documents, enabling us to find relevant passages quickly and accurately.
openai: The openai library allows us to interact with OpenAI’s API, enabling us to use their powerful language models for generating answers. This library will be crucial for sending prompts and receiving responses from OpenAI’s LLM.
requests and json: These libraries are used for making HTTP requests and handling JSON data, respectively. We will use requests to communicate with external APIs, including the LLUMO Compressor API, and json to parse and manipulate JSON data returned by these APIs.
Let’s start by installing these libraries using the following pip command. This command will ensure that all the necessary libraries are installed and ready to use in our Colab environment.
Before we dive into the specific functions and steps of our pipeline, it’s important to import all the necessary libraries that we’ll be using throughout this notebook.
To interact with OpenAI’s API and the LLUMO Compressor API, we need to provide our unique API keys for authentication. These keys are sensitive pieces of information that should be handled securely. In this step, we will use Python’s getpass
module to safely input our API keys and then store them in environment variables for later use.
Importing Required Modules:
getpass
: This module provides a way to securely prompt the user for input without echoing the input back to the screen. This is particularly useful for handling sensitive information like API keys.os
: This module provides a way to interact with the operating system, including setting environment variables.Prompting for OpenAI API Key:
openai_api_key = getpass("Enter your OpenAI API key: ")
: This line prompts the user to enter their OpenAI API key. The input is not displayed on the screen for security reasons.Prompting for LLUMO API Key:
llumo_api_key = getpass("Enter your LLUMO API key: ")
: Similarly, this line prompts the user to enter their LLUMO API key securely.Storing API Keys in Environment Variables:
os.environ['OPENAI_API_KEY'] = openai_api_key
: This line stores the OpenAI API key in an environment variable named OPENAI_API_KEY
.os.environ['LLUMO_API_KEY'] = llumo_api_key
: This line stores the LLUMO API key in an environment variable named LLUMO_API_KEY
.Deleting the Variables:
del openai_api_key
: This line deletes the variable openai_api_key
from memory to ensure that the API key is not accidentally exposed or misused later in the code.del llumo_api_key
: This line deletes the variable llumo_api_key
for the same reason.By following these steps, we ensure that our API keys are securely handled and stored, reducing the risk of accidental exposure. This setup is crucial for maintaining the security and integrity of our project.
In this step, we define a function to read a PDF file and extract all its text content. This function, load_pdf
, utilizes the PdfReader
class from the PyPDF2
library to achieve this. By looping through each page in the PDF and using the extract_text()
method, it collects all the text and returns it as a single string. This function is a key component of our pipeline, as it allows us to convert the PDF content into a format that can be further processed and analyzed using natural language processing tools.
Importing the PdfReader Class:
from PyPDF2 import PdfReader
: This line imports the PdfReader
class from the PyPDF2
library. PdfReader
is used to read and manipulate PDF files.Defining the load_pdf
Function:
def load_pdf(pdf):
: This line defines a function named load_pdf
that takes a single argument, pdf
, which is the path to the PDF file we want to read.Creating a PdfReader Instance:
pdf_reader = PdfReader(pdf)
: This line creates an instance of PdfReader
for the given PDF file. This instance allows us to access the contents of the PDF.Initializing a Text Variable:
text = ""
: This line initializes an empty string variable text
that will be used to accumulate the extracted text from each page of the PDF.Looping Through the Pages:
for page in pdf_reader.pages:
: This line starts a loop that iterates over each page in the PDF. pdf_reader.pages
is a list of all the pages in the PDF.Extracting Text from Each Page:
text += page.extract_text()
: Within the loop, this line extracts the text content from the current page using the extract_text()
method and appends it to the text
variable. This method is provided by the PyPDF2
library and returns the text found on the page.Returning the Extracted Text:
return text
: After the loop has processed all the pages, this line returns the accumulated text as a single string.In this step, we define a function to split a large block of text into smaller, manageable chunks. This is important for efficiently processing the text with natural language processing tools, as working with smaller chunks can improve performance and accuracy. The function split_text
uses RecursiveCharacterTextSplitter
, imported above, to achieve this.
Importing the RecursiveCharacterTextSplitter Class:
from langchain.text_splitter import RecursiveCharacterTextSplitter
: This line imports the RecursiveCharacterTextSplitter
class from the langchain.text_splitter
module. This class is designed to split text into smaller chunks based on specified parameters.Defining the split_text
Function:
def split_text(text):
: This line defines a function named split_text
that takes a single argument, text
, which is the large block of text we want to split into smaller chunks.Creating an Instance of RecursiveCharacterTextSplitter:
text_splitter = RecursiveCharacterTextSplitter(
: This line initializes an instance of RecursiveCharacterTextSplitter
with specific parameters:
chunk_size=1000
: This parameter sets the maximum size of each text chunk to 1000 characters. This ensures that each chunk is not too large to handle efficiently.chunk_overlap=200
: This parameter sets the overlap between consecutive chunks to 200 characters. Overlapping chunks can help maintain context between chunks, which can be important for tasks like text analysis and question answering.length_function=len
: This parameter specifies the function used to measure the length of the text. In this case, the built-in len
function is used, which measures the length in characters.Splitting the Text:
return text_splitter.split_text(text=text)
: This line uses the split_text
method of the RecursiveCharacterTextSplitter
instance to split the input text
into smaller chunks based on the specified parameters. The method returns a list of text chunks.In this step, we define a function to create embeddings for text chunks and store them in a vector store. This is essential for performing efficient similarity searches and finding relevant text passages. The function load_embeddings
uses the OpenAIEmbeddings
class and the FAISS
vector store.
Importing Necessary Classes:
from langchain_openai import OpenAIEmbeddings
: This import statement brings in the OpenAIEmbeddings
class, which is used to generate embeddings for text using OpenAI’s models.from langchain_community.vectorstores import FAISS
: This import statement brings in the FAISS
class, which is used to create and manage a vector store for efficient similarity search.Defining the load_embeddings
Function:
def load_embeddings(store_name, chunks):
: This line defines a function named load_embeddings
that takes two arguments:
store_name
: A placeholder for the name of the vector store (not used in this specific function but can be useful for future extensions).chunks
: A list of text chunks for which embeddings will be generated.Creating an OpenAIEmbeddings Instance:
embeddings = OpenAIEmbeddings()
: This line initializes an instance of the OpenAIEmbeddings
class. This instance will be used to generate embeddings for the text chunks.Creating a FAISS Vector Store:
vector_store = FAISS.from_texts(chunks, embedding=embeddings)
: This line creates a FAISS
vector store from the list of text chunks. The from_texts
method generates embeddings for each chunk using the embeddings
instance and stores them in the FAISS vector store. This vector store enables efficient similarity search, allowing us to quickly find the most relevant text chunks based on query embeddings.Returning the Vector Store:
return vector_store
: This line returns the created FAISS vector store. The vector store contains the embeddings for all the text chunks and is ready for similarity search operations.In this step, we define a function to compress text using the LLUMO Compressor API. This function sends a request to the LLUMO API, receives the compressed text, and calculates the compression percentage. The function also handles any errors that might occur during the process. Let’s understand each step one by one
os.getenv()
. The API key is essential for authenticating the request to the LLUMO API.application/json
) and the authorization token (Bearer {LLUMO_API_KEY}
).raise_for_status()
method raises an exception if the request fails.We will now combine everything to get the final code.
The main
function serves as the entry point for our PDF Query Assistant. It integrates all the steps discussed previously, from uploading and processing a PDF file to querying the content and using LLUMO compression to optimize costs. We will go in details of each part of the main
function:
files.upload()
to upload a PDF file.load_pdf
function to extract text from the PDF.split_text
function to split the extracted text into manageable chunks.load_embeddings
function to create a vector store using the text chunks.compress_with_llumo
function to compress the context using LLUMO API.ChatOpenAI
) with specified parameters.load_qa_chain
.We have now integrated all the steps to get the final code in the main
function. This comprehensive function handles PDF uploading, text extraction, text chunking, embedding generation, similarity search, text compression with LLUMO, and generating a response to the user’s query using the compressed or original text.
By including this check, we ensure that the main() function is only executed when the script is run directly, providing flexibility and preventing unintended behavior when the script is imported elsewhere.
It will ask you to upload a PDF and then when uploaded, it will receive input for the question you want to be answered.
Save cost in Langchain pipeline using LLUMO compressor API
Welcome to this Colab Notebook tutorial! In this guide, we will demonstrate how to efficiently incorporate the LLUMO Compressor API into your Langchain pipeline to save costs while utilizing Large Language Models (LLMs). Specifically, we will walk you through the process of extracting answers from a PDF document by integrating Langchain for vector search, OpenAI for generating answers, and LLUMO to compress prompts before sending them to OpenAI.
Using LLMs like OpenAI’s GPT-4 can be expensive, especially when dealing with large documents or frequent queries. The LLUMO Compressor API helps mitigate these costs by compressing prompts before they are sent to the LLM, ensuring you get the same high-quality responses at a lower cost. This tutorial is particularly useful for developers and data scientists who want to optimize their Langchain pipelines by incorporating cost-saving measures without compromising on performance.
By the end of this tutorial, you will have a fully functional pipeline that efficiently extracts and processes information from PDFs, providing accurate answers while minimizing costs through prompt compression.
Let’s get started!
In this first step, we will install several essential Python libraries that are required to build our Langchain pipeline with the LLUMO Compressor API. Each library serves a specific purpose in our workflow, enabling us to handle natural language processing, PDF reading, environment variable management, similarity search, and interaction with OpenAI’s API. Here’s a detailed overview of each library we will be using:
Langchain: This library and its components provide a robust framework for natural language processing tasks. Langchain offers tools for constructing, training, and deploying language models, making it an integral part of our pipeline for vector search and text processing.
PyPDF2: PyPDF2 is a pure Python library that allows us to read and manipulate PDF files. In this tutorial, we will use PyPDF2 to extract text from PDF documents, which will then be processed and analyzed using our natural language processing tools.
python-dotenv: This library is used for managing environment variables in Python. By using python-dotenv, we can securely store and access sensitive information such as API keys and other configuration settings needed for our project.
faiss-cpu: Developed by Facebook AI Research, faiss-cpu is a powerful library for efficient similarity search and clustering of dense vectors. We will use faiss-cpu to perform vector search on the text extracted from PDF documents, enabling us to find relevant passages quickly and accurately.
openai: The openai library allows us to interact with OpenAI’s API, enabling us to use their powerful language models for generating answers. This library will be crucial for sending prompts and receiving responses from OpenAI’s LLM.
requests and json: These libraries are used for making HTTP requests and handling JSON data, respectively. We will use requests to communicate with external APIs, including the LLUMO Compressor API, and json to parse and manipulate JSON data returned by these APIs.
Let’s start by installing these libraries using the following pip command. This command will ensure that all the necessary libraries are installed and ready to use in our Colab environment.
Before we dive into the specific functions and steps of our pipeline, it’s important to import all the necessary libraries that we’ll be using throughout this notebook.
To interact with OpenAI’s API and the LLUMO Compressor API, we need to provide our unique API keys for authentication. These keys are sensitive pieces of information that should be handled securely. In this step, we will use Python’s getpass
module to safely input our API keys and then store them in environment variables for later use.
Importing Required Modules:
getpass
: This module provides a way to securely prompt the user for input without echoing the input back to the screen. This is particularly useful for handling sensitive information like API keys.os
: This module provides a way to interact with the operating system, including setting environment variables.Prompting for OpenAI API Key:
openai_api_key = getpass("Enter your OpenAI API key: ")
: This line prompts the user to enter their OpenAI API key. The input is not displayed on the screen for security reasons.Prompting for LLUMO API Key:
llumo_api_key = getpass("Enter your LLUMO API key: ")
: Similarly, this line prompts the user to enter their LLUMO API key securely.Storing API Keys in Environment Variables:
os.environ['OPENAI_API_KEY'] = openai_api_key
: This line stores the OpenAI API key in an environment variable named OPENAI_API_KEY
.os.environ['LLUMO_API_KEY'] = llumo_api_key
: This line stores the LLUMO API key in an environment variable named LLUMO_API_KEY
.Deleting the Variables:
del openai_api_key
: This line deletes the variable openai_api_key
from memory to ensure that the API key is not accidentally exposed or misused later in the code.del llumo_api_key
: This line deletes the variable llumo_api_key
for the same reason.By following these steps, we ensure that our API keys are securely handled and stored, reducing the risk of accidental exposure. This setup is crucial for maintaining the security and integrity of our project.
In this step, we define a function to read a PDF file and extract all its text content. This function, load_pdf
, utilizes the PdfReader
class from the PyPDF2
library to achieve this. By looping through each page in the PDF and using the extract_text()
method, it collects all the text and returns it as a single string. This function is a key component of our pipeline, as it allows us to convert the PDF content into a format that can be further processed and analyzed using natural language processing tools.
Importing the PdfReader Class:
from PyPDF2 import PdfReader
: This line imports the PdfReader
class from the PyPDF2
library. PdfReader
is used to read and manipulate PDF files.Defining the load_pdf
Function:
def load_pdf(pdf):
: This line defines a function named load_pdf
that takes a single argument, pdf
, which is the path to the PDF file we want to read.Creating a PdfReader Instance:
pdf_reader = PdfReader(pdf)
: This line creates an instance of PdfReader
for the given PDF file. This instance allows us to access the contents of the PDF.Initializing a Text Variable:
text = ""
: This line initializes an empty string variable text
that will be used to accumulate the extracted text from each page of the PDF.Looping Through the Pages:
for page in pdf_reader.pages:
: This line starts a loop that iterates over each page in the PDF. pdf_reader.pages
is a list of all the pages in the PDF.Extracting Text from Each Page:
text += page.extract_text()
: Within the loop, this line extracts the text content from the current page using the extract_text()
method and appends it to the text
variable. This method is provided by the PyPDF2
library and returns the text found on the page.Returning the Extracted Text:
return text
: After the loop has processed all the pages, this line returns the accumulated text as a single string.In this step, we define a function to split a large block of text into smaller, manageable chunks. This is important for efficiently processing the text with natural language processing tools, as working with smaller chunks can improve performance and accuracy. The function split_text
uses RecursiveCharacterTextSplitter
, imported above, to achieve this.
Importing the RecursiveCharacterTextSplitter Class:
from langchain.text_splitter import RecursiveCharacterTextSplitter
: This line imports the RecursiveCharacterTextSplitter
class from the langchain.text_splitter
module. This class is designed to split text into smaller chunks based on specified parameters.Defining the split_text
Function:
def split_text(text):
: This line defines a function named split_text
that takes a single argument, text
, which is the large block of text we want to split into smaller chunks.Creating an Instance of RecursiveCharacterTextSplitter:
text_splitter = RecursiveCharacterTextSplitter(
: This line initializes an instance of RecursiveCharacterTextSplitter
with specific parameters:
chunk_size=1000
: This parameter sets the maximum size of each text chunk to 1000 characters. This ensures that each chunk is not too large to handle efficiently.chunk_overlap=200
: This parameter sets the overlap between consecutive chunks to 200 characters. Overlapping chunks can help maintain context between chunks, which can be important for tasks like text analysis and question answering.length_function=len
: This parameter specifies the function used to measure the length of the text. In this case, the built-in len
function is used, which measures the length in characters.Splitting the Text:
return text_splitter.split_text(text=text)
: This line uses the split_text
method of the RecursiveCharacterTextSplitter
instance to split the input text
into smaller chunks based on the specified parameters. The method returns a list of text chunks.In this step, we define a function to create embeddings for text chunks and store them in a vector store. This is essential for performing efficient similarity searches and finding relevant text passages. The function load_embeddings
uses the OpenAIEmbeddings
class and the FAISS
vector store.
Importing Necessary Classes:
from langchain_openai import OpenAIEmbeddings
: This import statement brings in the OpenAIEmbeddings
class, which is used to generate embeddings for text using OpenAI’s models.from langchain_community.vectorstores import FAISS
: This import statement brings in the FAISS
class, which is used to create and manage a vector store for efficient similarity search.Defining the load_embeddings
Function:
def load_embeddings(store_name, chunks):
: This line defines a function named load_embeddings
that takes two arguments:
store_name
: A placeholder for the name of the vector store (not used in this specific function but can be useful for future extensions).chunks
: A list of text chunks for which embeddings will be generated.Creating an OpenAIEmbeddings Instance:
embeddings = OpenAIEmbeddings()
: This line initializes an instance of the OpenAIEmbeddings
class. This instance will be used to generate embeddings for the text chunks.Creating a FAISS Vector Store:
vector_store = FAISS.from_texts(chunks, embedding=embeddings)
: This line creates a FAISS
vector store from the list of text chunks. The from_texts
method generates embeddings for each chunk using the embeddings
instance and stores them in the FAISS vector store. This vector store enables efficient similarity search, allowing us to quickly find the most relevant text chunks based on query embeddings.Returning the Vector Store:
return vector_store
: This line returns the created FAISS vector store. The vector store contains the embeddings for all the text chunks and is ready for similarity search operations.In this step, we define a function to compress text using the LLUMO Compressor API. This function sends a request to the LLUMO API, receives the compressed text, and calculates the compression percentage. The function also handles any errors that might occur during the process. Let’s understand each step one by one
os.getenv()
. The API key is essential for authenticating the request to the LLUMO API.application/json
) and the authorization token (Bearer {LLUMO_API_KEY}
).raise_for_status()
method raises an exception if the request fails.We will now combine everything to get the final code.
The main
function serves as the entry point for our PDF Query Assistant. It integrates all the steps discussed previously, from uploading and processing a PDF file to querying the content and using LLUMO compression to optimize costs. We will go in details of each part of the main
function:
files.upload()
to upload a PDF file.load_pdf
function to extract text from the PDF.split_text
function to split the extracted text into manageable chunks.load_embeddings
function to create a vector store using the text chunks.compress_with_llumo
function to compress the context using LLUMO API.ChatOpenAI
) with specified parameters.load_qa_chain
.We have now integrated all the steps to get the final code in the main
function. This comprehensive function handles PDF uploading, text extraction, text chunking, embedding generation, similarity search, text compression with LLUMO, and generating a response to the user’s query using the compressed or original text.
By including this check, we ensure that the main() function is only executed when the script is run directly, providing flexibility and preventing unintended behavior when the script is imported elsewhere.
It will ask you to upload a PDF and then when uploaded, it will receive input for the question you want to be answered.