This updated application combines the power of LangChain and Groq's Gemma Model to summarize content from YouTube videos, websites, audio files, or document files and enables an interactive chat interface with the summarized data. This allows users not only to analyze large content but also to engage with it conversationally.
- Summarize YouTube Videos: Extracts transcripts and generates concise summaries of video content.
- Website Summarization: Summarizes text from any web page.
- Audio File Transcription and Summarization: Transcribes and summarizes audio files (e.g.,
.mp3
,.wav
,.m4a
). - Document Summarization: Handles
.pdf
,.docx
, and.txt
files for summarization. - Chat with Summarized Data: Enables users to ask questions and interact with the summarized content using a chatbot interface.
- LangChain: Provides LLM-based summarization and retrieval capabilities.
- Groq Gemma Model: A state-of-the-art language model for natural language processing.
- Streamlit: For a dynamic and user-friendly web interface.
- FAISS: For efficient vector-based retrieval in the chatbot.
- Whisper by OpenAI: Transcribes audio files into text.
- Python 3.8 or higher
- A Groq API Key (required for the app)
git clone https://github.com/amitanand983/langchain-summarizer.git
cd langchain-summarizer
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txt
-
Add Your Groq API Key:
- Go to the sidebar in the app.
- Enter your Groq API Key.
-
Run the Application:
streamlit run app.py
-
Provide Input:
- Enter a YouTube or Website URL.
- Upload an audio file (
.mp3
,.wav
,.m4a
). - Upload a document (
.pdf
,.docx
,.txt
).
- Start the app: Open the app in your browser (default:
http://localhost:8501
). - Input Content:
- Enter a YouTube URL for video summarization.
- Paste a website URL to summarize webpage content.
- Upload an audio file or document for transcription and summarization.
- Summarize the Content: Click "Summarize the Content" to generate a summary of the input. The app processes the input and displays a concise summary.
- Chat with the Data:
- Use the chat interface to ask questions about the summarized content.
- View responses with relevant sources displayed for transparency.
- Token Limit: Very large inputs (e.g., lengthy videos or documents) are chunked for processing, which may impact context consistency in summaries.
- YouTube Restrictions: Videos with disabled transcripts cannot be processed.
- Model Dependency: Requires an active Groq API Key for the Gemma model.
Contributions are welcome! Feel free to open issues or submit pull requests for new features, bug fixes, or improvements.
- Fork the repository.
- Create a feature branch:
git checkout -b feature-name
- Commit your changes:
git commit -m "Add new feature"
- Push to the branch:
git push origin feature-name
- Open a Pull Request.
This project is licensed under the MIT License.
Developed by Amit Anand.
Feel free to reach out for suggestions or feedback!