5️⃣LangChain: Advance Techniques

Advanced Techniques

LangChain의 주요 모듈에서 데이터처리하는 Chunking, Embedding 방법은 중요합니다. 또한 Query에서 Prompt Template, Retrieval에서 Splitter는 여러 방법이 있습니다.

주로 사용하는 Advance Techniques의 사례를 코드로 구현해 보겠습니다.

import os
from dotenv import load_dotenv  

load_dotenv()
!echo "OPENAI_API_KEY=<Your_OpenAI_Key>" >> .env
api_key = os.getenv("OPENAI_API_KEY")
!mkdir data
!wget https://github.com/Coding-Crashkurse/Udemy-Advanced-LangChain/blob/main/data/food.txt -p ./data/food.txt
!wget https://github.com/Coding-Crashkurse/Udemy-Advanced-LangChain/blob/main/data/founder.txt -p ./data/founder.txt
!wget https://github.com/Coding-Crashkurse/Udemy-Advanced-LangChain/blob/main/data/restaurant.txt -p ./data/restaurant.txt

Chunking

with open("./data/restaurant.txt") as f:
    raw_data = f.read()

Standard Chunking

Semantic Chunking

  1. standard SemanticChunker

  2. breakpoint_threshold_type=['percentile', 'standard_deviation', 'interquartile']

Huggingface Embeddings

Queries: HYDE_PROMPT

Retriever: Parent/Child Splitter

Last updated