KG Pipelines Flashcards
(9 cards)
1
Q
What are the four main stages of the KG‑RAG pipeline for QA?
A
1) Query Generation 2) Knowledge Retrieval 3) Context Construction 4) Answer Generation
2
Q
Step 1: Query Generation – tasks and common approaches
A
- Tasks: Entity linking, relation extraction, SPARQL/Cypher query construction.
-
Approaches:
• Template‑based patterns
• Neural NL→SPARQL models
• LLM‑based query prompting
• Hybrid combinations
3
Q
Step 2: Knowledge Retrieval – strategies and challenges
A
-
Strategies:
• Direct SPARQL/Cypher execution
• Vector retrieval via KG embeddings
• Path ranking for multi‑hop
• Graph traversal for complex queries - Challenges: Ambiguous entities, KG incompleteness, precision vs recall, query optimization
4
Q
Step 3: Context Construction – methods and formats
A
- Methods: Subgraph extraction → linearization → enrichment → prioritization
-
Formats:
• Triplet lists
• Natural‑language verbalizations
• Structured JSON
• Hybrid
5
Q
Step 4: Answer Generation – LLM techniques
A
-
Techniques:
• Prompt engineering
• In‑context examples
• Chain‑of‑thought prompting
• Output verification -
Advanced:
• KG‑specific fine‑tuning
• ReAct reasoning
• Self‑consistency checks
• Explainability layers
6
Q
Designing a KG‑RAG pipeline – key decision factors
A
- Query complexity (simple fact vs multi‑hop reasoning)
- KG size & latency requirements
- Number of hops needed
- Desired response style (concise fact vs conversational explanation)
7
Q
Justifying pipeline step selection
A
- Match query generation approach to NL ambiguity and data availability
- Choose retrieval method based on KG structure, completeness, and scale
- Format context to balance informativeness and prompt length for the LLM
- Apply answer generation techniques to ensure correctness and clarity
8
Q
Potential privacy issues in KG‑RAG pipelines
A
- Exposure of sensitive entities or relationships
- Leakage of user queries via logs or prompts
- Inference of personal data through embeddings
Mitigations: Anonymization, access control, minimal context
9
Q
Potential quality issues in KG‑RAG pipelines
A
- KG incompleteness or stale data
- Retrieval precision vs recall trade‑offs
- LLM hallucinations despite correct context
- Context truncation or snippet bias
Mitigations: KG curation, hybrid retrieval, output verification