KG Pipelines Flashcards

(9 cards)

1
Q

What are the four main stages of the KG‑RAG pipeline for QA?

A

1) Query Generation 2) Knowledge Retrieval 3) Context Construction 4) Answer Generation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Step 1: Query Generation – tasks and common approaches

A
  • Tasks: Entity linking, relation extraction, SPARQL/Cypher query construction.
  • Approaches:
    • Template‑based patterns
    • Neural NL→SPARQL models
    • LLM‑based query prompting
    • Hybrid combinations
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Step 2: Knowledge Retrieval – strategies and challenges

A
  • Strategies:
    • Direct SPARQL/Cypher execution
    • Vector retrieval via KG embeddings
    • Path ranking for multi‑hop
    • Graph traversal for complex queries
  • Challenges: Ambiguous entities, KG incompleteness, precision vs recall, query optimization
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Step 3: Context Construction – methods and formats

A
  • Methods: Subgraph extraction → linearization → enrichment → prioritization
  • Formats:
    • Triplet lists
    • Natural‑language verbalizations
    • Structured JSON
    • Hybrid
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Step 4: Answer Generation – LLM techniques

A
  • Techniques:
    • Prompt engineering
    • In‑context examples
    • Chain‑of‑thought prompting
    • Output verification
  • Advanced:
    • KG‑specific fine‑tuning
    • ReAct reasoning
    • Self‑consistency checks
    • Explainability layers
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Designing a KG‑RAG pipeline – key decision factors

A
  • Query complexity (simple fact vs multi‑hop reasoning)
  • KG size & latency requirements
  • Number of hops needed
  • Desired response style (concise fact vs conversational explanation)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Justifying pipeline step selection

A
  • Match query generation approach to NL ambiguity and data availability
  • Choose retrieval method based on KG structure, completeness, and scale
  • Format context to balance informativeness and prompt length for the LLM
  • Apply answer generation techniques to ensure correctness and clarity
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Potential privacy issues in KG‑RAG pipelines

A
  • Exposure of sensitive entities or relationships
  • Leakage of user queries via logs or prompts
  • Inference of personal data through embeddings
    Mitigations: Anonymization, access control, minimal context
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Potential quality issues in KG‑RAG pipelines

A
  • KG incompleteness or stale data
  • Retrieval precision vs recall trade‑offs
  • LLM hallucinations despite correct context
  • Context truncation or snippet bias
    Mitigations: KG curation, hybrid retrieval, output verification
How well did you know this?
1
Not at all
2
3
4
5
Perfectly