4.6 Multi-Instance Review and Output Validation
Multi-Instance Review คืออะไร
Multi-Instance Review คือการ run Claude หลาย instances บน input เดียวกัน แล้วเปรียบเทียบ results — ใช้ consensus หรือ majority vote เพื่อเพิ่ม accuracy ลด hallucination และสร้าง confidence score เหมือนการขอความเห็นจากผู้เชี่ยวชาญหลายคนแล้วดูว่าเห็นตรงกันไหม
Voting/Majority Pattern
import asyncio
async def multi_instance_classify(text, n=3):
# Run N instances in parallel
tasks = [
client.messages.create(
model="claude-sonnet-4-20250514",
messages=[{"role": "user", "content": f"Classify sentiment: {text}"}]
)
for _ in range(n)
]
responses = await asyncio.gather(*tasks)
# Collect votes
votes = [r.content[0].text.strip().lower() for r in responses]
# Majority vote
from collections import Counter
vote_counts = Counter(votes)
winner = vote_counts.most_common(1)[0]
return {
"result": winner[0],
"confidence": winner[1] / n,
"agreement": winner[1] == n, # unanimous?
"all_votes": votes
}
Adversarial Verification
ให้ instance หนึ่ง verify/challenge output ของอีก instance:
# Instance 1: Generate answer
answer = await agent("What are the security risks in this code?")
# Instance 2: Adversarial verifier (try to refute)
verification = await agent(f"""
Try to refute this claim. Default to refuted=true if uncertain:
Claim: {answer}
Is this claim correct and well-supported?
""", schema={"refuted": "boolean", "reason": "string"})
if not verification.refuted:
return answer # Survived verification
else:
return None # Rejected
Confidence Scoring
async def confident_answer(question, threshold=0.7, n=5):
results = await multi_instance_classify(question, n=n)
if results["confidence"] >= threshold:
return results["result"]
else:
# Low confidence — escalate to human or ask for clarification
return {
"status": "uncertain",
"best_guess": results["result"],
"confidence": results["confidence"],
"disagreements": results["all_votes"]
}
Output Validation Strategies
Schema Validation
import jsonschema
def validate_output(output, schema):
try:
jsonschema.validate(output, schema)
return True, None
except jsonschema.ValidationError as e:
return False, str(e)
Semantic Validation
ใช้ Claude ตัวอื่น check ว่า output make sense:
validation = await client.messages.create(
messages=[{"role": "user", "content": f"""
Does this answer make logical sense? Check for:
1. Internal contradictions
2. Factual impossibilities
3. Missing required information
Answer: {original_output}
Valid (yes/no) and reason:
"""}]
)
Cross-Reference Validation
เทียบ output กับ known facts:
# Extract claims from output
claims = extract_claims(output)
# Verify each claim against source
for claim in claims:
evidence = search_source_docs(claim)
if not evidence:
flag_unverified(claim)
Exam Tips
- Multi-instance = run same prompt N times, use majority vote
- Confidence = agreement ratio (3/3 agree = 1.0, 2/3 agree = 0.67)
- Adversarial verification = second instance tries to REFUTE first instance’s answer
- ข้อสอบอาจถาม: เมื่อไหร่ใช้ multi-instance — high-stakes decisions, classification ที่ต้อง accuracy
- Trade-off: N instances = N times cost, but higher reliability
- Threshold-based escalation: low confidence → ask human, don’t guess