🛠️ Practical Sessions
Promptfoo Test Harness
⚙️
Promptfoo Setup & YAML Config
Install promptfoo, configure providers and write your first test suite in YAML
Promptfoo Assertions & Scoring
Master built-in assertions, custom scorers, and threshold-based pass/fail logic
🔴
Promptfoo Red-Teaming
Automated adversarial testing — scan for jailbreaks, PII leaks and prompt injections
🚀
Promptfoo CI/CD Integration
Gate deployments with LLM quality checks in GitHub Actions, pass-rate thresholds and diff reports
🛠️ Practical Sessions
LangTest
🔧
LangTest Setup & Harness Config
Install LangTest, configure the Harness, and connect HuggingFace or OpenAI models
💪
LangTest Robustness & NLP Tests
Test model stability under typos, case changes, contractions and entity swaps
⚖️
LangTest Bias & Fairness Tests
Detect demographic, gender, religion and nationality biases in NLP model predictions
📊
Custom Tests & HTML Reports
Write custom test types, set per-category thresholds, and export shareable HTML reports
🛠️ Practical Sessions
DeepEval
🔬
DeepEval Setup & Core Metrics
Install DeepEval, write your first test case, and run built-in metrics like answer relevancy and hallucination
🗂️
DeepEval RAG Evaluation
Evaluate RAG pipelines with faithfulness, context recall, context precision and answer relevancy
⚗️
DeepEval Custom Metrics & G-Eval
Build domain-specific metrics with G-Eval and write fully custom scorer classes
🛠️ Practical Sessions
RAGAS
📐
RAGAS Setup & Core Metrics
Install RAGAS, build a Dataset, and compute faithfulness, answer relevancy and context metrics
🧬
RAGAS Advanced: Testset Generation & Custom Metrics
Auto-generate evaluation testsets from your documents and write custom RAGAS metrics
📈
RAGAS Production Monitoring
Track RAG quality in production with continuous RAGAS scoring and drift detection
🛠️ Practical Sessions
LangSmith
🔭
LangSmith Tracing & Observability
Instrument LangChain apps with LangSmith tracing — capture runs, inspect spans and debug failures
📋
LangSmith Datasets & Evaluation
Create evaluation datasets, run automated evaluators and track quality metrics over time
Quiz Score
0/0
No answers yet