A/B (x) Testing
A/B (x) Testing compares multiple model outputs and has experts, diverse users or both choose the most effective response.
How do you fine-tune an LLM with A/B (x) Testing?
A/B (x) Testing takes two or more sample answers from your model and asks specialists or a diverse crowd, and sometimes both, to select their favorite.
At Defined.ai, we use this method to make sure your model isn’t just technically accurate but also aligns with what users truly want. A/B (x) Testing is a great way to gather subjective feedback by asking people to compare different options and share their opinions.
How A/B (x) Testing Works
Create
Generate two or more comparable model outputs
Collect
Get real human feedback on which answers people prefer
Align
Update your model to provide responses that resonate with your users
Right on!
Your model stands out from the rest