How to evaluate large language model chatbots: experimenting with Streamlit and ProdigyRead the full post on Medium.