BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Evaluation with LLMs - Theoretical and Practical insights - Eyal K
 olman (Microsoft)
DTSTART:20251024T110000Z
DTEND:20251024T120000Z
UID:TALK234793@talks.cam.ac.uk
CONTACT:Suchir Salhan
DESCRIPTION:Abstract: \nAs large language models (LLMs) continue to evolve
 \, the task of assessing their performance becomes increasingly crucial an
 d complex\, and LLMs are being used to evaluate the quality of other model
 s. In this talk\, I will explore LLM-as-a-Judge\, combining theoretical fo
 undations with practical insights from the industry. Topics include benchm
 ark design\, pre-LLM metrics\, common pitfalls illustrated with real examp
 les\, methods for automatic tuning of evaluation metrics\, and the industr
 y-academy gaps. I will conclude with a vision for the future of robust and
  meaningful LLM assessment.\n\nBio:\nDr. Eyal Kolman is a Senior Researche
 r at Microsoft and an adjunct lecturer at Tel Aviv University and Bar-Ilan
  University\, where he teaches courses in Deep Learning. He holds a Ph.D. 
 in Electrical Engineering from Tel Aviv University and has over 25 years o
 f experience in machine learning and artificial intelligence. His work spa
 ns evaluation methodologies\, applied AI systems\, and large-scale learnin
 g models. Dr. Kolman has authored numerous research papers\, holds dozens 
 of patents\, and is the author of Knowledge‑Based Neurocomputing: A Fuzz
 y Logic Approach. 
LOCATION:SS03 Hybrid (In-Person + Online).  Google Meet Link: https://meet
 .google.com/yeu-pqce-rsn
END:VEVENT
END:VCALENDAR
