BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//IEEE Toronto Section - ECPv6.15.17//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-WR-CALNAME:IEEE Toronto Section
X-ORIGINAL-URL:https://www.ieeetoronto.ca
X-WR-CALDESC:Events for IEEE Toronto Section
REFRESH-INTERVAL;VALUE=DURATION:PT1H
X-Robots-Tag:noindex
X-PUBLISHED-TTL:PT1H
BEGIN:VTIMEZONE
TZID:UTC
BEGIN:STANDARD
TZOFFSETFROM:+0000
TZOFFSETTO:+0000
TZNAME:UTC
DTSTART:20200101T000000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=UTC:20211123T170000
DTEND;TZID=UTC:20211123T183000
DTSTAMP:20260530T073906
CREATED:20211030T112020Z
LAST-MODIFIED:20211223T084823Z
UID:10000480-1637686800-1637692200@www.ieeetoronto.ca
SUMMARY:Reinforcement Learning Game Tree / Markoff Chains
DESCRIPTION:Prerequisites: You do not need to have attended the earlier talks. If you know zero math and zero machine learning\, then this talk is for you. Jeff will do his best to explain fairly hard mathematics to you. If you know a bunch of math and/or a bunch machine learning\, then these talks are for you. Jeff tries to spin the ideas in new ways. Longer Abstract: At the risk of being non-standard\, Jeff will tell you the way he thinks about this topic. Both “Game Trees” and “Markoff Chains” represent the graph of states through which your agent will traverse a path while completing the task. Suppose we could learn for each such state a value measuring “how good” this state is for the agent. Then competing the task in an optimal way would be easy. If our current state is one within which our agent gets to choose the next action\, then she will choose the action that maximizes the value of our next state. On the other hand\, if our adversary gets to choose\, he will choose the action that minimizes this value. Finally\, if our current state is one within which the universe flips a coin\, then each edge leaving this state will be labeled with the probability of taking it. Knowing that that is how the game is played\, we can compute how good each state is. The states in which the task is complete is worth whatever reward the agent receives in the said state. These values somehow trickle backwards until we learn the value of the start state. The computational challenge is that there are way more states then we can ever look at.  Speaker(s): Prof. Jeff Edmonds\,   Virtual: https://events.vtools.ieee.org/m/287737
URL:https://www.ieeetoronto.ca/event/reinforcement-learning-game-tree-markoff-chains/
LOCATION:Virtual: https://events.vtools.ieee.org/m/287737
CATEGORIES:Instrumentation & Measurement,Women in Engineering
END:VEVENT
END:VCALENDAR