BEGIN:VCALENDAR
VERSION:2.0
PRODID:researchseminars.org
CALSCALE:GREGORIAN
X-WR-CALNAME:researchseminars.org
BEGIN:VEVENT
SUMMARY:Jessica Sorrell (UCSD)
DTSTART:20220406T170000Z
DTEND:20220406T180000Z
DTSTAMP:20260423T021012Z
UID:TCSPlus/38
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/TCSPlus/38/"
 >Reproducibility in Learning</a>\nby Jessica Sorrell (UCSD) as part of TCS
 +\n\n\nAbstract\nReproducibility is vital to ensuring scientific conclusio
 ns are reliable\, but failures of reproducibility have been a major issue 
 in nearly all scientific areas of study in recent decades. A key issue und
 erlying the reproducibility crisis is the explosion of methods for data ge
 neration\, screening\, testing\, and analysis\, where\, crucially\, only t
 he combinations producing the most significant results are reported. Such 
 practices (also known as p-hacking\, data dredging\, and researcher degree
 s of freedom) can lead to erroneous findings that appear to be significant
 \, but that don’t hold up when other researchers attempt to replicate th
 em.  \n\nIn this talk\, we introduce a new notion of reproducibility for r
 andomized algorithms. This notion ensures that with high probability\, an 
 algorithm returns exactly the same output when run with two samples from t
 he same distribution. Despite the exceedingly strong demand of reproducibi
 lity\, there are efficient reproducible algorithms for several fundamental
  problems in statistics and learning. We show that any statistical query a
 lgorithm can be made reproducible with a modest increase in sample complex
 ity\, and we use this to construct reproducible algorithms for finding app
 roximate heavy-hitters and medians. Using these ideas\, we give the first 
 reproducible algorithm for learning halfspaces via a reproducible weak lea
 rner and a reproducible boosting algorithm. We initiate the study of lower
  bounds and inherent tradeoffs for reproducible algorithms\, giving nearly
  tight sample complexity upper and lower bounds for reproducible versus no
 n-reproducible SQ algorithms. Finally\, we discuss connections to other we
 ll-studied notions of algorithmic stability\, such as differential privacy
 .\n\nJoint work with Russell Impagliazzo (UCSD)\, Rex Lei (UCSD)\, and Ton
 iann Pitassi (Columbia University)\, to appear in STOC 2022.\n
LOCATION:https://researchseminars.org/talk/TCSPlus/38/
END:VEVENT
END:VCALENDAR
