BEGIN:VCALENDAR
VERSION:2.0
PRODID:researchseminars.org
CALSCALE:GREGORIAN
X-WR-CALNAME:researchseminars.org
BEGIN:VEVENT
SUMMARY:Rui Castro (Mathematics Department\, TU Eindhoven)
DTSTART:20230518T160000Z
DTEND:20230518T170000Z
DTSTAMP:20260423T021049Z
UID:MPML/107
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/MPML/107/">A
 nomaly detection for a large number of streams: a permutation/rank-based h
 igher criticism approach</a>\nby Rui Castro (Mathematics Department\, TU E
 indhoven) as part of Mathematics\, Physics and Machine Learning (IST\, Lis
 bon)\n\n\nAbstract\nAnomaly detection when observing a large number of dat
 a streams is essential in a variety of applications\, ranging from epidemi
 ological studies to monitoring of complex systems. High-dimensional scenar
 ios are usually tackled with scan-statistics and related methods\, requiri
 ng stringent distributional assumptions for proper test calibration. In th
 is talk we take a non-parametric stance\, and introduce two variants of th
 e higher criticism test that do not require knowledge of the null distribu
 tion for proper calibration. In the first variant we calibrate the test by
  permutation\, while in the second variant we use a rank-based approach. B
 oth methodologies result in exact tests in finite samples. Our permutation
  methodology is applicable when observations within null streams are indep
 endent and identically distributed\, and we show this methodology is asymp
 totically optimal in the wide class of exponential models. Our rank-based 
 methodology is more flexible\, and only requires observations within null 
 streams to be independent. We provide an asymptotic characterization of th
 e power of the test in terms of the probability of mis-ranking null observ
 ations\, showing that the asymptotic power loss (relative to an oracle tes
 t) is minimal for many common models. As the proposed statistics do not re
 ly on asymptotic approximations\, they typically perform better than popul
 ar variants of higher criticism relying on such approximations. Finally\, 
 we demonstrate the use of these methodologies when monitoring the content 
 uniformity of an active ingredient for a batch-produced drug product\, and
  monitoring the daily number of COVID-19 cases in the Netherlands.\n\nBase
 d on joint work with Ivo Stoepker\, Ery Arias-Castro and Edwin van de den 
 Heuvel:\nhttps://arxiv.org/abs/2009.03117\n
LOCATION:https://researchseminars.org/talk/MPML/107/
END:VEVENT
END:VCALENDAR
