BEGIN:VCALENDAR
VERSION:2.0
PRODID:researchseminars.org
CALSCALE:GREGORIAN
X-WR-CALNAME:researchseminars.org
BEGIN:VEVENT
SUMMARY:Anders StÃ¥hlberg & Serik Sagitov (Chalmers & University of Gothen
burg)
DTSTART;VALUE=DATE-TIME:20221006T131500Z
DTEND;VALUE=DATE-TIME:20221006T140000Z
DTSTAMP;VALUE=DATE-TIME:20240222T164305Z
UID:gbgstats/2
DESCRIPTION:Title: Counting molecular identifiers in sequencing using a multitype branching
process with immigration\nby Anders StÃ¥hlberg & Serik Sagitov (Chalme
rs & University of Gothenburg) as part of Gothenburg statistics seminar\n\
nLecture held in MVL14.\n\nAbstract\nDetection of extremely rare variant a
lleles\, such as tumour DNA\, within a complex mixture of DNA molecules is
experimentally challenging due to sequencing errors. Barcoding of target
DNA molecules in library construction for next-generation sequencing provi
des a way to identify and bioinformatically remove polymerase induced erro
rs. During the barcoding procedure involving $t$ consecutive PCR cycles\,
the DNA molecules become barcoded by unique molecular identifiers (UMI). D
ifferent library construction protocols utilise different values of $t$. T
he effect of a larger $t$ and imperfect PCR amplifications is poorly descr
ibed. \n\nThis paper proposes a branching process with growing immigration
as a model describing the random outcome of $t$ cycles of PCR barcoding
. Our model discriminates between five different amplification rates $r_1$
\, $r_2$\, $r_3$\, $r_4$\, $r$ for different types of molecules associated
with the PCR barcoding procedure. We study this model by focussing on $C_
t$\, the number of clusters of molecules sharing the same \nUMI\, as well
as $C_t(m)$\, the number of UMI clusters of size $m$. Our main finding i
s a remarkable asymptotic pattern valid for moderately large $t$. It turns
out that \n$E(C_t(m))/E(C_t)\\approx 2^{-m}$ for $m=1\,2\,\\ldots$\, rega
rdless of the underlying parameters $(r_1\,r_2\,r_3\,r_4\,r)$. The knowled
ge of the quantities $C_t$ and $C_t(m)$ as functions of the experimental p
arameters $t$ and $(r_1\,r_2\,r_3\,r_4\,r)$ will help the users to draw mo
re adequate conclusions from the outcomes of different sequencing protocol
s.\n
LOCATION:https://researchseminars.org/talk/gbgstats/2/
END:VEVENT
END:VCALENDAR