BEGIN:VCALENDAR
VERSION:2.0
PRODID:researchseminars.org
CALSCALE:GREGORIAN
X-WR-CALNAME:researchseminars.org
BEGIN:VEVENT
SUMMARY:Matthew Lee (University of Bristol)
DTSTART;VALUE=DATE-TIME:20201015T130000Z
DTEND;VALUE=DATE-TIME:20201015T140000Z
DTSTAMP;VALUE=DATE-TIME:20230605T064536Z
UID:Essex-DataScience/1
DESCRIPTION:Title: EpiViz: an implementation of Circos plots for epidemiologists\nby Matthew Lee (University of Bristol) as part of (ED-3S) Essex Data Sc
ience Seminar Series\n\n\nAbstract\nEpidemiology studies predominantly foc
us on single exposure and single outcome associations. However\, biologica
l pathways involve numerous processes and identifying meaningful intermedi
ate associations that can be taken forward for further analysis is complex
. This is particularly the case for studies involving metabolomics data\,
as effects rarely occur in isolation. Gaining global overview of hundreds
of exposure/outcome associations may therefore aid downstream analyses. Vi
sual inspection is one of the main modes of understanding global exposure/
outcome associations. EpiViz is a wrapper that makes producing Cricos plot
s simple and efficient for those new to programming and data visualisation
.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/1/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Godwin Osuntoki (University of Essex)
DTSTART;VALUE=DATE-TIME:20201022T130000Z
DTEND;VALUE=DATE-TIME:20201022T140000Z
DTSTAMP;VALUE=DATE-TIME:20230605T064536Z
UID:Essex-DataScience/2
DESCRIPTION:Title: Bayesian Analysis of chromosomal interactions in Hi-C data using
the hidden Markov random field model\nby Godwin Osuntoki (University
of Essex) as part of (ED-3S) Essex Data Science Seminar Series\n\n\nAbstra
ct\nThere are different biological methods that have been developed over t
he years for analysis of the 3D structure of the DNA. Few computational an
d statistical methods have\, however\, been developed to analysis data gen
erated using the Hi-C method. We follow statistical methodology to explore
the Hi-C data. The Hi-C data is well suited to be analyzed using a finite
mixture model. The Potts model\, a hidden Markov random field model\, was
employed to analyze the hidden (latent) components. The hidden components
through the Potts model can be categorized into k components (k = 2\,3…
\,K). Using the Metropolis-within-Gibbs approach to analyze the data\, the
proposed method was able to detect interactions (short and long range) an
d loops. A large part of the significant interactions that we detect are f
ound within Topological Associated Domains\, which is one of the 3D struct
ures known to occur in Hi-C data.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/2/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Nosheen Faiz (University of Essex)
DTSTART;VALUE=DATE-TIME:20201105T140000Z
DTEND;VALUE=DATE-TIME:20201105T150000Z
DTSTAMP;VALUE=DATE-TIME:20230605T064536Z
UID:Essex-DataScience/4
DESCRIPTION:Title: Assessing how feature selection and hyper-parameters influence o
ptimal trees ensemble and random projection\nby Nosheen Faiz (Universi
ty of Essex) as part of (ED-3S) Essex Data Science Seminar Series\n\n\nAbs
tract\nOur work investigates the effect of feature selection on three meth
ods: Random Forest (Breiman 2001)\, Optimal Trees Ensemble (Khan et al 201
6) and Random Projection (Canning and Samworth 2017) in high dimensional s
ettings. To this end\, LASSO has been considered for selecting the most im
portant features based on training data for dimension reduction. Additiona
lly\, the influence of various hyper-parameters regulating the three metho
ds has also been assessed. Analysis on several benchmark datasets is given
to illustrate the phenomena. The results reveal that feature selection im
proves the predictive performance of the Random Forest and Random Projecti
on methods in addition to reducing the computational burden. The performan
ce of Optimal Trees Ensemble is less influenced by feature selection.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/4/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Peng Liu (University of Essex)
DTSTART;VALUE=DATE-TIME:20201112T140000Z
DTEND;VALUE=DATE-TIME:20201112T150000Z
DTSTAMP;VALUE=DATE-TIME:20230605T064536Z
UID:Essex-DataScience/5
DESCRIPTION:Title: Ordering and Inequalities for Mixtures on Risk Aggregation\n
by Peng Liu (University of Essex) as part of (ED-3S) Essex Data Science Se
minar Series\n\n\nAbstract\nAggregation sets\, which represent model uncer
tainty due to unknown dependence\, are an important object in the study of
robust risk aggregation. In this talk\, we investigate ordering relations
between two aggregation sets for which the sets of marginals are related
by two simple operations: distribution mixtures and quantile mixtures. Int
uitively\, these operations ``homogenize" marginal distributions by maki
ng them similar. As a general conclusion from our results\, more ``homogen
eous" marginals lead to a larger aggregation set\, and thus more severe mo
del uncertainty\, although the situation for quantile mixtures is much mor
e complicated than that for distribution mixtures. \nWe proceed to study
inequalities on the worst-case values of risk measures in risk aggregatio
n\, which represent conservative calculation of regulatory capital. Among
other results\, we obtain an order relation on VaR under quantile mixture
for marginal distributions with monotone densities. Numerical results are
presented to visualize the theoretical results and further inspire some c
onjectures.\nFinally\, we discuss the connection of our results to joint m
ixability and to merging p-values in multiple hypothesis testing.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/5/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Tolulope Fadina (University of Essex)
DTSTART;VALUE=DATE-TIME:20210225T140000Z
DTEND;VALUE=DATE-TIME:20210225T150000Z
DTSTAMP;VALUE=DATE-TIME:20230605T064536Z
UID:Essex-DataScience/6
DESCRIPTION:Title: Symmetric measures of variability induced by risk measures\n
by Tolulope Fadina (University of Essex) as part of (ED-3S) Essex Data Sci
ence Seminar Series\n\n\nAbstract\nGeneral measures of variability induced
by risk measures are investigated for their potential applications to ris
k management. We emphasize on the three classes of variability measures ge
nerated by the Value-at-Risk\, Expected Shortfall\, and the Expectiles. Th
eir properties are explored\, and we obtain a characterization result on g
eneral model spaces. Convergence properties and asymptotic normality of th
e empirical variability measures estimators are established. An applicatio
n of the variability measures to financial data is also investigated.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/6/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Ioana Olan (University of Cambridge)
DTSTART;VALUE=DATE-TIME:20201126T140000Z
DTEND;VALUE=DATE-TIME:20201126T150000Z
DTSTAMP;VALUE=DATE-TIME:20230605T064536Z
UID:Essex-DataScience/7
DESCRIPTION:Title: Detecting the hierarchical structure of the cell nucleus\nby
Ioana Olan (University of Cambridge) as part of (ED-3S) Essex Data Scienc
e Seminar Series\n\n\nAbstract\nChromatin consists of DNA wrapped around h
istones and forms complex three-dimensional structures within the cell nuc
leus with various degrees of compaction. Genes have been shown to be repre
ssed by their proximity to the nuclear periphery or activated by being in
contact with special regulatory regions called enhancers. Thus the relativ
e positioning of genes and their interactions with other regions are very
important in determining whether they are expressed or not. Interactions b
etween pairs of genomic regions have been studied using assays such as Hi-
C\, which generate large matrices estimating interaction frequencies. We u
se such interaction estimates as weights in a network whose nodes are equa
lly sized genomic regions and perform nested community detection in order
to resolve the relative positioning of genomic regions of interest and mod
el the interior of the cell nucleus. Our biological model is cellular sene
scence\, a phenotype associated with dramatic changes in its chromatin int
eractions network relative to normal cells. Senescence corresponds to perm
anent cell cycle arrest and has been shown to act as a protective barrier
against tumourigenesis.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/7/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Josh Bull (University of Oxford)
DTSTART;VALUE=DATE-TIME:20201203T140000Z
DTEND;VALUE=DATE-TIME:20201203T150000Z
DTSTAMP;VALUE=DATE-TIME:20230605T064536Z
UID:Essex-DataScience/8
DESCRIPTION:Title: Can maths tell us how to win at Fantasy Football?\nby Josh B
ull (University of Oxford) as part of (ED-3S) Essex Data Science Seminar S
eries\n\n\nAbstract\nFantasy Football is an online game played by millions
of people every year\, in which players attempt to predict the outcome of
football matches over the course of a season. To the surprise of everyone
(including myself)\, I was lucky enough to be crowned the winner of the 2
019-20 Fantasy Premier League\, one of the largest competitions in the UK.
As a researcher in Mathematical Oncology at the University of Oxford\, pe
ople have asked me whether I used maths to win – while I followed some s
trategies at the time\, I didn’t have any proof that they were in some s
ense mathematically optimal. However\, mathematical modelling is a tool wh
ich is capable of exploring exactly these kinds of questions: how can we i
dentify the best strategies to tackle complex problems? What types of data
are important to consider\, and how should we use them to inform our deci
sions? In this talk\, I’ll analyse how different quantitative approaches
can be used to tackle key questions in Fantasy Football\, and identify th
e strengths and weaknesses of these frameworks. Finally\, I’ll address t
he question: Can maths tell us how to win at Fantasy Football?\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/8/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Osama Mahmoud (University of Essex)
DTSTART;VALUE=DATE-TIME:20210211T140000Z
DTEND;VALUE=DATE-TIME:20210211T150000Z
DTSTAMP;VALUE=DATE-TIME:20230605T064536Z
UID:Essex-DataScience/9
DESCRIPTION:Title: Slope-Hunter: A robust method for index-event bias correction in
genome-wide association studies of conditional analyses\nby Osama Mah
moud (University of Essex) as part of (ED-3S) Essex Data Science Seminar S
eries\n\n\nAbstract\nBackground: Studying genetic associations with progno
sis (e.g. survival\, subsequent events) is problematic due to selection bi
as - also termed index event bias or collider bias - whereby selection on
disease status can induce associations between causes of incidence with pr
ognosis. A current method for adjusting genetic associations for this bias
assumes there is no genetic correlation between incidence and prognosis\,
which may not be a plausible assumption.\n\nMethods: We propose an altern
ative\, the ‘Slope-Hunter’ approach\, which is unbiased even when ther
e is genetic correlation between incidence and prognosis. Our approach has
two stages. First\, we use cluster-based techniques to identify: variants
affecting neither incidence nor prognosis (these should not suffer bias a
nd only a random sub-sample of them are retained in the analysis)\; varian
ts affecting prognosis only (excluded from the analysis). Second\, we fit
a cluster-based model to identify the class of variants only affecting inc
idence\, and use this class to estimate the adjustment factor. {\\color{bl
ue} The underlying assumption of our approach is that variants affecting o
nly incidence explain more variation in incidence than any group of varian
ts with unique effects\, e.g. via same exposure\, on both incidence and pr
ognosis}.\n\nResults: Simulation studies showed that {\\color{blue} our ap
proach eliminates the bias and outperforms alternatives in the presence of
genetic correlation\, and performs as well as alternatives under no genet
ic correlation when its assumption is satisfied. We applied the ‘Slope-H
unter’ method to a study of fasting blood insulin levels (FI) conditiona
l on body mass index (BMI)\, estimated the index event bias\, and adjusted
conditional associations of the lead variants with FI. Our estimates sugg
ested that there were common causes of BMI and FI of concordant directions
of effect\, that are in-line with previously observed association between
obesity and insulin resistance.}\n\nConclusions: Our approach is unbiased
even in the presence of genetic correlation between incidence and progres
sion when the underlying assumptions hold. Bias-adjusting methods should b
e used to carry out causal analyses when conditioning on incidence.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/9/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Yanchun Bao (University of Essex)
DTSTART;VALUE=DATE-TIME:20201217T140000Z
DTEND;VALUE=DATE-TIME:20201217T150000Z
DTSTAMP;VALUE=DATE-TIME:20230605T064536Z
UID:Essex-DataScience/10
DESCRIPTION:Title: Estimating mode effects from a sequential mixed-modes experimen
t\nby Yanchun Bao (University of Essex) as part of (ED-3S) Essex Data
Science Seminar Series\n\n\nAbstract\nThe large-scale household panel stud
y Understanding Society (The U.K. Household Longitudinal Study UKHLS) has\
, until recently\, used interviewers to administer its questionnaires\, bu
t is now in the process of allowing individuals to participate using the w
eb. Survey data are known to be affected by survey mode so a sequential mo
de-effects experiment was carried out on to evaluate the impact of this ch
ange on the panel. In this talk we present a novel estimator and analysis
strategy to quantify the impact of mode across a wide range of variables\,
with large mode effects on the covariance of a pair of variables used to
indicate an increased risk that statistical analyses involving this pair w
ill be affected.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/10/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Rafal Kulakowski (University of Essex)
DTSTART;VALUE=DATE-TIME:20210204T140000Z
DTEND;VALUE=DATE-TIME:20210204T150000Z
DTSTAMP;VALUE=DATE-TIME:20230605T064536Z
UID:Essex-DataScience/11
DESCRIPTION:by Rafal Kulakowski (University of Essex) as part of (ED-3S) E
ssex Data Science Seminar Series\n\nAbstract: TBA\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/11/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Yassir Rabhi (University of Essex)
DTSTART;VALUE=DATE-TIME:20201210T140000Z
DTEND;VALUE=DATE-TIME:20201210T150000Z
DTSTAMP;VALUE=DATE-TIME:20230605T064536Z
UID:Essex-DataScience/12
DESCRIPTION:Title: Copulas and measures of dependence under length-biased sampling
and informative censoring\nby Yassir Rabhi (University of Essex) as p
art of (ED-3S) Essex Data Science Seminar Series\n\n\nAbstract\nLength-bia
sed data are often encountered in cross-sectional surveys and prevalent-co
hort studies on disease durations. Under length-biased sampling subjects w
ith longer disease durations have greater chance to be observed. As a resu
lt\, covariate values linked to the longer survivors are favoured by the s
ampling mechanism. When the sampled durations are also subject to right ce
nsoring\, the censoring is informative. Modelling dependence structure wit
hout adjusting for these issues leads to biased results. In this talk\, I
will present a study on copulas for modelling dependence when the collecte
d data are length-biased and account for both informative censoring and co
variate bias. I will address the nonparametric estimation of the bivariate
distribution\, copula function and its density\, and Kendall and Spearman
measures for right-censored length-biased data. The proposed estimator of
the bivariate CDF is a Hadamard-differentiable functional of two MLEs\, K
aplan-Meier and empirical CDF\, and inherits their efficiencies. Based on
this estimator\, we devise estimators for copula function and a local-poly
nomial estimator for copula density that accounts for boundary bias. In ad
dition\, I will introduce estimators for Kendall and Spearman measures. Th
e weak convergence of the estimators will also be discussed. The proposed
method is then applied to analyse a set of right-censored length-biased da
ta on survival with dementia\, collected as part of a nationwide study in
Canada.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/12/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Carolin Strobl (Universität Zürich)
DTSTART;VALUE=DATE-TIME:20201119T140000Z
DTEND;VALUE=DATE-TIME:20201119T150000Z
DTSTAMP;VALUE=DATE-TIME:20230605T064536Z
UID:Essex-DataScience/13
DESCRIPTION:Title: A Statistician’s Botanical Garden - The Ideas behind Trees\,
Model-Based Trees and Random Forests\nby Carolin Strobl (Universität
Zürich) as part of (ED-3S) Essex Data Science Seminar Series\n\n\nAbstrac
t\nClassification and regression trees\, model-based trees and random fore
sts are powerful statistical methods from the field of machine learning. T
hey have been shown to achieve a high prediction accuracy\, especially in
big data applications with many predictor variables and complex associatio
n patterns (such as nonlinear and higher-order interaction effects). While
individual trees are easy to interpret\, random forests are "black box" p
rediction methods. They do\, however\, provide variable importance measure
s\, that are being used to judge the relevance of the individual predictor
variables. The aim of this presentation is to introduce the rationale beh
ind trees\, model-based trees and random forests\, to illustrate their pot
ential for high-dimensional data exploration\, e.g.\, in psychological res
earch\, but also to point out limitations and potential pitfalls in their
practical application.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/13/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Shenggang Hu (University of Essex)
DTSTART;VALUE=DATE-TIME:20221013T130000Z
DTEND;VALUE=DATE-TIME:20221013T140000Z
DTSTAMP;VALUE=DATE-TIME:20230605T064536Z
UID:Essex-DataScience/14
DESCRIPTION:Title: Statistical disaggregation - a Monte Carlo approach for imputat
ion under constraints\nby Shenggang Hu (University of Essex) as part o
f (ED-3S) Essex Data Science Seminar Series\n\nLecture held in NTC.1.04.\n
\nAbstract\nStatistical disaggregation has become more and more important
for smart energy systems. A typical example of such disaggregation problem
s is to learn energy consumption for a higher resolution level (data recor
ded at higher frequency) based on data at a lower resolution (data recorde
d at lower frequency). Constrained models are often used in such problems
and they are often very useful compared to their unconstrained counterpart
s in terms of reducing uncertainty and leading to an improvement of the ov
erall performance. However\, these constrained models usually are not expr
essible as ordinary distributions due to their intractable density functio
ns which makes it hard to conduct further analysis. This paper introduces
a novel constrained Monte Carlo sampling algorithm based on Langevin diffu
sions and rejection sampling to solve the problem of sampling from constra
ined models. This new method is then applied to a statistical disaggregati
on problem for an electricity consumption dataset. Our approach provides
excellent accuracy of data imputation\, based on our simulation studies an
d data analysis. The new method is also justified theoretically.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/14/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Prof Christian Martin Hennig (University of Bologna\, UCL)
DTSTART;VALUE=DATE-TIME:20221103T140000Z
DTEND;VALUE=DATE-TIME:20221103T150000Z
DTSTAMP;VALUE=DATE-TIME:20230605T064536Z
UID:Essex-DataScience/15
DESCRIPTION:Title: Advances in using cluster analysis for species delimitation
\nby Prof Christian Martin Hennig (University of Bologna\, UCL) as part of
(ED-3S) Essex Data Science Seminar Series\n\nLecture held in STEM 3.1.\n\
nAbstract\nBiological species are often delimited based on genetic multilo
cus data using methods for inferring phylogenetic trees or model- or dista
nce-based cluster analysis. A major problem here is that genetic dissimila
rity does not only arise from separated species\, but also if subpopulatio
ns of a species live in geographically distant areas without genetic excha
nge. In any case\, be it using partitioning cluster analysis or hierarchic
al trees\, it is a hard problem to decide the number of species\, and whet
her groups that are candidates for being species actually belong together.
I will discuss some the use of some new approaches for clustering and est
imating the number of clusters for this problem\, focusing particularly on
testing whether observed genetic heterogeneity within a species candidate
group can be explained be geographical distance rather than consisting of
separate species. This requires hypothesis testing in a distance-distance
regression model. I will also discuss the integration of such a testing r
outine in a fully automated method for species delimitation.\n\nReference\
n\nHausdorf\, B\, Hennig\, C. Species delimitation and geography. Mol Ecol
Resour. 2020\; 20: 950– 960. https://doi.org/10.1111/1755-0998.13184\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/15/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Johan van der Molen (University of Cambridge)
DTSTART;VALUE=DATE-TIME:20221124T140000Z
DTEND;VALUE=DATE-TIME:20221124T150000Z
DTSTAMP;VALUE=DATE-TIME:20230605T064536Z
UID:Essex-DataScience/16
DESCRIPTION:Title: Dirichlet process mixture inconsistency for the number of compo
nents: how worried should we be in practice?\nby Dr Johan van der Mole
n (University of Cambridge) as part of (ED-3S) Essex Data Science Seminar
Series\n\nLecture held in STEM 3.1.\n\nAbstract\nBayesian nonparametric mi
xture models are widely used for model-based clustering due to their flexi
bility and conceptual simplicity\, as well as the availability of efficie
nt sampling methods for performing inference. However\, recent work has es
tablished that such models have undesirable asymptotic properties regardin
g the estimation of the number of clusters. For instance\, Dirichlet Proce
ss Mixtures (DPMs) have been shown to be inconsistent for the number of cl
usters\, and overestimation of the number of clusters has been observed in
practice for finite samples. Finite mixtures with a prior on the number o
f components - also known as Mixtures of Finite Mixtures (MFMs) - have bee
n suggested as an asymptotically consistent alternative\, but the effects
of model misspecification can still result in asymptomatic inconsistency a
nd poor estimation of the number of clusters in practice. \n\nHere we spec
ifically focus on estimation of the number of clusters in Bayesian nonpara
metric mixtures in practice\, including the impact of Markov chain Monte C
arlo (MCMC) post-processing algorithms for summarisation and identificatio
n of a final representative summary clustering. We consider practical scen
arios of low to moderate dimension\, through both simulation studies and a
pplications to real biomolecular data. In the situations we consider\, we
confirm that even when the parametric form of the mixture component distri
butions is correctly specified\, DPMs lead to mild overestimation of the n
umber of clusters for finite samples. However\, we also demonstrate that t
his can be corrected by common summarisation methods\, suggesting that app
lications of DPMs in practice may be more robust than the theory might sug
gest. We show that\, for both DPMs and MFMs\, mixture component density mi
sspecification typically leads to more dramatic overestimation\, with DPMs
providing slightly worse estimates than MFMs\, but with the common patter
n of “true” clusters in the data being split into smaller subclusters
due to additional mixture components being required to flexibly capture fe
atures of the data inadequately described by the misspecified models. We c
onsider implications for high-dimensional data analysis\, in which simplif
ying assumptions that are commonly made in practice for computational trac
tability (e.g. assuming a diagonal covariance matrix for Gaussian mixture
components) are also expected to result in model misspecification. As part
of our work\, we compare popular MCMC post-processing algorithms for iden
tifying a final summary clustering\, and show that although some of them h
ave a positive impact on results\, others can introduce severe overestimat
ion of the number of clusters\, even when the underlying posterior distrib
ution from which samples are being drawn is centred on the true number of
clusters. This is joint work with Yannis Chaumeny\, Paul Kirk\, Anthony Da
vidson.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/16/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Alexei Vernitski (University of Essex)
DTSTART;VALUE=DATE-TIME:20221027T130000Z
DTEND;VALUE=DATE-TIME:20221027T140000Z
DTSTAMP;VALUE=DATE-TIME:20230605T064536Z
UID:Essex-DataScience/17
DESCRIPTION:Title: Using machine learning to solve mathematical problems and to se
arch for examples and counterexamples in pure maths research\nby Dr Al
exei Vernitski (University of Essex) as part of (ED-3S) Essex Data Science
Seminar Series\n\nLecture held in STEM 3.1.\n\nAbstract\nOur recent resea
rch can be generally described as applying state-of-the-art technologies o
f machine learning to suitable mathematical problems. We use both reinforc
ement learning and supervised learning (underpinned by deep learning). As
to mathematical problems we consider\, they include learning to untangle a
braid (this problem is not unlike the problem of solving the Rubik cube)\
, learning to find the parity of a permutation (as compared to the classic
al problem of deep learning of learning to find the parity bit of a binary
array)\, comparing mathematical mistakes made by artificial intelligence
with those made by human mathematicians\, etc.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/17/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Qiuyi Hong (University of Essex)
DTSTART;VALUE=DATE-TIME:20221117T140000Z
DTEND;VALUE=DATE-TIME:20221117T150000Z
DTSTAMP;VALUE=DATE-TIME:20230605T064536Z
UID:Essex-DataScience/18
DESCRIPTION:Title: A Bilevel Game-TheoreDc Decision-Making Framework for Strategic
Retailers in Both Local and Wholesale Electricity Markets\nby Qiuyi H
ong (University of Essex) as part of (ED-3S) Essex Data Science Seminar Se
ries\n\nLecture held in STEM 3.1.\n\nAbstract\nIn this talk we propose a b
ilevel game-theoretic model for multiple strategic retailers participating
in both wholesale and local electricity markets while considering custome
rs’ switching behaviours. At the upper level\, each retailer maximizes i
ts own profit by making optimal offering decisions in the retail market an
d bidding decisions in the day-ahead wholesale (DAW) and local power excha
nge (LPE) markets. The interaction among multiple strategic retailers is f
ormulated using the Bertrand competition model. For the lower level\, ther
e are three optimisation problems. First\, the customers’ welfare maximi
sation problem with their switching behaviors is formulated to capture the
demand responses from customers. Second\, a market-clearing problem is fo
rmulated for the independent system operator (ISO) in the DAW market. Thir
d\, a novel LPE market is developed for retailers to facilitate their powe
r balancing. In addition\, the bilevel multi-leader multi-follower Stackel
berg game forms an equilibrium problem with equilibrium constraints (EPEC)
problem\, which is solved by the diagonalization algorithm. Numerical res
ults demonstrate the feasibility and effectiveness of the EPEC model and t
he importance of modeling customers’ switching behaviors. We corroborate
that incentivising customers’ switching behaviors and increasing the nu
mber of retailers facilitates retail competition\, which results in reduci
ng strategic retailers’ retail prices and profits. Moreover\, the relati
onship between customers’ switching behaviors and welfare is reflected b
y a balance between the electricity purchasing cost (i.e.\, electricity pr
ice) and the electricity consumption level.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/18/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Mateo Salles (University of Essex)
DTSTART;VALUE=DATE-TIME:20230209T140000Z
DTEND;VALUE=DATE-TIME:20230209T150000Z
DTSTAMP;VALUE=DATE-TIME:20230605T064536Z
UID:Essex-DataScience/19
DESCRIPTION:Title: Supervised Learning for Untangling Braids\nby Mateo Salles
(University of Essex) as part of (ED-3S) Essex Data Science Seminar Series
\n\nLecture held in STEM 3.1.\n\nAbstract\nUntangling a braid is a typical
multi-step process\, and reinforcement learning can be used to train an a
gent to untangle braids. Here we present another approach. Starting from t
he untangled braid\, we produce a dataset of braids using breadth-first se
arch and then apply behavioral cloning to train an agent on the output of
this search. As a result\, the (inverses of) steps predicted by the agent
turn out to be an unexpectedly good method of untangling braids\, includin
g those braids which did not feature in the dataset.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/19/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr. Peng Liu (University of Kent)
DTSTART;VALUE=DATE-TIME:20230504T130000Z
DTEND;VALUE=DATE-TIME:20230504T140000Z
DTSTAMP;VALUE=DATE-TIME:20230605T064536Z
UID:Essex-DataScience/20
DESCRIPTION:Title: Optimal Smooth Approximation for Quantile Matrix Factorisation<
/a>\nby Dr. Peng Liu (University of Kent) as part of (ED-3S) Essex Data Sc
ience Seminar Series\n\nLecture held in STEM 3.1.\n\nAbstract\nMatrix Fact
orisation (MF) is essential to many estimation tasks. Most existing matrix
factorisation methods focus on least squares matrix factorisation (LSMF)\
, which aims to minimise a smooth L2 loss between observations and their d
ependent matrix measurement variables. In reality\, however\, L1 loss and
check loss are widely used in regression to deal with outliers or observat
ions contaminated by skewed or heavy-tailed noise. Although under certain
conditions\, linear convergence to the global optimality can be establishe
d for matrix factorisation under the L2 loss\, there is a lack of provably
efficient algorithms for solving matrix factorisation under non-smooth lo
sses. In this paper\, we investigate Quantile Matrix Factorization (QMF)\,
the counterpart of Quantile Regression in matrix estimation\, that adopts
a tunable check loss and introduces robustness to matrix estimation for s
kewed and heavy tailed observations\, which are prevalent in reality. To d
eal with the non-smooth loss\, we propose Nesterov smoothed QMF (NsQMF)\,
extending Nesterov’s optimal smooth approximation technique to the matri
x factorisation setting. We then present an alternating minimization algor
ithm to solve the smooth NsQMF efficiently. We mathematically prove that s
olving the smoothed NsQMF is equivalent to solving the original non-smooth
QMF problem and that our proposed algorithm achieves linear convergence t
o the global optimality of QMF. Numerical evaluations verify our theoretic
al findings and demonstrate that NsQMF significantly outperforms the commo
nly used LSMF and prior approximate smoothing heuristics for QMF under var
ious noise distributions.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/20/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr. Xiaochuan Yang (University of Brunel)
DTSTART;VALUE=DATE-TIME:20230525T130000Z
DTEND;VALUE=DATE-TIME:20230525T140000Z
DTSTAMP;VALUE=DATE-TIME:20230605T064536Z
UID:Essex-DataScience/21
DESCRIPTION:Title: Some recent progress in random geometric graphs: beyond the sta
ndard regimes\nby Dr. Xiaochuan Yang (University of Brunel) as part of
(ED-3S) Essex Data Science Seminar Series\n\nLecture held in STEM 3.1.\n\
nAbstract\nI will survey some recent joint works with Mathew Penrose (Bath
) on the cluster structure of random geometric graphs in a regime that is
less discussed in the literature. The statistics of interest include the
number of k-components\, the number of components\, the number of vertice
s in the giant component\, and the connectivity threshold. We show LLN and
normal/Poisson approximation by Stein's method.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/21/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr. Yufei Zhang (London School of Economics & Political Science)
DTSTART;VALUE=DATE-TIME:20230511T130000Z
DTEND;VALUE=DATE-TIME:20230511T140000Z
DTSTAMP;VALUE=DATE-TIME:20230605T064536Z
UID:Essex-DataScience/22
DESCRIPTION:Title: Exploration-exploitation trade-off for continuous-time reinforc
ement learning\nby Dr. Yufei Zhang (London School of Economics & Polit
ical Science) as part of (ED-3S) Essex Data Science Seminar Series\n\nLect
ure held in STEM 3.1.\n\nAbstract\nRecently\, reinforcement learning (RL)
has attracted substantial research interests. Much of the attention and su
ccess\, however\, has been for the discrete-time setting. Continuous-time
RL\, despite its natural analytical connection to stochastic controls\, ha
s been largely unexplored and with limited progress. In particular\, chara
cterising sample efficiency for continuous-time RL algorithms remains a ch
allenging and open problem.\n\nIn this talk\, we develop a framework to an
alyse model-based reinforcement learning in the episodic setting. We then
apply it to optimise exploration-exploitation trade-off for linear-convex
RL problems\, and report sublinear (or even logarithmic) regret bounds for
a class of learning algorithms inspired by filtering theory. The approach
is probabilistic\, involving analysing learning efficiency using concentr
ation inequalities for correlated continuous-time observations\, and apply
ing stochastic control theory to quantify the performance gap between appl
ying greedy policies derived from estimated and true models.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/22/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Prof. Chenggui Yuan (Swansea University)
DTSTART;VALUE=DATE-TIME:20230601T130000Z
DTEND;VALUE=DATE-TIME:20230601T140000Z
DTSTAMP;VALUE=DATE-TIME:20230605T064536Z
UID:Essex-DataScience/24
DESCRIPTION:Title: Numerical solutions of SDEs with irregular coefficients\nby
Prof. Chenggui Yuan (Swansea University) as part of (ED-3S) Essex Data Sc
ience Seminar Series\n\nLecture held in STEM 3.1.\n\nAbstract\nStochastic
differential equations (SDEs) with irregular coefficients have been widely
studied. In this talk\, I will discuss the strong convergence and the we
ak convergence of SDEs with irregular coefficients. The convergence rate
will be investigated under different irregular conditions on coefficients.
\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/24/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr. Robert Gaunt (The University of Manchester)
DTSTART;VALUE=DATE-TIME:20230615T130000Z
DTEND;VALUE=DATE-TIME:20230615T140000Z
DTSTAMP;VALUE=DATE-TIME:20230605T064536Z
UID:Essex-DataScience/25
DESCRIPTION:Title: Normal approximation for the posterior in exponential families<
/a>\nby Dr. Robert Gaunt (The University of Manchester) as part of (ED-3S)
Essex Data Science Seminar Series\n\nLecture held in STEM 3.1.\nAbstract:
TBA\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/25/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr. Arthur Maheo
DTSTART;VALUE=DATE-TIME:20230622T130000Z
DTEND;VALUE=DATE-TIME:20230622T140000Z
DTSTAMP;VALUE=DATE-TIME:20230605T064536Z
UID:Essex-DataScience/26
DESCRIPTION:by Dr. Arthur Maheo as part of (ED-3S) Essex Data Science Semi
nar Series\n\nLecture held in STEM 3.1.\nAbstract: TBA\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/26/
END:VEVENT
END:VCALENDAR