BEGIN:VCALENDAR
VERSION:2.0
PRODID:researchseminars.org
CALSCALE:GREGORIAN
X-WR-CALNAME:researchseminars.org
BEGIN:VEVENT
SUMMARY:Matthew Lee (University of Bristol)
DTSTART:20201015T130000Z
DTEND:20201015T140000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/1
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/1/">EpiViz: an implementation of Circos plots for epidemiologists</a
 >\nby Matthew Lee (University of Bristol) as part of (ED-3S) Essex Data Sc
 ience Seminar Series\n\n\nAbstract\nEpidemiology studies predominantly foc
 us on single exposure and single outcome associations. However\, biologica
 l pathways involve numerous processes and identifying meaningful intermedi
 ate associations that can be taken forward for further analysis is complex
 . This is particularly the case for studies involving metabolomics data\, 
 as effects rarely occur in isolation. Gaining global overview of hundreds 
 of exposure/outcome associations may therefore aid downstream analyses. Vi
 sual inspection is one of the main modes of understanding global exposure/
 outcome associations. EpiViz is a wrapper that makes producing Cricos plot
 s simple and efficient for those new to programming and data visualisation
 .\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/1/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Godwin Osuntoki (University of Essex)
DTSTART:20201022T130000Z
DTEND:20201022T140000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/2
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/2/">Bayesian Analysis of chromosomal interactions in Hi-C data using
  the hidden Markov random field model</a>\nby Godwin Osuntoki (University 
 of Essex) as part of (ED-3S) Essex Data Science Seminar Series\n\n\nAbstra
 ct\nThere are different biological methods that have been developed over t
 he years for analysis of the 3D structure of the DNA. Few computational an
 d statistical methods have\, however\, been developed to analysis data gen
 erated using the Hi-C method. We follow statistical methodology to explore
  the Hi-C data. The Hi-C data is well suited to be analyzed using a finite
  mixture model. The Potts model\, a hidden Markov random field model\, was
  employed to analyze the hidden (latent) components. The hidden components
  through the Potts model can be categorized into k components (k = 2\,3…
 \,K). Using the Metropolis-within-Gibbs approach to analyze the data\, the
  proposed method was able to detect interactions (short and long range) an
 d loops. A large part of the significant interactions that we detect are f
 ound within Topological Associated Domains\, which is one of the 3D struct
 ures known to occur in Hi-C data.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/2/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Nosheen Faiz (University of Essex)
DTSTART:20201105T140000Z
DTEND:20201105T150000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/4
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/4/">Assessing how feature selection and hyper-parameters influence o
 ptimal trees ensemble and random projection</a>\nby Nosheen Faiz (Universi
 ty of Essex) as part of (ED-3S) Essex Data Science Seminar Series\n\n\nAbs
 tract\nOur work investigates the effect of feature selection on three meth
 ods: Random Forest (Breiman 2001)\, Optimal Trees Ensemble (Khan et al 201
 6) and Random Projection (Canning and Samworth 2017) in high dimensional s
 ettings. To this end\, LASSO has been considered for selecting the most im
 portant features based on training data for dimension reduction. Additiona
 lly\, the influence of various hyper-parameters regulating the three metho
 ds has also been assessed. Analysis on several benchmark datasets is given
  to illustrate the phenomena. The results reveal that feature selection im
 proves the predictive performance of the Random Forest and Random Projecti
 on methods in addition to reducing the computational burden. The performan
 ce of Optimal Trees Ensemble is less influenced by feature selection.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/4/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Peng Liu (University of Essex)
DTSTART:20201112T140000Z
DTEND:20201112T150000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/5
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/5/">Ordering and Inequalities for Mixtures on Risk Aggregation</a>\n
 by Peng Liu (University of Essex) as part of (ED-3S) Essex Data Science Se
 minar Series\n\n\nAbstract\nAggregation sets\, which represent model uncer
 tainty due to unknown dependence\, are an important object in the study of
  robust risk aggregation. In this talk\, we investigate ordering relations
  between two aggregation sets for which the sets of marginals are related 
 by two simple operations: distribution mixtures and quantile mixtures. Int
 uitively\, these operations ``homogenize"   marginal distributions by maki
 ng them similar. As a general conclusion from our results\, more ``homogen
 eous" marginals lead to a larger aggregation set\, and thus more severe mo
 del uncertainty\, although the situation for quantile mixtures is much mor
 e complicated than   that for distribution mixtures. \nWe proceed to study
  inequalities on the worst-case values of risk measures in risk aggregatio
 n\, which represent conservative calculation of regulatory capital.  Among
  other results\, we obtain an order relation on VaR under quantile mixture
  for marginal distributions with monotone densities. Numerical results are
  presented to visualize the theoretical results and further inspire some c
 onjectures.\nFinally\, we discuss the connection of our results to joint m
 ixability and to merging p-values in multiple hypothesis testing.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/5/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Tolulope Fadina (University of Essex)
DTSTART:20210225T140000Z
DTEND:20210225T150000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/6
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/6/">Symmetric measures of variability induced by risk measures</a>\n
 by Tolulope Fadina (University of Essex) as part of (ED-3S) Essex Data Sci
 ence Seminar Series\n\n\nAbstract\nGeneral measures of variability induced
  by risk measures are investigated for their potential applications to ris
 k management. We emphasize on the three classes of variability measures ge
 nerated by the Value-at-Risk\, Expected Shortfall\, and the Expectiles. Th
 eir properties are explored\, and we obtain a characterization result on g
 eneral model spaces. Convergence properties and asymptotic normality of th
 e empirical variability measures estimators are established. An applicatio
 n of the variability measures to financial data is also investigated.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/6/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Ioana Olan (University of Cambridge)
DTSTART:20201126T140000Z
DTEND:20201126T150000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/7
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/7/">Detecting the hierarchical structure of the cell nucleus</a>\nby
  Ioana Olan (University of Cambridge) as part of (ED-3S) Essex Data Scienc
 e Seminar Series\n\n\nAbstract\nChromatin consists of DNA wrapped around h
 istones and forms complex three-dimensional structures within the cell nuc
 leus with various degrees of compaction. Genes have been shown to be repre
 ssed by their proximity to the nuclear periphery or activated by being in 
 contact with special regulatory regions called enhancers. Thus the relativ
 e positioning of genes and their interactions with other regions are very 
 important in determining whether they are expressed or not. Interactions b
 etween pairs of genomic regions have been studied using assays such as Hi-
 C\, which generate large matrices estimating interaction frequencies. We u
 se such interaction estimates as weights in a network whose nodes are equa
 lly sized genomic regions and perform nested community detection in order 
 to resolve the relative positioning of genomic regions of interest and mod
 el the interior of the cell nucleus. Our biological model is cellular sene
 scence\, a phenotype associated with dramatic changes in its chromatin int
 eractions network relative to normal cells. Senescence corresponds to perm
 anent cell cycle arrest and has been shown to act as a protective barrier 
 against tumourigenesis.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/7/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Josh Bull (University of Oxford)
DTSTART:20201203T140000Z
DTEND:20201203T150000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/8
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/8/">Can maths tell us how to win at Fantasy Football?</a>\nby Josh B
 ull (University of Oxford) as part of (ED-3S) Essex Data Science Seminar S
 eries\n\n\nAbstract\nFantasy Football is an online game played by millions
  of people every year\, in which players attempt to predict the outcome of
  football matches over the course of a season. To the surprise of everyone
  (including myself)\, I was lucky enough to be crowned the winner of the 2
 019-20 Fantasy Premier League\, one of the largest competitions in the UK.
  As a researcher in Mathematical Oncology at the University of Oxford\, pe
 ople have asked me whether I used maths to win – while I followed some s
 trategies at the time\, I didn’t have any proof that they were in some s
 ense mathematically optimal. However\, mathematical modelling is a tool wh
 ich is capable of exploring exactly these kinds of questions: how can we i
 dentify the best strategies to tackle complex problems? What types of data
  are important to consider\, and how should we use them to inform our deci
 sions? In this talk\, I’ll analyse how different quantitative approaches
  can be used to tackle key questions in Fantasy Football\, and identify th
 e strengths and weaknesses of these frameworks. Finally\, I’ll address t
 he question: Can maths tell us how to win at Fantasy Football?\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/8/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Osama Mahmoud (University of Essex)
DTSTART:20210211T140000Z
DTEND:20210211T150000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/9
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/9/">Slope-Hunter: A robust method for index-event bias correction in
  genome-wide association studies of conditional analyses</a>\nby Osama Mah
 moud (University of Essex) as part of (ED-3S) Essex Data Science Seminar S
 eries\n\n\nAbstract\nBackground: Studying genetic associations with progno
 sis (e.g. survival\, subsequent events) is problematic due to selection bi
 as - also termed index event bias or collider bias - whereby selection on 
 disease status can induce associations between causes of incidence with pr
 ognosis. A current method for adjusting genetic associations for this bias
  assumes there is no genetic correlation between incidence and prognosis\,
  which may not be a plausible assumption.\n\nMethods: We propose an altern
 ative\, the ‘Slope-Hunter’ approach\, which is unbiased even when ther
 e is genetic correlation between incidence and prognosis. Our approach has
  two stages. First\, we use cluster-based techniques to identify: variants
  affecting neither incidence nor prognosis (these should not suffer bias a
 nd only a random sub-sample of them are retained in the analysis)\; varian
 ts affecting prognosis only (excluded from the analysis). Second\, we fit 
 a cluster-based model to identify the class of variants only affecting inc
 idence\, and use this class to estimate the adjustment factor. {\\color{bl
 ue} The underlying assumption of our approach is that variants affecting o
 nly incidence explain more variation in incidence than any group of varian
 ts with unique effects\, e.g. via same exposure\, on both incidence and pr
 ognosis}.\n\nResults: Simulation studies showed that {\\color{blue} our ap
 proach eliminates the bias and outperforms alternatives in the presence of
  genetic correlation\, and performs as well as alternatives under no genet
 ic correlation when its assumption is satisfied. We applied the ‘Slope-H
 unter’ method to a study of fasting blood insulin levels (FI) conditiona
 l on body mass index (BMI)\, estimated the index event bias\, and adjusted
  conditional associations of the lead variants with FI. Our estimates sugg
 ested that there were common causes of BMI and FI of concordant directions
  of effect\, that are in-line with previously observed association between
  obesity and insulin resistance.}\n\nConclusions: Our approach is unbiased
  even in the presence of genetic correlation between incidence and progres
 sion when the underlying assumptions hold. Bias-adjusting methods should b
 e used to carry out causal analyses when conditioning on incidence.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/9/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Yanchun Bao (University of Essex)
DTSTART:20201217T140000Z
DTEND:20201217T150000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/10
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/10/">Estimating mode effects from a sequential mixed-modes experimen
 t</a>\nby Yanchun Bao (University of Essex) as part of (ED-3S) Essex Data 
 Science Seminar Series\n\n\nAbstract\nThe large-scale household panel stud
 y Understanding Society (The U.K. Household Longitudinal Study UKHLS) has\
 , until recently\, used interviewers to administer its questionnaires\, bu
 t is now in the process of allowing individuals to participate using the w
 eb. Survey data are known to be affected by survey mode so a sequential mo
 de-effects experiment was carried out on to evaluate the impact of this ch
 ange on the panel. In this talk we present a novel estimator and analysis 
 strategy to quantify the impact of mode across a wide range of variables\,
  with large mode effects on the covariance of a pair of variables used to 
 indicate an increased risk that statistical analyses involving this pair w
 ill be affected.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/10/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Rafal Kulakowski (University of Essex)
DTSTART:20210204T140000Z
DTEND:20210204T150000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/11
DESCRIPTION:by Rafal Kulakowski (University of Essex) as part of (ED-3S) E
 ssex Data Science Seminar Series\n\nAbstract: TBA\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/11/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Yassir Rabhi (University of Essex)
DTSTART:20201210T140000Z
DTEND:20201210T150000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/12
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/12/">Copulas and measures of dependence under length-biased sampling
  and informative censoring</a>\nby Yassir Rabhi (University of Essex) as p
 art of (ED-3S) Essex Data Science Seminar Series\n\n\nAbstract\nLength-bia
 sed data are often encountered in cross-sectional surveys and prevalent-co
 hort studies on disease durations. Under length-biased sampling subjects w
 ith longer disease durations have greater chance to be observed. As a resu
 lt\, covariate values linked to the longer survivors are favoured by the s
 ampling mechanism. When the sampled durations are also subject to right ce
 nsoring\, the censoring is informative. Modelling dependence structure wit
 hout adjusting for these issues leads to biased results. In this talk\, I 
 will present a study on copulas for modelling dependence when the collecte
 d data are length-biased and account for both informative censoring and co
 variate bias. I will address the nonparametric estimation of the bivariate
  distribution\, copula function and its density\, and Kendall and Spearman
  measures for right-censored length-biased data. The proposed estimator of
  the bivariate CDF is a Hadamard-differentiable functional of two MLEs\, K
 aplan-Meier and empirical CDF\, and inherits their efficiencies. Based on 
 this estimator\, we devise estimators for copula function and a local-poly
 nomial estimator for copula density that accounts for boundary bias. In ad
 dition\, I will introduce estimators for Kendall and Spearman measures. Th
 e weak convergence of the estimators will also be discussed. The proposed 
 method is then applied to analyse a set of right-censored length-biased da
 ta on survival with dementia\, collected as part of a nationwide study in 
 Canada.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/12/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Carolin Strobl (Universität Zürich)
DTSTART:20201119T140000Z
DTEND:20201119T150000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/13
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/13/">A Statistician’s Botanical Garden - The Ideas behind Trees\, 
 Model-Based Trees and Random Forests</a>\nby Carolin Strobl (Universität 
 Zürich) as part of (ED-3S) Essex Data Science Seminar Series\n\n\nAbstrac
 t\nClassification and regression trees\, model-based trees and random fore
 sts are powerful statistical methods from the field of machine learning. T
 hey have been shown to achieve a high prediction accuracy\, especially in 
 big data applications with many predictor variables and complex associatio
 n patterns (such as nonlinear and higher-order interaction effects). While
  individual trees are easy to interpret\, random forests are "black box" p
 rediction methods. They do\, however\, provide variable importance measure
 s\, that are being used to judge the relevance of the individual predictor
  variables. The aim of this presentation is to introduce the rationale beh
 ind trees\, model-based trees and random forests\, to illustrate their pot
 ential for high-dimensional data exploration\, e.g.\, in psychological res
 earch\, but also to point out limitations and potential pitfalls in their 
 practical application.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/13/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Shenggang Hu (University of Essex)
DTSTART:20221013T130000Z
DTEND:20221013T140000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/14
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/14/">Statistical disaggregation - a Monte Carlo approach for imputat
 ion under constraints</a>\nby Shenggang Hu (University of Essex) as part o
 f (ED-3S) Essex Data Science Seminar Series\n\nLecture held in NTC.1.04.\n
 \nAbstract\nStatistical disaggregation has become more and more important 
 for smart energy systems. A typical example of such disaggregation problem
 s is to learn energy consumption for a higher resolution level (data recor
 ded at higher frequency) based on data at a lower resolution (data recorde
 d at lower frequency). Constrained models are often used in such problems 
 and they are often very useful compared to their unconstrained counterpart
 s in terms of reducing uncertainty and leading to an improvement of the ov
 erall performance. However\, these constrained models usually are not expr
 essible as ordinary distributions due to their intractable density functio
 ns which makes it hard to conduct further analysis. This paper introduces 
 a novel constrained Monte Carlo sampling algorithm based on Langevin diffu
 sions and rejection sampling to solve the problem of sampling from constra
 ined models. This new method is then applied to a statistical disaggregati
 on problem for an electricity consumption dataset.  Our approach provides 
 excellent accuracy of data imputation\, based on our simulation studies an
 d data analysis. The new method is also justified theoretically.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/14/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Prof Christian Martin Hennig (University of Bologna\, UCL)
DTSTART:20221103T140000Z
DTEND:20221103T150000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/15
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/15/">Advances in using cluster analysis for species delimitation</a>
 \nby Prof Christian Martin Hennig (University of Bologna\, UCL) as part of
  (ED-3S) Essex Data Science Seminar Series\n\nLecture held in STEM 3.1.\n\
 nAbstract\nBiological species are often delimited based on genetic multilo
 cus data using methods for inferring phylogenetic trees or model- or dista
 nce-based cluster analysis. A major problem here is that genetic dissimila
 rity does not only arise from separated species\, but also if subpopulatio
 ns of a species live in geographically distant areas without genetic excha
 nge. In any case\, be it using partitioning cluster analysis or hierarchic
 al trees\, it is a hard problem to decide the number of species\, and whet
 her groups that are candidates for being species actually belong together.
  I will discuss some the use of some new approaches for clustering and est
 imating the number of clusters for this problem\, focusing particularly on
  testing whether observed genetic heterogeneity within a species candidate
  group can be explained be geographical distance rather than consisting of
  separate species. This requires hypothesis testing in a distance-distance
  regression model. I will also discuss the integration of such a testing r
 outine in a fully automated method for species delimitation.\n\nReference\
 n\nHausdorf\, B\, Hennig\, C. Species delimitation and geography. Mol Ecol
  Resour. 2020\; 20: 950– 960. https://doi.org/10.1111/1755-0998.13184\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/15/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Johan van der Molen (University of Cambridge)
DTSTART:20221124T140000Z
DTEND:20221124T150000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/16
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/16/">Dirichlet process mixture inconsistency for the number of compo
 nents: how worried should we be in practice?</a>\nby Dr Johan van der Mole
 n (University of Cambridge) as part of (ED-3S) Essex Data Science Seminar 
 Series\n\nLecture held in STEM 3.1.\n\nAbstract\nBayesian nonparametric mi
 xture models are widely used for model-based clustering due to their flexi
 bility and  conceptual simplicity\, as well as the availability of efficie
 nt sampling methods for performing inference. However\, recent work has es
 tablished that such models have undesirable asymptotic properties regardin
 g the estimation of the number of clusters. For instance\, Dirichlet Proce
 ss Mixtures (DPMs) have been shown to be inconsistent for the number of cl
 usters\, and overestimation of the number of clusters has been observed in
  practice for finite samples. Finite mixtures with a prior on the number o
 f components - also known as Mixtures of Finite Mixtures (MFMs) - have bee
 n suggested as an asymptotically consistent alternative\, but the effects 
 of model misspecification can still result in asymptomatic inconsistency a
 nd poor estimation of the number of clusters in practice. \n\nHere we spec
 ifically focus on estimation of the number of clusters in Bayesian nonpara
 metric mixtures in practice\, including the impact of Markov chain Monte C
 arlo (MCMC) post-processing algorithms for summarisation and identificatio
 n of a final representative summary clustering. We consider practical scen
 arios of low to moderate dimension\, through both simulation studies and a
 pplications to real biomolecular data. In the situations we consider\, we 
 confirm that even when the parametric form of the mixture component distri
 butions is correctly specified\, DPMs lead to mild overestimation of the n
 umber of clusters for finite samples. However\, we also demonstrate that t
 his can be corrected by common summarisation methods\, suggesting that app
 lications of DPMs in practice may be more robust than the theory might sug
 gest. We show that\, for both DPMs and MFMs\, mixture component density mi
 sspecification typically leads to more dramatic overestimation\, with DPMs
  providing slightly worse estimates than MFMs\, but with the common patter
 n of “true” clusters in the data being split into smaller subclusters 
 due to additional mixture components being required to flexibly capture fe
 atures of the data inadequately described by the misspecified models. We c
 onsider implications for high-dimensional data analysis\, in which simplif
 ying assumptions that are commonly made in practice for computational trac
 tability (e.g. assuming a diagonal covariance matrix for Gaussian mixture 
 components) are also expected to result in model misspecification. As part
  of our work\, we compare popular MCMC post-processing algorithms for iden
 tifying a final summary clustering\, and show that although some of them h
 ave a positive impact on results\, others can introduce severe overestimat
 ion of the number of clusters\, even when the underlying posterior distrib
 ution from which samples are being drawn is centred on the true number of 
 clusters. This is joint work with Yannis Chaumeny\, Paul Kirk\, Anthony Da
 vidson.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/16/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Alexei Vernitski (University of Essex)
DTSTART:20221027T130000Z
DTEND:20221027T140000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/17
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/17/">Using machine learning to solve mathematical problems and to se
 arch for examples and counterexamples in pure maths research</a>\nby Dr Al
 exei Vernitski (University of Essex) as part of (ED-3S) Essex Data Science
  Seminar Series\n\nLecture held in STEM 3.1.\n\nAbstract\nOur recent resea
 rch can be generally described as applying state-of-the-art technologies o
 f machine learning to suitable mathematical problems. We use both reinforc
 ement learning and supervised learning (underpinned by deep learning). As 
 to mathematical problems we consider\, they include learning to untangle a
  braid (this problem is not unlike the problem of solving the Rubik cube)\
 , learning to find the parity of a permutation (as compared to the classic
 al problem of deep learning of learning to find the parity bit of a binary
  array)\, comparing mathematical mistakes made by artificial intelligence 
 with those made by human mathematicians\, etc.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/17/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Qiuyi Hong (University of Essex)
DTSTART:20221117T140000Z
DTEND:20221117T150000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/18
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/18/">A Bilevel Game-TheoreDc Decision-Making Framework for Strategic
  Retailers in Both Local and Wholesale Electricity Markets</a>\nby Qiuyi H
 ong (University of Essex) as part of (ED-3S) Essex Data Science Seminar Se
 ries\n\nLecture held in STEM 3.1.\n\nAbstract\nIn this talk we propose a b
 ilevel game-theoretic model for multiple strategic retailers participating
  in both wholesale and local electricity markets while considering custome
 rs’ switching behaviours. At the upper level\, each retailer maximizes i
 ts own profit by making optimal offering decisions in the retail market an
 d bidding decisions in the day-ahead wholesale (DAW) and local power excha
 nge (LPE) markets. The interaction among multiple strategic retailers is f
 ormulated using the Bertrand competition model. For the lower level\, ther
 e are three optimisation problems. First\, the customers’ welfare maximi
 sation problem with their switching behaviors is formulated to capture the
  demand responses from customers. Second\, a market-clearing problem is fo
 rmulated for the independent system operator (ISO) in the DAW market. Thir
 d\, a novel LPE market is developed for retailers to facilitate their powe
 r balancing. In addition\, the bilevel multi-leader multi-follower Stackel
 berg game forms an equilibrium problem with equilibrium constraints (EPEC)
  problem\, which is solved by the diagonalization algorithm. Numerical res
 ults demonstrate the feasibility and effectiveness of the EPEC model and t
 he importance of modeling customers’ switching behaviors. We corroborate
  that incentivising customers’ switching behaviors and increasing the nu
 mber of retailers facilitates retail competition\, which results in reduci
 ng strategic retailers’ retail prices and profits. Moreover\, the relati
 onship between customers’ switching behaviors and welfare is reflected b
 y a balance between the electricity purchasing cost (i.e.\, electricity pr
 ice) and the electricity consumption level.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/18/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Mateo Salles (University of Essex)
DTSTART:20230209T140000Z
DTEND:20230209T150000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/19
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/19/">Supervised Learning for Untangling Braids</a>\nby Mateo Salles 
 (University of Essex) as part of (ED-3S) Essex Data Science Seminar Series
 \n\nLecture held in STEM 3.1.\n\nAbstract\nUntangling a braid is a typical
  multi-step process\, and reinforcement learning can be used to train an a
 gent to untangle braids. Here we present another approach. Starting from t
 he untangled braid\, we produce a dataset of braids using breadth-first se
 arch and then apply behavioral cloning to train an agent on the output of 
 this search. As a result\, the (inverses of) steps predicted by the agent 
 turn out to be an unexpectedly good method of untangling braids\, includin
 g those braids which did not feature in the dataset.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/19/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr. Peng Liu (University of Kent)
DTSTART:20230504T130000Z
DTEND:20230504T140000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/20
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/20/">Optimal Smooth Approximation for Quantile Matrix Factorisation<
 /a>\nby Dr. Peng Liu (University of Kent) as part of (ED-3S) Essex Data Sc
 ience Seminar Series\n\nLecture held in STEM 3.1.\n\nAbstract\nMatrix Fact
 orisation (MF) is essential to many estimation tasks. Most existing matrix
  factorisation methods focus on least squares matrix factorisation (LSMF)\
 , which aims to minimise a smooth L2 loss between observations and their d
 ependent matrix measurement variables. In reality\, however\, L1 loss and 
 check loss are widely used in regression to deal with outliers or observat
 ions contaminated by skewed or heavy-tailed noise. Although under certain 
 conditions\, linear convergence to the global optimality can be establishe
 d for matrix factorisation under the L2 loss\, there is a lack of provably
  efficient algorithms for solving matrix factorisation under non-smooth lo
 sses. In this paper\, we investigate Quantile Matrix Factorization (QMF)\,
  the counterpart of Quantile Regression in matrix estimation\, that adopts
  a tunable check loss and introduces robustness to matrix estimation for s
 kewed and heavy tailed observations\, which are prevalent in reality. To d
 eal with the non-smooth loss\, we propose Nesterov smoothed QMF (NsQMF)\, 
 extending Nesterov’s optimal smooth approximation technique to the matri
 x factorisation setting. We then present an alternating minimization algor
 ithm to solve the smooth NsQMF efficiently. We mathematically prove that s
 olving the smoothed NsQMF is equivalent to solving the original non-smooth
  QMF problem and that our proposed algorithm achieves linear convergence t
 o the global optimality of QMF. Numerical evaluations verify our theoretic
 al findings and demonstrate that NsQMF significantly outperforms the commo
 nly used LSMF and prior approximate smoothing heuristics for QMF under var
 ious noise distributions.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/20/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr. Xiaochuan Yang (University of Brunel)
DTSTART:20230525T130000Z
DTEND:20230525T140000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/21
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/21/">Some recent progress in random geometric graphs: beyond the sta
 ndard regimes</a>\nby Dr. Xiaochuan Yang (University of Brunel) as part of
  (ED-3S) Essex Data Science Seminar Series\n\nLecture held in STEM 3.1.\n\
 nAbstract\nI will survey some recent joint works with Mathew Penrose (Bath
 )  on the cluster structure of random geometric graphs in a regime that is
  less discussed in the literature.  The statistics of interest include the
  number of k-components\, the number of components\, the number of vertice
 s in the giant component\, and the connectivity threshold. We show LLN and
  normal/Poisson approximation by Stein's method.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/21/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr. Yufei Zhang (London School of Economics & Political Science)
DTSTART:20230511T130000Z
DTEND:20230511T140000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/22
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/22/">Exploration-exploitation trade-off for continuous-time reinforc
 ement learning</a>\nby Dr. Yufei Zhang (London School of Economics & Polit
 ical Science) as part of (ED-3S) Essex Data Science Seminar Series\n\nLect
 ure held in STEM 3.1.\n\nAbstract\nRecently\, reinforcement learning (RL) 
 has attracted substantial research interests. Much of the attention and su
 ccess\, however\, has been for the discrete-time setting. Continuous-time 
 RL\, despite its natural analytical connection to stochastic controls\, ha
 s been largely unexplored and with limited progress. In particular\, chara
 cterising sample efficiency for continuous-time RL algorithms remains a ch
 allenging and open problem.\n\nIn this talk\, we develop a framework to an
 alyse model-based reinforcement learning in the episodic setting. We then 
 apply it to optimise exploration-exploitation trade-off for linear-convex 
 RL problems\, and report sublinear (or even logarithmic) regret bounds for
  a class of learning algorithms inspired by filtering theory. The approach
  is probabilistic\, involving analysing learning efficiency using concentr
 ation inequalities for correlated continuous-time observations\, and apply
 ing stochastic control theory to quantify the performance gap between appl
 ying greedy policies derived from estimated and true models.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/22/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Prof. Chenggui Yuan (Swansea University)
DTSTART:20230601T130000Z
DTEND:20230601T140000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/24
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/24/">Numerical solutions of SDEs with irregular coefficients</a>\nby
  Prof. Chenggui Yuan (Swansea University) as part of (ED-3S) Essex Data Sc
 ience Seminar Series\n\nLecture held in STEM 3.1.\n\nAbstract\nStochastic 
 differential equations (SDEs) with irregular coefficients have been widely
  studied. In this talk\, I will discuss the strong convergence and  the we
 ak convergence of SDEs with  irregular coefficients. The convergence rate 
 will be investigated under different irregular conditions on coefficients.
 \n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/24/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr. Robert Gaunt (The University of Manchester)
DTSTART:20230615T130000Z
DTEND:20230615T140000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/25
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/25/">Normal approximation for the posterior in exponential families<
 /a>\nby Dr. Robert Gaunt (The University of Manchester) as part of (ED-3S)
  Essex Data Science Seminar Series\n\nLecture held in STEM 3.1.\n\nAbstrac
 t\nIn this talk I'll introduce quantitative Bernstein-von Mises type bound
 s on the normal approximation of the posterior distribution in exponential
  family models when centering either around the posterior mode or around t
 he maximum likelihood estimator. Our bounds\, obtained through a version o
 f Stein’s method\, are non-asymptotic\, and data dependent\; they are of
  the correct order both in the total variation and Wasserstein distances\,
  as well as for approximations for expectations of smooth functions of the
  posterior. All our results are valid for univariate and multivariate post
 eriors alike\, and do not require a conjugate prior setting. We illustrate
  our findings on a variety of exponential family distributions\, including
  Poisson\, multinomial and normal distribution with unknown mean and varia
 nce. The resulting bounds have an explicit dependence on the prior distrib
 ution and on sufficient statistics of the data from the sample\, and thus 
 provide insight into how these factors may affect the quality of the norma
 l approximation.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/25/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr. Arthur Maheo (Amazon)
DTSTART:20230622T130000Z
DTEND:20230622T140000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/26
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/26/">Benders decomposition for public transportation</a>\nby Dr. Art
 hur Maheo (Amazon) as part of (ED-3S) Essex Data Science Seminar Series\n\
 nLecture held in STEM 3.1.\n\nAbstract\nCanberra (Australia) wants to desi
 gn a transportation network combining high-frequency buses with on-demand 
 taxis. The resulting hub-and-shuttle network design problem is a large\, d
 ifficult mixed-integer program. We identified how to decompose the problem
  – design first\, route second – and used a modern Benders decompositi
 on on the resulting formulation.\nThis new approach is orders of magnitude
  faster\, allowing us to solve full instances where a standard approach ca
 n only do small ones.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/26/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Prof Boris Mirkin (National Research University Higher School of E
 conomics)
DTSTART:20231006T120000Z
DTEND:20231006T130000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/27
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/27/">Anomalous clustering at various data formats</a>\nby Prof Boris
  Mirkin (National Research University Higher School of Economics) as part 
 of (ED-3S) Essex Data Science Seminar Series\n\nLecture held in 1N1.4.1.\n
 \nAbstract\nAnomalous clustering is a method for extracting clusters one-b
 y-one. It is an extension of the Principal Component Analysis method to z
 ero-one matrix factorization settings. After a brief overview of various v
 ersions of the method\, including its  extensions to similarity data\, sp
 atial data\, and fuzzy clustering\, I am going to concentrate on a most r
 ecent development\, a triple-stage application of the approach to the anal
 ysis of spatial-temporal patterns in a coastal oceanic phenomenon of upwel
 ling (see Nascimento et al. 2023).\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/27/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Jin Zhu (LSE)
DTSTART:20231019T130000Z
DTEND:20231019T140000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/28
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/28/">A Tuning-Free Algorithm for Sparsity-Constraint Optimization</a
 >\nby Dr Jin Zhu (LSE) as part of (ED-3S) Essex Data Science Seminar Serie
 s\n\nLecture held in STEM 3.1.\n\nAbstract\nSparsity-constraint optimizati
 on has wide applicability in signal processing\, statistics\, and machine 
 learning. Existing fast algorithms must burdensomely tune parameters\, suc
 h as the step size or the implementation of precise stop criteria\, which 
 may be challenging to determine in practice. To address this issue\, we de
 velop an algorithm named sparsity-constraint optimization via splicing ite
 ration (SCOPE) to optimize nonlinear differential objective functions with
  strong convexity and smoothness in low dimensional subspaces. Algorithmic
 ally\, the SCOPE algorithm converges effectively without tuning parameters
 . Theoretically\, SCOPE has a linear convergence rate and converges to a s
 olution that recovers the true support set when it correctly specifies the
  sparsity. We also develop parallel theoretical results without restricted
 -isometry-property-type conditions. We apply SCOPE’s versatility and pow
 er to solve sparse quadratic optimization\, learn sparse classifiers\, and
  recover sparse Markov networks for binary variables. The numerical result
 s on these specific tasks reveal that SCOPE perfectly identifies the true 
 support set with a 10–1000 speedup over the standard exact solver\, conf
 irming SCOPE’s algorithmic and theoretical merits. Our open-source Pytho
 n package scope based on C++ implementation is publicly available on GitHu
 b\, reaching a ten-fold speedup on the competing convex relaxation methods
  implemented by the cvxpy library.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/28/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Shenggang Hu (University of Warwick)
DTSTART:20231026T130000Z
DTEND:20231026T140000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/29
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/29/">Differential Privacy of Bayesian Posterior under Contamination<
 /a>\nby Dr Shenggang Hu (University of Warwick) as part of (ED-3S) Essex D
 ata Science Seminar Series\n\nLecture held in STEM 3.1.\n\nAbstract\nIn re
 cent years\, differential privacy has been adopted by tech-companies and g
 overnmental agencies as the standard for measuring privacy in algorithms. 
 We study the level of differential privacy in Bayesian posterior sampling 
 setups. As opposed to the common privatization approach of injecting Lapla
 ce/Gaussian noise into the output\, Huber's contamination model is conside
 red\, where we replace at random the data points with samples from a heavy
 -tailed distribution. The derived bound for the differential privacy level
  in our approach matches the existing literature while lifting the restric
 tion on bounded observation space. We further consider the effect of sampl
 e size on privacy level and conclude that asymptotically the contamination
  approach is fully private at no cost of information loss.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/29/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Prof Wolfgang Hardle (Humboldt-Universität zu Berlin\, Germany)
DTSTART:20240118T140000Z
DTEND:20240118T150000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/30
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/30/">Data Science in a Math-Less Digital Society</a>\nby Prof Wolfga
 ng Hardle (Humboldt-Universität zu Berlin\, Germany) as part of (ED-3S) E
 ssex Data Science Seminar Series\n\nLecture held in STEM 3.1.\n\nAbstract\
 nIn an increasingly digital and data-driven world\, the importance of data
  science cannot be overstated.  Data science\, by itself\, carries a "push
  to analyse“  button though\, that lets the analyst forget about the „
 math behind the machine learning tools“\n\nWe cover a few examples\, whe
 re data science needs math in order to be understood and applied.\n\nBy th
 e end of this talk\, attendees will gain a fresh perspective on data scien
 ce's role in a math-less digital society. They will leave with practical i
 nsights\, tools\, and strategies to leverage data effectively\, fostering 
 a culture of data-driven decision-making that transcends mathematical barr
 iers.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/30/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Dimitra Kosta (University of Edinburgh)
DTSTART:20231123T134500Z
DTEND:20231123T144500Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/31
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/31/">Maximum likelihood estimation of toric Fano varieties</a>\nby D
 r Dimitra Kosta (University of Edinburgh) as part of (ED-3S) Essex Data Sc
 ience Seminar Series\n\nLecture held in Zoom.\n\nAbstract\nI will talk abo
 ut the maximum likelihood estimation problem for several classes of toric 
 Fano models. I will start by exploring the maximum likelihood degree for a
 ll 2-dimensional Gorenstein toric Fano varieties. I will show that the ML 
 degree is equal to the degree of the surface in every case except for the 
 quintic del Pezzo surface with two ordinary double points and provide expl
 icit expressions that allow one to compute the maximum likelihood estimate
  in closed form whenever the ML degree is less than 5. I will explore the 
 reasons for the ML degree drop using A-discriminants and intersection theo
 ry. If there is time\, I will discuss about toric Fano varieties associate
 d to 3-valent phylogenetic trees and their ML degree.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/31/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Richard Mann (University of Leeds)
DTSTART:20240201T140000Z
DTEND:20240201T150000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/33
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/33/">Collective decision-making by rational agents</a>\nby Dr Richar
 d Mann (University of Leeds) as part of (ED-3S) Essex Data Science Seminar
  Series\n\nLecture held in STEM 3.1.\n\nAbstract\nThe decisions made by ot
 hers are a valuable source of social information about the world\, because
  they may have knowledge that we lack. This means that when one agent make
 s a given choice\, it can induce others to do so as well. In this talk I w
 ill describe a theory of rational agents who optimally utilise the social 
 information provided by others\, and explore the dynamics this produces at
  the individual and group level. In particular\, I will show how the impli
 cit beliefs such agents hold about the physical and social environment sha
 pe their response to each other\, and how changes to the environment that 
 conflict with these beliefs can dramatically alter collective behaviour an
 d impact the success of groups.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/33/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Jinyu Tian (Macau University of Science and Technology)
DTSTART:20231214T140000Z
DTEND:20231214T150000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/34
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/34/">Discreteness Problem in Adversarial Machine Learning</a>\nby Dr
  Jinyu Tian (Macau University of Science and Technology) as part of (ED-3S
 ) Essex Data Science Seminar Series\n\n\nAbstract\nAdversarial examples (A
 Es) of deep neural networks (DNNs) are receiving ever-increasing attention
  because they help in understanding the mechanism of DNNs and provide a no
 vel perspective of the ethics of deep learning applications. In many real 
 scenarios\, AEs have to be discrete (e.g. digital images). Most existing w
 orks achieve the discreteness relying on the discretization of continuous 
 AEs. Unfortunately\, they cannot sufficiently control the spatial differen
 ce before and after discretizing continuous AEs\, which will leads to two 
 sid-effects: degrading the attack capability of the obtained discrete AEs 
 or introducing the extra distortion. \n\nIn this work\, we propose an adve
 rsarial attack called Discrete Attack (DATK) to produce continuous AEs tig
 htly close to their discrete counterparts. Owning the negligible spatial d
 istance between them\, the expected discrete AEs perform with the same pow
 erful attack capability as the continuous AEs without an extra distortion 
 overhead. More precisely\, the proposed DATK generate AEs from a novel per
 spective by directly modeling adversarial perturbations (APs) as discrete 
 random variables. The AE generation problem thus reduces to the estimation
  of the distribution of discrete APs. Since this problem typically is nond
 ifferential\, we relax it with the proposed reparameterizing tricks and ob
 tain an approximated continuous distribution of discrete APs. Our theoreti
 cal proof shows that\, by virtue the continuous APs sampled from the appro
 ximated distribution\, the spatial distance between the resultant continuo
 us AEs and their discrete counterparts are tightly bounded\, which signifi
 cantly overcomes the side-effects caused by the discretization. Extensive 
 results over Imagenet\, Cifar10 and TU Berlin Sketch demonstrate the super
 iority of our method when attacking representative DNNs including Vgg19\, 
 Resnet50\, DenseNet121 and MobilenetV2. It is also verified that our DATK 
 is more robust against the state-ofthe-art adversarial detection methods.\
 n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/34/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Hong Duong (University of Birmingham)
DTSTART:20231130T140000Z
DTEND:20231130T150000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/35
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/35/">Model Reduction of Complex Systems</a>\nby Dr Hong Duong (Unive
 rsity of Birmingham) as part of (ED-3S) Essex Data Science Seminar Series\
 n\nLecture held in STEM 3.1.\n\nAbstract\nComplex systems in nature and in
  applications (such as molecular systems\, crowd dynamics\, swarming\, opi
 nion formation\, just to name a few) are often described by systems of sto
 chastic differential equations (SDEs) and partial differential equations (
 PDEs). It is often analytically impossible or computationally prohibitivel
 y expensive to deal with the full models due to their high dimensionality 
 (degrees of freedom\, number of involved parameters\, etc.). It is thus of
  great importance to approximate such large and complex systems by simpler
  and lower dimensional ones\, while still preserving the essential informa
 tion from the original model. This procedure is referred to as model reduc
 tion or coarse-graining in the literature. In this talk\, I will present m
 ethods for qualitative and quantitative coarse-graining of several SDEs an
 d PDEs\, in the presence or absence of a scale-separation.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/35/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Yuyu Chen (University of Melbourne)
DTSTART:20231116T130000Z
DTEND:20231116T140000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/36
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/36/">Diversification of infinite-mean Pareto distributions</a>\nby D
 r Yuyu Chen (University of Melbourne) as part of (ED-3S) Essex Data Scienc
 e Seminar Series\n\nLecture held in Zoom.\n\nAbstract\nWe show the perhaps
  surprising inequality that the weighted average of negatively dependent s
 uper-Pareto random variables\, possibly caused by triggering events\, is l
 arger than one such random variable in the sense of first-order stochastic
  dominance. The class of super-Pareto distributions is extremely heavy-tai
 led and it includes the class of infinite-mean Pareto distributions. We di
 scuss several implications of this result via an equilibrium analysis in a
  risk exchange market. First\, diversification of super-Pareto losses incr
 eases portfolio risk\, and thus a diversification penalty exists. Second\,
  agents with super-Pareto losses will not share risks in a market equilibr
 ium. Third\, transferring losses from agents bearing super-Pareto losses t
 o external parties without any losses may arrive at an equilibrium which b
 enefits every party involved. The empirical studies show that our new ineq
 uality can be observed empirically for real datasets that fit well with ex
 tremely heavy tails.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/36/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Xiaochun Meng (University of Bath)
DTSTART:20240509T130000Z
DTEND:20240509T140000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/37
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/37/">Angular Combining of Forecasts of Probability Distributions</a>
 \nby Dr Xiaochun Meng (University of Bath) as part of (ED-3S) Essex Data S
 cience Seminar Series\n\nLecture held in STEM 3.1.\n\nAbstract\nWhen multi
 ple forecasts are available for a probability distribution\, forecast comb
 ining enables a pragmatic synthesis of the information to extract the wisd
 om of the crowd. A linear opinion pool has been widely used\, whereby the 
 combining is applied to the probability predictions of the distributional 
 forecasts. However\, it has been argued that this will tend to deliver ove
 rdispersed distributional forecasts\, prompting the combination to be appl
 ied\, instead\, to the quantile predictions of the distributional forecast
 s. Results from different applications are mixed\, leaving it as an empiri
 cal question whether to combine probabilities or quantiles. In this paper\
 , we present an alternative approach. Looking at the distributional foreca
 sts\, combining the probability forecasts can be viewed as vertical combin
 ing\, with quantile forecast combining seen as horizontal combining. Our p
 roposal is to allow combining to take place on an angle between the extrem
 e cases of vertical and horizontal combining. We term this angular combini
 ng. The angle is a parameter that can be optimized using a proper scoring 
 rule. For implementation\, we provide a pragmatic numerical approach and a
  simulation algorithm. Among our theoretical results\, we show that\, as w
 ith vertical and horizontal averaging\, angular averaging results in a dis
 tribution with mean equal to the average of the means of the distributions
  that are being combined. We also show that angular averaging produces a d
 istribution with lower variance than vertical averaging\, and\, under cert
 ain assumptions\, greater variance than horizontal averaging. We provide e
 mpirical support for angular combining using weekly distributional forecas
 ts of Covid mortality.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/37/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Mahendra Singh Rajpoot (University of Essex)
DTSTART:20240125T140000Z
DTEND:20240125T150000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/38
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/38/">Large Language Models: A Stepping Stone for AGI!</a>\nby Mahend
 ra Singh Rajpoot (University of Essex) as part of (ED-3S) Essex Data Scien
 ce Seminar Series\n\nLecture held in STEM 3.1.\n\nAbstract\nIn the rapidly
  evolving landscape of Artificial Intelligence (AI)\, Large Language Model
 s (LLMs) have emerged as a transformative force\, showcasing remarkable ca
 pabilities in natural language understanding and generation. This presenta
 tion delves into the pivotal role that LLMs play as a stepping stone towar
 ds achieving Artificial General Intelligence (AGI). We explore the fundame
 ntal principles\, applications\, and underlying mechanisms that propel LLM
 s while contemplating their implications for the broader goal of AGI. The 
 talk will navigate through recent advancements\, challenges\, and ethical 
 considerations in harnessing the potential of LLMs\, ultimately envisionin
 g their contribution to the evolution of comprehensive artificial intellig
 ence\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/38/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Yi Zhang (University of Birmingham)
DTSTART:20240425T130000Z
DTEND:20240425T140000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/40
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/40/">On discounted Markov decision processes and their extensions</a
 >\nby Dr Yi Zhang (University of Birmingham) as part of (ED-3S) Essex Data
  Science Seminar Series\n\nLecture held in 4SW.6.28.\n\nAbstract\nThe theo
 ry for discounted Markov decision processes (MDPs) has been well developed
 . In this talk we review some basic results concerning their occupation me
 asures\, which are convenient for the studies of optimal control problems 
 with constraints. After that\, we discuss the possibility of their extensi
 ons to more general models (uniformly absorbing MDPs\, absorbing MDPs\, or
  more general MDPs with total criteria). The studies of absorbing MDPs hav
 e been active recently.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/40/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Kareemah Chopra (University of Essex)
DTSTART:20240208T140000Z
DTEND:20240208T150000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/41
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/41/">[Cancelled] The Bunching Behaviour of Cows</a>\nby Dr Kareemah 
 Chopra (University of Essex) as part of (ED-3S) Essex Data Science Seminar
  Series\n\nLecture held in STEM 3.1.\n\nAbstract\nBunching behavior in cat
 tle may occur for several reasons including enabling social interactions\,
  a response to stress or danger\, or due to shared interest in resources s
 uch as feeding or watering areas. There is evidence in pasture grazed catt
 le that bunching may occur more frequently at higher ambient temperatures\
 , possibly due to sharing of fly-load or to seek shade from the direct sun
  under heat stress conditions. Here we demonstrate how bunching behavior i
 s associated with higher ambient temperatures in a barn-housed UK dairy he
 rd. A real-time local positioning system (RTLS) was used\, as part of a pr
 ecision livestock farming (PLF) approach\, to track the spatial position a
 nd activity of a commercial dairy herd (c100 cows) in a freestall barn con
 tinuously at high temporal resolution for 4 mo between August and November
  2014. Bunching was determined using 4 different spatial measures determin
 ed on an hourly basis: herd full and core range size\, mean herd inter-cow
  distance (ICD)\, and mean herd nearest neighbor distance (NND). For hourl
 y mean ambient temperatures above 20°C\, the herd showed higher bunching 
 behavior with increasing ambient temperature (i.e.\, reduced full and core
  range size\, ICD\, and NND). Aggregated space-use intensity was found to 
 positively correlate with localized variations in temperature across the b
 arn (as measured by animal mounted sensors)\, but the level of correlation
  decreased at higher ambient barn temperatures. Bunching behavior may incr
 ease localized temperatures experienced by individuals and hence may be a 
 maladaptive behavioral response in housed dairy cattle\, which are known t
 o suffer heat stress at higher temperatures. Our study is the first to use
  high-resolution positional data to provide evidence of associations betwe
 en bunching behavior and higher ambient temperatures for a barn-housed dai
 ry herd in a temperate region (UK). Further studies are needed to explore 
 the exact mechanisms for this response to inform both welfare and producti
 on management.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/41/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Professor Richard J. Samworth (University of Cambridge)
DTSTART:20240229T140000Z
DTEND:20240229T150000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/42
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/42/">Isotonic subgroup selection</a>\nby Professor Richard J. Samwor
 th (University of Cambridge) as part of (ED-3S) Essex Data Science Seminar
  Series\n\nLecture held in STEM 3.1.\n\nAbstract\nGiven a sample of covari
 ate-response pairs\, we consider the subgroup selection problem of identif
 ying a subset of the covariate domain where the regression function exceed
 s a pre-determined threshold. We introduce a computationally-feasible appr
 oach for subgroup selection in the context of multivariate isotonic regres
 sion based on martingale tests and multiple testing procedures for logical
 ly-structured hypotheses. Our proposed procedure satisfies a non-asymptoti
 c\, uniform Type I error rate guarantee with power that attains the minima
 x optimal rate up to poly-logarithmic factors. Extensions cover classifica
 tion\, isotonic\nquantile regression and heterogeneous treatment effect se
 ttings.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/42/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Professor Edward Rochead (Defence Science and Technology Laborator
 y)
DTSTART:20240307T140000Z
DTEND:20240307T150000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/43
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/43/">The Alliance for Data Science Professionals</a>\nby Professor E
 dward Rochead (Defence Science and Technology Laboratory) as part of (ED-3
 S) Essex Data Science Seminar Series\n\nLecture held in STEM 3.1.\n\nAbstr
 act\nThe talk will begin by introducing the Alliance\, its members and how
  it was formed. It will then explain how individuals can become accredited
  as Advanced Data Science Professionals and also describe the plans being 
 formed to accredit degrees. It is expected that the discussion would focus
  on how the AfDSP can work with academic colleagues and ensure accreditati
 on is attractive and meaningful to them\, and also consider how it may fee
 d into the employability of graduates in relevant disciplines.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/43/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Laurel Ariane Regibeau-Rockett (Stanford University)
DTSTART:20240321T140000Z
DTEND:20240321T150000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/44
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/44/">Hurricanes as heat engines</a>\nby Laurel Ariane Regibeau-Rocke
 tt (Stanford University) as part of (ED-3S) Essex Data Science Seminar Ser
 ies\n\nLecture held in STEM 3.1.\n\nAbstract\nHurricanes are dangerous and
  destructive atmospheric phenomena\, frequently causing loss of lives worl
 dwide. Improving our understanding of hurricanes can help improve hurrican
 e forecasts and projections of their response to climate change. One conce
 ptual model of the hurricane\, which has supported major advancements in h
 urricane science\, is the conceptualization of the hurricane as a heat eng
 ine.  This theoretical framework supports research at the intersection of 
 physics\, mathematics\, and atmospheric science. In this seminar\, we will
  review this important theoretical model and some of its applications\, to
 gether with possible directions of future research in this interdisciplina
 ry domain.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/44/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Professor Mariachiara Di Cesare (University of Essex)
DTSTART:20240314T140000Z
DTEND:20240314T150000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/45
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/45/">Institute of Public Health and Wellbeing opportunities to enhan
 ce research for all</a>\nby Professor Mariachiara Di Cesare (University of
  Essex) as part of (ED-3S) Essex Data Science Seminar Series\n\nLecture he
 ld in STEM 3.1.\n\nAbstract\nThe IPHW\, established in 2022\, represents a
  major strategic innovation for the University of Essex\, bringing togethe
 r our community of experts to provide pioneering leadership in the product
 ion of world-class research\, knowledge exchange and impact. Working with 
 regional\, national\, and international partners\, the IPHW is driven by a
  collective goal of creating a healthier and fairer society. During this s
 eminar we will discuss the IPHW mission\, vision\, and strategy and look a
 t opportunities to enhance interdisciplinary research in the field of heal
 th and wellbeing with a special focus on data science.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/45/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Maria Brigida Ferraro (Sapienza University of Rome)
DTSTART:20240530T130000Z
DTEND:20240530T140000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/46
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/46/">Two-mode clustering in a fuzzy setting: methods and cluster val
 idity indices</a>\nby Dr Maria Brigida Ferraro (Sapienza University of Rom
 e) as part of (ED-3S) Essex Data Science Seminar Series\n\nLecture held in
  STEM 3.1.\n\nAbstract\nThe aim of clustering is to find a partition of th
 e rows (e.g. objects) of a data matrix based on the values assumed on a se
 t of variables (columns). Two objects belong to the same cluster if the co
 rresponding rows are close to each other according to a certain metric bas
 ed on all the variables. However\, it can be reasonable to seek clusters s
 uch that objects assigned to the same cluster are close to each other with
  respect to a subset of variables. The research\ninterest can also be reve
 rsed\, i.e.\, the goal is to find clusters of variables close to each othe
 r in terms of a subset of objects. Standard clustering algorithms are not 
 adequate to accomplish these tasks. For this purpose\, two-mode clustering
  methods have been introduced. Two-mode clustering consists in simultaneou
 sly partitioning modes (e.g.\, objects and variables) of an observed two-m
 ode data matrix.\n\nIn the literature\, two-mode clustering methods have b
 een extensively studied and extended\nalong various directions. Most of th
 em are based on the classical approach to clustering\, i.e.\, the objects 
 (or the variables) are either assigned or not to the clusters. A more powe
 rful and flexible exploratory approach is represented by introducing fuzzi
 ness in the clustering process. In this case\, the objects (or the variabl
 es) are no longer either assigned or not to the clusters\, but belong to t
 he clusters with the so-called (fuzzy) membership degrees taking values in
  the interval [0\,1]. A high membership degree\, close to 1\, recognizes a
 n object (or variable) strongly assigned to a cluster\, i.e.\, an object (
 or variable) very close to the corresponding cluster prototype.\n\nStartin
 g from the Double k-Means\, we propose a class of two-mode clustering algo
 rithms in a\nfuzzy framework\, including some robust proposals\, taking in
 to account that\, in this case\,\ndifferent kinds of outliers exist and sh
 ould be considered.\nIn addition\, in order to evaluate the two fuzzy part
 itions and to choose the optimal numbers of clusters\, new cluster validit
 y indices are introduced. The proposed measures are defined in\nterms of t
 he compactness within each cluster and separation between clusters. Starti
 ng from\nsome well-known indices in standard fuzzy clustering\, some gener
 alizations to the two-mode\ncase are addressed. The adequacy of the propos
 als is checked by means of simulation and real-case studies.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/46/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Mohamed Bader (University of Portsmouth)
DTSTART:20240627T130000Z
DTEND:20240627T140000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/47
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/47/">Beyond Theory: Machine Learning and AI in Action for Early Beha
 vior and Outcome Prediction</a>\nby Dr Mohamed Bader (University of Portsm
 outh) as part of (ED-3S) Essex Data Science Seminar Series\n\nLecture held
  in 1N1.4.1.\n\nAbstract\nThis talk explores the practical application of 
 machine learning and AI in predicting early behaviors and outcomes across 
 healthcare\, digital marketing\, and industrial sectors. In healthcare\, p
 articularly ICUs\, we discuss how AI models forecast patient outcomes and 
 improve resource management. In digital marketing\, the focus is on how AI
  anticipates consumer behaviors to optimize marketing strategies. Lastly\,
  in industrial applications\, we examine AI's role in predicting maintenan
 ce needs and enhancing operational reliability.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/47/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Professor Guy Nason (Imperial College London)
DTSTART:20240516T130000Z
DTEND:20240516T140000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/48
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/48/">Network Time Series</a>\nby Professor Guy Nason (Imperial Colle
 ge London) as part of (ED-3S) Essex Data Science Seminar Series\n\nLecture
  held in STEM 3.1.\n\nAbstract\nA network time series is a multivariate ti
 me series where the individual series are known to be linked by some under
 lying network structure. Sometimes this network is known a priori\, and so
 metimes the network has to be created\, often inferred from the multivaria
 te series itself. Network time series are becoming increasingly common\, l
 ong\, and collected over a large number of variables. We are particularly 
 interested in network time series whose network structure changes over tim
 e.\n\nWe describe some recent developments in the modeling of network time
  series via generalized network autoregressive (GNAR) process models. Thes
 e models use regular autoregressive links between a variable and its past 
 and between a variable and the past of its neighbours. GNAR models are hig
 hly parsimonious and\, hence\, work well for short series or those afflict
 ed by worrying amounts of missing data. For the same reason\, they tend no
 t to overfit and often exhibit excellent forecasting performance\, especia
 lly when compared to alternatives such as vector autoregressive models.\n\
 nThis talk explains the GNAR model and some interesting variants. We intro
 duce some new tools for model selection and exhibit their use on epidemic 
 and economic data.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/48/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Kareemah Chopra (University of Essex)
DTSTART:20240502T130000Z
DTEND:20240502T140000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/50
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/50/">The Bunching Behaviour of Cows</a>\nby Dr Kareemah Chopra (Univ
 ersity of Essex) as part of (ED-3S) Essex Data Science Seminar Series\n\n\
 nAbstract\nBunching behavior in cattle may occur for several reasons inclu
 ding enabling social interactions\, a response to stress or danger\, or du
 e to shared interest in resources such as feeding or watering areas. There
  is evidence in pasture grazed cattle that bunching may occur more frequen
 tly at higher ambient temperatures\, possibly due to sharing of fly-load o
 r to seek shade from the direct sun under heat stress conditions. Here we 
 demonstrate how bunching behavior is associated with higher ambient temper
 atures in a barn-housed UK dairy herd. A real-time local positioning syste
 m (RTLS) was used\, as part of a precision livestock farming (PLF) approac
 h\, to track the spatial position and activity of a commercial dairy herd 
 (c100 cows) in a freestall barn continuously at high temporal resolution f
 or 4 mo between August and November 2014. Bunching was determined using 4 
 different spatial measures determined on an hourly basis: herd full and co
 re range size\, mean herd inter-cow distance (ICD)\, and mean herd nearest
  neighbor distance (NND). For hourly mean ambient temperatures above 20°C
 \, the herd showed higher bunching behavior with increasing ambient temper
 ature (i.e.\, reduced full and core range size\, ICD\, and NND). Aggregate
 d space-use intensity was found to positively correlate with localized var
 iations in temperature across the barn (as measured by animal mounted sens
 ors)\, but the level of correlation decreased at higher ambient barn tempe
 ratures. Bunching behavior may increase localized temperatures experienced
  by individuals and hence may be a maladaptive behavioral response in hous
 ed dairy cattle\, which are known to suffer heat stress at higher temperat
 ures. Our study is the first to use high-resolution positional data to pro
 vide evidence of associations between bunching behavior and higher ambient
  temperatures for a barn-housed dairy herd in a temperate region (UK). Fur
 ther studies are needed to explore the exact mechanisms for this response 
 to inform both welfare and production management.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/50/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Anusa Suwanwong (University of Essex)
DTSTART:20240620T130000Z
DTEND:20240620T140000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/51
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/51/">A Gene Selection Method for Classification with Three Classes U
 sing Proportional Overlapping Scores</a>\nby Anusa Suwanwong (University o
 f Essex) as part of (ED-3S) Essex Data Science Seminar Series\n\nLecture h
 eld in STEM 3.1.\nAbstract: TBA\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/51/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Yu Wu (Southwest Jiaotong University\, China)
DTSTART:20241010T110000Z
DTEND:20241010T120000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/52
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/52/">Efforts to overcome the curse of dimensionality in sequential d
 ecision-making problems</a>\nby Dr Yu Wu (Southwest Jiaotong University\, 
 China) as part of (ED-3S) Essex Data Science Seminar Series\n\n\nAbstract\
 nSequential decision-making problems are widespread\, and solving them is 
 essential for enhancing efficiency\, reducing costs\, and optimizing resou
 rce allocation. However\, these problems are notoriously difficult due to 
 the “curse of dimensionality.” Given the diversity of sequential decis
 ion-making problems and the broad applicability of solution methods\, this
 \ntalk will primarily focus on the complex Dynamic Vehicle Routing Problem
  (DVRP). It will start by elucidating the specific challenges posed by the
  curse of dimensionality\, including the exponential growth of state space
 \, action space\, and transition probabilities. Then\, the talk will exami
 ne and discuss existing techniques to address these challenges\, such as s
 tate aggregation\, initial policy generation\, offline-online policy impro
 vement\, state (or state-action) value function representation\, and metho
 ds for updating and leveraging probabilistic laws. Finally\, by modularly 
 deconstructing\, updating\, and recombining these techniques\, this talk w
 ill propose new approaches to potentially overcome the curse of dimensiona
 lity in the future.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/52/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Prof Tengyao Wang (LSE)
DTSTART:20241128T120000Z
DTEND:20241128T130000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/53
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/53/">High-dimensional changepoint estimation with heterogeneous miss
 ingness</a>\nby Prof Tengyao Wang (LSE) as part of (ED-3S) Essex Data Scie
 nce Seminar Series\n\nLecture held in CTC 3.02.\n\nAbstract\nWe propose a 
 new method for changepoint estimation in partially-observed\, high-dimensi
 onal time series that undergo a simultaneous change in mean in a sparse su
 bset of coordinates.  Our first methodological contribution is to introduc
 e a 'MissCUSUM' transformation\, that captures the interaction between the
  signal strength and the level of missingness in each coordinate.  In orde
 r to borrow strength across the coordinates\, we project these MissCUSUM s
 tatistics along a direction found as the solution to an optimisation probl
 em.  The changepoint can then be estimated as the location of the peak of 
 the projected series.  In a model that allows different missingness probab
 ilities in different component series\, we identify that the key interacti
 on between the missingness and the signal is an observation-probability-we
 ighted sum of squares of the signal change in each coordinate.   More spec
 ifically\, we prove that the angle between the estimated and oracle projec
 tion directions\, as well as the changepoint location error\, are controll
 ed with high probability by the sum of two terms\, both involving this wei
 ghted sum of squares\, and representing the error incurred due to noise an
 d due to missingness respectively.  The striking effectiveness of our meth
 odology is further demonstrated both on simulated data\, and on an oceanog
 raphic data set.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/53/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Eleni-Rosalina Andrinopoulou (Erasmus Medical Center Rotterdam)
DTSTART:20241205T120000Z
DTEND:20241205T130000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/54
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/54/">Assessing Risk Indicators in Clinical Practice with Joint Model
 s</a>\nby Dr Eleni-Rosalina Andrinopoulou (Erasmus Medical Center Rotterda
 m) as part of (ED-3S) Essex Data Science Seminar Series\n\nLecture held in
  https://essex-university.zoom.us/j/98548135065.\n\nAbstract\nThe increasi
 ng availability of clinical measures (e.g.\, electronic medical records) l
 eads to collecting many different types of information. This information m
 ainly includes multiple longitudinal measurements collected during follow-
 up visits of the patient to the clinic. Big data is the key element to new
  developments and precision medicine. Nowadays\, individualized\, dynamic 
 predictions are popular in different medical fields because they improve p
 atient care. In particular\, it is of high clinical interest to predict fu
 ture measurements to assess recovery on patients who experienced a stroke 
 or predict life expectancy for patients with Cystic Fibrosis.\n \nPhysicia
 ns collect a variety of measurements over time to assess the severity and 
 progression of a disease. Even though all outcomes will be considered toge
 ther intuitively\, they will usually be analyzed separately. It is biologi
 cally relevant to study them together\; therefore\, it is more appropriate
  to analyze them assuming a single statistical model. This\, however\, pos
 es many challenges. In particular\, different characteristics of the patie
 nts' longitudinal profiles could influence the outcome(s) of interest. For
  example\, the rate of change could be a better predictor than the actual 
 value. Therefore\, it is essential to assume the correct association struc
 ture when obtaining dynamic predictions.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/54/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Victoria Volodina (University of Exeter)
DTSTART:20241114T120000Z
DTEND:20241114T130000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/55
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/55/">Polynomial structural equation models (SEMs) for decision suppo
 rt systems</a>\nby Dr Victoria Volodina (University of Exeter) as part of 
 (ED-3S) Essex Data Science Seminar Series\n\nLecture held in CTC 3.02.\n\n
 Abstract\nProbabilistic graphical models are widely used to model complex 
 systems with uncertainty. In a decision-support context\, Bayesian analysi
 s may focus on obtaining beliefs from a decision maker and multiple expert
  panels about individual variables or propagating the effects of new evide
 nce through the graph G\, coherently updating beliefs in vertices that are
  not yet established. The associated computations can become cumbersome. T
 o facilitate the decision-making process\, we propose adopting the polynom
 ial structural equation model (SEM) to depict complex relationships betwee
 n individual variables in graph G and with a utility function in polynomia
 l form. Since the marginal posterior distributions of individual variables
  can become analytically intractable\, we develop a nonparametric message-
 passing algorithm that propagates information throughout the graph using o
 nly moments\, enabling exact calculation of expected utility scores. We il
 lustrate the proposed methodology with examples and an application to deci
 sion problems in energy planning and healthcare.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/55/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Jonas Latz (Manchester University)
DTSTART:20250227T120000Z
DTEND:20250227T130000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/56
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/56/">Cancelled</a>\nby Dr Jonas Latz (Manchester University) as part
  of (ED-3S) Essex Data Science Seminar Series\n\nLecture held in CTC 3.02.
 \nAbstract: TBA\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/56/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Jianya Lu (University of Essex)
DTSTART:20241017T110000Z
DTEND:20241017T120000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/57
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/57/">Distribution estimation for time series via DNN-based GANs with
  an application to change-point estimation</a>\nby Dr Jianya Lu (Universit
 y of Essex) as part of (ED-3S) Essex Data Science Seminar Series\n\nLectur
 e held in CTC 3.02.\n\nAbstract\nThe generative adversarial networks (GANs
 ) have recently been applied to estimating the distribution of independent
  and identically distributed data\, and have attracted a lot of research a
 ttention. In this talk\,  I'll demonstrate the effectiveness of GANs in es
 timating the distribution of stationary time series. Theoretically\, we de
 rive a non-asymptotic error bound for the Deep Neural Network (DNN)-based 
 GANs estimator for the stationary distribution of the time series. Our app
 roach is based on the blocking technique and the $m$-dependence approximat
 ion technique that divides the time series into interlacing blocks of equa
 l size and then constructs independent blocks. Based on the theoretical an
 alysis\, we propose an algorithm for estimating the change-point in time s
 eries distribution. Numerical results of Monte Carlo experiments and real 
 data application are given to validate our theory and algorithm.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/57/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Porf. Michael J. Daniels (University of Florida\, USA)
DTSTART:20250213T120000Z
DTEND:20250213T130000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/58
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/58/">A Bayesian nonparametric approach for evaluating the causal eff
 ect of treatment in observational cohort studies with semi-competing risks
 </a>\nby Porf. Michael J. Daniels (University of Florida\, USA) as part of
  (ED-3S) Essex Data Science Seminar Series\n\nLecture held in CTC 3.02.\n\
 nAbstract\nWe develop a Bayesian nonparametric (BNP) approach to evaluate 
 the causal effect of treatment where a nonterminal event may be censored b
 y one (or more) terminal event(s)\, but not vice versa (i.e.\, semi-compet
 ing risks). Based on the idea of principal stratification\, we define a no
 vel estimand for the causal effect of treatment on the nonterminal event. 
 We introduce identification assumptions (using factorizations based on vin
 e copulas) indexed by  sensitivity parameters and show how to draw inferen
 ce using our BNP approach. We illustrate our methodology using data from a
  cardiovascular cohort study.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/58/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Soudeep Deb (Indian Institute of Management Bangalore)
DTSTART:20241212T120000Z
DTEND:20241212T130000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/59
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/59/">Structural breaks in the spatial network of real estate dynamic
 s: A study of UK property transactions</a>\nby Dr Soudeep Deb (Indian Inst
 itute of Management Bangalore) as part of (ED-3S) Essex Data Science Semin
 ar Series\n\nLecture held in CTC 3.02.\n\nAbstract\nThe real estate market
  is a dynamic system shaped by various economic\, social\, and environment
 al factors\, making the detection of structural changes crucial for unders
 tanding market trends\, managing risks\, and guiding investment and policy
  decisions. This study examines temporal changes in Greater London’s rea
 l estate market using weekly data at the MSOA (Middle Layer Super Output A
 reas) level through a two-stage methodology. First\, Local Indicators of S
 patial Association (LISA) are applied to identify significant clusters of 
 high and low property prices and spatial outliers\, which are then integra
 ted into a network framework incorporating geographical distance. Second\,
  structural breaks in market dynamics are detected using network Laplacian
 s\, capturing both gradual and abrupt shifts over time. These findings are
  further leveraged to develop a localized house price index and a data-dri
 ven zoning approach\, offering enhanced tools for evaluating property pric
 es with greater accuracy\, benefiting both investors and policymakers.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/59/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Wasiur R. KhudaBukhsh (University of Nottingham)
DTSTART:20250515T110000Z
DTEND:20250515T120000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/60
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/60/">Enzyme kinetic reactions as interacting particle systems: Stoch
 astic averaging and parameter inference</a>\nby Dr Wasiur R. KhudaBukhsh (
 University of Nottingham) as part of (ED-3S) Essex Data Science Seminar Se
 ries\n\nLecture held in CTC 3.02.\n\nAbstract\nWe consider a stochastic mo
 del of multistage Michaelis--Menten (MM) type enzyme kinetic reactions des
 cribing the conversion of substrate molecules to a product through several
  intermediate species. The high-dimensional\, multiscale nature of these r
 eaction networks presents significant computational challenges\, especiall
 y in statistical estimation of reaction rates. This difficulty is amplifie
 d when direct data on system states are unavailable\, and one only has acc
 ess to a random sample of product formation times. To address this\, we pr
 oceed in two stages. First\, under certain technical assumptions akin to t
 hose made in the Quasi-steady-state approximation (QSSA) literature\, we p
 rove two asymptotic results: a stochastic averaging principle that yields 
 a lower-dimensional model\, and a functional central limit theorem that qu
 antifies the associated fluctuations. Next\, for statistical inference of 
 the parameters of the original MM reaction network\, we develop a mathemat
 ical framework involving an interacting particle system (IPS) and prove a 
 propagation of chaos result that allows us to write a product-form likelih
 ood function. The novelty of the IPS-based inference method is that it doe
 s not require information about the state of the system and works with onl
 y a random sample of product formation times. We provide numerical example
 s to illustrate the efficacy of the theoretical results. Preprint: https:/
 /arxiv.org/abs/2409.06565\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/60/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Kirstii Badcock (University of Essex)
DTSTART:20241121T120000Z
DTEND:20241121T130000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/61
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/61/">SMSAS and Impact: Ensuring your research makes a difference</a>
 \nby Kirstii Badcock (University of Essex) as part of (ED-3S) Essex Data S
 cience Seminar Series\n\nLecture held in CTC 3.02.\n\nAbstract\nI will giv
 e a short overview of national research assessments\, why impact is import
 ant and how it is scored in the REF drawing on examples of high scoring im
 pact case studies taken from REF21 for Unit of Assessment 10.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/61/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Yi Xia (University of Essex)
DTSTART:20250116T120000Z
DTEND:20250116T130000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/62
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/62/">A Numerical Investigation of Non-Convex Reinsurance Problems Us
 ing a Method of Homotopy Optimization with Perturbations and Ensembles</a>
 \nby Yi Xia (University of Essex) as part of (ED-3S) Essex Data Science Se
 minar Series\n\nLecture held in CTC 3.02.\nAbstract: TBA\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/62/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Stuart McDonald (Longevity and Demographic Insights)
DTSTART:20250130T120000Z
DTEND:20250130T130000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/63
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/63/">Mortality in the wake of the pandemic</a>\nby Stuart McDonald (
 Longevity and Demographic Insights) as part of (ED-3S) Essex Data Science 
 Seminar Series\n\nLecture held in CTC 3.02.\n\nAbstract\nStuart McDonald w
 ill discuss the role of actuaries in proving modelling and insights and ad
 dressing misinformation during the Covid-19 pandemic. He will show the imp
 act of the pandemic and resulting NHS pressures on UK mortality rates\, an
 d the challenge this presents for projecting future mortality.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/63/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Rishideep Roy (University of Essex)
DTSTART:20250313T120000Z
DTEND:20250313T130000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/64
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/64/">Real-time within game forecasting in football</a>\nby Dr Rishid
 eep Roy (University of Essex) as part of (ED-3S) Essex Data Science Semina
 r Series\n\nLecture held in CTC 3.02.\n\nAbstract\nWe employ a Bayesian me
 thodology to predict the results of soccer matches in real-time. Using seq
 uential data of various events throughout the match\, we utilise a multino
 mial probit regression in a novel framework to estimate the time-varying i
 mpact of covariates and to forecast the outcome. English Premier League da
 ta from eight seasons are used to evaluate the efficacy of our method. Dif
 ferent evaluation metrics establish that the proposed model outperforms po
 tential competitors inspired by existing statistical or machine learning a
 lgorithms. Additionally\, we apply robustness checks to demonstrate the mo
 del’s accuracy across various scenarios.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/64/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Mengchu Li (University of Birmingham)
DTSTART:20250522T110000Z
DTEND:20250522T120000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/65
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/65/">Differential privacy analysis of Langevin algorithms</a>\nby Dr
  Mengchu Li (University of Birmingham) as part of (ED-3S) Essex Data Scien
 ce Seminar Series\n\nLecture held in CTC 3.02.\n\nAbstract\nIn this talk\,
  I will introduce the concept of differential privacy\, the prevailing fra
 mework for developing statistical procedures while quantifying the amount 
 of privacy offered to each individual in the data set. Differential privac
 y guarantees are often achieved by injecting noise into deterministic algo
 rithms\, and this fact makes a large class of sampling algorithms naturall
 y private without any modifications. I will focus on the simplest unadjust
 ed Langevin algorithm and discuss several attempts to characterise its pri
 vacy guarantees under the differential privacy framework.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/65/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Romina Hashami (University of Essex)
DTSTART:20251016T120000Z
DTEND:20251016T130000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/66
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/66/">Can News Predict the Direction of Oil Price Volatility? A Langu
 age Model Approach with SHAP Explanations</a>\nby Romina Hashami (Universi
 ty of Essex) as part of (ED-3S) Essex Data Science Seminar Series\n\nLectu
 re held in 4.311.\n\nAbstract\nFinancial markets can be highly sensitive t
 o news\, investor sentiment and economic indicators\, leading to important
  asset price fluctuations.  In this study we focus on crude oil\, due to i
 ts crucial role in commodity markets and global economy. Specifically we a
 re interested on understanding the directional changes of oil price volati
 lity\, and for this purpose\, we investigate whether news alone -- without
  incorporating traditional market data -- can effectively predict the dire
 ction of oil price movements. Using a decade-long dataset from Eikon (2014
 –2024)\, we develop an ensemble learning framework to extract predictive
  signals from financial news. Our approach leverages diverse sentiment ana
 lysis techniques and cutting-edge language models\, including FastText\, F
 inBERT\, Gemini\, and LLaMA\, to capture market sentiment and textual patt
 erns. We benchmark our model against the Heterogeneous Autoregressive (HAR
 ) model and assess statistical significance using the McNemar test. Notabl
 y\, while most sentiment-based indicators do not consistently outperform H
 AR\, the raw news count emerges as a robust predictor. Among embedding tec
 hniques\, FastText proves most effective for forecasting directional movem
 ents. Furthermore\, SHAP-based interpretation at the word level reveals ev
 olving predictive drivers across market regimes: pre-pandemic emphasis on 
 supply-demand and economic terms\; early pandemic focus on uncertainty and
  macroeconomic instability\; post-shock attention to long-term recovery in
 dicators\; and war-period sensitivity to geopolitical and regional oil mar
 ket disruptions. These findings highlight the predictive power of news-dri
 ven features and the value of explainable NLP in financial forecasting.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/66/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Peter Young (King's College London)
DTSTART:20251106T130000Z
DTEND:20251106T140000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/67
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/Essex-DataSc
 ience/67/">Argumentation Theory and its Applications in Data Science</a>\n
 by Dr Peter Young (King's College London) as part of (ED-3S) Essex Data Sc
 ience Seminar Series\n\nLecture held in 4.311.\n\nAbstract\nArgumentation 
 theory is a branch of artificial intelligence (AI) that models how claims 
 and arguments interact and how conflicting viewpoints can be resolved. Whi
 le rooted in formal logic\, argumentation theory has found new relevance i
 n data science\, offering tools for modelling and analysing debates\, expl
 ainable AI\, and decision-making with uncertain and contradictory informat
 ion.\n\nIn this talk\, I will introduce the foundational concepts of argum
 entation theory\, tracing its development from formal logic to computation
 al models. I will then explore its integration with data-driven approaches
 \, highlighting my recent work in the analysis of social media discourse a
 nd in information synthesis for portfolio optimisation. These examples ill
 ustrate how ideas from argumentation can complement approaches in decision
 -making and statistical learning.\n\nThis talk is aimed at researchers int
 erested in the intersection of AI\, data science\, and computational socia
 l science\, and does not assume any prior background in logic or argumenta
 tion.\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/67/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Yi Xia (University of Essex)
DTSTART:20251120T130000Z
DTEND:20251120T140000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/68
DESCRIPTION:by Yi Xia (University of Essex) as part of (ED-3S) Essex Data 
 Science Seminar Series\n\nLecture held in 4.311.\nAbstract: TBA\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/68/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Dane Grundy (Aviva)
DTSTART:20251127T130000Z
DTEND:20251127T140000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/69
DESCRIPTION:by Dr Dane Grundy (Aviva) as part of (ED-3S) Essex Data Scienc
 e Seminar Series\n\nLecture held in 4.311.\nAbstract: TBA\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/69/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Yang Lu (University of Concordia)
DTSTART:20251204T140000Z
DTEND:20251204T150000Z
DTSTAMP:20260422T215602Z
UID:Essex-DataScience/70
DESCRIPTION:by Dr Yang Lu (University of Concordia) as part of (ED-3S) Ess
 ex Data Science Seminar Series\n\nLecture held in 4.311.\nAbstract: TBA\n
LOCATION:https://researchseminars.org/talk/Essex-DataScience/70/
END:VEVENT
END:VCALENDAR
