Colloquia

Fall 2022

Information about past colloquia is available here.

 

Date
Speaker
Affiliation
Title
Note
Wednesday, August 31 Ming-Hui Chen, et. al UConn STAT New graduate info session
Wednesday, September 7 Jae-Kwang Kim Iowa State University Multiple Bias Calibration for Valid Statistical Inference With Selection Bias In Person
Wednesday, September 14 Scott Bruce Texas A&M University Interpretable Classification of Categorical Time Series Using the Spectral Envelope and Optimal Scalings Virtual
Wednesday, September 21 Reuben Retnam Takeda Pharmaceuticals A “Divide-and-Conquer” AECM Algorithm for Large non-Gaussian Longitudinal Data with Irregular Follow-Ups Virtual
Wednesday, September 28 Judy Huixia Wang George Washington University Copula-Based Approaches for Analyzing Non-gaussian Spatial Data In Person
Wednesday, October 5 Xianyang Zhang Texas A&M University Powerful Large-scale Inference in Omics Association Studies In Person
Wednesday, October 12 Sakshi Arya Pennsylvania State University Virtual
Friday, October 14 Clarice R. Weinberg National Institutes of Health (NIH) Makuch Lecture;
In Person
Saturday, October 15 Bani K. Mallick Texas A&M University Distinguished Alumni Lecture;
In Person
Wednesday, October 19 Malay Ghosh University of Florida Pfizer Colloquium;
In Person
Wednesday, October 26 John Stufken University of North Carolina Greensboro In Person
Wednesday, November 2
Wednesday, November 9 Haben Michael University of Massachusetts UConn-UMass joint colloquium
Friday, November 18
Wednesday, November 30
Wednesday, December 7

Jae-Kwang Kim, LAS Dean’s Professor, Department of Statistics, Iowa State University

Multiple Bias Calibration for Valid Statistical Inference With Selection Bias

Valid statistical inference is notoriously challenging when the sample is subject to selection bias. We approach this difficult problem by employing multiple candidate models for the propensity score function combined with empirical likelihood. By incorporating the multiple propensity score (PS) models into the internal bias calibration constraint in the empirical likelihood setup, the selection bias can be safely eliminated so long as the multiple candidate models contain the true PS model. The bias calibration constraint for the multiple PS model in the empirical likelihood is called the multiple bias calibration. The multiple PS models can include both ignorable and nonignorable models. In the context of data integration setup, the conditions for multiple bias calibration can be achieved. Asymptotic properties are discussed, and some limited simulation studies are presented to compare with the existing methods.

Key reference:

  • Qin, J., D. Leung and J. Shao (2002), ‘Estimation with survey data under non-ignorable nonresponse or informative sampling’, Journal of the American Statistical Association 97, 193–200.
  • Morikawa, K. and Kim, J.K. (2021), ‘Semiparametric optimal estimation with nonignorable nonresponse data’, The Annals of Statistics 49(5), 2991–3014.
  • Han, P. and L. Wang (2013), ‘Estimation with missing data: Beyond double robustness’, Biometrika 100, 417–430.

Bio: Dr. Jae-kwang Kim is a LAS dean’s professor in the Department of Statistics at Iowa State University (ISU). He got his Ph.D. from ISU in 2000 under the supervision of prof. Wayne Fuller and then worked in various places (such as Westat and Yonsei university in Korea) before he joined ISU in 2008. He is a fellow of ASA and IMS and the president-elect of KISS (Korean International Statistical Society). He has extensive research/consulting experience in the area of survey sampling and missing data analysis. He is also a co-author of the book “Statistical methods for handling incomplete data (2nd edn)” coauthored with Jun Shao.

Event location: Philip E. Austin Building, Rm. 434
Date and time: Wednesday, September 7, 2022, 4:00 pm ET, 1-hour duration

Scott Bruce, Assistant Professor, Department of Statistics, Texas A & M University

Interpretable Classification of Categorical Time Series Using the Spectral Envelope and Optimal Scalings

This article introduces a novel approach to the classification of categorical time series under the supervised learning paradigm. To construct meaningful features for categorical time series classification, we consider two relevant quantities: the spectral envelope and its corresponding set of optimal scalings. These quantities characterize oscillatory patterns in a categorical time series as the largest possible power at each frequency, or spectral envelope, obtained by assigning numerical values, or scalings, to categories that optimally emphasize oscillations at each frequency. Our procedure combines these two quantities to produce an interpretable and parsimonious feature-based classifier that can be used to accurately determine group membership for categorical time series. Classification consistency of the proposed method is investigated, and simulation studies are used to demonstrate accuracy in classifying categorical time series with various underlying group structures. Finally, we use the proposed method to explore key differences in oscillatory patterns of sleep stage time series for patients with different sleep disorders and accurately classify patients accordingly.

Bio: Dr. Bruce is an assistant professor in the Department of Statistics at Texas A&M University. He completed his doctoral training at Temple University where he developed statistical methodology for frequency-domain analysis of time series. He is passionate about creating novel computationally-efficient statistical methods for the analysis of time series and longitudinal data in areas such as sleep research, neuroscience, and psychiatry. He is involved in numerous transdisciplinary projects in these areas and aims to produce high-quality publications in top statistics and scientific journals. His research interests include nonstationary time series, spectral analysis, Bayesian statistical learning, computational data science, longitudinal data analysis, transdisciplinary research, applications in sleep medicine, biomechanics, neuroscience, and psychiatry.

Webex Info: Meeting link: https://uconn-cmr.webex.com/uconn-cmr/j.php?MTID=me2abee59b2c4bc7f18722b814e7fd740
Meeting number (access code): 2620 164 2847
Meeting password: wQSBpahY348
Date and time: Wednesday, September 14, 2022, 4:00 pm ET, 1-hour duration

Reuben Retnam, Takeda Pharmaceuticals

A “Divide-and-Conquer” AECM Algorithm for Large non-Gaussian Longitudinal Data with Irregular Follow-Ups

Features of non-Gaussianity, manifested via skewness and heavy tails, are ubiquitous in databases generated from large scale observational studies. Yet they continue to be routinely analyzed via linear/non-linear mixed effects models under standard Gaussian assumptions for the random terms. In periodontal disease data, these issues are applicable to the modeling of clinical attachment level and pocket depth. These problems are maintained, if not exacerbated, in the longitudinal data framework.

In this research, we define and elucidate an extension of the skew-t linear mixed model suitable for a big data setting. This extensibility is achieved via the implementation of divide-and-conquer techniques that utilize the distributed expectation-maximization algorithm. Specifically, the E-steps of the AECM algorithm are run in parallel on multiple worker processes, while manager processes perform the M-steps with a updated fraction of the results from the local expectation steps. We prove convergence properties of this algorithm and show examples of its performance compared to traditional modelling methods on real and simulated data.

Bio: Dr. Reuben Retnam is a recent graduate of Virginia Commonwealth University’s Department of Biostatistics. His research interests include longitudinal data,extensions of the EM algorithm, and extrapolation-based model acceleration. He joined Takeda Pharmaceuticals in 2022, focusing on modeling complex pre-clinical outcomes.

Webex Info: Meeting link: https://uconn-cmr.webex.com/uconn-cmr/j.php?MTID=mfcdeb56fd6cd67392fb3455f80570950
Meeting number (access code): 2624 992 7557
Meeting password: Ybjz4Xa3Bv3
Date and time: Wednesday, September 21, 2022, 4:00 pm ET, 1-hour duration

Judy Huixia Wang, Chair and Professor, Department of Statistics, Washington University

Copula-Based Approaches for Analyzing Non-gaussian Spatial Data

Many existing methods for analyzing spatial data rely on the Gaussian assumption, which is violated in many applications such as wind speed, precipitation and COVID mortality data. In this talk, I will discuss several recent developments of copula-based approaches for analyzing non-Gaussian spatial data. First, I will introduce a copula-based spatio-temporal model for analyzing spatio-temporal data and a semiparametric estimator. Second, I will present a copula-based multiple indicator kriging model for the analysis of non-Gaussian spatial data by thresholding the spatial observations at a given set of quantile values. The proposed algorithms are computationally simple, since they model the marginal distribution and the spatio-temporal dependence separately. Instead of assuming a parametric distribution, the approaches model the marginal distributions nonparametrically and thus offer more flexibility. The methods will also provide convenient ways to construct both point and interval predictions based on the estimated conditional quantiles. I will present some numerical results including the analyses of a wind speed and a precipitation data. If time allows, I will also discuss a recent work on copula-based approach for analyzing count spatial data.

Bio: Judy Huixia Wang received her Ph.D. in Statistics from the University of Illinois in 2006. She was a faculty member in the Department of Statistics at North Carolina State University from 2006 to 2014. Currently, she is a Professor and Chair of the Department of Statistics at George Washington University. She received a CAREER award from the National Science Foundation and the Tweedie New Researcher Award from the Institute of Mathematical Statistics in 2012. In 2018, she was elected a Fellow of both the American Statistical Association and the Institute of Mathematical Statistics. She was one of the 2022 IMS Medallion Lecturers. She served as a Program Director in the Division of Mathematical Sciences (DMS) of the National Science Foundation from 2018 to 2022, managing the statistics program as well as several interdisciplinary programs that are cross-directorate and cross-agency. Her research interests include quantile regression, semiparametric and nonparametric regression, high dimensional inference, extreme value analysis, spatial analysis, etc.

Event location: Philip E. Austin Building, Rm. 434
Date and time: Wednesday, September 28, 2022, 4:00 pm ET, 1-hour duration

Xianyang Zhang, Associate Professor, Department of Statistics, Texas A & M University

Powerful Large-scale Inference in Omics Association Studies

Increasing statistical power in analyzing omics data through methodological innovation is of tremendous benefit to the field due to resource constraints for individual studies. One direction of innovation is to utilize auxiliary data that could inform us of the statistical and biological properties of the omics features. The first part of the talk will introduce a new multiple testing procedure that can incorporate these auxiliary data to boost the statistical power. We develop a fast algorithm to implement the proposed procedure and prove its asymptotic validity. Through numerical studies, we demonstrate that the new approach improves over the state-of-the-art methods by being flexible, robust, powerful, and computationally efficient. The second part of the talk tackles the issue of statistical power loss when simultaneously adjusting for confounders and multiple testing in omics association studies. The traditional statistical procedure involves fitting a confounder-adjusted regression model for each omics feature, followed by multiple testing correction. Here we show that the conventional procedure is not optimal and present a new approach, 2dFDR, a two-dimensional false discovery rate control procedure, for powerful confounder adjustment in multiple testing. Through extensive evaluation, we show that 2dFDR is more powerful than the traditional procedure, and in the presence of strong confounding and weak signals, the power improvement could be more than 100%.

Bio: Xianyang Zhang is an Associate Professor in the statistics department at Texas A&M University. He obtained his Ph.D. in statistics from the University of Illinois at Urbana Champaign in 2013. His research interests include high dimensional/large-scale statistical inference, kernel methods, genomics data analysis, functional data analysis, time series, and econometrics. He currently serves as an associate editor for Biometrics and Journal of Multivariate Analysis.

Event location: Philip E. Austin Building, Rm. 434
Date and time: Wednesday, October 5, 2022, 4:00 pm ET, 1-hour duration