Colloquia

Fall 2019

All colloquia will be held at 4PM in AUST 344, unless otherwise noted. Coffee will be served at 3:30PM in AUST 326.

Information about past colloquia is available here.

Date
Speaker
Title
Location
Wednesday, August 28 Yichuan Zhao, Georgia State University Rank-based estimating equation with non-ignorable missing responses 11AM in AUST 313. Coffee at 10:30AM in AUST 326.
Wednesday, September 4 Tianying Wang, Columbia University Integrated Quantile Rank Test (iQRAT) for gene-level associations in Sequencing Studies 4PM in AUST 344. Coffee at 3:30PM in AUST 326.
Monday, September 9 Michael Lavine, US Army Research Office Suboptimal is the best 3:35PM in AUST 163. Coffee at 3:00PM in AUST 326.
Wednesday, September 11 Ivair Ramos Silva, Federal University, Brazil On the Correspondence between Frequentist and Bayesian Tests 4PM in AUST 344. Coffee at 3:30PM in AUST 326.
Wednesday, September 18 Patrick J. Cantwell, U.S. Census Bureau Statistical Methods at the U.S. Census Bureau: From Simple Statistical Theory to Complex Practical Application
Recipient of the 2019 UConn Statistics Department Distinguished Alumni Award
4PM in Gentry Bldg, Rm. 131. Coffee at 3:15PM-3:45PM in AUST 326.
Wednesday, September 25 Suman Majumdar, Department of Statistics, University of Connecticut ON ASYMPTOTIC STANDARD NORMALITY OF THE TWO SAMPLE PIVOT 4PM in AUST 344. Coffee at 3:30PM in AUST 326.
Wednesday, October 2 TBD
Wednesday, October 9 Forrest Crawford, Yale University
Wednesday, October 16 Hui Zou, University of Minnesota A nearly condition-free fast algorithm for Gaussian graphical model recovery 4PM in AUST 344. Coffee at 3:30PM in AUST 326.
Wednesday, October 23 Min Shu, University of Connecticut 4PM in AUST 344. Coffee at 3:30PM in AUST 326.
Wednesday, October 30 Liqun Wang, University of Manitoba
Wednesday, November 6 Julio Castrillon, Boston University
Wednesday, November 13 William Evan Johnson, Boston University
Wednesday, November 20 TBD
Wednesday, December 4 Subhashis Ghoshal, North Carolina State University

Yichuan Zhao; Georgia State University

Rank-based estimating equation with non-ignorable missing responses

August 28, 2019 at 11AM in AUST 313

In this talk, a general regression model with responses missing not at random is considered. From a rank based estimating equation, a rank based estimator of the regression parameter is derived. Based on this estimator's asymptotic normality property, a consistent sandwich estimator of its corresponding asymptotic covariance matrix is obtained. In order to overcome the over-coverage issue of the normal approximation procedure, the empirical likelihood based on the rank-based gradient function is defined, and its asymptotic distribution is established. Extensive simulation experiments under different settings of error distributions with different response probabilities are considered, and the simulation results show that the proposed empirical likelihood approach has better performance in terms of coverage probability and average length of confidence intervals for the regression parameters compared with the normal approximation approach and its least-squares counterpart. A data example is provided to illustrate the proposed methods.


Tianying Wang; Columbia University

Integrated Quantile Rank Test (iQRAT) for gene-level associations in Sequencing Studies

September 4, 2019 at 4PM in AUST 344

Sequence-based association studies often evaluate the group-wise effects of rare and common genetic variants within a gene on a phenotype of interest. Many such approaches have been proposed, such as the widely used burden and sequence kernel association tests. These approaches focus on identifying genetic effects on the phenotypic mean. As the genetic associations can be complex, we propose here an efficient rank test to investigate the genetic effects across the entire distribution of a phenotype. The proposed test generalizes the classical quantile-specific rank-score test, by integrating the rank score test statistics over quantile levels while incorporating Cauchy combination test scheme and Fisher's method to maximize the power. We show that the resulting test complements the mean-based analysis and improves efficiency and robustness. Using simulations studies and real Metabochip data on lipid traits, we investigated the performance of the new test in comparison with the burden tests and sequence kernel association tests in multiple scenarios.


Michael Lavine, US Army Research Office

Suboptimal is the best

September 9, 2019

Many statistics problems are framed as optimization. That is, we write down a target function f (θ) and find the input value θ∗ that maximizes it. Our thesis is that we are often better served by finding the entire set of θ’s that come close to maximizing f viz; Θ∗ ≡ {θ : f (θ) ≥ f (θ∗) − E}. This talk will
Explain why (briefly);
Show a few examples of what can be gained by finding Θ∗; and
Show one possible approach to finding Θ∗.


Ivair Ramos Silva; Federal University, Brazil

On the Correspondence between Frequentist and Bayesian Tests

September 11, 2019

The confrontation between adepts of the Bayesian and the frequentist schools has endured for decades. However, a reconciling theory for hypothesis testing is emerging. This presentation is inspired in the work of Silva (2018), who shows that one can always calibrate Bayesian and frequentist tests in order to present the same decision rule for any hypothesis test problem.

Patrick J. Cantwell; U.S. Census Bureau
Recipient of the 2019 UConn Statistics Department Distinguished Alumni Award

Statistical Methods at the U.S. Census Bureau: From Simple Statistical Theory to Complex Practical Application

September 18, 2019

The U.S. Census Bureau has employed dozens of statistics graduates from the University of Connecticut over the years. At the Bureau, staff develop complex theoretical models and implement them in the course of our work on surveys and censuses. However, we often begin with simple ideas and find interesting–sometimes complex–applications to solve problems. In this presentation, we address three questions. For each, we briefly describe a statistical application based on simple statistical concepts.

The questions: (1) In a time when data intruders have access to sophisticated software and huge databases of personal information, how can we ensure the confidentiality of individuals’ responses to the census and our surveys? The randomized response method suggests ways to guarantee confidentiality in the presence of any external threats. (2) Can we measure how well the U.S. decennial census “covers” the population of the United States? A statistical procedure developed in the 19th Century provides the starting point and became the topic of a controversial Supreme Court case. (3) How can we design a survey that produces high quality estimates of the current unemployment rate as well as the change in the rate from the previous month? Practical considerations and basic concepts of statistical covariance provide guidance on effective survey designs and estimation procedures to precisely measure the unemployment rate, a major indicator and driver of the stock market.


Suman Majumdar; Department of Statistics, University of Connecticut

ON ASYMPTOTIC STANDARD NORMALITY OF THE TWO SAMPLE PIVOT

September 25, 2019

The large sample solution to the problem of comparing the means of two (possibly heteroscedastic) populations, based on two random samples from the populations, hinges on the pivot underpinning the construction of the confidence interval and the test statistic being asymptotically standard Normal. We regularly use this well-known solution if the two samples are independent and the sample sizes are large. However, to establish the asymptotic standard Normality of the two sample pivot, existing results in the literature seem to assume, above and beyond the cross sample independence of the two samples, that the ratio of the sample sizes converges to a finite positive number. restriction on This the asymptotic behavior of the ratio of the sample sizes is impossible to verify in practical applications and carries the risk of rendering the theoretical justification of the large sample approximation invalid even in moderately unbalanced designs. Our results show that neither the restriction on the asymptotic behavior of the ratio of the sample sizes nor the assumption of cross sample independence is necessary for the asymptotic standard Normality of the two sample pivot. Convergence of the joint distribution of the standardized sample means to a spherically symmetric distribution on the plane, which has to be the bivariate standard Normal distribution, implies the asymptotic standard Normality of the two sample pivot, with the passage of the sample sizes to infinity being completely unrestricted. Finally, the two random samples we work with can be considered to be a truncation of an underlying infinite sequence of random vectors, with truncation in each coordinate occurring at a different stage. As long as this infinite sequence consists of independent (not necessarily identically distributed) elements, Cesàro convergence of the sequence of cross sample correlation coefficients to zero is equivalent to both the asymptotic standard Normality of the two sample pivot and the asymptotic bivariate standard Normality of the standardized sample means.

Hui Zou; School of Statistics, University of Minnesota

A nearly condition-free fast algorithm for Gaussian graphical model recovery

October 16, 2019

Many methods have been proposed for estimating Gaussian graphical model. The most popular ones are the graphical lasso and neighborhood selection, because the two are computational very efficient and have some theoretical guarantees. However, their theory for graph recovery requires some very stringent structure assumptions (a.k.a. the irrepresentable condition). We argue that replacing the lasso penalty in these two methods with a non-convex penalty does not fundamentally remove the theoretical limitation, because another structure condition is required. As an alternative, we propose a new algorithm for graph recovery that is very fast and easy to implement and enjoys strong theoretical properties under basic sparsity assumptions.