**Spring 2020**

All colloquia will be held at 4PM in AUST 344, unless otherwise noted. Coffee will be served at 3:30PM in AUST 326.

**All department events are currently suspended due to coronavirus outbreak.**

**Information about past colloquia is available here.**

## Date |
## Speaker |
## Title |
## Location |

Friday, January 17 | László Márkus, University of Connecticut | Rough Stochastic Correlation for Modeling Tail Dependence of Asset Price Pairs | 11AM in AUST 344. Coffee at 10:30PM in AUST 326. |

Wednesday, January 22 | Emmy Karim, University of Connecticut | Construction of Simultaneous Confidence Intervals for Ratios of Means of Lognormal Distributions | 4PM in AUST 344. Coffee at 3:30PM in AUST 326. |

Wednesday, January 29 | Derek Aguiar, University of Connecticut | Bayesian Nonparametric Modelling and Scalable Inference In Large-Scale Genomics Data | 4PM in AUST 344. Coffee at 3:30PM in AUST 326. |

Wednesday, Feburary 5 | Sangwook Kang, University of Connecticut | Smoothed Quantile Regression for Censored Residual Lifetime Data | 4PM in AUST 344. Coffee at 3:30PM in AUST 326. |

Wednesday, Feburary 19 | Nitis Mukhopadhyay, University of Connecticut | Non-Routine Exploration of Jensen’s Inequality, Cramer-Rao Inequality, MVUE, and Symmetry | 4PM in AUST 344. Coffee at 3:30PM in AUST 326. |

Wednesday, Feburary 26 | Marcos Prates, University of Connecticut | Spatial Confounding Beyond Generalized Linear Mixed Models: Extension to Shared Components and Spatial Frailty Models | 4PM in AUST 344. Coffee at 3:30PM in AUST 326. |

Friday, March 6 | Yuchen Fama & Nathan Lally, HSB/Hartford Steam Boiler | Data Science & Data Engineering @ HSB | 11AM in AUST 434. Coffee at 10:30AM in AUST 326. |

Wednesday, March 11 | Haim Bar, University of Connecticut | Large-P Variable Selection in Two-Stage Models | 4PM in AUST 344. Coffee at 3:30PM in AUST 326. |

Wednesday, March 25 | Ted Westling, UMass (joint) | TBD | 4PM in AUST 344. Coffee at 3:30PM in AUST 326. |

Wednesday, April 1 | Joseph G. Ibrahim, UNC Gillings School of Global Health | Robert W. Makuch Distinguished Lecture in BiostatisticsThe Scale Transformed Power Prior with Applications to Studies with Different Endpoints |
4PM in AUST 344. Coffee at 3:30PM in AUST 326. |

Wednesday, April 8 | Sudipto Banerjee, UCLA | TBD | 4PM in AUST 344. Coffee at 3:30PM in AUST 326. |

Wednesday, April 15 | Tahir Ekin, Texas State University | Augmented Probability Simulation Methods for Decisions and Games | 4PM in AUST 344. Coffee at 3:30PM in AUST 326. |

Wednesday, April 22 | Ying Wei, Columbia University | TBD | 4PM in AUST 344. Coffee at 3:30PM in AUST 326. |

Wednesday, April 29 | Dong-Yun Kim, NIH | TBD | 4PM in AUST 344. Coffee at 3:30PM in AUST 326. |

## László Márkus; Institute of Mathematics, Eötvös Loránd University, Budapest, Hungary and

Department of Statistics, University of Connecticut

### Rough Stochastic Correlation for Modeling Tail Dependence of Asset Price Pairs

#### January 17, at 11AM in AUST 344

In 2009 the magazine Wired published ”Recipe for Disaster: The Formula That Killed Wall Street” as the cover story written by journalist Felix Salmon. It blames the subprime crisis on the Gaussian copula, which was then used in finance as industry standard to estimate the probability distribution of losses on a pool of loans or bonds or assets.

The Gaussian copula cannot, indeed, create tail dependence, crucial in modeling simultaneous defaults, but that was known before the crisis, as were other models, capable to do so. More than 10 years passed by since then, but the various copula and other models in use, going beyond correlation for describing dependence, do not harmonize well with the stochastic differential equation (SDE) description used for individual assets. Those models are often evaluated on the basis of their performance in option pricing, putting them to the test by relatively few data and short time period. In the lecture I build up an approach where interdependence is inherent from the covariations of Brownian motions driving the asset equations. These covariations in turn are integrals of suitable SDE driven stochastic processes called stochastic correlations. We test the goodness of the suggested model on historic asset price data, by using Kendall functions of copulas. The paradigm of rough paths leads to a newly emerging methodology in modeling stochastic volatility of assets. We suggest a similar approach to the mentioned stochastic correlations, and show that in frequent, minute-wise trade the fractal dimensions support the assumption of rough paths. The developed model helps showing that similar herding behavior of brokers as expressed by the HIX index may lead to very different tail dependence and hence e.g. variable probabilities of coincident defaults. The model may also be useful e.g. in CDO pricing, and in Credit Value Adjustment (CVA). A positive correlation/association between exposure and counterparty default risk gives rise to the so called Wrong-Way Risk (WWR) in CVA. Even though roughly two-thirds of the losses in the credit crisis were due to CVA losses, a decade after the crisis addressing WWR in a both sound and tractable way remains challenging. Our suggested model is capable of creating tail

dependence, and produces more realistic CVA premiums than constant correlations.

## Emmy Karim; Assistant Professor-in-Residence, Department of Statistics, University of Connecticut

### Construction of Simultaneous Confidence Intervals for Ratios of Means of Lognormal Distributions

#### January 22, at 4PM in AUST 344

For constructing simultaneous confidence intervals for ratios of means for lognormal distributions, two approaches using a two-step method of variance estimates recovery are proposed. The first approach proposes fiducial generalized confidence intervals (FGCIs) in the first step followed by the method of variance estimates recovery (MOVER) in the second step (FGCIs–MOVER). The second approach uses MOVER in the first and second steps (MOVER–MOVER). Performance of proposed approaches is compared with simultaneous fiducial generalized confidence intervals (SFGCIs). Monte Carlo simulation is used to evaluate the performance of these approaches in terms of coverage probability, average interval width, and time consumption.

## Derek Aguiar, Assistant Professor, Computer Science and Engineering, University of Connecticut

### Bayesian nonparametric modelling and scalable inference in large-scale genomics data

#### January 29, at 4PM in AUST 344

Bayesian nonparametric models provide a formal mechanism for encoding probabilistic assumptions about the data generation process where the dimension of the latent space is unknown a priori or may grow with additional samples. A common limitation of these models is that posterior inference is computationally intensive, particularly for nonconjugate models or when integrating over combinatorial structures. In this talk, I will introduce two hierarchical Bayesian nonparametric models and inference algorithms that scale to large genomics data. First, I will describe our mixed-membership model for alternatively spliced transcript discovery with explicit sparsity and inference algorithms based on stochastic variational inference. Second, I will present our genetic sequence clustering model based on fragmentation coagulation processes and how we scale nonconjugate model inference using maximization-expectation. Lastly, I will demonstrate the advantages of our Bayesian nonparametric approach when compared to state-of-the-art methods on simulated and experimental data.

## Sangwook Kang, Visiting Associate Professor, Department of Statistics, University of Connecticut

### Smoothed quantile regression for censored residual lifetime data

#### February 5, at 4PM in AUST 344

In this talk, we consider a semiparametric regression modeling of quantiles for residual lifetimes at a certain time point. Quantile residual lifetimes are essential summary measures in survival analysis. Recent statistical inference procedures for fitting semiparametric quantile residual lifetime models have mostly been based on estimating functions that are nonsmooth in model parameters. Thus, obtaining point estimates and their standard errors estimates could be computationally very demanding. We propose to employ a computationally-efficient induced-smoothing procedure that smoothes nonsmooth estimating functions. Variance estimation can be done via efficient resampling procedures that uses the sandwich form of asymptotic variances. We establish the consistency and asymptotic normality of the proposed estimators. Finite sample properties are investigated via an extensive simulations studies. We illustrate our proposed methods with a dental restoration study dataset.

## Nitis Mukhopadhyay, Professor, Department of Statistics, University of Connecticut

### Non-Routine Exploration of Jensen’s Inequality, Cramer-Rao Inequality, MVUE, and Symmetry

#### February 19, at 4PM in AUST 344

Deep grounding of Jensen’s Inequality, Cramer-Rao Inequality, MVUEs, and Symmetry, to name a few among other topics, in the core material of statistical inference is undeniable. After dealing with such stuff intently during the past 48+ years of my life, one may expect that by now I should know nearly whatever there is to know. In all humility, however, I must admit that I absolutely do not, and I feel lucky that I do not. Every new encounter still tends to open a new horizon for me that I was not aware of earlier. It is like going to a new candy store! I hope to share my passion for selected discoveries (and rediscoveries) which by some chance I suddenly come across sometimes, and then I pause: I just want to soak in their intrinsic power, beauty and elegance, but do so ever so softly and quietly.

## Marcos Prates, Visiting Professor, Department of Statistics, University of Connecticut

### Spatial Confounding Beyond Generalized Linear Mixed Models: Extension to Shared Components and Spatial Frailty Models

#### February 26, at 4PM in AUST 344

Spatial confounding is defined as the confounding between the fixed and spatial random effects in generalized linear mixed models (GLMMs). It gained attention in the past years, as it may generate unexpected results in modeling. We introduce solutions to alleviate the spatial confounding beyond GLMMs for two families of statistical models. In the shared component models, multiple count responses are recorded at each spatial location, which may exhibit similar spatial patterns. Therefore, the spatial effect terms may be shared between the outcomes in addition to specifics spatial patterns. Our proposal relies on the use of modified spatial structures for each shared component and specific effects. Spatial frailty models can incorporate spatially structured effects and it is common to observe more than one sample unit per area which means that the support of fixed and spatial effects differs. Thus, we introduce a projection-based approach for reducing the dimension of the data. An R package named “RASCO: An R package to Alleviate Spatial Confounding” is provided. Cases of lung and bronchus cancer in the state of California are investigated under both methodologies and the results prove the efficiency of the proposed methodology.

## Yuchen Fama & Nathan Lally, Hartford Steam Boiler

### Data Science & Data Engineering @ HSB

#### March 6, at 11AM in AUST 434

HSB (Munich Re Group) data science staff will present an overview of their data science and data engineering disciplines including an introduction to their team and organizational principles, a survey of selected projects, and a discussion of how HSB’s team contributes to the Munich Re global community. HSB is currently recruiting and will reserve half an hour after the presentation to meet with students.

## Haim Bar, Associate Professor, Department of Statistics, University of Connecticut

### Large-P Variable Selection in Two-Stage Models

#### March 11, at 11AM in AUST 434

Model selection in the large-P small-N scenario is discussed in the framework of two-stage models. Two specific models are considered, namely, two-stage least squares (TSLS) involving instrumental variables (IVs), and mediation models. In both cases, the number of putative variables (either instruments or mediators) is large, but only a small subset should be included in the two-stage model. We use two variable selection methods which are designed for high-dimensional settings, and compare their performance in terms of their ability to find the true IVs or mediators. Our approach is demonstrated via simulations and case studies.

## Joseph G. Ibrahim, Alumni Distinguished Professor, Department of Biostatistics, UNC Gillings School of Global Health

### The Scale Transformed Power Prior with Applications to Studies with Different Endpoints

#### April 1, at 4PM in AUST 344

We develop a scaled transformed version of the power prior to accommodate settings in which the historical data and the current data involve different data types, such as binary and continuous data. This situation arises often in clinical trials, for example, when the historical data involves binary response data and the current data may be time-to-event or some other continuous outcome. The traditional power prior proposed by Ibrahim and Chen (2000) does not account for different data types in the context discussed here. Thus, a new type of power prior needs to be formulated in these settings, which we call the scale transformed power prior. The scale transformed power prior is constructed so that the information matrix based the current data likelihood is appropriately scaled by the information matrix from the historical data, thereby shifting the scale of the parameter vector from the historical data to the new data. Several real data examples are presented to motivate the scale transformation and several simulation studies are presented to show the advantages in performance of the Scale Transformed Power Prior over the power prior and other priors.

## Tahir Ekin, Associate Professor, CIS & Quantitative Methods, Texas State University

### Augmented Probability Simulation Methods for Decisions and Games

#### April 15, at 4PM in AUST 344

Expectation function based decision and game theoretic models require both computation of the utility function and its optimization. This can be computationally challenging especially in cases with continuous and multi-modal sources of uncertainty or complex objective function surfaces. We propose augmented simulation approaches, that treat the decision variable(s) as random, and construct an augmented distribution in the space of both decisions and random variables. Simulation from this distribution simultaneously solves for the expectation of the objective function and optimization problem. In doing so, we sample more frequently from the marginal decision space in that the objective function has higher values in a maximization problem. This talk introduces augmented probability simulation and its extensions to solve for stochastic programming problems and game theoretic models. There will be a discussion and illustration on a variety of applications such as news-vendor type models, service systems and cybersecurity.

Bio: Dr. Tahir Ekin is an Associate Professor of Quantitative Methods in McCoy College of Business, Texas State University. His research interests include analytical applications in health care fraud assessment and decision modeling under uncertainty. His book on health care fraud analytics titled “Statistics and Health Care Fraud: How to Save Billions” has been recently published. His work has appeared in a variety of journals including Journal of the Royal Statistical Society Series C, International Statistical Review, Naval Research Logistics and Decision Analysis among others. Dr. Ekin holds a Ph.D. in Decision Sciences from The George Washington University, and a B.S. in Industrial Engineering from Bilkent University, Turkey. He is an elected member of International Statistical Institute and currently serves as Vice President of the International Society of Business and Industrial Statisticians.