Debashis Ghoshhttps://works.bepress.com/debashis_ghosh/Recent works by Debashis Ghoshen-usCopyright (c) 2019 All rights reserved.Fri, 01 Jan 2016 00:00:00 +00003600CKAT softwarehttps://works.bepress.com/debashis_ghosh/75/<div class="line" id="line-33">This is the software to accompany "A novel copy number variants kernel association test with application to autism spectrum disorders studies" by Zhan et al.</div>Debashis Ghosh et al.Fri, 01 Jan 2016 00:00:00 +0000https://works.bepress.com/debashis_ghosh/75/OtherCovariate adjustment using propensity scores for dependent censoringhttps://works.bepress.com/debashis_ghosh/76/<div class="line" id="line-33">In many medical studies, estimation of treatment effects is often of primary </div><div class="line" id="line-155">scientific interest. Standard methods for evaluating the treatment</div><div class="line" id="line-73">effect in survival analysis typically require the assumption of independent</div><div class="line" id="line-75">censoring. Such an assumption might be invalid in many medical studies,</div><div class="line" id="line-77">where the presence of dependent censoring leads to difficulties in analyzing</div><div class="line" id="line-79">covariate effects on disease outcomes. This data structure is called 'semi-</div><div class="line" id="line-81">competing risks data'. In marginal modeling under semicompeting risks</div><div class="line" id="line-83">data, an artificial censoring technique is a promising approach to handle</div><div class="line" id="line-85">dependent censoring. However, continuous covariates with large variabil-</div><div class="line" id="line-87">ities may lead to excessive artificial censoring, which subsequently results</div><div class="line" id="line-89">in numerically unstable estimation. In this paper, we propose a strategy</div><div class="line" id="line-91">for weighted estimation of treatment effects in the accelerated failure time</div><div class="line" id="line-93">model. Weights are based on propensity score modeling of the treatment</div><div class="line" id="line-95">conditional on confounder variables. This novel application of propensity</div><div class="line" id="line-97">scores avoids excess artificial censoring caused by continuous covariates and</div><div class="line" id="line-99">simplifies computation. Monte Carlo simulation studies and application to</div><div class="line" id="line-113">AIDS and cancer research are used to illustrate the methodology.</div><div class="line" id="line-115"><br></div>Youngjoo Cho et al.Fri, 01 Jan 2016 00:00:00 +0000https://works.bepress.com/debashis_ghosh/76/ArticlesA kernel-based metric for balance assessmenthttps://works.bepress.com/debashis_ghosh/77/<div class="line" id="line-33"> p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Helvetica} </div><div class="line" id="line-61">One of the commonly used approaches to the causal analysis of observational data is the use</div><div class="line" id="line-63">of matching techniques. Matching involves taking treated subjects and nding comparable</div><div class="line" id="line-65">control subjects who have either similar covariate values and/or propensity score values.</div><div class="line" id="line-67">We focus on the latter approach in this article. In matching, an important purpose is to</div><div class="line" id="line-69">achieve balance in the covariates among the treatment groups. In this article, we </div><div class="line" id="line-87"> propose a new balance measure called kernel distance, which is the empirical estimate of the probability metric</div><div class="line" id="line-77">defined in the reproducing kernel Hilbert spaces. Compared to the traditional balance metrics,</div><div class="line" id="line-79">the kernel distance measures the difference in the two multivariate distributions instead of the</div><div class="line" id="line-81">difference in the finite moments of the distribution. Simulation studies and a real data example are used to </div><div class="line" id="line-209">illustrate the methodology.</div>Yeying Zhu et al.Fri, 01 Jan 2016 00:00:00 +0000https://works.bepress.com/debashis_ghosh/77/ArticlesPrognostic and predictive directions for clinical trialshttps://works.bepress.com/debashis_ghosh/73/<p>In many clinical trials, treatment effects can be quite heterogeneous across subgroups so that individuals in different subgroups can receive different benefits of the treatment. This can be quite important for the purposes of clinical decision-making purposes. In this article, we introduce a general concept of risk score that is motivated by potential outcomes consider- ations. The concepts of prognostic and predictive directions for outcome data are defined. Their basis is in the dimension reduction (DR) literature and can also be motivated by con- ditional independence assumptions. Under some conditions, one can use existing methods from the DR literature to estimate the directions assuming a complete data structure. We show how to adapt the procedure with data that come from a randomized clinical trial. The methodology is illustrated with application to a set of colorectal cancer clinical trials.</p>
Debashis GhoshThu, 01 Jan 2015 00:00:00 +0000https://works.bepress.com/debashis_ghosh/73/ArticlesWeighted estimation of the accelerated failure time model in the presence of dependent censoringhttps://works.bepress.com/debashis_ghosh/64/<p>Independent censoring is a crucial assumption in survival analysis. However, this is impractical in many medical studies, where the presence of dependent censoring leads to difficulty in analyzing covariate effects on disease outcomes. The semicompeting risks framework offers one approach to handling dependent censoring. There are two representative estimators based on an artificial censoring technique in this data structure. However, neither of these estimators is better than another with respect to efficiency (standard error). In this paper, we propose a new weighted estimator for the accelerated failure time (AFT) model under dependent censoring. One of the advantages in our approach is that these weights are optimal among all the linear combinations of the previously mentioned two estimators. To calculate these weights, a novel resampling-based scheme is employed. Attendant asymptotic statistical results for the estimator are established. In addition, simulation studies, as well as an application to real data, show the gains in efficiency for our estimator.</p>
Youngjoo Cho et al.Thu, 01 Jan 2015 00:00:00 +0000https://works.bepress.com/debashis_ghosh/64/ArticlesEstimating Controlled Direct Effects of Restrictive Feeding Practices in the `Early Dieting in Girls' Studyhttps://works.bepress.com/debashis_ghosh/61/<p>In this article, we examine the causal effect of parental restrictive feeding practices on children’s weight status. An important mediator we are interested in is children’s self-regulation status. Traditional mediation analysis (Baron and Kenny, 1986) applies a structural equation modelling (SEM) approach and decomposes the intent-to-treat (ITT) effect into direct and indirect effects. More recent approaches interpret the mediation effects based on the potential outcomes framework. In practice, there often exist confounders that jointly influence the mediator and the outcome. Inverse probability weighting based on propensity scores are used to adjust for confounding and reduce the dimensionality of confounders simultaneously. We show that combining machine learning algorithms (such as a generalized boosted model) and logistic regression to estimate the propensity scores can be more accurate and efficient in estimating the controlled direct effects than using logistic regression alone. The proposed methods are general in the sense that we can combine multiple candidate models and use the cross-validation criterion to select the optimal subset of the candidate models for combining. The criterion achieves a balance between the number of models we combine and the variability of the resulting estimator. A data application to the Early Dieting in Girls Study shows that the causal effect of mother’s restrictive feeding differs according to whether the daughter eats in the absence of hunger.</p>
Yeying Zhu et al.Thu, 01 Jan 2015 00:00:00 +0000https://works.bepress.com/debashis_ghosh/61/ArticlesIncorporating auxiliary information for improved prediction using combination of kernel machineshttps://works.bepress.com/debashis_ghosh/62/<p>With evolving genomic technologies, it is possible to get different measures of the same underlying biological phenomenon using different technologies. The goal of this paper is to build a prediction model for an outcome variable Y from covariates X. Besides X, we have surrogate covariates W which are related to X. We want to utilize the information in W to boost the prediction for Y using X. In this paper, we propose a kernel machine-based method to improve prediction of Y by X by incorporating auxiliary information W. By combining single kernel machines, we also propose a hybrid kernel machine predictor, which can yield a smaller prediction error than its constituents. The prediction error of our kernel machine predictors is evaluated using simulations. We also apply our method to a lung cancer dataset and an Alzheimer's disease dataset.</p>
Xiang Zhan et al.Thu, 01 Jan 2015 00:00:00 +0000https://works.bepress.com/debashis_ghosh/62/ArticlesEquivalence of Kernel Machine Regression and Kernel Distance Covariance for Multidimensional Trait Association Studieshttps://works.bepress.com/debashis_ghosh/67/<p>Associating genetic markers with a multidimensional phenotype is an important yet challenging problem. In this work, we establish the equivalence between two popular methods: kernel-machine regression (KMR), and kernel distance covariance (KDC). KMR is a semiparametric regression framework that models covariate effects parametrically and genetic markers non-parametrically, while KDC represents a class of methods that include distance covariance (DC) and Hilbert-Schmidt independence criterion (HSIC), which are nonparametric tests of independence. We show that the equivalence between the score test of KMR and the KDC statistic under certain conditions can lead to a novel generalization of the KDC test that incorporates covariates. Our contributions are 3-fold: (1) establishing the equivalence between KMR and KDC; (2) showing that the principles of KMR can be applied to the interpretation of KDC; (3) the development of a broader class of KDC statistics, where the class members are statistics corresponding to different kernel combinations. Finally, we perform simulation studies and an analysis of real data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) study. The ADNI study suggest that SNPs of FLJ16124 exhibit pairwise interaction effects that are strongly correlated to the changes of brain region volumes.</p>
Wen-Yu Hua et al.Thu, 01 Jan 2015 00:00:00 +0000https://works.bepress.com/debashis_ghosh/67/ArticlesKernel approaches for differential expression analysis of mass spectrometry-based metabolomics datahttps://works.bepress.com/debashis_ghosh/72/<p>BACKGROUND: Data generated from metabolomics experiments are different from other types of "-omics" data. For example, a common phenomenon in mass spectrometry (MS)-based metabolomics data is that the data matrix frequently contains missing values, which complicates some quantitative analyses. One way to tackle this problem is to treat them as absent. Hence there are two types of information that are available in metabolomics data: presence/absence of a metabolite and a quantitative value of the abundance level of a metabolite if it is present. Combining these two layers of information poses challenges to the application of traditional statistical approaches in differential expression analysis.</p>
<p>RESULTS: In this article, we propose a novel kernel-based score test for the metabolomics differential expression analysis. In order to simultaneously capture both the continuous pattern and discrete pattern in metabolomics data, two new kinds of kernels are designed. One is the distance-based kernel and the other is the stratified kernel. While we initially describe the procedures in the case of single-metabolite analysis, we extend the methods to handle metabolite sets as well.</p>
<p>CONCLUSIONS: Evaluation based on both simulated data and real data from a liver cancer metabolomics study indicates that our kernel method has a better performance than some existing alternatives. An implementation of the proposed kernel method in the R statistical computing environment is available at http://works.bepress.com/debashis_ghosh/60/</p>
Debashis Ghosh et al.Thu, 01 Jan 2015 00:00:00 +0000https://works.bepress.com/debashis_ghosh/72/ArticlesMatching methods for biomarker evaluation: a mapping with causal inferencehttps://works.bepress.com/debashis_ghosh/74/<p>In many medical settings, there is interest in evaluating the predictive ability of a candidate biomarker while adjusting appropriately for confounding factors. Recently, Janes and Pepe (2008, {\it Biometrics} 64: 1 -- 9) evaluated the effects of matching on classification accuracy for biomarkers. In this article, we note an analogy between the use of matching in causal inference with its role in the biomarker evaluation problem. This leads us to be able to import much of the literature on matching from causal inferential settings to the biomarker evaluation problem. This leads to a theoretical characterization of the bias properties of matching using a modification of the concept equal percent bias reduction that has been previously developed in the literature. In addition, we can develop an approach to matching with multiple confounders using a `reverse propensity score.' Assumptions relevant to proper causal inference are adapted to the biomarker problem, and various tree-based regression modelling diagnostics are developed. The methodology is illustrated with application to evaluating the role of mitotic rate for the discrimination of sentinel lymph node (SLN) positive cases in melanoma.</p>
Debashis Ghosh et al.Thu, 01 Jan 2015 00:00:00 +0000https://works.bepress.com/debashis_ghosh/74/Unpublished Papers