Profile Picture

Nilotpal Sanyal

Assistant Professor
Assistant Director, Data Analytics Lab

Department of Mathematical Sciences
University of Texas at El Paso
500 W University Ave
El Paso, TX 79968-0514
Office: Bell Hall 328
Phone: (915)747-6763
E-mail: nsanyal@utep.edu
Personal Webpage

Icon   Icon   Icon   Icon   Icon

I am an Assistant Professor in the Department of Mathematical Sciences at the University of Texas at El Paso.

I obtained a PhD in Statistics from the University of Missouri-Columbia with a dissertation on Bayesian functional magnetic resonance imaging (fMRI) data analysis and Bayesian optimal design. Following that, I had an extensive postdoctoral research experience at Stanford University, the University of California-San Diego, and Texas A&M University working in biological data applications.

My research interests are Bayesian statistics, high-dimensional variable selection, nonparametric regression, statistical genetics, computational neuroscience, and survival analysis.

I am truly passionate about teaching and have immense respect for the value of good teaching and good mentoring.



↩ UTEP Mathematical Sciences Faculty alphabetically, or by research areas

Research

Overview

My current research interests are Bayesian statistics, high-dimensional variable selection, nonparametric regression, statistical genetics, and survival analysis. I work in both the Bayesian and the frequentist frameworks. I enjoy developing computationally efficient statistical/machine learning methods and software. My research has found applications in omics, epidemiology, public health, and neuroscience.

Specific research areas

  • High-dimensional variable selection and inference methods. Such methods are extremely useful for sparse data that contain a large number of variables/features (e.g., public health data), often much larger than the number of observations (e.g., GWAS data, or gene expression data), where only a few features have significant effects.
  • Multiscalar methods. Such methods are useful for data that may contain information at multiple scales or resolution levels (say, image data or areal data) by virtue of the implicit multiscalar nature of the process (say, fMRI brain activation) and/or availability of information at multiple scales (say, time series data).
  • Survival data methods in the presence of competing risks. Such methods help to predict the risk of the event (say, death) due to the primary cause of interest (say, cardio-vascular disease) correctly by accounting for the presence of other causes (say, accident) that may lead to the same event. [Note that if we simply exclude from sample the persons who die from accident, we lose the information that those persons do not die from cardio-vascular disease up to the time of their accident. A competing risks model incorporates this information.]
  • Survival data methods in the presence of cure fraction. Such methods help to account explicitly for the presence of possibly cured persons (say, long-time meditators) in the population who may never experience the event of interest (say, depression).
  • Gene by environment (GxE) interaction methods. Such methods help to understand how environmental factors (say, pollution) and lifestyle factors (say, smoking) may modify the effect of genetic factors on a trait or disease (say, lung cancer).

I have developed/co-developed the R software packages GWASinlps, CGEN, and BHMSMAfMRI based on my research. See the Software tab for more details about them.

Publications

- Google Scholar
- ResearchGate
- ORCiD

Software

Here are some software packages that I have developed/co-developed based on my research.

  • BHMSMAfMRI: This is an R software package that performs Bayesian hierarchical multi-subject multiscale analysis of function MRI (fMRI) data, or other multiscale data, as described in Sanyal & Ferreira (2012) using wavelet based prior that borrows strength across subjects and provides posterior smooth estimates of the effect sizes and samples from their posterior distribution. Description and download instructions are available at the package webpage at https://nilotpalsanyal.github.io/BHMSMAfMRI/.
  • GWASinlps: This is an R software package that performs Bayesian non-local prior based iterative variable selection for data from genome-wide association studies (GWAS), or other high-dimensional data, as described in Sanyal et al. (2019). Description and download instructions are available at the package webpage at https://nilotpalsanyal.github.io/GWASinlps/.
  • CGEN: This is an R software package that analyzes case-control data in genetic epidemiology. It provides a set of statistical methods for evaluating gene x environment (or gene x gene) interactions under multiplicative and additive risk models (Sanyal et al., 2021; Rochemonteix et al., 2021), with or without assuming gene-environment (or gene-gene) independence in the underlying population. Description and download instructions are available at the package webpage at https://www.bioconductor.org/packages/release/bioc/html/CGEN.html. A tutorial for the additive gene x environment interaction tests under the trend effect of genotypes, proposed in the above references, are available at https://github.com/thehanlab/AdditiveGxEtrendtest.

SPLC-RAT: My past colleagues at Stanford University have developed this shiny app based on our joint work on the development and validation of the first risk prediction tool for second primary lung cancer that incorporates comprehensive risk factors including smoking information, medical history, treatment, and tumor characteristics using large population-based data. It is available at https://splc-risk-prediction.shinyapps.io/SPLC-RiskAssessmentTool/.

Univariate probability distribution viewer: A shiny app to visualize various univariate probability distributions. Feel free to use for non-commercial classroom teaching.

 

References:

Sanyal, Nilotpal, and Ferreira, Marco A.R. (2012). Bayesian hierarchical multi-subject multiscale analysis of functional MRI data. Neuroimage, 63, 3, 1519-1531. doi:10.1016/j.neuroimage.2012.08.041.

Sanyal, N., Lo, M.T., Kauppi, K., Djurovic, S., Andreassen, O.A., Johnson, V.E. and Chen, C.H. (2019). GWASinlps: non-local prior based iterative SNP selection tool for genome-wide association studies. Bioinformatics, 35(1), pp.1-11. doi:10.1093/bioinformatics/bty472

Sanyal, N., Napolioni, V., de Rochemonteix, M., Belloy, M.E., Caporaso, N.E., Landi, M.T., Greicius, M.D., Chatterjee, N. and Han, S.S. (2021). A Robust Test for Additive Gene-Environment Interaction Under the Trend Effect of Genotype Using an Empirical Bayes-Type Shrinkage Estimator. American journal of epidemiology, 190(9), pp.1948-1960. doi:10.1093/aje/kwab124.

De Rochemonteix, M., Napolioni, V., Sanyal, N., Belloy, M.E., Caporaso, N.E., Landi, M.T., Greicius, M.D., Chatterjee, N. and Han, S.S. (2021). A likelihood ratio test for gene-environment interaction based on the trend effect of genotype under an additive risk model using the gene-environment independence assumption. American journal of epidemiology, 190(1), pp.129-141. American journal of epidemiology, 190(9), pp.1948-1960. doi:10.1093/aje/kwaa132.

Teaching

I ardently love to teach and have immense respect for the value of good teaching and good mentoring.

Current courses (Spring 2024)

  • STAT 6329 - Statistical Programming, UTEP
  • DS 6390 - DS Research Collaborative, UTEP

Past Courses

  • STAT 6370 - Special Topics (Advanced Regression Analysis), UTEP.
  • Statistical Data Analysis (with project supervision for 12 students), International Statistical Education Center, ISI, Kolkata, 2022-23.
  • Statistical Methods, International Statistical Education Center, ISI, Kolkata, 2022-23.
  • Descriptive Statistics, International Statistical Education Center, ISI, Kolkata, 2022-23.

Workshop teaching

  • Special Lecture on Survival Analysis, Maulana Azad College, Kolkata, April 2023.
  • R Sessions for CoxBoost modeling, Virtual workshop, Stanford University Quantitative Science Unit, January 2021.
  • Random Forest for Competing Risk Data, Virtual workshop, Stanford University Quantitative Science Unit, December 2020.
  • Predictive Modeling of Competing Risk Data Using Penalized Regression, Virtual workshop, Stanford University Quantitative Science Unit, November 2020.
  • Time Series Analysis, Winter School on Statistical Data Analysis Methods, Indian Statistical Institute, Kolkata, February 2015.
  • Introduction to R, Winter School on Statistical Data Analysis Methods, Indian Statistical Institute, Kolkata, February 2015.
  • Descriptive Statistics, Winter School on Statistical Data Analysis Methods, Indian Statistical Institute, Kolkata, February 2015.
  • Time Series Analysis, Short-term Course on Statistical Methods, Arya Vidyapeeth College, Guwahati, Assam, India, November 2014.
  • Introduction to R, Short-term Course on Statistical Methods, Arya Vidyapeeth College, Guwahati, Assam, India, November 2014.
  • Design of Experiments, Workshop on Techniques of Data Analysis, Dimapur Govt. College, Nagaland, India, September 2014.
  • Time Series Analysis, Workshop on Techniques of Data Analysis, Dimapur Govt. College, Nagaland, India, September 2014.
  • R for Time Series, Workshop on Techniques of Data Analysis, Dimapur Govt. College, Nagaland, India, September 2014.

Some materials from past teaching/workshops:

Service

This is the content for the third link.

Learn

Here are some self-made precise guides for quick learning.

Others

Alongside academic research I have multifarious interests. Feel free to explore some of them here, to comment, and to connect.