Our lab is interested in developing computational and statistical approaches to analyze large-scale genomics data, and identifying the genetic basis of neurobehavioral disorders such as bipolar disorder, Tourette Syndrome, and schizophrenia. We use a multidisciplinary approach to achieve these goals by combining statistics, genetics, computer science, bioinformatics, and psychiatry. We extensively collaborate with researchers in UCLA and in other institutions such as Harvard Medical School, Broad Institute, and Asan Medical Center in Korea. More information on our past research projects can be found below and in the Publication page.
Correcting for population structure in GWAS
- We developed an efficient LMM approach called EMMAX that removes effect of population structure more accurately than other approaches. (Kang et al., 2010)
- We extended EMMAX to account for genomic regions under selection by using multiple variance component models. (Sul and Eskin, 2013)
- We developed an LMM approach to correct for population structure in gene-by-environment GWAS. (Sul et al., In Press)
Effective and efficient analysis of eQTL data
- We developed a statistical approach called Meta-Tissue that uses meta-analysis to combine information from multiple tissues and to detect more eQTLs than a standard approach. (Sul et al., 2013)
- We developed a fast multiple testing correction method based on multivariate normal sampling to efficiently detect genes in which genetic variants have effects. (Sul et al., 2015)
- We developed a method to correctly identify genetic variants called regulatory hotspots that regulate expression of many genes. (Joo et al., 2014)
Identifying association of rare variants from sequencing data
- We developed methods called RWAS and OWAS that achieve higher power than other approaches by assigning optimal weights to rare variants. (Sul et al., 2011)
- We developed another method that infers causal variants from genomic data and functional information, which further increases statistical power. (Sul et al., 2011)
- We developed a method that incorporate data from low-coverage sequencing, and this is important especially for studies that try to sequence many individuals in a limited budget. (Navon et al., 2013)
Identifying genetic basis of human complex traits with GWAS
- We performed a large-scale genotype imputation for several GWAS that infers genotypes on genetic markers that are not genotyped by parallelizing imputation processes on high-performance clusters.
- We performed a genetic association analysis to identify genetic variants that affect a phenotype with the methods that I developed or state-of-the-art approaches.
- We performed quality control on several GWAS datasets to remove individuals and genetic variants that contain genotyping errors and to make sure that a genetic association analysis does not result in false association due to technical artifacts. (Stein et al., 2010), (Miller et al., 2012), (Rietveld et al., 2013), (Luykx et al., 2015), (Luykx et al., 2014), (Plongthongkum et al., 2014), (Kim et al., 2014)
- Kang HM, Sul JH, Service SK, Zaitlen NA, Kong S-Y, Freimer NB, et al. Variance component model to account for sample structure in genome-wide association studies. Nat Genet. 2010 Apr;42(4):348–54.
- Sul JH, Eskin E. Mixed models can correct for population structure for genomic regions under selection. Nat Rev Genet. 2013 Apr;14(4):300.
- Sul JH, Bilow M, Yang W-Y, Kostem E, Furlotte N, He D, et al. Accounting for population structure in gene-by-environment interactions in genome-wide association studies using mixed models. PLoS genetics. In Press
- Sul JH, Han B, Ye C, Choi T, Eskin E. Effectively identifying eQTLs from multiple tissues by combining mixed model and meta-analytic approaches. PLoS Genet. 2013 Jun;9(6):e1003491.
- Sul JH, Raj T, de Jong S, de Bakker PIW, Raychaudhuri S, Ophoff RA, et al. Accurate and fast multiple-testing correction in eQTL studies. Am J Hum Genet. 2015 Jun 4;96(6):857–68.
- Joo JWJ, Sul JH, Han B, Ye C, Eskin E. Effectively identifying regulatory hotspots while capturing expression heterogeneity in gene expression studies. Genome Biol. 2014;15(4):r61.
- Sul JH, Han B, He D, Eskin E. An optimal weighted aggregated association test for identification of rare variants involved in common diseases. Genetics. 2011 May;188(1):181–8.
- Sul JH, Han B, Eskin E. Increasing power of groupwise association test with likelihood ratio test. J Comput Biol. 2011 Nov;18(11):1611–24.
- Navon O, Sul JH, Han B, Conde L, Bracci PM, Riby J, et al. Rare variant association testing under low-coverage sequencing. Genetics. 2013 Jul;194(3):769–79.
- Stein JL, Hua X, Morra JH, Lee S, Hibar DP, Ho AJ, et al. Genome-wide analysis reveals novel genes influencing temporal lobe structure with relevance to neurodegeneration in Alzheimer’s disease. Neuroimage. 2010 Jun;51(2):542–54.
- Miller MB, Basu S, Cunningham J, Eskin E, Malone SM, Oetting WS, et al. The Minnesota Center for Twin and Family Research genome-wide association study. Twin Res Hum Genet. 2012 Dec;15(6):767–74.
- Rietveld CA, Medland SE, Derringer J, Yang J, Esko T, Martin NW, et al. GWAS of 126,559 individuals identifies genetic variants associated with educational attainment. Science. 2013 Jun 21;340(6139):1467–71.
- Luykx JJ, Bakker SC, Visser WF, Verhoeven-Duif N, Buizer-Voskamp JE, den Heijer JM, et al. Genome-wide association study of NMDA receptor coagonists in human cerebrospinal fluid and plasma. Mol Psychiatry. 2015 Dec;20(12):1557–64.
- Luykx JJ, Bakker SC, Lentjes E, Neeleman M, Strengman E, Mentink L, et al. Genome-wide association study of monoamine metabolite levels in human cerebrospinal fluid. Mol Psychiatry. 2014 Feb;19(2):228–34.
- Plongthongkum N, van Eijk KR, de Jong S, Wang T, Sul JH, Boks MPM, et al. Characterization of genome-methylome interactions in 22 nuclear pedigrees. PLoS ONE. 2014;9(7):e99313.
- Kim J-H, Cheong HS, Sul JH, Seo J-M, Kim D-Y, Oh J-T, et al. A genome-wide association study identifies potential susceptibility loci for Hirschsprung disease. PLoS ONE. 2014;9(10):e110292.