Research Interests

Genetic Regulation of the Antioxidant Enzymes

My initial studies of genetic regulation at the biochemical level were made on the enzymes superoxide dismutase (CuZn and Mn SOD; Schisler and Singh, 1985), catalase (CAT; Schisler and Singh, 1987) and glutathione peroxidase (GSH-Px; Schisler and Singh, 1988) in inbred mouse lines under the auspices of Professor Shiva Singh at the University of Western Ontario in London, Ontario, Canada. This enzymatic pathway is responsible for the detoxification of reactive oxygen species (ROS), which are known to damage many sub-cellular structures including DNA (Sies and Cadenas, 1983). I began this study based on observations by collaborators that inbred mouse lines exhibited different mutagen-induced levels of DNA damage, as assessed by increased levels of sister chromatid exchange and micronucleus formation (Reimer and Singh, 1982). Most importantly, all of the inbred mouse lines had different baseline levels of chromosome damage. It was thought that differential activity of the above enzymes might be responsible for this effect (Gutierrez and Lopez-Saez, 1982). Biochemical assays indicated that the expression of the enzymes was tissue-specific (in erythrocyte lysates, kidney, liver, and lung) and developmentally regulated in inbred mouse lines. In general, enzyme activities were found to increase with age and were highest in metabolically active tissues (e.g., liver). Factor analysis of enzyme variation established that CuZn and Mn SOD were influenced largely by a single determinant, and the activities of CAT and GSH-PX peroxide were regulated by a different factor (Bannister and Bannister, 1987; publication in preparation). Such results were compatible with substrate-dependent regulation of the enzymes, perhaps by superoxide (the CuZn and Mn SOD substrate) or hydrogen peroxide (the CAT and GSH-Px substrate). Furthermore, positive correlations between CuZn SOD activity in the liver, kidney, and lung and micronucleus formation in young and old mice were identified, indicating a possible protective role of this enzyme in preventing DNA damage during aging.

Administration of xenobiotics, such as ethanol (Schisler and Singh, 1989), which may also produce ROS in vivo, was found to modify the expression of these enzymes in a strain-specific manner possibly indicating the existence of ethanol- or an ethanol metabolite-responsive regulation. Quantitative genetic analysis of CAT activity phenotypes (among the most variable of the above enzymes and thus potentially the most informative in a study of genetic regulation) in a number of genetic crosses allowed testing of mathematical models of inheritance (Schisler and Singh, 1991). These studies indicated that multiple genetic factors/loci must govern the tissue-specific expression of CAT in mice. One inbred mouse strain which harbored an acatalasemic mutation (low expression of CAT protein and enzyme activity in most tissues but normal RNA levels) was particularly informative in these studies (Reimer and Singh, 1990); however, subsequent sequencing of this CAT gene has revealed only a single mutation which probably does not affect the enzyme activity phenotype (Reimer et al., 1994). Ongoing molecular studies are attempting to characterize other loci which may affect the CAT acatalasemic phenotype by analyzing RNA expression after manipulation of the CAT gene's 5' and 3' untranslated DNA sequences (Reimer and Singh, 1996). Collaborative studies are underway with Professor Singh that involve the sequencing and analysis of CAT introns to determine if regulatory elements exist within these sequences.

Molecular Evolution of the Antioxidant Enzymes

Recent phylogenetic reconstruction of amino acid sequence data for each of the above enzymes demonstrated the expected relationships among species (student publications in preparation). Quantitative representations (i.e. sequence profiles) of the same sequences were then used to search the major protein databases for conserved domains such as enzyme active sites (Gribskov et al., 1987). Several hundred distantly related proteins were identified which indicated the antiquity and potential recombinatorial properties of protein domains from these enzymes (publication in preparation). At the DNA level, putative regulatory elements that may have roles in gene expression (Wingender et al., 2001) were identified in the intron sequences of some of these genes (student publication in preparation). In addition, comparative analysis of coding region synonymous and non-synonymous nucleotide substitutions indicated that CuZn SOD may be under separate, selective evolutionary pressure, from Mn SOD, CAT and GSH-Px which is interesting given the results of the factor analysis described above.

The human CuZn SOD gene was also analyzed to determine how familial amyotrophic lateral sclerosis (FALS) -associated mutations are related to substitutions in the CuZn SOD genes of other species (Deng et al., 1993). Motifs which harbor mutations in humans or substituted sites in other species were compared and used to probe DNA databases in an attempt to identify related sequences that could undergo similar mutational changes during the course of evolution. Preliminary results show that DNA substitutions at particular sites that manifest as FALS mutations in humans may also play a significant role in the evolution and divergence of the CuZn SOD gene (publication in preparation). This comparative phylogenetic approach to evaluation of disease-causing mutations using similar genes in related species has yielded important insights into disease processes and this approach will become more important as additional eukaryotic and eubacterial genomes are sequenced (Thornton and DeSalle, 2000).

Bioinformatic Database Construction

There are approximately 20,000 - 25,000 protein-encoding genes in the human genome (She et al., 2004), quite a difference from the 30,000 - 120,000 proposed (Harrison et al., 2002), yet even this reduced number accounts for only about 1.4 percent of the approximately 3,200 Mb of DNA contained therein (Szymanski and Barciszewski, 2002). The function of the remaining 98.6% of the human genome is not well characterized, although it is known to harbor many repetitive elements, pseudogenes, retroviruses, introns as well as regulatory sequences (Lander et al., 2001). The rapidity with which genomes are being sequenced (Pennisi, 2002) has created a bottleneck in the informational and analytical infrastructure; indeed we are falling far behind in our ability to organize and analyze sequence information (Benson, et al., 2002). My postdoctoral fellowship in Professor Jeffrey Palmer's lab at Indiana University, Bloomington was spent developing specialized databases and algorithms that could be adapted or expanded in an attempt to organize increasing amounts of non-coding genomic data.

I chose to develop prototype genomic databases using intron sequences from multiple species since (i) excellent, curated, non-redundant amino acid sequence databases such as Swiss-PROT already exist (Bairoch and Apweiler, 2000) and could serve as a starting point for this study; (ii) the presence of introns within protein-encoding genes means that the sequences of many introns have already been determined; (iii) many introns are associated with multi-gene families which also are well documented in prominent databases (Bateman et al., 2002; Falquet, et al., 2002) that can furnish a large pool of related sequences for comparative analysis; and (iv) some intron sequences are apparently conserved among different species and thus may have an uncharacterized functional significance. A non-redundant database of nuclear, protein-encoding, genomic DNA sequences highlighting nuclear pre-mRNA introns was therefore constructed using information contained in the Swiss-PROT and Genbank sequence databases (Schisler and Palmer, 2000). This Intron DataBase (IDB) contains information about (i) introns (including nucleotide sequence, location, phase, length, GC content, and consensus-sequence rule violations); (ii) exons (including nucleotide sequence, length, and GC content); (iii) protein-coding regions (including amino acid sequence and length); and (iv) descriptive information about the source gene and organism. The database is also extensively cross-referenced to the Swiss-PROT, Genbank, and a variety of other databases. In collaboration with Arlin Stoltzfus, Assistant Professor at the University of Maryland Biotechnology Institute, I hold a National Institutes of Health R01 Grant Number LM007218-01A1 "Intron Evolution: automated phylogenetic analysis system" valued at $552,000 US over three years (July 1, 2002- June 30, 2005). This funding will enhance IDB as well as address some fundamental issues concerning the origin and evolution of introns (see below).

The 2005 edition of IDB contains information on over 250,000 genes and 500,000 introns. It is my plan to extend the structure and content of this database to include 5' and 3' UTRs and upstream and downstream regions of protein-encoding genes, other defined sequences such as repeated sequences and RNA genes as well as currently undefined intergenic sequences. Due to its non-redundant nature, this database also could be used as a primary source for all sequence information related to a given gene, chromosome or genome. A collaboration with Professor Richard McCaman of Cal. State Fullerton initiated last year was recently expanded to include Professors Tae Ryu, Wan-Ying Chang, and Douglas Eernisse, also affiliated with Cal. State Fullerton, to continue development of this genomic database system as well as to address fundamental issues concerning the characterization and evolution of non-coding, non-intron, DNA sequences in eukaryotic genomes. As part of this process, we hope to enhance the teaching of genomics and bioinformatics as well as expose undergraduate and graduate student researchers to problems surrounding biological database design, experimental data integration, and data interpretation.

Origin and Evolution of Splicesomal Introns

The real power of the IDB and similar databases lies not only in their comprehensive non-redundant contents but also in the relative ease by which useful information from a phylogenetic or medical perspective may be located, extracted, and analyzed. To illustrate this point, an Intron Evolution DataBase (IEDB), which summarizes all intron information from the 3,000+ species present in the IDB was constructed. The IEDB provides a statistical analysis of all exon and intron sequences as well as data concerning intron penetration (relative number of coding regions with introns), density (number of introns per kb of total coding sequence DNA), distribution, and consensus sequences for each species present in IDB. Such data can be expected to furnish many insights into intron evolution including:

The Student

Given that field of bioinformatics is interdisciplinary by definition, students with backgrounds in biology, molecular biology, mathematics, or computer science are welcome to pursue research projects in my laboratory. Students with a programming background have made significant contributions including creation of perl scripts to parse BLAST search results and new algorithms to extract data from numerous online databases. Students in mathematics have contributed their knowledge of statistics to analyze trends in large genome datasets and those with a background in biology have engaged in testing some of the hypotheses outlined above. Furman's biology department has recently acquired an Apple Workgroup Cluster for Bioinformatics (16 processor "supercomputer") as well as a Beckman CEQ8000 Genetic Analysis System (sequencing and fragment analysis) to invigorate both bioinformatics and biotechnology research - students are welcome to use this equipment in their own research projects under my direction. Regardless of their specialization, all students who pass through my laboratory are expected to become familiar with the tools of bioinformatics and their application in the areas of biological database information retrieval, functional domain recognition, gene finding, inference of evolutionary relationships among genes and proteins, and modeling structure from sequence. Student projects do not necessarily involve data generation or computer programming, rather students are encouraged to analyze data from various genome and sequencing projects with the objective of producing publication-quality results that address fundamental problems in genome evolution and genetics. Collaboration is encouraged and students have the opportunity to discuss their results with graduate students and faculty at Cal State Fullerton and the Center for Advanced Research in Biotechnology (part of the National Institute of Standards) in Washington, D.C. I have also been involved with the internship program at IUP genie Physiologique et Informatique, Universite de Poitiers, France as well as at the Greenwood Genetics Center. Students also have the opportunity to present their results at international bioinformatics meetings such as those hosted by the International Society of Computational Biology (ISCB).

Literature Cited

Bairoch, A. and Apweiler, R. 2000. "The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000." Nucleic Acids Res. 28(1): 45-8.

Bannister, W. H. and Bannister, J. V. 1987. "Factor analysis of the activities of superoxide dismutase, catalase and glutathione peroxidase in normal tissues and neoplastic cell lines." Free Radic. Res. Commun. 4(1): 1-13.

Bateman, A., Birney, E. et al. 2002. "The Pfam protein families database." Nucleic Acids Res. 30(1): 276-80.

Benson, D. A., Karsch-Mizrachi, I. et al. 2002. "GenBank." Nucleic Acids Res. 30(1): 17-20.

Castillo-Davis, C. I., Mekhedov, S. L. et al. 2002. "Selection for short introns in highly expressed genes." Na.t Genet. 31(4): 415-8.

Cho, G. and Doolittle, R. F. 1997. "Intron distribution in ancient paralogs supports random insertion and not random loss." J. Mol. Evol. 44(6): 573-84.

Deng, H. X., Hentati, A. et al. 1993. "Amyotrophic lateral sclerosis and structural defects in Cu,Zn superoxide dismutase." Science 261(5124): 1047-1051.

Falquet, L., Pagni, M. et al. 2002. "The PROSITE database, its status in 2002." Nucleic Acids Res. 30(1): 235-8.

Gentles, A. J. and Karlin, S. 1999. "Why are human G-protein-coupled receptors predominantly intronless?" Trends Genet. 15(2): 47-9.

Gribskov, M., McLachlan, A. D. et al. 1987. "Profile analysis: detection of distantly related proteins." Proc. Natl. Acad. Sci. USA. 84: 4355-4358.

Gutierrez, C. and Lopez-Saez, J. F. 1982. "Oxygen dependence of sister chromatid exchanges." Mutat. Res. 103: 295-302.

Hankeln, T., Friedl, H. et al. 1997. "A variable intron distribution in globin genes of Chironomus: evidence for recent intron gain." Gene 205(1-2): 151-60.

Harrison, P. M., Kumar, A. et al. 2002. "A question of size: the eukaryotic proteome and the problems in defining it." Nucleic Acids Res. 30(5): 1083-90.

Jaworski, C., Sperbeck, S. et al. 1997. "Alternative splicing of Pax6 in bovine eye and evolutionary conservation of intron sequences." Biochem. Biophys. Res. Commun. 240(1): 196-202.

Lander, E. S., Linton, L. M. et al. 2001. "Initial sequencing and analysis of the human genome." Nature 409(6822): 860-921.

Logsdon, J. M., Jr. 1998. "The recent origins of spliceosomal introns revisited." Curr. Opin. Genet. Dev. 8(6): 637-48.

Long, M. and Rosenberg, C. 2000. "Testing the "proto-splice sites" model of intron origin: evidence from analysis of intron phase correlations." Mol. Biol. Evol. 17(12): 1789-96.

McLysaght, A., Enright, A. J. et al. 2000. "Estimation of synteny conservation and genome compaction between pufferfish (Fugu) and human." Yeast 17(1): 22-36.

Pennisi, E. 2002. "Sequencing. Chimps and fungi make genome." Science 296(5573): 1589-91.

Qiu WG, Schisler N, Stoltzfus A. 2004. The evolutionary gain of spliceosomal introns: sequence and phase preferences. Molecular Biology and Evolution 21(7):1252-63. Erratum in: Mol Biol Evol. 2004 Sep;21(9):1812.

Reimer, D. L., Bailley, J. et al. 1994. "Complete cDNA and 5' genomic sequences and multilevel regulation of the mouse catalase gene." Genomics 21(2): 325-336.

Reimer, D. L. and Singh, S. M. 1982. "Cyclophosphamide induced in vivo sister chromatid exchanges (SCE) in Mus musculus. I. Strain differences and empirical association with relative chromosome size." Can. J. Genet. Cytol. 24(521-527).

Reimer, D. L. and Singh, S. M. 1990. "In situ hybridization studies on murine catalase mRNA expression during embryonic development." Dev. Genet. 11: 318-325.

Reimer, D. L. and Singh, S. M. 1996. "Distinct mRNA-binding proteins interacting wuth short repeat sequences of the 3' UTR may be involved in the post-transcriptional regulation of the mouse catalase gene, Cas-1." DNA Cell Biol. 15(4): 317-328.

She et al. 2004. "Shotgun sequence assembly and recent segmental duplications within the human genome." Nature. 431: 927-930.

Schisler, N. J. and Palmer, J. D. 2000. "The IDB and IEDB: intron sequence and evolution databases." Nucleic Acids Res. 28(1): 181-4.

Schisler, N. J. and Singh, S. M. 1985. "Tissue-specific developmental regulation of superoxide dismutase (SOD-1 and SOD-2) activities in genetic strains of mice." Biochem. Genet. 23(3-4): 291-308.

Schisler, N. J. and Singh, S. M. 1987. "Inheritance and expression of tissue-specific catalase activity during development and aging in mice." Genome 29(5): 748-760.

Schisler, N. J. and Singh, S. M. 1988. "Modulation of selenium-dependent glutathione peroxidase (Se-GSH-Px) activity in mice." Free Radic. Biol. Med. 4(3): 147-153.

Schisler, N. J. and Singh, S. M. 1989. "Effect of ethanol in vivo on enzymes which detoxify oxygen free radicals." Free Rad. Biol. Med. 7: 117-123.

Schisler, N. J. and Singh, S. M. 1991. "A quantitative genetic analysis of tissue-specific catalase activity in Mus musculus." Biochem. Genet. 29(1-2): 65-89.

Sies, H. and Cadenas, E. 1983. Biological basis of detoxification of oxygen free radicals. Biological Basis of Detoxification. J. Caldwell and W. B. Jakoby. New York, Academic Press: 181-211.

Szymanski, M. and Barciszewski, J. 2002. "Beyond the proteome: non-coding regulatory RNAs." Genome Biol. 3(5): S0005.

Thornton, J. W. and DeSalle, R. 2000. "Gene family evolution and homology: genomics meets phylogenetics." Ann. Rev. Genomics Hum. Genet. 1: 41-73.

Wingender, E., Chen, X. et al. 2001. "The TRANSFAC system on gene expression regulation." Nucleic Acids Res. 29(1): 281-3.

[Dividing Line Image]

[Home]

Send mail to nicholas.schisler@furman.edu with questions or comments about this web site.
Copyright © 1997-2005 Nicholas Schisler, PhD
Last modified: June 10, 2005