Abstracts

Abstracts for the posters

Distribution of DNA Curvature in Complete Prokaryotic Genomes

Alexander Bolshoy

Institute of Evolution, University of Haifa, ISRAEL

Co-author: Eviatar Nevo

Computer analysis of completely sequenced prokaryotic genomes is the key to understanding conservative patterns of gene regulation and will provide invaluable direction for future experimental biological research programs in the subject. We analyzed the distribution of predicted intrinsic curvature along all complete prokaryotic genomes. Sequence-dependent DNA curvature is known to play an important role in transcription initiation of many genes. Our analysis of the predicted curvature distribution showed that the genomes could be divided into two groups. Curvature distribution in all bacteria of the first group indicated a substantial fraction of genes characterized by intrinsic DNA curvature located within intergenic untranslated regions immediately upstream to the start of translation. We did not find this peculiar DNA curvature distribution in prokaryotes of the second group. Remarkably, all bacteria of the first group are mesophilic, whereas almost all prokaryotes of the second group are hyperthermophilic, both Eubacteria and Archaea. It was shown in vitro that intrinsic DNA curvature disappears with raising temperature. We hypothesize that the DNA curvature plays a biological role in gene regulation in mesophilic as against hyperthermophilic prokaryotes, i.e. DNA curvature presumably has a functional adaptive significance determined by temperature selection. As more complete prokaryotic genomes are sequenced, further verification of this finding will pave the road for the future of Ecological Genomic Studies.

References:

1. Bolshoy, A., P. McNamara, R.E. Harrington & E.N. Trifonov. (1991). Curved DNA without A-A: experimental estimation of all 16 DNA wedge angles. Proc Natl Acad Sci U S A 88(6), 2312-2316.

2. Gabrielian, A. and Bolshoy, A. (1999) "Sequence complexity and DNA curvature." Computers & Chemistry 23, 263-274

3. Gabrielian, A. E., Landsman, D., and Bolshoy, A. "Curved DNA in promoter sequences" (1999) In Silico 1 .

4. Perez-Martin, J. & V. de Lorenzo. (1997). Clues and consequences of DNA bending in transcription. Annu Rev Microbiol 51, 593-628.

Protein searching in the midnight zone

Karin Hardell

Co-authors: Jeanette Hargbo, Uwe H Sauer

The aim of this work was to develop a search tool that is able to pick out proteins with similar fold but low sequence similarity from a sequence database. One demand is that the structure of a particular query protein is known.

Our method was developed specifically to identify DNA-binding proteins with S-type immunoglobulin fold under exclusion of common immunoglobulins. With the the X-ray structure of the AML1 Runt domain as a starting point, we searched the PDB for related 3D structures. Next, the 3D coordinates of the domains related to Runt were aligned simultaneously using the Runt domain as a template. Amino acids that were close in space among all different protein structures were assumed to be critical for the particular fold and were used to generate a structure based multiple sequence alignment. From the multiple sequence alignment, a profile Hidden Markov Model was calculated. A sequence database was searched with the constructed profile, resulting in sequences used to generate the profile as well as new ones that were not. Only DNA-binding immunoglobulins were found. We predict that the proteins without known 3D structure are related to Runt and bind to DNA.

Nonparametric linkage power by simulations and multivariate normal approximation

Mikael Knutsson

Department of Mathematical Statistics, Chalmers University of Technology, Sweden
A simulation method for generation of marker data given a specified genetic model is presented for affected sib-pair families. We describe how powers in nonparametric linkage analysis can be obtained, combining the marker simulations with calculations of some appropriate linkage statistic and a multivariate normal approximation. The approach is also used to find the correct thresholds and pointwise levels of significance in partial genome scans.

The Galton-Watson process with immigration in an epidemiological context

Manuel Mota

Department of Mathematics, University of Extremadura, Spain

Co-authors: Miguel González and Manuel Molina

The Galton-Watson process (GWP) has often been used as a descriptive population model (see for example Jagers(1975)). In particular, it has been considered as a statistical model in an epidemiological context (Becker(1975,1977), Neyman and Scott(1964),...). However, if the number of infectives converges to a non degenerate variable and it is allowed the incorporation of new infected individuals from other populations, the GWP seems to be an inadequate model, being necessary to use more complex processes. The aim of this paper is to show that for such situations it can be considered, within the framework of the branching processes with immigration, the study of the limit behaviour of some infectious diseases. We present the Galton-Watson processes with immigration in an epidemiological context and taking into account some of the factors governing the spread of the diseases, we propose corrective policies. As an illustration for the theoretical study, some simulated examples are developed. The software we have used is programming language of Mathematica (v3.0).

References:

Jagers, P.(1975) Branching processes with Biological Applications, John Wiley, London.

Becker, N. G.(1975) The use of mathematical models in determining vaccination policies. Bulletin of the International Statistical Institute 46, Book 1, 478-490.

Becker, N. G.(1977) Estimation for discrete time branching processes with application to epidemics. Biometrics 33, 515-522.

Neyman, J. and Scott, E.L. (1964) Stochastic Models in Medicine and Biology. J. Gurland editor, 45-55, University of Wisconsin Press, Madison.

Estimating the age of a disease

Ulrica Olofsson

Department of Mathematical Statistics, Göteborg University and Chalmers University of Technology, Sweden
At some time in the past, a certain disease gene appeared in a population by mutation. Its fate can be modelled using a multitype Galton-Watson branching process model.

We consider a situation where the location of the gene for a dominant disease is known. In this case, information about additional genetic markers linked to the disease gene can be used to estimate the age (time since appearance) of the disease based on a sample of affected individuals. The behaviour of this estimate is studied using simulations.

DNA Microsatellite Evolution: Facts, Theory and Simulation

Alexander Renwick

Department of Statistics, Rice University, USA

Co-authors: Marek Kimmel, Leslea Davison, Heidi Spratt, Patrick King

We examine length distributions of approximately 6000 human dinucleotide microsatellite loci, representing chromosomes 1 to 22, from the GDB Database. Under the stepwise mutation model, results from theory and simulation are compared with the empirical data. In both constant and expanding population scenarios, a simple single step model with parameters chosen to account for the observed variance of microsatellite lengths produces results inconsistent with the observed homozygosity and the dispersion of length skewness. Complicating the model by allowing a variable mutation rate accounts for the homozygosity, and introducing a small probability of a large step accounts for the dispersion in skewnesses. We discuss these results in light of the long term evolution of microsatellites.

References:

Kimmel, M., R. Chakraborty, J. P. King, M. Bamshad, W. S. Watkins and L. B. Jorde, 1998. Signatures of population expansion in microsatellite repeat data. Genetics 148: 1921-1930.

Kimmel, M., and R. Chakraborty, 1996. Measures of variation of DNA repeat loci under a general stepwise mutation model. Theoretical Population Biology 50: 345-367.

A Novel Approach to Structure Alignment

Markus Ringnér

Complex Systems Division, Dept. of Theoretical Physics, Lund University, Sweden
A new method for protein structure alignment is presented, which is based on simultaneously minimizing an error function with respect to both assignment and structural degrees of freedom. The assignment part is minimized using a deterministic mean field approach. The latter implicitly enables exploration of the entire space of possible alignments and is hence able to handle permutations. Also, it allows for a probabilistic interpretation of the results. The method is extremely generic and can conveniently host a variety of constraints in addition to matching. It performs very well including situations, where permutations are called for.

References:
M. Ohlsson, C. Peterson, M. Ringnér and R. Blancenbecler
A Novel Approach to Structure Alignment
LU TP 00-07 and SLAC-PUB-8429 (2000)

A pedagogical implementation of biosequence alignment algorithms

Peter Sestoft

Department of Mathematics and Physics, Royal Veterinary and Agricultural University, Denmark
We present a model implementation of the standard dynamic programming algorithms for biosequence alignment: global alignment (Needleman-Wunsch), local alignment (Smith-Waterman), overlap alignment, repeated matches. We handle linear as well as affine gap costs, and implement the (simple) quadratic-space as well as the (slightly more complicated) linear-space versions of the dynamic programming algorithms.

The implementation is written as a Java applet, so that students can experiment interactively using a standard webbrowser. The source code is fairly clear, so that students can investigate the finer details of the implementation.

References: The algorithms can be accessed at http://www.dina.kvl.dk/~sestoft/bsa/dinaws/bsapplet.html

Branching processes with emigration for modelling cell populations

Marossia N. Slavtchova-Bojkova

Department of Probability and Statistics, Bulgarian Academy of Sciences

Co-author: Nickolay M. Yanev

The aim of our study is to present and investigate a new stochastic process for modelling population dynamics, namely Sevastyanov's age-dependent branching process with emigration. Such model appears naturally in applications to cell populations where with positive probability the cells may die before their life-cycle is completed. The results may help to give answer of such question like what one can say about the future development of the population under certain reproduction conditions of the individuals and rate of emigration, as well.

In this connection the use of branching processes to model cell populations was pioneered by Bellman and Harris, where the Bienayme-Galton-Watson branching process was generalized to a process where each individual first lives a random amount of time, independently of the others, and gives birth to a random number of offspring. Further generalization of Bellman-Harris branching process is so-called Sevastyanov's age-dependent branching process where the progeny depends on the age of the particle at the moment of giving birth to new particles of age zero.

Applying renewal technique we establish limit theorems in the case when local characteristics of the processes are finite. The limit behaviour depends essentially on the criticality of the processes and the rate of emmigration.

References:

P. Jagers (1975) Branching Processes with biological applications. John Wiley and Sons.

B. Sevastyanov (1971) Branching Processes. Nauka, Moscow.

M. Slavtchova-Bojkova, Multi-type Age-dependent Branching Processes with State-dependent Immigration, Proceeedings of the Athen's Conference on Applied Probability and Time Series Analysis, Ed. C.C. Heyde, Yu. V. Prohorov, R. Pyke, S. T. Rachev , Lecture Notes in Statistics, 114, p. 192-205, 1996, Springer Verlag, New York.

Non-linear modeling of genetic regulatory interactions

Mattias Wahde

Div. of Mechatronics, Chalmers University of Technology, Sweden

Co-authors: John Hertz, Nordita, Denmark

The availability of time series of gene expression data has made possible the investigation of interactions between genes of groups of genes. We have studied genetic regulatory networks using a (non-linear) recurrent network model. Using evolutionary algorithms, we have developed a method for determining the interactions in such networks.

We present the method and show the results obtained from applying it to both artificial and real gene expression data from developing rat spinal cord and hippocampus. Taken together, the data from the two tissues allow us to identify the main features of the structure of the regulatory networks governing nervous system development.

References:

1. Wahde, M. and Hertz, J. "Coarse-grained reverse engineering of genetic regulatory networks", Biosystems, in press

2. www.me.chalmers.se/~mwahde/genenets.html

Last modified: Sun May 7 14:10:42 MET DST 2000