AU776605B2

AU776605B2 - Novel cell cycle genes and uses thereof

Info

Publication number: AU776605B2
Application number: AU19823/00A
Authority: AU
Inventors: Veronique Katelijne Cecile Kristien Boudolf; Lieven De Veylder; Dirk Inze; Juan Antonio Torres Acosta
Original assignee: CropDesign NV
Current assignee: CropDesign NV
Priority date: 1998-12-17
Filing date: 1999-12-17
Publication date: 2004-09-16
Anticipated expiration: 2019-12-17
Also published as: WO2000036124A3; EP1141353A2; CA2355901A1; AU1982300A; WO2000036124A2; MXPA01006104A; BR9916831A

Description

WO 00/36124 PCT/EP99/10084 Novel cell cycle genes and uses thereof The present invention relates to DNA sequences encoding cell cycle interacting proteins as well as to methods for obtaining the same. The present invention also provides vectors comprising said DNA sequences, wherein the DNA sequences are operatively linked to regulatory elements allowing expression in prokaryotic and/or eukaryotic host cells. In addition, the present invention relates to the proteins encoded by said DNA sequences, antibodies to said proteins and methods for their production. Furthermore, the present invention relates to regulatory sequences which naturally regulate the expression of the above described DNA sequences. The present invention also relates to a method for controlling or altering growth characteristics of a plant and/or a plant cell comprising introduction and/or expression of one or more cell cycle regulatory proteins functional in a plant or parts thereof and/or one or more DNA sequences encoding such proteins. Also provided by the present invention is a process for disruption plant cell division by interfering in the expression of a substrate for cyclin-dependent protein kinase using a DNA sequence according to the invention wherein said plant cell is part of a transgenic plant. The present invention further relates to diagnostic compositions comprising the aforementioned DNA sequences, vectors, proteins and antibodies. The present invention also relates to methods for the identification of compounds being capable of activating or inhibiting the cell cycle. Furthermore, the present invention relates to transgenic plant cells, plant tissue and plants containing the above-described DNA sequences, regulatory sequences and vectors as well as to the use of the aforementioned DNA sequences, regulatory sequences, vectors, proteins, antibodies and/or compounds identified by the method of the invention in plant cell and tissue culture, plant breeding and/or agriculture.

Several documents are cited throughout the text of this specification. Each of the documents cited herein (including any manufacturer's specifications, instructions etc.) WO 00/36124 PCT/EP99/1 0084 2 are hereby incorporated herein by reference; however, there is no admission that any document cited is indeed prior art as to the present invention.

Cell division is fundamental for growth in humans, animals and plants. Prior to dividing in two daughter cells, the mother cell needs to replicate its DNA. The cell cycle is traditionally divided into 4 distinct phases: G1: the gap between mitosis and the onset of DNA synthesis; S: the phase of DNA synthesis; G2: the gap between S and mitosis; M: mitosis, the process of nuclear division leading up to the actual cell division.

The distinction of the 4 cell cycle phases provides a convenient way of dividing the interval between successive divisions. Although they have served a useful purpose, a recent flurry of experimental results, much of it as a consequence of cancer research, has resulted in a more intricate picture of the cell cycle's "four seasons" (Nasmyth, Science 274, 1643-1645, 1996; Nurse, Nature, 344, 503-508, 1990). The underlying mechanism controlling the cell cycle control system has only recently been studied in greater detail. In all eukaryotic systems, including plants, this control mechanism is based on two key families of proteins which regulate the essential process of cell division, namely protein kinases (cyclin dependent kinases or CDKs) and their activating associated subunits, called cyclins. The activity of these protein complexes is switched on and off at specific points of the cell cycle. Particular CDK-cyclin complexes activated at the G1/S transition trigger the start of DNA replication. Different CDK-cyclin complexes are activated at the G2/M transition and induce mitosis leading to cell division. Each of the CDK-cyclin complexes execute their regulatory role via modulating different sets of multiple target proteins. Furthermore, the large variety of developmental and environmental signals affecting cell division all converge on the regulation of CDK activity. CDKs can therefore be seen as the central engine driving cell division.

In animal systems and in yeast, knowledge about cell cycle regulations is now quite advanced. The activity of CDK-cyclin complexes is regulated at five levels: (i) transcription of the CDK and cyclin genes; (ii) association of specific CDKs with their specific cyclin partner; (iii) phosphorylation/dephosphorylation of the CDK and cyclins; (iv) interaction with other regulatory proteins such as SUC1/CKS1 homologues and cell WO 00/36124 PCT/EP99/10084 3 cycle kinase inhibitors (CKI); and cell cycle phase-dependent destruction of the cyclins and CKIs.

The study of cell cycle regulation in plants has lagged behind that in animals and yeast.

Some basic mechanisms of cell cycle control appear to be conserved among eukaryotes, including plants. Plants were shown to also possess CDK's, cyclins and CKIs. However plants have unique developmental features which are reflected in specific characteristics of the cell cycle control. These include for instance the absence of cell migration, the formation of organs throughout the entire lifespan from specialized regions called meristems, the formation of a cell wall and the capacity of non-dividing cells to re-enter the cell cycle. Another specific feature is that many plant cells, in particular those involved in storage endosperm), are polyploid due to rounds of DNA synthesis without mitosis. This so-called endoreduplication is intimately related with cell cycle control.

Due to these fundamental differences, multiple components of the cell cycle of plants are unique compared to their yeast and animal counterparts. For example, plants contain a unique class of CDKs, such as CDC2b in Arabidopsis, which are both structurally and functionally different from animal and yeast CDKs.

The further elucidation of cell cycle regulation in plants and its differences and similarities with other eukaryotic systems is a major research challenge. Strictly for the case of comparison, some key elements about yeast and animal systems are described below in more detail.

As already mentioned above, the control of cell cycle progression in eukaryotes is mainly exerted at two transition points: one in late G 1 before DNA synthesis, and one at the G 2 /M boundary. Progression through these control points is mediated by cyclindependent protein kinase (CDK) complexes, which contain, in more detail, a catalytic subunit of approximately 34-kDa encoded by the CDK genes. Both Saccharomyces cerevisiae and Schizosaccharomyces pombe only utilize one CDK gene for the regulation of their cell cycle. The kinase activity of their gene products p 3 4 CD C2 and p 34 CDC28 in Sch. pombe and in S. cerevisiae, respectively, is dependent on regulatory proteins, called cyclins. Progression through the different cell cycle phases is achieved by the sequential association of p 34 DC2/ c D C28 with different cyclins. Although in higher eukaryotes this regulation mechanism is conserved, the situation is more complex since WO 00/36124 PCT/EP99/10084 4 they have evolved to use multiple CDKs to regulate the different stages of the cell cycle.

In mammals, seven CDKs have been described, defined as CDK1 to CDK7, each binding a specific subset of cyclins.

In animal systems, CDK activity is not only regulated by its association with cyclins but also involves both stimulatory and inhibitory phosphorylations. Kinase activity is positively regulated by phosphorylation of a Thr residue located between amino acids 160-170 (depending on the CDK protein). This phosphorylation is mediated by the CDKactivating kinase (CAK) which interestingly is a CDK/cyclin complex itself. Inhibitory phosphorylations occur at the ATP-binding site (the Tyr15 residue together with Thrl4 in higher eukaryotes) and are carried out by at least two protein kinases. A specific phosphatase, CDC25, dephosphorylates these residues at the G 2 /M checkpoint, thus activating CDK activity and resulting in the onset of mitosis. CDK activity is furthermore negatively regulated by a family of mainly low-molecular weight proteins, called cyclindependent kinase inhibitors (CKIs). Kinase activity is inhibited by the tight association of these CKIs with the CDK/cyclin complexes.

With respect to cell cycle regulation in plants a summary of the state of the art is given below. In Arabidopsis, thusfar only two CDK genes have been characterized in detail, CDC2aAt and CDC2bAt, of which the gene products share 56% amino acid identity.

Both CDKs are distinguished by several features. First, only CDC2aAt is able to complement yeast p 3 4

CD

C2/CDC28 mutants. Second, CDC2aAt and CDC2bAt bear different cyclin-binding motifs (PSTAIRE and PPTALRE, respectively), suggesting they may bind distinct types of cyclins. Third, although both CDC2aAt and CDC2bAt show the same spatial expression pattern, they exhibit a different cell cycle phase-specific regulation. The CDC2aAt gene is expressed constitutively throughout the whole cell cycle. In contrast, CDC2bAt mRNA levels oscillate, being most abundant during the S and G 2 phases. In addition, multiple cyclins have been isolated from Arabidopsis. The majority displays the strongest sequence similarity with the animal A- or B-type class of cyclins, but also D-type cyclins have been identified. Although the classification of Arabidopsis cyclins is mainly based upon sequence similarity, limited data suggests that this organization corresponds with differential functions of each cyclin class (Renaudin, Plant Mol. Biol. 32 (1996) 1003-1018).

WO 00/36124 PCT/EP99/1 0084 In order to manage problems related to plant growth, plant architecture, stress responses and/or plant diseases, it is believed to be of utmost importance to identify and isolate plant genes and gene products involved in the regulation of the plant cell division, and more particularly coding for and interacting with CDKs and/or their interacting proteins, responsible for the control of the cell cycle. If such novel genes and/or proteins have been isolated and analyzed; the growth of the plant as a whole can be influenced.

Also, the growth of specific tissues or organs and thus the architecture of the plant can be modified. Cell cycle proteins may also provide targets to facilitate the identification of inhibitors or activators of cell cycle regulatory proteins that may useful as herbicides or plant growth regulators.

Thus, the technical problem underlying the present invention is to provide means and methods for modulating cell cycle proteins that are particular useful in agriculture and plant cell and tissue culture.

The solution to the technical problem is achieved by providing the embodiments characterized in the claims.

Accordingly, the present invention relates to a DNA sequence encoding a cell cycle interacting protein or encoding an immunologically active and/or functional fragment of such a protein, selected from the group consisting of: DNA sequences (aa) comprising a nucleotide sequence encoding at least the mature form of a protein (LDV115) comprising the amino acid sequence as given in SEQ ID NO: 2; (ab) comprising the nucleotide sequence as given in SEQ ID NO: 1; (ac) comprising a nucleotide sequence hybridizing with the complementary strand of a nucleotide sequence as defined in (aa) or (ab) under stringent hybridization conditions; (ad) comprising an nucleotide sequence encoding a protein having an amino acid sequence at least 60 identical to the amino acid sequence encoded by the nucleotide sequence of (aa) or (ab); WO 00/36124 PCT/EP99/10084 (b) (ae) comprising a nucleotide sequence encoding at least the domain binding to CDKs of the protein encoded by the nucleotide sequence of any one of (aa) to (ad); DNA sequences (ba) comprising a nucleotide sequence encoding at least the mature form of a Protein (PLP) comprising the amino acid sequence as given in any one of SEQ ID NOs: 4, 34, 36, 38, 40 or 42; (bb) comprising the nucleotide sequence as given in any one of SEQ ID NOs: 3, 33, 35, 37, 39 or 41; (bc) comprising a nucleotide sequence hybridizing with the complementary strand of a nucleotide sequence as defined in (ba) or (bb) under stringent hybridization conditions; (bd) comprising an nucleotide sequence encoding a protein having an amino acid sequence at least 40 identical to the amino acid sequence encoded by the nucleotide sequence of (ba) or (bb); (be) comprising a nucleotide sequence encoding at least the cyclin-like interacting domain of the protein encoded by the nucleotide sequence of any one of (ba) to (bd); DNA sequences (ca) comprising a nucleotide sequence encoding at least the mature form of a protein (VB33) comprising the amino acid sequence as given in SEQ ID NO: 6; (cb) comprising the nucleotide sequence as given in SEQ ID NO: (cc) comprising a nucleotide sequence hybridizing with the complementary strand of a nucleotide sequence as defined in (ca) or (cb) under stringent hybridization conditions; (cd) comprising an nucleotide sequence encoding a protein having an amino acid sequence at least 60 identical to the amino acid sequence encoded by the nucleotide sequence of (ca) or (cb); (ce) comprising a nucleotide sequence encoding at least the domain binding to CDKs of the protein encoded by the nucleotide sequence of any one of (ca) to (cd); (c) WO 00/36124 PCT/EP99/1 0084 7 DNA sequences (da) comprising a nucleotide sequence encoding at least the mature form of a protein (VB89) comprising the amino acid sequence as given in SEQ ID NO: 8; (db) comprising the nucleotide sequence as given in SEQ ID NO: 7; (dc) comprising a nucleotide sequence hybridizing with the complementary strand of a nucleotide sequence as defined in (da) or (db) under stringent hybridization conditions; (dd) comprising an nucleotide sequence encoding a protein having an amino acid sequence at least 60 identical to the amino acid sequence encoded by the nucleotide sequence of (da) or (db); (de) comprising a nucleotide sequence encoding at least the domain binding to CDKs of the protein encoded by the nucleotide sequence of any one of (da) to (dd); DNA sequences (ea) comprising a nucleotide sequence encoding at least the mature form of a protein (VBDAHP) comprising the amino acid sequence as given in SEQ ID NO: (eb) comprising the nucleotide sequence as given in SEQ ID NO: 9; (ec) comprising a nucleotide sequence hybridizing with the complementary strand of a nucleotide sequence as defined in (ea) or (eb) under stringent hybridization conditions; (ed) comprising an nucleotide sequence encoding a protein having an amino acid sequence at least 60 identical to the amino acid sequence encoded by the nucleotide sequence of (ea) or (eb); (ee) comprising a nucleotide sequence encoding at least the domain binding to CDKs of the protein encoded by the nucleotide sequence of any one of (ea) to (ed); DNA sequences (fa) comprising a nucleotide sequence encoding at least the mature form of a protein (VBDBP) comprising the amino acid sequence as given in SEQ ID NO: 12; WO 00/36124 PCT/EP99/1 0084 8 (fb) comprising the nucleotide sequence as given in SEQ ID NO: 11; (fc) comprising a nucleotide sequence hybridizing with the complementary strand of a nucleotide sequence as defined in (fa) or (fb) under stringent hybridization conditions; (fd) comprising an nucleotide sequence encoding a protein having an amino acid sequence at least 60 identical to the amino acid sequence encoded by the nucleotide sequence of (fa) or (fb); (fe) comprising a nucleotide sequence encoding at least the domain binding to CDKs of the protein encoded by the nucleotide sequence of any one of (fa) to (fd); DNA sequences (ga) comprising a nucleotide sequence encoding at least the mature form of a protein (VBHSF) comprising the amino acid sequence as given in SEQ ID NO: 14; (gb) comprising the nucleotide sequence as given in SEQ ID NO: 13; (gc) comprising a nucleotide sequence hybridizing with the complementary strand of a nucleotide sequence as defined in (ga) or (gb) under stringent hybridization conditions; (gd) comprising an nucleotide sequence encoding a protein having an amino acid sequence at least 60 identical to the amino acid sequence encoded by the nucleotide sequence of (ga) or (gb); (ge) comprising a nucleotide sequence encoding at least the domain binding to CDKs of the protein encoded by the nucleotide sequence of any one of (ga) to (gd); DNA sequences obtainable by screening an appropriate library under stringent conditions with a probe having at least 17 consecutive nucleotides of a nucleotide sequence of any one of SEQ ID NOs: 1, 3, 5, 7, 9, 11, 13, 15 to 33, 35, 37, 39, 41, 48, 49 or 53 to 57; DNA sequences comprising a nucleotide sequence encoding a fragment of a protein encoded by a DNA sequence of any one of to wherein said fragment is capable of interacting with a cell cycle protein; and WO 00/36124 PCT/EP99/1 0084 9 DNA sequences, the nucleotide sequence of which is degenerate as a result of the genetic code to a nucleotide sequence of a DNA sequence as defined in any one of to The term "cell cycle interacting protein" or "cell cycle protein" as denoted herein means a protein which exerts control on or regulates or is required for the cell cycle or part thereof of a cell, tissue, organ or whole organism and/or DNA replication. It is may also be capable of binding to, regulating or being regulated by cyclin dependent kinases, in particular CDC2a and/or CDC2b and preferably to plant cyclin dependent kinases or their subunits. The term also includes peptides, polypeptides, fragments, variant, homologs, alleles or precursors (eg preproteins or proproteins) thereof.

The term "cell cycle" means the cyclic biochemical and structural events associated with growth, division and proliferation of cells, and in particular with the regulation of the replication of DNA and mitosis. The cycle is divided into periods called: Go, Gap, (G 1 DNA synthesis Gap 2

(G

2 and mitosis Normally these four phases occur sequentially, however the cell cycle also includes modified cycles wherein one or more phases are absent resulting in modified cell cycle such as endomitosis, acytokinesis, polyploidy, polyteny, and endoreduplication.

The terms "gene", "polynucleotide", "nucleic acid sequence", "nucleotide sequence", "DNA sequence" or "nucleic acid molecule" as used herein refers to a polymeric form of nucleotides of any length, either ribonucleotides or deoxyribonucleotides. This term refers only to the primary structure of the molecule. Thus, this term includes double- and single-stranded DNA, and RNA. It also includes known types of modifications, for example, methylation, "caps" substitution of one or more of the naturally occuring nucleotides with an analog. Preferably, the DNA sequence of the invention comprises a coding sequence encoding at least the mature form of the above defined cell cycle interacting protein, i.e. the protein which is posttranslationally processed in its biologically active form, for example due to cleavage of leader or secretory sequences or a proprotein sequence or other natural proteolytic cleavage points.

By "functional fragment" and "biologically active form" polypeptides are meant that exhibit activity similar, but not necessarily identical, to an activity of the wild-type cell cycle interacting proteins of the invention or an activity that is enhanced over that of the WO 00/36124 PCT/EP99/10084 wild-type proteins (either the full-length protein or, preferably, the mature protein), as measured in a particular biological assay. Assays of cell cycle interacting activity are disclosed, for example, in Examples 1 to 7, below. These assays can be used to measure cell cycle interacting activity of partially purified or purified native or recombinant protein. The cell cycle interacting protein of the invention binds to CDC2, i.e. CDC2a and/or CDC2b, from Arabidopsis. Thus, a polypeptide having a functional fragment or the "biological activity" of the cell cycle interacting protein of the invention will bind to CDCs as set forth in Example 1 or 7.

The term "immunologically active fragment" of a cell cycle interacting protein of the invention denotes proteins or peptides which have at least a part of the primary structural conformation for one or more epitopes capable of reacting specifically with antibodies to a protein which is encodable by a nucleic acid molecule as set forth above.

Preferably, the peptides and proteins encoded by a nucleic acid molecule of the invention are recognized by an antibody that specifically recognizes an epitope of the cell cycle interacting protein comprising the amino acid residues that are unique for the protein encoded by any one of the aforementioned DNA sequences. Preferably, said peptides and proteins are capable of eliciting an effective immune response in a mammal, for example mouse or rabbit.

The DNA sequence which encodes for the predicted mature polypeptides of the proteins comprising SEQ ID NOS: 2, 4, 34, 36, 38, 40, 42, 6, 8, 10, 12 or 14 or for the biologically active fragment thereof may include: only the coding sequence for the mature polypeptide or for a biologically active fragment thereof; the coding sequence for the mature polypeptide or for a biologically active fragment thereof and additional coding sequence such as a leader or secretory sequence or a proprotein sequence; the coding sequence for the mature polypeptide (and optionally additional coding sequence) and non-coding sequence, such as intron or non-coding sequence 5' and/or 3' of the coding sequence for the predicted mature polypeptide.

A "coding sequence" is a nucleotide sequence which is transcribed into mRNA and/or translated into a polypeptide when placed under the control of appropriate regulatory sequences. The boundaries of the coding sequence are determined by a translation start codon at the 5'-terminus and a translation stop codon at the 3'-terminus. A coding sequence can include, but is not limited to mRNA, cDNA, recombinant nucleotide WO 00/36124 pCEP99/10094 11 sequences or genomic DNA, while introns may be present as well under certain circumstances. Thus, the nucleotide sequences of the present invention can be engineered in order to alter a cell cycle interacting protein coding sequence for a variety of reasons, including but not limited to, alterations which modify the cloning, processing and/or expression of the gene product. For example, mutations may be introduced using techniques which are well known in the art, site-directed mutagenesis to insert new restriction sites, to alter glycosylation patterns, to change codon preference, to produce splice variants, etc.

In accordance with the present invention a two-hybrid system (Fields et al., Nature 340 (1989), 245-246,) was exploited whereby CDC2aAt or CDC2bAt as bait and a cDNA library of a cell suspension as prey are used. Novel gene products interacting with CDC2aAt or CDC2bAt indicative of hitherto unknown plant cell cycle regulatory nucleotide sequences were identified. The library was made from a mixture mRNA from Arabidopsis thaliana cell suspensions harvested at various growing stages: early exponential, exponential, early stationary and stationary phase.

Twelve cDNA clones have been identified in accordance with the invention comprising the nucleotide sequences as depicted in SEQ ID NOS: 1, 3, 33, 35, 37, 39, 41, 5, 7, 9, 11 and 13, which encode proteins that are capable of specifically interacting with cdc2aAt or cdc2bAt; see Examples 1, 2 and 7, below. The proteins encoded by the cDNA clones comprised the amino acid sequences depicted in SEQ ID NOS: 2, 4, 34, 36, 38, 40, 42, 6, 8, 10, 12 and 14. Computer assisted homology search in genome data bases revealed that novel genes have been identified and/or genes where the (partial) cDNA was described but the particular function of the gene remained unknown. In particular, the examples of the present invention demonstrate that novel cell cycle interacting proteins and their encoding genes have been identified. The possible applications of the these cell cycle interacting proteins and their encoding nucleic acid molecules will be discussed further below and are evident from the description provided in the Examples.

WO 00/36124 PCT/EP99/1 0084 12 The homology search was performed with the program BLASTX and BLASTN (version 2.0a19MP-WashU [build decunix3.2 01:53:29 05-feb-1998] (see Altschul, Nucleic Acids Res. 25 (1997), 3389-3402) on the Arabidopsis thaliana nucleic acids database at ATDB at Stanford (http://genome-www2.stanford.edu/cgi-bin/AtDB/nph-blast2atdb). The function GAP (general alignment) (from the GCG 9.1 package, Genetics Computer Group Inc., Madison, USA) has been used with the parameters Gap weight 12 and Length weight 4 to quantify the percentage of homology and similarity. The protein sequences were then used to perform a BLASTP (version 2.0.4 [feb-24-1998]) with BEAUTY post-processing provided by the Human Genome Center, Baylor College of Medicine against the National Center for Biotechnology Information's non-redundant protein database (http://dot.imgen.bcm.tmc.edu:9331/seq-search/protein-search.html).

The results of the homology search are described in the appended examples.

As described in the Examples, during the course of search in the database homology has been found for one or more of the above described nucleotide sequences to some "Expressed Sequence Tags" (ESTs), i.e. (partial) cDNA clones comprising Open Reading Frames (ORFs) for (fragments of) proteins of unknown function and/or the nucleotide sequence of which has not sufficient coding capacity for a functional protein.

These particular ESTs per se are specifically excluded from the scope of the claims.

However, as far as the use of such ESTs in embodiments is concerned which have been first conceived in accordance with the present invention they are covered by the present invention and encompassed by the appended claims. The same applies to nucleotide sequences that may be present within for example a section of a chromomsome that has been described in context with an organism's genome sequencing project but hitherto have not been identified to constitute a gene with biological function, nor what the particular biological function of this gene could be.

Thus it is evident that the genes comprising the nucleotide sequences of each SEQ ID NOS: 1, 3, 33, 35, 37, 39, 41, 5, 7, 9, 11 and 13 each encode a member of a novel class of cell cycle interacting proteins. In particular, the nucleotide sequences of SEQ ID NOS: 3, 33, 35, 37, 39 and 41 define a novel class of PHO80-like Proteins (PLPs); see also Example 7.

WO 00/36124 PCTEP99/1 0084 13 The present invention also relates to DNA sequences hybridizing with the abovedescribed DNA sequences and differ in one or more positions in comparison with these as long as they encode a cell cycle interacting protein. By "hybridizing" it is meant that such nucleic acid molecules hybridize under conventional hybridization conditions, preferably under stringent conditions such as described by, Sambrook (Molecular Cloning; A Laboratory Manual, 2nd Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY (1989)). An example of one such stringent hybridization condition is hybridization at 4XSSC at 65 2C, followed by a washing in 0.1XSSC at 65 -C for one hour. Alternatively, an exemplary stringent hybridization condition is in 50 formamide, 4XSSC at 42 9C. Cell cycle interacting proteins derived from other organisms such as mammals, in particular humans, may be encoded by other DNA sequences which hybridize to the sequences for plant cell cycle interacting proteins under relaxed hybridization conditions and which code on expression for peptides having the ability to interact with cell cycle proteins. Examples of such non-stringent hybridization conditions are 4XSSC at 50 QC or hybridization with 30-40 formamide at 42 Further preferred hybridization conditions are described in the examples. Such molecules comprise those which are fragments, analogues or derivatives of the cell cycle interacting protein of the invention and differ, for example, by way of amino acid and/or nucleotide deletion(s), insertion(s), substitution(s), addition(s) and/or recombination(s) or any other modification(s) known in the art either alone or in combination from the above-described amino acid sequences or their underlying nucleotide sequence(s). Using the PESTFIND program (Rogers, Science 234 (1986), 364-368), PEST sequences (rich in proline, glutamic acid, serine, and threonine) can be identified, which are characteristically present in unstable proteins. Such sequences may be removed from the cell cycle interacting proteins in order to increase the stability and optionally the activity of the proteins.

Methods for introducing such modifications in the nucleic acid molecules according to the invention are well-known to the person skilled in the art. The invention also relates to nucleic acid molecules the sequence of which differs from the nucleotide sequence of any of the above-described nucleic acid molecules due to the degeneracy of the genetic code. All such fragments, analogues and derivatives of the protein of the invention are included within the scope of the present invention, as long as the essential characteristic immunological and/or biological properties as defined above remain unaffected in kind, WO 00/36124 PCT/EP99/1 0084 14 that is the novel nucleic acid molecules of the invention include all nucleotide sequences encoding proteins or peptides which have at least a part of the primary structural conformation for one or more epitopes capable of reacting with antibodies to cell cycle interacting proteins which are encodable by a nucleic acid molecule as set forth above and which have comparable or identical characteristics in terms of biological activity and/or the capability to interact with Other proteins. It is preferred that proteins encoded by a nucleic acid molecule of the invention are at least capable of interacting with CDC2, particularly CDC2a and/or CDC2b, preferably from a plant such as Arabidopsis thaliana.

Whilst the above described proteins may interact with a CDC2 from Arabidopsis thaliana, the most likely interaction is with a CDC2 from the same species from which the gene was isolated (homologous interaction). This capability allows advantageous uses of the proteins of the invention and their encoding nucleic acid molecules as will be described in more detail below. Part of the invention is therefore also nucleic acid molecules encoding a polypeptide comprising at least a functional part of a cell cycle interacting protein encoded by a nucleic acid sequence comprised in a nucleic acid molecule according to the invention. An example for this is that the polypeptide or a fragment thereof according to the invention is embedded in another amino acid sequence. Preferably, the DNA sequence of the invention encodes a protein having substantially the same amino acid sequence as the proteins defined in SEQ ID NOS: 2, 4, 34, 36, 38, 40, 42, 6, 8, 10, 12 and 14.

Extending the polynucleotide sequence of the invention The polynucleotide sequences encoding the cell cycle interacting proteins may be extended utilizing partial nucleotide sequence and various methods known in the art to detect upstream sequences such as promoters and regulatory elements. Gobinda, (PCR Methods Applic. 2 (1993), 318-322) discloses "restriction-site" polymerase chain reaction (PCR) as a direct method which uses universal primers to retrieve unknown sequence adjacent to a known locus. First, genomic DNA is amplified in the presence of primer to a linker sequence and a primer specific to the known region. The amplified sequences are subjected to a second round of PCR with the same linker primer and another specific primer internal to the first one. Products of each round of PCR are WO 00/36124 PCT/EP99/10084 transcribed with an appropriate RNA polymerase and sequenced using reverse transcriptase.

Inverse PCR can be used to amplify or extend sequences using divergent primers based on a known region (Triglia, Nucleic Acids Res. 16 (1988), 8186). The primers may be designed using OLIGO® 4.06 Primer Analysis Software (1992; National Biosciences Inc, Plymouth MN), or another appropriate program to be preferably 22-30 nucleotides in length, to have a GC content of preferably 50% or more, and to anneal to the target sequence at temperatures preferably about 68°-720C. The method uses several restriction enzymes to generate a suitable fragment in the known region of a gene. The fragment is then circularized by intramolecular ligation and used as a PCR template.

Capture PCR (Lagerstrom, PCR Methods Applic. 1 (1991), 111-119) is a method for PCR amplification of DNA fragments adjacent to a known sequence in, human or plant yeast artificial chromosome DNA. Capture PCR also requires multiple restriction enzyme digestions and ligations to place an engineered double-stranded sequence into an unknown portion of the DNA molecule before PCR.

Another method which may be used to retrieve unknown sequences is that of Parker, (Nucleic Acids Res. 19 (1991), 3055-3060). Additionally, one can use PCR, nested primers and PromoterFinder libraries to walk in genomic DNA (PromoterFinder

M

Clontech (Palo Alto CA). This process avoids the need to screen libraries and is useful in finding intron/exon junctions. Preferred libraries for screening for full length cDNAs are ones that have been size-selected to include larger cDNAs. Also, random primed libraries are preferred in that they will contain more sequences which contain the 5' and upstream regions of genes. A randomly primed library may be particularly useful if an oligo d(T) library does not yield a full-length cDNA. Genomic libraries are useful for extension into the 5' nontranslated regulatory region. Suitable methods for identifying promoters are also described in WO 99/61619, in particular at pages 50 and 51.

Capillary electrophoresis may be used to analyze the size or confirm the nucleotide sequence of sequencing or PCR products; see, Sambrook, supra. Systems for rapid sequencing are available from Perkin Elmer, Beckmann Instruments (Fullerton CA), and other companies.

WO 00/36124 PCT/EP99/10084 16 Computer-assisted identification of cell cycle interacting proteins and their encoding genes As is further described in the appended examples BLAST2, which stands for Basic Local Alignment Search Tool (Altschul, 1997; Altschul, J. Mol. Evol. 36 (1993), 290-300; Altschul, J. Mol. Biol. 215 (1990), 403-410), can be used to search for local sequence alignments. BLAST produces alignments of both nucleotide and amino acid sequences to determine sequence similarity. Because of the local nature of the alignments, BLAST is especially useful in determining exact matches or in identifying homologs. The fundamental unit of BLAST algorithm output is the High-scoring Segment Pair (HSP). An HSP consists of two sequence fragments of arbitrary but equal lengths whose alignment is locally maximal and for which the alignment score meets or exceeds a threshold or cutoff score set by the user. The BLAST approach is to look for HSPs between a query sequence and a database sequence, to evaluate the statistical significance of any matches found, and to report only those matches which satisfy the user-selected threshold of significance. The parameter E establishes the statistically significant threshold for reporting database sequence matches. E is interpreted as the upper bound of the expected frequency of chance occurrence of an HSP (or set of HSPs) within the context of the entire database search. Any database sequence whose match satisfies E is reported in the program output.

Analogous computer techniques using BLAST (Altschul, 1997, 1993 and 1990, supra) are used to search for identical or related molecules in nucleotide databases such as GenBank or EMBL. This analysis is much faster than multiple membrane-based hybridizations. In addition, the sensitivity of the computer search can be modified to determine whether any particular match is categorized as exact or homologous. The basis of the search is the product score which is defined as: %seauence identity x% maximum BLAST score 100 and it takes into account both the degree of similarity between two sequences and the length of the sequence match. For example, with a product score of 40, the match will be exact within a 1-2% error; and at 70, the match will be exact. Homologous molecules WO 00/36124 PCT/EP99/10084 17 are usually identified by selecting those which show product scores between 15 and although lower scores may identify related molecules.

Identifying derivatives, variants and homologs of the cell cycle interacting proteins of the invention As is demonstrated in the appended examples a two-hybrid screening assay has been developed in accordance with the present invention suitable for identifying cell cycle interacting proteins. Thus, in another aspect the present invention relates to a method for identifying and obtaining cell cycle interacting proteins comprising a two-hybrid screening assay wherein CDC2a or CDC2b as a bait and a cDNA library of cell suspension culture as prey are used. Preferably, said CDC2a and CDC2b is CDC2aAt and CDC2bAt, respectively. However, CDKs or their corresponding subunits from other plants or other organisms such as mammals may be employed as well. The cell culture may be from any organism possessing cell cycle interacting proteins such as animals, preferably mammals. Particularly preferred are plant cell suspension cultures such as from Arabidopsis. The nucleic acid molecules encoding proteins or peptides identified to interact with CDC2a or CDC2b in the above mentioned assay can be easily obtained and sequenced by methods known in the art; see also the appended examples.

Therefore, the present invention also relates to a DNA sequence encoding a cell cycle interacting protein obtainable by the method of the invention.

In a preferred embodiment the nucleic acid molecules according to the invention are RNA or DNA molecules, preferably cDNA, genomic DNA or synthetically synthesized DNA or RNA molecules. Since cell cycle interacting proteins are supposed to play a key role in the plant cell cycle, corresponding proteins displaying similar properties should be present in other organisms including mammals as well. Nucleic acid molecules of the invention can be obtained, by hybridization of the above-described nucleic acid molecules with a (sample of) nucleic acid molecule(s) of any source. Nucleic acid molecules hybridizing with the above-described nucleic acid molecules can in general be derived from any organism, preferably plants possessing such molecules, preferably from monocotyledonous or dicotyledonous plants, in particular from plants of interest in WO 00/36124 PCT/EP99/10084 18 agriculture, horticulture or wood culture, such as crop plants, namely those of the family Poaceae, any starch producing plants, such as potato, maniok, leguminous plants, oil producing plants, such as oilseed rape, linenseed, etc., plants using polypeptide as storage substances, such as soybean, plants using sucrose as storage substance, such as sugar beet or sugar cane, trees, ornamental plants etc. Preferably, the nucleic acid molecules according to the invention are derived from crop plants maize, rice, barley, wheat, rye, oats etc.), potatoes, oil producing plants oilseed rape, sunflower, peanut, soybean, etc.), cotton, sugar beet, sugar cane, leguminous plants beans, peas etc.), and, of course, from Arabidopsis thaliana. Nucleic acid molecules hybridizing to the above-described nucleic acid molecules can be isolated, from libraries, such as cDNA or genomic libraries by techniques well known in the art. For example, hybridizing nucleic acid molecules can be identified and isolated by using the above-described nucleic acid molecules or fragments thereof or complements thereof as probes to screen libraries by hybridizing with said molecules according to standard techniques. Possible is also the isolation of such nucleic acid molecules by applying a nucleic acid amplicification technique such as the polymerase chain reaction (PCR) using as primers oligonucleotides derived form the above-described nucleic acid molecules. Also nucleic acid molecules may be identified and isolated using microarrays or DNA chips (Southern et al. (1999) Nat. Genet, Jan:21(1 Suppl.):5-9; Ramsay, (1998) Nature Biotechnology, 16 Nucleic acid molecules which hybridize with any of the aforementioned nucleic acid molecules also include fragments, derivatives and allelic variants of the above-described nucleic acid molecules that encode a cell cycle interacting protein or an immunologically active or functional fragment thereof. Fragments are understood to be parts of nucleic acid molecules long enough to encode the described protein or a functional or immunologically active fragment thereof as defined above.

The term "derivative" means in this context that the nucleotide sequence of these nucleic acid molecules differs from the sequences of the above-described nucleic acid molecules in one or more nucleotide positions and are highly homologous to said nucleic acid molecules. Homology is understood to refer to a sequence identity of at least 40 particularly an identity of at least 60 preferably more than 80 and still more WO 00/36124 PCT/EP99/i 0084 19 preferably more than 90 The term "substantially homologous" refers to a subject, for instance a nucleic acid, which is at least 50% identical in sequence to the reference when the entire ORF (open reading frame) is compared, where the sequence identity is preferably at least 70%, more preferably at least 80%, still more preferably at least especially more than about 90%, most preferably 95% or greater, particularly 98% or greater. The deviations from the sequences of the nucleic acid molecules described above can, for example, be the result of nucleotide substitution(s), deletion(s), addition(s), insertion(s) and/or recombination(s); see supra.

Homology further means that the respective nucleic acid molecules or encoded proteins may also be functionally and/or structurally equivalent. The nucleic acid molecules that are homologous to the nucleic acid molecules described above and that are derivatives of said nucleic acid molecules are, for example, variations of said nucleic acid molecules which represent modifications having the same biological function, in particular encoding proteins with the same or substantially the same biological function. They may be naturally occurring variations, such as sequences from other plant varieties or species, or mutations. These mutations may occur naturally or may be obtained by mutagenesis techniques. The allelic variations may be naturally occurring allelic variants as well as synthetically produced or genetically engineered variants; see supra.

The proteins encoded by the various derivatives and variants of the above-described nucleic acid molecules may share specific common characteristics, such as biological activity, molecular weight, immunological reactivity, conformation, etc., as well as physical properties, such as electrophoretic mobility, chromatographic behavior, sedimentation coefficients, pH optimum, temperature optimum, stability, solubility, spectroscopic properties, etc.

Examples of the different possible applications of the nucleic acid molecules according to the invention as well as molecules derived from them will be described in detail in the following.

WO 00/36124 PCT/EP9/1 0084 Uses of the nucleic acid molecules of the present invention In one embodiment, the present invention relates to a nucleic acid molecule which hybridizes with the complementary strand of the nucleic acid molecule of the invention and which encodes a mutated version of the protein as defined above which has lost its immunological and/or biological activity. This embodiment may prove useful for, e.g., generating dominant mutant alleles of the above-described cell cycle interacting proteins. Said mutated version is preferably generated by substitution, deletion and/or addition of 1 to 5 or 5 to 10 amino acid residues in the amino acid sequence of the above-described wild type proteins.

In a further embodiment, the invention relates to nucleic acid molecules of at least nucleotides in length hybridizing specifically with a nucleic acid molecule as described above or with a complementary strand thereof. Specific hybridization occurs preferably under stringent conditions and implies no or very little cross-hybridization with nucleotide sequences encoding no or substantially different proteins. Such nucleic acid molecules may be used as probes and/or for the control of gene expression. Nucleic acid probe technology is well known to those skilled in the art who will readily appreciate that such probes may vary in length. Preferred are nucleic acid probes of 16 to 35 nucleotides in length. Of course, it may also be appropriate to use nucleic acids of up to 100 and more nucleotides in length. The nucleic acid probes of the invention are useful for various applications. On the one hand, they may be used as primers for amplification of nucleic acid sequences according to the invention. The design and use of said primers is known by the person skilled in the art. Preferably such amplification primers comprise a contiguous sequence of at least 6 nucleotides, in particular 13 nucleotides, preferably to 25 nucleotides or more, identical or complementary to the nucleotide sequence depicted in SEQ ID NOS: 1, 3, 33, 35, 37, 39, 41, 5, 7, 9, 11 or 13 or to a nucleotide sequence encoding the amino acid sequence of SEQ ID NOS: 2, 4, 34, 36, 38, 40, 42, 6, 8, 10, 12 or 14. Another application is the use as a hybridization probe to identify nucleic acid molecules hybridizing with a nucleic acid molecule of the invention by homology screening of genomic DNA or cDNA libraries. Nucleic acid molecules according to this preferred embodiment of the invention which are complementary to a WO 00/36124 PCT/EP99/1 0084 21 nucleic acid molecule as described above are preferably at least 17 nucleotides in length and may also be used for repression of expression of a cell cycle gene, for example due to an antisense or triple helix effect or for the construction of appropriate ribozymes (see, EP-A1 0 291 533, EP-A1 0 321 201, EP-A2 0 360 257) which specifically cleave the (pre)-mRNA of a gene comprising a nucleic acid molecule of the invention or part thereof. Selection of appropriate target sites and corresponding ribozymes can be done as described, for example, in Steinecke, Ribozymes, Methods in Cell Biology Galbraith et al.. eds Academic Press, Inc. (1995), 449-460. The above described nucleic acid molecules may either be DNA or RNA or a hybrid thereof. Furthermore, said nucleic acid molecule may contain, for example, thioester bonds and/or nucleotide analogues, commonly used in oligonucleotide anti-sense approaches. Said modifications may be useful for the stabilization of the nucleic acid molecule against endo- and/or exonucleases in the cell. Said nucleic acid molecules may be transcribed by an appropriate vector containing a chimeric gene which allows for the transcription of said nucleic acid molecule in the cell.

Furthermore, the person skilled in the art is well aware that it is also possible to label such a nucleic acid probe with an appropriate marker for specific applications, such as for the detection of the presence of a nucleic acid molecule of the invention in a sample derived from an organism, in particular plants. A number of companies such as Pharmacia Biotech (Piscataway NJ), Promega (Madison Wi), and US Biochemical Corp (Cleveland OH) supply commercial kits and protocols for these procedures. Suitable reporter molecules or labels include those radionuclides, enzymes, fluorescent, chemiluminescent, or chromogenic agents as well as substrates, cofactors, inhibitors, magnetic particles and the like. Patents teaching the use of such labels include US Patents US-A-3,817,837; US-A-3,850,752; US-A-3,939,350; US-A-3,996,345;

US-A-

4,227,437; US-A-4,275,149 and 4,366,241. Also, recombinant immunoglobulins may be produced as shown in US-A-4,816,567 incorporated herein by reference.

Furthermore, the so-called "peptide nucleic acid" (PNA) technique can be used for the detection or inhibition of the expression of a nucleic acid molecule of the invention. For example, the binding of PNAs to complementary as well as various single stranded RNA and DNA nucleic acid molecules can be systematically investigated using thermal denaturation and BIAcore surface-interaction techniques (Jensen, Biochemistry 36 WO 00/36124 PCT/EP99/1 0084 22 (1997), 5072-5077). Furthermore, the nucleic acid molecules described above as well as PNAs derived therefrom can be used for detecting point mutations by hybridization with nucleic acids obtained from a sample with an affinity sensor, such as BIAcore; see Gotoh, Rinsho Byori 45 (1997), 224-228. Hybridization based DNA screening on peptide nucleic acids (PNA) oligomer arrays are described in the prior art, for example in Weiler, Nucleic Acids Research 25 (1997), 2792-2799. The synthesis of PNAs can be performed according to methods known in the art, for example, as described in Koch, J.

Pept. Res. 49 (1997), 80-88; Finn, Nucleic Acids Research 24 (1996), 3357-3363.

Further possible applications of such PNAs, for example as restriction enzymes or as templates for the synthesis of nucleic acid oligonucleotides are known to the person skilled in the art and are, for example, described in Veselkov, Nature 379 (1996), 214 and Bohler, Nature 376 (1995), 578-581. A further application of the nucleic acids of the invention is their use in a two-hybrid system to identify interacting proteins proteins that specifically interact with the nucleic acid-encoding products). Methods for preparing and performing the two-hybrid screen are known in the art, including descriptions provided in this document and generally see Hannon G. and Bartel P. Identification of interacting proteins using the two-hybrid system Methods Mol. Cellular Biol. 5 (1995), 289-297.

Detection and mapping of related polynucleotide sequences The nucleic acid sequence for a cell cycle interacting protein of the invention can also be used to generate hybridization probes for mapping the naturally occurring genomic sequence. The sequence may be mapped to a particular chromosome or to a specific region of the chromosome using well known techniques. These include in situ hybridization to chromosomal spreads, flow-sorted chromosomal preparations, or artificial chromosome constructions such as yeast artificial chromosomes, bacterial artificial chromosomes, bacterial P1 constructions or single chromosome cDNA libraries as reviewed in Price (Blood Rev. 7 (1993), 127-134) and Trask (Trends Genet. 7 (1991), 149-154). The technique of fluorescent in situ hybridization of chromosome spreads has been described, among other places, in Verma, (1988) Human Chromosomes: A Manual of Basic Techniques, Pergamon Press, New York NY. Fluorescent in situ WO 00/36124 PCT/EP99/10084 23 hybridization of chromosomal preparations and other physical chromosome mapping techniques may be correlated with additional genetic map data. Examples of genetic map data can be found in the 1994 Genome Issue of Science (265:1981f) and Meinke, Science 282 (1998), 662-682. Correlation between the location of the gene encoding a cell cycle interacting protein of the invention on a physical chromosomal map and a specific feature, plant growth, architecture, yield, stress, disease etc. may help delimit the region of DNA associated with this feature. The nucleotide sequences of the subject invention may be used to detect differences in gene sequences between normal, carrier or affected individuals. Furthermore, the means and methods described herein can be used for marker-assisted breeding.

In situ hybridization of chromosomal preparations and physical mapping techniques such as linkage analysis using established chromosomal markers may be used for extending genetic maps. For example an sequence tagged site based map of the human genome was recently published by the Whitehead-MIT Center for Genomic Research (Hudson, Science 270 (1995), 1945-1954) on a map of the plant genome by way of the Arabidopsis genome is available from http://genome.wwz.Stanford.edu/cgibin/AtDB/nph-blast2atdb. Often the placement of a gene on the chromosome of another species may reveal associated marker even if the number or arm of a particular chromosome is not known. New sequences can be assigned to chromosomal arms, or parts thereof, by physical mapping. This provides valuable information to investigators searching for interacting genes using positional cloning or other gene discovery techniques. Once such gene has been crudely localized by genetic linkage to a particular genomic region, any sequences mapping to that area may represent associated or regulatory genes for further investigation. The nucleotide sequence of the subject invention may also be used to detect differences in the chromosomal location due to translocation, inversion, etc. among normal, carrier or affected individuals.

Vectors and expression systems The present invention also relates to vectors, particularly plasmids, cosmids, viruses, bacteriophages and other vectors used conventionally in genetic engineering that WO 00/36124 PCT/EP99/1 0084 24 contain a nucleic acid molecule according to the invention. Methods which are well known to those skilled in the art can be used to construct various plasmids and vectors; see, for example, the techniques described in Sambrook, Molecular Cloning A Laboratory Manual, Cold Spring Harbor Laboratory (1989) N.Y. and Ausubel, Current Protocols in Molecular Biology, Green Publishing Associates and Wiley Interscience, N.Y. (1989), (1994). Plasmids and vectors to be preferably employed in accordance with the present invention include those well known in the art. Alternatively, the nucleic acid molecules and vectors of the invention can be reconstituted into liposomes for delivery to target cells.

In a preferred embodiment the nucleic acid molecule present in the vector is linked to (a) control sequence(s) which allow the expression of the nucleic acid molecule in prokaryotic and/or eukaryotic cells.

The term "control sequence" refers to regulatory DNA sequences which are necessary to effect the expression of coding sequences to which they are ligated. The nature of such control sequences differs depending upon the host organism. In prokaryotes, control sequences generally include promoter, ribosomal binding site, and terminators.

In eukaryotes generally control sequences include promoters, terminators and, in some instances, enhancers, transactivators or transcription factors. The term "control sequence" is intended to include, at a minimum, all components the presence of which are necessary for expression, and may also include additional advantageous components.

The term "operably linked" refers to a juxtaposition wherein the components so described are in a relationship permitting them to function in their intended manner. A control sequence "operably linked" to a coding sequence is ligated in such a way that expression of the coding sequence is achieved under conditions compatible with the control sequences. In case the control sequence is a promoter, it is obvious for a skilled person that double-stranded nucleic acid is preferably used.

Thus, the vector of the invention is preferably an expression vector. An "expression vector" is a construct that can be used to transform a selected host cell and provides for expression of a coding sequence in the selected host. Expression vectors can for instance be cloning vectors, binary vectors or integrating vectors. Expression comprises WO 00/36124 PCTEP99/1 0084 transcription of the nucleic acid molecule preferably into a translatable mRNA.

Regulatory elements ensuring expression in prokaryotic and/or eukaryotic cells are well known to those skilled in the art. In the case of eukaryotic cells they comprise normally promoters ensuring initiation of transcription and optionally poly-A signals ensuring termination of transcription and stabilization of the transcript, for example in plants, those of the 35S RNA from Cauliflower Mosaic Virus (CaMV). Other promoters commonly used are the polyubiquitin promoter, and the actin promoter for ubiquitous expression.

The termination signals usually employed are from the Nopaline Synthase promoter or from the CAMV 35S promoter. A plant translational enhancer often used is the TMV omega sequences, the inclusion of an intron (Intron-1 from the Shrunken gene of maize, for example) has been shown to increase expression levels by up to 100-fold. (Mait, Transgenic Research 6 (1997), 143-156; Ni, Plant Journal 7 (1995), 661-676). Additional regulatory elements may include transcriptional as well as translational enhancers.

Possible regulatory elements permitting expression in prokaryotic host cells comprise, the PL, lac, trp or tac promoter in E. coli, and examples of regulatory elements permitting expression in eukaryotic host cells are the AOX1 or GAL1 promoter in yeast or the CMV-, SV40-, RSV-promoter (Rous sarcoma virus), CMV-enhancer, enhancer or a globin intron in mammalian and other animal cells. In this context, suitable expression vectors are known in the art such as Okayama-Berg cDNA expression vector pcDV1 (Pharmacia), pCDM8, pRc/CMV, pcDNA1, pcDNA3 (In-vitrogene), pSPORT1 (GIBCO BRL). An alternative expression system which could be used to express a cell cycle interacting protein is an insect system. In one such system, Autographa californica nuclear polyhedrosis virus (AcNPV) is used as a vector to express foreign genes in Spodoptera frugiperda cells or in Trichoplusia larvae. The coding sequence of a nucleic acid molecule of the invention may be cloned into a nonessential region of the virus, such as the polyhedrin gene, and placed under control of the polyhedrin promoter.

Successful insertion of said coding sequence will render the polyhedrin gene inactive and produce recombinant virus lacking coat protein coat. The recombinant viruses are then used to infect S. frugiperda cells or Trichoplusia larvae in which the protein of the invention is expressed (Smith, J. Virol. 46 (1983), 584; Engelhard, Proc. Nat. Acad. Sci.

USA 91 (1994), 3224-3227). Further promoters and expression systems that may be WO 00/36124 PCT/EP99/1 0094 26 used in accordance with the present invention are described in the prior art, for example WO 99/61619.

Advantageously, the above-described vectors of the invention comprises a selectable and/or scorable marker. Selectable marker genes useful for the selection of transformed plant cells, callus, plant tissue and plants are well known to those skilled in the art and comprise, for example, antimetabolite resistance as the basis of selection for dhfr, which confers resistance to methotrexate (Reiss, Plant Physiol. (Life Sci. Adv.) 13 (1994), 143- 149); npt, which confers resistance to the aminoglycosides neomycin, kanamycin and paromycin (Herrera-Estrella, EMBO J. 2 (1983), 987-995) and hygro, which confers resistance to hygromycin (Marsh, Gene 32 (1984), 481-485). Additional selectable genes have been described, namely trpB, which allows cells to utilize indole in place of tryptophan; hisD, which allows cells to utilize histinol in place of histidine (Hartman, Proc.

Natl. Acad. Sci. USA 85 (1988), 8047); mannose-6-phosphate isomerase which allows cells to utilize mannose (WO 94/20627) and ODC (ornithine decarboxylase) which confers resistance to the ornithine decarboxylase inhibitor, 2-(difluoromethyl)-DLornithine, DFMO (McConlogue, 1987, In: Current Communications in Molecular Biology, Cold Spring Harbor Laboratory ed.) or deaminase from Aspergillus terreus which confers resistance to Blasticidin S (Tamura, Biosci. Biotechnol. Biochem. 59 (1995), 2336-2338).

Useful scorable marker are also known to those skilled in the art and are commercially available. Advantageously, said marker is a gene encoding luciferase (Giacomin, PI. Sci.

116 (1996), 59-72; Scikantha, J. Bact. 178 (1996), 121), green fluorescent protein (Gerdes, FEBS Lett. 389 (1996), 44-47) or 13-glucuronidase (Jefferson, EMBO J. 6 (1987), 3901-3907). This embodiment is particularly useful for simple and rapid screening of cells, tissues and organisms containing a vector of the invention.

The present invention furthermore relates to host cells comprising a vector as described above or a nucleic acid molecule according to the invention wherein the nucleic acid molecule is foreign to the host cell.

By "foreign" it is meant that the nucleic acid molecule is either heterologous with respect to the host cell, this means derived from a cell or organism with a different genomic background, or is homologous with respect to the host cell but located in a different WO 00/36124 PCT/EP99/1 0084 27 genomic environment than the naturally occurring counterpart of said nucleic acid molecule. This means that, if the nucleic acid molecule is homologous with respect to the host cell, it is not located in its natural location in the genome of said host cell, in particular it is surrounded by different genes. In this case the nucleic acid molecule may be either under the control of its own promoter or under the control of a heterologous promoter. The vector or nucleic acid molecule according to the invention which is present in the host cell may either be integrated into the genome of the host cell or it may be maintained in some form extrachromosomally. In this respect, it is also to be understood that the nucleic acid molecule of the invention can be used to restore or create a mutant gene via homologous recombination (Paszkowski Homologous Recombination and Gene Silencing in Plants. Kluwer Academic Publishers (1994)).

The host cell can be any prokaryotic or eukaryotic cell, such as bacterial, insect, fungal, plant or animal cells. Preferred fungal cells are, for example, those of the genus Saccharomyces, in particular those of the species S. cerevisiae.

The term "prokaryotic" is meant to include all bacteria which can be transformed or transfected with a DNA or RNA molecules for the expression of a protein of the invention. Prokaryotic hosts may include gram negative as well as gram positive bacteria such as, for example, E. coli, S. typhimurium, Serratia marcescens and Bacillus subtilis.

The term "eukaryotic" is meant to include yeast, higher plant, insect and preferably mammalian cells. Depending upon the host employed in a recombinant production procedure, the protein encoded by the polynucleotide of the present invention may be glycosylated or may be non-glycosylated. The cell cycle interacting proteins of the invention may or may not also include an initial methionine amino acid residue. A polynucleotide of the invention can be used to transform or transfect the host using any of the techniques commonly known to those of ordinary skill in the art. Furthermore, methods for preparing fused, operably linked genes and expressing them in, e.g., mammalian cells and bacteria are well-known in the art (Sambrook, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 1989).

Another subject of the invention is a method for the preparation of cell cycle interacting proteins which comprises the cultivation of host cells according to the invention which, due to the presence of a vector or a nucleic acid molecule according to the invention, are able to express such a protein, under conditions which allow expression of the WO 00/36124 PCT/EP99/1 0084 28 protein and recovering of the so-produced protein from the culture. It is also to be understood that the proteins can be expressed in a cell free system using for example in vitro translation assays known in the art.

The term "expression" means the production of a protein or nucleotide sequence in the cell. However, said term also includes expression of the protein in a cell-free system. It includes transcription into an RNA product, post-transcriptional modification and/or translation to a protein product or polypeptide from a DNA encoding that product, as well as possible post-translational modifications. Depending on the specific constructs and conditions used, the protein may be recovered from the cells, from the culture medium or from both. The terms "protein" and "polypeptide" used in this application are interchangeable. "Polypeptide" refers to a polymer of amino acids (amino acid sequence) and does not refer to a specific length of the molecule. Thus peptides and oligopeptides are included within the definition of polypeptide. This term does also refer to or include post-translational modifications of the polypeptide, for example, glycosylations, acetylations, phosphorylations and the like. Included within the definition are, for example, polypeptides containing one or more analogs of an amino acid (including, for example, unnatural amino acids, etc.), polypeptides with substituted linkages, as well as other modifications known in the art, both naturally occurring and non-naturally occurring. For example, it is well known by the person skilled in the art that it is not only possible to express a native protein but also to express the protein as fusion polypeptides or to add signal sequences directing the protein to specific compartments of the host cell, ensuring secretion of the protein into the culture medium, etc. The protein of the invention may also be expressed as a recombinant protein with one or more additional polypeptide domains added to facilitate protein purification. Such purification facilitating domains include, but are not limited to, metal chelating peptides such as histidine-tryptophan modules that allow purification on immobilized metals, protein A domains that allow purification on immobilized immunoglobulin, and the domain utilized in the FLAGS extension/affinity purification system (Immunex Corp, Seattle WA). The inclusion of a cleavable linker sequences such as Factor XA or enterokinase (Invitrogen, San Diego CA) between the purification domain and the protein of interest is useful to facilitate purification. One such expression vector provides for expression of a fusion protein compromising a cell cycle interacting protein and WO 00/36124 PCT/EP99/10084 29 contains nucleic acid encoding 6 histidine residues followed by thioredoxin and an enterokinase cleavage site. The histidine residues facilitate purification on IMIAC (immobilized metal ion affinity chromatography as described in Porath, Protein Expression and Purification 3 (1992), 263-281) while the enterokinase cleavage site provides a means for purifying the cell cycle interacting protein from the fusion protein.

In addition to recombinant production, fragments of the protein of the invention may be produced by direct peptide synthesis using solid-phase techniques (cf Stewart et al.

(1969) Solid Phase Peptide Synthesis, WH Freeman Co, San Francisco; Merrifield, J.

Am. Chem. Soc. 85 (1963), 2149-2154). In vitro protein synthesis may be performed using manual techniques or by automation. Automated synthesis may be achieved, for example, using Applied Biosystems 431A Peptide Synthesizer (Perkin Elmer, Foster City CA) in accordance with the instructions provided by the manufacturer. Various fragments of the cell cycle interacting protein of the invention may be chemically synthesized and/or modified separately and combined using chemical methods to produce the full length molecule. Once expressed or synthesized, the protein of the present invention can be purified according to standard procedures of the art, including ammonium sulfate precipitation, affinity columns, column chromatography, gel electrophoresis and the like; see, Scopes, "Protein Purification", Springer-Verlag,

N.Y.

(1982). Substantially pure proteins of at least about 90 to 95% homogeneity are preferred, and 98 to 99% or more homogeneity are most preferred, for pharmaceutical uses. Once purified, partially or to homogeneity as desired, the proteins may then be used therapeutically (including extracorporeally) or in developing and performing assay procedures.

Cell cycle interacting proteins of the invention The present invention furthermore relates to cell cycle interacting proteins encoded by the nucleic acid molecules according to the invention or produced or obtained by the above-described methods, and to functional and/or immunologically active fragments of such cell cycle interacting proteins. The proteins and polypeptides of the present invention are not necessarily translated from a designated nucleic acid sequence; the polypeptides may be generated in any manner, including for example, chemical WO 00/36124 PCT/EP99/1 0084 synthesis, or expression of a recombinant expression system, or isolation from a suitable viral system. The polypeptides may include one or more analogs of amino acids, phosphorylated amino acids or unnatural amino acids. Methods of inserting analogs of amino acids into a sequence are known in the art. The polypeptides may also include one or more labels, which are known to those skilled in the art. In this context, it is also understood that the proteins according to the invention may be further modified by conventional methods known in the art. By providing the proteins according to the present invention it is also possible to determine fragments which retain biological activity. This allows the construction of chimeric proteins and peptides comprising an amino sequence derived from the protein of the invention, which is crucial for its, e.g., binding activity and other functional amino acid sequences, e.g. GUS marker gene (Jefferson, EMBO J. 6 (1987), 3901-3907). The other functional amino acid sequences may be either physically linked by, chemical means to the proteins of the invention or may be fused by recombinant DNA techniques well known in the art.

The term "fragment of a sequence" or "part of a sequence" means a truncated sequence of the original sequence referred to. The truncated sequence (nucleic acid or protein sequence) can vary widely in length; the minimum size being a sequence of sufficient size to provide a sequence with at least a comparable function and/or activity of the original sequence referred to, while the maximum size is not critical. In some applications, the maximum size usually is not substantially greater than that required to provide the desired activity and/or function(s) of the original sequence. Typically, the truncated amino acid sequence will range from about 5 to about 60 amino acids in length. More typically, however, the sequence will be a maximum of about 50 amino acids in length, preferably a maximum of about 30 amino acids. It is usually desirable to select sequences of at least about 10, 12 or 15 amino acids, up to a maximum of about or 25 amino acids.

Furthermore, folding simulations and computer redesign of structural motifs of the protein of the invention can be performed using appropriate computer programs (Olszewski, Proteins 25 (1996), 286-299; Hoffman, Comput. Appl. Biosci. 11 (1995), 675-679).

Computer modeling of protein folding can be used for the conformational and energetic analysis of detailed peptide and protein models (Monge, J. Mol. Biol. 247 (1995), 995- 1012; Renouf, Adv. Exp. Med. Biol. 376 (1995), 37-45). In particular, the appropriate WO 00/36124 PCT/EP99/1 0084 31 programs can be used for the identification of interactive sites of the cell cycle interacting protein and its receptor, its ligand or other interacting proteins by computer assistant searches for complementary peptide sequences (Fassina, Immunomethods 5 (1994), 114- 120. Further appropriate computer systems for the design of protein and peptides are described in the prior art, for example in Berry, Biochem. Soc. Trans. 22 (1994), 1033- 1036; Wodak, Ann. N. Y. Acad. Sci. 501 (1987), 1-13; Pabo, Biochemistry 25 (1986), 5987-5991. The results obtained from the above-described computer analysis can be used for, the preparation of peptide mimetics of the protein of the invention or fragments thereof. Such pseudopeptide analogues of the natural amino acid sequence of the protein may very efficiently mimic the parent protein (Benkirane, J. Biol. Chem. 271 (1996), 33218- 33224). For example, incorporation of easily available achiral (-amino acid residues into a protein of the invention or a fragment thereof results in the substitution of amide bonds by polymethylene units of an aliphatic chain, thereby providing a convenient strategy for constructing a peptide mimetic (Banerjee, Biopolymers 39 (1996), 769-777). Superactive peptidomimetic analogues of small peptide hormones in other systems are described in the prior art (Zhang, Biochem. Biophys. Res. Commun. 224 (1996), 327-331). Appropriate peptide mimetics of the protein of the present invention can also be identified by the synthesis of peptide mimetic combinatorial libraries through successive amide alkylation and testing the resulting compounds, for their binding and immunological properties.

Methods for the generation and use of peptidomimetic combinatorial libraries are described in the prior art, for example in Ostresh, Methods in Enzymology 267 (1996), 220-234 and Domer, Bioorg. Med. Chem. 4 (1996), 709-715.

Furthermore, a three-dimensional and/or crystallographic structure of the protein of the invention can be used for the design of peptide mimetic inhibitors of the biological activity of the protein of the invention (Rose, Biochemistry 35 (1996), 12933-12944; Rutenber, Bioorg. Med. Chem. 4 (1996), 1545-1558).

Antibodies Furthermore, the present invention relates to antibodies specifically recognizing a cell cycle interacting protein according to the invention or parts, i.e. specific fragments or epitopes, of such a protein. The antibodies of the invention can be used to identify and isolate other WO 00/36124 PCT/EP99/10084 32 cell cycle interacting proteins and genes in any organism, preferably plants. These antibodies can be monoclonal antibodies, polyclonal antibodies or synthetic antibodies as well as fragments of antibodies, such as Fab, Fv or scFv fragments etc. Monoclonal antibodies can be prepared, for example, by the techniques as originally described in K6hler and Milstein, Nature 256 (1975), 495, and Galfre, Meth. Enzymol. 73 (1981), 3, which comprise the fusion of mouse myeloma cells to spleen cells derived from immunized mammals. Furthermore, antibodies or fragments thereof to the aforementioned peptides can be obtained by using methods which are described, in Harlow and Lane "Antibodies, A Laboratory Manual", CSH Press, Cold Spring Harbor, 1988; Coligan, "Current Protocols in Immunology", Wiley/Greene, NY (1991). These antibodies can be used, for example, for the immunoprecipitation and immunolocalization of proteins according to the invention as well as for the monitoring of the synthesis of such proteins, for example, in recombinant organisms, and for the identification of compounds interacting with the protein according to the invention. For example, surface plasmon resonance as employed in the BIAcore system can be used to increase the efficiency of phage antibodies selections, yielding a high increment of affinity from a single library of phage antibodies which bind to an epitope of the protein of the invention (Schier, Human Antibodies Hybridomas 7 (1996), 97-105; Malmborg, J. Immunol. Methods 183 (1995), 7- 13). In many cases, the binding phenomena of antibodies to antigens is equivalent to other ligand/anti-ligand binding.

Transgenic plants Plant cell division can conceptually be influenced in four ways: inhibiting or arresting cell division, (ii) maintaining, facilitating or stimulating cell division, (iii) uncoupling DNA synthesis from mitosis and cytokinesis or (iv) uncoupling cell division from intrinsic developmental or external environmental conditions. Modulation of the expression of a cell cycle interacting protein encoded by a nucleotide sequence according to the invention has surprisingly an advantageous influence on plant cell division characteristics, in particular on the disruption of the G1/S and/or G2/M transition and as a result thereof on the total make-up of the plant concerned or parts thereof. An example is that DNA synthesis, or mitosis may be negatively influenced by interfering with the WO 00/36124 PCT/EF'99/10084 33 formation of a cyclin-dependent protein kinase complex. Alternatively, overexpression of the CDK complex interacting protein accelerates reentry into the cell cycle.

The term "cyclin-dependent protein kinase complex" means the complex formed when a, preferably functional, cyclin associates with a, preferably, functional cyclin dependent kinase. Such complexes may be active in phosphorylating proteins and may or may not contain additional protein species.

The term "protein kinase" means an enzyme catalyzing the phosphorylation of proteins.

To analyse the industrial applicabilities of the invention, transformed plants can be made modulating the nucleotide sequence according to the invention. Such an modulation of the new gene(s), proteins or inactivated variants thereof will either positively or negatively have an effect on cell division. Methods to modify the expression levels and/or ratios and/or the activity are known to persons skilled in the art and include for instance overexpression, co-suppression, the use of ribozymes, sense and anti-sense strategies, gene silencing approaches. "Sense strand" refers to the strand of a doublestranded DNA molecule that is homologous to a mRNA transcript thereof. The "antisense strand" contains an inverted sequence which is complementary to that of the "sense strand".

Hence, the nucleic acid molecules according to the invention are in particular useful for the genetic manipulation of plant cells in order to modify the characteristics of plants and to obtain plants with modified, preferably with improved or useful phenotypes. Similarly, the invention can also be used to modulate the cell division and the growth of cells, preferentially plant cells, in in vitro cultures. Specifically the plant cell division rate and/or the inhibition of a plant cell division can be influenced by overexpression or reducing the expression of a gene encoding a protein according to the invention. Overexpression of a cell cycle interacting protein encoding gene according to the invention promotes cell proliferation, while reducing gene expression arrests cell division or prevents reentry into the cell cycle. Part of the invention is thus the usage of the nucleic acid molecules as mentioned hereinbefore as a negative or positive regulator of cell proliferation. A transformed plant can thus be obtained by transforming a plant cell with a gene encoding a polypeptide concerned or fragment thereof alone or in combination. For this WO 00/36124 PCT/EP99/10084 34 purpose tissue specific promoters, in one construct or being present as a separate construct in addition to the sequence concerned, can be used. Surprisingly using a polypeptide or fragment thereof according to the invention or using antisense RNA for the gene according to the invention cell division of the meristems of the plant can be manipulated, positively and/or negatively respectively. Furthermore, overproduction of the cell cycle interacting protein of the invention enhances growth and results in cell division to be less sensitive to an arrest caused by environmental stress such as salt, nutrient deprivation, drought, chilling and the like.

Thus, the present invention relates to a method for the production of transgenic plants, plant cells or plant tissue comprising the introduction of a nucleic acid molecule or vector of the invention into the genome of said plant, plant cell or plant tissue.

For the expression of the nucleic acid molecules according to the invention in sense or antisense orientation in plant cells, the molecules are placed under the control of regulatory elements which ensure the expression in plant cells. These regulatory elements may be heterologous or homologous with respect to the nucleic acid molecule to be expressed as well with respect to the plant species to be transformed. In general, such regulatory elements comprise a promoter active in plant cells. These promoters can be used to modulate increase, decrease, alter) cell cycle interacting protein content and/or composition in a desired tissue or under certain conditions. To obtain expression in all tissues of a transgenic plant, preferably constitutive promoters are used, such as the 35 S promoter of CaMV (Odell, Nature 313 (1985), 810-812) or promoters from such genes as rice actin (McElroy et al. (1990) Plant Cell 2:163-171) maize H3 histone (Lepetit et al.

(1992) Mol. Gen. Genet 231:276-285) or promoters of the polyubiquitin genes of maize (Christensen, Plant Mol. Biol. 18 (1982), 675-689). In order to achieve expression in specific tissues of a transgenic plant it is possible to use tissue specific promoters (see, Stockhaus, EMBO J. 8 (1989), 2245-2251 or Table A).

WO 00/36124 PCT/EP99/10084 Table A: Exemplary tissue specific or tissue-preferred promoters for use in the performance of the present invention.

GENE SOURCE EXPRESSION REFERENCE

PATTERN

a-amylase (Amy32b) aleurone Lanahan, e t al., Plant Cell 4.203- 211, 1992; Skriver, et al. Proc. Nat.

Sc. (USA) 88:7266-7270,1991 cathepsin P-like gene aleurone Cejudo, et Plant Molecular 20.849-856, 1992.

Agrobacterium rhizogenes cambium Nilsson et Physiol. Plant. l00.456ro/B 462, 1997 PR P genes cell wall http://salus.medium.edu/mmg/tierney/ht ___ml barley Itri promoter endosperm synthetic promoter endospermn Vicente-Carbajosa et al., Plant J. 13: 629-640, 1998.

AtPRP4 flowers http://salus.medium.edu/mmg/tierney/ht ml chalene synthase (chsA) flowers Van der Meer, et Plant Mo!. Biol. 1990.

LAT52 anther Twell et al Mol. Gen Genet. 217:240- 245 (1989) apetala-3 flowers chitinase fruit (berries, grapes, Thomas et CSIRO Plant Industry, etc) Urrbrae, South Australia, Australia; .html rbcs-3A green tissue (eg leaf) Lam, E. et The Plant Cell 2: 857- 866, 1990.; Tucker et Plant Physiol.

1992.

leaf-specific genes leaf Baszczynski, et Nuc. Acid Res. 16: ___4732, 1988.

AtPRP4 leaf http://salus.medium.edu/mmg/tierney/ht ml Pinus cab-6 leaf Yamamoto et Plant Cell PhysioL 35.773-778,1994.

SAM22 senescent leaf Crowell, et Plant Mo!. Biol. 18:459- ____466,1992.

R. japonicum nif gene nodule United States Patent No. 4, 803, 165 B. japonicum nifH gene nodule United States Patent No. 5, 008, 194 nodule Yang, et The Plant J. 3: 573-585.

PEP carboxylase (PEPC) nodule Pathirana, et Plant MoL Bio. 437-450,1992.

leghaemnoglobin (Lb) nodule Gordon, eta., J. Exp. Bot. 44:1453- WO 00/36124 WO 0036124PCT/EP9910084 1465, 1993.

Tungro baciliform virus gene phloemn Bhattacharyy'a-Pakrasi, et al, The Plant J. 4:71-79, 1992.

sucrose-binding protein gene plasma membrane Grimes, et The Plant Cell 4:1561 1574,1992.

pollen-specific genes pollen; microspore Albani, et al., Plant Mo. Bio. 15: 605, 1990; Albani, et Plant Mo!. Bidl. 16: 1991) Zml3 pollen Guerrero et al Mol. Gen. Genet.

(1993) apg gene microspore Twell et al Sex. Plant Reprod. 6:217- 224 (1993) maize pollen-specific gene pollen Hamilton, et al., Plant Mot. Biol. 18: 211 218,1992.

sunflower pollen-expressed pollen Baltz, et al., The Plant J. 2: 713-721, gene 1992.

B. napus pollen-specific gene pollen;anther; tapetumn Arnoldo, et al., J. Cell. Biochem., Abstract No. Y1O01, 204, 1992.

root-expressible genes roots Tingey, et al., EMBO J. 6:1,1987.

tobacco auxin-inducible gene root tip Van der Zaal, et al., Plant Mol. Biol. 16, 983, 1991.

0-tubulin root Oppenheimer, eta., Gene 63: 87,-1988.

tobacco root-specific genes root Conkling, et al., Plant Physiol. 93: 1203, 1990.

B. napus G 1-3b gene root United States Patent No. 5, 401, 836 SbPRP1 roots Suzuki et al., Plant Mol. Biol. 21: 109- 119,1993.

AtPRP1; AtPRP3 roots; root hairs hftp://salus.medium.edu/mmg/tierney/ht ml RID2 gene root cortex http://www2.cnsu.edu/ncsu/research TobRB7 gene root vasculature hftp://www2.cnsu.edu/ncsu/research AtPRP4 leaves; flowers; lateral http://salus.medium.edu/mmg/tierney/ht root primordia ml seed-specific genes seed Simon, et Plant Mo!. Biol. 191, 1985; Scofield, et J. Biol. Chem.

262.:12202, 1987.; Baszczynski, et Plant Mo!. Bid/. 14: 633, 1990.

Brazil Nut albumin seed Pearson, et al., Plant Mol. Biol. 18: 235- 245, 1992.

legumn sed Elis, et al., Plant Mol. Biol. 10: 203-214, WO 00/36124 PCT/EP99/10084 I i glutelin (rice) seed 1988.

Takaiwa, et al., Mol. Gen. Genet. 208: 15-22, 1986; Takaiwa, et al., FEBS Letts. 221: 43-47, 1987.

Matzke et al Plant Mol Biol, 14(3):323- 32 1990 Stalberg, et al, Planta 199: 515-519, 1996.

1 zein seed napA seed sunflower oleosin seed (embryo and dry Cummins, et al., Plant Mol. Biol. 19: seed) see) 873-876, 1992 LEAFY shoot meristem Weigel et al., Cell 69.843-859, 1992.

Arabidopsis thaliana knati shoot meristem Accession number AJ131822 Malus domestica kn 1 shoot meristem Accession number Z71981 CLA VATA 1 shoot meristem Accession number AF049870 stigma-specific genes stigma Nasrallah, et al., Proc. Natl. Acad. Sci.

USA 85: 5551, 1988; Trick, et Plant Mol. Bio. 15: 203, 1990.

class I patatin gene tuber Liu et al., Plant Mol. Biol. 153:386-395, 1991.

blz2 endosperm EP99106056.7 PCNA rice meristem Kosugi et al, Nucleic Acids Research 19:1571-1576, 1991; Kosugi S. and Ohashi Y, Plant Cell 91607-1619, 1997.

I

The promoters listed in the table are provided for the purposes of exemplification only and the present invention is not to be limited by the list provided therein. Those skilled in the art will readily be in a position to provide additional promoters that are useful in performing the present invention. The promoters listed may also be modified to provide specificity of expression as required.

Known are also promoters which are specifically active in tubers of potatoes or in seeds of different plants species, such as maize, Vicia, wheat, barley etc. Inducible promoters may be used in order to be able to exactly control expression under certain environmental or developmental conditions such as pathogens, anaerobia, light, etc. An example for inducible promoters are the promoters of genes encoding heat shock proteins. Also microspore-specific regulatory elements and their uses have been described WO 00/36124 PCT/EP99/10084 38 (W096/16182). Furthermore, the chemically inducible Tet-system may be employed (Gatz, Mol. Gen. Genet. 227 (1991); 229-237). Further suitable promoters are known to the person skilled in the art and are described, in Ward (Plant Mol. Biol. 22 (1993), 361- 366). The regulatory elements may further comprise transcriptional and/or translational enhancers functional in plants cells. Furthermore, the regulatory elements may include transcription termination signals, such as a poly-A signal, which lead to the addition of a poly A tail to the transcript which may improve its stability.

In the case that a nucleic acid molecule according to the invention is expressed in sense orientation it is in principle possible to modify the coding sequence in such a way that the protein is located in any desired compartment of the plant cell. These include the nucleus, endoplasmatic reticulum, the vacuole, the mitochondria, the plastids, the apoplast, the cytoplasm etc. Since the interacting component of the protein of the invention excerts its effects in the cytoplasm and/or nucleus, corresponding signal sequences are preferred to direct the protein of the invention in the same compartment. Methods how to carry out this modifications and signal sequences ensuring localization in a desired compartment are well known to the person skilled in the art.

Methods for the introduction of foreign DNA into plants are also well known in the art.

These include, for example, the transformation of plant cells or tissues with T-DNA using Agrobacterium tumefaciens or Agrobacterium rhizogenes, the fusion of protoplasts, direct gene transfer (see, EP-A 164 575), injection, electroporation, biolistic methods like particle bombardment, pollen-mediated transformation, plant RNA virus-mediated transformation, liposome-mediated transformation, transformation using wounded or enzyme-degraded immature embryos, or wounded or enzyme-degraded embryogenic callus and other methods known in the art. The vectors used in the method of the invention may contain further functional elements, for example "left border"- and "right border"-sequences of the T-DNA of Agrobacterium which allow for stably integration into the plant genome. Furthermore, methods and vectors are known to the person skilled in the art which permit the generation of marker free transgenic plants, i.e. the selectable or scorable marker gene is lost at a certain stage of plant development or plant breeding. This can be achieved by, for example cotransformation (Lyznik, Plant Mol.

WO 00/36124 PCT/EP99/10084 39 Biol. 13 (1989), 151-161; Peng, Plant Mol. Biol. 27 (1995), 91-104) and/or by using systems which utilize enzymes capable of promoting homologous recombination in plants (see, W097/08331; Bayley, Plant Mol. Biol. 18 (1992), 353-361); Lloyd, Mol.

Gen. Genet. 242 (1994), 653-657; Maeser, Mol. Gen. Genet. 230 (1991), 170-176; Onouchi, Nucl. Acids Res. 19 (1991), 6373-6378). Methods for the preparation of appropriate vectors are described by, Sambrook (Molecular Cloning; A Laboratory Manual, 2nd Edition (1989), Cold Spring Harbor Laboratory Press, Cold Spring Harbor,

NY).

Suitable strains of Agrobacterium tumefaciens and vectors as well as transformation of Agrobacteria and appropriate growth and selection media are well known to those skilled in the art and are described in the prior art (GV3101 (pMK90RK), Koncz, Mol. Gen.

Genet. 204 (1986), 383-396; C58C1 (pGV 3850kan), Deblaere, Nucl. Acid Res. 13 (1985), 4777; Bevan, Nucleic. Acid Res. 12(1984), 8711; Koncz, Proc. Natl. Acad. Sci.

USA 86 (1989), 8467-8471; Koncz, Plant Mol. Biol. 20 (1992), 963-976; Koncz, Specialized vectors for gene tagging and expression studies. In: Plant Molecular Biology Manual Vol 2, Gelvin and Schilperoort Dordrecht, The Netherlands: Kluwer Academic Publ. (1994), 1-22; EP-A-120 516; Hoekema: The Binary Plant Vector System, Offsetdrukkerij Kanters Alblasserdam (1985), Chapter V, Fraley, Crit. Rev. Plant.

Sci., 4, 1-46; An, EMBO J. 4 (1985), 277-287). Although the use of Agrobacterium tumefaciens is preferred in the method of the invention, other Agrobacterium strains, such as Agrobacterium rhizogenes, may be used, for example if a phenotype conferred by said strain is desired.

Methods for the transformation using biolistic methods are well known to the person skilled in the art; see, Wan, Plant Physiol. 104 (1994), 37-48; Vasil, Bio/Technology 11 (1993), 1553-1558 and Christou (1996) Trends in Plant Science 1, 423-431.

Microinjection can be performed as described in Potrykus and Spangenberg Gene Transfer To Plants. Springer Verlag, Berlin, NY (1995).

The transformation of most dicotyledonous plants is possible with the methods described above. But also for the transformation of monocotyledonous plants several successful transformation techniques have been developed. These include the transformation using biolistic methods as, described above as well as protoplast transformation, electroporation of partially permeabilized cells, introduction of DNA using glass fibers, etc.

WO 00/36124 PCT/EP99/1 0084 The term "transformation" as used herein, refers to the transfer of an exogenous polynucleotide into a host cell, irrespective of the method used for the transfer. The polynucleotide may be transiently or stably introduced into the host cell and may be maintained non-integrated, for example, as a plasmid or as chimeric links, or alternatively, may be integrated into the host genome. The resulting transformed plant cell can then be used to regenerate a transformed plant in a manner known by a skilled person.

In general, the plants which can be modified according to the invention and which either show overexpression of a protein according to the invention or a reduction of the synthesis of such a protein can be derived from any desired plant species. They can be monocotyledonous plants or dicotyledonous plants, preferably they belong to plant species of interest in agriculture, wood culture or horticulture interest, such as crop plants (e.g.

maize, rice, barley, wheat, rye, oats etc.), potatoes, oil producing plants oilseed rape, sunflower, pea nut, soy bean, etc.), cotton, sugar beet, sugar cane, leguminous plants (e.g.

beans, peas etc.), wood producing plants, preferably trees, etc.

Thus, the present invention relates also to a transgenic plant cell which contains (preferably stably integrated into its genome) a nucleic acid molecule according to the invention linked to regulatory elements which allow expression of the nucleic acid molecule in plant cells and wherein the nucleic acid molecule is foreign to the transgenic plant cell.

For the meaning of foreign; see supra. The presence and expression of the nucleic acid molecule in the transgenic plant cells leads to the synthesis of a cell cycle interacting protein and leads to physiological and phenotypic changes in plants containing such cells.

Thus, the present invention also relates to transgenic plants and plant tissue comprising transgenic plant cells according to the invention. Due to the (over)expression of a cell cycle interacting protein of the invention, at developmental stages and/or in plant tissue in which they do not naturally occur these transgenic plants may show various physiological, developmental and/or morphological modifications in comparison to wild-type plants.

Therefore, part of this invention is the use of cell cycle genes and/or cell cycle interacting proteins to modulate the level of cell cycle interacting proteins and/or plant cell division WO 00/36124 PCT/EP99/10084 41 and/or growth in plant cells, plant tissues, plant organs and/or whole plants. To the scope of the invention also belongs a method to influence the activity of cyclindependent protein kinase in a plant cell by transforming the plant cell with a nucleic acid molecule according to the invention and/or manipulation of the expression of said molecule. More in particular using a nucleic acid molecule according to the invention, the disruption of plant cell cycle can be accomplished by interfering in the expression of a substrate for cyclin-dependent protein kinase. The latter goal may be achieved, for example, with methods for reducing the amount of active cell cycle interacting proteins.

For example, to obtain transgenic plants overexpressing a A. thaliana cell cycle interacting gene of the invention, its coding region can be cloned, into the pAT7002 vector (Aoyama and Chua, Plant J. 11 (1997), 605-612). This vector allows inducible expression of the cloned inserts by the addition of the glucocorticoid dexamethasone.

For example, following a polymerase chain reaction (PCR) technology the coding region of the cell cycle interacting gene can be amplified using appropriate primers, whereby a first primer contains an Xhol and a second primer contains an Spel restriction site. The obtained PCR fragment can be purified and cut with Xhol and Spel. Subsequently the fragment can be cloned into the Xhol and Spel sites of pTA7002. The resulted binary vector can be transferred into Agrobacterium tumefaciens. This strain can be used to transform Nicotiana tabacum cv. Petit havana using, the leaf disk protocol (Horsh, Science 227 (1985), 1229-1231) and Arabidopsis thaliana using, the root transformation protocol (Valvekens, PNAS 85 (1988), 5536-5540). Transgenic plants can then be selected on hygromycine 20 mg/l. Plants can be tested for the gene of interest inducible expression as follows. 2 to 3 leaves of each transformant can be cut in two. Each half can be either submersed in 50 mM Na-citrate buffer (pH 5.8) with or without dexamethasone (0.03 mM concentration). After 24 hours of induction RNA can be extracted from these leaves using the Trizol reagents (Gibco-BRL) according to the manufactures and a northern gel can be run using, 5 pg of RNA. The gel can be blotted on a nitro-cellulose filter (HybondN+, Amersham) and hybridised with a gene specific probe. Furthermore, seeds of transformants can be put on /2 MS medium with 1% sucrose, both with and without dexamethasone. As a control SR1 seeds should be included. In the presence of dexamethasone the growth behaviour of the transgenic WO 00/36124 PCT/EP99/10084 42 plants as compared to the control plants is expected to be modified. For example, these transgenic plants may grow faster and/or have additional cells. Furthermore, said plant may be less sensitive to environmental stress compared to the corresponding wild type plant.

Furthermore, the invention also relates to a transgenic plant cell which contains (preferably stably integrated into its genome) a nucleic acid molecule according to the invention or part thereof, wherein the transcription and/or expression of the nucleic acid molecule or part thereof leads to reduction of the synthesis of a cell cycle interacting protein. In a preferred embodiment, the reduction is achieved by an anti-sense, sense, ribozyme, co-suppression and/or dominant mutant effect.

"Antisense" and "antisense nucleotides" means DNA or RNA constructs which block the expression of the naturally occurring gene product.

The provision of the nucleic acid molecules according to the invention opens up the possibility to produce transgenic plant cells with a reduced level of the protein as described above and, thus, with a defect in the cell cycle. Techniques how to achieve this are well known to the person skilled in the art. These include, for example, the expression of antisense-RNA, ribozymes, of molecules which combine antisense and ribozyme functions and/or of molecules which provide for a co-suppression effect; see also supra. When using the antisense approach for reduction of the amount of cell cycle interacting proteins in plant cells, the nucleic acid molecule encoding the antisense-RNA is preferably of homologous origin with respect to the plant. species used for transformation. However, it is also possible to use nucleic acid molecules which display a high degree of homology to endogenously occurring nucleic acid molecules encoding a cell cycle interacting protein. In this case the homology is preferably higher than particularly higher than 90% and still more preferably higher than 95%. The reduction of the synthesis of a protein according to the invention in the transgenic plant cells can result in an alteration in, cell division. In transgenic plants comprising such cells this can lead to various physiological, developmental and/or morphological changes, preferably to improved regeneration and transformation capacity of, cultured cells or wounded tissue.

WO 00/36124 PCT/EP99/10084 43 Thus, the present invention also relates to transgenic plants comprising the abovedescribed transgenic plant cells. These may show, for example, a deficiency in cell division and/or reduced growth characteristics compared to wild type plants due to the stable or transient presence of a foreign DNA resulting in at least one of the following features: disruption of (an) endogenous gene(s) encoding a protein of the invention; expression of at least on antisense RNA and/or ribozyme against a transcript comprising a nucleic acid molecule of the invention; expression of a sense and/or non-translatable mRNA of the nucleic acid molecule of the invention; expression of an antibody of the invention; incorporation of a functional or non-functional copy of the regulatory sequence of the invention; or incorporation of a recombinant DNA molecule or vector of the invention.

The present invention also relates to cultured plant tissues comprising transgenic plant cells as described above which either show overexpression of a protein according to the invention or a reduction in synthesis of such a protein.

Any transformed plant obtained according to the invention can be used in a conventional breeding scheme or in in vitro plant propagation to produce more transformed plants with the same characteristics and/or can be used to introduce the same characteristic in other varieties of the same or related species. Such plants are also part of the invention.

Seeds obtained from the transformed plants genetically also contain the same characteristic and are part of the invention. As mentioned before, the present invention is in principle applicable to any plant and crop that can be transformed with any of the transformation method known to those skilled in the art and includes for instance corn, wheat, barley, rice, oilseed crops, cotton, tree species, sugar beet, cassava, tomato, potato, numerous other vegetables, fruits.

In yet another aspect, the invention also relates to harvestable parts and to propagation material of the transgenic plants according to the invention which either contain transgenic plant cells expressing a nucleic acid molecule according to the invention or which contain WO 00/36124 PCT/EP99/! 0084 44 cells which show a reduced level of the described protein. Harvestable parts can be in principle any useful parts of a plant, for example, flowers, pollen, seedlings, tubers, leaves, stems, fruit, seeds, roots etc. Propagation material includes, for example, seeds, fruits, cuttings, seedlings, tubers, rootstocks etc.

Regulatory sequences of cell cycle interacting genes As mentioned above, the regulatory sequences that naturally drive the expression of the above described cell cycle interacting proteins may prove useful for the expression of heterologous DNA sequences in certain plant tissues and/or at different developmental stages in plant development.

Accordingly, in a further aspect the present invention relates to a regulatory sequence of a promoter naturally regulating the expression of a nucleic acid molecule of the invention described above or of a nucleic acid molecule homologous to a nucleic acid molecule of the invention. With methods well known in the art it is possible to isolate the regulatory sequences of the promoters that naturally regulate the expression of the abovedescribed DNA sequences; see, Example 8. For example, using the above described nucleic acid molecules as probes a genomic library consisting of plant genomic DNA cloned into phage or bacterial vectors can be screened by a person skilled in the art. Such a library consists e.g. of genomic DNA prepared from seedlings, fractionized in fragments ranging from 5 kb to 50 kb, cloned into the lambda GEM11 (Promega) phages. Phages hybridizing with the probes can be purified. From the purified phages DNA can be extracted and sequenced. Having isolated the genomic sequences corresponding to the genes encoding the above-described cell cycle interacting proteins, it is possible to fuse heterologous DNA sequences to these promoters or their regulatory sequences via transcriptional or translational fusions well known to the person skilled in the art. In order to identify the regulatory sequences and specific elements of these cell cycle genes, 5'-upstream genomic fragments can be cloned in front of marker genes such as luc, gfp or the GUS coding region and the resulting chimeric genes can be introduced by means of Agrobacterium tumefaciens mediated gene transfer into plants or transfected into plant cells or plant tissue for WO 00/36124 PCT/EP99/1 0084 transient expression. The expression pattern observed in the transgenic plants or transfected plant cells containing the marker gene under the control of the regulatory sequences of the invention reveal the boundaries of the promoter and its regulatory sequences. Preferably, said regulatory sequence is capable of conferring expression of a heterologous DNA sequence in main and lateral root meristems, shoot apical meristems, embryos at the globular, heart and torpedo stages, floral meristems and/or cambial cells in the stem.

In context with the present invention, the term "regulatory sequence" refers to sequences which influence the specificity and/or level of expression, for example in the sense that they confer cell and/or tissue specificity; see supra. Such regions can be located upstream of the transcription initiation site, but can also be located downstream of it, in transcribed but not translated leader sequences.

The term "promoter", within the meaning of the present invention refers to nucleotide sequences necessary for transcription initiation, i.e. RNA polymerase binding, and may also include, for example, the TATA box.

The term "nucleic acid molecule homologous to a nucleic acid molecule of the invention", as used herein includes promoter regions and regulatory sequences of other cell cycle interacting protein encoding genes, such as genes from other species, for example, maize, alfalfa, potato, sorghum, millet, coix, barley, wheat and rice the coding region of which share substantial homology to the cell cycle interacting proteins of the invention and which display substantially the same expression pattern. Such promoters are characterized by their capability of conferring expression of a heterologous DNA sequence in meristematic tissue and cells and other tissues mentioned above.

Thus, according to the present invention, regulatory sequences from any species can be used that are functionally homologous to the regulatory sequences of the promoter of the above defined nucleic acid molecules, or promoters of genes that display an identical or similar pattern of expression, in the sense of being expressed in the abovementioned tissues and cells. However, the expression conferred by the regulatory sequences of the invention may not be limited to, for example, root meristem cells but can include or be restricted to, for example, subdomains of meristems. The particular expression pattern may also depend on the plant/vector system employed. However, WO 00/36124 PCT/EP99/10084 46 expression of heterologous DNA sequences driven by the regulatory sequences of the invention predominantly occurs in the meristem unless certain elements of the regulatory sequences of the invention, were taken and designed by the person skilled in the art to control the expression of a heterologous DNA sequence in other cell types.

It is also immediately evident to the person skilled in the art that further regulatory elements may be added to the regulatory sequences of the invention. For example, transcriptional enhancers and/or sequences which allow for induced expression of the regulatory sequences of the invention may be employed. A suitable inducible system is for example tetracycline-regulated gene expression as described, by Gatz, supra.

The regulatory sequence of the invention may preferably be derived from the above described cell cycle interacting genes. Plants that may be suitable sources for such genes have been described above.

Usually, said regulatory sequence is part of a recombinant DNA molecule. In a preferred embodiment of the present invention, the regulatory sequence in the recombinant DNA molecule is operatively linked to a heterologous DNA sequence.

The term heterologous with respect to the DNA sequence being operatively linked to the regulatory sequence of the invention means that said DNA sequence is not naturally linked to the regulatory sequence of the invention. Expression of said heterologous DNA sequence comprises transcription of the DNA sequence, preferably into a translatable mRNA. Regulatory elements ensuring expression in eukaryotic cells, preferably plant cells, are well known to those skilled in the art. They usually comprise poly-A signals ensuring termination of transcription and stabilization of the transcript, see also supra.

Additional regulatory elements may include transcriptional as well as translational enhancers; see supra.

In a preferred embodiment, the heterologous DNA sequence of the above-described recombinant DNA molecules encodes a peptide, protein, antisense RNA, sense RNA and/or ribozyme. The recombinant DNA molecule of the invention can be used alone or as part of a vector to express heterologous DNA sequences, which, encode WO 00/36124 PCTEP99/1 0084 47 proteins for, the control of disease resistance, modulation of nutrition value or diagnostics of cell cycle related gene expression. The recombinant DNA molecule or vector containing the DNA sequence encoding a protein of interest is introduced into the cells which in turn produce the RNA and optionally protein of interest. For example, the regulatory sequences of the invention can be operatively linked to a lethal gene for use in the production of male and female sterility in plants. Suitable lethal genes include the Bacillus amyloliquefaciens ribonuclease (Hartlet, J. Mol. Biol. 89 (1985)) and the Bacillus amyloliquefaciens ribonuclease expressed with or without its inhibitor, barstar. Another example for a lethal gene is the catalytic A fragment of diphteria toxin (Tweeten, J.

Bacteriol. 156 (1983), 680-685). Expression of diphteria toxin within yeast cells causes ADP-ribosylation of elongation factor 2, which leads to inhibition of protein synthesis and eventual cell death (Mattheakis, Mol. Cell. Biol. 12 (1992), 4026-4037).

On the other hand, said protein can be a scorable marker, luciferase, green fluorescent protein or B-galactosidase. This embodiment is particularly useful for simple and rapid screening methods for compounds and substances described herein below capable of modulating cell cycle interacting protein gene expression. For example, a cell suspension can be cultured in the presence and absence of a candidate compound in order to determine whether the compound affects the expression of genes which are under the control of regulatory sequences of the invention, which can be measured, e.g., by monitoring the expression of the above-mentioned marker. It is also immediately evident to those skilled in the art that other marker genes may be employed as well, encoding, for example, a selectable marker which provides for the direct selection of compounds which induce or inhibit the expression of said marker.

The regulatory sequences of the invention may also be used in methods of antisense approaches. The antisense RNA may be a short (generally at least 10, preferably at least 14 nucleotides, and optionally up to 100 or more nucleotides) nucleotide sequence formulated to be complementary to a portion of a specific mRNA sequence and/or DNA sequence of the gene of interest. Standard methods relating to antisense technology have been described; see, Klann, Plant Physiol. 112 (1996), 1321-1330. Following transcription of the DNA sequence into antisense RNA, the antisense RNA binds to its WO 00/36124 PCT/EP99/10084 48 target sequence within a cell, thereby inhibiting translation of the mRNA and downregulating expression of the protein encoded by the mRNA. Thus, in a further embodiment, the invention relates to nucleic acid molecules of at least 15 nucleotides in length hybridizing specifically with a regulatory sequence as described above or with a complementary strand thereof. For the possible applications of such nucleic acid molecules, see supra.

The present invention also relates to vectors, particularly plasmids, cosmids, viruses and bacteriophages used conventionally in genetic engineering that comprise a recombinant DNA molecule of the invention. Preferably, said vector is an expression vector and/or a vector further comprising a selection marker for plants. For example of suitable selector markers, see supra. Methods which are well known to those skilled in the art can be used to construct recombinant vectors; see, for example, the techniques described in Sambrook, Molecular Cloning A Laboratory Manual, Cold Spring Harbor Laboratory (1989) N.Y. and Ausubel, Current Protocols in Molecular Biology, Green Publishing Associates and Wiley Interscience, N.Y. (1989). Alternatively, the recombinant DNA molecules and vectors of the invention can be reconstituted into liposomes for delivery to target cells; see also supra.

The present invention furthermore relates to host cells transformed with a regulatory sequence, a DNA molecule or vector of the invention. Said host cell may be a prokaryotic or eukaryotic cell; see supra.

In a further preferred embodiment, the present invention provides for a method for the production of transgenic plants, plant cells or plant tissue comprising the introduction of a nucleic acid molecule, recombinant DNA molecule or vector of the invention into the genome of said plant, plant cell or plant tissue. For the expression of the heteroiogous DNA sequence under the control of the regulatory sequence according to the invention in plant cells, further regulatory sequences such as poly A tail may be fused, preferably 3' to the heterologous DNA sequence, see also supra. Further possibilities might be to add Matrix Attachment Sites at the borders of the transgene to act as "delimiters" and insulate against methylation spread from nearby heterochromatic sequences. Methods WO 00/36124 PCT/EP99/10084 49 for the introduction of foreign DNA into plants, plant cells and plant tissue are described above.

Thus, the present invention relates also to transgenic plant cells which contain stably integrated into the genome a recombinant DNA molecule or vector according to the invention.

Furthermore, the present invention also relates to transgenic plants and plant tissue comprising the above-described transgenic plant cells. These plants may show, for example, modified architecture, increased yield or an increased tolerance to diseases, nematodes, geminiviruses or to stresses, salt, heat, nutrient deprivation, etc.

In yet another aspect the invention also relates to harvestable parts and to propagation material of the transgenic plants according to the invention which contain transgenic plant cells described above. Harvestable parts and propagation material can be in principle any useful part of a plant; see supra.

With the regulatory sequences of the invention, it will be possible to study in vivo gene expression related to cell cycle interacting proteins. Furthermore, since cell cycle interacting protein expression has different patterns in different stages of physiological and pathological conditions, it is now possible to determine further regulatory sequences which may be important for the up- or down-regulation of the expression or activity of cell cycle interacting proteins, for example in response to ions or elicitors. In addition, it is now possible to in vivo study mutations which affect different functional or regulatory aspects of specific gene expression in the cell cycle. Thus, the present invention also relates to the use of the above described regulatory sequences and recombinant DNA molecules of the invention for the expression of heterologous DNA sequences.

The in vivo studies referred to above will be suitable to further broaden the knowledge on the mechanisms and genes involved in the control of the cell cycle. Expression of heterologous genes or antisense RNA under the control of the regulatory sequence of the present invention in plants and plant cells may allow the understanding of the function of each of these genes in the plant.

WO 00/36124 PCT/EP99/10084 As mentioned hereinbefore, the nucleic acid molecules and proteins of the present invention provide a basis for the development of mimetic compounds that may be inhibitors or activators of cell cycle interacting proteins or their encoding genes. It will be appreciated that the present invention also provides cell based screening methods that allow a high-throughput-screening (HTS) of compounds that may be candidates for such inhibitors and activators.

Thus, the present invention further relates to a method for the identification of an activator or inhibitor of genes encoding cell cycle interacting proteins comprising the steps of: culturing a plant cell or tissue or maintaining a plant comprising a recombinant DNA molecule comprising a readout system operatively linked to a regulatory sequence of the invention in the presence of a compound or a sample comprising a plurality of compounds under conditions which permit expression of said readout system; identifying or verifying a sample and compound, respectively, which leads to suppression or activation and/or enhancement of expression of said readout system in said plant, plant cell, or plant tissue.

The present invention further relates to a method for identifying and obtaining an activator or inhibitor of cell cycle interacting proteins comprising the steps of: combining a compound to be screened with a reaction mixture containing the cell cycle interacting protein of the invention and a readout system capable of interacting with the cell cycle interacting protein under suitable conditions which permit interaction of the cell cycle interacting protein with said readout system; identifying or verifying a sample and compound, respectively, which leads to suppression or activation of the readout system.

The term "read out system" in context with the present invention means any substrate that can be monitored, for example due to enzymatically induced changes. It also includes DNA sequences which upon transcription and/or expression in a cell, tissue or WO 00/36124 PCT/EP99/10084 51 organism provide for a scorable and/or selectable phenotype. Such read out systems are well known to those skilled in the art and comprise, for example, substrates for protein kinases, recombinant DNA molecules and marker genes as described above.

The term "plurality of compounds" in a method of the invention is to be understood as a plurality of substances which may or may not be identical.

Said compound or plurality of compounds may be chemically synthesized or microbiologically produced and/or comprised in, for example, samples, cell extracts from, plants, animals or microorganisms. Furthermore, said compound(s) may be known in the art but hitherto not known to be capable of suppressing or activating cell cycle interacting proteins. The reaction mixture may be a cell free extract or may comprise a cell or tissue culture. Suitable set ups for the method of the invention are known to the person skilled in the art and are, for example, generally described in Alberts et al., Molecular Biology of the Cell, third edition (1994), in particular Chapter 17.

The plurality of compounds may be, added to the reaction mixture, culture medium, injected into the cell or sprayed onto the plant. The cell or tissue that may be employed in the method of the invention preferably is a host cell, plant cell or plant tissue of the invention described in the embodiments hereinbefore.

If a sample containing a compound or a plurality of compounds is identified in the method of the invention, then it is either possible to isolate the compound from the original sample identified as containing the compound capable of suppressing or activating cell cycle interacting proteins, or one can further subdivide the original sample, for example, if it consists of a plurality of different compounds, so as to reduce the number of different substances per sample and repeat the method with the subdivisions of the original sample. Depending on the complexity of the samples, the steps described above can be performed several times, preferably until the sample identified according to the method of the invention only comprises a limited number of or only one substance(s). Preferably said sample comprises substances of similar chemical and/or physical properties, and most preferably said substances are identical. Several methods are known to the person skilled in the art for producing and screening large libraries to identify compounds having specific affinity for a target. These methods include the phage-display method in which randomized peptides are displayed from phage and WO 00/36124 PCT/EP99/10084 52 screened by affinity chromatography to an immobilized receptor; see, WO 91/17271, WO 92/01047, US-A-5,223,409. In another approach, combinatorial libraries of polymers immobilized on a chip are synthesized using photolithography; see, e.g., US-A-5,143,854, WO 90/15070 and WO 92/10092. The immobilized polymers are contacted with a labeled receptor and scanned for label to identify polymers binding to the receptor. The synthesis and screening of peptide libraries on continuous cellulose membrane supports that can be used for identifying binding ligands of the polypeptide of the invention and thus possible inhibitors and activators is described, for example, in Kramer, Methods Mol. Biol. 87 (1998), 25-39. This method can also be used, for example, for determining the binding sites and the recognition motifs in the polypeptide of the invention. In like manner, the substrate specificity of the DnaK chaperon was determined and the contact sites between human interleukin-6 and its receptor; see Rudiger, EMBO J. 16 (1997), 1501-1507 and Weiergraber, FEBS Lett. 379 (1996), 122- 126, respectively. Furthermore, the above-mentioned methods can be used for the construction of binding supertopes derived from the polypeptide of the invention. A similar approach was successfully described for peptide antigens of the anti-p24 (HIV-1) monoclonal antibody; see Kramer, Cell 91 (1997), 799-809. A general route to fingerprint analyses of peptide-antibody interactions using the clustered amino acid peptide library was described in Kramer, Mol. Immunol. 32 (1995), 459-465. In addition, antagonists of the polypeptide of the invention can be derived and identified from monoclonal antibodies that specifically react with the polypeptide of the invention in accordance with the methods as described in Doring, Mol. Immunol. 31 (1994), 1059-1067.

More recently, WO 98/25146 described further methods for screening libraries of complexes for compounds having a desired property, especially, the capacity to agonize, bind to, or antagonize a polypeptide or its cellular receptor. The complexes in such libraries comprise a compound under test, a tag recording at least one step in synthesis of the compound, and a tether susceptible to modification by a reporter molecule. Modification of the tether is used to signify that a complex contains a compound having a desired property. The tag can be decoded to reveal at least one step in the synthesis of such a compound. Other methods for identifying compounds which interact with the proteins according to the invention or nucleic acid molecules encoding such molecules are, for example, the in vitro screening with the phage display WO 00/36124 PCT/EP99/1 0084 53 system as well as filter binding assays or "real time" measuring of interaction using, for example, the BIAcore apparatus (Pharmacia).

All these methods can be used in accordance with the present invention to identify activators and antagonists of the polypeptide of the invention.

Various sources for the basic structure of such an activator or inhibitor can be employed and comprise, for example, mimetic analogs of the polypeptide of the invention. Mimetic analogs of the polypeptide of the invention or biologically active fragments thereof can be generated by, for example, substituting the amino acids that are expected to be essential for the biological activity with, stereoisomers, i.e. D-amino acids; see e.g., Tsukida, J. Med. Chem. 40 (1997), 3534-3541. Furthermore, in case fragments are used for the design of biologically active analogs Pro-mimetic components can be incorporated into a peptide to reestablish at least some of the conformational properties that may have been lost upon removal of part of the original polypeptide; see, e.g., Nachman, Regul. Pept. 57 (1995), 359-370. Furthermore, the polypeptide of the invention can be used to identify synthetic chemical peptide mimetics that bind to or can function as a ligand, substrate, binding partner or the receptor of the polypeptide of the invention as effectively as does the natural polypeptide; see, Engleman, J. Clin.

Invest. 99 (1997), 2284-2292.

The structure-based design and synthesis of low-molecular-weight synthetic molecules that mimic the activity of the native biological polypeptide is further described in, e.g., Dowd, Nature Biotechnol. 16 (1998), 190-195; Kieber-Emmons, Current Opinion Biotechnol. 8 (1997), 435-441; Moore, Proc. West Pharmacol. Soc. 40 (1997), 115-119; Mathews, Proc. West Pharmacol. Soc. 40 (1997), 121-125; Mukhija, European J.

Biochem. 254 (1998), 433-438.

It is also well known to the person skilled in the art, that it is possible to design, synthesize and evaluate mimetics of small organic compounds that, for example, can act as a substrate or ligand to the polypeptide of the invention. For example, it has been described that D-glucose mimetics of hapalosin exhibited similar efficiency as hapalosin WO 00/36124 PCTEP99/1 0084 54 in antagonizing multidrug resistance assistance-associated protein in cytotoxicity; see Dinh, J. Med. Chem. 41(1998), 981-987.

The nucleic acid molecule of the invention can also serve as a target for activators and inhibitors. Activators may comprise, for example, proteins that bind to the mRNA of a gene encoding a polypeptide of the invention, thereby stabilizing the native conformation of the mRNA and facilitating transcription and/or translation, in like manner as Tat protein acts on HIV-RNA. Furthermore, methods are described in the literature for identifying nucleic acid molecules such as an RNA fragment that mimics the structure of a defined or undefined target RNA molecule to which a compound binds inside of a cell resulting in retardation of cell growth or cell death; see, WO 98/18947 and references cited therein. These nucleic acid molecules can be used for identifying unknown compounds of pharmaceutical and/or agricultural interest, and for identifying unknown RNA targets for use in treating a disease. These methods and compositions can be used in screening for novel antibiotics, bacteriostatics, or modifications thereof or for identifying compounds useful to alter expression levels of proteins encoded by a nucleic acid molecule. Alternatively, for example, the conformational structure of the RNA fragment which mimics the binding site can be employed in rational drug design to modify known antibiotics to make them bind more avidly to the target. One such methodology is nuclear magnetic resonance (NMR), which is useful to identify drug and RNA conformational structures. Still other methods are, for example, the drug design methods as described in WO 95/35367, US-A-5,322,933, where the crystal structure of the RNA fragment can be deduced and computer programs are utilized to design novel binding compounds which can act as antibiotics.

The compounds which can be tested and identified according to a method of the invention may be expression libraries, cDNA expression libraries, peptides, proteins, nucleic acids, antibodies, small organic compounds, hormones, peptidomimetics, PNAs or the like (Milner, Nature Medicine 1 (1995), 879-880; Hupp, Cell 83 (1995), 237-245; Gibbs, Cell 79 (1994), 193-198 and references cited supra).

Furthermore, genes encoding a putative regulator of cell cycle interacting protein and/or which excert their effects up- or downstream the cell cycle interacting protein of the WO 00/36124 PCTEP99/1 0094 invention may be identified using, for example, insertion mutagenesis using, for example, gene targeting vectors known in the art (see, Hayashi, Science 258 (1992), 1350-1353; Fritze and Walden, Gene activation by T-DNA tagging. In Methods in Molecular biology 44 (Gartland, K.M.A. and Davey, eds). Totowa: Human Press (1995), 281-294) or transposon tagging (Chandlee, Physiologia Plantarum 78 (1990), 105-115). Said compounds can also be functional derivatives or analogues of known inhibitors or activators. Such useful compounds can be for example transacting factors which bind to the cell cycle interacting protein or regulatory sequences of the invention.

Identification of transacting factors can be carried out using standard methods in the art (see, Sambrook, supra, and Ausubel, supra). To determine whether a protein binds to the protein or regulatory sequence of the invention, standard native gel-shift analyses can be carried out. In order to identify a transacting factor which binds to the protein or regulatory sequence of the invention, the protein or regulatory sequence of the invention can be used as an affinity reagent in standard protein purification methods, or as a probe for screening an expression library. The identification of nucleic acid molecules which encode proteins which interact with the cell cycle interacting proteins described above can also be achieved, for example, as described in Scofield (Science 274 (1996), 2063-2065) by use of the so-called yeast "two-hybrid system"; see also the appended examples. In this system the protein encoded by the nucleic acid molecules according to the invention or a smaller part thereof is linked to the DNA-binding domain of the GAL4 transcription factor. A yeast strain expressing this fusion protein and comprising a lacZ reporter gene driven by an appropriate promoter, which is recognized by the GAL4 transcription factor, is transformed with a library of cDNAs which will express plant proteins or peptides thereof fused to an activation domain. Thus, if a peptide encoded by one of the cDNAs is able to interact with the fusion peptide comprising a peptide of a protein of the invention, the complex is able to direct expression of the reporter gene. In this way the nucleic acid molecules according to the invention and the encoded peptide can be used to identify peptides and proteins interacting with cell cycle interacting proteins. It is apparent to the person skilled in the art that this and similar systems may then further be exploited for the identification of inhibitors of the binding of the interacting proteins.

Once the transacting factor is identified, modulation of its binding to or regulation of expression of the cell cycle interacting protein of the invention can be pursued, WO 00/36124 PCT/EP99/10084 56 beginning with, for example, screening for inhibitors against the binding of the transacting factor to the protein of the present invention. Activation or repression of cell cycle interacting proteins could then be achieved in plants by applying of the transacting factor (or its inhibitor) or the gene encoding it, e.g. in a vector for transgenic plants. In addition, if the active form of the transacting factor is a dimer, dominant-negative mutants of the transacting factor could be made in order to inhibit its activity.

Furthermore, upon identification of the transacting factor, further components in the pathway leading to activation signal transduction) or repression of a gene involved in the control of cell cycle then can be identified. Modulation of the activities of these components can then be pursued, in order to develop additional drugs and methods for modulating the cell cycle in animals and plants.

Thus, the present invention also relates to the use of the two-hybrid system as defined above for the identification of cell cycle interacting proteins or activators or inhibitors of such poteins Determining whether a compound is capable of suppressing or activating cell cycle interacting proteins can be done, for example, by monitoring DNA duplication and cell division. It can further be done by monitoring the phenotypic characteristics of the cell of the invention contacted with the compounds and compare it to that of wild-type plants. In an additional embodiment, said characteristics may be compared to that of a cell contacted with a compound which is either known to be capable or incapable of suppressing or activating cell cycle interacting proteins.

The compounds isolated by the above methods also serve as lead compounds for the development of analog compounds. The analogs should have a stabilized electronic configuration and molecular conformation that allows key functional groups to be presented to the receptor in substantially the same way as the lead compound. In particular, the analog compounds have spatial electronic properties which are comparable to the binding region, but can be smaller molecules than the lead compound, frequently having a molecular weight below about 2 kD and preferably below about 1 kD. Identification of analog compounds can be performed through use of techniques such as self-consistent field (SCF) analysis, configuration interaction (CI) WO 00/36124 PCT/EP99/1 0084 57 analysis, and normal mode dynamics analysis. Computer programs for implementing these techniques are available; Rein, Computer-Assisted Modeling of Receptor- Ligand Interactions (Alan Liss, New York, 1989). Methods for the preparation of chemical derivatives and analogues are well known to those skilled in the art and are described in, for example, Beilstein, Handbook of Organic Chemistry, Springer edition New York Inc., 175 Fifth Avenue, New York, N.Y. 10010 U.S.A. and Organic Synthesis, Wiley, New York, USA. Furthermore, said derivatives and analogues can be tested for their effects according to methods known in the art. Furthermore, peptidomimetics and/or computer aided design of appropriate derivatives and analogues can be used, for example, according to the methods described above.

The inhibitor or activator identified by the above-described method may prove useful as a herbicide, pesticide, insecticide, antibiotic, tumor suppressing agent and/or as a cell growth regulator. Thus, in a further embodiment the invention relates to a compound obtained or identified according to the method of the invention said compound being an activator of cell cycle interacting proteins or an inhibitor of cell cycle interacting proteins.

The above-described compounds include, for example, cell cycle kinase inhibitors. "Cellcycle kinase inhibitor" (CKI) is a protein which inhibit CDK/cyclin activity and is produced and/or activated when further cell division has to be temporarily or continuously prevented. The antibodies, nucleic acid molecules, inhibitors and activators of the present invention preferably have a specificity at least substantially identical to the binding specificity of the natural ligand or binding partner of the cell cycle protein of the invention, in particular if cell cycle stimulation is desired. An antibody or inhibitor can have a binding affinity to the cell cycle interacting protein of the invention of at least 5 preferably higher than 10 7

M

1 and advantageously up to 10 1 0 M in case cell cycle suppression should be mediated.

In a preferred embodiment, a suppressive antibody or inhibitor of the invention has an affinity of at least about 10- 7 M, preferably at least about 10 9 M and most preferably at least about 10.11 M; and cell cycle stimulating activator has an affinity of less than about 10-7 M, preferably less than about 10 6 M and most preferably in order of 10" 5

M.

In case of nucleic acid molecules it is preferred that they have a binding affinity to those encoding the amino acid sequences depicted in SEQ ID NO: 2, 4, 34, 36, 38, 40, 42, 6, WO 00/36124 PCT/EP99/10084 58 8, 10, 12 or 14 of at most 5- or 10-fold less than an exact complement of consecutive nucleotides of the above described nucleic acid molecules.

Preferably, the compound identified according to the above described method or its analog or derivative is further formulated in a therapeutically active form or in a form suitable for the application in plant breeding or plant cell and tissue culture. For example, it can be combined with a agriculturally acceptable carrier known in the art. Thus, the present invention also relates to a method of producing a therapeutic or plant effective composition comprising the steps of one of the above described methods of the invention and combining the compound obtained or identified in the method of the invention or an analog or derivative thereof with a pharmaceutically acceptable carrier or with a plant cell and tissue culture acceptable carrier. As is evident from the above, the present invention generally relates to compositions comprising at least one of the aforementioned nucleic acid molecules, vectors, proteins, regulatory sequences, recombinant DNA molecules, antibodies or compounds. Advantageously, said composition is for use as a medicament, a diagnostic means, a kit or as a plant effective composition.

Compositions useful in agriculture and in plant cell and tissue culture Plant protection compositions can be prepared by employing the above-described methods of the invention and synthesizing the compound identified as inhibitor or activator in an amount sufficient for use in agriculture. Thus, the present invention also relates to a method for the preparation of an agricultural plant protection composition comprising the above-described steps of the method of the invention and synthesizing the compound so identified or an analog or derivative thereof.

In the plant protection composition of the invention, the compound identified by the above-described method may be preferentially formulated by conventional means commonly used for the application of, for example, herbicides and pesticides or agents capable of inducing systemic acquired resistance (SAR). For example, certain additives known to those skilled in the art stabilizers or substances which facilitate the uptake by the plant cell, plant tissue or plant may be used.

WO 00/36124 PCT/EP99/1 0084 59 Pharmaceutical compositions The cell cycle interacting proteins of the invention appear to function in the cell division cycle which is similar in plants and animals. Accordingly, the nucleic acid molecules and proteins of the invention or derivatives thereof as well as the above described activators and inhibitors may be used to modulate the cell division cycle in animal, preferably mammalian cells which is integral to the development and spread of cancerous cells. A cell cycle interacting protein that acts as a basal transcription factor may promote cancer cell growth. In conditions where cell cycle interacting protein activity is not desirable, cells could be transfected with antisense sequences to cell cycle interacting protein encoding polynucleotides or provided with antagonists to the protein or its encoding gene. Thus, the above described antagonists or antisense molecules may be used to slow, stop, or reverse cancer cell growth. Thus, the present invention also relates to a method of producing a therapeutic agent comprising the steps of the methods described hereinbefore and synthesizing the activator or inhibitor obtained or identified in step (c) or an analog or derivative thereof in an amount sufficient to provide said agent in a therapeutically effective amount to a patient.

Compounds identified by the above methods or analogs are formulated for therapeutic use as pharmaceutical compositions. The compositions can also include, depending on the formulation desired, pharmaceutically acceptable, usually sterile, non-toxic carriers or diluents, which are defined as vehicles commonly used to formulate pharmaceutical compositions for animal or human administration. The diluent is selected so as not to affect the biological activity of the combination. Examples of such diluents are distilled water, physiological saline, Ringer's solutions, dextrose solution, and Hank's solution. In addition, the pharmaceutical composition or formulation may also include other carriers, adjuvants, or nontoxic, nontherapeutic, nonimmunogenic stabilizers and the like. A therapeutically effective dose refers to that amount of protein or its antibodies, antagonists, or inhibitors which ameliorate the symptoms or condition. Therapeutic efficacy and toxicity of such compounds can be determined by standard pharmaceutical procedures in cell cultures or experimental animals, ED50 (the dose therapeutically effective in 50% of the population) and LD50 (the dose lethal to 50% of the population).

WO 00/36124 PCT/E P99/1 0084 The dose ratio between therapeutic and toxic effects is the therapeutic index, and it can be expressed as the ratio, LD50/ED50.

Diagnostic means and kits The invention also relates to a diagnostic composition comprising at least one of the aforementioned nucleic acid molecules, vectors, proteins, antibodies or compounds and optionally suitable means for detection. Said diagnostic compositions may be used for methods for determining expression of cell cycle interacting proteins by detecting the presence of the corresponding mRNA which comprises isolation of mRNA from a cell and contacting the mRNA so obtained with a probe comprising a nucleic acid probe as described above under hybridizing conditions, detecting the presence of mRNA hybridized to the probe, and thereby detecting the expression of the protein in the cell.

Further methods of detecting the presence of a protein according to the present invention comprises immunotechniques well known in the art, for example enzyme linked immunosorbent assay. Furthermore, it is possible to use the nucleic acid molecules according to the invention as molecular markers in plant breeding. Moreover, the present invention relates to a kit comprising at least one of the aforementioned nucleic acid molecules, regulatory sequences, recombinant DNA molecules, vectors, proteins, compounds or antibodies of the invention. The kit of the invention may contain further ingredients such as selection markers and components for selective media suitable for the generation of transformed host cells and transgenic plant cells, plant tissue or plants. Furthermore, the kit may include buffers and substrates for reporter genes that may be present in the recombinant gene or vector of the invention. The kit of the invention may advantageously be used for carrying out the method of the invention and could be, inter alia, employed in a variety of applications referred to herein, in the diagnostic field or as research tool. The parts of the kit of the invention can be packaged individually in vials or in combination in containers or multicontainer units.

Manufacture of the kit follows preferably standard procedures which are known to the person skilled in the art. The kit or its ingredients according to the invention can be used in plant cell and plant tissue cultures, for example, for any of the above described methods for detecting inhibitors and activators of cell cycle genes. The kit of the WO 00/36124 PCT/EP99/1 0084 61 invention and its ingredients are expected to be very useful in breeding new varieties of, for example, plants which display improved properties such as nutritial value or disease resistance.

Further applications of the invention The person skilled in the art can use proteins according to the invention from other organisms such as yeast and animals to influence cell division progression in those other organisms such as mammals or insects. In a preferred embodiment one or more DNA sequences, vectors or proteins of the invention or the above-described antibody or compound are, for instance, used to specifically interfere in the modulation of the protein levels or activity of any protein involved in disruption of the expression levels of genes involved in G1/S and/or G2/M transition in the cell cycle process in transformed plants, particularly in the complete plant in selected plant organs, tissues or cell types under specific environmental conditions, including abiotic stress such as cold, nutrient deprivation, heat, drought or salt stress or biotic stress such as pathogen attack during specific developmental stages.

Specifically the plant cell division rate and/or the inhibition of a plant cell division can be influenced by (partial) elimination of a gene or reducing the expression of a gene encoding a protein according to the invention. Said plant cell division rate and/or the inhibition of a plant cell division can also be influenced by eliminating or inhibiting the activity of the protein according to the invention by using for instance antibodies directed against said protein. As a result of said elimination or reduction greater organisms or specific organs or tissues can be obtained; greater in volume and in mass too.

Furthermore inhibition of cell division by various adverse environmental conditions such as drought, nutrient deprivation, high salt content, chilling and the like can be delayed or prevented by reduction or enhancing with a dominant negative version) of said expression of a gene according to the invention. The division rate of a plant cell can also WO 00/36124 PCT/EP99/1 0084 62 be influenced in a transformed plant by overexpression of a nucleic acid molecule according to the invention. Therefore an important aspect of the current invention is a method to modify plant architecture by overproduction or reduction of expression of a sequence according to the invention under the control of a tissue, cell or organ specific promoter. Another aspect of the present invention is a method to modify the growth inhibition of plants caused by environmental stress conditions above mentioned, or more particularly salt stress or nutrient deprivation by appropriate use of sequences according to the invention. Surprisingly using a polypeptide or fragment thereof according to the invention or using antisense RNA or any method to reduce the expression of the gene according to the invention, cell division in the meristem of both main and lateral roots, shoot apical or the vascular tissue of a plant can be manipulated. Furthermore any of the DNA sequences of the invention can be used to manipulate (reduce or enhance) the level of endopolyploidy and thereby increasing the storage capacity of, for example, endosperm cells. Thus, another aspect of the current invention is that one or more DNA sequences, vectors or proteins, regulatory sequences or recombinant DNA molecules of the invention or the above-described antibody or compound can be used to modulate, for instance, endoreduplication in storage cells, storage tissues and/or storage organs of plants or parts thereof. The term "endoreduplication" means recurrent DNA replication without consequent mitosis and cytokinesis.

Preferred target storage organs and parts thereof for the modulation of endoreduplication are, for instance, seeds (such as from cereals, oilseed crops), roots (such as in sugar beet), tubers (such as in potato) and fruits (such as in vegetables and fruit species). Furthermore it is expected that increased endoreduplication in storage organs and parts thereof correlates with enhanced storage capacity and as such with improved yield. In yet another embodiment of the invention, a plant with modulated endoreduplication in the whole plant or parts thereof can be obtained from a single plant cell by transforming the cell, in a manner known to the skilled person, with the abovedescribed means.

In view of the foregoing, the present invention also relates to the use of a DNA sequence, vector, protein, antibody, regulatory sequences, recombinant DNA molecule, nucleic acid molecules or compound of the invention for modulating plant cell cycle, plant cell division and/or growth, for influencing the activity of cell cycle interacting WO 00/36124 PCT/EP99/10084 63 protein, for disrupting plant cell division by influencing the presence or absence or by interfering in the expression of a cyclin-dependent protein, for modifying growth inhibition of plants caused by environmental stress conditions, for inducing male or female sterility, for influencing cell division progression in a host as defined above or for use in a screening method for the identification of inhibitors or activators of cell cycle proteins.

Furthermore, it is possible to use the nucleic acid molecules according to the invention as molecular markers in plant breeding. Thus, the present invention also relates to the use of a DNA sequence or regulatory sequence of the invention as a marker gene in plant or animal cell and tissue culture or as a marker in marker-assisted plant breeding.

Moreover, the overexpression of nucleic acid molecules according to the invention may be useful for the alteration or modification of plant/pathogen interaction. The term "pathogen" includes, for example, bacteria, viruses and fungi as well as protozoa.

Regulation of phosphate assimilation In a preferred embodiment of the invention, DNA sequences of section encoding

PLP

proteins described herein before and corresponding vectors, proteins etc. of the invention, in particular PHO80-like proteins (PLPs) may be used to improve the tolerance of plants towards suboptimal nutrient conditions, in particular levels of phosphate. Therefore such sequences may be used to uncouple optimal phosphate conditions from the plant growth rate resulting in enhanced growth rates in normal conditions or stress conditions such as low phosphate. Plants with modified expression of the PLP genes can display enhanced growth rates in normal growth conditions and in different stress conditions, in particular in the case of nutritional deprivation. Plants with modified expression of the PLP genes encompasses a method for conferring plant tolerance towards low levels of phosphate, meaning they may also be useful as a transgenic selective markers.

The cDNA clone (LDV24) was isolated according to the invention as a novel protein interacting with the CDC2aAt protein in a two-hybrid screen. This clone encodes a protein showing strongest homology to the cyclins PHO80, PCL1 and PCL2 from Saccharomyces cerevisiae and to the PREG protein from Neurospora crassa, and was WO 00/36124 PCT/EP99/1 0084 64 renamed PLP5 (PHO80-like protein). Also four cDNAs, named PLP1 to PLP4, were isolated by RT-PCR technology from Arabidopsis thaliana according to the invention.

Tissue specific expression analysis was performed. Two-hybrid analysis demonstrated all plant PLPs interact with the A. thaliana CDKs. Overexpression and antisense constructs were designed and introduced into plants.

Physiological response to phosphate stress Phosphorus availability is considered one of the major growth-limiting factors for plants in many natural ecosystems. The primary source of phosphorus in soils is inorganic phosphate Phosphorous is one of the most important nutrients for all organisms as it is part of many key biomolecules, like DNA, RNA, and lipids. In addition Pi plays an essential role in the energy transfer chain and multiple metabolic pathways (Robinson (1996) Annals of botany 77, 179-185). For these reasons, plants have developed several adaptive mechanisms to overcome Pi stress, which involve both morphological and metabolic changes. The most common adaptation under limiting Pi are: (a) morphological adaptations such as root growth and architecture changes or (b) metabolic adaptations are represented by: changes in the respiration rate and phospholipid content of chloroplasts. Phosphate availability affects the thylakoid lipid composition, the relative amount of sulfolipids, and a concomitant decrease in phosphatidylglycerol. Also several enzymes of the glycolytic pathway are altered (Theodorou and Plaxton (1993) Plant Physiol. 101, 339-344), (ii) secretion of protons and organic acids. Expression of low-Mr organic acids help to mobilise stores of Pi that are present in the soil as insoluble salts (Nagano and Ashihara (1993) Plant cell Physiol. 34, 1219-1228), (iii) synthesis of proteins that include high-affinity Pi transporters, RNases (Bariola et al. (1994) Plant J 6, 673-85) and phophatases (Del Pozo et al. (1999) Plant J. 19, 579-589). Phosphate uptake by roots and distribution within the plant are presumed to occur via a phosphate/proton (H /orthophosphate) cotransport. The uptake rate is enhanced severalfold by Pi deficiency, and there is evidence Pi deficiency induces a high-affinity Pi transporter in root and leaf cells.

(Muchhal and Raghothama (1999) Proc. Natl. Acad. 96, 5868-72; Liu et al. (1998) Plant Physiol 1998 Jan;116, 91-9). When phosphate is still available in the cell, but not outside, the synthesis of extra- and intracellular (cytoplasmatic and vacuolar) RNases by WO 00/36124 PCT/EP99/10084 Pi starvation is induced (Kock et al. (1998) Plant Mol Biol 27, 477-85). Also the stimulation of phosphatase activities in response to Pi starvation are well documented.

Both RNases and phosphatases are thought to be involved in both Pi acquisition and recycling, depending on their cellular and subcellular location. In addition, a growing number of phosphorus stress response genes are being identified, but for must of them a clear function has not been found yet.

In plants it appears that the regulation of Pi uptake and transport is likely to operate at a number of different sites, including: uptake of Pi from the external medium by root hairs and epidermal cells, movement of Pi through the cortical cells, loading of Pi into xylem vessels in the root, unloading from the xylem into the shoot cells. At the cellular level, the cytoplasmatic Pi concentration is kept constant while the vacuole stores are more labile (Burleigh and Harrison (1999) Plant Phys. 119, 241-248).

After the Pi stress signal is done, the rest of the plant exhibits significant metabolic alteration such as: activation of Pi recycling, alteration of plant respiration rate (alternative pathway of glycolisis and mitochondria electron transport), modification in the photosynthesis and photosynthate partitioning in leaves, changes in Pi flow in the vascular system.

Phosphate signal transduction The mechanisms that control the acclimation of Escherichia coli and Saccharomyces cerevisiae to Pi limitation have been extensively studied. In E.coli a two component regulatory system governs the transcription of many genes that are responsive to the Pi levels of the environment. In S. cerevisiae many mutants (pho series mutants) have been isolated. In this system, transcription of the PH05 gene, encoding a repressible acid phosphatase (rAPase), is under the control of the phosphate availability in the medium via a complex network of intracellular regulatory factors that comprises at least five genes: PH02, PH04, PH80, PH081 and PH085. PH02 and PH04 encode the activators necessary for transcription of PH05. When the levels of Pi are high, the PHO4 protein is hyperphosphorylated, impeding its nuclear import (and then the interaction with the PHO2 transcription factor). This phosphorylation is mediated by the PHO80/PH085 cyclin/CDK complex, thus being negative regulatory factors for the WO 00/36124 PCT/E P99/1 0084 66 expression. The PH085 encodes a non-essential protein kinase with 50% identity to the CDC28, and PH080 encodes a protein with homology to other yeast cyclins.

Unlike the well-understood PHO regulation system in S.cerevisiae, the basis on which the plants are able to respond to external phosphate concentration are not yet understood.

Link between cell division control and phosohate nutrition PH085 and PHO80 (and related sequences of the PHO80 like PCL 1 and PCL2) might have substrates that mediate other responses than phosphate starvation, such as regulation of growth and cell division. This is supported by the observation that the S.

cerevisiae PH085 protein can interact with the G1 specific cyclins PCL1 and PCL2 (close homologues to the PHO80). In a yeast strain deficient for the G1 cyclins CLN1 and CLN2, PH085 is required for G1 progression. This result suggests that PH085 is involved in a regulatory pathway that links the nutrient status of the cell with cell division activity (Gilliquet and Berben (1993) FEMS Microbiol Lett ,108, 333-9).

In plants the relationship between growth, cell division and Pi availability is demonstrated by the observed increase of lateral roots when the Pi concentration in the soil decreases, suggesting low Pi levels act as a mitogenic factor. In tobacco BY-2 cells, cell division is inhibited when the Pi is absent in the medium. Cells deprived of phosphate for 3 days induced cells to semi-synchronously re-enter the cell cycle from a static state (Sano et al. (1999) Plant Cell Physiol. 40, Phosphate as a limiting factor for the cell division of tobacco BY-2 cells. Plant Cell Physiol 40, Both events suggests a clear link between cell cycle regulation and the available nutrient levels. This has more recently been demonstrated using a transgenic approach. Plants overexpressing a membrane associated phosphate apyrase show that an increase in phosphate transport correlates with an enhanced growth phenotype (Thomas et al.

(1999) Plant Phys. 119, 543-551).

In accordance with the present invention, the plant homolog to the cyclin PHO80 has been isolated for the time. They have also identified a family of such PHO80-like proteins (PLPs). The invention therefore encompasses such nucleotide sequences, proteins and their derivatives, variants and homologs. It also provides transgenic plants comprising WO 00/36124 PCT/EP99/10084 67 PLPs. Thus, the above described embodiments of the present invention may be preferably performed with PLP nucleic acids and protein, for example as illustrated below.

An embodiment of the invention includes a method for modulating increasing or decreasing), in a transgenic plant, the expression of PLP genes. The method comprises transforming a plant cell (as described previously) with a vector comprising a nucleotide sequence of a PLP of the invention. In some embodiments modulating the PLP protein may be by use of a promoter to up or down regulate gene expression or to regulate expression in certain tissues or under certain environmental conditions. In a preferred embodiment a constitutive or root-specific promoter is used.

One embodiment of the invention includes a method for improving the tolerance of plants towards suboptimal nutrient conditions, in particular levels of phosphate, by modulating PLP expression and/or activity. Another embodiment includes a method for improving the growth of plants in normal conditions or suboptimal nutrient conditions, in particular levels of phosphate, by modulating PLP expression and/or activity.

An embodiment of the invention includes a method for providing enhanced rate or frequency of seed germination comprising modulating PLP expression and/or activity.

Also in some embodiments coding regions of the PLP genes can be altered by insertion, deletion, substitution or addition to decrease the activity of the encoded protein.

An embodiment of the invention includes using a PLP gene in combination with one or more another PLP genes. Similarly they may be used in combination with other transgenes that confer another phenotype to the plant. Likewise, it is possible to first confer, improved phosphate sensitivity to a plant in accordance with the method of the invention and to then in an additional step transform such plant in accordance thereof with a further nucleic acid molecule, the presence of which results in another new phenotype characteristic of said plant. Irrespective of the actual performance of transformation, the result of the present invention displays at least two new properties compared to a naturally occurring wild-type plant, that is improved phosphate sensitivity and: a phenotype that is due to the presence WO 00/36124 PCT/EP99/1 0084 68 of a further nucleic acid molecule in said plants e.g. herbicide or insectide tolerance, resistance to pathogens, improvement of starch composition and/or production etc.; see also supra.

An embodiment of the invention includes a method for using of PLPs as a positive or negative selectable marker during transformation procedures (Wickert et al. (1998) J. Bacteriology 180 (7):1887-1894). Overexpression of the PLP would mean that it could be used as a positive selectable marker during transformation procedures while antisense/cosuppression means it could be used as a negative selective marker. The selective agent is an antibiotic, preferably hygromycin.

These and other embodiments are disclosed and encompassed by the description and examples of the present invention. Further literature concerning any one of the methods, uses and compounds to be employed in accordance with the present invention may be retrieved from public libraries, using for example electronic devices. For example the public database "Medline" may be utilized which is available on the Internet, for example under http://www.ncbi.nlm.nih.gov/PubMed/medline.html. Further databases and addresses, such as http://www.ncbi.nlm.nih.gov/, http://www.infobiogen.fr/, http://www.fmi.ch/biology/research_tools.html, http://www.tigr.o rg/, are known to the person skilled in the art and can also be obtained using, e.g., http://www.lycos.com. An overview of patent information in biotechnology and a survey of relevant sources of patent information useful for retrospective searching and for current awareness is given in Berks, TIBTECH 12 (1994), 352-364.

The Figures show: Figure 1: Expression of the PLP genes in Arabidopsis tissues. A gel blot of RT-PCR from the Arabidopsis tissues indicated and from suspension cultured cell is shown. Total RNA was prepared from these tissues, which were harvested complete from 4 weeks old plants.

WO 00/36124 PCTEP99/1 0084 69 The present invention is further illustrated by reference to the following non-limiting examples.

Unless stated otherwise in the examples, all recombinant DNA techniques are performed according to protocols as described in Sambrook et al.. (1989), Molecular Cloning A Laboratory Manual. Cold Spring Harbor Laboratory Press, NY or in Volumes 1 and 2 of Ausubel et al.. (1994), Current Protocols in Molecular Biology, Current Protocols. Standard materials and methods for plant molecular work are described in Plant Molecular Biology Labfase (1993) by R.D.D. Croy, jointly published by BIOS Scientific Publications Ltd (UK) and Blackwell Scientific Publications

(UK).

Example 1: Identification of cell cycle interacting proteins using the two hybrid system with CDC2b as a bait A two-hybrid screening was performed using as bait a fusion between the GAL4 DNAbinding domain and CDC2bAt. Vectors and strains used were provided with the Matchmaker Two-Hybrid System (Clontech, Palo Alto, CA). The bait was constructed by inserting the CDC2bAt PCR fragment into the pGBT9 vector. The PCR fragment was created from the cDNA using primers to incorporate EcoRI restriction enzyme sites CGGATCCGAATTCATGGAGAACGAG-3' (SEQ ID NO: 15) and CGGATCCGAATTCTCAGAACTGAGA-3') (SEQ ID NO: 16). The PCR fragment was cut with EcoRI and cloned into the EcoRI site of pGBT9, resulting in the plasmid pGBTCDC2B. The GAL4 activation domain cDNA fusion library was obtained from Clontech from mRNA of Arabidopsis thaliana cell suspensions harvested at various growing stages: early exponential, exponential, early stationary, and stationary phase.

For the screening a 1-liter culture of the Saccharomyces cerevisiae strain HF7c (MATa ura3-52 his3-200 ade2-101 lys2-801 trpl-901 leu2-3,112 gal4-542 gal80-538 L YS2::GAL lUAS-GAL 1TATA-HIS3 URA3::GAL41r7mers3x-CyC1TATA-Lac2) was cotransformed with 590 pg pGBTCDC2B, 1100 pg DNA of the library, and 40 mg salmon sperm carrier DNA using the lithium acetate method (Gietz et 1992). To estimate the number of independent cotransformants, 1/1000 of the transformation mix WO 00/36124 PCT/EP99/10084 was plated on Leu- and Trp- medium. The rest of the transformation mix was plated on medium to select for histidine prototrophy (Trp-, Leu-, His-). After 5 days of growth at 300C, the colonies larger than 2 mm were streaked on histidine-lacking medium. A total of 107 independent cotransformants were screened for there ability to grow on histidine free medium. A 5-day incubation at 30°C yielded 352 colonies. Of the His* colonies the activation domain plasmids were isolated as described (Hoffman and Winston, 1987, Gene 57, 267-272). The hybriZAPTM inserts were PCR amplified using the primers AGGGATGTTTAATACCACTAC-3' (SEQ ID NO: 17) and GCACAGTTGAAGTGAACTTGC-3' (SEQ ID NO: 18). PCR fragments were digested with Alul and fractionized on a 2% agarose gel. Plasmid DNA of which the inserts gave rise to different restriction patterns were electroporated into Escherichia coli XL1-Blue, and the DNA sequence of the inserts was determined. Extracted DNA was also used to retransform HF7c to test the specificity of the interaction.

Example 2: Identification of cell cycle interacting proteins using the two hybrid system with CDC2a as a bait For the identification of cell cycle interacting proteins also a two hybrid system based on GAL4 recognition sites to regulate the expression of both his3 and lacZ reporter genes was used to identify CDC2aAt-interacting of proteins. The bait used for the two-hybrid screening was constructed by inserting the CDC2aAt coding region into the pGBT9 vector (Clontech). The insert was created by PCR using the CDC2aAt cDNA as template.

Primers were designed to incorporate EcoRI restriction enzyme sites. The primers used were 5'-CGAGATCTGAATTCATGGATCAGTA-3' (SEQ ID NO: 19) and 5'-CGAGATCTGAATTCCTAAGGCATGCC-3' (SEQ ID NO: 20). The PCR fragment was cut with EcoRI and cloned into the EcoRI site of pGBT9, resulting in the pGBTCDC2A plasmid. For the screening a GAL4 activation domain cDNA fusion library was used constructed from Arabidopsis thaliana cell suspension cultures. This library was constructed using RNA isolated from cells harvested at 20 hours, 3, 7 and 10 days after dilution of the culture in new medium. These time point correspondent to cells from the early exponential growth phase to the late stationary phase. mRNA was prepared using Dynabeads oligo(dT) 25 according to the manufacturer's instructions (Dynal). The GAL4 WO 00/36124 PCT/EP99/1 0084 71 activation domain cDNA fusion library was generated using the HybriZAP T vector purchased with the HybriZAP T Two-Hybird cDNA Gigapack cloning Kit (Stratagene) following the manufacturer's instructions. The resulting library contained approximately 3.106 independent plaque-forming units, with an average insert size of 1 Kb.

For the screening a 1-liter culture of the Saccharomyces cerevisiae strain HF7c (MATa ura3-52 his3-200 ade2-101 lys2-801 trpl-901 leu2-3,112 gal4-542 gal80-538 LYS2::GAL 1UAs-GAL 1TATA-HIS3 URA3::GAL4 1 7mers(3x)-CyCTATAT-LacZ) was cotransformed with 400 gg pGBTCDC2A, 500 gg DNA of the library, and 40 mg salmon sperm carrier DNA using the lithium acetate method (Gietz et al.. 1992, Nucleic Acids Res. 20, 1425).

To estimate the number of independent cotransformants, 1/1000 of the transformation mix was plated on Leu- and Trp' medium. The rest of the transformation mix was plated on medium to select for histidine prototrophy (Trp', Leu", His). Of a total of approximately 1.2 x 107 independent transformants 1200 colonies grew after 3 days of incubation at 30 0

C.

The colonies larger than 2 mm were streaked on histidine-lacking medium supplemented with 10 mM 3-amino-1,2,4-triazole (Sigma). Two-hundred-fifty colonies capable of growing under these conditions were tested for P-galactosidase activity as described (Breedon and Nasmyth 1995, Cold Spring Harbor Symp. Quant. Biol. 50, p643-650), and 153 turned out to be His* and LacZ*. Plasmid DNA was prepared from the positive clones and sequenced.

Example 3: Cell cycle interacting proteins associating with Cdc2aAt or Cdc2bAt Nine cDNA clones were obtained by the method described in Example 1 and 2, which are further described below. The specificity of the interaction those clones was verified by the retransformation of yeast with pGBTCDC2A or pGBTCDC2B and the corresponding cDNA clones. As controls, pGBTCDC2A or pGBTCDC2B was cotransformed with a vector containing only the GAL4 activation domain (pGAD424); and the nine cDNA vectors were each cotransformed with a plasmid containing only the GAL4 DNA binding domain (pGBT9). Transformants were plated on medium with or without histidine. Only transformants containing both pGBTCDC2A or pGBTCDC2B and one of the nine cDNA clones were able to grow in the absence of histidine.

WO 00/36124 PCT/EP99/1 0084 72 Example 4: Vb89 (SEQ ID NO: 7) HAL3 BLAST analysis A BLAST data base search revealed that the Vb89 clone encode the Arabidopsis thaliana HAL3 homologue, isolated recently and of which the function was unknown.

Unexpectingly, the Vb89 clone interacts with CDC2bAt, but not with CDC2aAt in the twohybrid system. The interaction of Vb89 with CDC2bAt highlights an important role of Vb 89 in cell cycle control. The publicly available databases were screened with the cDNA VB89. An overall perfect homology with HAL3, already known in the databases was found. With the help of BLASTX U80192 (score 1.9e-106) was found as the best homologue. This sequence is a partial cDNA from A.thaliana (entered in the databank: 28-APR-1997)(with publ.: Culianez-Macia,F.A., Espinosa-Ruiz,A. and Serrano,R, Arabidopsis thaliana HAL3 homolog gene, unpublished). Except that VB89 is longer, there are no major differences with this cDNA.

HAL3 is a halotolerant gene isolated in Saccharomyces cerevisiae (Ferrando, 1995 Molecular and Cellular Biology, 15:5470-5481.). Hal3p can inhibit the Ppzl protein phosphatase resulting in an increased resistance to sodium and lithium. These effect is largely a result of the increased expression of the ENA/PMR2A gene. This gene codes for a P-type ATPase responsible for sodium efflux (De Nadal et al., 1998 Proc. Natl.

Acad. Sci. USA, 95: 7357-7362). The HAL3 gene has also been isolated independently (as SIS2) and characterized on the basis of its ability to increase, when present in high copy number, the growth rate of sit4 mutants (Di Como et al., 1995 Genetics, 139: 107.). The SIT4/PPH1 gene encodes a type 2A-related Ser/Thr protein phosphatase that is required in late G1 for normal G1 cyclin expression and for bud formation.

Interestingly, overexpression of HAL3/SIS2 stimulates the rate of cyclin accumulation in sit4 mutants.

Altering expression of gene The Vb89 or HAL3 gene isolated according to the invention (and its homologs, derivatives and variants) may be used to confer salt tolerance on plants and/or improved growth under such conditions. The gene is expressed in plants using various types of WO 00/36124 PCT/EP99/1 0084 73 promoters, such as a constitutive promoter, a tissue-specific promoter, preferably a rootspecific promoter or an inducible promoter, preferably a salt-inducible promoter.

Example 5: VbDAHP (SEQ ID NO: 9) When a BLAST data base was used it was found that the VbDAHP clone encode a 3deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase with a high similarity to the DHS2 gene. The VbDAHP clone interacts with CDC2bAt, but not with CDC2aAt in the two-hybrid system. The publicly available databases were screened with the cDNA VBDAHP (SEQ ID NO: An overall perfect homology was found with DAHP (AROG_ARATH 3 -deoxy-D-arabino-heptulosonate 7-phosphate synthase [Arabidopsis thaliana]), already known in the databases. With the BLASTX as best homologue 000218 (score 1.9e-49, C-term; 5.6e-86, N-term) was found. This sequence is a complete mRNA from A.thaliana (entered in the databank: 01-NOV-1997) (with publ.: Keith, Proc. Natl. Acad. Sci. U.S.A. 88 8821-8825 (1991)). With the BLASTN/nr we found the same DAHP.

In Arabidopsis thaliana, two genes has been isolated encoding 3-deoxy-D-arabinoheptulosonate 7-phosphate (DAHP) synthase, an enzyme catalyzing the first committed step in aromatic amino acid biosynthesis (Keith et al., 1991). Both genes, DHS1 and DHS2, may have distinct physiological roles, as there are differentially expressed in plants subjected either to physical wounding or to infiltration by virulent and avirulent strains of Pseudomonas syringae. Other enzymes in the Arabidopsis aromatic pathway are also encoded by duplicated genes, an arrangement that may allow independent regulation of aromatic amino acid biosynthesis by distinct physiological requirements such as protein synthesis and secondary metabolism.

Keith Dong Ausubel Fink G.R. (1991) Differential induction of 3-deoxy-Darabino-heptulosonate 7-phosphate synthase genes in Arabidopsis thaliana by wounding and pathogenic attack. Proc. Natl. Acad. Sci. USA, 88: 8821-8825.

WO 00/36124 PCT~EP99/1 0084 74 Example 6: VbHSF (SEQ ID NO: 13) Heat Shock Factor 3 BLAST analysis A BLAST data base search revealed that the VbHSF clone is very similar to the Arabidopsis thaliana Heat-Shock Transcription Factor HSF3. The VbHSF clone interacts with CDC2bAt, but not with CDC2aAt with the two-hybrid system. Organisms synthesize heat shock proteins (HSPs) in response to sublethal heat stress and concomitantly acquire increased tolerance against a subsequent, otherwise lethal, heat shock. Heat shock factor (HSF) is essential for the transcription of many HSP genes. Recently two HSF genes, HSF3 and HSF4, were isolated from an Arabidopsis cDNA library (Prandl et al., Mol Gen Genet (1998) May; 258(3):269-78). Transgenic Arabidopsis plants were generated containing constructs that allow expression of HSF3 and HSF4 or the respective translational beta-glucuronidase (GUS) fusions. Overexpression of HSF3 or HSF3-GUS, but not of HSF4 or HSF4-GUS, causes HSP synthesis at the non-heatshock temperature of 25 degrees C in transgenic Arabidopsis. In transgenic plants bearing HSF3/HSF3-GUS, transcription of several heat shock genes is derepressed.

Electrophoretic mobility shift assays suggest that derepression of the heat shock response is mediated by HSF3/HSF3-GUS functioning as transcription factor.

HSF3/HSF3-GUS-overexpressing Arabidopsis plants show an increase in basal thermotolerance, indicating the importance of HSFs and HSF-regulated genes as determinants of thermoprotective processes. Plants transgenic for HSF3/HSF3-GUS exhibit no other obvious phenotypic alterations.

Derepression of HSF activity upon overexpression suggests the titration of a negative regulator of HSF3 or an intrinsic constitutive activity of HSF3. Stable overexpression of HSFs may be applied to other organisms as a means of derepressing the heat shock response.

A possible regulatory interaction between heat shock response and cell cycle control in plants has already been suggested. Reindl et al. (Plant Physiol (1997) Sep;115(1):93- 100) reported the phosphorylation of the Arabidopsis heat-shock transcription factor HSF1 by a cyclin-dependent kinase. The HSF1 kinase forms a stable complex with AtHSF1, The HSF1 kinase interacts with the cell-cycle control protein Suclp and is WO 00/36124 PCT/EP99/10084 immunoprecipitated by an antibody specific for the Arabidopsis cyclin-dependent CDC2a kinase. Phosphorylation by CDC2a in vitro inhibits DNA binding of AtHSF1 to the cognate heat-shock elements.

Different studies have shown that Heat shock factors can serve as auxiliary proteins in formation of CDK/cyclin complexes. For example during meiosis I of mouse spermatocytes it is proposed that HSP70-2 assists in CDC2/cylinB1 complex formation through interaction with CDC2 and that this interaction establishes and/or maintains the CDC2 protein in a conformation that is competent for cyclin B1 binding (Zhu et al., 1997 Development 124: 3007-3014).

To obtain further independent evidence for the interaction of VbHSF with CDC2bAt, the VbHSF protein was overproduced in E.coli, purified to homogeneity, and coupled to Sepharose beads. The VbHSF-Sepharose beads were used during binding and kinase assays: Expression and purification For VbHSF expression and purification a fusion protein with a His-Tag sequence was generated. The VbHSF-coding region was PCR amplified using the primers 5'-CCATATGGAATTCGCACGAGGC-3' (SEQ ID NO: 21) and 5'-GCAGTAATAGGATCCACTATAGGG-3' (SEQ ID NO: 22). The PCR fragment was cut with Ndel and BamHI and cloned into the The Ndel and BamHI sites of pET19b (Novagen) and the resulting vector pETHSF was transformed into E.coli BL21 cells (Novagen). E.coli cells were grown at 37 °C until OD 0.6 and the production of the fusion protein was induced by adding 1mM isopropyl-f-D-thiogalactopyranoside for 3 h at 30 Cells were spun down and frozen. After thawing cells on ice, the cells were resuspended in lysis buffer containing 50 mM NaH 2

PO

4 (pH 300 mM NaCI and mM imidazole and lysozym was added to 1 mg/ml, After incubation on ice for minutes, cells were sonicated and spun; the supernatant was loaded on Ni-NTA resin (Qiagen) and incubated for 1 h at 4 °C (200 rpm on a rotary shaker). Subsequnetly the lysate-Ni-NTA mixture was loaded on a column and the column was washed with volumes of wash buffer containing 50 mM NaH 2

PO

4 (pH 300 mM NaCI and 20 mM WO 00/36124 PCT/EP99/1 0094 76 imidazole. Next the fusion protein was eluted with 3 volumes elution buffer (50 mM NaH 2

PO

4 (pH 300 mM NaCI and 250 mM imidazole). The purified VbHSF protein was coupled to CNBr-activated Sepharose 4B (Pharmacia, Uppsala, Sweden) at a concentration of 5 mg/ml of gel according to the manufacturer's instruction.

Binding assay: Protein extracts were prepared from 2-day-old cell suspensions of A. thaliana Col-O in homogenization buffer (HB) containing 50 mM Tris-HCL (pH 60mM Bglycerophosphate, 15mM nitrophenyl phosphate, 15mM EGTA, 15mM MgCI 2 2mM dithiothreitol, 0.1 mM vanadate, 50 mM NaF, 20 gg/ml leupeptin, 20 jg/ml aprotenin, jg/ml soybean trypsin inhbitor (SBTI), 100 gM benzamidine, 1mM phenylmethylsulfonylfluoride, and 0.1 Triton X-100. In a total volume of 300 jl HB, 150 mg of protein was loaded on 30 il 50% VbHSF-Sepharose or control Sepharose beads and incubated on a rotating wheel for 2h at 4 oC. Beads were washed 3 times with Beads Buffer containing 50 mM Tris-HCL (pH 50 mM NaF, 250 mM NaCI, 5 mM EDTA, 0.1 NP-40, 10 gg/ml leupeptin, 10 jig/ml aprotenin, 10 .g/ml SBTI, 100 jM benzamidine, 1mM phenylmethylsulfonylfluoride. Beads were resuspended in 25 jl SDS-loading buffer and boiled The suspernatant was separated on a 12.5 SDS-PAGE gel and electroblotted on nitrocellulose membrane (Hybond-C'; Amersham). Filters were blocked overnight with 2 milk in phosphate buffered saline (PBS), washed 3 times with PBS, probed for 2 h with specific antibodies for CDC2aAt (1/5000 dilution) or CDC2bAt (1/2500 dilution) in PBS containing 0.5 Tween-20 and 1 albumin, washed for 1 h with PBS with peroxidase-conjugated secondary antibody (Amersham) and washed for 1 h with PBS containing 0.5 Tween-20. Protein detection was done by the chemoluminescent procedure (Pierce, Rockford, IL). Using a CDC2bAtspecific antibody, no signal was observed at the expected size of CDC2bAt in extracts eluted from the control Sepharose beads. However a clear positive signal was observed for extracts loaded upon the VbHSF-Sepharose beads, giving independent evidence for the interaction between VbHSF and CDC2bAt.

WO 00/36124 PCT/EP99/10084 77 Kinase assays: The kinase assays were performed with Cdk complexes purified from total plant protein extracts by pl3suc-sepharose affinity binding, according to Azzi et al. (1992). In order to control if the VbHSF protein has a regulatory effect on the phosphorylation of Histon H1, VbHSF-Sepharose and control beads were used during a Histon H1 kinase assay as described by Hemerly et al. (1995). After 20 min incubation at 300C, samples were analysed by SDS-PAGE and autoradiographed. We could not detect any difference in [3P] phosphate incorporation in histon H1 comparing the control and the VbHSF samples.

In another kinase assay we did not use Histon H1, to test wether the VbHSF functions as a substrate or not. The analysis of the autoradiography revealed the phosphorylation of the VbHSF protein by the CDK complexes. In combination with the binding assay we can speculate that the VbHSF protein acts as a substrate for CDC2bAt.

These results indicate that VbHSF (or HSF3) is phosphorylated by CDK. This suggests a regulatory role of phosphorylation of VbHSF by CDK/cyclin complexes namely that HSF3 activity is affected by phosphorylation, and hence its ability to confer thermotolerance on a plant may be manipulated.

Altering HSF3 expression to confer thermotolerance in plants The invention provides a method for conferring thermotolerance on a plant by modifying the activity of HSF3, preferably via its phosphorylation state. Therefore, a nucleic acid of HSF3 is introduced into a plant cell, plant tissue or plant that encodes a HSF3 with a modified phosphorylation state.

It is possible to identify phosphorylation sites of HSF3 by random mutagenesis, antiphospho-amino acid antibodies anti-phospho-tyrosine and anti-phospho-threonine antibodies Zhang. Planta 200 (1996), 2-12) and computer-assisted identification, methods known in the art. A state of enhanced phosphorylation can be mimicked by replacing the phosphorylated amino acids by a glutamic acid or aspartic acid. A method to prevent phosphorylation and to mimick a non phosphorylation state, comprises WO 00/36124 PCTEP99/1 0084 78 replacing the phosphorylated amino acids by an amino acid that cannot be phosphorylated (other than glutamic acid or aspartic acid), namely an amino acid that is not tyrosine, serine or threonine. The invention would also relate to transgenic plants, tissues and cells obtainable by the methods above and comprising a HSF3 with modified activity. As mentioned previously such transgenic plants may display improved tolerance to stress, in particular heat stress.

Example 7: LDV24 (SEQ ID NO: 3) PHO80-like protein (PLP) BLAST analysis The LDV24 gene, renamed the PLP5 gene encodes a protein interacting with CDC2a and being highly similar to the PREG1 and PHO80 proteins of Neurospora crassa and Saccharomyces cerevisiae, respectively. The publicly available databases were screened with the cDNA LDV24. With the BLASTX as best homologue the PREG(AF051226) protein from Picea mariana (score: 1.5e-35) and PREG(AC003672) protein from Arabidopsis (score: 3.1e-35) were found. But there is homology with (P20052|PH80_YEAST) PHOSPHATE SYSTEM CYCLIN PHO80 (score: 2.1e-10). With the BLASTN/nr we found AF051226 Picea mariana PREG-like protein (score: 3.9e-12).

Functional domains are predicted at amino acid positions 61-168 and 73-171 as comprising putative cyclin like interacting domains.

itself shows similarity to the Saccharomyces cerevisiae G1-specific cyclins HCS26 and OrfD (Kaffman, Science 263 (1994) 1153-1155). The catalytic CDK subunit binding to PHO80 is PHO85, a CDK with roles in both the cell cycle and metabolic controls (Lenburg and O'Shea 1996, TIBS 21, p383-387). PHO80 in complex with PH085 regulates phosphatase gene expression. When inorganic phosphate in the medium is abundant the PHO80-PHO85 complex phosphorylates the PHO4 transcription factor. Phosphorylated PHO4 remains mainly cytoplasmic, resulting in the repression of expression of the PH05 phosphatase gene (O'Neill et al. 1996, Science 271, p209-212). When cell are starved for phosphate, the PHO80-PHO85 complex is inhibited by the CDK inhibitor PHO81, and transcription of PHO5 is activated.

WO 00/36124 PCT/EP99/1 0084 79 The levels of PH05 expression are sensitive to the levels of PHO80. Overexpression of PH080 results in a partial defect of PH05 activation when phosphate is limiting (Yoshida et al. 1989, MGG 217, p40-46; Madden et al. 1988, Nucleic Acids Res. 16, p2625-2637). At the other hand, deletion of PHO80 results in the presence of high levels of inorganic phosphate (Madden et al. 1988, Nucleic Acids Res. 16, p2625-2637).

Similar effects can be expected for plants when the LDV24 gene is deleted or overexpressed. This might result in an adapted growth in conditions where organic phosphate is present at limiting or exceeding levels. More phosphate accumulation might positively affect the rate of plant growth and biomass production.

Isolation of other PLPs A systematic database screening using the PLP5 gene sequence as template revealed the existence of four related genes in Arabidopsis thaliana (see Table These novel genes were isolated using the RT-PCR technology using the below enlisted combinations of primers. Briefly, total RNA was isolated from exponential phase cell suspension cultured of Arabidopsis thaliana ecotype columbia by the anidinium thiocyanate-phenol-chloroform method using an RNA extraction solution (TRIzol Reagent, GibcoBRL, Grand Island, NY). For the cDNA synthesis, was used the superscript preamplification system, taking 3 micrograms total RNA for the first strand synthesis using the oligo (dT) primer and follow the manufacturer's manual for the rest. 1 microliter of cDNA was used for isolating the five PLP genes by PCR using specific primers (see Table 2 and SEQ ID NOS: 23 to 32) and the following PCR reaction condition: initial denaturalisation 94 °C for 2 min, 35 cycles of 94 °C 45 sec, 55 °C 45 sec and 72 °C 1 min for each one, and a final extension at 72 °C for 5 min. After gel purification of all the cDNA the fragments were cloned directly into the vector pGEM-T and sequenced. The resulting nucleotide and amino acid sequences are given in Table 3a and SEQ ID NOS: 33 to 42. All proteins contain a highly conserved cyclin-like domain (see Table 3b and SEQ ID NOS: 43 to 47). Table 4 gives the percentage of sequence identity and similarity between the different PLP proteins.

WO 00/36124 PCT/EP99/10084 Table 1: Database acknowledgements of PL. EST Annotation GenBank Chromosome accesion ______number PLP1 T4E19 F161322.1 080513 11 PLP2 143B15T7, v PLP3 1O3D21XP, 316G7T7, 227G23T7,

III

___103D21T7 PLP4 T14P1.11 AAD32828 11 176E21T7, 213N\15T7, 230B316T7, 138P12T7III Arabidopsis thaliana WU-BLAST2 Search, Comparison Matrix: BLOSUM62 Table 2: List of the forward and reverse primers used for isolating the PLP1 -4 genes.

forward revyerse PLP1 5'GGGAATTCATGGCGGAACTGAGMA ___TCC3' _ICG3' PLP2 5'GGGAATTCATGGCTGATCAGATTGA GATCC3' GCG3' PLP3 5'GGGAATTCATGTTAACCGCAGCCGG ___AGACG3' GATG3' PLP4 5'CCGAATTCATGGATTCCCTAGCGATT TCTCC3' GATGG3' I5'GGGAATTCATGGACTCTCTCGCAAC 5'GGGGATCCTTGCCGATCAGCGTGC3' CT_0 Bold sequence: sequence included to facilitate cloning Table 3a: cDNA and deduced amino-acid sequence of the A. thaliana PLP genes.

PLPJ (SEQ ID NOs: 33 and 34)

ATGGACTAATCATTAGCAGTAACTCTTTCTGT

GAGCGAGTTGCTGAGTCAAACGATCTGACCCGACGAGTCGCGACTCAGTCACAGAGAGTTTCG

GTGTTTCATGGACTGAGTCGACCACGATAACGATTCAGAGCTATCTAGAGAGGATCTTCA

TACGCAAATTGTAGTCCTTCTTGCTTCGTCGTTGCTTACGTTTATCTCGATCGTTTCACTCAC

AGACAACCTTCACTTCCCATCAATTCCTTTAACGTCCATCGTCTTCTCATCACTAGTGTCATG

GTCGCTGCTAAATTCCTCGATGATCTGTACTACAACATGCGTATTACGCGAAGTGGGAGGA

ATACCAGAAGATTTGGTGTTTATGGTGATGAT

AACGTGACGCCAAACACATTCAACGCCTACTTCTCTTATCTTCAAGATGACTCTTCTT

CACCTCTCTCTCTCGTTGTTGTCCCATCATCAAGATCTCTCATTACCTTCACGACGATGA

GCTTCTCATCAGAACAACAACAACAACACTCGCTGTTTGA

MALNSMKIFSLEVENLRVTSRSFGSPIISLRF

YANCSPSFVY DFHQSPNFVRLISMAKLDYNAYKG I STKEMNFLELDFLFGLGFELNVTPNTFNAYFSYLQKEMTLLQ PLSLVVVPS SRSLITFNDDE

ASHQKQQQQQLAV

WO 00/36124 81PCT/EP99/1 0084 PLP2 (SEQ ID Nos: 3S and 36) ATGGCTGATCAGATTGA AT CAGATGACCGATCTTCAACATGCGGT

ATGCCAAGTGTTTTAACGGCATGTCGTATCTCTTGCAGAGTCGACACAAC

CTGAGCCAGAAACAGAAGCCCTCAGCTTCACTGGAGTACCWACTCATCACG

AGCTATCTCGACGGATCTTTGAATACGCGATTGTAGCTACTCTTAACTGAA

ATATATTTGGATCGGTTCGTGAGAGCAGCCATTTTTGCCTATCATCTTAGCA

AGGCTTATAATCACAAGTGTCTTGGTCTCTGCTAJATTCATGGAGCTGTCAAT

TTGTTCGGAATTGGGTTTGAGTTAAJACGTCACCGTTTCTACTTTATATTGTTT

CTACAAGAGAGATGGCGATGTTGATGAGATGAAGTCTCTGTTCGACTTCTC

AAAATCTCTTTTAAGACGAAACTTGTGATGTATCCACACGAGAGCCTACATA

CACAACAAGAAGCAACTCGCTGCTGCTTGA

MADQIEIQRNQDLQEPLAEIPSLTSYLLQRVSENNLQQKSSTVPIS

IR

SYLERIFEYANCSYSCYIVAYIYLDRFVKKQPFLPINSFNVRLITSVLVSAKFDDLSYNN

EYYAKVGGI SREEMNMLELDFLFG IGFELNVTVSTFNNYCCFLQREAMTJKRTSLFLE PS SF KI SFKTKLVMYPHEEDSLSTHNKKQLAA PLP3 (SEQ ID Nos: 37 and 38) PLP3 (SEQ ID Nos: 37 and 38)

ATGTTAACCGCAGCCGGAGACGATGACTGGACCCGGTCGTGGCAATGCAGA

GCAGCCACTCCAAGAGTGCTGACTATATCTCCCATGTGATGGAAGTGGCCAA

GAGTGGTTAGCTAAGCWCTAGGATTTGGGAGAGCTGA

CTTAGCTAA

GCGCCGAGCATAAGTATACTATACCTTGAGAGGATATATAATCCAAGACC

GCATGTTTCGTTGTTGGTATGTGTACATAGACCGGTTGGCTA

AGTCGTCTG

GTTGTCTCCTTGAATGTTCATAGACTCCTCGTCACTTGTGTCATATCGCAAAT

GATGACGTGCACTACACACGAGTTCTATGCTCGGGTTGGAGGAGATCGC

AACAAAATGGAGTTGGAGCTTCTCTTTCTTCTTGACTTTAGTCGGGTTGGT

TTCGAGAGCTATTGCTTTCACCTCGAGAGATGCAATACAGCGTCTCT

TAGATTCACAGAGATTCTCTCCAGCATCTACTTTATCATCTTTATATGTT

MLAGDLDVGEAEATR

ISHVMKLVARNEWLAKQTKGFGKSLHV

APSIS

IAKYLERIY{YTKCSPAFVYYDLAKPSVSNHRLTVIAI

DDVHNNEFYAVGGVSNDLKELELLFLLDFRVTVFVEYCHEEQJDVS

KDIQPMQESLSPASTLSSLYV

PLP4 (SEQ ID Nos: 39 and ATGGATTCCCTAGCGATTTCTCCAGGAGCTCCGATCAGCTTAT

TATTACAA

GATGATTCCAACACAGTACCTCTAGTCATCTCTGTTCTCTGCTACACACTA

WO 00/36124 PCT/EP99/10084 82

GCTAGGAACGAGAGATCAGCCGGAGCTACGGTGGTTTTGGTAGACACGTGTCTTTGATTGC

CGGGAGATTCCTGATATGACTATTCAATCATACCTAGAGAGATTTTCCGGTATACCAAAGCC

GTCCATCGGTTTACGTCGTGGCTTATGTATACATTGACCGGTTCTGTCAGATCCAAGGT

TTCAGAATCAGTCTTACCAATGTACATCGTCTCCTTATCACACTATCATGATCGCTTCCA

TACGTCGAAGATATGAACTACAAACTCGTACTTTGCGAAGTAGGAGGATTAGAGACAGAA

GATGAATTGATGGTTGTTGTGATAGTCTTATT

AGTGTGTTCGAGAGTTACTGTTGTCATCTAGAGAGAGTAGTATTGGAGGAGGTTATCAG

ATGAAGATCTGGTAGATAACAAAATTCAACTA

CATCATCATCACCATCAATTTTCTCGAATCATGTTGTAG

MDSLAI SPRKLRSDLYSYSYQDDSNfTPLVI SVLSSLIERTLARNERI

SRSYGGFGKTRVFDC

REIPDMTIQSYLERIFRYTKAGPSVYVVAYVYIDRFCQNNQGFRI

SLTNVHRLLITTINIASK

YEDMNSYFAKVGGLETEDLNNLELEFLFLMGFKLHVJSVFESYCCHLERVS

IGGGYQ

IEKALRCAEEIKSRQIVQDPKHHHjHHQFSRML PLPS (SEQ ID Nos: 41 and 42)

ATGGACTCTCTCGCA-ACCGATCCAGCTTTCATTGATTCGGATGTATACCTCAGGTTAGGACTT

ATTATTGAGGGCAAACGATTGAAAAAGCCACCGACTGTACTCTCACGCCTCTCTTCTTCTCTG

GAGAGATCTCTGTTACTCAATCATGATGACAGATTCTGCTTGGATCGCCAGACTCTGTTACC

GTGTTTGACGGGAGATCTCCCCCTGAGATCAGTATTGCACACTACTTGGATCGCATTTTCAAG

TACTCTTGCTGCAGTCCCTCCTGCTTCGTCATTGCGCATATCTACATTGATCACTTTCTCCAT

AAGACCCGAGCCCTTCTCAAACCCCTTATGTCCACCGCCTTATCATTACACTGTCATGTTA

GCTGCTAAAGTCTTCGATGATAGGTATTTCAACATGCATACTACGCAGAGTGGGAGGTGTG

ACTACGAGAGAGTTAAACAGATTGGAGATGGAGTTGTTGTTTACCCTTGACTTCAAGCTTCAG

GTAGATCCTCAGACGTTTCACACACACTGTTGTCAGTTAGAAGCAGAJCAGAGACGGCTTC

CAGATCGAGTGGCCCATAAAGAAGCATGCCGAGCCACAAGAGACTTGGCAGAGAGGACA

CCCGACTCACTCTGCTCTCAACCACAGCACGCTGATCGGC

MDSLATDPAFIDSDVYLRLGLI IEGKRLKKPPTVLSRLS

SSLERSLLLNHDDKILLGSPDSVT

VFDGRSPPEI SIAHYLDRIFKYSCCS PSCFVIAHIYIDHFLHKTRALLKPLNHLI

ITTVML

AAVDRFNYAVGTRLREELTDKQDQFTCQEQRG

Q IEWPIKEACRANKETWQKRTPDSLC

SQTTAR

WO 00/36124 WO 0036124PCT/EP99/1 0084 Table 3b:Domains analysis Position Amino Acid Sequence

CYCLIN

I_ domain' PLP1 56-141 YLERIFKYANCSP SCFVVAYVYLDFTHRQPSLPINSF 1.24955 (SEQ ID NVHRLLITSVVAAFLDDLYYNAYYJKGGI

STKEM

43) NFLELDFLF

YLERIFEYANCSYSCYIVAYIYLDRFVKKQPFLPINSF

PLP2 64- 149 NVHRLIITSVLVSAKFMDLSYEYYAVGGISREEM 1.19454 (SEQ ID NMLELDFLF NO: 44)1 YLERI YKYTKC SPACFVVGYVYIDRL J{KI.PGSLVVS L PLP3 71 -156 NVHRLLVTCVMIAAKILDDVFYNNEFYARVGGVSNDL 0.63602 (SEQ ID NKMELELLF NO: PLP4 73- 158 YLERIFRYTKAGPSVYVVAYYIDRFCQNQGFRISLT 1.87380 (SEQ) ID NVHRLLITTIMIASKYEDMYKSYFAKVJGGLETEDL NO: 46) NNLELEFLF 77- 161 YLDRIFKYSCCSPSCFVIAIYIDHFLHKTALLKPLN 0.19729 (SEQ) ID VHRLIITTVMLAAKVFDDRYFNAYYRVGGJTTREN NO: 47) RLEMELLF Notes: Domain present in cyclins, TFIIB and Retinoblastomna 1'E-values are calculated using Hidden Markov Models.

Table 4: Amino acid sequence identity and similarity (bold) between the different A. thaliana PL-Ps.

PLP1 PLP2 PLP3 PLP4 -61 45 42 48 PLP1 PLP2 80 41 136 44 PLP3 68 67 145 68 62 68 -4 PLP4 IL566 63 64 6 Expression analysis of PLP genes in lants WO 00/36124 PCT/EP99/10084 84 The spatial expression pattern of the different PLP genes was studied using quantitative RT-PCR using the Superscript preamplification system (Gibco/BRL, Gaithersburg,

MD,

USA). Total RNA was isolated from roots, rosette leaves, stems, flowers, seedlings, and actively dividing cell suspensions using the Trizol reagents according to the manufacturer's protocol. First strand cDNA was synthesised from 1 microgram of RNA as described by the manufacturer. The single-stranded cDNA products were subjected to PCR using 0.2 mM concentrations of 5' and 3' specific primers (see Table Care was taken to quantify changes in individual mRNA levels by employing appropriate

RT-

PCR conditions under which a linear relationship existed between amounts of RNA added and intensities of the RT-PCR products. Aliquots of 10 microliter were taken after the 15, 20 and 25 cycles, each cycle being 94°C for 30 s, 550C for 30 s, and 72 oC for 1 minute. The products were electrophorically separated on a 1.0% agarose gel, stained with ethidium bromide and blotted onto nitrocellulose membranes. Fluorescein labelled probes specific for the different PLP genes were prepared using the Gene images random prime labelling module (Amersham). Signals were visualised using the Genes images CDP-star detection module (Amersham). A hybridising signal for PLP3 could only be observed for root tissue. In contrast PLP1, PLP2, PLP4, and PLP5 gene expression could be detected in all tissues examined, see Figure 1.

Interaction between the PLP and CDKs Protein-protein interactions between the different PLPs and CDKs were studied using a two-hybrid system based upon GAL4 recognition sites to regulate the expression of the his3 reporter gene. Vectors and strains used were provide with the Matchmaker Two- Hybrid (Clontech, Palo Alto, CA). The baits used for the two-hybrid analysis were constructed by inserting the PLPs coding region into the pGBT9 (as an fusion protein with the DNA binding domain of the GAL4 transcription factor) and pGAD424 (as an fusion protein with the transcriptional activation domain of the GAL4 transcription factor) vectors. The inserts were created by PCR using the PLPs cDNA as template and primers to incorporate EcoRl and BamHI restriction enzyme sites (see Table 2), resulting into the plasmids pGTBPLP1 to pGBTPLP5 and pGADPLP1 to Vectors were tested for self activation, and pGBTPLP2, pGBTPLP3 and were found positive, excluding their use for studying protein-protein interactions. All WO 00/36124 PCTEP99/10084 other constructions were tested for their interaction with the CDC2aAt and CDC2bAt proteins, cloned in pGTB9 and pGAD424. (De Veylder et al. (1997) FEBS Lett 412, 446-52). For this an appropriate reporter strain (HF7c (MATa ura3-52 his3-200 ade2-101 lys2-801 trpl-901 leu2-3,112 gal4-542 ga80-538 LYS2::GAL1UAS-GAL 1TATA-HIS 3 URA3::GAL4 7mers(3x)-CyClTATA-LacZ)) was transformed with different combinations of the two-hybrid vectors, and tested for its ability to grow in the absence of histidine. The obtained results are summarised in Table 5. All PLPs were shown to interact with CDC2aAt. PLP2 and PLP3 interact only with CDC2aAt, not with CDC2bAt. In contrast, PLP1, PLP4, and PLP5 interact with both CDC2aAt and CDC2bAt, but stronger with CDC2aAt.

Table 5:Two hybrid interaction between the PLPs and CDC2a and CDC2b genes.

pGBT9 pGADCDC2a pGADCDC2b pGAD424 (control) PLP1 PLP2 ND ND PLP3 ND ND PLP4 ND ND+++ PGAD424 pGBTCDC2a pGBTCDC2b PGBT9 (control) PLP1 PLP2 PLP3 PLP4 Note: interaction, no interaction, ND no determinate Isolation of PLP5 (cyclin PH080) Arabidopsis mutant In plant, a direct way for obtaining information on the function of a gene of interest is to study the gene disrupted mutant plant (Reverse genetics).

To identify a mutant plant, DNA extracted from pools of a collection of mutagenized plants generated for example by the insertion of a T-DNA element,are used as template for PCR screening using oligonucleotide primers from the insertional element and from the gene of interest. The sensitivity of the PCR reaction is able to detect the insertion of a T-DNA in the target gene. Once a pool has been confirmed to contain the interest WO 00/36124 PCT/EP99/10084 86 gene linked to the insertional element, the different mutant plants used to prepare the pool are analysed by PCR in order to identify the individual mutant line.

A. Identification of pools containing T-DNA insertion mutant in 1. Arabidopsis T-DNA insertion mutant collection At INRA-Versailles, a large population of mutagenized Arabidopsis plants, ecotype Wassilevskija has been generated by a vacuum and detergent infiltration methods (Becthold et al., 1993; 1995) with an Agrobacterium suspension strain MP5-1 carrying the binary vector pGKB5 (Bouchez et al., 1993, Becthold et al., 1995). At present, the collection contain more than 35,000 independent T-DNA lines with more of 55,000 inserts. (an average of one insertion every 2.5 kb).

For reverse genetics screens, the seeds of the generated T-DNA lines are grouped in primary pools of 48 families. Approximately 100 seeds from each family are mixed and ground in vitro on a large Petri plate. The seedlings plants (10-15 days, stage 2 rossetes leaves) are used for DNA extraction as described Doyle and Doyle, Focus 12:13-15.

Aliquots of 20 ul of the resuspended DNA (100-300 ng/ul) from each of the 16 primary pools are used to prepare 2 ml of one hyper-pools Each hyper-pool represents 768 independent T-DNA lines. Aliquots of 5 ul (15-30 ng/ul) of each hyper-pools (46 hyperpools, at present) are charged in a 96-well microplate, where the PCR amplification reaction will be performed.

References Becthold Ellis and Pelletier G. (1993).ln planta Agrobacterium mediated gene transfer by infiltration of adult Arabidopsis thaliana plants. C.R. Acad. Sci Paris, Sciences de la vie /Life Sciences 316 :1194-1119 Bouchez Camilleri Caboche M. (1993). A binary vector based on Basta resistance for in planta transformation of Arabidopsis thaliana. C.R. Acad. Sci. Paris, Sci de la vie/ Life Sciences 316 :1188-1193.

WO 00/36124 PCT/EP99/1 0084 87 Becthold Bouchez D. (1995). In planta Agrobacterium-mediated transformation of adult plant Arabidopsis thaliana plants by vacuum infiltration. In Gene transfer to plants.

1. Potrykus and G. Spangenberg Eds, Springer-Verlag, Heilderberg, pp 19-23.

WEB pages: T-DNA lines: http://nasc.nott.ac.uk 8300/Vol2ii/pelletier.html sequence http://nasc.nott.ac.uk :8300 /Vol2ii/bouchez.html 2. PCR screeninq: The oligonucleotides primers for the Arabidopsis cyc PHO were designed from the cDNA sequence obtained from the identified clone interacting with the Arabidopsis cdc2a kinase in a two hybrid screen. A foward and reverse primers were tested for specificity, and yield a good PCR amplification using the wild-type genomic DNA of Arabidopisis plants, ecotype WS, as template. The designed primers did not show unespecific amplification in combination with the T-DNA primers Tag3 nor Primers Foward primer F2 5'-ATTGCACACTACTTGGATCGCATT-3' (SEQ ID NO: 48) Reverse primer R1 :5'-GATAGAATGGGAACGGCTAG-3' (SEQ ID NO: 49) Tag3 primer :5'-CTGATACCAGACGTTGCCCGCATAA-3' (SEQ ID NO: primer :5'-CTACAAATTGCCTTTTCTTATCGAC-3' (SEQ ID NO: 51) Each gene primer was used in combination with a T-DNA primer, in the PCR screen.

Standard PCR mix for each microplate well: ADN: 5 ul (10-30 ng /ul) PCR buffer: 2.5 ul MgCl2: 2.5 ul (25 mM) dNTPs: 0.5 ul (10 mM) Gene primer: 2.5 ul (10 uM) T-DNA primer: 2.5 ul (10 uM) Taq Polymerase: 1.0 ul (1 U /ul)

H

2 0: 8.5 ul The PCR conditions were: WO 00/36124 PCT/EP99/1 0084 88 2'-94 C cycles (touch down) 15"-94 C 30"-65 C -1 C /cycle 2' -72 C cycles 15"-94 C 15"-55 C 1'-72 C 2'-72 C 5'-4C 3. Hybridization Analysis Due to the numerous artifacts generated by the PCR reaction, it is neccessary to identify, between the PCR products which one contain the gene of interest linked to the insertion element. To overcome this problem, an hybridization analysis was carried out.

ul of each hyper-pool PCR reaction were electrophored on a 2% agarose gel (TAE).

After migration, the gel was equilibrated in 0.4 N of NaOH for 30 min, and transfered simultaneously to two charged nylon membranes (Pall, plus) over night. After transfer, the membranes were rinsed with 2X SSC, one followed by hybridization with the gene probe and the other with the T-DNA probes.

The gene probe was prepared from the digested plasmid containing the cDNA encoding for the cyclin PHO 80 identifed in the two hybrid screen. The T-DNA probes correspond to a mix of left border (fragment of 1kb after Kpnl digestion of plasmid pBS-LB) and right border (fragment of 0,8 kb after Sstl-EcoRV digestion of the pBS-RB) The digested gel purifed fragments were labelled using the ALKPHOS (Amersham, RPN 3680) non-radiactive labelling kit. The hybridization and washing were done according with the instructions of the manufacturer.

WO 00/36124 PCT/EP99/10084 89 Developing of each autoradiogram obtained after hybridization with the gene and the T- DNA probes, revealed a clear signal that superimposed in both blots, indicating a potential T-DNA insertion mutant. The POR fragment given the hybridization positive signal was further sequenced confirming that it contained the cyclin PLP5 gene linked to the T-DNA insertion element.

The sequence of the mutant line was done with the forward primer 2 of cyclin Sequence of the mutant line with forward F2 primer:

NTGTACTAAA

CCATAAGACC

CATGTTAGCT

TCAACACGCA

TCAACAATGC

AGATGGAGTT

CACACTGTTG

GGCACACGCC

GGAAAGTTGT

AGGTGCANTC

CGAGCCCTTC

GCTAAAGTCT

AATAAGTCTT

ATACTACGCA

GTTGTTTACC

TACTGAATCG

CTGGAGTCCG

CCAANACGAA

CCTCCTGCTT

TCAAACCCCT

TCGATGATAG

CAATCATAGA

AGAGTGGGAG

CTTGACTTCA

GATTTTCAAG

GCCCGTTTCC

TCCCAGTGTC

CGTCATGGAT

TAATGTCCAC

GTATGTTACT

TTCATTGATC

GTGTGACTAC

AGCTTCAGGT

GGTCTGGCCA

AGTTGAGGGT

CTATTACCAA

ATCTACATTG

CGCCTTATCA

CACTAAACCT

TCTGGTGTTG

GAGAGAGTTA

AGATCCTCAG

AAACTATTCC

TGTCTACGCT

TAGCCGACGG

ATCACTTTCT

TTACAACTGT

GGTATCAAAT

NGCAGGTATT

AACAGATTGG

ACGTTTCACA

GNGGGCACCT

TANATGAGAA

TATCGATAAG

CTNGATGTAC ATGGTCNkTA NNAIAAAGGCN AT (SEQ ID NO: 52) Sequence homology to the right border of the T-DNA is indicated in bold.

The gene sequence is homologous to the Arabidopsis EST N37922, however there is not a genomic sequence homolog in the data base. At the protein level, it is homologous (score of 5e-30) to the PREG-like protein of Arabidopsis (AC003672), to the yeast cyclin PCL 7 partner of the cdc PHO 85 (score 1le-9), and to the yeast cyclin PHO80 (score 1le- 6).

The length of the Arabidopsis PREG-like protein (A0003672) homologous to the cyclin PHO 80, is 202 aminoacids. If the PLP5 belong to the family of the PREG-like proteins, the T-DNA insertion should be located approximaly at aminoacid position 157.

4. Partial aenomic seguence Given that there is not genomic sequence homologs to the cDNAs of PLP5 in the data base, two oligonucleotids primers designed from the cDNA of PLP5 previously identifed WO 00/36124 PCT/EP99/10094 in the two hybrid screen, were used to amplify by PCR, a partial genomic fragment containing the corresponding clDNA sequence.

Forward primer F1l 5'-cgatccagctttcattgaftcg-3' (SEQ ID NO: 53) Reverse primer Rl 5'-GATAGAATGGGAACGGCTAG-3' (SEQ ID NO: 54) The POR fragment obtained was sequenced by dye terminator using the forward primer F1.

The sequence is:

ATTTCNTTNGNTGTATACCTCAGGTTAGGACTTATTATTGAGGGCAACGATTGAAAGCCACC

GACTGTTCTCTCACGCCTCTCTTCTTCTCTGGAGAGATCTCTGTTACTCAATCATGATGACAAGA

TTCTGCTTGGATCGCCAGACTCTGTTACCGTGTTTGACGGGAGATCTCCCCCTGAGATCAGTATT

GCACACTACTTGGATCGCATTTTCAAGTACTCTTGCTGCAGTCCCTCCTGCTTCGTCATTGCGCA

TATCTACATTGATCACTTTCTCCATAAGACCCGAGCCCTTCTCAAACCCCTTAATGTCCACCGCC

TTATCATTACAACTGTCATGTTAGCTGCTAAAGTCTTCGATGATAGGTATGTTACTCACTAAACC

TGGTATCAATTCAACACGCAAATAGTCTTCATCATAGATTCATTGATCTCTGGTGTTGTC

GGTATTTCAACAATGCATACTACGCAAGAGTGGGAGGTGTGACTACGAGAGAGTTACAGATTG

GAGATGGAGTTGTTGTTTACCCTTGACTTCAAGCTTCAGGTAGATCCTCAGACGTTTCACACACA

CTGTTGTCAAGTTAGAAAAGCAGAACAGCGACGGCTTCCAGATCGAGTGGCCCATAAAAGA-AGCA

TGCCGAGCCAACAAGAGACTTGGCAGAAGAGGACACCCGACTCACTCTGCTCTCAAC~CAAGC

ACGCTGATCGGCAAGGGNAAAANGA (SEQ ID NO: The alignment of the clDNA with the partial genomic sequence revealed the presence of one intron indicated in bold. The underlined sequece represent the insertion site of the T-DNA in the mutant line.

B. Identification of lines containing T-DNA insertion mutant in cyclin Exoerimental design: -identify a PLP5 insertion mutant -characterize the PLP5 mutant -identify homozygous plants -select growth conditions to detect phenotype differences between wild type control and mutants WO 00/36124 PCT/EP99/1 0084 91 1. Identification of the positive line The 48 lines from the positive pool were grown in growth chamber for two weeks. Plants were harvested and frozen in liquid nitrogen. Plants (1g) were grinded with a pestle and a mortar and homogeneized in 6ml buffer containing 78mM Tris HCI pH8, 40mM EDTA, 390 mM NaCI, 1% SDS, 15mM sodium bisulfite at 65°C for 30 min. 2ml potassium acetate 5M were added and the mixture was incubated on ice for 20 min. Supernatants were recovered after centrifugation (20 min, 4500 rpm, 4°C) and 4ml isopropanol 200C) was added and incubated for at least 30 min at 4°C. After centrifugation (7 min, 4500 rpm, 4°C) the pellet was dried and taken in 420 microliter ammonium acetate 7.4M. Supernatants were recovered after centrifugation (20 min, 4500 rpm, 4°C) and 700 microliter isopropanol were added and incubated for 30 min at 4°C. After centrifugation (7 min, 4500 rpm, 4°C) the pellet was dried and taken in 400 microliters Tris-EDTA (100mM-10mM) buffer pH 8.0. After centrifugation (15 min, 13700 rpm, 4°C), supernatants were mixed with 800 microliters ethanol at -20°C for 10 min. The final pellet was recoved after centrifugation (5min, 13700 rpm, 4°C) and washed with ethanol. The final pellet was taken in 20 microliters of Tris-EDTA (100mM-10mM) buffer and used for PCR (1/100 dilution).

2. Growth of positive lines The positive line identified from the INRA collection was grown in growth chamber under four types of conditions: 1- 100mg/L kanamycin on At medium (see 6. General methods below) 2- At medium minus sugar and vitamins 3- At medium in light conditions (20 12 h photoperiod, normal intensity) 4- At medium in dark conditions Plants were then examined for obvious phenotypes and kanamycin segregation which gives an indication on the number of T-DNA insertions, the linkage of insertions and the sex effect.

For other phenotypes of interest plants are grown on specific medium: 1-At medium minus sucrose plus different amounts of Pi (0 to 50mM K2P04) 2-At medium plus different amounts of Pi (0 to WO 00/36124 PCT/EP99/10084 92 3-At medium minus sucrose plus different amounts of Pi (0 to 50mM) at 28°C 4-At medium plus various amount of hygromycin (0 to 200mM) At medium plus various amount of auxin, or cytokinines Observations were made concerning germination, emergence of radicle, emergence of cotyledons, emergence of first pair of leaves, color, shape. Flowers were observed in green house on the homozygous lines.

3. Detection of Homozyqous plants Homozygous plants were detected by PCR first using the following combination of primers (F2-Tag5 and F2-R1).

F2: 5' ATTGCACACTACTTGGATCGCATT 3' (SEQ ID NO: 56) R1: 5' CTATCTTACCCTTGCCGATCAGC 3' (SEQ ID NO: 57) 5' CTACAAATTGCCTTTTCTTATCGAC 3' (SEQ ID NO: 58) PCR conditions were for one reaction: microliter DNA or controle (water, wild type, pool) microliter buffer (Tris-HCI 100mM pH9.5, KCI 500mM, 1% Triton X100) microliter MgCI2 microliter dNucleotidesTP 1 microliter of TaqPolymerase (1 Unit/microliter) microliter water microliter of each primer The PCR program was as follow: 2 min 940C cycles touch-down 15 sec 94°C, 30 sec 650C, 2 min 72°C cycles 15 sec 94°C, 15 sec 55"C, 1 min 72°C 2 min 72 °C min 4°C WO 00/36124 PCT/EP99/10084 93 Young leaves (1cm square) were ground in homogeneization medium (200mM Tris-HCI 250mM NaCI, 25mM EDTA, 0.5% sodium dodecyl sulfate). After 30 min on ice, supernatant was recovered after centrifugation (5min, 13000rpm; room temperature) and 1 volume isopropanol was added. After inversion of the tube and about 10 min, pellets were carefully recovered after centrifugation (5min, 13000rpm; room temperature), dried and taken into 20 microliter Tris-EDTA (100mM-10mM) buffer. PCR were done using 1:100 dilution of the DNA of each individual plants.

These plants were then transfered to the green house for multiplication and crosses.

The seeds were then harvested and put in growth chamber on agarose plates containing 100mg/I kanamycin to analyse the segregation of the insertion into the PLP5 gene.

4. Growth of plants In addition plants were grown directly in the green house on soil, watered, under 12 h photoperiod and normal light intensity. Such plants were also used to make crosses with wild type plants in order to clean the genotype from unwanted short T-DNA insertions in other genes not detected by kanamycin resistance gene.

Determination of GUS activity.

Gus activity is expressed when a T-DNA is inserted into a gene in the proper direction; This allows to detect where mutated gene is expressed. Preliminary data (Nusseaume, CEA Cadarache) indicated that GUS activity could be detected in the positive line containing PLP5 mutant.

Gus activity was detected as in Jefferson et 1987 (EMBO J, 6: 3901-3907) with slight modifications. Tissues (whole plantlets: two weeks old) were fixed in for 1h at -20°C. Tissues were then incubated with 1mg/ml 5-bromo,4-chloro, 3indolyl, beta Dglucuronide in 0.1M potassium phosphate buffer pH7 containing 0.1% triton X100, EDTA, 2mM potassium ferrocyanide, 2mM potassium ferricyanide. Tissues were vacuum infiltrated for 10 min and incubated for at least 1 hour at 37 0 C. 70% ethanol was used to destain the tissues prior to microscopic analysis.

6. General methods/protocols/materials WO 00/36124 PCT/EP99/10084 94 Protocol for Arabidopsis culture: Sseed sterilization Dissolve 1 bleach pellet (Bayrochlore, Bayrol) in 40 ml H 2 0, add a few drops of Teepol or Tween (prepare fresh).

Dilute the previous solution 1/10 in 95% Ethanol to make the sterilization solution (SS).

Add 10 ml of SS to each tube of seeds, and incubate 7 min at room temperature with constant, gentle agitation.

Rince twice with 10 ml 95% ethanol.

Let the seeds sediment, and carefully remove as much ethanol as possible. (It's possible to invert the tubes, but be careful Leave the tubes open under a sterile hood overnight.

Sin vitro culture medium: We use non diluted medium with 10 g/l sucrose (modified from Estelle Sommerville, 1987, MGG 206 200).

At medium

KNO

3

KH

2

PO

4 MgSO 4 Ca(NO 3 2 Microelements Vitamins Bromcresol Purple MES pH 6 Agar Stock 1 M 1 M 1 M 1 M 1000x 500x 0.16% 14% Amount/L 5 mL 2,5 mL 2 mL 2 mL 1 mL 2 mL 5 mL 5 mL 7g Final concentration 2mM 2mM lx lx 0.0008% 0.035% 0.7% Autoclave (120 0 C, 20 min), then add Ferric Ammonium Citrate 1% (autoclaved separately) 5 mL 0.005% WO 00/36124 PCT/EP99/10084 Sprinkle the surface-sterilized seeds on a 14 cm agar plate containing Arabidopsis culture medium (AtM/2), covered with a round filter paper (Whatman 3MM). Seal the plates with a gas-permeable chirurgical tape.

Synchronize germination by a cold treatment at 4°C for 48 hours.

Place in the growth chamber under the following conditions photoperiod 16 h day (100-150 /E/m 2 8 h night temperature 200C day 15°C night; humidity After 10-15 days of culture, plantlets (2-leaf rosettes) are ready for DNA isolation. Each plate should yield 3-6 g fresh weight.

Gently scrape the plantlets from the filter paper using a razor blade. The plantlets are weighted, and frozen in liquid nitrogen for future use.

Microelements 1000x Amount/L Final concentration (1000 x) (1x)

H

3

BO

3 4328 mg 70 pM MnCI,, 4H 2 0 2770 mg 14 pM CuSO 4 5H 2 0 125 mg 0.5 pM Na 2 MoO, 2H 2 0 50 mg 0.2 pM NaCI 584 mg 10 pM ZnSO,, 7H 2 0 288 mg 1 pM CoCI2, 6H 2 0 2.5 mg 0.01 pM Autoclave (120°C, 20 min) Vitamins 500x Amount/L (500 x) Myo-lnositol 50 g/L Ca Panthotenate 0.5 g/L Niacin 0.5 g/L Pyridoxine 0.5 g/L Thiamine HCI 0.5 g/L WO 00/36124 PCT/EP99/1 0084 96 Biotin 5 mg/L Keep at -20 0

C.

Using the above methods a mutant containing an insertion in the PLP5 gene was identified and called Mutant 11 (mut1 1).

Segregation analysis for kanamycin resistance indicated a %/34 population of sensitive/resistant plants. This indicates the probability for one insertion of T-DNA.

This result was confirmed by southern analysis of DNA digestions with T-DNA probes.

Effects of hygromycin: Aminoglycosides are antibiotics that affect rRNA interactions and lead to mistranslation. It was shown that in yeast pho80-pho85 and pho4 are required for increased sensitivity to aminoglycoside antibiotics (Wickert et al. (1998) J. Bacteriology 180 (7):1887-1894). Mut11 is tested to determine whether it is hypersensitive to hygromycin. Plants were grown on At medium plus various amounts of hygromycin. Observations were made for germination, cotyledon emergence, and general root aspect.

Conclusions: at low concentrations final germination capacity is similar for WS and 11K11 but mean time germination is longer (about 18h) for homozygous mut11 at 10mM. A similar phenomenon is observed at 25mM hygromycin but final germination is similar.

As a consequence cotyledon emergence is delayed.

at higher concentrations, mean time germination is longer for mut11 and final germination capacity is reduced. Cotyledon emergence is delayed for 11K11 and radicle growth is severely affected for both wild type and 11 K11.

as a conclusion, mut11 is more sensitive to hygromycin, suggesting a role of and/or other components of the signalling cascade in sensitivity to hygromycin.

Transgenics overexpressing mut11 could be more resistant to hygromycin.

Overexpression of the PLP5 would mean that could be used as a positive selectable marker during transformation procedures while antisense/cosuppression could be used as a negative selective marker.

WO 00/36124 PCT/EP99/1 0084 97 Example 8: Extension of cell cycle interacting protein encoding polynucleotides to full length or to recover regulatory elements The cell cycle interacting protein encoding nucleic acid sequences (SEQ ID NOS: 1, 3, 33, 35, 37, 39, 41, 5, 7, 9, 11 and 13) are used to design oligonucleotide primers for extending a partial nucleotide sequence to full length or for obtaining 5' sequences from genomic libraries. One primer is synthesized to initiate extension in the antisense direction (XLR) and the other is synthesized to extend sequence in the sense direction (XLF). Primers allow the extension of the known cell cycle interacting protein encoding sequence "outward" generating amplicons containing new, unknown nucleotide sequence for the region of interest. The initial primers are designed from the cDNA using OLIGO® 4.06 Primer Analysis Software (National Biosciences), or another appropriate program, to be preferably 22-30 nucleotides in length, to have a GC content of preferably 50% or more, and to anneal to the target sequence at temperatures preferably about 68 0 -72°C. Any stretch of nucleotides which would result in hairpin structures and primer-primer dimerizations is avoided. The original, selected cDNA libraries, prepared from mRNA isolated from actively dividing cells or a plant genomic library are used to extend the sequence; the latter is most useful to obtain 5' upstream regions. If more extension is necessary or desired, additional sets of primers are designed to further extend the known region. By following the instructions for the XL- PCR kit (Perkin Elmer) and thoroughly mixing the enzyme and reaction mix, high fidelity amplification is obtained. Beginning with 40 pmol of each primer and the recommended concentrations of all other components of the kit, PCR is performed suing the Peltier Thermal Cycle (PTC200; MJ Research, Watertown MA) and the following parameters: Step 1 94 0 C for 1 min (initial denaturation) Step 2 65°C for 1 min Step 3 68°C for 6 min Step 4 94° for 15 sec WO 00/36124 PCT/EP99/1 0084 98 Step 5 65°C for 1 min Step 6 68 0 C for 7 min Step 7 Repeat steps 4-6 for 15 additional cycles Step 8 94°C for 15 sec Step 9 65°C for 1 min Step 10 680C for 7:15 min Step 11 Repeat step 8-10 for 12 cycles Step 12 720C for 8 min Step 13 40C (and holding) A 5-10 p1 aliquot of the reaction mixture is analyzed by electrophoresis on a low concentration (about agarose mini-gel to determine which reactions were successful in extending the sequence. Bands thought to contain the largest products were selected and cut out of the gel. Further purification involves using a commercial gel extraction method such as QIAQuick T (QIAGEN Inc). After recovery of the DNA, Klenow enzyme was used to trim single-stranded, nucleotide overhangs creating blunt ends which facilitate religation and cloning. After ethanol precipitation, the products are redissolved in 13 p1 of ligation buffer, 1pu T4-DNA ligase (15 units) and 1 p/ T4 polynucleotide kinase are added, and the mixture is incubated at room temperature for 2-3 hours or overnight at 16°C. Competent E. coli cells (in 40 p/ of appropriate media) are transformed with 3 pl of ligation mixture and cultured in 80 p/ of SOC medium (Sambrook, supra). After incubation for one hour at 370C, the whole transformation mixture is plated on Luria Bertani (LB)-agar (Sambrook, supra) containing 2xCarb. The following day, several colonies are randomly picked from each plate and cultured in 150 pl of liquid LB/2xCarb medium placed in an individual well of an appropriate, commerically-available, sterile 96-well microtiter plate. The following day, 5 p1 of each overnight culture is transferred into a non-sterile 96-well plate and after dilution 1:10 with water, 5 pl of each sample is transferred into a PCR array. For PCR amplification, 18 pl of concentrated PCR reaction mix (3.3x) containing 4 units of 4Tth DNA polymerase, a vector primer and both of the gene specific primers used for the extension reaction are added to each well. Amplification is performed using the following conditions: WO 00/36124 PCTEP99/1 0084 99 Step 1 94°C for 60 sec Step 2 94°C for 20 sec Step 3 55°C for 30 sec Step 4 72°C for 90 sec Step 5 Repeat steps 2-4 for an additional 29 cycles Step 6 720C for 180 sec Step 7 4°C (and holding) Aliquots of the PCR reactions are run on agarose gels together with molecular weight markers. The sizes of the PCR products are compared to the original partial cDNAs, and appropriate clones are selected, ligated into plasmid and sequenced.

WO 00/36124 PCT/E P99/1 0084 100 Example 9: VbDBP(SEQ ID NO: 11) When a BLAST data base was used it was found that the VbDBP clone is very similar to the putative DNA binding protein (Arabidopsis thaliana) and also contains a lot of homologies with PCF2 (Oryza sativa). VbDBP interacts with CDC2b but not with CDC2a. The publicly available databases were screened with the cDNA VBDPBP (Nterm). With the help of BLASTX gene21 from AC003680 (score 1.0e-27) was found as best homologue. This is a genomic sequence from A.thaliana (entered in the databank:20-MAR-1998), chromosome II. The prediction made here gives 1 big exon, but the new predictions made in accordance with the present invention gave two exons (the big one, followed by a small one). The cDNA VBDPBP shows not so high homology (gene 21 might only be from the same family as VBDPBP) with the big exon, so completion of the cDNA will confirm one or the other annotation and might give a new sequence. Other homologues are D87261) PCF2 [Oryza sativa] (score 9.2e-27) and D87260) PCF1 [Oryza sativa] (score 8.5e-24) both with publication: Kosugi, S. and Ohashi,Y. PCF1 and PCF2 specifically bind to cis elements in the rice proliferating cell nuclear antigen gene. Plant Cell 9 1607-1619 (1997). With the help of BLASTN/nr an other genomic sequence from chromosome V, AB010072 (2e-12) (08-JAN-1998) sequenced by the KAOS-people (P1 clone: MEE6) was found. The region with homology is located between (18754..18848) has no annotations at all. The publicly available databases was screened with the cDNA VBDPBP (C-term (SEQ ID NO: but nothing was found with BLASTX.

PCF1 and PCF2 are proteins isolated in rice that specifically bind to sites Ila and lib in the promotor region of the rice PCNA gene (Kosugi et al., 1997). The rice proliferating cell nuclear antigen (PCNA) protein is an auxiliary protein of DNA polymerase (that participates in a variety of processes, such as DNA replication, DNA repair synthesis, and cell cycle control through reactions with the CDK-cyclin-CKI complex. The PCNA gene is induced at the G1-to-S phase boundary and is well conserved in eukaryotes.

The expression of the rice PCNA gene is restricted exclusively to meristematic regions and is controlled at the transcriptional phase. PCNA protein is also present in WO 00/36124 PCT/EP99/1 0084 101 proliferating cells but absent from nondividing cells and terminally differentiated plant tissues.

Loss-of-function analysis of the rice PCNA promoter using transgenic plants has demonstrated that two elements (sites Ila and lib) in the proximal region are essential for the proliferating cell-specific transcriptional activity. On the other hand, two repeated site Ila sequences located upstream of the cauliflower mosaic virus 35S minimal promoter confer transcriptional activation in tobacco protoplast. This suggests that sites Ila and Ilb most probably function as positive cis-acting elements in proliferating cells.

The proteins PCF1 and PCF2 specifically bind to sites Ila and lib in the promoter region of the rice PCNA gene and may act as transcription factors to control DNA synthesisrelated genes in plants. In particular, PCF2, with a high level of DNA binding activity in meristematic tissues, may act as transcriptional activator for these genes. These proteins have a deduced basic helix-loop-helix (bHLH) motif that is responsible for DNA binding and dimerization. PCF1 and PCF2 are novel types of bHLH proteins that are distinct from other known bHLH transcriptional factors.

Kosugi, and Ohasi Y. (1997) PCF1 and PCF2 specifically bind to cis elements in the Rice proliferating cell nuclear antigen gene. The Plant Cell, 9, 1607-1619.

WO 00/36124 PCT/EP99/1 0084 102 Example 10: Vb33 (SEQ ID NO: The Vb33 clone encodes a protein interacting with CDC2b but not with CDC2a. The publicly available databases were screened with the cDNA VB33. With the BLASTX as best homologue a predicted gene on the Z49937 sequence having a similarity with an ankyrin motif (score 0.62) was found. This sequence comes from C.elegans cosmid and the gene F14F3.2 was predicted based on a C.elegans EST (yk192g4.5).

Example 11: LDV115 (SEQ ID NO: 1) The LDV115 gene encodes a protein interacting with CDC2a but not with CDC2b and showing limited similarity to the Saccharomyces cerevisiae WEB1 protein. The publicly available databases were screended with the cDNA LDV115. With the BLASTX it was found as best homologue the WEB1 protein from S.pombe (AB004537)(score 6.7e-17).

This protein as well as the other hits were mainly due to proline-richness of the LDV115 translation. The homology is low but spread over about 50% of the S.pombe protein, which might indicate that LDV115 is at least a member of the family. The WEB1 gene was isolated as a yeast homologue of the adenoviral E1A gene (Zieler et al., 1995, MCB p3227-3237). The protein products of the E1A gene are implicated in a variety of transcriptional and cell cycle events, involving interactions with several proteins present in. the human cells, including parts of the transcriptional machinery and negative regulators of cell division such as the Rb gene product and p107. WEB1 is identical to SEC31, a protein involved in budding of transport vesicles from the endoplasmic reticulum (Pryer et al. (1993), J. Cell. Biol. 120, p865-875). The protein similarity between WEB1 and LDV115 is almost completely due to the presence of a proline-rich region found in both proteins. Proline-rich regions are not restricted to the WEB1 protein, but can also be found in many structural proteins such as hydroxyproline-rich glycoproteins and extensins. Therefore, LDV115 might not be a true homologue of WEB1.

PAOPERyns\SpeOncatonsm19823-OO doc-2/310 102A Throughout this specification and the claims which follow, unless the context requires otherwise, the word "comprise", and or variations such as "comprises" or "comprising", will be understood to imply the inclusion of a stated integer or step or group of integers or steps but not the exclusion of any other integer or step or group of integers or steps.

The reference to any prior art in this specification is not, and should not be taken as, an acknowledgment or any form of suggestion that that prior art forms part of the common general knowledge in Australia.

EDITORIAL NOTE APPLICATION NUMBER 19823/00 The following Sequence Listing pages 1 to 34 are part of the description. The claims pages follow on pages "103" to "109".

1 SEQUENCE LISTING <110> CropDesign N.V.

<120> Novel cell cycle genes and uses thereof <130> C2681PCT <140> <141> <150> EP 98 12 4062.5 <151> 1998-12-17 <160> 58 <170> PatentIn ver. 2.1 <210> 1 <211> 1989 <212> DNA <213> Arabidopsis thaliana <220> <221> CDS <222> (2)..(1672) <400> 1 a acg caa gaa atg caa gaa gaa gag gaa gaa agt tct gac cca gtt ttt 49 Thr Gin Glu Met Gin Glu G Glu Gl Glu Glu Ser Ser Asp Pro Val Phe 1 5 10 gat aat gcc ate cag cga gcg ttg att gtt gga gat tac aag gag gcg 97 Asp Asn Ala Ile Gin Arg Ala Leu Ile Val Gly Asp Tyr Lys Glu Ala 25 gt gat cag tgt ata act gca aat aag atg gcc gat gct tta gtt att 145 val Asp Gln Cys Ile Thr Ala Asn Lys Met Ala Asp Ala Leu Val Ile 40 gct cat gtt ggt ggt aca gcg ttg tgg gag agt act cgt gag aaa tat 193 Ala His Val Gly Gly Thr Ala Leu Trp Glu Ser Thr Arg Glu Lys Tyr 55 ttg aag acg aac agt gcg cca tac atg aag gtt gtt tct gcg atg gtg 241 Leu Lys Thr Asn Ser Ala Pro Tyr Met Lys val val Ser Ala Met Val 70 75 aac aat gat ctc agg agc ctt ate tat aca agg tea cat aag ttc tgg 289 Asn Asn Asp Leu Arg Ser Leu Ile Tyr Thr Arg Ser His Lys Phe Trp 90 aaa gag act ctt get ctc ctc tgt act ttt gca caa gga gaa caa tgg 337 Lys Glu Thr Leu Ala Leu Leu Cys Thr Phe Ala Gin Gly Glu Gin Trp 100 105 110 aca acc ctg tgt gat gcc ctt gcc tcg aag ttg atg get get ggt aac 385 Thr Thr Leu Cys Asp Ala Leu Ala Ser Lys Leu Met Ala Ala Gly Asn 115 120 125 act ttg gct gca gtt ctc tgc tac att tgc gca ggc aat gtt gac aga 433 Thr Leu Ala Ala Val Leu Cys Tyr Ile Cys Ala Gly Asn val Asp Arg 130 135 140 aca gta gaa att tgg tcg agg age ctt gca aat gag cgg gat gga aga 481 Thr val Glu Ile Trp Ser Arg Ser Leu Ala Asn Glu Arg Asp Gly Arg 145 150 155 160 tct tat get gag ctt ctt caa gat ctt atg gag aag act ctt gtc ctt 529 Ser Tyr Ala Giu gct Al a ttt Phe gca Al a ctt Leu 225 aac Asfl aat Asn gat

ASP

gtt Val gct Al a 305 gct Al a ctg Leu tt C Phe caa Gin aag

LYS

385 ccc Pro cct Pro ttg Leu gag GiU atg Met 210 tca se r act Th r cag Gin aat Asfl tca Se r 290 cag Gin cag Gin aag Lys acg Th r tat Ty r 370 atg met atg met cca Pro gca Al a agt se r 195 aag

LYS

ata le aca Th r gag Gi U cag Gin 275 cat His cca Pro cca Pro aat Asn 355 gca Al a ccc Pro gca Al a aca Th r act Th r 180 tat Ty r tac Ty r tta Le u g ct Al a cca Pro 260 tat Ty r cca Pro g ct Al a tcc se r gca Al a 340 cca Pro cct Pro caa Gin act Th r cag Gin 420 Ley 165 gct Al a ttg Leu cg t Arg tca se r 245 act Th r cag Gin ccc Pro c cg Pro atg met 325 gat

ASP

tct Se r t ct Se r gtt Val cca Pro 405 cag Gin Leu Gin ASP Leu aac Asn gag GlU aaa Lys gat

ASP

230 cag Gin caa Gin atg Met cag Gin 310 ag a Arg caa Gin aac Asn gtt Val 390 gca Al a aaa Lys ata le gtt Val 215 cgt Arg aac Asn g cg Al a ccg Pro cag Gin 295 cca Pro act Th r tat Ty r aat Asn cct Pro 375 g ct Al a gtt Val gct Al a aag

LYS

ctg Leu 200 ct g Leu att Ile act Th r caa Gin tac Ty r 280 caa Gin tct Se r aca Th r cag Gin gca Al a 360 tca Se r cca Pro gct Al a gca Al a ttc Ph e 185 gcc Al a gat

ASP

tct se r cag Gin cca Pro 265 act Th r cca Pro ttt Phe ttt Phe cag Gin 345 tac Ty r caa Gin gca Al a cca Pro cag Gin 425 met 170 ag c Se r ag c Se r tct se r cta Leu cct Pro 250 aac Asn gat

ASP

acc Th r act Th r gtt Val 330 cca Pro cct Pro ctt Leu gct Al a aga Arg 410 gca Al a Giu LYS Thr Leu gca Al a caa Gin tct Se r 235 caa Gin gtt Val tct se r atg met cca Pro 315 c ct Pro acc Th r gtt Val 395 t ct se r gcc Al a tct se r Al a gc Ge r ctt Leu tat Ty r ttt Ph e 300 gct Al a tca se r atg Met ccc Pro caa Gin 380 ccc Pro gt g Va1 cct Pro ctg Leu ctt Leu 205 ttg Leu gaa Gi U acc Th r g ct Al a tat Ty r 285 atg met cct Pro act Th r agt se r ccg Pro 365 tat Ty r ata Ile caa Gin gcg Al a tgt cys 190 ctt Leu tca se r cct Pro atg Met aac Asn 270 gtc Val cca Pro aca Th r ccc Pro t ct Se r 350 cct cct Pro 430 Val 175 aaa Lys aca Th r cct Pro gag Gi U cca Pro 255 cca Pro cct Pro cac His agc Se r cct Pro 335 cat His cct Pro aac Asn ttt Phe gca Al a 415 gca Al a Leu ct c Leu acg Th r gaa GlU act Th r 240 tat Ty r tat Ty r caa Gin caa Gin aat Asn 320 gca Al a tca Se r cct Pro acg Th r 400 agt Se r act Th r 577 625 673 721 769 817 865 913 961 1009 1057 1105 1153 1201 1249 1297 1345 ccg cca cca act gtt cag act gca gat Pro Pro Pro Thr Val Gin Thr Ala ASP act tcc aac Thr Ser Asn gtt cca gcc cac Val Pro Ala His 440 ttg Leu 445 aat Asfl cag aaa Gi n LYS 450 gaa gca GlU Ala gt g ata gca Va Ile Ala acg Th r 455 cgt Arg aca agg ctt Thr Arg Leu ttc Phe 460 aag

LYS

gag aca tcg GlU Thr Ser ctg g ga g gc Leu G 1y GBY 465 gaa GiU gca Al a 470 aaa Lys gcg aat act Ala Asn Thr aag cgt gag Lys Arg Glu gac aac tcg ASP Asn Ser aga Arg 485 aag

LYS

tta g gt gct Leu G ly Ala ctg Leu 490 aaa

LYS

gt g aaa ctc Val1 LY S Leu aac agc Asn Ser 495 gaga c atc GyASP Ile gct ctg gac Ala Leu ASP 515 ctg act acc LeU Thr Thr tcc se r 500 aac Asfl aat gct gcg Asn Ala Ala ctc gca cag LeU Ala Gin aat gac ttc Asn ASP Phe agc se r 520 gaa Glu gcc ctt caa Ala Leu Gin ata le 525 ctg Leu cta tgc caa Leu CyS Gin 510 cag gta ctt Gin Val Leu gca aca cta Ala Thr Leu 1393 1441 1489 1537 1585 1633 1682 1742 1802 1862 1922 agc gaa tgg Ser Glu Trp 530 aag cgg Lys Arg 545 gac

ASP

535 gcc Al a tgc aac ttc Cys Asn Phe tgg Trp 540 cgg A rg atg atg gtc met met Val agg caa aat Arg Gin Asn tga ttatttattt tctggttcat tttgctttct ttttttttgt atagagatca gtttgttcgt aaaaaaa ggattttttt ttttataaat taattttggt ttttttaaaa catttatgaa cctcatctgc tcagattgct gaaacatttt tttaaggagg cgccgttgct tacttccagt catgtttttg gacgtgtgta gctctaattt tacttttttg gatgttactt tcaaactcct tttttttttt tttagatagt taccggacaa ctttctttct aaaaaaaaaa aaaaaaaaaa aaaaaaaaca aaaaaaaaaa 1982 1989 <210> 2 <211> 556 <212> PRT <213> Arabidopsis thaliana <400> 2 Thr Gin 1 ASP Asn Val ASP Ala His Leu Lys Asn Asn Lys Glu Thr Thr Thr Leu 130 Thr Val 145 Gi U Al a Gin Val Th r

ASP

Th r Leu 115 Al a Gi U Met le Cy s Gi y As n Leu Leu 100 Cys Al a le Gin 5 Gin Ile Gi y Se r Arg Al a

ASP

Val Trp Giu Arg rh r Th r Ala 70 Se r Leu Al a Leu Se r 150 GlU Al a Al a Al a Pro Leu Leu Le u Cys 135 Arg Giu Leu Asn 40 Leu Ty r le cys Al a 120 Ty r Se r Gi u 10 Val Met Gi u Lys Th r 90 Phe Lys Cy 5 Ala se r Gi y Al a Se r Val 75 A rg Al a Le u Al a Asn 155

ASP

Ty r Al a Arg se r His Gi y Al a 125 Asfl A rg Pro

LYS

Leu Gi U Al a Lys Gi U 110 Al a Val

ASP

Val GlU Val Lys Met Ph e Gin Gi y

ASP

Gi y Ph e Ala le Ty r Val Trp Trp Asn Arg Ar g 160 4 Ser Tyr Ala Glu Leu Leu Gin Asp Leu Met Glu Lys Thr Leu val Leu 165 170 175 Ala Leu Ala Thr Gly Asn Lys Lys Phe Ser Ala Ser Leu Cys Lys Leu 180 185 190 Phe Glu Ser Tyr Ala Glu Ile Leu Ala Ser Gin Gly Leu Leu Thr Thr 195 200 205 Ala Met Lys Tyr Leu Lys Val Leu Asp Ser Gly Gly Leu Ser Pro Glu 210 215 220 Leu Ser Ile Leu Arg Asp Arg Ile Ser Leu Ser Ala Glu Pro Glu Thr 225 230 235 240 Asn Thr Thr Ala Ser Gly Asn Thr Gin Pro Gin Ser Thr Met Pro Tyr 245 250 255 Asn Gln Glu Pro Thr Gin Ala Gin Pro Asn val Leu Ala Asn Pro Tyr 260 265 270 Asp Asn Gin Tyr Gin Gin Pro Tyr Thr Asp Ser Tyr Tyr val Pro Gln 275 280 285 Val Ser His Pro Pro Met Gin Gin Pro Thr Met Phe Met Pro His Gin 290 295 300 Ala Gin Pro Ala Pro Gin Pro Ser Phe Thr Pro Ala Pro Thr Ser Asn 305 310 315 320 Ala Gin Pro Ser Met Arg Thr Thr Phe val Pro Ser Thr Pro Pro Ala 325 330 335 Leu Lys Asn Ala Asp Gin Tyr Gln Gln Pro Thr Met Ser Ser His Ser 340 345 350 Phe Thr Gly Pro Ser Asn Asn Ala Tyr Pro Val Pro Pro Gly Pro Gly 355 360 365 Gin Tyr Ala Pro Ser Gly Pro Ser Gin Leu Gly Gin Tyr Pro Asn Pro 370 375 380 Lys Met Pro Gin Val Val Ala Pro Ala Ala Gly Pro Ile Gly Phe Thr 385 390 395 400 Pro Met Ala Thr Pro Gly Val Ala Pro Arg Ser val Gin Pro Ala Ser 405 410 415 Pro Pro Thr Gin Gin Ala Ala Ala Gin Ala Ala Pro Ala Pro Ala Thr 420 425 430 Pro Pro Pro Thr Val Gin Thr Ala Asp Thr Ser Asn val Pro Ala His 435 440 445 Gin Lys Pro Val Ile Ala Thr Leu Thr Arg Leu Phe Asn Glu Thr ser 450 455 460 Glu Ala Leu Gly Gly Ala Arg Ala Asn Thr Thr Lys Lys Arg Glu Ile 465 470 475 480 Glu Asp Asn Ser Arg Lys Leu Gly Ala Leu Phe val Lys Leu Asn Ser 485 490 495 Gly Asp Ile Ser Lys Asn Ala Ala Asp Lys Leu Ala Gin Leu Cys Gin 500 505 510 Ala Leu Asp Asn Asn Asp Phe Ser Thr Ala Leu Gin Ile Gin Val Leu 515 520 525 Leu Thr Thr Ser Glu Trp Asp Glu Cys Asn Phe Trp Leu Ala Thr Leu 530 535 540 Lys Arg Met Met val Lys Ala Arg Gln Asn val Arg 545 550 555 <210> 3 <211> 814 <212> DNA <213> Arabidopsis thaliana <220> <221> CDS <222> <400> 3 gg gac tct ctc gca acc gat cca get ttc att gat tcg gat gta tac 47 Asp Ser Leu Ala Thr Asp Pro Ala Phe Ile Asp Ser Asp val Tyr 1 5 10 ctc agg tta gga ctt att att gag ggc aaa cga ttg aaa aag cca ccg Leu Arg Leu Gly Leu Ile Ile Glu Gly Lys Arg Leu Lys Lys Pro Pro act Th r aat Asfl ttt Phe cg c Arg cat Hi S ccc Pro aaa Lys ttt Phe 160 cac His tgg Trp agg Arg gtt Val cat His gac

ASP

att le atc Ile ctt Leu gtc Val 145 acc Th r tgt cys ccc Pro aca Th r ct c Leu g at

ASP

ttc Ph e tac Ty r aat Asnf ttc Ph e 130 gt ctt Leu tgt cy 5 ata Ile ccc Pro 210 tca se r gac

ASP

aga Arg aag Lys att Ile gtc Val 115 gat

ASP

act Th r gac

ASP

cag Gin aaa Lys 195 gac

ASP

cgc A rg aag Lys tct Se r tac Ty r gat

ASP

100 cac His g at

ASP

acg Th r ttc Phe tta Leu 180 gaa Gi U tca se r ctc tct Leu Ser att ctg Ile Leu ccc cct Pro Pro 70 tct tgc Ser cys 85 cac ttt His Phe cgc ctt Arg Leu agg tat Arg Tyr aga gag Arg GlU 150 aag ctt LYS Leu 165 gaa aag Giu Lys gca tgc Aia Cys ttc tgc Phe cys tct se r ctt Leu 55 gag Gi u tgc cys ctc Leu atc Ile ttc Ph e 135 tta Leu cag Gin cag Gin cga Arg tct Se r 215 tct Se r atc Ile agt se r cat HiS att Ile 120 aac Asfl aac Asfl gta Val aac Asn gcc Al a 200 caa Gin ctg Leu t cg Se r ag t Se r ccc Pro aag Lys 105 aca Th r aat Asn aga A rg gat

ASP

ag a Arg 185 aac Asfl acc Th r gag Glu cca Pro att Ile tcc Se r 90 acc rh r act rh r gca Al a ttg Leu cct Pro 170 gac

ASP

aaa Lys aca Th r aga A rg gac

ASP

gca Aila tg c Cys cga Arg gtc tac Ty r gag Gi U 155 cag Gin gag Gi U gca Aila t ct se r tct se r cac His ttc Phe gcc Aila atg Met tac Ty r 140 atg Met acg Th r ttc Phe act Th r cgc Arg 220 tta Leu acc Th r ttg Leu att Ile ctc Leu 110 g ct Aila aga Ang ttg Leu cac His atc le 190 cag Gin ctc Leu gtg Val1 gat

ASP

gcg Al a aaa Lys g ct Aila gtg Val ttg Leu aca Th r 175 gag Giu aag

LYS

143 191 239 287 335 383 431 479 527 575 623 665 725 785 814 tcggcaaggg taagatagga ttattttgtg ttttagtagt gatgattctt ttgcatgatt gattgtttgt gacaattgtg tgtagtagaa aatctgaaaa tttctaccaa ctcattcttt aagaagttgc taaaaaaaaa aaaaaaaaa <210> 4 <211> 220 <212> PRT <213> Arabidopsis thaliana <400> 4 ASP ser Leu Aia Thr ASP Pro Ala Phe Ilie ASP ser ASP Val Tyr Leu 1 5 10 Arg LeU Giy Leu Ilie lie Giu Gly Lys Arg LeU LyS Lys Pro Pro Thr 25 Val Leu Ser Arg Leu Ser Ser Ser Leu Giu Arg Ser Leu Leu Leu Asn 40 6 HiS Asp Asp Lys Ile Leu Leu Gly Ser Pro Asp Ser Val Thr Val Phe 55 Asp Gly Arg Ser Pro Pro Glu Ile Ser Ile Ala His Tyr Leu Asp Arg 70 75 Ile Phe Lys Tyr Ser Cys Cys Ser Pro Ser Cys Phe Val Ile Ala His 90 le Tyr le Asp His Phe Leu His Lys Thr Arg Ala Leu Leu Lys Pro 100 105 110 Leu Asn val His Arg Leu Ile Ile Thr Thr Val Met Leu Ala Ala LYS 115 120 125 Val Phe Asp ASP Arg Tyr Phe Asn Asn Ala Tyr Tyr Ala Arg Val Gly 130 135 140 Gly Val Thr Thr Arg Glu Leu Asn Arg Leu Glu Met Giu Leu Leu Phe 145 150 155 160 Thr Leu Asp Phe Lys Leu Gin Val Asp Pro Gin Thr Phe His Thr His 165 170 175 Cys Cys Gin Leu Giu Lys Gin Asn Arg Asp Giy Phe Gin Ile Giu Trp 180 185 190 Pro le Lys GIu Ala Cys Arg Ala Asn Lys Gu Thr Trp Gin Lys Arg 195 200 205 Thr Pro Asp Ser Phe Cys Ser Gin Thr Thr Ala Arg 210 215 220 <210> <211> 1268 <212> DNA <213> Arabidopsis thaliana <220> <221> CDS <222> (1)..(1266) <400> aat tcg gca cga ggc ctt ctc cag ctt cat cct tgc aac aag gt gta 48 Asn Ser Ala Arg Gly Leu Leu Gin Leu His Pro Cys Asn Lys va] Val 1 5 10 ctc tgg ggt ctt tct cat cag ata ttt gtc ggc tgc tgc agc tct gtg 96 Leu Trp Gly Leu Ser His Gin le Phe val Gly Cys Cys Ser Ser Va] 25 atg gaa gat gat gct acc agc aaa tta gct gcc ccc aag ccc gag cct 144 Met GIu Asp ASP Ala Thr Ser Lys Leu Ala Ala Pro Lys Pro Giu Pro 40 gct gat cag aat ctc gaa gct ggc aaa gct gct gtc ttc caa agg gga 192 Ala ASP Gin Asn Leu GiU Ala Gly Lys Ala Ala val Phe Gin Arg G y 55 tac aat ttg gtt cag gg aag tca gaa cat gga tta cca ttg gtt gat 240 Tyr Asn Leu Val Gin Gly Lys Ser Giu His Gly Leu Pro Leu Val Asp 70 75 aat tgc aaa gat ttg tcc tta gca gct ggt aac aat ttc gat gga acg 288 Asn Cys Lys Asp Leu Ser Leu Ala Ala Gly Asn Asn Phe Asp Gly Thr 90 gct cct ttg gag tat cat cag cag tat gat ctg caa caa gag ttt gaa 336 Ala Pro Leu GIu Tyr His Gin Gin Tyr Asp Leu Gin Gin Giu Phe Glu 100 105 110 cca aac ttc aat ggt ggt ttc aac aat tgt ccc agt tat ggt gta gta 384 Pro Asn Phe Asn G y G y Phe Asn Asn Cys Pro Ser Tyr Giy Val Val 115 120 125 gag ggt cct ata cat atc tct aat ttt atc ccg act att tgt cct cac 432 Giu Gly Pro Ile His Ile Ser Asn Phe Ile Pro Thr Ile Cys Pro His cct Pro 145 cag Gin ctg Leu cgt Ang agt Se r g ct Al a 225 acg Th r cca Pro gac

ASP

gag GlU cac His 305 gct Al a act Th r cag Gin aac Asfl agc ser 385 aat Asfl ctg Leu ctc Leu cac HiS cct Pro gca Al a 210 g ca Al a gtt Val ag g Arg tac Ty r ttt Ph e 290 cat HiS tgt cys tca se r atg met acc Th r 370 aca Th r gac

ASP

cat His ag g Arg tgg Trp 195 aag Lys act Th r ctg Leu agg A rg aat Asn 275 ttc Ph e gcc Al a aaa Lys 355 acc Th r aaa Lys tat Ty r tct se r gat

ASP

ctt Leu 180 agt se r gct Al a g ct Al a gag GlU gcc Ala 260 gaa Gi y ttg Leu 340 aga Arg aac Asfl gtc Val tgg Trp ttg Le u 165 tca se r atc Ile aaa Lys agt se r 245 ttt Phe cgt Arg ctg Leu tgg Trp tac Ty r 325 aaa

LYS

ctc Leu aac Asfl gcc Al a gta Val 405 gtc Val 150 att le aat Asfl tct se r 230 gag Giu gag GiU aag Lys cat HiS 310 agg Arg gtc Val aca Th r aac Asfl acc Th r 390 caa Gin gaa Gi U cta Leu aaa

LYS

215 cca Pro aca rh r agc Se r tgg Trp aga Arg 295 ctt Leu ctc Leu tca se r g ct Al a aaa Lys 375 gaa Glu aaa Lys tcc Se r agg Arg aaa

LYS

200 gat

ASP

tgg Trp cta Leu cac His 280 tct se r tac Ty r gag GiU aac Asfl gag Gi U 360 cgc Arg aat Asfl gag Gi U tgt cys agg Arg gcc Al a 185 gac

ASP

gtt Val aat Asfl agg Arg aac Asn 265 gag Gi U tac Ty r gaa GiU ctc Leu gac

ASP

345 ttc Ph e tgc cys gtt Val ttt Phe g ct Al a att le 170 acc Th r gct Al a gag Gi U 250 aga Arg tca se r tac Ty r tat Ty r aag Lys 330 tca Se r cct Pro atc le cag Gin aac Asn 410 ctt Ley 155 act Th r agg Arg ctg Leu att Ile cca Pro 235 tgg Trp aag

LYS

cgt Arg atg Met gag GiU 315 ctt Leu gt cca Pro aaa Lys aac Asn 395 tat Ty r tgg Trp gca Al a tat Ty r ctt Leu ccc Pro 220 gag GiU cta Leu caa Gin aaa Lys gat

ASP

300 at c le gtt Val gct Al a gaa Gi U 380 aca Th r ctg Leu gat

ASP

gca Al a gaa Giu ttt Phe 205 gaa GiU ctc Leu ttc Phe aga Arg cag Gin 285 cca Pro aac Asn gac

ASP

gat

ASP

aac Asn 365 aga Arg gta Val gtc Val tg c cys gct Al a tcc Se r 190 gct Al a tgt cy 5 ttt Ph e ttt Phe tct se r 270 atc le cag Gin aag Lys ctg Leu 350 aat Asn cca Pro gag GiU cct Pro tcc se r 175 g ct Al a gaa GiU gat

ASP

gac

ASP

255 tta Leu atg met cct Pro tgt cys aag Lys 335 cag Gin acc Th r aaa

LYS

cag Gin aat Asn 415 ag c Se r 160 acg Th r gt g Va1 ctt Leu ct c Leu 240 aag

LYS

cca Pro gtc Val ctg Leu gat

ASP

320 aag Lys aag Lys act Th r gt g Val1 gca Ala 400 cta Leu 480 528 576 624 672 720 768 816 864 912 960 1008 1056 1104 1152 1200 1248 agc gac tat tat atc ccc tg 1268 Ser ASP Tyr Tyr Ilie Pro 420 <210> 6 <211> 422 <212> PRT <213> Arabidopsis thaliana <400> 6 Asn Ser Ala 1 Le u met Al a Ty r Asfl Al a Pro GiU Pro 145 Gin Leu Arg se r Al a 225 Th r Pro

ASP

Trp Gi U

ASP

Asfl Cys Pro Asfl Gi y 130 Leu Leu HiS Pro Al a 210 Al a Val Arg Ty r Gi y

ASP

Gin Leu Lys Leu Ph e 115 Pro His Arg Trp Gi y 195 Lys Th r Leu Arg Asfl 275 Arg Leu

ASP

Asn Val

ASP

GiU 100 Asfl Ilie Se r

ASP

Leu 180 Se r Al a Al a GiU Al a 260 Gi y Gi y 5 Se r Al a Leu Gin Leu Ty r Gi y His Trp Leu 165 Se r lie Gi y Lys Se r 245 Ph e Arg Leu Leu Gin His Gin Ilie Thr Ser Lys 40 Giu Ala Gly 55 Gly Lys Ser 70 Ser Leu Ala His Gin Gin Gly Phe Asn 120 Ilie Ser Asn 135 Val Gin Lys 150 Ilie Gly Ser Asn Giu Arg Gly Leu Lys 200 Gly Lys ASP 215 Ser Pro Trp 230 Giu Thr Leu Giu Ser Gly Gly Trp His 280 Leu Phe 25 Le u Lys Gi U Al a Ty r 105 As n Phe cys Arg Ala 185

ASP

Val Asn Arg Asn 265 GlU His 10 Val Al a Al a His Gi y 90

ASP

Cy s lie Al a Ilie 170 Th r Gi y Gi y Al a Gi U 250 Arg Se r Pro Gi y Al a Al a Gi y 75 Asn Leu Pro Pro Leu 155 Th r A rg Leu lie Pro 235 Trp Lys Arg Cy s Cys Pro Val Leu Asfl Gin Se r Th r 140 Trp Ala Ty r Leu Pro 220 Gi U Leu Gin

LYS

ASP

300 Asn Cys Lys Ph e Pro Phe Gin Ty r 125 lie

ASP

Ala GiU Ph e 205 Gi U Leu Phe Arg Gin 285 Lys Se r Pro Gin Leu

ASP

GiU 110 Gi y Cy 5 Cys Al a Se r 190 Al a Cys Ph e Phe Se r 270 Ilie Val Se r Gi u Arg Val Gi y Phe Val Pro Pro Se r 175 Gi y Al a Gi U

ASP

255 Le u met 'ial Val Pro Gi y

ASP

Th r Giu Val His Se r 160 Th r Val Leu Gi y Leu 240 Lys Pro Val Giu Phe Gly Gly Leu LYS 290 Arg Ser Tyr Tyr met 295 Pro Gin Pro Leu 9 His His Phe Glu Trp His Leu Tyr Giu Tyr GiU Ile Asn Lys Cys ASP 305 310 315 320 Ala Cys Ala Leu Tyr Arg Leu Giu Leu LYS Leu Val ASP Gly Lys Lys 325 330 335 Thr ser Lys Gly LYS Val Ser Asn ASP Ser val Ala ASP Leu Gin Lys 340 345 350 Gin Met Giy Arg Leu Thr Ala Giu Phe Pro Pro Giu Asn Asn Thr Thr 355 360 365 Asn Thr Thr Asn Asn Asn Lys Arg Cys Ilie LYS Gly Arg Pro Lys Val 370 375 380 Ser Thr Lys Vfal Ala Thr Gly Asn Val Gin Asn Thr Val Giu Gin Ala 385 390 395 400 Asn ASP Tyr Gly Val Gly GiU Giu Phe Asn Tyr Leu Val Gly Asn Leu 405 410 415 Ser ASP Tyr Tyr Ile Pro 420 <210> 7 <211> 653 <212> DNA <213> Arabidopsis thaliana <220> <221> CDS <222> (651) <400> 7 gaa ttc ggc acg agc tcc ttc ctc ggc tgt aac aag ata gag aag aag 48 Giu Phe Gly Th r Ser Ser Phe LeU G] y Cys Asn LYS Ilie GlU Lys Lys 1 5 10 atg aat atg gaa gtg gat aca gta aca agg aag cct cgt atc tta cta 96 met Asn met Giu va] ASP Thr Val Thr Arg Lys Pro Arg Ilie Leu Leu 25 gct gca agt gga agt gtg gct tca att aag ttc agt aat ctc tgc cat 144 Ala Ala Ser Gly Ser va] Ala ser lie LYS Phe Ser Asn Leu Cys His 40 tgt ttc tca gaa tgg gct gaa gtc aaa gcc gtc gct tca aaa tca tct 192 Cys Phe Ser Giu Trp Ala GiU Val Lys Ala Val Ala ser LySsSer Ser 55 ctc aat ttc gtt gat aaa cct tct cta cct cag aat gtg act ctc tat 240 Leu Asn Phe Val ASP Lys Pro Ser Leu Pro Gin Asn va] Thr Leu Tyr 70 75 aca gat gaa gat gaa tgg tct agc tgg aac aag att ggt gat ccc gtt 288 Thr ASP Giu ASP Giu Trp Ser Ser Trp Asn Lys Ilie Gly ASP Pro Val 90 ctt cat atc gag ctc aga cgc tgg gct gat gtt atg atc att gct cct 336 Leu His le Giu Leu Arg Arg Trp Ala ASP Val met Ile Ile Ala Pro 100 105 110 ttg tct gct aac aca tta gcc aag att gct ggt ggg tta tgt gat aat 384 Leu Ser Ala Asn Thr Leu Ala Lys lie Ala Gly G ly Leu Cys ASP Asn 115 120 125 cta ttg aca tgt ata gta aga gca tgg gat tat agc aaa ccg ttg ttt 43 432 Leu Leu Thr Cys Ilie Val Arg Ala Trp ASP Tyr Ser Lys Pro Leu Phe 130 135 140 gtt gca ccg gcg atg aac act ttg atg tgg aac aat cct ttc aca gaa 480 Val Ala Pro Ala met Asn Thr Leu met Trp Asn Asn Pro Phe Thr GlU 145 150 155 160 cgg cac ctt gtc ttg ctt gat gaa ctt g ga atc acc cta att cct ccc 528 Arg His Leu Val Leu Leu ASP Giu Leu Gly lie Thr Leu Ilie Pro Pro 165 170 175 atc aag aag aaa ctg gcc tgt gy9a gac tac g gt aat ggc gca atg gct 576 Ilie LYS Lys LYS Leu Ala Cys G lY ASP Ty r Gly Asn G y Al a met Ala 180 185 190 gag cct tct ctg att tat tcc act gtt aga ctg ttc tgg gag tca caa 624 GiU Pro Ser Leu Ilie Tyr Ser Thr Val Arg Leu Phe Trp Giu Ser Gin 195 200 205 gct cgt aaa caa aga gat g ga acc agt tg 653 Ala Arg Lys Gin Arg ASP Gly rhr Ser 210 215 <210> 8 <211> 217 <212> PRT <213> Arabidopsis thaliana <400> 8 Glu Phe Gly Thr Ser Ser Phe Leu Giy Cys Asn Lys Ilie Giu Lys Lys 1 5 10 met Asn met Giu Vai ASP Thr Val Thr Arg Lys Pro Arg Ile Leu Leu 25 Ala Ala Ser Gly Ser Val Ala Ser Ilie Lys Phe Ser Asn Leu Cys His 40 Cys Phe Ser Giu Trp Ala Giu val Lys Ala Val Ala Ser LYS Ser Ser 55 Leu Asn Phe Vial ASP Lys Pro Ser Leu Pro Gin Asn Val Thr Leu Tyr 70 75 Thr ASP Giu ASP Giu Trp Ser Ser Trp Asn Lys Ilie Gly ASP Pro Val 90 Leu His Ile GiU Leu Arg Arg Trp Ala ASP Val met Ile Ile Ala Pro 100 105 110 Leu Ser Ala Asn Thr Leu Ala LYS Ilie Ala Gly Gly Leu Cys Asp Asn 115 120 125 Leu Leu Thr Cys Ilie Val Arg Ala Trp ASP Tyr Ser Lys Pro Leu Phe 130 135 140 Val Ala Pro Ala met Asn Thr Leu met Trp Asn Asn Pro Phe Thr Giu 145 150 155 160 Arg His Leu Val Leu Leu ASP Giu Leu Gly Ilie Thr Leu Ilie Pro Pro 165 170 175 Ilie Lys Lys Lys Leu Ala Cys Gly ASP Tyr Gly Asn Gly Ala met Ala 180 185 190 Giu Pro Ser Leu Ilie Tyr Ser Thr Val Arg Leu Phe Trp Giu Ser Gin 195 200 205 Ala Arg LYS Gin Arg ASP Gly Thr Ser 210 215 <210> 9 <211> 1856 <212> DNA <213> Arabidopsis thaliana <220> <221> CDS <222> .(1583) <400> 9 gaattcctcg agctacgtca gggccctgac gtagccgtca atcgaaatcc caaagatcag cg atg gtg act cta aac gct tct tct cct ctc acg acc aag tcg ttc 107 met Va 1 Thr Leu Asn Ala Ser Ser Pro LeU Thr Thr LYSsSer Phe 1 5 10 ctc Leu ttc Ph e gct Al a caa Gin acg Th r cta Leu caa Gin att Ile tt C Phe ttt Ph e 160 ctg Leu aaa

LYS

ccc tac cgt Pro gcc Al a tcg se r ttg Leu ctt Leu gag Giu aga Arg 145 gcg Al a ccg Pro t cg se r Ty r gtt Val gtt Val ccg Pro tct se r gat

ASP

gac

ASP

130 aag LyS agt se r ag g Arg Arg cat His aaa Lys gat

ASP

tct se r aag Lys gat

ASP

115 acc Th r cag Gin ccg Pro tac Ty r att Ilie 195 cac His tcg se r tgg Trp tat Ty r ttt Ph e ctt Leu 100 tgt Cys ttt Phe tta Leu aga Arg aga Ar g 180 c ct Pro gct cct cgc Aia Pro Arg act gac ccc Thr ASP Pro agt cta gag Ser Leu GlU 55 cct gat cag Pro ASP Gin 70 cct cct ata Pro Pro Ilie 85 gg t caa gcg G y Gin Ala gct gag agt Ala Giu Ser agg gtt ctt Arg Val Leu 135 cca gtt atc Pro Val Ile 150 tta gac cca Leu ASP Pro 165 gga gat aac Gi y ASP Asfl gat cct cat ASP Pro His cgt Arg aag Lys 40 agt se r aag Lys gtt Val g ct Al a ttc Ph e 120 ctt Leu aag Lys ttt Phe ata Ilie agg Ar g 200 cca atc Pro Ilie 25 aaa tct Lys Ser tgg aag Trp Lys gat gtt ASP Val ttc gct Phe Aia 90 atg ggt met G ly 105 aag gaa Lys GIU cag atg Gin met gtj gjga Va] G y gag gag Giu Glu 170 aat ggt Asfl GMY 185 atg gtt met Val tct ttc Ser Phe acc caa Thr Gin tcg aag Ser Lys gat tcg ASP ser gj9t gag Gly Giu caa gcc Gln Ala ttt aac Phe Asn gg t gtt Gly Val 140 aga atg Arg met 155 aaa gat LYS ASP gat gct ASP Ala aga gcg Arg Ala tcc cct gtc Ser Pro Val tca gcc tcc Ser Ala Ser aaa gct ttg Lys Ala Leu gtt cta cag Val Leu Gin gct agg aaa Ala Arg Lys ttt atg ctt Phe met Leu 110 gct aat aac Ala Asn Asn 125 gtt ctc atg Val Leu met gct g gt cag Ala Gly Gin ggt gtg aag G] y Va] LYS 175 ttt gat gag Phe ASP Giu 190 tac aca cag Tyr Thr Gin 205 tct gtg gct acg ttg aat ctc ttg aga gca ttt gct act gga ggt tat Sen val Ala Thn Leu Asn Leu Leu Arg Ala Phe Ala Thr G lY Gly Tyr 210 215 220 gca Al a agt Se r 240 gct Al a atg Met tat Ty r gat

ASP

gat

ASP

320 atc Ilie gag GiU aga Ang gca Al a cac His 400 gat

ASP

gaa GlU gt agc Se r gct Al a 225 gaa GlU ttg Leu act Th r gag Giu tgc cys 305 aag Lys ata Ile atg met gtc Val 385 gca Al a act Th n tca Se r 465 atg met cag Gin act Th n caa Gin 290 tct Sen g ct Al a gt g Va] cta Leu 370 cg t A rg aac Asfl atc Ilie agt Sen gaa Giu 450 cg c Ang cag Gin ttc Ph e act Th r 275 gca Ala g cg Al a cat His agt Sen aac Asfl 355 gct Al a aca Th n agg Ang ttc Ph e 435 tgt Cys tac Tyrn aga Ang gac

ASP

atg Met 260 gag GiU ct c Leu cac His gtt Val gat

ASP

340 c ct Prno gag GlU gcc Al a atc lie g cg Al a 420 cct Prno gtc Vai cac His gtt Val ag g Ang 245 ttt Ph e aca Th n atg Met gag Gi U 325 aaa Lys cag Gin aat Asfl atg Met 405 gaa GiU act Th n agc Sen 230 tac Tyrn gca Al a tgg Tnp aga Ang ctt Leu 310 ttt Phe atg met aac Asfl atg met cag Gin 390 g ct Aia ttg Leu cac His cag Gin cgt Ang g ct Aila aca Th n gag Giu 295 tgg Tnp ct g Leu gtc Val aag

LYS

cgg Ang 375 att Ilie cct Pno ag a Ang gtt Val tca Sen 455 tgt Cys tgg Tnp gaa Giu tcc Sen 280 gat

ASP

gtt Val agg Ang cct Prno cct Prno 360 gtc Val gt g Va1 gcg Aila cat His 440 cg C Ang gac

ASP

aac Asn ttg Leu ctt Leu 265 cat His tca Sen ga Sen gag GLy act Pher 345 tta

LYS

act Th n tca P no ctt Leu g ct Aila 250 act Th r gag GiU aca Th n gaa Gi U atc Ilie 330 gaa Giu agg Ang ctt Le u tgg Tnp cta Leu 410 ttc Phe gaa Gi U atc Ilie aga Ang gat

ASP

235 aat Asn agt Sen tgt Cy s t ct Sen cga Arg 315 gct Al a ctg Leu att Ilie cct Prno gtt Val 395 aaa Lys gac

ASP

atg met act Th n ct c Leu 475 ttc Ph e aga Ang gct Al a ttg Leu 300 act Th n aac Asn gtt Val acg Th n aat Asn 380 agt Sen act Th n gtc Val act Th n tac Tyrn 460 aac Asn acg Th n gtt Val cac His tta Leu 285 ctt Leu cg c Ang ccc Prno aag Lys gtt Val 365 ttg Leu gat

ASP

cg t Arg cat His 445 aac Asn gca Al a caa Gin gat

ASP

ccg Pro 270 ttg Leu tac Tyrn caa Gin ctc Leu ctg Leu 350 ata Ile atc lie cca Prno tct Sen gat

ASP

430 caa Gin gat

ASP

tct Sen cat His gag Giu 255 atc Ilie c ct Pro tat Tyrn ctt Leu 335 ata Ilie gtg Va 1 ag a Ang atg met ttc Phe 415 caa Gin aac Asn cta Leu cag Gin 779 827 875 923 971 1019 1067 1115 1163 1211 1259 1307 1355 1403 1451 1499 1547 470 tct ctg gag ctt gca ttc atc att gca gag cgt ctg cga aag aga agg 13 Ser LeU GlU Leu Ala Phe Ile Ile Ala Giu Ar 8 Leu Arg LYS Arg Arg 480 485 498 495 ctt ggt tcc ggg aat ctt ccg tca tct att gga gtc tagagaacaa 1593 Leu G] y Ser Gi y Asn Leu Pro Ser Ser Ilie Gi y Val 500 505 gaaaatactt atccgagcta ggatgtgtgt gtatagaggc tgatctctac ttattaagtt 1653 gccaagttaa atgagcttgt gtactgttaa aagtaagata ttgttgtttt tgtgtgttgg 1713 gttatgattt tgtctgaaat aagtggctga ctttataacc cgtaaatctc tacgtcacgc 1773 ttgcaacaaa aattcgatat ttgattcaat cacagaaaag tcctcccatt aaggtgtaaa 1833 ccctgacgta gctcgaggaa ttc 1856 <210> <211> 507 <212> PRT <213> Arabidopsis thaliana <400> met Val Thr LeU Asn Ala Ser Ser Pro Leu Thr Thr LySsSer Phe Leu 1 5 10 Pro Tyr Arg His Ala Pro Arg Arg Pro Ile Ser Phe Ser Pro Val Phe 25 Ala Val HiSsSer Thr ASP Pro Lys LySsSer Thr Gin Sen Ala Ser Ala 40 Ser Val Lys Trp Ser LeU Glu Ser Trp LySsSer LYS Lys Ala Leu Gin 55 Leu Pro ASP Tyr Pro ASP Gin LYS ASP \'al Asp Ser Val Leu Gin Thr 70 75 Leu Sen Ser Phe Pro Pro Ilie Val Phe Ala Gly Giu Ala Arg LYS Leu 90 GiU ASP LYS Leu Gly Gin Ala Ala met Gly Gin Ala Phe met Leu Gin 100 105 110 Gly Gly ASP CyS Ala Giu ser Phe Lys Giu Phe Asn Ala Asn Asn Ile 115 120 125 Arg ASP Thr Phe Arg Val LeU Leu Gin met Gly Val Val Leu met Phe 130 135 140 Gly Gly Gin Leu Pro Val Ile Lys val Gly Arg met Ala Gly Gin Phe 145 150 155 160 Ala Lys Pro Arg Leu ASP Pro Phe Giu Giu Lys ASP Gly Val Lys Leu 165 170 175 Pro Ser Tyr Arg Gly ASP Asn lie Asn Gly ASP Ala Phe ASP Giu Lys 180 185 190 Ser Arg Ilie Pro ASP Pro His Arg met Val Arg Ala Tyr Thr Gin Ser 195 200 205 Val Ala Thr Leu Asn Leu Leu Arg Ala Phe Ala Thr Gly Gly Tyr Ala 210 215 220 Ala met Gin Arg Val Ser Gin Trp Asn Leu ASP Phe Thr Gin HiSsSer 225 230 235 240 Glu Gin Gly ASP Arg Tyr Arg Glu Leu Ala Asn Arg Val ASP 245 250 Giu Ala 255 Leu Th r GlU cy s 305 Gi y Lys le Met Val 385 Gi y Al a Gi y Th r Se r 465 Leu Gi y Gi y Th r Gin 290 Se r Al a Val Leu Gi y 370 A rg Asn le Se r GiU 450 Ang Gi U Sen Ph e Th r 275 Al a Al a His Sen Asn 355 Al a Gi y Th r Arg Ph e 435 Cys Ty r Leu Gi y met 260 GlU Leu His Val

ASP

340 Pro GlU Al a Ile Ala 420 Pro Val His Al a Asfl 500 Gi y Phe Th r Met Gi U 325

LYS

Gin Asn Gi y Met 405 Glu Gi y Gi y Th r Phe 485 Leu Al a Tnp A rg Leu 310 Ph e Met Asfl Met Gin 390 Al a Leu Gi y Gi y His 470 le Pro Al a Th r GlU 295 Trp Leu Val Lys A rg 375 Ile Pro Arg 'ial Se r 455 cys Ile Ser Gi y Sen 280

ASP

Val Arg Pro Pro 360 Val Val Gi y Al a His 440 Arg

ASP

Al a Sen Ley 265 His Sen Gi y Gi y Sen 345 Gi y

LYS

Th r Gi y Phe 425 Leu Th r Prno Gi u Ile 505 Th r Gi U Th r Gi U Ile 330 Gi u Ang Leu Trp Leu 410 Ph e GlU le Arg Arg 490 Gi y Sen cys Sen Arg 315 Al a Le u le Prno Val 395

LYS

ASP

Met Th n Leu 475 Leu Val Al a Leu Gi y 300 Th n Asn val Th n Asn 380 Sen Th n Val Th n Tyrn 460 Asn Ang Prno 270 Leu Tyrn Gin Leu Leu 350 le Ile Prno Sen

ASP

430 Gin

ASP

Sen Ang le Prno Tyrn Leu Gi y 335 le Val Arg met Ph e 415 Gin Asn Leu Gin Ang 495 met Ty r

ASP

320 le Glu Ang Al a His 400

ASP

GlU Val Sen Sen 480 Leu <210> 11 <211> 1081 <212> DNA <213> Arabidopsis thaliana <220> <221> COS <222> .(954) <400> 11 gaa ttc ggc acg agg gat ccc aag aac cta aat cgt cac caa gta cca Glu Phe Gly Tr Ang ASP Pro Lys Asn Leu Asn Ang His Gin Val Pro 1 5 10 aat ttc ttg aac cca cca cca cca ccg cga aat cag ggt ttg gta gat Asn Phe Leu Asn Pro Pro Pro Pro Pro Ang Asn Gin Gly Leu Val Asp gat

ASP

act Th r cca Pro t ct se r Arg ttg Leu gct Al a t Ct Se r 145 tct se r ag t Se r gtt Val ttt Phe tat Ty r 225 agt Se r gag GiU tct se r gat

ASP

gag Giu aac As n aac Asfl atg Met gag Gi U 130 g ct Al a ct t Leu ag t Se r t ct se r 210 aga Arg ttt Phe tta Leu ttt Phe g ct Al a att le aag Lys aaa

LYS

cct Pro cat HiS 115 cca Pro tta Leu act Th r ag t Se r agg Arg 195 tct Se r att Ile gca Ala act Th r 275 gct Al a aaa Lys aag Lys gac

ASP

g ct Al a 100 aaa

LYS

t cg Se r g ct Al a g ct Al a 180 tca se r t ct Se r ttg Leu 260 cag Gin t ct Se r gat

ASP

agt se r aga Arg ctt Leu tct Se r att Ile tct Se r 165 aga Arg agt se r gt? ttt Phe att le 245 t ct se r att Ile g ct Al a ttc Phe cag Gin 70 cac His tgt cy s gat

ASP

att le tca Se r 150 tta Leu cca Pro tta Leu cca Pro cct Pro 230 ttg Leu caa Gin tat Ty r gtt Val1 cag Gin 55 aat Asfl act Th r gct Al a g ca Al a 135 gct Al a atg Met tta Leu cca Pro acc Th r 215 gaa Gi y caa Gin cat His 295 gtt Val 40 atc Ile cag Gin aaa Lys gct Al a gaa Giu 120 gct Al a gca Al a atc le aat Asn act Th r 200 act Th r ttt Phe cag Gin 280 tcc se r gt g Va1 aac Asn gtc Val agg Arg 105 act Th r act Th r acc rh r agt Se r tgg Trp 185 gat gat

ASP

aat Asn 265 atg Met gac

ASP

gtc Val cag Gin gaa Gi U 90 att le atc Ile tct Se r cat His 170 tta Leu tta Leu ttt Ph e cat His 250 gtt Val gag Gi u tct se r ctt Leu 75 ttt Ph e cag Gin tca Se r aac Asn 155 gac

ASP

att Ile tgg Trp atg met cct Pro 235 aat Asn cag Gin aat Asn gct Al a aga Arg caa Gin tgg Trp 140 cat His tta Leu cca Pro agt se r 220 cag Gin gtt Val gct Al a cat HiS 300 cgc Arg tcc Se r cct Pro ttg Leu ctg Leu 125 act Th r cat HiS gat

ASP

aat Asn 205 gaa gtt Val at g met ttg Leu cag Gin 285 aaa Lys gac

ASP

aag

LYS

cga Arg act Th r 110 ctt Leu ata le caa Gin 190 gta Val cct Pro aat Asn 270 gct Al a cca Pro aaa

LYS

aga Arg cga Arg aga Arg caa Gin ccg Pro 175 gaa GiU g ct Al a gct Al a cat His 255 cct Pro caa Gin aca Th r gaa GiU ag c Se r att Ile gaa Giu caa Gin gcc Al a 160 tct Se r ga Get 240 ct Gey ag Gin gct Al a 144 192 240 288 336 384 432 480 528 576 624 672 720 768 816 864 912 290 agg gtt ctt cac Arg Val Leu His atg cat cat aac Met His His Asn gaa gaa cat cag Giu GiU His Gin 16 caa gag agt ggt gag aaa gat gat tct caa ggc tca ggt cgt 954 Gin Giu Ser G] y GiU Lys Asp Asp Ser Gin G ly Se r G~y Arg 305 310 315 taaaaggatt gggttttttt tgtatcttct ggatttgaaa aagcttttgg cttttgtttt 1014 gtgataatat tgttgtaatt tgtaccacca tggagaagaa aaagaaaagg ttatataaaa 1074 aaaaaaa 1081 <210> 12 <211> 318 <212> PRT <213> Arabidopsis thaliana <400> 12 Giu Phe Gly Thr 1 Asfl

ASP

Th r Pro Se r Arg Le u Ala Se r 145 Se r Se r Val Ph e Ty r 225 Se r GiU Phe

ASP

Gi u Asfl Asfl Met Gi y Gi U 130 Al a Leu Se r Se r Gi y 210 Arg Phe Leu Leu Al a Ilie Lys

LYS

Pro His 115 Pro Leu Th r Se r A rg 195 Se r Ile Al a Gi y Asfl Al a Lys Lys

ASP

Al a 100 Lys Se r Al a Al a Gi y 180 Se r Gi y Gi y Se r Leu 260 Arg 5 Pro Se r

ASP

Se r Arg Leu Se r Ilie Se r Gi y 165 A rg Se r Val Phe lie 245 Se r ASP Pro Lys Pro Pro Pro Ala Val Val 40 Phe Gin Ilie 55 Gin Asn Gin 70 His Thr Lys Cys Ala Ala ASP Gly Glu 120 Ilie Ala Ala 135 Ser Ala Ala 150 Leu Met Ilie Pro Leu Asn Leu Pro Thr 200 Pro Thr Thr 215 Pro Gly Phe 230 Leu Gly Gly Gin GiU Gly Asn Pro 25 Se r 'ial Asn Val Arg 105 Th r Th r Th r Se r Trp 185 Gi y Gi y

ASP

Asn Asn 265 Ley 10 Arg

ASP

Val Gin GiU 90 lie lie Gi y Se r His 170 Gi y Leu Leu Phe His 250 Val Asn Asn Gi U Se r Leu 75 Gi y Ph e Gin Se r Asn 155

ASP

lie Trp met Pro 235 Asn Gi y Arg Gin Asn Al a Gi y Arg Gin Trp Gi y 140 His Leu Gi y Pro Se r 220 Gi y Gin Val His Gi y Arg Se r Pro Gi y Leu Leu 125 Th r His

ASP

Gi y Asn 205 Gi U Val Met Leu Gin Le u Lys

ASP

Lys Arg Th r 110 Leu Ilie Gin Gi y Gi y 190 'ial Gi y Gi y Pro Asn 270 Val Val Pro Lys Arg Arg Arg Gin Pro Gi y Gi y 175 GiU Al a Al a His Gi y 255 Pro Pro

ASP

Th r Gi u Se r Ilie GiU Gin Al a Gi y 160 Se r Gi y Gi y Gi y Met 240 Leu Gin Ser Phe Thn Gin Ile 275 Gin Gly Arg Val Leu 290 Gin Glu Ser Gly Giu 305 17 Tyn Gin Gin Met Gly Gin Ala Gin Aia Gin Aia 280 285 Hi S Hi S Met His His Asn His Giu Giu His Gin 295 300 LyS ASP ASP sen Gin Giy Ser Gly Arg 310 315 <210> 13 <211> 777 <212> DNA <213> Arabidopsis thaliana <220> <221> CDS <222> (1) <400> 13 (774) aat Asn 1 gag Gi u aga Ang ttc Ph e cct Pro aat Asn tca Se r ttg Leu agt Se r gat

ASP

145 atg met gag Giu t cg Se r aat Asn tat Tyrn tta Leu gac

ASP

ccc Pro gct Aila ccg Pro 130 tta Leu g ct Ala gca Al a cg t Arg cag Gin aat Asn agt sen aac Asfl aac As n cat His 115 gct Al a gtt Val gtt Val aag LyS cg a Arg ccg Pro act Th r ttc Ph e cct Pro aca Th r 100 cat His caa Gin tta Leu atg met 180 5 gac

ASP

t cg Sen agt Sen cta Leu tca Se r gtt Vai c ct Pro tgc Cys gat

ASP

165 aat Asn aac Asn aat Asfl ata le acc rh r ttg Leu 70 agt Sen cag Gin caa Gin gca Aila gag Giu 150 gag Giu gag

GIU

aaa LyS gtg Val1 aac Asn tca Sen 55 aga Arg tca Sen gct Al a gca Al a 135 aca Th n tca Sen tta Leu aag LyS g ct Al a gaa Gi U 40 cct Prno gat

ASP

gtt Val gca Al a 120 cct Prno g at

ASP

gaa

GIU

ctg Leu agg Ang aat Asn 25 gca Al a cgg Ang gtt Val t ct Sen acg Th n 105 ctg Leu gca Al a ag t Sen gag

GIU

185 aga A ng 10 g ca Al a tat Tyrn ccc Pro 90 aat Asn gtt Val gac

ASP

gat

ASP

170 ctt Leu ctt Leu caa Gin gaa

GIU

agt Se r 75 gta Val caa Gin cag Gin tct Se r gag Gi U 155 g ca Al a gtc Val cct Prno aac Asn aat Asfl tca Sen tct Sen aca Th r gta Val cca Prno tgg Tnp 140 tgt cy 5 att le cct Prno gta Val cg c Ang atg Met gtt Val acc Th n ttg Leu ccc Prno aat Asfl 125 ag c Sen ttt Phe tct Sen aag LyS gat

ASP

cag Gin ctt Leu tca Sen tct Sen gcc Al a gaa Gi U 110 ata le cct Prno gat

ASP

cct Prno ctg Leu 190 gag Giu att le cga Ang aac Asfl gta Val gag Gi U gca Al a gaa GiU cca Pro gaa GiU 175 ccc Prno cag Gin gtt Val cag Gin aat Asn gac

ASP

ttt Ph e agt Sen caa Gin ttt Ph e ata Ile 160 48 96 144 192 240 288 336 384 432 480 528 576 atc caa gat cca ttc tgg gaa cag ttc le Gin ASP Pro Phe Trp GlU Gin Phe 195 200 att gca gat aca gac gat att cta tca lie Ala ASP Thr ASP ASP Ilie Leu Ser 210 215 ttg gta ttg gaa caa gaa cca aac gag Leu Val Leu Glu Gin Glu Pro Asn Giu 225 230 atg aag tat ctt act gaa caa atg g ga met Lys Tyr Leu Thr Giu Gin met Gly 245 agg aaa taa Arg LYS <210> 14 <211> 258 <212> PRT <213> Arabidopsis thaliana 18 ttt tct gtt gaa ctc Phe Ser Val GlU Leu 205 gja tca gt g gag aat GI y Ser Va] Giu Asn 220 tgg acc cgt aat gaa Trp Thr Arg Asn Giu 235 ctg ctt tcc tca gaa Leu Leu Ser Ser Giu 250 cca Pro aat Asn caa Gin gca Al a 255 g cg Al a gac

ASP

caa Gin 240 cag Gin 624 672 720 768 777 <400> 14 Asn Ser 1 Giu Arg Ph e Pro Asn se r Leu se r

ASP

145 met GiU Ilie lie Leu Asn Ty r Leu

ASP

Gi y Pro Al a Pro 130 Leu Al a Gi y Gin Al a 210 Val Ala Arg Gly 5 Arg Gly ASP Gin Pro Ser Asn Thr ser Ser Phe Leu Asn Pro Ser Asn Thr Val 100 His His Pro 115 Ala Gin Gly Val Gly Cys val Leu ASP 165 LYS met Asn 180 ASP Pro Phe 195 ASP Thr ASP LeU Giu Gin Asn Lys LYS Asn Val Ala lie Asn Giu 40 Thr Ser Pro 55 Leu Gly ASP 70 Ser Arg Val Gin Ser Ala Gin Ala Gly 120 Ala Ala Pro 135 Giu Thr ASP 150 Glu Ser Giu Glu Leu Leu Trp Glu Gin 200 ASP Ilie Leu 215 GlU Pro Asn A rg Asfl 25 Al a Arg Val se r Th r 105 Leu Al a se r Gi y Gi U 185 Phe se r GiU Arg 10 Gi y Al a Ty r Pro Gl y 90 Asn val

ASP

Gi y

ASP

170 Gi y Ph e Gi y Trp Leu Leu Gln GlU Se r 75 Val Gin Gin Se r GlU 155 Al a 'ial Se r Se r Th r Pro Asn Asn Se r Se r Th r Val Pro Trp 140 cys lie Pro val 'ial 220 Arg Val Arg met Val Th r Leu Pro Asn 125 se r Phe se r Lys GlU 205 Gi U Asn

ASP

Gin Leu Se r Se r Al a GiU 110 lie Pro

ASP

Pro Leu 190 Leu Asn GlU GlU lie Arg Asn Val Gi u Ala Gl y GlU Pro GiU 175 Pro Pro Asn Gin Gin Val Gin Asn

ASP

Phe Se r Gin Phe lie 160 Gly Gl y Al a

ASP

Gin 19 225 230 235 240 met Lys Tyr Leu Thr Glu Gin Met Gly Leu Leu ser Ser GlU Ala Gin 245 250 255 Arg LyS <210> <211> <212> DNA <213> Arabidopsis thaliana <400> cggatccgaa ttcatggaga acgag <210> 16 <211> <212> DNA <213> Arabidopsis thaliana <400> 16 cggatccgaa ttctcagaac tgaga <210> 17 <211> 21 <212> DNA <213> Arabidopsis thaliana <400> 17 agggatgttt aataccacta c 21 <210> 18 <211> 21 <212> DNA <213> Arabidopsis thaliana <400> 18 gcacagttga agtgaacttg c 21 <210> 19 <211> <212> DNA <213> Arabidopsis thaliana <400> 19 cgagatctga attcatggat cagta <210> <211> 26 <212> DNA <213> Arabidopsis thaliana <400> cgagatctga attcctaagg catgcc 26 <210> 21 <211> 22 <212> DNA <213> Artificial sequence <220> <223> Description of Artificial Sequence: artificial sequence <400> 21 ccatatggaa ttcgcacgag gc 22 <210> 22 <211> 24 <212> DNA <213> Artificial sequence <220> <223> Description of Artificial Sequence: artificial sequence <400> 22 gcagtaatag gatccactat aggg 24 <210> 23 <211> 28 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence: artificial sequence <400> 23 gggaattcat ggcggaactt gagaatcc 28 <210> 24 <211> 32 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence: artificial sequence <400> 24 ggggatccaa gacaagataa gagtccctgc cg 32 <210> <211> <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence: artificial sequence <400> gggaattcat ggctgatcag attgagatcc <210> 26 <211> 33 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence: artificial sequence <400> 26 ggggatccgc ataaatataa tcaagcagca gcg 33 <210> 27 <211> <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence: artificial sequence <400> 27 gggaattcat gttaaccgca gccggagacg <210> 28 <211> 34 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence: artificial sequence <400> 28 ggggatccgg ggatccatca aacatataaa gatg 34 <210> 29 <211> 31 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence: artificial sequence <400> 29 ccgaattcat ggattcccta gcgatttctc c 31 <210> <211> 36 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence: artificial sequence <400> ggggatccct acaacatgat tcgagaaaat tgatgg 36 <210> 31 <211> 26 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence: artificial sequence <400> 31 gggaattcat ggactctctc gcaacc 26 <210> 32 <211> 24 <212> DNA <213> Artificial Sequence <220> <223> Description Of Artificial sequence: artificial sequence <400> 32 ggggatcctt gccgatcagc gtgc <210> 33 <211> 609 <212> DNA <213> Arabidopsis thaliana <220> <222> <400> 33 (606) atg met 1 tta Le u cga A rg cga Arg gca Al a cgt Arg cat His gat

ASP

acg Th r ttt Phe 145 ctt Leu gcg Al a tct Se r gtc Val cca Pro aat Asn ttc Phe cg t Arg ctg Leu aag Lys 130 gaa GiU caa Gin gaa Gi u tca Se r gcg Al a acg Th r tgt Cys act Th r ctt Leu tac Ty r 115 gag Glu tta Leu aag Lys ct t Leu ttg Leu act Th r ata Ile agt Se r cac His ctc Le u 100 tac Ty r atg Met aac Asn gaa GlU gag Gi U 5 cta Leu cag Gin acg Th r cct Pro aga Arg atc Ile aac Asn aat Asn gt atg met 165 aat Asn gag Giu tca Se r att Ile tct Se r 70 caa Gin act Th r aat Asn ttt Ph e acg Th r 150 act Th r cca Pro cga Arg cag Gin cag Gin 55 tg c cys cct Pro agt Se r gcg Al a cta Leu 135 cca Pro ctt Leu agt Se r gtt Val aga Arg agc Se r ttc Ph e tca Se r gtc 'ial tat Ty r 120 gag Gi U aac Asn ctt Leu gta Val gct Al a 25 gtt 'ial tat Ty r gtc Val ctt Leu atg met 105 tac Ty r ctg Leu aca Th r caa Gin atg met 10 gag Glu t cg Se r cta Leu gtt 'ial ccc Pro 90 gtc Val g cg Al a gat

ASP

ttc Phe cct Pro 170 tcg Se r tca Se r gtg Va 1 gag Gi U gct Al a 75 atc le gct Al a aaa Lys ttc Phe aac Asn 155 ct c Leu aag

LYS

aac Asfl ttt Ph e ag g Arg tac Ty r aat Asfl g ct Al a gt g Va1 tta Leu 140 gcc Al a tct Se r ata le ctg Leu ttc Ph e tat Ty r ttt Phe ttc Phe 110 ttc Phe gtt Val gca Al a acc Th r ctg Leu aaa Lys ct c Leu aac Asn ctc Leu ata le tta Leu tct Se r gtt Val 175 ttc Phe cga Arg agt Se r tac Ty r gat

ASP

gtc Val gat

ASP

ag c Se r tat Ty r 160 gtc Val 48 96 144 192 240 288 336 384 432 480 528 576 cca tca tca aga tct ctc att acc ttc aac gac gat gaa gct tct cat Pro Ser Ser Arg Ser Leu lie Thr Phe Asn ASP ASP Giu Ala Ser His 23 180 185 cag aaa caa caa caa caa caa ctc gct gtt tga Gin Lys Gin Gin Gin Gin Gin Leu Ala Vai 195 200 190 <210> 34 <211> 202 <212> PRT <213> Arabidopsi <400> 34 s thaiiana met Leu Arg Arg Al a Arg His

ASP

Th r Phe 145 Leu Pro Gin Ala Giu Leu GiU Asn Pro se r Val Pro Asn Ph e A rg Leu Lys 130 Giu Gin se r

LYS

se r Al a Th r Cys Th r Leu Ty r 115 Gi U Leu Lys Se r Gin 195 Leu Th r Ile Se r His Leu 100 Ty r Met Asn GiU Arg 180 Gin Le u Gin Th r Pro Arg le Asn Asn Val met 165 Se r Gin Gi u Se r Ile Se r 70 Gin Th r Asn Ph e rh r 150 Th r Leu Gin A rg Gin Gin 55 cy 5 Pro Se r Al a Leu 135 Pro Leu le Gin se r Val Arg 40 Se r Phe Se r Val Ty r 120 Gi u Asn Leu Th r Leu 200 Val Al a 25 Val Ty r Val Leu Met 105 Ty r Leu Th r Gin Ph e 185 Al a Met 10 Giu Se r Leu Val Pro 90 Val Al a

ASP

Ph e Pro 170 Asn 'ial se r se r \'al Gi u Al a 75 le Al a Lys Ph e Asn 155 Leu

ASP

LYS

Asn Ph e Arg Ty r Asn Al a Val Leu 140 Al a Se r

ASP

Leu

ASP

His Ile \'al Se r Lys Gi y 125 Phe Ty r Leu Gi U le Leu Gi y Ph e Ty r Phe Ph e 110 Gi y Gi y Phe Val Al a 190 Al a Th r Leu Lys Leu Asn Leu le Leu Se r Val 175 Se r Ph e A rg Se r Ty r

ASP

Val

ASP

Se r Gi y Ty r 160 Val HiS <210> <211> 660 <212> DNA <213> Arabidopsis thaliana <220> <221> CDS <222> .(657) <400> atg gct gat cag att gag atc cag aga atg aac caa gat ctt caa gaa met Ala ASp Gin le Giu Ile Gin Arg met Asn Gin ASP LeU Gin GIU 1 5 10 cca ttg gct gag atc atg cca agt gtt tta acg gca atg tcg tat ctc 24 Pro Leu Ala Glu Ile Met Pro Ser Val Leu Thr Ala Met Ser Tyr Leu 25 ttg caa aga gta tcg gag acc aac gac aac ctg agc cag aaa cag aag 144 Leu Gin Arg Val Ser Glu Thr Asn Asp Asn Leu Ser Gin LyS Gin Lys 40 ccc tca agc ttc act gga gta acc aaa cct tcc att tcc atc aga agc 192 Pro Ser Ser Phe Thr G y Val Thr Lys Pro Ser Ile Ser Ile Arg Ser 55 tat ctc gaa cgg atc ttt gaa tac gcg aat tgt agc tac tcg tgt tac 240 Tyr Leu Giu Arg Ile Phe Giu Tyr Ala Asn Cys Ser Tyr Ser Cys Tyr 70 75 atc gtc gca tat ata tat ttg gat cgg ttc gtg aag aag cag cca ttt 288 Ile Val Ala Tyr le Tyr Leu Asp Arg Phe val LyS Lys Gin Pro Phe 90 ttg cct atc aat tct ttt aat gtc cat agg ctt ata atc aca agt gtc 336 Leu Pro Ile Asn Ser Phe Asn Val His Arg Leu Ile Ile Thr Ser val 100 105 110 ttg gtc tct gct aaa ttc atg gat gac ttg agt tac aac aat gaa tat 384 Leu Val Ser Aia Lys Phe Met Asp ASp Leu Ser Tyr Asn Asn Giu Tyr 115 120 125 tat gca aaa gtt g ga g ga ata agc aga gaa gaa atg aac atg ctt gag 432 Tyr Ala Lys val Gly G]y le Ser Arg Giu Glu Met Asn Met Leu Giu 130 135 140 ctt gac ttc ttg ttc gga att gg ttt gag tta aac gtc acc gtt tct 480 Leu Asp Phe Leu Phe G y Ile G y Phe Giu Leu Asn vai Thr Val Ser 145 150 155 160 act ttc aat aac tat tgt tgt ttt cta caa aga gag atg gcg atg ttg 528 Thr Phe Asn Asn Tyr Cys Cys Phe Leu Gin Arg Giu Met Ala Met Leu 165 170 175 atg aag atg aag tct ctg ttt ctt gaa cct tct tca ttc aaa atc tct 576 Met Ls Met LYS Ser Leu Phe Leu Gu Pro Ser Ser Phe Lys lie Ser 180 185 190 ttt aag acg aaa ctt gtg atg tat cca cac gag gaa gac tct tta tct 624 Phe Lys Thr Lys Leu Val Met Tyr Pro His Giu Giu Asp Ser Leu Ser 195 200 205 act cac cac aac aag aag caa ctc gct gct gct tga 660 Thr His His Asn Lys Lys Gin Leu Ala Ala Ala 210 215 <210> 36 <211> 219 <212> PRT <213> Arabidopsis thaliana <400> 36 Met Ala Asp Gin Ile Gu Ile Gin Arg Met Asn Gin Asp Leu Gin Glu 1 5 10 Pro Leu Ala Giu Ile Met Pro Ser Val Leu Thr Ala Met Ser Tyr Leu 25 Leu Gin Arg val Ser Giu Thr Asn Asp Asn Leu Ser Gin Lys Gin LyS 40 Pro Ser Ser Phe Thr Gly Val Thr Lys Pro Ser Ile Ser le Arg Ser 55 Tyr Leu Glu Arg Ilie Phe Glu Tyr Ala Asn cys Ser Tyr Ser Cys Tyr 70 75 lie Val Ala Tyr lie Tyr Leu ASP Arg Phe Val LYS Lys Gin Pro Phe 90 Leu Pro Ile Asn Ser Phe Asn Val His Arg Leu le Ile Thr Ser Val 100 105 110 Leu Val Ser Ala Lys Phe met ASP ASP Leu Sen Tyr Asn Asn Giu Tyr 115 120 125 Tyr Ala Lys Val Gly Gly Ile Ser Arg Glu Giu Met Asfl met Leu GlU 130 135 140 Leu ASP Phe Leu Phe Gly Ilie Gly Phe Giu Leu Asn Val Thr Val Ser 145 150 155 160 Thr Phe Asn Asn Tyr Cys Cys Phe Leu Gin Arg Giu met Ala Met Leu 165 170 175 Met Lys Met Lys Sen Leu Phe LeU Giu Pro Sen Ser Phe Lys Ilie Sen 180 185 190 Phe Lys Thr Lys Leu Val Met Tyr Pro His Giu Giu ASP Sen Leu Sen 195 200 205 Thr His His Asn Lys LYS Gin Leu Ala Ala Ala 210 215 <210> 37 <211> 633 <212> DNA <213> Anabidopsis thaliana <220> <221> COS <222> .(630) <400> 37 atg tta acc gca gcc gga gac gat gaa ctg gac ccg gtc gtg gga cca 48 Met Leu Thr Ala Ala Gly ASP ASP Giu Leu ASP Pro Val Va] G y Pro 1 5 10 gaa tcg gca acg gaa gca gcc act cca aga gtg ctg act ata atc tcc 96 Giu Sen Ala Thr GiU Ala Ala Thr Pro Ang va] Leu Thr Ilie Ilie Sen 25 cat gt gatg gag aag ctc gt g gca cga aac gag tgg tta gct aag caa 144 His va] met Giu Lys Leu Va] Ala Ang Asn Giu Trp Leu Ala Lys Gin 40 act aag gga ttt ggg aag agc ttg gag gcg ttt cac ggc gtg aga gcg 192 Thr Lys Gly Phe G ly Lys Sen Leu Giu Ala Phe His G y Va IAng Al a 55 ccg agc ata agt ata gct aaa tac ctt gag agg ata tat aag tac aca 240 Pro Sen lie Sen Ilie Ala Lys Tyr Leu Giu Ag Ilie Tyr Lys Tyr Thr 70 75 aaa tgt agc ccg gca tgt ttc gtt gtt ggg tat gtg tac ata gac cgg 288 LYS cys Sen Pro Ala Cys Phe Val Val Gi y Tyr va] Ty r Ilie ASP Ang 90 ttg gct cat aag cat cct g gt tct ttg gtt gtc tcc ttg aat gtt cat 336 Leu Ala His LYS His Pro Gly Sen Leu Val Val Sen LeU Asn Val His 26 100 105 110 aga ctc ctc gtc act tgt gtc atg att gct gcc aag ata cta gat gac 384 Arg Leu Leu Val Thr Cys Val met lie Ala Ala Lys lie Leu ASP ASP 115 120 125 gt gcac tac aac aac gag ttc tat gct cgg gtt gy9a gygc gta agc aat 432 Val His Tyr Asn Asn Giu Phe Tyr Ala Arg V'al G 1y G Iy Val Ser Asn 130 135 140 gca gac ttg aac aaa atg gag ttg gag ctt ctc ttt ctt ctt gac ttt 480 Ala ASP Leu Asn Lys met Giu Leu GiU Leu Leu Phe Leu Leu ASP Phe 145 150 155 160 aga gtt act gt g agt ttt aga gtt ttc gag agc tat tgc ttt cac ctc 528 Arg Val Thr Val Ser Phe Arg Val Phe GlU Ser Tyr Cys Phe His Leu 165 170 175 gaa aaa gag atg caa cta aac gac gtc gtt tct tcc ctc aaa gat att 576 Glu Lys GlU met Gin Leu Asn ASP Val Val Ser Ser Leu LYS ASP Ilie 180 185 190 caa cca atg caa gaa agt ctc tct cca gca tct act tta tca tct tta 624 Gin Pro met Gin Giu Ser Leu Ser Pro Ala Ser Thr Leu Ser Ser Leu 195 200 205 tat gtt tga 633 Tyr Val 210 <210> 38 <211> 210 <212> PRT <213> Arabidopsis thaliana <400> 38 met Leu Thr Ala Ala Gly ASP ASP Giu Leu ASP Pro Val Val Gly Pro 1 5 10 Giu Ser Ala Thr Giu Ala Ala Thr Pro Arg Val Leu Thr Ilie Ilie Ser 25 His Val met GiU Lys Leu Val Ala Arg Asn Giu Trp LeU Ala Lys Gin 40 Thr Lys Gly Phe Gly LySsSer LeU GiU Ala Phe His Gly Val Arg Ala 55 Pro Ser Ilie Ser Ilie Ala Lys Tyr Leu Giu Arg Ilie Tyr LYS Tyr Thr 70 75 Lys Cys Ser Pro Ala Cys Phe Val Val Gly Tyr Val Tyr Ilie ASP Arg 90 Leu Ala His Lys His Pro Gly Ser Leu Val val Ser Leu Asn Val His 100 105 110 Arg Leu Leu Val Thr Cys Val met le Ala Ala Lys Ile Leu Asp ASP 115 120 125 Val His Tyr Asn Asn Giu Phe Tyr Ala Arg Val Gly Gly Val ser Asn 130 135 140 Ala ASP Leu Asn Lys met Giu Leu Giu Leu Leu Phe Leu Leu ASP Phe 145 150 155 160 Arg val Thr Val Ser Phe Arg Val Phe Giu Ser Tyr Cys Phe His Leu 165 170 175 27 Giu Lys Giu Met Gin Leu Asn ASP Val Val Ser Ser Leu Lys ASP Ilie 180 185 190 Gin Pro Met Gin Giu Ser Leu Ser Pro Ala Ser Thr Leu Ser Ser Leu 195 200 205 Tyr Val 210 <210> 39 <211> 669 <212> DNA <213> Arabidopsis thaliana <220> <221> CDS <222> .(666) <400> 39 atg gat tcc cta gcg att tct cca agg aag ctc cga tca gac ctc tac 48 met ASP Ser Leu Ala Ilie Ser Pro Arg Lys Leu Arg Ser ASP Leu Tyr 1 5 10 tct tac tct tac caa gat gat tcc aac aca gta cct cta gtc atc tct 96 Ser Tyr Ser Tyr Gin ASP ASP Ser Asn Thr Val Pro Leu Val Ilie Ser 25 gtt ctc tcg tct ctg atc gaa cga act tta gct agg aac gag aga atc 144 Val Leu Ser Ser Leu Ile Giu Arg Thr Leu Aia Arg Asn Giu Arg Ile 40 agc cgg agc tac ggt ggt ttt ggt aag aca cgt gtc ttt gat tgc cgg 192 Ser Arg Ser Tyr G ly G] y Phe G ly Lys Thr Arg Vai Phe ASP Cys Arg 55 gag att cct gat atg act att caa tca tac cta gag aga att ttc cgg 240 Gu Ile Pro ASP met Thr le Gin Ser Tyr Leu Giu Arg Ile Phe Arg 70 75 tat acc aaa gcc ggt cca tcg gtt tac gtc gtg gct tat gta tac att 288 Tyr Thr Lys Ala G ly Pro Ser \'ai Tyr Val Va] Ala Tyr val Tyr Ilie 90 gac cgg ttc tgt cag aat aac caa g gt ttc aga atc agt ctt acc aat 336 ASP Arg Phe Cys Gin Asn Asn Gin G ly Phe Arg lie Ser Leu Thr Asn 100 105 110 gta cat cgt ctc ctt atc aca act atc atg atc gct tcc aaa tac gtc 384 Val His Arg Leu Leu Ilie Thr Thr Ilie met Ilie Aia Ser Lys Tyr Val 115 120 125 gaa gat atg aac tac aaa aac tcg tac ttt gcg aaa gta gga aggatta 432 Giu ASP met Asn Tyr Lys Asn Ser Tyr Phe Aia Lys Vai Gi y GlY Leu 130 135 140 gag aca gaa gat ttg aac aat ttg gaa ctg gag ttc ttg ttc ttg atg 480 Giu Thr Giu ASP Leu Asn Asn Leu Giu Leu Giu Phe Leu Phe Leu met 145 150 155 160 gja tt t aag ttg cat gt g aat gt 9 agt gt g ttc gag agt tac tgt tgt 528 GyPhe Lys Leu His Va] Asn Val Ser Val Phe Giu Ser Tyr Cys Cys 165 170 175 cat cta gaa aga gaa gtg agt att gga gga ggt tat cag atc gaa aaa 576 His Leu Giu Arg Giu Va] Ser Ilie Gly G ly Gly Tyr Gin lie Giu LYS 180 185 190 28 gca ttg cgt tgc gct gag gaa atc aaa tct aga caa att gtt caa gac Ala Leu Arg Cys Ala Giu Giu le LySsSer Arg Gin Ile Val Gin ASP 195 200 205 cct aaa cat cat cat cac cat caa ttt tct cga atc atg ttg tag Pro Lys His His His His His Gin Phe Ser Arg Ile met Leu 210 215 220 <210> <211> 222 <212> PRT <213> Arabidopsis thaliana <400> met ASP 1 Ser Tyr Val Leu Ser Arg Giu leI Tyr ThrI ASP ArgI Val His GiU ASP 130 Giu Thr 145 Gly Phe His Leu Ala Leu Pro Lys 210 Ser Leu Ala Ile Ser se r se r se r Pro Lys Phe A rg 115 miet GiU Lys GiU A~rg 195 His Ty r Se r Ty r

ASP

Al a cys 100 Leu Asn

ASP

Leu A rg 180 cys 5 Gin Leu Gi y Met Gi y Gin Leu Ty r Leu His 165 GiU Al a

ASP

Ilie Gi y rh r 70 Pro Asn le

LYS

Asn 150 Val Val GlU

ASP

GiU Phe Ile Se r Asn Th r Asn 135 Asfl Asn Se r GiU His 215 Pro se r Arg 40 Gi y Gin Val Gin Th r 120 Se r Leu Val lie le 200 Arg Asn 25 Th r Lys Se r Ty r Gi y 105 Ile Ty r GiU Se r Gi y 185 Lys Lys 10 Th r Leu Th r Ty r Val 90 Phe Met Ph e Leu Val 170 Gi y Se r Leu Val Al a Arg Leu 75 Val Arg Ile Al a GiU 155 Phe Gi y A rg Se r Leu Asn Phe Arg Ty r Se r Se r 125 Val Leu se r Gin le 205

ASP

'ial Gi U

ASP

le Val Leu 110

LYS

Gi y Ph e Ty r le 190 Val Leu Ilie A rg cys Phe Ty r Th r Ty r Gi y Leu cy s 175 Gi U Gin Ty r Se r lie Arg Arg Ilie Asn Val Leu Met 160 Cy s Lys

ASP

His His HiS Gin Phe Ser Arg Ile met Leu 220 <210> 41 <211> 671 <212> DNA <213> Arabidopsis thaliana <220> <221> CDS <222> .(669) <400> 41 atg gac tct ctc met ASP Ser Leu 1 Ct C Leu act Th r aat Asfl ttt Ph e cg c A rg cat His ccc Pro aaa Lys 145 ttt Phe cac His tgg Trp agg Arg agg Arg gta \'al cat His 50 gac

ASP

att Ile atc le ctt Leu gtc Val 130 acc Th r tgt Cys ccc Pro aca Th r 210 tta Leu ctc Leu 35 gat

ASP

ttc Phe tac Ty r aat Asfl 115 ttc Ph e gtg Va 1 ctt Leu tgt cys ata Ile 195 ccc Pro gca Al a 5 ctt Leu cgc Arg aag Lys tct Se r tac Ty r gat

ASP

cac His gat

ASP

acg Th r ttc Ph e 165 tta Leu gaa Giu tca Se r acc gat Thr ASP cca gct Pro Ala att lie ctc Leu att Ile ccc Pro 70 tct Se r cac His cgc A rg agg Arg aga Ang 150 aag Lys gaa Glu gca Al a ctc Leu att Ile tct se r ctg Leu 55 cct Pro tgc Cys ttt Phe ctt Leu tat Ty r 135 gag GiU ctt Leu aag Lys tgc cy s tg c cys 215 gag Glu tct Se r 40 ctt Leu gag GiU tgc cy 5 ct c Leu atc lie 120 ttc Ph e tta Leu cag Gin cag Gin cga Arg 200 tct Se r 25 tct Se r atc lie agt Se r cat His 105 att lie aac Asfl aac Asfl gta Val aac Asfl 185 gcc Aila caa Gin ttc Ph e 10 aaa Lys ctg Leu tcg Se r agt Se r ccc Pro 90 aag Lys aca Th r aat Asfl aga A rg gat

ASP

170 aga A rg aac Asn acc Th r att gat tcg gat Ilie ASP Ser ASP cga A rg gag Gi U cca Pro att lie 75 tcc Se r acc Th r act Th r gca Aila ttg Leu 155 c ct Pro gac

ASP

aaa

LYS

aca Th r gta tac Val Tyr ttg Leu aga Arg gac

ASP

gca Aila tg c cys cga Arg gtc Val tac Ty r 140 gag GlU cag Gin gag Gi U gca Al a 220 aaa

LYS

tct Se r tct Se r cac His ttc Phe gcc Al a atg met 125 tac Ty r atg met acg Th r ttc Ph e act Th r 205 cgc A rg aag Lys ctg Leu gtt Val tac Ty r gtc Val ctt Leu 110 tta Leu gca Al a gag GlU ttt Phe cag Gin 190 tgg Trp tga cca Pro tta Leu acc Th r ttg Leu att Ilie ctc Leu gct Al a aga Arg ttg Leu cac His 175 atc lie cag Gin tcg Se r ccg Pro ctc Leu gt g Va1 gat

ASP

0 gcg Al a aaa Lys gct Al a gt ttg Leu 160 aca Th r gag GlU aag Lys gc 48 96 144 192 240 288 336 384 432 480 528 576 624 671 <210> 42 <211> 221 <212> PRT <213> Arabidopsis thaliana <400> 42 met ASP Ser Leu Ala Thr ASP Pro Ala Phe lie ASP ser ASP Vial Tyr 1 5 10 Leu Arg Leu Gly Leu Ilie Ilie Glu Gly LYS Arg Leu Lys Lys Pro Pro 25 Thr Val Leu Ser Arg Leu Ser Ser Ser LeU Glu Arg Ser LeU Leu Leu Asfl Ph e Arg His Pro Lys Gi y 145 Ph e His Trp Arg His

ASP

Ile le Leu Val 130 Gi y Th r Cy 5 Pro Th r 210

ASP

Gi y Phe Ty r Asfl 115 Phe Val Leu cys Ile 195 Pro Lys se r Ty r

ASP

His

ASP

Th r Phe 165 Le u GiU Se r le Pro 70 Se r His Arg Arg Arg 150 Lys Glu Ala Leu Ley 55 Pro Cys Ph e Leu Ty r 135 GiU Leu Lys Cys cy s 215 Leu Gly Ser Pro GiU Cys Leu le 120 Phe Leu Gin Gin Ar q 200 Se r le se r His 105 le Asn Asfl Val Asn 185 Ala Gin le 75 Se r Th r Th r Al a Leu 155 Pro

ASP

LYS

Th r

ASP

Al a cys Arg Val Ty r 140 GiU Gin Gi y Gi u Al a 220 ser Val Thr His Ph e Al a Met 125 Ty r Met Th r Ph e Th r 205 A rg Ty r Val Leu 110 Leu Al a GlU Phe Gin 190 Trp Val

ASP

Al a Lys Al a Val Leu 160 Th r Glu

LYS

<210> 43 <211> <212> PRT <213> Artificial sequence <220> <223> Description of Artificial sequence <400> 43 Tyr Leu GlU Arg Ile Phe Lys Tyr 1 5 Val Val Ala Tyr Val Tyr Leu ASP sequence: artificial Al a Arg Cys Ser Pro Ser Cys Phe Thr His Arg 25 Leu Pro Ilie Asn Ser Phe met Val Ala Ala Lys Phe Tyr Ala Lys Val Gly Gly 70 Leu ASP Phe Leu Phe Asn Leu 55 Ile Val Hi S Arg Leu Leu Gin Pro Ser Thr Ser Val Asn Ala Tyr ASP ASP Leu Tyr Ty r met Ser Thr Lys Gi u 75 Asn Phe Leu Glu <210> 44 <211> <212> PRT <213> Artificial sequence <220> <223> Description Of Artificial sequence: artificial sequence <400> 44 Tyr Leu GlU Arg Ilie Phe Glu Tyr Ala Asn Cys Ser Tyr Ser Cys Tyr 1 5 10 lie Val Ala Tyr Ilie Tyr Leu ASP Arg Phe Val Lys Lys Gin Pro Phe 31 25 Leu Pro lie Asn Ser Phe Asn Vial His Arg Leu Ilie Ilie Thr Ser Val 40 Leu Val Ser Ala Lys Phe met ASP ASP Leu Ser Tyr Asn Asn Giu Tyr 55 Tyr Ala LYS Val Gly Gly lie Ser Arg Giu Glu Met Asn met Leu Glu 70 75 LeU ASP Phe Leu Phe <210> <211> <212> PRT <213> Artificial sequence <220> <223> Description of Artificial sequence <400> Tyr Leu GlU Arg Ilie Tyr Lys Tyr 1 5 Val V'al Gly Tyr val Tyr Ile ASP Ser Leu 'ial Val Ser LeU Asn Val 40 met lie Ala Ala LYS Ilie Leu ASP 55 Tyr Ala Arg Val Gly Gly Val Ser 70 Leu Glu Leu Leu Phe <210> 46 <211> <212> PRT <213> Artificial sequence <220> <223> Description of Artificial sequence <400> 46 Tyr Leu Glu Arg Ile Phe Arg Tyr 1 5 Val Val Ala Tyr Val Tyr Ilie ASP Phe Arg Ile Ser Leu Thr Asn Val met lie Ala Ser LYS Tyr Val Glu 55 Phe Ala Lys Val Gly Gly Leu Glu 70 sequence: artificial Th r Arg 25 His

ASP

Asn

LYS

10 Leu Arg Val Al a cys Al a Leu His

ASP

75 se r His Leu Ty r Leu Pro Lys Val Asn Asn Ala cys Phe His Pro Gly Thr Cys Val Asn GlU Phe Lys Met Glu sequence: artificial Th r Arg 25 His

ASP

Th r

LYS

10 Phe Arg Met Gi U Al a Cy s Leu Asfl

ASP

75 Gi y Gl n Leu Ty r Leu Ser Val Tyr Asn Gln Gly Thr Thr Ile Asn Ser Tyr Asn LeU Glu Leu Glu Phe Leu Phe <210> 47 <211> 84 <212> PRT <213> Artificial Sequence <220> <223> Description of Artificial sequence <400> 47 Tyr Leu Asp Arg Ile Phe Lys Tyr 1 5 val Ile Ala His Ile Tyr Ile Asp Leu Leu Lys Pro Leu Asn val His 40 Leu Ala Ala Lys val Phe Asp Asp Ala Arg val Gly Gly Val Thr Thr 70 Glu Leu Leu Phe sequence: artificial Ser His 25 Arg Arg Arg cys 10 Phe Leu Tyr Glu cys Leu Ile Phe Leu 75 Pro Lys Thr ASn Arg Ser Cys Phe Thr Arg Ala Thr val Met Ala Tyr Tyr Leu Glu Met <210> 48 <211> 24 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence: artificial sequence <400> 48 attgcacact acttggatcg catt <210> 49 <211> <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence: artificial sequence <400> 49 gatagaatgg gaacggctag <210> <211> <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence: artificial sequence <400> ctgataccag acgttgcccg cataa <210> 51 <211> <212> DNA <213> Artificial sequence <220> <223> Description of Artificial sequence: artificial sequence <400> 51 ctacaaattg ccttttctta tcgac <210> 52 <211> 572 <212> DNA <213> Artificial sequence <220> <223> Description of Artificial sequence sequence: artificial <400> 52 ntgtactaaa ccataagacc catgttagct tcaacacgca tcaacaatgc agatggagtt cacactgttg ggcacacgcc ggaaagttgt ctngatgtac aggtgcantc cgagcccttc gctaaagtct aataagtctt atactacgca gttgtttacc tactgaatcg ctggagtccg ccaanacgaa atggtcnata cctcctgctt tcaaacccct tcgatgatag caatcataga agagtgggag cttgacttca gattttcaag gcccgtttcc tcccagtgtc nnaaaaggcn cgtcatggat taatgtccac gtatgttact ttcattgatc gtgtgactac agcttcaggt ggtctggcca agttgagggt ctattaccaa at atctacattg cgccttatca cactaaacct tctggtgttg gagagagtta agatcctcag aaactattcc tgtctacgct tagccgacgg atcactttct ttacaactgt ggtatcaaat ngcaggtatt aacagattgg acgtttcaca gngggcacct tanatgagaa tatcgataag 120 180 240 300 360 420 480 540 572 <210> 53 <211> 22 <212> DNA <213> Artificial sequence <220> <223> Description of Artificial Sequence: artificial sequence <400> 53 cgatccagct ttcattgatt cg <210> 54 <211> <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial sequence: artificial sequence <400> 54 gatagaatgg gaacggctag <210> <211> 740 <212> DNA <213> Arabidopsis thaliana <400> atttcnttng ccaccgactg gatgacaaga cctgagatca tcctgcttcg ctcaaacccc ttcgatgata tcaatcatag aagagtggga ccttgacttc aaaagcagaa acaaagagac gatcggcaag ntgtatacct ttctctcacg ttctgcttgg gtattgcaca tcattgcgca ttaatgtcca ggtatgttac attcattgat ggtgtgacta aagcttcagg cagcgacggc ttggcagaag ggnaaaanga caggttagga cctctcttct atcgccagac ctacttggat tatctacatt ccgccttatc tcactaaacc ctctggtgtt cgagagagtt tagatcctca ttccagatcg aggacacccg cttattattg tctctggaga tctgttaccg cgcattttca gatcactttc attacaactg tggtatcaaa gtgcaggtat aaacagattg gacgtttcac agtggcccat actcactctg agggcaaacg gatctctgtt tgtttgacgg agtactcttg tccataagac tcatgttagc ttcaacacgc ttcaacaatg gagatggagt acacactgtt aaaagaagca ctctcaaacc attgaaaaag actcaatcat gagatctccc ctgcagtccc ccgagccctt tgctaaagtc aaataagtct catactacgc tgttgtttac gtcaagttag tgccgagcca acagcacgct 120 180 240 300 360 420 480 540 600 660 720 740 <210> 56 <211> 24 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial sequence: artificial sequence <400> 56 attgcacact acttggatcg catt <210> 57 <211> 23 <212> DNA <213> Artificial sequence <220> <223> Description of Artificial sequence: artificial sequence <400> 57 ctatcttacc cttgccgatc agc <210> 58 <211> <212> DNA <213> Artificial sequence <220> <223> Description of Artificial sequence: artificial sequence <400> 58 ctacaaattg ccttttctta tcgac 1

Claims

1. An isolated DNA sequence encoding a CDC2b binding protein or encoding an immunologically active and/or functional fragment of such a protein, said DNA sequence selected from the group consisting of: a DNA sequence comprising a nucleotide sequence encoding at least the m ature form of a protein (VB89) comprising the amino acid sequence as given in SEQ ID NO: 8; a DNA seqeuence comprising the nucleotide sequence as given in SEQ ID NO: 7; a derivative, fragment, allelic variant or homolog of the DNA sequence set forth in SEQ ID NO: 7, wherein said derivative, fragment, allelic variant or homolog hybridizes with the complementary strand of a nucleotide sequence as defined in or under stringent hybridization conditions; a derivative, fragment, allelic variant or homolog of the DNA sequence set forth in SEQ ID NO: 7, wherein said derivative, fragment, allelic variant or homolog encodes a CDC2b binding protein or encodes an immunologically active and/or functional fragment of such a protein comprising an amino acid sequence at least 60% identical to the amino acid sequence encoded by the nucleotide sequence of or or a DNA sequence encoding at least the CDC2b binding domain of the protein or an immunologically active and/or functional fragment thereof encoded by the DNA sequence of any one of to wherein said isolated DNA sequence is not the DNA sequence set forth in EMBL Database Accession No. U80192.

2. An isolated nucleic acid molecule of at least 15 nucleotides in length hybridizing specifically with a DNA sequence of claim 1 or with a complementary strand thereof.

3. A vector comprising a DNA sequence of claim 1. P',OPERUtc0l9823-00 sped.doc 2317I04 104

4. The vector of claim 3 which is an expression vector wherein the DNA sequence is operatively linked to one or more control sequences allowing the expression in prokaryotic and/or eukaryotic host cells. A host cell containing a vector of claim 3 or 4 or a DNA sequence of claim 1.

6. The host cell of claim 5 which is a bacterial, insect, fungal, plant or animal cell.

7. A method for the production of a CDC2b binding protein or an immunologically active or functional fragment thereof comprising culturing a host cell of claim 5 or 6 under conditions allowing the expression of the protein and recovering the produced protein from the culture.

8. An isolated CDC2b binding protein or an immunologically active or functional fragment thereof encoded by a DNA sequence of claim 1 or obtained by the l l• method of claim 7.

9. An isolated antibody specifically recognizing the protein of claim 8 or a fragment or epitope thereof, wherein said epitope or fragment comprises at least 5 amino acid residues. A method for the production of transgenic plants, plant cells or plant tissue Scomprising the introduction of a DNA sequence of claim 1 or 2 or a vector of claim 3 or 4 into the genome of said plant, plant cell or plant tissue.

11. The method of claim 10 further comprising regenerating a plant from said plant tissue or plant cell.

12. A transgenic plant cell comprising a DNA sequence of claim 1 which is operably linked to regulatory elements allowing transcription and/or expression of the DNA sequence in plant cells or obtainable according to the method of claim 10 or 11. P:%OPER.Jtc%19823-O spea do 231M1R4 105

13. The transgenic plant cell of claim 12 wherein said DNA sequence or said vector is stably integrated into the genome of the plant cell.

14. A transgenic plant or a plant tissue comprising plant cells of claim 12 or 13. The transgenic plant of claim 14 in which plant cell division and/or growth is enhanced and/or wherein the plant is less sensitive to environmental stress compared to the corresponding wild type plant.

16. A transgenic plant cell which contains stably integrated into the genome a DNA sequence of claim 1 or 2 or part thereof or obtained according to the method of claim 10 or 11, wherein the transcription and/or expression of the DNA sequence or part thereof leads to reduction of the synthesis of a CDC2b binding protein in the cells. t

17. The plant cell of claim 16, wherein the reduction is achieved by an antisense, sense, ribozyme, co-suppression, dominant mutant effect and/or a knock out mutant in the gene. S: 18. A transgenic plant or plant tissue comprising plant cells of claim 16 or 17.

19. The transgenic plant of claim 18 which displays a deficiency in plant cell division S* and/or growth. Harvestable parts or propagation material of plants of any one of claims 14, 18 or 19 comprising plant cells of claim 12, 13, 16 or 17.

21. An isolated regulatory sequence of a promoter regulating the expression of a nucleic acid molecule comprising the DNA sequence of any one of claim 1, said regulatory sequence being capable of conferring expression of a heterologous DNA sequence during various stages of the cell cycle, wherein said regulatory P:AOPERUcA19823-00 spedadoc 23I7J4 106 sequence naturally regulates the expression of the DNA of claim 1 in the genome of an organism.

22. A recombinant DNA molecule comprising the regulatory sequence of claim 21.

23. The recombinant DNA molecule of claim 22, wherein said regulatory sequence is operatively linked to a heterologous DNA sequence.

24. A host cell transformed with a regulatory sequence of claim 21 or a recombinant DNA molecule of claim 22 or 23. A transgenic plant, plant tissue, or plant cell comprising the regulatory sequence of claim 21 or the recombinant DNA molecule of claim 22 or 23.

26. A method for the identification of an activator or inhibitor of CDC2 binding proteins II or their encoding genes comprising the steps of: culturing a plant cell or tissue or maintaining a plant comprising a recombinant DNA molecule comprising a readout system operatively linked to a regulatory sequence of claim 21 in the presence of a compound or a sample comprising a plurality of compounds under conditions which permit expression of said readout system; identifying or verifying a sample and compound, respectively, which leads to suppression or activation and/or enhancement of expression of said readout system in said plant, plant cell, or plant tissue.

27. A method for identifying and obtaining an activator or inhibitor of cell division comprising the steps of: combining a compound to be screened with a reaction mixture containing the CDC2b binding protein of claim 8 and a readout system capable of interacting with the protein under suitable conditions which permit interaction of the protein with said readout system; PAOPERU61I9823-OO sped.doc 23/7/04 107 identifying or verifying a sample and compound, respectively, which leads to suppression or activation of the readout system.

28. A method of producing a therapeutic agent comprising the steps of the method of claim 27 and synthesizing the activator or inhibitor obtained or identified in step or an analog or derivative thereof in an amount sufficient to provide said agent in a therapeutically effective amount to a patient.

29. A method of producing a plant effective agent comprising the steps of the method of claim 27 and synthesizing the activator or inhibitor obtained or identified in step or an analog or derivative thereof in an effective amount sufficient to provide said agent in an effective amount suitable for the application in agriculture or S: plant cell and tissue culture. .i e*

30. A method of producing a therapeutic or plant effective composition comprising the steps of the method of claim 27 and combining the compound obtained or S°identified in step or an analog or derivative thereof with a pharmaceutically acceptable carrier or with a plant cell and tissue culture acceptable carrier.

31. An activator or inhibitor of a cell division obtained by the method of any one of Ol claims 27 to 29.

32. A composition comprising a DNA sequence of claim 1 or 2, a vector of claim 3 or 4, a protein of claim 8, an antibody of claim 9, or the activator or inhibitor of claim 31.

33. The composition of claim 32 for use as a medicament, a diagnostic means, a kit or plant effective agent.

34. Use of a DNA sequence of claim 1 or 2, the vector of claim 3 or 4, the protein of claim 8, the antibody of claim 9 or the activator or inhibitor of claim 31 for modulating the cell cycle in an animal or plant, plant cell division and/or growth, P:OPERUc19823-OO sped doc 23RID14 108 for influencing the activity of cell cycle proteins in a plant or animal cell, as positive or negative regulator of cell proliferation, for modifying the growth inhibition caused by environmental stress conditions, for use in a screening method for the identification of inhibitors or activators of cell cycle proteins, as growth regulator, herbicide or for inducing nematode resistance in plants. Use of a DNA sequence of claim 1 or 2 or the regulatory sequence of claim 21 as a marker gene in plant or animal cell and tissue culture or as a marker in marker- assisted plant breeding.

36. Use of a regulatory sequence of claim 21 or a recombinant DNA molecule of claim 22 or 23, for the expression of a heterologous DNA sequence during a stage of the cell cycle. 4

37. Use of an isolated DNA sequence encoding a CDC2b binding protein or encoding an immunologically active and/or functional fragment of such a protein, said DNA sequence selected from the group consisting of: a DNA sequence comprising a nucleotide sequence encoding at least the mature form of a protein (VB89) comprising the amino acid sequence as given in SEQ ID NO:8; a DNA sequence comprising the nucleotide sequence as given in SEQ ID NO:7; a derivative, fragment, allelic variant or homolog of the DNA sequence set forth in SEQ ID NO:7, wherein said derivative, fragment, allelic variant or homolog hybridizes with the complementary strand of the nucleotide sequence as defined in or under stringent hybridization conditions; a derivative, fragment, allelic variant or homolog of the DNA sequence set forth in SEQ ID NO:7, wherein said derivative, fragment, allelic variant or homolog encodes a CDC2b binding protein or encodes an immunologically active and/or functional fragment of such a protein, P:%OPERtc19823-0O sped.doc 230)04 109 comprising an amino acid sequence at least 60% identical to the amino acid encoded by the nucleotide sequence of or or a DNA sequence encoding at least the CDC2b binding domain of the protein or immunologically active and/or functional fragment thereof encoded by the DNA sequence of any one of to or a fragment of said DNA sequence which is at least 15 nucleotides in length; or a vector comprising said DNA sequence; or a protein or immunologically active or functional fragment thereof encoded by said DNA sequence; or an isolated antibody specifically recognizing said protein or a fragment or epitope thereof wherein said fragment or epitope comprises at least five amino acid residues; for modulating the cell cycle in an animal or plant; for modulating plant cell division and/or growth; for influencing the activity of cell cycle proteins in a plant or animal cell; as a positive or negative regulator of cell proliferation; for modifying growth inhibition caused by environmental stress conditions; for use in a screening method for the identification of inhibitors or activators of cell cycle proteins; as a growth regulator; as a herbicide; and/or for inducing nematode resistance in plants. Dated this 2 3 rd day of July 2004. c CropDesign NV By its Patent Attorneys Davies Collison Cave P:XOPER.tc9823-0O sped.doc 23RI04