AU696364B2

AU696364B2 - Morphogenic protein screening method II

Info

Publication number: AU696364B2
Application number: AU36040/97A
Authority: AU
Inventors: Charles N. Cohen; Thangavel Kuberasampath; Hermann Oppermann; Engin Ozkaynak; Roy H. L. Pang; David C. Rueger; John E. Smart
Original assignee: Creative Biomolecules Inc
Current assignee: Creative Biomolecules Inc
Priority date: 1991-08-30
Filing date: 1997-08-27
Publication date: 1998-09-10
Anticipated expiration: 2012-08-28
Also published as: AU3604097A

Description

AUSTRALIA

PATENTS ACT 1990

ORIGINAL

COMPLETE SPECIFICATION t0*s 4 V 4 4*

V

Name of Applicant: Address of Applicant: Actual Inventor(s): Creative Biomolecules, Inc.

45 South Street, Hopkinton, Massachusetts 01748, United States of America John E. SMART Hermann OPPERMANN Engin OZKAYNAK Thangavel KUBERASAMPATH David C. RUEGER Roy H.L. PANG Charles N. COHEN DAVIES COLLISON CAVE, Patent Attorneys, 1 Little Collins Street, Melbourne, 3000.

r I. s o o r o o r Address for Service: Complete Specification for the invention entitled: Morphogenic protein screening method I The following statement is a full description of this invention, including the best method of performing it known to us: -1- I- c- I P:%OPFRWMRO1286249.SPCE 1418197 1 1 1A- MORPHOGENIC PROTEIN SCREENING METHOD II The invention relates to a method of screening drugs for the ability to modulate the level in mammals of proteins which can induce tissue morphogenesis and to methods of determining which animal tissue(s) and/or cell types within a tissue express a particular morphogenic protein. More particularly, the present invention provides a method of diagnosing tissue damage and/or disease wherein altered expression of a particular morphogenic protein is indicative of said damage or disease.

Background of the Invention Cell differentiation is the central characteristic of morphogenesis which initiates in the embryo, and continues to various degrees throughout the life of an organism in adult 15 tissue repair and regeneration mechanisms. Members of the TGF-P superfamily include subfamilies of highly-related genes that now are suspected to play important roles in cell differentiation and morphogenesis during development and/or during adult life.

For example, the Drosophila decapentaplegic gene product (DPP) 20 has been implicated in formation of the dorsal-ventral axis in fruit flies; activins induce mesoderm and anterior structure formation in mammals; MUllerian inhibiting substance (MIS) may be required for male sex development in mammals; I 25 growth/differentiation factor-1 (GDF-1) has been implicated in nerve development and maintenance; other morphogenic proteins (BMP-2, -4 and OP-1) induce bone formation.

-"I

The development and study of a bone induction model system has identified the developmental cascade of bone differentiation as consisting of chemotaxis of mesenchymal cells, proliferation of these progenitor cells, differentiation of cartilage, ossification and hypertrophy of this cartilaginous tissue, vascular invasion, bone formation, remodeling, and finally, marrow differentiation (Reddi (1981) Collagen Rel. Res.

1:209-206). This bone model system, which is studied in adult mammals, recapitulates the cascade of bone differentiation events that occur in formation of bone in the developing fetus. In other studies, the epithelium of the urinary bladder has been shown to induce new bone formation. Huggins (1931, Arch. Surg.

22:377-408) showed that new bone formation could be induced by surgical transplantation of urinary bladder epithelium onto the parietal fascia. Urist (1965, Science 150:893-899) demonstrated that implantation of demineralized bone segments resulted in endochondral bone formation. The latter study and observation suggested the existence of an osteogenic protein and that bovine diaphyseal bone was a source of enriched preparations of osteogenic protein (Sampath et al., J.

Biol. Chem. 265:13198-13205, 1990; Urist, ibid; Reddi et al., Proc. Nat. Aca. Sci. 69:1601-1605, 1972; Sampath et al., Proc. Natl. Acad. Sci. .80:6591-6595, 1983). Proteins capable of inducing endochondral bone formation in mammals when implanted in association with a matrix now have been identified in a number of different mammalian species, as have the genes encoding these proteins, (see, for example, U.S. Patent No.

4,968,590; U.S.S.N. 315,342 filed February 23, 1989; and U.S.S.N. 599,543, filed October 18, 1990). Human OP-1 DNA has been cloned from various cDNA and genomic libraries using a consensus probe (Ozkaynak et al., EMBO J. 9:2085-2093, 1990). Purified human recombinant OP-1, expressed in mammalian cells, has been shown to induce new bone formation in vivo. Like other members of the 'TGF-A superfamily, OP-1 is produced as a precursor, glycosylated, processed and secreted as a mature dimer. Mature OP-1 is cleaved at a maturation site following a sequence with the pattern of RXXR (Panganiban et al., Mol. Cell. Biol. 10:2669-2677, 1990).

The degree of morphogenesis in adult tissue varies among different tissues and depends on, among other factors, the degree of cell turnover in a given tissue.

On this basis, tissues can be divided into three broad categories: 1) tissues with static cell populations such as nerve and skeletal muscle where there is little or no cell division and most of the cells formed during development persist throughout adult life and, therefore, possess little or no ability for normal ~regeneration after injury; 2) tissues containing conditionally renewing populations such as liver where there is generally little cell division but, in response to an appropriate stimulus or injury, cells can divide to produce daughters of the same differentiated cell type; and 3) tissues with permanently renewing populations including blood, bone, testes, and stratified squamous epithelia which are characterized by rapid and continuous cell turnover in the adult. Here, the terminally differentiated cells have a short life span and are replaced through

I

IC proliferation of a distinct subpopulation of cells, known as stem or progenitor cells.

It is an object of this invention to provide a method of screening compounds which, when administered to a given tissue from a given organism, cause an alteration in the level of morphogenic protein ("morphogen") produced by the tissue. Such compounds, when administered systemically, will result in altered systemic or local levels of morphogenic activity. This morphogenic activity includes the ability to induce proliferation and sequential differentiation of progenitor cells, and the ability to support and maintain the differentiated phenotype or sequence of phenotypes through the progression of events that results in the formation of normal adult tissue (including organ regeneration). Thus, broadly, the invention provides a key to development of additional modalities of therapies involving modulation of morphogenic protein production in animals or adult mammals, humans, and consequent correction of conditions involving pathologic alteration of the balance of tissue cell turnover. Another object of the invention is to provide methodologies for identifying or selecting a combination of compound(s) which may increase a progenitor cell population in a mammal, stimulate progenitor cells to differentiate in vivo or in vitro, maintain the differentiated phenotype or sequence of phenotypes of a tissue, induce tissuespecific growth in vivo, or replace diseased or damaged tissues or organs in vivo. Another object of the invention is to determine the tissue(s) or organ(s) of origin of a given morphogen. Another object of the

I

III -rC- 1^invention is to determine the specific cell type(s) within the tissue(s) or organ(s) of origin, or cell line(s) derived from the tissue(s), or organ(s) of origin, that is responsible for the synthesis and production of a given morphogen. These and other objects and features of the invention will be apparent from the description, drawing, and claims which follow.

S*.i P 1I flOPVMR)\21624 92 SIT 1018M9' -6- Summary of the Invention The invention features a method of screening candidate compounds for the ability to modulate the effective local or systemic concentration or level of morphogenic protein in an organism. The method is practiced by incubating one or more candidate compound(s) with cells from a test tissue type of an organism known to produce a given morphogen for a time sufficient to allow the compound(s) to affect the production, expression and/or secretion, of morphogen by the cells; and then assaying cells and the medium conditioned by the cells for a change in a parameter indicative of the level of production of the morphogenic protein. The procedure may be used to identify compounds showing promise as drugs for human use capable of increasing or decreasing morphogen production in 15 vivo, thereby to correct or alleviate a diseased condition.

In a related aspect, the invention features a method of 'screening tissue(s) of an organism to assess whether or at what level cells of the tissue(s) produce a particular morphogen, thereby to determine a tissue(s) of origin of the morphogen.

20 This permits selection of the tissue cell type to be used in the screening. As used herein, "tissue" refers to a group of cells which are naturally found associated, including an organ.

In a further related aspect, the invention features a method of assessing or diagnosing tissue damage or disease comprising assaying the tissue to determine the level, of a particular morphogen therein, wherein a depression or absence of morphogen expression relative to undamaged or undiseased tissue is indicative of damage or disease.

As an example of tissue(s) or organ(s) which produce high levels of morphogen relative to the level produced by other types of tissues, it has been discovered that OP-1, first found in bone tissue is produced at relatively high levels in cells derived from renal, kidney or bladder, or adrenal tissue; that GDF-1 is produced at relatively high levels in cells derived from nerve, brain tissue; that DPP is produced at relatively high levels in cells derived from one of the following drosophila tissues: dorsal ectoderm, epithelial imaginal disc, visceral mesoderm, or gut endoderm; that Vgr-1 is produced at relatively high levels in cells derived from mouse lung tissue; and that Vgl is produced at relatively high levels in cells derived from xenopus fetal endoderm tissue. In addition, BMP3 and CBMP2B transcripts have been identified in abundance in lung tissue. As used herein, "derived" means the cells are the cultured tissue itself, or are a cell line whose parent cells are the tissue itself.

Preferred methods for determining the level of or a change in the level of a morphogen in a cultured cell include using an antibody specific for the morphogen, in an immunoassay such as an ELISA or radioimmunoassay; and determining the level of nucleic acid, most particularly mRNA, encoding the morphogen using a nucleic acid probe that hybridizes under stringent conditions with the morphogen RNA, such as in San RNA dot blot analysis. Where a change in the presence and/or concentration of morphogen is being determined, it-will be necessary to measure and compare the levels of morphogen in the presence and absence of the candidate compound. The nucleic acid probe may be a nucleotide sequence encoding the morphogen or a fragment large enough to hybridize specifically only to RNA encoding a specific morphogen under stringent conditions. As used herein, "stringent conditions" are

_I

8 defined as conditions in which non-specific hybrids will be eluted but at which specific hybrids will be maintained, incubation at 0.1X SSC (15mM NaCl, Na citrate) at 50 0 C for 15 minutes.

Examples of morphogens whose levels may be determined according to the invention include OP-1, OP- 2, GDF-1, Vgr-1, DPP, 60A CBMP2A, CBMP2B, BMP 2, 3, 4, 6, or Vgl. Thus, if an immunoassay is used to indicate the presence and/or concentration of a morphogen, an antibody specific for one of these morphogens only, and which will not detect the presence of other morphogens, will be used. Similarly, if nucleic acid hybridization is used to indicate the level of RNA encoding the morphogen, a nucleotide probe specific for one of these morphogens only will be used under hybridization conditions such that the probe should not be capable of hybridizing with RNA encoding a different morphogen. A morphogen includes an active C-terminal core region, which includes at least six cysteine residues, and a region N-terminal to the Cterminal region that is relatively non-homologous to the equivalent N-terminal regions of other morphogens.

In addition, the 3' noncoding region of the mRNA is unique to each morphogen. Thus, a nucleic acid probe encoding all or a portion of the sequences N-terminal to the C-terminal core region of a morphogen, or encoding all or a portion of the sequences C-terminal to or 3' to the core region of a morphogen may be used as a probe which detects mRNA encoding that morphogen only.

"Morphogenic proteins" or "morphogens", as used herein, include naturally-occurring osteogenic proteins p.

e I I::i0'iR\MRO\ S6241'S~ -1111/9)7 -9capable of inducing the full developmental cascade of bone formation, as well as polypeptide chains not norn;.Ily associated with bone or bone formation, but sharing substantial sequence homology with osteogenic proteins. Such proteins, as well as DNA sequences encoding them, have been isolated and characterized for a number of different species. See, for example, U.S. Patent No. 4,968,590 and U.S. Patent Number.

5,011,691, U.S. application Serial Number 1989; 422,699, filed October 17, 1989, and 600,024 and 599,543, both filed October 18, 1990; Sampath et al., (1990) J. Biol. Chem. 265:13198-13205; Ozkaynak et al. (1990) EMBO J. 9:2085-2093; and Lee, Proc. Nat.

Aca. Sci. 88:4250-4254 (1991), all of which are hereby incorporated by reference. Many of these proteins subsequently were discovered to have utility beyond bone morphogenesis. See, 15 Australian Patent Specification No. 660019, filed, 11 March, 1992. The mature forms of morphogens share substantial amino acid sequence homology, especially in the C-terminal core regions of the proteins. In particular, most of the proteins share a seven-cysteine skeleton in this region, in addition to S: 20 other apparently required amino acids. Table II, infra, shows the amino acid sequence homologies for nine morphogens over the carboxy terminal 102 amino acids.

Among the morphogens useful in this invention are proteins originally identified as osteogenic proteins, such as the OP-1, OP-2 and CBMP2 proteins, as well as amino acid sequence-related proteins such as DPP (from Drosophila), Vgl (from Xenopus), Vgr-1 (from mouse, see U.S. 5,011,691 to Oppermann et al.), GDF-1 (from mouse, see Lee (1991) PNAS 88:4250-4254), all of which are I- presented in Table II and Seq. ID Nos.5-14), and the recently identified 60A protein (from Drosophila, Seq.

ID No. 24, see Wharton et al. (1991) PNAS 88:9214-9218.) The members of this family, which include members of the TGF-p super-family of proteins, share substantial aimino acid sequence homology in their C-terminal regions. The proteins are trans' :ed as a precursor, having an N-terminal signal peptido sequence, typically less than about 30 residues, followed by a "pro" domain that is cleaved to yield the mature sequence. The signal peptide is cleaved rapidly upon translation, at a cleavage site that can be predicted in a given sequence using the method of Von Heijne ((1986) Nucleic Acids Research 14:4683-4691.) Table I, below, describes the various morphogens identified to date, including their nomenclature as used herein, their Seq. ID references, and publication sources for the amino acid sequences for the full length proteins not included in the Seq. Listing. The disclosure of these publications is incorporated herein by reference.

TABLE I e "OP-1" refers generically to the group of v morphogenically active proteins expressed from part or all of a DNA sequence encoding OP-1 protein, including allelic and species variants thereof, human OP-1 Seq. ID No. 5, mature protein amino acid sequence), or mouse OP-1 Seq. ID No. 6, mature protein amino acid sequence.) The conserved seven cysteine skeleton is defined by residues 38 to 139 of Seq. ID Nos. 5 and 6. The cDNA sequences and the amino acids encoding the full length proteins are provided in Seq. Id Nos. 16 and 17 (hOPI) and Seq. ID Nos. 18 and 19 (mOPl.) The mature proteins are defined by residues 293-431 (hOP1) and 292-430 (mOP1). The "pro" regions of the proteins, cleaved to yield the mature, morphogenically active proteins are defined essentially by residues 30-292 (hOP1) and residues 30-291 (mOPl).

"OP-2" refers generically to the group of active proteins expressed from part or all of a DNA sequence encoding OP-2 protein, including allelic and species variants thereof, human OP-2 Seq.

ID No. 7, mature protein amino acid sequence) or mouse OP-2 Seq. ID No. 8, mature protein amino acid t sequence). The conserved seven cysteine skeleton is defined by residues 38 to 139 of Seq. ID Nos. 7 and 8. The cDNA ;sequences and the amino acids encoding the full length proteins are provided in Seq.

ID Nos. 20 and 21 (hOP2) and Seq. ID Nos.

22 and 23 (mOP2.) The mature proteins are defined essentially by residues 264-402 (hOP2) and 261-399 (mOP2). The "pro" regions of the proteins, cleaved to yield ft.

12 the mature, morphogenically active proteins likely are defined essentially by residues 18-263 (hOP2) and residues 18-260 (mOP2). (Another cleavage site also occurs 21 residues upstream for both OP-2 proteins.) "CBMP2" refers generically to the morphogenically active proteins expressed from a part or all of a DNA sequence encoding the CBMP2 proteins, including allelic and species variants thereof, human CBMP2A ("CBMP2A(fx)", Seq ID No. 9) or human CBMP2B DNA ("CBMP2B(fx)", Seq. ID No. The amino acid sequence for the full' length proteins, referred to in the literature as BMP2A and BMP2B, or BMP2 and BMP4, appear in Wozney, et al. (1988) '....Science 242:1528-1534. The pro domain for BMP2 (BMP2A) likely includes residues 248 or 25-282; the mature protein, residues 249-396 or 283-396. The pro domain for BMP4 (BMP2B) likely includes residues 25-256 or 25-292; the mature protein, residues 257-408 or 293-408.

"DPP(fx)" refers to protein sequences encoded by the Drosophila DPP gene and defining the conserved seven cysteine skeleton (Seq. ID No. 11). The amino acid sequence for the full length protein appears in Padgett, et al (1987) Nature 325: 81-84. The pro

-II

"Vgl(fx)" "Vgr-l(fx)" domain likely extends from the signal peptide cleavage site to residue 456; the mature protein likely is defined by residues 457-588.

refers to protein sequences encoded by the Xenopus Vgl gene and defining the conserved seven cysteine skeleton (Seq. ID No. 12). The amino acid sequence for the full length protein appears in Weeks (1987) Cell 51: 861-867. The pro domain likely extends from the signal peptide cleavage site to residue 246; the mature protein likely is defined by residues 247-360.

refers to protein sequences encoded by the murine Vgr-1 gene and defining the conserved seven cysteine skeleton (Seq. ID No. 13). The amino acid sequence for the full length protein appears in Lyons, et al, (1989) PNAS 86: 4554-4558. The pro domain likely extends from the signal peptide cleavage site to residue 299; the mature protein likely is defined by residues 300-438.

refers to protein sequences encoded by the human GDF-1 gene and defining the conserved seven cysteine skeleton (Seq. ID No. 14). The cDNA and encoded amino sequence for the full length protein is a 9 a.

*c a a.

Ir a a 'GDF-1(fx)" "BMP3(fx) 14 provided in Seq. ID. No. 32. The pro domain likely extends from the signal peptide cleavage site to residue 214; the mature protein likely is defined by residues 215-372.

refers generically to the morphogenically active proteins expressed from part or all of a DNA sequence (from the Drosophila gene) encoding the 60A proteins (see Seq.

ID No. 24 wherein the cDNA and encoded amino acid sequence for the full length protein is provided). "60A(fx)" refers to the protein sequences defining the conserved seven cysteine skeleton (residues 354 to 455 of Seq. ID No. 24.) The pro domain likely extends from the signal peptide cleavage site to residue 324; the mature protein likely is defined by residues 325-455.

refers to protein sequences encoded by the human BMP3 gene and defining the conserved seven cysteine skeleton (Seq. ID No. 26).

The amino acid sequence for the full length protein appears in Wozney et al.

(1988) Science 242: 1528-1534. The pro domain likely extends from the signal peptide cleavage site to residue 290; the mature protein likely is defined by residues 291-472.

a a a *fee a h S*.6 a o a a.

"BMP6(fx) refers to protein sequences encoded by the human BMP5 gene and defining the conserved seven cysteine skeleton (Seq. ID No. 27).

The amino acid sequence for the full length protein appears in Celeste, et al.

(1991) PNAS 87: 9843-9847. The pro domain likely extends from the signal peptide cleavage site to residue 316; the mature protein likely is defined by residues 317-454.

refers to protein sequences encoded by the human BMP6 gene and defining the conserved seven cysteine skeleton (Seq. ID No. 28).

The amino acid sequence for the full length protein appear sin Celeste, et al.

(1990) PNAS 87: 9843-5847. The pro domain likely includes extends from the signal peptide cleavage site to residue 374; the mature sequence likely includes residues 375-513.

o s o e r r ~r The OP-2 proteins have an additional cysteine residue in this region see residue 41 of Seq. ID Nos. 7 and in addition to the conserved cysteine skeleton in common with the other proteins in this family. The GDF-1 protein has a four amino acid insert within the conserved skeleton (residues 44-47 of Seq.

ID No. 14) but this insert likely does not interfere with the relationship of the cysteines in the folded 16 structure. In addition, the CBMP2 proteins are missing one amino acid residue within the cysteine skeleton.

The morphogens are inactive when reduced, but are active as oxidized homodimers and when oxidized in combination with other morphogens of this invention.

Thus, as defined herein, a morphogen is a dimeric protein comprising a pair of polypeptide chains, wherein each polypeptide chain comprises at least the C-terminal six cysteine skeleton defined by residues 43-139 of Seq. ID No. 5, including functionally equivalent arrangements of these cysteines amino acid insertions or deletions which alter the linear arrangement of the cysteines in the sequence but not their relationship in the folded structure), such that, when the polypeptide chains are folded, the dimeric protein species comprising the pair of polypeptide chains has the appropriate three-dimensional structure, including the appropriate intra- and inter-chain disulfide bonds such that the protein is capable of acting as a morphogen as defined herein. Specifically, the morphogens generally are capable of the following biological functions in a morphogenically permissive environment: stimulating proliferation of progenitor cells; stimulating the differentiation of progenitor cells; stimulating the proliferation of differentiated cells; and supporting the growth and maintenance of differentiated cells, including the "redifferentiation" of transformed cells. In addition, it is also anticipated that these morphogens are capable of 0

C

*°ee° °oe.

*e e

I

inducing redifferentiation of committed cells under appropriate environmental conditions.

Morphogens useful in this invention comprise one of two species of generic amino acid sequences: Generic Sequence 1 (Seq. ID No. 1) or Generic Sequence 2 (Seq.

ID No. where each Xaa indicates one of the naturally-occurring L-isomer, a-amino acids or a derivative thereof. Generic Sequence 1 comprises the conserved six cysteine skeleton and Generic Sequence 2 comprises the conserved six cysteine skeleton plus the additional cysteine identified in OP-2 (see residue 36, Seq. ID No. In another preferred aspect, these sequences further comprise the following additional sequence at their N-terminus: Cys Xaa Xaa Xaa Xaa (Seq. ID No. 1 Preferred amino acid sequences within the foregoing generic sequences include: Generic Sequence 3 (Seq. ID No. Generic Sequence 4 (Seq. ID No. Generic Sequence 5 (Seq. ID No. 30) and Generic Sequence 6 (Seq. ID No. 31), listed below. These Generic Sequences accommodate the homologies shared among the various preferred members of this morphogen family identified in Table II, as well as the amino acid sequence variation among them. Specifically, Generic Sequences 3 and 4 are composite amino acid sequences of the following proteins presented in Table II and identified in Seq. ID Nos. 5-14: human OP-1 (hOP-1, Seq. ID Nos. 5 and 16-17), mouse OP-1 (mOP-1, Seq. ID 0 oo 18 Nos. 6 and 18-19), human and mouse OP-2 (Seq. ID Nos. 7, 8, and 20-22), CBMP2A (Seq. ID No. CBMP2B (Seq. ID No. 10), DPP (from Drosophila, Seq. ID No. 11), Vgl, (from Xenopus, Seq. ID No. 12), Vgr-1 (from mouse, Seq. ID No. 13), and GDF-1 (from mouse, Seq. ID No. 14.) The generic sequences include both the amino acid identity shared by the sequences in Table II, as well as alternative residues for the variable positions within the sequence. Note that these generic sequences allow for an additional cysteine at position 41 or 46 in Generic Sequences 3 or 4, respectively, providing an appropriate cysteine skeleton where inter- or intramolecular disulfide bonds can form, and contain certain critical amino acids which influence the tertiary structure of the proteins.

Generic Sequence 3 Leu Tyr Val Xaa Phe 1 Xaa Xaa Xaa Gly Trp Xaa Xaa Trp Xaa Xaa Ala Pro Xaa Gly Xaa Xaa Ala Xaa Tyr Cys Xaa Gly Xaa Cys Xaa Xaa Pro Xaa Xaa Xaa Xaa Xaa 19 Xaa Xaa Xaa Asn His Ala Xaa Xaa Xaa Xaa Leu Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Cys Cys Xaa Pro Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Leu Xaa Xaa Xaa Xaa Xaa Xaa Xaa Val. Xaa Leu Xaa Xaa Xaa Xaa Xaa Met Xaa Val Xaa Xaa Cys Gly Cys Xaa a 9* a a a wherein each Xaa is independently .el~etv6'df. £om a group of one or more specified amino ac.ds deftned as follows: "Res." means "residue" and Xaa at res.4 (Ser, Asp or Glu) Xaa at res. 6 (Arg, Gin, Ser Or Lys); Xaa at res.7 (Asp or Giu); Xaa at res.8 (Leu or Val.); Xaa at res.ii (Gin, Leu, Asp, His or Asn); Xaa at res .12 (Asp, Arg or Asn); Xaa at res.14 (Ile or Val); Xaa at res.15 (Ile or Val); Xaa at res.18 (Glu, Gin, Leu, Lys, Pro or Arg); Xaa at res.20 (Tyr or Phe) Xaa at res.21 (Ala, Ser, Asp, Met, His, Leu or Gin) Xaa at res.23 (Tyr, Asn or Phe); Xaa at res.26 (Glu, His, Tyr, Asp or Gin); Xaa at res.28 (Glu, Lys, Asp or Gin); Xaa at res.30 =(Ala, Ser, Pro or Gin); Xaa at res. 31 (Phe, Leu or Tyr) Xaa at res.33 =(Leu or Val); Xaa at res.34 (Asn, Asp, Ala or Thr); Xaa at res.35 Ser, Asp, Glu, Leu or Ala); Xaa at res.36 (Tyr, Cys, His, Ser or Ile); Xaa at res.37 (Met, Phe, Gly or Leu); Xaa at res.38 (Asn or Ser); Xaa at res.39 (Ala, Ser or Gly); Xaa at =(Thr, Leu or Ser); Xaa at res.44 (Ile or Val); Xaa at res.45 (Val or Leu); Xaa at res.46 (Gin or Arg); Xaa at res.47 (Thr, Ala or Ser); Xaa at res. 49 (Val or Met) Xau at res. 50 (His or Asn) Xaa at res.51 (2.lie, Leu, Asn, Ser, Ala or Val); Xaa at res.52 (Ile, Met, Asn, Ala or Val); Xaa at res.53 (Asn, Lys, Ala or Giu); Xaa at res.54 (Pro or Ser); Xaa at res.55 (Giu, Asp, Asn, or Giy); Xaa at res.56 (Thr, Aia, Val, Lys, Asp, Tyr, Ser or Ala); Xaa at res.57 (Vai, Ala or Ile); Xaa at res.58 =(Pro or Asp); Xaa at res. 59 (Lys or Leu) Xaa at res. (Pro or Aia) Xaa at res .6 3 (Ala or Val); Xaa at =(Thr or Ala); Xaa at res.66 (Gin, Lys, Arg or Giu); Xaa at res. 67 (Leu,. Met or Val); Xaa at res. 68 =(Asn, Ser or Asp); Xaa at res. 69 =(Ala, Pro or Ser); Xaa at res.70 (Ile, Thr or Val); Xaa at res.71 (Ser or Ala); Xaa at res.72 (Val or Met); Xaa at res.*74 (Tyr or.Phe); Xaa at. res.75 (Phe, Tyr or Leu); Xaa at res.76 (Asp or Asn); Xaa at res.77= (Asp, Giu, Asn or Ser); Xaa at res.78 (Ser, Gin, Asn or Tyr); Xaa at res.79 (Ser, Asn, Asp or Glu); Xaa at res. 80 (Asn, Thr or Lys); Xaa at res. 82 (Ile or Val); Xaa at res. 84 (Lys or Arg) Xaa at res. 85 (Lys, Asn, Gin or His); Xaa at res.86 =(Tyr or His); V 21 Xaa at res.87 (Arg, Gin or Glu); Xaa at res.88= (AL.i, Glu or Asp); Xaa at res.90 =(Val, Thr or Ala); Xaa res.92 (Arg, Lys, Val, Asp or Glu); Xaa at res.93 (Ala, Gly or Glu); and Xaa at res.97 (His or Arg); Generic Secguence 4 Cys Xaa Xaa Xaa Xaa Leu Tyr Val Xaa Phe 1 5 Xaa Xaa Xaa Gly Xaa Ala Xaa Tyr Xaa Pro Xaa Xaa Xaa Xaa Xaa Xaa Cys Xaa Xaa Xaa Xaa Xaa Xaa Xaa Pro Cys Xaa Xaa Leu Xaa Pro Xaa Xaa Xaa Xaa Xaa Xaa Asn Xaa Xaa Xaa Leu Xaa Xaa Trp Gly Gly Xaa His Xaa Xaa Xaa Val Met Xaa Xaa Xaa Xaa Ala Xaa Xaa Xaa Xaa Xaa Xaa Xaa Trp Xaa Xaa Ala Cys Xaa Xaa Xaa Xaa Xaa Xaa Xaa Cys Xaa Xaa Xaa Leu Xaa Val Xaa C. a.

C

a

C..

V C

C

Xaa Cys Gly Cys Xaa 100 wherein each Xaa is independently selected from a group of one or more specified amino acids as defined by the following: "Res." means "residue"' and Xaa at res.2= (Lys or Arg); Xaa at res.3 (Lys or Arg); Xaa at res.4 (His or Arg); Xaa at res.5 (Giu, Ser, His, Gly, Arg or Pro); Xaa at res.9 (Ser, Asp or Glu); Xaa at res.11 (Arg, Gin, Ser or Lys); Xaa at res.12 (Asp or Glu); Xaa at res.13 (Leu or Val); Xaa at res.16= (Gin, Leu, Asp, His or Asn); Xaa at res.17 (Asp, Arg, or Asn); Xaa at res.19 (Ile or Val); Xaa at res.20 (Ile or Val); Xaa at res.23 (Glu, Gin, Leu, Lys, Pro or Arg); Xaa at res.25 (Tyr or Phe); Xaa at res.26= (Ala, Ser, Asp, Met, His, Leu, or Gin); Xaa at res.28= (Tyr, Asn or Phe); Xaa at res.31 (Glu, His, Tyr, Asp or Gin); Xaa at res.33 Giu, Lys, Asp or Gin); Xaa at (Ala, Ser or Pro); Xaa at res.36 (Phe, Leu or Tyr); Xaa at res.38 (Leu or Val); Xaa at res.39 (Asn, Asp, Ala or Thr); Xaa at res.40 (Ser, Asp, Glu, Leu or Ala); Xaa at res.41 (Tyr, Cys, His, Ser or Ile); Xaa at res.42 =(Met, Phe, Gly or Leu); Xaa at res.44 (Ala, Ser or Gly); Xaa at res.45 (Thr, Leu or Ser); Xaa at res.49 (Ile or Val); Xaa at (Val or Leu); Xaa at res.51 (Gin or Arg); Xaa at res.52 (Thr, Ala or Ser); Xaa at res.54 (Val or Met); Xaa at res.55 (His or Asn); Xaa at res.56= (Phe, Leu, Asn, Ser, Ala or Val); Xaa at res.57 (Ile, Met, Asn, Ala or Val); Xaa at res.58 (Asn, Lys, Ala or Giu); Xaa at res.59 =(Pro or Ser); Xaa at (Glu, Asp, or Gly); Xaa at res.61 =(Thr, Ala, Val, Lys, Asp, Tyr, Ser or Ala); Xaa at res.62 =(Val, Ala or Ile); Xaa at res.63 =(Pro or Asp); Xaa at res.64 (Lys or Leu); Xaa at res.65 (Pro or Ala); Xaa at res.68 (Ala or Val); Xaa at res.70 (Thr or Ala); Xaa at res.71 (Gin, Lys, Arg or Glu); Xaa at res.72 (Leu, Met or Val); Xaa at res.73 =(Asn, Ser or Asp); Xaa at res.74 (Ala, Pro or Ser); Xaa at (Ile, Thr or Val); Xaa at res.76 (Ser or Ala); Xaa at res.77 (Val or Met); Xaa at res.79 (Tyr or Phe); Xaa at res.80 (Phe, Tyr or Leu); Xaa at res.81 (Asp or Asn); Xaa at res.82 (Asp, Glu, Asn or Ser); Xaa at res.83 (Ser, Gin, Asn or Tyr); Xaa at res.84 (Ser, Asn, Asp or Glu); Xaa at res.85 (Asn, Thr or Lys); Xaa at res.87 (Ilie or Val); Xaa at res.89 (Lys or Arg); Xaa at res.90 (Lys, Asn, Gin or His); Xaa at res.91 (Tyr or His); Xaa at res.92 (Arg, Gin or Glu); Xaa at res.93 =(Asn, Giu or Asp); Xaa at (Val, Thr or Ala); Xaa at res.97 (Arg, Lys, Val, Asp or Glu); Xaa at res.98 (Ala, Gly or Glu); and Xaa at res.102 (His or Arg).

Similarly, Generic Sequence 5 (Seq. ID No. 30) and Generic Sequence 6 (Seq. ID No. 31.) accommodate the homologies shared antong all the morphoge. protein family members identified in Table II. Specifically, Generic Sequences 5 and 6 are composite amino acid .se sequences of humati OP-i (hOP-1, Seq. ID Nos. 5 and 16- 17), mouse OP-i (mOP-i, Seq. ID Nos. 6 and 18-19), human and mouse OP-2 (Seq. ID Nos. 7, 8, and 20-22), CBMP2A (Seq. ID No. CBMP2B (Seq. ID No. 10), DPP 4 (from Drosophila, Seq. ID No. 11), Vgl, (from Xenopus, SqIDNo. 12,Vr1(from mosSeq. IDNo. 13), and GDF-1 (from mouse, Seq. ID No. 14), human BMP3 24 (Seg. ID No. 26), human J3MPS (Seg. ID No. 27), homan BMP6 (Seq. ID No. 28) and 60(A) (from Drosophila, Seq.

ID Nos. 24-25). The generic sequences include botla the amino acid identity shared by these sequence~s in thte 0-terminal domain, defined by the six and seven cysteinte skeletons (Generic Sequences 5 and 6, respectively), as well as alternative residues for -the variablfh positions with, the -sequence. As for Generic Sequences 3 and 4, Generic Sequences 5 and 6 allow f or an additional cysteine at position 41 (Generilc Sequence or position 46 (Generic Sequence providing an appropriate cysteine skeleton where inter- or intramolecular disulfide bonds can form, and containing certain critical amino acids which influence the tertiary structure of the proteins.

Generic Sequence Leu Xaa Xaa Xaa Phe Xaa X(aa Xaa Gly Trp Xaa Xaa Trp Xaa X a4 Xaa Pro Xaa Xaa Xaa. Xaa Ala la Tyr Cys Xaa Gly Xaa C Xa a Xao Pro Xaa taa Xaa Xaa Xaa Xaa xaa Xaa Asn His Ala Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Cys Cy'z Xaa Pro Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Leu Xaa Xaa Xaa Xaa Xaa Xaa Xaa Val Xaa Leu Xaa Xaa Xaa Xaa Xaa Met Xaa Val Xaa Xaa Cys Xaa Cys Xaa *9*4

S

*9*e *9 9 9. *9 9 S9* 9* 9* 9 9.

9 9 9*9* wherein each Xaa is independently selected from a group of one or more specified amino acids defined as follows: "Res." means "residue" and :gaa at res.2- ('Lyr or Lys); Xaa at res.3 =Val or Ile); Xaa at res.4 =(Ser, Asp or Giu); Xaa at res..6 (Arg, Gin, Ser, Lys or Ala); Xaa at res.7 (Asp, Glu or Lys); Xaa at res.8 (Leu, Val or Ile); Xaa at res.l11 (Gin, Leu, Asp, His, Asn or Ser); Xaa at res,12 (Asp, Arg, Asn or Glu); X~ia at res.14 (Ile or Val); Xaa at res.15 (Ile or Val); Xaa at res.16 (Ala or Ser); Xaa at res.18 (Giu, Gin, Leu, Lys, Pro or Arg); Xaa at res.19 (Gly or Ser); Xaa at res.20 (Tyr or Phe); Xaa at res.21 (Ala, Ser, Asp, Met, His, Gin, Leu or Gly); Xaa at res.23 (Tyr, Asn or Phe); Xaa at res.26 (Giu, His, Tyr, Asp, Gin or Ser); Xaa at res.28 (Giu, Lys, Asp, Gin or Ala) Xaa at res.30 (Ala, Ser, Pro, Gin or Asn); Xaa at res.31 =(Phe, Leu or Tyr); Xaa at res.33 (Leu, Val or Met); Xaa at res.34 (Asn, Asp, Ala, Thr or Pro); Xaa at res.35 (Ser, Asp, Glu, Leu, Ala or Lys); Xaa at res.36 (Tyr, Cys, His, Ser or Ile); Xaa at res.37 (Met, Phe, Gly or Leu); Xaa at res.38 (Asn, Ser or Lys); Xaa at res.39 (Ala, Ser, Giy or Pro); Xaa at res. 40 (Thr, Leu or Ser) Xaa at res.44 (Ile, Val or Thr); Xaa at res.45 =(Val, Leu or Ilie); Xaa at res.46 (Gin or Arg); Xaa at res.47 (Thr, Ala or Ser); Xaa at res.48 (Leu or Ile); Xaa at res.49 (Val or Met); Xaa at res.50 (His, Asn or Arg); Xaa at res.51 (Phe, Leu, Asn, Ser, Ala or Val); Xaa at res.52 (Ile, Met, Asn, Ala, Val or Leu) Xaa at res.53 (Asn, Lys, Ala, Glu, Gly or Phe); Xaa at res.54 (Pro, Ser or Val); Xaa at res.55 (Glu, Asp, Asn, Gly, Val or Lys); Xaa at res.56 (Thr, Ala, Val, Lys, Asp, Tyr, Ser, Ala, Pro or His); Xaa at res 57 (Val, Ala or Ile); Xaa at res.58 (Pro or Asp); Xaa at res.59 (Lys, Leu or Giu); Xaa at res.60 (Pro or Ala); Xaa at res.63 (Ala or Val); Xaa at res.65 (Thr, Ala or Glu) Xaa-at res. 66 (Gin, Lys, Arg or.

Giu); Xaa at res.67 (Leu, Met or Val); Xaa at res.68 =(Asn, Ser, Asp or Gly) Xaa at res. 69 (Ala, Pro or Ser); Xaa at res.70 (Ile, Thr, Vai or Leu); Xaa at res.71 (Ser, Ala or Pro); Xaa at res.72 (Vai, Met 0 or Ilie); Xaa at res.74 =(Tyr or Phe); Xaa at (Phe, Tyr, Leu or- His); Xaa at res.76 =(Asp, Asn or 6 00.

Go.

Leu); Xaa at res.77 (Asp, Giu, Asn or Ser); Xaa at res.78 (Ser, Gin, Asn, Tyr or Asp); Xaa at res.79 (Ser, Asn, Asp, Glu or Lys); Xaa at res.80 (Asn, Thr or Lys); Xaa at res.82 (Ile, Val or Asn); Xaa at res.84 (Lys or Arg); Xaa at res..85 =(Lys, Asn, Gin, His or Val); Xaa at res.86 (Tyr or His); Xaa at res.87 (Arg, Gin, Giu or Pro); Xaa at res.88 (Asn, Glu or Asp); Xaa at res.90 (Val, Thr, Ala or Ile); Xaa at res.92 =(Arg, Lys, Val, Asp or Glu); Xaa at res.93 =(Ala, Gly, Giu or Ser); Xaa at res.95 =(Gly or Ala) and Xaa at res.97 (His or Arg).

Generic Seguence 6 Cys Xaa Xaa Xaa Xaa Leu Xaa Xaa Xaa Phe 1 5 Xaa Xaa Xaa Gly Trp Xaa Xaa Trp Xaa Xaa Xaa Pro Xaa Xaa Xaa Xaa Ala 2b Xaa Tyr Cys Xaa Gly Xaa'Cys Xaa Xaa Pro Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa.Asn His Ala Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Cy Cys Xaa Pro Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Leu Xaa Xaa Xaa Xaa Xaa Xaa Xaa Val Xaa Leu Xaa Xaa Xaa Xaa Xaa Met Xaa Val Xaa Xaa Cys Xaa Cys Xaa 100 wherein each Xaa is independently selected from a group of one or more specified amino acids as defined by the following: "Res." means "residue" and Xaa at res.2 (Lys, Arg, Ala or Gin); Xaa at res.3 (Lys, Arg or Met); Xaa at res.4 (His, Arg or Gin); Xaaat (Glu, Ser, His, Gly, Arg, Pro, Thr, or Tyr); Xaa at res.7 (Tyr or Lys); Xaa at res.8 (Val or Ile); Xaa at res.9 (Ser, Asp or Giu); Xaa at res.li (Arg, Gin, Ser, Lys or Ala); Xaa at res.12 (Asp, Giu, or Lys); Xaa at res.13 (Leu, Val or Ile); Xaa at res.16 (Gln, Leu, Asp, His, Asn or Ser); Xaa at res.17 (Asp, Arg, Asn or Glu); Xaa at res.19 (Ile or Val); Xaa at res.20 (Ile or Val); Xaa at res.21 (Ala or Ser); Xaa at res.23 (Giu, Gin, Leu, Lys, Pro or Arg); Xaa at res.24 (Gly or Ser); Xaa at res.25 (Tyr or Phe); Xaaat res.26 (Ala, Ser, Asp, Met, His, Gin, Leu, or Gly); Xaa at res.28 =(Tyr, Asn or Phe); Xaa at res.31 (Glu, His, Tyr, Asp, Gin or Ser); Xaa at res.33 Glu, Lys, Asp, Gin or Ala); Xaa at (Ala, Ser, Pro, Gin or Asn); Xaa at res.36 (Phe, Leu V. or Tyr); Xaa at res.38 (Leu, Val or Met); Xaa at res.39 =(Asn, Asp, Ala, Thr or Pro); Xaa at (Ser, Asp, Glu, Leu, Ala or Lys); Xaa at res.41 =(Tyr, Cys, His, Ser or Ile); Xaa at res.42 =(Met, Phe, Gly or Leu); Xaa at res.43 (Asn, Ser or Lys); Xaa at res.44 (Ala, Ser, Gly or Pro); Xaa at res.45 (Thr, Leu or Ser); Xaa at res.49 (Ile, Val or Thr); Xaa at (Val, Leu or Ile); Xaa at res.51 (Gin or Arg); Xaa at res.52 (Thr, Ala or Ser); Xaa at res.53 =(Leu or Ile); Xaa at res.54 (Val or Met); Xaa at (His, Asn or Arg); Xaa at res.56 (Phe, Leu, Asn, Ser, Ala or Val); Xaa at res.57 (Ile, Met, Asn, Ala, Val or Leu); Xaa at res.58 (Asn, Lys, Ala, Glu, Gly or Phe); Xaa at res.59 (Pro, Ser or Val); Xaa at (Giu, Asp, Gly, Val or Lys); Xaa at res..61 (Thr, Ala,, Val, Lys, Asp, Tyr, Ser, Aia, Pro or His); Xaa at res 1 ,62 (Val, Ala or Ile); Xaa at res.63 =(Pro or Asp); Xaa at res.64 (Lys, Leu or Glu); Xaa at (Pro or Ala); Xaa at res.68 (Ala or Val); Xaa at res.70 (Thr, Ala or Glu); Xaa at res.71 (Gin, Lys, Arg or Giu); Xaa at res.72 =(Leu, Met or Val); Xaa at res.73 (Asn, Ser, Asp or Gly); Xaa at res.74 (Ala, Pro or Ser); Xaa at res.75 (Ile, Thr, Val or Leu); Xaa at res.76 (Ser, Ala or Pro); Xaa at ren.77 (Val, Met or Ile); Xaa at res.79 (Tyr or Phe); Xaa at res.80 (Phe, Tyr, Leu or His); Xaa at res.81 (Asp, Asn or Leu); Xaa at res.82 (Asp, Glu, Asn or Ser); Xaa at res.83 (Ser, Gin, Asn, Tyr or Asp); Xaa at res.84 (Ser, Asn, Asp, Glu or Lys); Xaa at res.85 (Asn, Thr or Lys); Xaa at res.87 (Tle, Vai or Asn); Xaa at res.89 (Lys or Arg); Xaa at 9 res.90 (Lys, Asn, Gin, His or Val); Xaa at res.91 (Tyr or Hs;Xaa at res.92 (Ag Gn Giu or Pro); Xaa at res.93 (Asn, Giu or Asp); Xaa at res.95 :o (Val, Thr, Ala or Ile); Xaa at res.97 (Arg, Lys, Val, Asp or Glu); Xaa at res.98 (Ala, Gly, Glu or Ser); Xaa at res.100 (Gly or Ala); and Xaa at res.102 (His or Arg).

Particularly useful sequences for use as morphogens in this invention include the C-terminal domains, the C-terminal 96-102 amino acid residues of Vgl, Vgr-1, DPP, OP-1, OP-2, CBMP-2A, CBMP-2B, GDF-1 (see Table II, below, and Seq. ID Nos. 5-14), as well as proteins comprising the C-terminal domains of 60A, BMP3, BMP5 and BMP6 (see Seq. ID Nos. 24-28), all of which include at least the conserved six or seven cysteine skeleton. In addition, biosynthetic constructs designed from the generic sequences, such as COP-1, 3-5, 7, 16, disclosed in U.S.

Pat. No. 5,011,691, also are useful. Other sequences include the inhibins/activin proteins (see, for example, U.S. Pat. Nos. 4,968,590 and 5,011,691).

Accordingly, other useful sequences are those sharing at least 70% amino acid sequence homology or "similarity", and preferably 80% homology or similarity with any of the sequences above. These are anticipated to include allelic and species variants and mutants, and biosynthetic muteins, as well as novel members of this morphogenic family of proteins. Particularly envisioned in the family of related proteins are those proteins exhibiting morphogenic activity and wherein the amino acid changes from the preferred sequences include conservative changes, those as defined by Dayoff et al., Atlas of Protein Sequence and Structure; vol. 5, Suppl. 3, pp. 345-362, Dayoff, ed., Nat'l BioMed. Research Fdn., Washington, D.C. 1979). As used herein, potentially useful sequences are aligned with a known morphogen sequence using the method of Needleman et al. ((1970) J.Mol.Biol. 48:443-453) and identities calculated by the Align program (DNAstar, Inc.).

"Homology" or "similarity" as used herein includes allowed conservative changes as defined by Dayoff et al.

Morphogen sequences which are detectable according to the methods of the invention include but are not limited to those having greater than 60% identity, preferably greater than 65% identity, with the amino acid sequence defining the conserved six cysteine skeleton of hOP1 residues 43-139 of Seq. ID No.

These most preferred sequences include both allelic and species variants of the OP-1 and OP-2 proteins, including the Drosophila 60A protein.

Accordingly, morphogens which are detectable according to the invention include active proteins comprising species of polypeptide chains having the generic amino acid sequence herein referred to as "OPX", which accommodates the homologies between the various identified species of OP1 and OP2 (Seq. Ir lo. 29).

The morphogens detectable in the methods of this invention include proteins comprising any of the polypeptide chains described above, whether isolated from naturally-occurring sources, or produced by recombinant DNA or other synthetic techniques, and includes allelic and species variants of these proteins, naturally-occurring or biosynthetic mutants thereof, chimeric variants containing a domain(s) or .ee -32region(s) of one family member functionally arranged with another domain(s) or regions(s) of a second family member, as well as various truncated and fusion constructs. Deletion or insertion or addition mutants also are envisioned to be active, including those which may alter the conserved C-terminal cysteine skeleton, provided that the alteration does not functionally disrupt the relationship of these cysteines in the folded structure. Accordingly, such active forms are considered the equivalent of the specifically described constructs disclosed herein. The proteins may include forms having varying glycosylation patterns, varying N-termini, a family of related proteins having regions of amino acid sequence homology, and active truncated or mutated forms of native or biosynthetic proteins, produced by expression of recombinant DNA in host cells.

The morphogenic proteins can be expressed from intact or truncated cDNA or from synthetic DNAs in procaryotic or eucaryotic host cells, and purified, cleaved, refolded, and dimerized to form morphogenically active compositions.

20 Currently preferred host cells include E. coli or mammalian cells, such as CHO, COS or BSC cells. A detailed description of .the morphogens detectable according to the methods of this invention is disclosed in Australian Patent Specification No.

660019, filed 11 March, 1992, the disclosure of which is 25 incorporated herein by reference.

ft o 9 9 ft o The screening method of the invention provides a simple method of determining a change in the level of morpheigenic protein as a result of exposure of cultured cells to one or more compound(s). The level of a morphogenic protein in a given cell culture, or a change in that level resulting from exposure to one or more compound(s) indicates that direct application of the compound modulates the level of the morphogen expressed by the cultured cells. If, for example, a compound upreraulated the production of OP-i by a kidney cell line, it would then be desirable to test systemic administration of this compound in an animal model to determine if it upregulated the production of OP-i in vivo. If this compound did upregulate the endogenous circulating levels of OP-i, it would be consistent with administration of the compound systemically for the purpose of correcting bone metabolism diseases such as osteoporosis. The level of morphogen in the body may be a result of a wide range of physical conditions, tissue degeneration such as occurs in diseases including arthritis, emphysema, osteoporosis, kidney diseases, lung diseases, cardiomyopathy, and cirrhosis of the liver. The level of morphogens in the body may also occur as a result of the normal process of aging.

A compound selected by the screening method of the invention as, for example, one which increases the level of morphogen in a tissue, may be consistent with the administration of the compound systemically or locally to a tissue for the purpose of preventing some form of tissue degeneration or for restoring the degenerated tissue to its normal healthy level.

*eQ *o 11,101TIOM110,44694 VI W1 cat. 111JI -34- A further advantage of the invention includes determining the tissue or tissues of origin of a given morphogen in order to administer a compound aimed at modulating the systemic level of morphogen for treatment of a disease or condition in which the level of morphogen production has become altered.

A further advantage of the invention, in particular the screening method provided herein is that it may be used to assay for the presence or absence of disease and/or damage to the tissue. More particularly, the pathologic depression or absence of a particular morphogen which is observable in a particular disease state of patients provides a basis for diagnosing that disease state by measuring the morphogen level to determine whether or not the morphogen level is depressed or alternatively, whether the morphogen is absent altogether. For example, using the assay of the invention, it has been determined that OP-i, first found in bone and demonstrated to be osteoinductive, is synthesized primarily in kidney, bladder, and adrenal tissue. This surprising discovery, coupled with the observation that patients with kidney disease often express loss of bone mass, suggests that the bone loss in these patients may be due to pathologic depression of OP-l synthesis in kidney.

The inventors have also found that depression or even an absence of OP-I levels in kidney at least, is diagnostic of renal disease and/or damage and preferably, that said depression or .bsence of OP-I levels is diagnostic of a renal disease and/or renal damage which is associated with bone loss.

Accordingly, a particularly preferred embodiment of the present invention provides a method of diagnosing renal tissue damage or disease comprising the step of measuring endogenous expression of OP-l by renal tissue of a mammal, wherein a I' ~Cfli'j(~hljU~Jflr,~4 ~II Ia'avI depression or absence of said endogenous expression relative to undamaged or undiseased mammalian renal tissue indicates a diagnosis that said mammal is afflicted with said damage or disease.

In an alternative embodiment, the invention provides a method of diagnosing renal tissue damage or disease comprising the steps of: contacting a sample of mammalian renal tissue with an agent for specifically detecting endogenous expression of OP-I is said tissue; detecting whether OP-i is expressed endogenously in said tissue or detecting a level of endogenous expression of OP-i in said tissue; and diagnosing said damage or disease if said expression is depressed or absent, or comparing said level of endogenously expressed OP-i in said tissue with a reference level of OP-1 endogenously expressed in undamaged or undiseased mammalian renal tissue to determine whether the endogenous expression of OP-i in said tissue is depressed or absent relative to the undamageA or undiseased tissue.

In a preferred embodiment, the agent for specifically detecting endogenous expression of OP-i is an isolated nucleic acid molecule capable of hybridising to the coding region and/or non-coding region of cDNA or RNA molecule, specific for OP-1 or an anti-OP-i antibody molecule, amongst others.

Optionally, the diagnostic method of the invention may further comprise the step(s) of measuring the expression of a gene other than the morphogen OP-i gene in said tissue, to provide a reference against which the endogenous expression of kONJONIHIPAN4 W VP 1,11PA01 35A isolated nucleic acid molecule or antibody which is capable of binding to RNA protein respectively, encoded by a gene such as a transcriptional elongation factor gene, amongst others.

Brief Description of the Drawings Figure 1 shows the fragments of OP-1, used as probes in Northern hybridizations useful in the processes of the invention.

Figure 2 shows results of Northern blot analysis of RNA using different OP-1-specific probes.

Figure 3 shows results of Northern blot analysis of RNA from different cells types probed with an OP-1 probe.

o e r e «*o 36 Detailed Description The invention is based on the discovery of a family of structurally related morphogenic proteins (BMPs), also called osteogenic proteins (OPs), and more particularly that various of these proteins play an important role, not only in embryogenesis, but also in tissue and organ maintenance anad r-air in juvenile and adult mammals.

Morphogenic proteins which have been identified include BMP 2, 3, 4, 5, 6, OP-1 and OP-2 (murine and human), Vgr-1, Vgl, DPP, GDF-1, CMBP-2A, CMBP-2B, 60A, and the inhibin/activin class of proteins. Other recombinant proteins include COP1, COP3, COP4, COPS, COP7, and COP16.

While, as explained herein, the morphogen have significant homologies and similarities in structure, it is hypothesized that variants within the morphogenic protein genes may have specific roles in specific tissue involving, for example, stimulation of progenitor cell multiplication, tissue specific or tissue preferred phenotype maintenance, and/or stimulation or modulation of the rate of differentiation, growth or replication of tissue cells characterized by high turnover. The effect on the longterm physiology, maintenance and repair of particular tissues by particular species of the morphogens is currently unknown in any significant detail. However, methods useful in determining which particular tissues express which particular morphogen(s), and for finding changes which stimulate or depress morphogen expression in vivo, would enable discovery and development of strategies for therapeutic treatment of a large number of diseased states, and provide drugs designed to implement the strategy.

S

*o, 1, t,!'A0PWAMR0120624 9AM-II 11191 -37- This invention provides such methods and, more specifically, two generic processes for obtaining data which ultimately will permit determination of structure/activity relationships of specific naturally occurring mammalian morphogens and drugs capable of modulating their production. Such st '-cture/activity relationships of naturally occurring morphogens include a correlation between morphogen levels and disease or damage to a tissue. For examrple, using the assay of the invention, it has been determined that OP-1, first found in bone and demonstrated to be osteoinductive, is synthesized primarily in kidney, bladder, and adrenal tissue. This surprising discovery, coupled with the observation that patients with kidney disease often express loss of bone mass, suggests that the bone loss in these patients may be due to patholog.c depression of OP-1 synthesis in kidney, and suggests that administration of OP-1 systemically or stimulation of OP-1 expression and secretion by the kidney may arrest bone loss, or effect remineralization through increased bone formation osteogenesis) 20 There are two fundamental aspects of the invention. One aspect involves an assay to determine tissues and cell types capable of synthesis and secretion of the morphogens; the other involves the use of the identified cell types configured in a screening system to find substances useful therapeuti.jally to 25 modulate, stimulate or depress, morphogen expression and/or secretion. The assay to determine the tissue of origin of a given morphogen involves screening a plurality two or more) different tissues by determining a parameter indicative of production of a morphogen in the tissue, and comparing the 30 parameters. As stated supra, such a comparison of parameters -37A may, in the case of a parameter comprising a disease state and or tissue damage, he useful in the diagnosis of the disease state and/or tissue damage. The tissue(s) of origin will, of course, be the tissue that pzoduces that morphogen.

The other assay of the invention involves screening can .ate compounds for their ability to modulate the effuctive systemic or local concentration of a morphogen by incubating the compound with a cell culture that produces the morphogen, and assaying the culture for a parameter indicative of a change in the production level of the morphogen. Useful candidate compounds then may be tested for in vivo efficacy in a suitable animal model. These compounds then may be used in vivo to modulate effective morphogen concentrating in the disease treatment.

1. Morphogen Tissue Distribution Morphogens are broadly distributed in developing and adult tissue. For example, DPP and 60A are expressed in both embryonic and developing Drosophila tissue. Vgl has been identified in Xenopus embryonic tissue. Vgr-i transcripts have been identified in a variety of murine tissues, including embryonic and developing brain, lung, liver, kidney and calvaria (dermal bone) tissue. In addition, both CBMP2B and CBMP3 have been identified in lung tissue. Recently, Vgr-1 transcripts also have been identified in adult murine lung, kidney, heart, and brain tissue, with particularly high levels in the lung (see infra). GDF-1 has been identified in human adult cerebellum and in fetal brain tissue. In addition, recent Northern blot analyses indicate that OP-1 is encoded by multiple transcripts in different tissues. This potential alternative splicing is consistent with the hypothesis that the longer transcripts may encoded additional proteins bicistronic mRNA) and each form may be tissue or developmentally related.

I*«

39 OP-1 and the CBMP2 proteins, both first identified as bone morphogens, have been identified in mouse and human placenta, hippocampus, calvaria and osteosarcoma tissue as determined by identification of OP-1 and CMBP2-specific sequences in cDNA libraries constructed from these tissues (see USSN 422,699, incorporated herein by reference).

Additionally, the OP-1 protein is present in a variety of embryonic and developing tissues including kidney, liver, heart and brain as determined by Western blot analysis and immunolocalization (see infra). OP-1-specific transcripts also have been identified in both embryonic and developing tissues, most abundantly in developing kidney, bladder, adrenal and (see infra). OP-1 also has been identified as a mesoderm inducing factor present during embryogenesis.

Moreover, OP-1 has been shown to be associated with satellite cells in the muscle and associated with potential pluripotential stem cells in bone marrow following damage to adult murine endochondral bone, indicating its morphogenic role in tissue repair and regeneration. In addition, a novel protein GDF-1 comprising a 7 cysteine skeleton, has been identified in neural tissue (Lee, 1991, Proc. Nat. Aca. Sci. 88: 4250-4254).

Knowledge of the tissue distribution of a given morphogen may be useful in choosing a cell type for screening according to the invention, or for targeting that cell type or tissue type'-for treatment. The proteins (or their mRNA transcripts) are readily identified in different tissues using standard methodologies and minor Smodifications thereof in tissues where expression may be low. For example: protein distribution may be determined using standard Western blot analysis or immunocytochemical techniques, and antibodies specific to the morphogen or P:\0'111\M10\21 62 P2,llI h1/W/1 morphogens of interest. Similarly, the distribution of morphogen transcripts may be determined using standard Northern hybridization protocols and a transcript-specific probe and hybridization conditions.

2. Useful Morphogens As defined herein a protein is morphogenic if it is capable of inducing the developmental cascade of cellular and molecular events that culminate in the formation of new, organ-specific tissue and comprises at least the conserved C-terminal six cysteine skeleton or its functional equivalent (see supra).

Specifically, the morphogens generally are capable of all of the following biological functions in a morphogenically permissive environment: stimulating proliferation of progenitor cells; stimulating the differentiation of progenitor cells; stimulating the proliferation of differentiated cells; and supporting the growth and maintenance of differentiated cells, including the "redifferentiation" of transformed cells. Details of how the morphogens detectable according to the methods of this invention

*O*

060e 20 first were identified, as well as a description on how to make, *00 use and test them for morphogenic activity are disclosed in Australian Patent Specification No. 660019 filed 11 March, 1992, the disclosures of which are hereby incorporated by reference.

As disclosed therein, the morphogens may be purified from 5A55 25 naturally-sourced material or recombinantly produced from procaryotic or eucaryotic host cells, using the genetic sequences disclosed therein. Alternatively, novel morphogenic sequences may be identified following the procedures disclosed therein.

ok *4 o 4 S o a I 41 Particularly useful proteins include those which comprise the naturally derived sequences disclosed in Table II. Other useful sequences include biosynthetic constructs such as those disclosed in U.S. Pat. 5,011,691, the disclosure of which is incorporated herein by reference COP-1, COP-3, COP-4, COP-5, COP-7, and COP-16).

Accordingly, the morphogens detectable according to the methods and compositions of this invention also may be described by morphogenically active proteins having amino acid sequences sharing 70% or, preferably, 80% homology (similarity) with any of the sequences described above, where "homology" is as defined herein above.

The morphogens detectable according to the method of this invention also can be described by any of the 6 generic sequences described herein (Generic Sequences 1, 2, 3, 4, 5 and Generic sequences 1 and 2 also may.

include, at their N-terminus, the sequence Cys Xaa Xaa Xaa Xaa (Seq. ID No. 1 Table II, set forth below, compares the amino acid sequences of the active regions of native proteins that j have been identified as morphogens, including human OP-1 (hOP-1, Seq. ID Nos. 5 and 16-17),'mouse OP-1 (mOP-1, Seq.

ID Nos. 6 and 18-19), human and mouse OP-2 (Seq. ID Nos. 7, 8, and 20-23), CBMP2A (Seq. ID No. CBMP2B (Seq. ID No. 10), BMP3 (Seq. ID No. 26), DPP (from Drosophila, Seq.

ID No. 11), Vgl, (from Xenopus, Seq. ID No. 12), Vgr-1 (from mouse, Seq. ID No. 13), GDF-1 (from mouse, Seq. ID 42 Nos. 14, 32 and 33), 60A protein (from Drosophila, Seq. ID Nos. 24 and 25), BMP5 (Seq. ID No. 27) and BMP6 (Seq. ID No. 28). The sequences are aligned essentially following the method of Needleman et al. (1970) J. Mol. Biol., 48:443-453, calculated using the Align Program (DNAstar, Inc.) In the table, three dots indicates that the amino acid in that position is the same as the amino acid in hOP-1. Three dashes indicates that no amino acid is present in that position, and are included for purposes of illustrating homologies. For example, amino acid residue of CBMP-2A and CBMP-2B is "missing". Of course, both these amino acid sequences in this region comprise Asn-Ser (residues 58, 59), with CBMP-2A then comprising Lys and Ile, whereas CBMP-2B comprises Ser and Ile.

eose e s 43 TLABLE I I hOP-1 zaOP-1 hOP-2 mOP-2

DPP

Vgl Vgr-1 CBffP-2A CBMP-2B BHP3 GDF- 1 BHP6 hop- I MOP-i hOP-2 mOP-2

DPP

Vgl Vgr-1 CBHP-2A CBHP-2B BMP3 GDF- 1 Cys 1e Ser Asp Glu Asp Asp Asp Asp Lys Arg Arg Arg Arg Ala Arg Gin Arg Phe Lys Arg Arg Arg Lys Arg Arg Arg Ala Met Arg Gin Ser Lys Gin Ser Ser Ala Lys His Arg Arg Arg Glu Asp Glu Glu Ser His Gly Pro Ser Tyr Arg Thr Leu Val Val Val Val Val Ile Val Leu Gly Tyr Lys Trp Val Gin Asp Leu Leu Asp Asn Asn Asn II Ser Glu His Arg His S S S S S S 9* *5 S S S S 44 BHP6 Gin a be hOP- 1 MOP-i hOP-2 mOP-2

DPP

Vgi Vgr- I CBHP- 2A CBHP-2B BMP3 GDF-i1 BHP6 Trp Ile Ile Val Val Val Val Val Val Val Ala Ser Pro Glu Gly Tyr Gin Gin Leu Gin Lys Pro Pro Lys Ser Phe Arg Phe Lys Ala Ser Ser Asp Met His Gin Asp Leu Gly

S

S S 5**S 9* .5 S hOP-i mOP-i hOP-2 mOP-2

DPP

Vgi Vgr- I CBHP-2A CBMP- 2B BHP3 GDF- 1 BHP6 Ala Tyr Tyr Asn Asn Phe Phe Asn Phe Phe Asn Cys Giu Gly His Tyr Asp His His Ser Gin Ser Asp Asp Glu Cys Lys Glu Asp, Ala Gin Ala Ser Pro Pro Ser Pro Pro Gin Asn Ser Ser S hOP-i Phe Pro Leu Asn Ser Tyr Met Asn Ala mOP-I hOP-2 Asp Cys mOP-2 Asp Cys DPP Ala Asp His Phe Ser Vgl Tyr Thr Glu Ile Leu Gly Vgr-1 Ala His CBHP-2A Ala Asp His Leu Ser CBHP-2B Ala Asp His Leu Ser GDF-1 Leu Val Ala Leu Ser Gly Ser** BHP3 Met Pro Lys Ser Leu Lys Pro Ala His BMPS Ala His Met BHP6 Ala His Met hOP-i Thr Asn His Ala Ile Val Gin Thr Leu

IDP-.

hOP-2 Leu Ser mOP-2 Leu Ser DPP Val Vgl Ser Leu Vgr-i CBHP-2A CBMP-2B SBMP3 Ser Thr Ile Ser Ile GDF-1 Leu Val Leu Arg Ala 5HP65 45 47 hOP-1 Val His Phe Ile Asn Pro Glu Thr Val uMOP-1 Asp hOP-2 His Leu Met Lys Asn Ala mOP-2 His Leu Met Lys Asp Val DPP Asn Asn Asn Gly Lys Vgl Ser Glu Asp le Vgr-i Val Met Tyr CBIIP-2A Asn Ser Val Ser Lys Ile CBXP-2B Asn Ser Val Ser Ser Ile BMP3 Arg Ala** Gly Val Val Pro Gly Ile .GDF-1 Met Ala Ala Ala Gly Ala Ala Leu Leu Glu Lys Lys Leu Met Phe Asp His BHP6 Leu Met Ty hOP-I Pro Lys Pro Cys Cys Ala Pro Thr Gin mOP-1 hQP-? Ala Lys UP Ala Val Vg9 2 Leu Ala.. a Lys CBHP-2A Al a Vail Giu CBHP-2B Ala Val. Giu BHP3 Giu Val.. Glu Lys GDF-l Asp Leu Va]. Ala Arg Arg Lys BKP6 Lys 65 48 Leu Aszi Ala Ile Ser Val Leu Tyr Phe hOP-i MOP-i hOP-2 inOP-2 Vgi Vgr-i

DPP

CBMP-2A CBMP-2B BMP3 GDF- I BMP6 Met Val Met Ser Ser Ser Asp Ser Ser Ser Ser Gly Pro Ser Ser Pro Ser Gin Asn Asn Asn Tyr Asn Asp Asn Thr Thr Val Leu Leu Ser Asn Asn Asp Giu Asp Lys Asp Glu Ala Pro Asn Thr Lys Lys Met Met Met Met le Vai Ile Val Val Val Val Val Val Asn Phe Phe Phe Leu Tyr Tyr Tyr Leu Leu Leu Tyr His Lys Arg Arg Arg Arg hOP-i MOP-i hOP-2 inOP-2

DPP

Vgl Vgr-i CBHP-2A CBMP-2B BMP3 GDF-i 60A BMP6 Asp Asp Ser Ser Asn Asn Glu Glu Glu Asn Ieu Asn S S 44*5 *95* 4. 4 49 hOP-i Lys Tyr Arg Asn Met Val Val Arg IiOP-1 hOP-2 His Lys mOP-2 His Lys DPP Asn Gin Giu Thr Val Vgi His Glu Ala Asp Vgr-1.. CBMP-2A Asn Gin Asp Giu CBHP-2B Asn Gin Glu Gilu BHP3 Val.. Pro Thr.. Glu GDF-1 Gin Giu Asp Asp Ile Lys BMP6 Trp hOP-1 Ala Cys Gly Cys His mOP-i hOP-2.. mOP-2.. DPP Gly Arg Vgi Giu Arg Vgr-i t.AGy r CBHP-2A Gly Arg BMP3 Ser Ala Arg **GDF-1 Glu Arg Ser Ser BHP6 100 **Between residues 56 and 57 of BliF3 is a Val residue; between residues 43 and 44 of GDF-1 lies the amiino acid sequence Gly-Gly-Pro-Pro.

As is apparent from the foregoing amino acid sequence comparisons, significant amino acid changes can be made within the generic sequences while retaining the morphogenic activity. For example, while the GDF-1 protein sequence depicted in Table II shares only about 50% amino acid identity with the hOP1 sequence described therein, the GDF-1 sequence shares greater than 70% amino acid sequence homology (or "similarity") with the hOP1 sequence, where "homology" or "similarity" includes allowed conservative amino acid changes within the sequence as defined by Dayoff, et al., Atlas of Protein Sequence and Structure supp.3, pp.345-362, Dayoff, ed., Nat'l BioMed.

Res. Fd'n, Washington D.C. 1979.) The currently most preferred protein sequences detectable as morphogens in this invention include those having greater than 60% identity, preferably greater than 65% identity, with the amino acid sequence defining the conserved six cysteine skeleton of hOPI residues 43-139 of Seq. ID No. These most preferred sequences include both allelic and species variants of the OP-1 and OP-2 proteins: including the Drosophila 60A protein. Accordingly, in still another preferred aspect, the invention includes detection of morphogens comprising species of polypeptide chains having the generic amino acid sequence referred to herein as "OPX", which defines the seven cysteine skeleton and accommodates the identities between the various identified mouse and human OP1 and OP2 proteins. OPX is presented in Seq. ID No. 29. As described therein, each Xaa at a given position independently is selected from the residues occurring at the corresponding position in the C-terminal sequence of mouse or human OP1 or OP2 (see Seq. ID Nos. 5-8 and/or Seq. ID Nos. 16-23).

3. Tissue-Specific Expression of OP-1 Once a morphogen is identified in a tissue, its level may be determined either at the protein or nucleic acid level. By comparing the levels of production of a given morphogen among different tissues, it is possible to determine the tissue(s) of origin of that morphogen. The level of production of the morphogen OP-1 in different tissues is one example of a morphogen having a tissue of origin, the kidney, which contains a cell type that can also be used as the cell type which is used to screen, according to the invention, different compounds for their potential effects on morphogen (OP-1) production.

The level of OP-1 varies among different tissue types. In order to screen compounds for their effect on the production of OP-1 by a given cell type, it may be desirable to determine which tissues produce levels of OP-1 which are sufficiently high to show a potential decrease and sufficiently low to show a potential increase in production. Different tissues may be screened at the RNA level as follows.

Any probe capable of hybridizing specifically to a transcript, and distinguishing the-transcript of interest from other, related transcripts may be used. Because the S* morphogens to be detected in the methods of this invention share such high sequence homology in their C-terminal domain, the tissue distribution of a specific morphogen transcript may best be determined using a probe specific

*L

I for the "pro" region of the immature protein and/or the N-terminal heterogeneous region of the mature protein.

Another useful probe sequence is the 3'non-coding region immediately following the stop codon. These portions of the sequence vary substantially among the morphogens of this invention, and accordingly, are specific for each protein. For example, a particularly useful Vgr-l-specific probe sequence is the PvuII-SacI fragment, a 265 bp fragment encoding both a portion of the pro region and the N-terminus of the mature sequence. Similarly, particularly useful mOP-1-specific probe sequences are the BstXI-BglI fragment, a 0.68kb sequence that covers approximately twothirds of the mOPi pro region; a StuI-Stul fragment, a 0.2 Kb sequence immediately upstream of the 7-cysteine domain, and an EarI-PstI fragment, a 0.3kb fragment containing the 3'untranslated sequence. Similar approaches may be used, for example, with hOP-1 (SEQ. ID NO.16) or human or mouse OP-2 (SEQ. ID NOS.20 and 22).

Using morphogen-specific oligonucleotides probes, morphogen transcripts can be identified in mammalian tissues, using standard methodologies well known to those having ordinary skill in the art. Briefly, total RNA from mouse embryos and organs from post-natal animals is prepared using the acid guanidine thiocyanate-phenolchloroform method (Chomczynski et al., Anal. Biochem.

162:156-159, 1987). The RNA may be dissolved in TES buffer mM Tris-HC1, 1 mM EDTA, 0.1% SDS, pH 7.5) and treated with Proteinase K (approx. 1.5 mg per g tissue sample) at 45 0 C for 1 hr. Poly(A) 4 RNA selection on oligo(dT)cellulose (Type 7, Pharmacia LKB Biotechnology Inc., Piscataway, NJ) may be done in a batch procedure by mixing S0.1 g oligo(dT)-cellulose with 11 ml RNA solution (from 1 g f. ft ft.

6 t ft ft ft tissue) in TES buffer and 0.5 M NaCI). Thereafter the oligo(dT) cellulose is washed in binding buffer (0.5 M NaCI, 10 mM Tris-HC1, 1 mM EDTA, pH 7.5) and poly(A) RNA is eluted with water. Poly(A) RNA (5 or 15 pg/lane) is fractionated on 1 or 1.2% agarose-formaldehyde gels (Selden, in Current Protocols in Molecular Biology, Ausubel et al. eds., pp. 1-4, 8, 9, Greene Publishing and Wiley-Interscience, New York, 1991). 1 pl of 400 pg/ml ethidium bromide is added to each sample prior to heat denaturation (Rosen et al., Focus 12:23-24, 1990).

Following electrophoresis, the gels are photographed and the RNA is blotted overnight onto Nytran nitrocellulose membranes (Schleicher Schuell Inc., Keene, NH) with 10 x SSC. The membranes are baked at 80 0 C for 30-60 min. and irradiated with UV light (1 mW/cm 2 for 25 sec.). The Northern hybridization conditions may be as previously described (Ozkaynak et al., EMBO J. 9:2085-2093, 1990).

For re-use, the filters may be deprobed in 1 mM Tris-HC1, 1 mM EDTA, 0.1% SDS, pH 7.5, at 90-95 0 C and exposed to film to assure complete removal of previous hybridization signals.

One probe which may be used to screen for transcripts encoding a morphogen includes a portion of or the complete OP-1 cDNA, which may be used to detect the presence of OP-1 mRNA or mRNAs of related morphogens. The sequence of the murine cDNA gene is set forth in SEQ ID NO:14.

:'OP-1 mRNA expression was analyzed in 17 day mouse embryos and 3 day post-natal mice by sequentially hybridizing filters with various probes. Probes from regions other than the highly conserved 7-cysteine domain were selected because this region is highly variable among members of the TGF-A superfamily. Fig. 1 shows the fragments of OP-1, used as probes in the Northern hybridizations. The solid box indicates the putative signal peptide and the hatched box corresponds'to the TGF-A-like domain that contains the seven cysteine residues. Asterisks indicate the potential N-glycosylation sites. The arrow marks the location of the cleavage site for OP-1 maturation. Three solid bars below the diagram indicate the OP-1 specific fragments used in making 3 2 P-labeled probes (0.68 kb BstXI BglI fragment, 0.20 kb Stul StuI fragment and 0.34 kb Earl PstI non-coding fragment).

Hybridization with a probe that covers approximately two thirds of the pro region (the 0.68 kb BstXI-BglI fragment), reveals a 4 kb message and 3 messages at 1.8 kb, 2.2 kb and 2.4 kb (Fig. 2B and D, and Fig. In the Northern hybridization of Fig. 2, equal amounts (15 pg) of poly(A)' RNA were loaded into each lane, electrophoresed on a 1% agarose-formaldehyde gel, blotted and hybridized. A 0.24 9.49 kb RNA ladder (Bethesda Research Labs, Inc.) was used as size standard. The same filter was used for sequential hybridizations with labeled probes specific for OP-1 (Panels B and Vgr-1 (Panel and EF-Tu (Panel Panel A: the EF-Tu specific probe (a control) was the 0.4 kb HindIII-SacI fragment (part of the coding region), the SacI site used belonged to the vector; Panel B. the OP-1 specific probe was the 0.6& kb BstXI-BglI fragment (two thirds of the pro region and upstream sequences of the mature domain, not including any sequences from the 7-cysteine domain); Panel C: the Vgr-1 specific probe was the 0.26 kb PvuII-SacI fragment (part of the pro region and the amino-terminal sequences of the mature

I

domain, including the first cysteina) (Lyons et al., 1989, Proc. Nat. Aca. Sci. 86: 4554, hereby incorporated by reference). Panel D: the OP-1 flanking) specific probe was the 0.34 kb EarI-PstI fragment untranslated sequences immediately following the sequences encoding OP- 1).

In Fig. 3, the tissues to be used for RNA preparation were obtained from two week old mice (Panel A) or 5 week old mice (Panel with the exception of poly A+ RNA which was obtained from kidney adrenal gland of two week old mice (Panel Equal amounts of poly A+ RNA pg for Panel A and 5 pg for Panel B) were loaded into each well. After electrophoresis agaroseformaldehyde gels) and blotting, RNA was hybridized to the OP-1 specific 3' flanking probe described in the legend of Fig. 2 (Panel The 0.24-9.5 kb RNA ladder was used as siz.* -tandard. The arrowheads indicate the OP-1 specific messages. The lower section of Panels A and B show the hybridization pattern obtained with the EF-Tu specific probe (a control).

Although the size of the Vgr-1 specific message is close to the 4 kb OP-1 species (Fig. 2 Panel the OP-1 4 kb mRNA is somewhat larger. To further rule out crosshybridization with a non-OP-1 message, the 0.2 kb StuI-Stul fragment which represents the gene specific sequences immediately upstream of those encoding the 7-cysteine domain was used. This probe gave a hybridization pattern similar to the one shown in Fig. 2 Panel B (data not shown). A third probe, the 0.34 kb EarI-PstI fragment containing 3' untranslated sequences, also confirmed the pattern (Fig. 2 Panel Thus, the same four OP-1 specific messages were observed with three distinct probes.

e* -e I The appearance of a new 4 kb OP-1 mRNA species was initially interpreted as cross hybridization of the OP-1 probe with Vgr-1 mRNA, which is approximately this size (Fig. 2 Panel However, the 4 kb message was detected with three different OP-1 specific probes, including one specific to the 3' untranslated region, and moreover it was separated from Vgr-1 message on the basis of size. Most likely, therefore, the 4 kb mRNA (and the three species of 1.8 kb, 2.2 kb and 2.4 kb) results from alternative splicing of OP-1 transcripts. The 4 kb OP-1 mRNA could also represent a bicistronic mRNA. The 4 kb message is a minor species in kidney, while it is more prominent in adrenal tissue.

The level of OP-1 expression was compared in different tissues using poly(A) RNA prepared from brain, spleen, lung, kidney and adrenal gland, heart, and liver of 13 day post-natal mice. The RNA was analyzed on Northern blots by hybridization to various probes (Fig. 3. Equal amounts of mRNA, as judged by optical density, were fractionated on agarose formaldehyde gels. Ethidium bromide staining of the gels revealed some residual ribosomal RNA in addition to the mRNA and provided another assurance that the mRNA was not degraded and that there was not significant quantitative or qualitative variation in the preparation. As control for.mRNA recovery, EF-Tu (translational elongation factor) mRNA was probed (assuming uniform expression of EF-Tu in most tissues). A great variation in the level of OP-1 expression was observed in spleen, lung, kidney and adrenal tissues whereas EF-Tu mRNA levels appeared relatively constant in these tissues (Fig. 3 Panel The highest level of OP-I mRNA was found Sin the kidneys. Uniformly lower levels of EF-Tu mRNA were 4 i.o.

4 0

-I-

found in brain, heart and liver (Fig. 3 Panel A).

Additional analysis of OP-i mRNA showed the presence of significant amounts of OP-1 mRNA in the bladder (data not shown). In summary, next to kidney, bladder and adrenal tissue, brain tissue contained the highest levels of OP-i RNA, whereas heart and liver did not give detectable signals.

OP-i mRNA patterns display qualitative changes in the various tissues. Of the four messages found in brain, the 2.2 kb message is most abundant whereas in lung and spleen the 1.8 kb message predominates. Levels of the 1.8-2.4 kb in the kidney OP-i mRNA are approximately two times higher in 3 day post-natal mice than in 17 day embryos, perhaps reflecting phases in bone and/or kidney development. mRNA was also prepared from carefully separated renal and adrenal tissues of 5 week old mice.

Northern blot analysis (Figure 3 Panel B) revealed that the high levels of 2.2 kb mRNA were derived from renal tissue whereas the 4 kb mRNA was more prominent in adrenal tissue.

The detection of of OP-i message primarily in the kidney but also in bladder links OP-i expression specifically with the urinary tract. Interestingly, the related Vgr-1 is also expressed at significant levels in kidney although its main site of expression in lung.

Once the tissue-specific expression of a given morphogen is known, cell types known to exist in that tissue or cell lines derived from that tissue can be screened, in a similar manner, to identify the cell type within that tissue that is actually responsible for the Stissue specific synthesis and secretion of the morphogen.

Once a cell type which produces the morphogen in an amount

P..

S S .9.

I

sufficient to detect increases or decreases in the production level of the morphogen upon exposure to a compound is identified, it may be used in tissue culture assay to rapidly screen for the ability of compound to upregulate or down regulate the synthesis and secretion of the morphogen. The level of morphogen production by the chosen cell type is determined with and without incubating the cell in culture with the compound, in order to assess the effects of the compound on the cell's ability to synthesize or secrete the morphogen. This can be accomplished by detection of the level of production of the morphogen either at the protein or mRNA level.

4. Growth of Cells in Culture Cell cultures derived from kidney, adrenals, urinary bladder, brain, or other organs, may be prepared as described widely in the literature. For example, kidneys may be explanted from neonatal, new born, young or adult rodents (mouse or rat) and used in organ culture as whole or sliced (1-4 mm) tissues. Primary tissue cultures and established cell lines, also derived from kidney, adrenals, urinary, bladder, brain, or other tissues may be established in multiwell plates (6 well, 24 well, or 96 well) according to conventional cell culture techniques, and are cultured in the absence or presence of serum for a period of time (1-7 days). Cells may be cultured, for example, in Dulbecco's Modified Eagle medium (Gibco, Long Island, NY) containing serum fetal calf serum at 1%- 10%, Gibco) or in serum-deprived medium, as desired, or in defined medium containing insulin, transferrin, glucose, albumin, or other growth factors).

S

Samples for testing the level of morphogen production include culture supernatants or cell lysates, collected periodically and evaluated for OP-1 production by immunoblot analysis of a portion of the cell culture itself, collected periodically and used to prepare polyA+ RNA for RNA analysis (Sambrook et al., eds., Molecular Cloning, 1989, Cold Spring Harbor Press, Cold Spring Harbor, NY). To monitor de novo OP-1 synthesis, some cultures are labeled with 3 SS-methionine/ 3 S-cysteine mixture for 6-24 hours and then evaluated for morphogen production by conventional immunoprecipitation methods (Sambrook et al., eds., Molecular Cloning, 1989, Cold Spring Harbor Press, Cold Spring Harbor, NY).

Alternatively, the production of morphogen or determination of the level of morphogen production may be ascertained using a simple assay for a parameter of cell growth, e.g., cellular proliferation or death. For example, where a morphogen is produced by a cultured cell line, the addition of antibody specific for the morphogen may result in relief from morphogen inhibition of cell growth. Thus, measurement of cellular proliferation can be used as an indication of morphogen production by a tissue.

Determination of Level of Morphogenic Protein In order to quantitate the production of a morphogenic protein by a cell type, an immunoassay may be performed to detect the morphogen using a polyclonal or monoclonal antibody specific for that morphogen. For example, OP-1 may be detected using a polyclonal antibody specific for OP-1 in an ELISA, as follows.

1 pg/100 ul of affinity-purified polyclonal rabbit IgG specific for OP-1 is added to each well of a 96-well o k «4 61 plate and incubated at 37*C for an hour. The wells are washed four times with 0.16M sodium borate buffer with 0.15 M NaCl (BSB), pH 8.2, containing 0.1% Tween 20. To minimize non-specific binding, the wells are blocked by filling completely with 1% bovine serum albumin (BSA) in BSB for 1 hour at 37°C. The wells are then washed four times with BSB containing 0.1% Tween 20. A 100 ul al.iuot of an appropriate dilution of each of the test samples of cell culture supernatant is added to each well in triplicate and incubated at 37°C for 30 min. After incubation, 100 ul biotinylated rabbit anti-OP-1 serum (stock solution is about 1 mg/ml and diluted 1:400 in BSB containing 1% BSA before use) is added to each well and incubated at 37 C for 30 min. The wells are then washed four times with BSB containing 0.1% Tween 20. 100 ul strepavidin-alkaline (Southern Biotechnology Associates, Inc. Birmingham, Alabama, diluted 1:2000 in BSB containing 0.1% Tween 20 before use) is added to each well and incubated at 37*C for 30 min. The plates are washed four times with 0.5M Tris buffered Saline (TBS), pH 7.2. substrate (ELISA Amplification System Kit, Life Technologies, Inc., Bethesda, MD) are added to each well incubated at room temperature for 15 min. Then, 50 ul amplifier (from the same amplification system kit) is added and incubated for another 15 min at room temperature. The reaction is stopped by the addition of 50 il 0.3 M sulphuric acid. The OD at 490 nm of the solution in each well is recorded. To quantitate OP-1 in culture media, a OP-1 standard curve is performed in parallel with the test samples.

V*

62 6. Preparation of Polyclonal Antibody Polyclonal antibody is prepared as follows. Each rabbit is given a primary immunization of 100 ug/500 ul E. coli-produced OP-1 monomer (amino acids 328-431 of SEQ.

ID NO: 11) in 0.1% SDS mixed with 500 ul Complete Freund's Adjuvant. The antigen is injected subcutaneously at multiple sites on the back and flanks of the animal. The rabbit is boosted after a month in the same manner using incomplete Freund's Adjuvant. Test bleeds are taken from the ear vein seven days later. Two additional boosts and test bleeds are performed at monthly intervals until antibody against OP-1 is detected in the serum using an ELISA assay. Then, the rabbit is boosted monthly with 100 ug of antigen and bled (15 ml per bleed) at days seven and ten after boosting.

7. Preparation of Monoclonal Antibody and Neutralizing Mon oclonal Antibody Monoclonal antibody specific for a given morphogen may be prepared as follows. A mouse is given two injections of E. coli produced OP-1 monomer (amino acids 328-431 in SEQ ID NO:11). The first injection contains 100ug of OP-1 in complete Freund's adjuvant and is given subcutaneously. The second injection contains 50 ug of OP- 1 in incomplete adjuvant and is given intraperitoneally.

The mouse then receives a total of 230 ug of OP-1 (amino acids 307-431 of SEQ ID NO:11) in four intraperitoneal injections at various times over an eight month period.

One week prior to fusion, The mouse is boosted intraperitoneally with 100 ug of OP-1 (15-139) and 30 ug of the N-terminal peptide (Ser293-Asn309-Cys) conjugated through the added cys residue to bovine serum albumin with e SMCC crosslinking agent. This boost is repeated five days four days three days (IP) and one day (IV) prior to fusion. The mouse spleen cells are then fused to myeloma 653) cells at a ratio of 1:1 using PEG 1500 (Boehringer Mannheim), and the cell fusion is plated and screened for OP-1-specific antibodies using OP-1 (307-431) as antigen. The cell fusion and monoclonal screening are according to procedures widely available in the art. The neutralizing monoclonal is identified by its ability to block the biological activity of OP-1 when added to a cellular assay which responds biologically to added OP-1.

8. Identification of OP-1 Producing Cell Line Which Displays OP-1 Surface Receptors During the process of routinely testing the effects of increasing concentrations of OP-1 and TGF-B on the proliferation of various cell lines, a cell line was identified which, surprising, appears not only to synthesize and secrete OP-1, but also to display cell surface receptors to which the secreted OP-1 binds and acts to inhibit proliferation of the cells. This cell line was identified after the following observations.Addition of increasing concentrations of OP-1 or TGF-A failed to increase or decrease the relatively low basal rate of proliferation of the cells. However, addition of a monoclonal antibody, which neutralizes the activity of Op-1, resulted in a large increase in the proliferation of the cells. In addition, simultaneous addition of the same quantity of OP-1 neutralizing monoclonal to a fixed amount of OP-1 resulted in an increase in proliferation which was intermediate between the low *e 9 64 basal level observed with OP-1 alone and the high level observed with the monoclonal alone.This cell line, which is an epithelial cell line that was derived from a bladder cell carcinoma, may be used in an assay of the invention. The parameter to be tested according to the invention is cellular proliferation. Thus, a compound(s) that increases or decreases the level of OP-1 production may be tested on this cell line as follows..

9. Assay for Identifying Drugs Which Affect OP-1 Synthesi s A simple medium flux screening assay can be configured in a standard 24 or 96 well microtiter dishe, in which each well contains a constant number of a cell line having the characteristics described above. Increasing concentrations of an OP-1 neutralizing monoclonal antibody is added from left to right across the dish. A constant amount of different test substances is added from top to bottom on the dish. An increase in the synthesis and secretion of OP-1 (over its constitutive (non-induced) level) will be indicated by an increase in the amount of OP-1 neutralizing antibody required to release the cells from the antimitogenic activity of OP-1. A decrease in the synthesis and secretion of OP-1 (below its constitutive (repressed) level) will be indicated by the observation that decreased concentrations of the OP-1 neutralizing monoclonal antibody will be required to release the cells from the antimitogenic activity of OP-1. One of the major advantages of this assay is that the end point, the 'dilution of antibody which has an effect on cell proliferation, is a measure of mitosis, or an increase in the number of cells per well. Because several convenient and rapid assays exist for quantitating cell numbers, this assay is faster and requires significantly fewer steps to perform.

The assay may be performed as follows. After addition of appropriate concentrations of the OP-1 neutralizing monoclonal antibody and test substances to the wells containing the cells, the dishes are placed in an incubator at 37 0 C for a period of 1-3 days. After completion of incubation/growth period, the dishes are removed and the cells in the individual wells are washed and stained with a vital stain, such as crystal violet. Washing and staining procedures are well-known in the art. The cells are then lysed and the stain dissolved in a constant amount of a solvent, such as ethanol. Quantitations of the dissolved stain, which is readily performed on an automated plate vendor, allows for direct quantitation of the number of cells in each well.

The above-described assay has the advantages of being rapid and easy to perform becaue it requires few steps.

Another advantage is intrinsic to the assay; drugs which are screened according to this procedure that result in cell death cytotoxic substances) are immediately, identifiable without the need of operator observation. In addition, although drugs that stop the growth of the cells cytostatic substances) are scored as positive due to failure to see increases in cell numbers, they are automatically scored as suspect due to the failure of the highest concentrations of OP-1 neutralizing monoclonal antibody to release the cells from the antimitogenic activity of OP-1.

o, 191 Candidate Drugs to Screen The screening methods of the invention is used to test compounds for their effect on the production of morphogenic protein by a given cell type. Examples of compounds which may be screened include but are not limited to chemicals, biological response modifiers lymphokines, cytokines, hormones, or vitamins), plant extracts, microbial broths and extracts medium conditioned by eukaryotic cells, body fluids, or tissue extracts.

The invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.

67 SEQUENCE LISTING GENERAL INFORMATION: APPLICANT: John Smart Herman Oppermann Engin Ozkaynak Thangavel Kuberasampath David C. Rueger Roy H.L. Pang Charles M. Cohen (ii) TITLE OF INVENTION: MORPHOGENIC PROTEIN SCREENING METHOD (iii) NUMBER OF SEQUENCES: 33 (iv) CORRESPONDENCE ADDRESS: ADDRESSEE: Creative BioMolecules STREET: 35 South Street CITY: Hopkinton STATE: Massachusetts COUNTRY: U.S.A.

ZIP: 01748 COMPUTER READABLE FORM: MEDIUM TYPE: Diskette, 5.25, 360kb storage COMPUTER: IBM XT OPERATING SYSTEM: DOS 3.30 S(D) SOFTWARE: ASC II TEXT (vi) PRIOR APPLICATION DATA: APPLICATION NUMBER: US 667,274 FILING DATE: March 11, 1991 (vii) PRIOR APPLICATION DATA: V APPLICATION NUMBER: US 752,861 FILING DATE: AUGUST 30, 1991 e ee* (viii)ATTORNEY/AGENT INFORMATION NAME: PITCHER, EDMUND R.

REG. NO.: 27,829 DOCKET NO.: CRP-058PC (ix) TELEPHONE: 617/248-7000 TELEFAX: 617/248-7100 *e e *e o a

M

69 INFORMATION FOR SEQ ID NO:1: SEQUENCE CHARACTERISTICS: LENGTH: 97 amino acids TYPE: amino acids TOPOLOGY: linear (ii) MOLECULE TYPE: protein (ix) FEATURE: NAME: Generic Sequence 1 OTHER INFORMATION: Each Xaa indicates one of the 20 naturallyoccurring L-isomer, a-amino acids or a derivative thereof.

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:1: Xaa, Xaa Xaa, Xaa Xaa Xaa 1 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa, Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Cys Xaa Xaa Xaa Cys Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 45 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Cys Cys Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Cys *Xaa Cys INFORMATION FOR SEQ ID NO: 2: SEQUENCE CHARACTERISTICS: LENGTH: 97 amino acids TYPE: amino acids TOPOLOGY: linear (ii) MOLECULE TYPE: protein (ix) FEATURE: NAME: Generic Sequence 2 OTHER INFORMATION: Each Xaa indicates one of the 20 naturallyoccurring L-isomer, a-amino acids or a derivative thereof.

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: Xaa Xaa Xaa Xaa Xaa Xaa 1 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa, Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Cys Xaa Xaa Xaa Cys Xaa Xaa Xaa Xaa Xaa Xaa Cys Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 40 45 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Cys 55 Cys Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 65 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Cys 85 Xaa Cys Xaa a a.

a a S. a.

a a a a. a a a a .aaa a. a a a.

71 INFORMATION FOR SEQ ID NO:3: SEQUENCE CHARACTERISTICS: LENGTH: 97 amino acids TYPE: amino acids TOPOLOGY: linear (ii) MOLECULE TYPE: protein (ix) FEATURE: NAME: Generic Sequence 3 OTHER INFORMATION: wherein each Xaa is independently selected from a group of one or more specified amino acids as defined in the specification.

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: Leu Tyr Val Xaa Phe 1 Xaa Xaa Xaa Gly Trp Xaa Xaa Trp Xaa Xaa Ala Pro Gly Xaa Xaa Xaa Ala Xaa Tyr Cys Xaa Gly Xaa Cys Xaa Xaa' Pro Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Asn His Ala Xaa Xaa 40 Xaa Xaa Leu Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Cys 55 Cys Xaa Pro Xaa Xaa Xaa Xaa Xaa a 72 Xaa Xaa Xaa Leu Xaa Xaa Xaa Xaa Xaa Xaa Xaa Val Xaa Leu Xaa Xaa Xaa Xaa Xaa Met Xaa Val Xaa Xaa Cys Gly Cys Xaa INFORMATION FOR SEQ ID NO:4: SEQUENCE CHARACTERISTICS: LENGTH: 102 amino acids TYPE: amino acids TOPOLOGY: linear (ii) MOLECULE TYPE: protein (ix) FEATURE: NAME: Generic Sequence 4 OTHER INFORMATION: wherein each Xaa is independently selected from a group of one or more specified amino acids as defined in the specification.

(xi) SEQUENCE DESCRIPTION: SEQ ID VO:4: Cys Xaa Xaa Xaa Xaa Leu Tyr Val Xaa Pile o1 5 Xaa Xaa Xaa Gly Trp Xaa Xaa Trp Xaa Xaa Ala Pro Xaa Gly Xaa Xaa Ala n:.Xaa Tyr Cys Xaa Gly Xaa Cys Xaa *Xaa Pro Xaa Xaa Xaa Xaa Xaa *5*S.40 Asn Xaa Xaa Asn His Ala Xaa Xaa Xaa Xaa Leu Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Cys Cys Xaa Pro Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Leu Xaa Xaa Xaa Xaa Xaa Xaa Xaa Val Xaa Leu Xaa Xaa Xaa Xaa Xaa Met Xaa Val Xaa Xaa Cys Gly Cys Xaa 100 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 139 amino acids TYPE: amino acids TOPOLOGY: linear (ii) MOLECULE TYPE: protein (ix) FEATURE: NAME: hOP-i (mature form) (xi) SEQUENCE DESCRIPTION: SEQ ID Ser Thr Gly Ser Lys Gin Arg Ser Gin S. S1 ::Asn Arg Ser Lys Thr Pro Lys Asn Gin Glu Ala Leu Arg Met Ala Asn Val. Ala :20 Glu Asn Ser Ser Ser Asp Gin Arg Gin Al a Ser Trp Ala Phe Thr Val Pro 100 Leu Asp Lys Cys Cys Phe Ile Tyr Pro Asn His Lys Asn 110 Asp Tyr Gly Lys Arg Ile Tyr Leu His Phe Pro Ala Ser 120 Arg Cys Lys Asp Ala Cys Asn Ala Ile Cys Ile Ser Asn 130 His His Leu Pro Giu Ser Ile Asn Cys Ser Asn Met Glu Gly Glu Gly Tyr Val Pro Ala 105 Val Val1 Val Leu Trp Gly Giu Met Gin Giu Pro Leu 115 Ile Val.

Tyr Gln Tyr Cys Asn Thr Thr Thr Tyr Leu 125 Arg Val Asp Ala Ala Ala Leu Val Gin Phe Lys Ala 135 Ties 000 :s9 *900.

0*00.

:*9o* of.

99 INFORMATION FORl SEQ ID NO:6: SEQUENCE CHARACTERISTICS: LENGTH: 139 amino acids TYPE: a.mino acids TOPOLOGY: linear (ii) MOLECULE TYPE: protein (ix) FEATURE: NAME: mOP-1 (mature form) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: Ser Thr Gly Gly Lys Gin Arg Ser Gin 1 Asn Arg Ser Lys Thr Pro Lys Asn Gin Glu Ala Leu Arg Met Ala Ser Val Ala Giu Asn Ser Ser Ser Asp Gin Arg Gin Ala Cys Lys Lys His Giu Leu Tyr Val Ser Phe Arg Asp Leu Gly Trp, Gin Asp Trp Ile Ile Ala Pro Glu Gly Tyr Ala Ala Tyr Tyr Cys Glu Gly Giu Cys Ala *Phe Pro Leu Asn Ser Tyr Met Asn Ala *Thr Asn His Ala Ile Val. Gin Thr Leu *Val His Phe Ile Asn Pro Asp Thr Val V Pro Lys Pro Cys Cys Ala Pro Thr Gin 100 105 Leu Asn Ala Ile Ser Val Leu Tyr Phe **.110 115 Asp Asp Ser Ser Asn Val Ile Leu Lys 120 125 Lys Tyr Arg Asn Met Val Val Arg Ala 130 135 Cys Gly Cys His INFORMATION FOR SEQ ID NO:7: SEQUENCE CHARACTERISTICS: LENGTH: 139 amino acids TYPE: amino acids TOPOLOGY: linear (ii) MOLECULE TYPE: protein (ix) FEATURE: NAME: hOP-2 (mature form) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:7: Ala Val Arg Pro Leu Arg Arg Arg Gln 1 Pro Lys Lys Ser Asn Glu Leu Pro Gin Ala Asn Arg Leu Pro Gly Ile Phe Asp Asp Val His Gly Ser His Gly Arg Gin Val Cys Arg Arg His Glu Leu Tyr Val Ser Phe Gln Asp Leu Gly Trp Leu Asp Trp Val Ile Ala Pro Gin Gly Tyr Ser Ala Tyr Tyr Cys Glu Gly Glu Cys Ser Phe Pro Leu Asp Ser Cys Met Asn Ala 75 Thr Asn His Ala Ile Leu Gin Ser Leu S* 85 77 Val His Leu Met Lys Pro Asn Ala Val Pro Lys Ala Cys Cys Ala Pro Thr Lys 100 105 Leu Ser Ala Thr Ser Val Leu Tyr Tyr 110 115 Asp Ser Ser Asn Asn Val Ile Leu Arg 120 125 Lys His Arg Asn Met Val Val Lys Ala 130 135 Cys Gly Cys His INFORMATION FOR SEQ ID NO:8: SEQUENCE CHARACTERISTICS: LENGTH: 139 amino acids TYPE: amino acids TOPOLOGY: linear (ii) MOLECULE TYPE: protein (ix) FEATURE: NAME: mOP-2 (mature form) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:8: Ala Ala Arg Pro Leu Lys Arg Arg Gin 1 Pro Lys Lys Thr Asn Glu Leu Pro His Pro Asn Lys Leu Pro Gly Ile Phe Asp Asp Gly His Gly Ser Arg Gly Arg Glu 30 Val Cys Arg Arg His Glu Leu Tyr Val 40 Ser Phe Arg Asp Leu Gly Trp Leu Asp Trp Val Ile Ala Pro Gin Gly Tyr Ser 55 78 Ala Tyr Tyr Cys Glii Gly Glu Cys Ala Phe Pro Leu Asp Ser Cys Met Asn Ala Thr Asn His Ala Ile Leu Gin Ser Leu Val His Leu Met Lys Pro Asp Val Val Pro Lys Ala Cys Cys Ala Pro Thr Lys 100 105 Leu Ser Ala Thr Ser Val Leu Tyr Tyr 110 115 Asp Ser Ser Asn Asn Val Ile Leu Arg 120 125 Lys His Arg Asn Met Val Val Lys Ala 130 135 Cys Gly Cys His INFORMATION FOR SEQ ID NO:9: SEQUENCE CHARACTERISTICS: LENGTH: 96 amino acids TYPE: amino acids TOPOLOGY: linear (ii) MOLECULE TYPE: protein (ix) FEATURE: NAME: CBMP-2A(fx) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:9: Cys Lys Arg His Pro Leu Tyr Val Asp Phe Ser 1 5 Asp Val Gly Trp Asn Asp Trp Ile Val Ala Pro Pro Gly Tyr His Ala ]?he Tyr Cys His Gly Glu Cys Pro Phe Pro Leu Ala Asp His Leu Asn jer Thr Asn His Ala Ile Val Gin Thr Leu Val Asn 50 Ser Val Asn Ser Lys Ile Pro Lys Ala Cys Cys Val Pro Thr Glu Leu Ser Ala Ile Ser Net Leu Tyr Leu Asp Glu Asn Glu Lys Val Val Leu Lys Asn Tyr Gin Asp Met Val Val Glu Gly Cys Gly 90 Cys Arg 100 e i *s* INFORMATION FOR SEQ ID 40: SEQUENCE CHARACTERISTICS: LENGTH: 101 amino acids TYPE: amino acids TOPOLOGY: linear (ii) MOLECULE TYPE: protein (ix) FEATURE: NAME: CBMP-2B(fx) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:1O: Cys Arg Arg His Ser 1 Leu Tyr Val Asp Phe Ser Asp Val Gly Trp Asn Asp Trp Ile Val Ala Pro Pro Gly Tyr Gin Ala Phe Tyr Cys His Gly Asp Cys Pro Phe Pro Len Ale. Asp His Leu Asn Ser Thr Asn His Ala Ile Val Gin Thr Len Val Asn Ser Val Asn Ser Ser 55 Ile Pro Lys Ala Cys Cys Val Pro Thr Gin Leu Ser Ala Ile Ser Met Len Tyr Leu Asp Gin Tyr Asp Lys Val Val Leu Lys Asn Tyr Gin Gin Met Val Val Glu Gly Cys Gly Cys Arg 95 100 S t INFORMATION FOR SEQ ID NO:11: SEQUENCE CHARACTERISTICS: LENGTH: 102 amino acids TYPE: amino acids TOPOLOGY: linear (ii) MOLECULE TYPE: protein (ix) FEATURE: NAME: DPP(fx) (xi) SEQUENCE DESCRIPTION: Cys Arg Arg His Ser Leu Tyr 1 5 Asp Val Gly Trp Asp Asp Trp Leu Gly Tyr Asp Ala Tyr Tyr Cys Pro Phe Pro Leu Ala Asp Thr Asn His Ala Val Val Gin 50 Asn Asn Asn Pro Gly Lys Val Cys Val Pro Thr Gin Leu Asp Leu Tyr Leu Asn Asp Gin Ser Lys Asn Tyr G.n Glu Met Thr SEQ ID Val Asp Ile Val Cys His His Phe Thr Leu Pro Lys Ser Val Thr Val Val Val NO:11: Phe Ser Ala Pro Gly Lys Asn Ser Val Asn Ala Cys Ala Met Val Leu Gly Cys 1 ly 100 Cys Arg 1~ 3 e I INFORMATION FOR SEQ ID NO:12: SEQUENCE CHARACTERISTICS: LENGTH: 102 amino acids TYPE: amino acids TOPOLOGY: linear (ii) MOLECULE TYPE: protein (ix) FEATURE: NAME: Vgl(fx) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:12: Cys Lys Lys Arg His Leu Tyr Val Glu Phe Lys 1 5 Asp Val Gly Trp Gin Asn Trp Val Ile Ala Pro Gin Gly Tyr Met Ala Asn Tyr Cys Tyr Gly Glu Cys Pro Tyr Pro Leu Thr Glu Ile Leu Asn Gly Ser Asn His Ala Ile Leu Gin Thr Leu Val His 50 Ser Ile Glu Pro Glu Asp Ile Pro Leu Pro Cys Cys Val Pro Thr Lys Met Ser Pro Ile Ser Met Leu Phe Tyr Asp Asn Asn Asp Asn Val Val Leu Arg His Tyr Glu Asn Met Ala Val Asp Glu Cys 90 Gly Cys Arg 100 83 INFORMATION FOR SEQ ID NO:13: SEQUENCE CHARACTERISTICS: LENGTH: 102 amino acids TYPE: amino acids TOPOLOGY: linear (ii) MOLECULE TYPE: protein (ix) FEATURE: NAME: Vgr-l(fx) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:13: Cys Lys Lys His Glu Leu Tyr Val Ser Phe Gin 1 5 Asp Val Gly Trp Gin Asp Trp Ile Ile Ala Pro Xaa Gly Tyr Ala Ala Asn Tyr Cys Asp Gly Glu Cys Ser Phe Pro Leu Asn Ala His Met Asn Ala Thr Asn His Ala Ile Val Gin Thr Leu Val His 50 Val Met Asn Pro Glu Tyr Val Pro Lys Pro Cys Cys Ala Pro Thr Lys Val Asn Ala Ile Ser Val Leu Tyr Phe Asp Asp Asn Ser Asn Val Ile Leu Lys Lys Tyr Arg Asn Met Val Val Arg Ala Cys 90 Gly Cys His 100 *eoe

I

INFORMATION FOR SEQ ID NO: 14: SEQUENCE CHARACTERISTICS: LENGTH: 106 amino acids TYPE: protein STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein (vi) ORIGINAL SOURCE: ORGANISM: human TISSUE TYPE: BRAIN (i x) FEATURE: OTHER INFORMATION: /product= "GDF-1 (fx)" (xi) SEQUENCE DESCRIPTION: SEQ ID N' Cys Arg Ala Arg Arg Leu Tyr Va, Ser Phe A 1 5 Trp His Arg Trp Val Ile Ala Pro Arg Gly Phe L 20 Cys Gin Gly Gin Cys Ala Leu Pro Val Ala Leu S 35 Gly Pro Pro Ala Leu Asn His Ala Val Leu Arg A 50 Ala Ala Ala Pro Gly Ala Ala Asp Leu Pro Cys C: 65 Arg Leu Ser Pro le Ser Val Leu Phe Phe Asp A~ 80 Val Val Leu Arg Gin Tyr Glu Asp Met Val Val A 95 100 Cys Arg 105 0: 14: .rg Glu Val eu Ala Asn er Gly Ser la Leu Bet ys Val Pro sn Ser Asp sp Glu Cys ft ft* INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 5 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID Cys Xaa Xaa Xaa Xaa 1 INFORMATION FOR SEQ ID 140:16: SEQUENCE CHARACTERISTICS: LENGTH: 1822 base pairs TYPE: nucleic acid STRAI4DEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (vi) ORIGINAL SOURCE: ORGANISM: HO0MO SAPIENS TISSUE TYPE: HIPPOCAMPUS (ix) FEATURE: NAME/KEY: CDS LOCATION: 49..1341 OTHER INFORMATION:/standard-name. "hOPi" (xi) SEQUENCE DESCRIPTION: SEQ ID 140:16: GGTGCGGGCC CGGAGCCCGG AGCCCGGGTA GCGCGTAGAG CCGGCGCG ATG CAC GTG Met His Val

C

S S S S CGC TCA Arg Ser 5 CTG CGA GCT GCG Leu Arg Ala Ala

GCG

Ala 10 CCG CAC AGC TTC Pro His Ser Phe

GTG

Val GCG CTC TGG GCA Ala Leu Trp Ala

CCC

Pro CTG TTC CTG CTG Leu Phe Leu Leu TCC GCC CTG CC Ser Ala Leu Ala TTC AGC CTG GAC Phe Ser Leu Asp GAG GTG CAC Glu Val His CGG GAG ATG Arg Glu Met 7CG AGC Ser Ser TTC ATC CAC CGG Phe Ile His Arg CTC CGC AGC CAG Leu Arg Ser Gin GAG CGG Glu Arg CAG CGC GAG ATC CTC TCC Gin Arg Glu Ile Leu Ser 60 ATT TTG CCC TTG Ile Leu Cly Leu CCC CAC CGC Pro His Arg CCG CGC CCG Pro Arg Pro CAC CTC CAG GGC His Leu Gin Gly

MAG

Lys 75 CAC MAC TCG His Asn Ser GCA CCC ATG TIC ATG Ala Pro Met Phe Met

CTG

Leu

GGC

Gly 100 GAC CTG, TAC MAC GCC Asp Leu Tyr Asn'Ala CAG CCC TTC TCC TAC Gin Gly Phe Ser Tyr GCG GIG GAG GAG Ala Val Glu Giu GGC COG CCC GCC Cly Gly Pro Gly CCC TAC MAG CCC GTC Pro Tyr Lys Ala Val 110 TTC ACT ACC Phe Ser Thr CTC ACC GAC Leu Thr Asp CAG GC Gin Gly 115 GCC GAC Ala Asp 130 CCC CCI CTG CC Pro Pro Leu Ala

AGC

Ser 120 CTG CMA CAT AGC Leu Gin Asp Ser CAT TTC His Phe 125 ATG GTC ATG Net Val Met CAC CCA CGC His Pro Arg 150

AGC

Ser 135 TIC GTC MAC CTC Phe Val Asn Leu

GIG

Val 140 GMA CAT GAG MAG Glu His Asp Lys GMA TTC TIC Glu Phe Phe 145 TCC MAG ATC Ser Lys Ile TAC CAC CAT CGA Tyr His His Arg TIC CGG ITT GAT Phe Arg Phe Asp CCA GMA Pro Glu 165 GGG GMA CC GIC Gly Glu Ala Val GCA GCC GAM TTC Ala Ala Glu Phe CGG ATC TAC MG GAC Arg Ile Tyr Lys Asp 175 CGG ATC AGC GTT TAT Arg le Ser Val Tyr

TAC

Tr IS0 ATC CGG GMA CGC le Arg Glu Arg GAC MAT GAG ACO Asp Asn Glu Thr

TIC

Phe 190 CAG GIG CIC CAC GIn Val Leu Gin

GAG

Glu 200 CAC TIG GGC AGG His Leu Gly Arg GMA ICG Clu Ser 205 CAT CIC TIC Asp Leu Phe CTG CTC Leu Leu 210 GAC AGC CCI ACC CIC Asp Set Arg Thr-Leu 215 TGG CCC TCG Trp, Ala Ser MAC CAC ICC Asn His Trp 235

GAG

Git 220 GAG GGC ICGCG Glu Gly Irp Leu a.

a. a ATC ACA 0CC le Thr Ala 230 ACC AGC Thr Ser GIG GIC MIT CCC Val Val Asa Pro GIG ITT GAC Val Phe Asp 225 CAC MAC CG His Asa Leu ATC MAC CCC Ile Asa Pro 585 633 681 729 777 825 873 GGC CG Gly Leu 245 GAG CTC ICC GIG Gin Leu Ser Val ACC GIG CAT CCC Thr Leu Asp Ciy GAG AC Gin Ser 255

AAG

Lys 260 TIC CC CCC CGG Leu Ala Gly Lem CCC CCC CAC CCC Gly Arg His Gly CAG MAC MG GAG Gin Asa Lys Gin .a a.

TIC ATG GTG GCT TTC TTC MAG GCC ACG GAG GTC CAC TTC CCC AGC ATC Phe Met Val Ala Phe Phe 280 285 290 CAG CGC AGC CAG MAC CGC TCC MAG ACG CCC CCC ICC ACG Arg Ser Thr MAG MAC CAG Lys Asn Gin 310 AGC GAC CAG Ser Asp Gin GGG AGC AMA Gly Ser Lys 295 GAA GCC CIG Glu Ala Leu AGG CAG CC Arg Gin Ala Gin Arg Ser Gin Asn Arg Ser

CGG

Arg 300

CC

Ala

MCG

Lys MAC GIG GCA GAG Asn Val Ala Giu 320 CAC GAG CTG TAT His Glu Leu Tyr 335 Lys Thr Pro 305 MAC AGC AGC Asn Ser Ser GTC AGC TIC Val Ser Phe 969 1017 1065 325 CGA GAC Arg Asp CTG GGC TGG Leu Gly Trp,

CAG

Gin 345

GGG

Gly TAC TAC TGT Tyr Tyr Cys

GAG

Glu 360

CAC

His

GAG

Giu MAC CCC ACC Asn Ala Ihr CCC GMA ACG Pro Glu Thr 390 ATC TCC GTC Ile Ser Val

AAC

Asn 375

GTG

Val

CTC

CCC ATC Ala Ile CCC AAC CCC Pro Lys Pro TAC TIC CAT Tyr Phe Asp 410 405 TAC AGA MAC ATG Tyr Arg Asn Met 420 ICC ATC ATC C CCI CMA CCC TAC CC Trp le le Ala Pro Giu Cly Tyr Ala 350 355 TGT CCC TTC CCI CTC MAC TCC TAC ATG Cys Ala Phe Pro Leu Asn Ser Tyr Met 365 370 CTG CAC ACG CIG CIC CAC TIC AIC MAC Val Gin Thr Leu Val His Phe Ile Asn 380 385 rGC TCI CC CCC ACG CAG CTC MAT CC Cys Cys Ala Pro Thr Gin Leu Asn Ala 395 400 GAC AGC ICC MAC GTC AIC CIG AAG A Asp Ser Ser Asn Val Ile Leu Lys Lys 415 GCC TOT CCC TGC CAC TAGCTCCTCC Ala Cys Gly Cys His 430 ~TTTIGATCCT CCATTCCTCG CCTTGGCCA ;AGA CCTTCCCCTC CCTATCCCCA ACITTAAAG XCAT ATGGCITTTG AICAGTTI CAGIGGCAG ;TGC AGGCAAAACC TAGCAGGAAA AAAAAACAA LVCAT TGGCTGGGAA CTCTCACCCA TGCACGGAC VACC AGCCAGGCCA CCCACCCGTG GGAGGAAGG VGTC TGTGCGAAAG GAAAATTGAC CCGGMAGTI 1113 1161 1209 1257 1305 1351 GTG GTC CCC Val Val Arg 425 0 0

GAGAATTCAG

GAACCAGCAG

TGTGAGAGTA

ATCCAATGAA

GCATMAAGAA

CGTTTCCAGA

GGCGTCGCAA

ACCCTTTGG-G

ACCAACTGCC

TTAGGAAACA

CAAGATCCTA

MAATGGCCGG

GGTAATTATG

GGGGTGGGCA

GCCAAG

TTTTGIC

TGAGCAC

CMAGCT(

GCCAGGJ

AGCGCC~

CATTGGJ

G

C

1411 1471 1531 1591 1651 1711 1771 1822 CTGTAATAAA TGTCACAA.TA AAACGMATGA ATGAAAAAAA AAAAAAAAA A INFORMiATION FOR SEQ ID NO: 17: SEQUENCE CHARACTERISTICS: LENGTH: 431 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (ix) FEATURE: OTHER INFORMATION: /Product="OP1-PP" (xi) SEQUENCE DESCRIPTION: SEQ ID NO:17: Met His Val Arg Ser Leu Arg Ala Ala Ala Pro His Ser Phe Val Ala 1 5 10 Leu Trp Ala Pro Leu Phe Leu Leu Arg Set Ala Leu Ala Asp Phe Ser 25 Leu Asp Asn Giu Val His Set Ser Phe Ile His Arg Arg Leu Arg Ser 40 Gin Giu Arg Arg Giu Met Gin Arg Giu Ile Leu Ser Ile Leu Gly Leu 55 Pro His Arg Pro Arg Pro His Leu Gln Gly Lys His Asn Ser Ala Pro 70 75 Met Phe Met Leu Asp Leu Tyr Asn Ala Met Ala Val Giu Giu Gly Gly 90 Giy Pro Gly Gly Gin Gly r'he Ser Tyr Pro Tyr Lys Ala Val Phe Ser 100 105 110 Thr Gin Gly Pro Pro Leu Ala Ser Leu Gin Asp Ser His Phe Leu Thr 115 120 125 Asp Ala Asp Met Val Met Ser Phe Val Asn Leu Val Giu His Asp Lys 130 135 140 *Giu Phe Phe His Pro Arg Tyr His His Arg Glu Phe Arg Phe Asp Leu .145 150 155 160 Set Lys Ile Pro Giu Gly Giu Ala Val Thr Ala Ala Glu Phe Arg Ile 165 170 175 Tyr Lys Asp Tyr Ile Arg Glu Arg Phe Asp Asn Glu Thr Phe Arg Ile 180 185 190 Ser Val Tyr Gin Vai Leu Gin Giu His Leu Gl1y Arg Glu Ser Asp Leu 195 200 205 Phe Leu Leu Asp Ser Arg 210 Thr Leu Trp Ala Ser Glu Giu Gly Trp Leu 215 220 Val 225 His lie Lys Arg Lys 305 Asn Val Gly Ser Phe 385 Leu Phe Asn Asn Gin Ser 290 Thr Ser Ser Tyr Tyr 370 Ile Asn Asp Ile Leu Gly Pro Lys 260 Pro Phe 275 lie Arg Pro Lys Ser Ser Phe Arg 340 Ala Ala 355 Met Asn Asn Pro Ala Ile Thr Leu 245 Leu Met Ser Asn Asp 325 Asp Tyr Ala Glu Ser 405 Ala 230 Gin Ala Val Thr Gln 310 Gin Leu Tyr Thr rhr 390 Val Thr Leu Gly Ala Gly 295 Glu Arg Gly Cys Asn 375 Vai Leu Ser Asn His Ser Leu Phe 280 Ser Ala Gin Trp Glu 360 His Pro Tyr Val lie 265 Phe Lys Leu Ala Gin 345 Gly Ala Lys Phe Glu 250 Gly Lys Gin Arg Cys 330 Asp Glu lie Pro Asp 4 410 Trp 235 Thr Arg Ala Arg Met 315 Lys Trp Cys Val Cys 395 Asp Leu His Thr Ser 300 Ala Lys Ile Ala Gin 380 Cys Ser *Asp Gly Gly Pro 270 Giu Val 285 Gin Asn Asn Val His Glu lie Ala 350 Phe Pro 365 Thr Leu Ala Pro Ser Asn Val Val Asn Pro Gln 255 Gln His Arg Ala Leu 335 Pro Leu Val Thr Val 415 Arg 240 Ser Asn Phe Ser Glu 320 Tyr Glu Asn His Gln 400 lie r ~o r r t' r r r s Leu Lys Lys Tyr 420 Arg Asn Met Val Val 425 Arg Ala Cys Gly Cys His 430 c- INFORMATION FOR SEQ ID NO:18: SEQUENCE CHARACTERISTICS: LENGTH: 1873 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: 'inear (ii) MOLECULE TYPE: cDNA (vi) ORIGINAL SOURCE: ORGANISM: IiURIDAE TISSUE TYPE: 9XBRYO (ix) FEATURE: NAME/KEY: COS LOCATION: 104. .1393 OTHER INFORMATION: /note= "HOPi (CDNA)" (xi) SEQUENCE DESCRIPTION: SEQ ID NO:18: CTGCAGCAAG TGACCTCGGG TCGTGGACCG CTGCCCTGCC CCCTCCGCTG CCACCTGGGG CGGCGCGGGC CCGGTGCCCC GGATCGCGCG TAGAGCCGGC GCG ATG CAC GTG CC Met His Val Arv

TCG

Ser CTG CGC GCT C Leu Arg Ala Ala CCA CAC AGC TTC Pro His Ser Phe GCG CTC TGG GOG Ala Leu Trp Ala CTG TTC TTG CTO Leu Phe Leu Leu TCC GCC CTG GCC Ser Ala Leu Ala TTC AGC CTG GAC Phe Ser Leu Asp AAC GAG Asn Glu 44 49 GTG CAC TCC Val His Ser GAG ATG CAG Giu Met Gin TTC ATO CAC CGG Phe Ile His Arg CTC CCC AGC CAG Leu Arg Ser Gin GAG CGG CGG Glu Arg Arg CGG GAG ATC CTG Arg Cli Ile Leu ATC TTA GGG TTG CCC CAT CCC CCC le Leu Gly Leu Pro His Arg Pro CCC CCG Arg Pro CAC CTC CAG OGA His Leu Gin Gly CAT M.T TCG GCG His Asn Ser Ala ATG TTC ATG TTG Met Phe Met Leu

GAC

Asp CIG TAC AAC GCC Leu Tyr Asn Ala GCG GTG GAG GAG AGC GGG CCG GAC GCA Ala Val Glu Giu Ser Gly Pro Asp Gly 4.4.

GGC TTC TCC TAC Gly Phe Ser Tyr

CCC

Pro 105 TAC MAG GCC GTC Tyr Lys Ala Val TTC ACT ACC CAG GGC CCC CCT Phe Ser Thr Gin Gly Pro Pro 110 115 TTA GCC AC Leu Ala Ser ATG AGC TTC Met Ser Phe 135 CGA TAC CAC Arg Tyr His 150 CAG GAC AGC CAT Gin Asp Ser His

TTC

Phe 125 CTC ACT GAC GCC Leu Thr Asp Ala GAC ATG GTC Asp Met Val 130 GTC MAC CTA GTG Val Asn Leu Val CMA CAT Giu His 140 GAC AAA GMA TTC TIC CAC CCT Asp Lys Glu Phe Phe Pro CAT CGG GAG His Arg Glu

TIC

Phe 155

CGG

Arg TTT GAT CZTT 7CC Phe Asp Leu Ser 160 ATC CCC GAG le Pro Giu CMA CGG GTG ACC Glu Arg Val Thr GCC GMA TIC AGG Ala Giu Phe Arg

ATC

Ile 175 TAT MAG GAC TAC Tyr Lys Asp Tyr CCC GAG CGA TrT Arg Glu Arg Phe MAC GAG ACC TIC Asn Giu Thr Phe CAG ATC Gin Ile 190 ACA GTC TAT Thr Val T'yr CAG GTG Gin Val 195 CTC CAG GAG Leu Gin Giu CCC ACC ATC Arg Thr le 215 TCA GGC AGG GAG Ser Gly Arg Glu

TCG

Ser 205 GAC CTC TIC TTG Asp Leu Phe Leu CTG GAC AC Leu Asp Ser 210 GAT ATC ACA Asp Ile Thr TGG GCT TCT GAG Trp Ala Ser Giu GGC TCC TTC GTG Cly Trp Leu Val

TIT

Phe 225 GCC ACC Ala Thr 230 AGC MAC CAC TGG GTG GTC MAC CCT CCC Ser Asn His Trp Val Val Asn Pro Arg 235 MAC CTG, GGC TTA Asn Leu Gly Leu CTC TCT GTG GAG Leu Ser Val Giu CTG CAT CG Leu Asp Gly CAG AGC Gin Ser 255 CAG MAC Gin Asn 270 ATC MAC CCC MAG Ile Asn Pro Lys GCA CCC CIG ATT Ala Gly Leu Ile CCC CAT GGA CCC Arg His Gly Pro MAG CMA CCC Lys Gin Pro TIC ATC Phe Met 275 GTG CCC TIC Val Ala Phe ACG CCC GC Thr Gly Gly 295 MAG CCC ACG GMA GTC CAT CTC CGT AGT Lys Ala Thr (flu Val His Leu Arg Ser 285 ATC CGG TCC Ile Arg Ser 290 CCA MAG MAC Pro Lys Asn MCG CAC CCC AGC Lys Gin Arg Ser MAT CGC TCC MAG Asn Arg Ser Lys

ACG

Thr 305 1027 1075 CMA GAG Gin Glu 310 CCC CTG AGO ATG Ala Leu Arg Met

CC

Ala 315 ACT GTG CCA GMA Ser Val Ala Clu

MAC

Asn 320 AGC AGC ACT GAC Ser Ser Ser Asp CAG AGG CAG GCC TGC AAG AAA CAT GAG Arg Gin Ala Cys Lys Lys 330 TGG ATC Trp Ile His Glu CTG TAC GTC AGC TTC CGA GAC Leu Tvr Val Ser Phe Arg Asp 335

GMA

Giu 340 GGC TGG CAG Gly Trp Gin

GAC

Asp 345

GAG

Giu ATT GCA CCT Ile Ala Pro 350 TTC CCT CTG Phe Pro L~u GGC TAT GCT Gly Tyr Ala GCC TAC Ala Tyr 355 TAC TGT GAG GGA Tyr Cys Giu Gly 360O ACC MAC CAC GCC Thr Asn His Aia 375 TGC GCC Cys Ala MAC TCC TAC Asn Ser Tyr ATC GTC CAG ACA Ile Val Gin Thr 380 CCC TOC TGT GCG Pro Cys Cys Ala 395 GTT CAC TTC Val His Phe

ATC

Ile 385

MAC

Asn ATG MAC GCC Met Asn Ala 370 MAC CCU GA C Asn Pro Asp GCC ATC TCT Ala Ile Ser ACA GTA Thr Val 390 GTC CTC Val Leu 405 MAC ATG Asn Met

CCC

Pro

MAG

Lys CCC ACC CAG Pro Thr Gin TAC TTC Tyr Phe GAC GAC Asp Asp -410 GTG GTC CGG GCC Val Val Arg Ala 425 AGC TCT MAT GTC ATC CTG MAG MAG TAC AGA Ser Ser Asn Val Ile Leu Lys Lys Tyr Arg 415 420 TOT GGC TGC CAC TAGCTCTTCC TGAGACCCTG Cys Gly Cys His 430 1123 1171 1219 1267 1315 1363 1413 1473 1533 1593 1653 1713 1773 1833 1873

ACCTTTGCGG

CCCACCTTGG

AAGCATGTAA

GGCACGTGAC

GTCTGCCAGG

AATCGCAAGC

TCTGTGTTGA

GAATGAAAAA

GGCCACACCT TTCCAMATCT COAGGAGMOC AGACCMACCT GGGTTCCAGA MCCTGAGCG GGACMAGATC CTACCAGCTA MAAGTGTCCA GTGTCCACAT CTCGTTCAGC TGCAGCAGMA AGGGAAACCA AGCAGMAGCC AAAAAAAAMA AAAAMMAK

TCGATGTCTC

CTCCTGAGCC

TGCAGCAGCT

CCACAGCA.M

GGCCCCTGGC

GGMACCGCTT

ACTGTMATGA

AAMAGAATTC

ACCATCTMAG

TTCCCTCACC

GATGAGCGCC

CCCOTMAG

GCTCTGAGTC

AGCCAGGGTG

TATGTCACMA

TCTCTCAC,'

TCCCMACCGG

CTTTCCTTCT

CAGGAAAAAT

TTTGAGGAGT

GGCGCTGGCG

TAAMACCCAT

S S S INFORMATION FOR SEQ ID NO:19: SEQUENCE CHARACTERISTICS: LENGTH: 430 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (ix) FEATURE: OTHER INFORMATION: /product= "mOPl-PP" (xi) SEQUENCE DESCRIPTION: SEQ ID NO:19: Net His Val Arg Ser Leu Arg Ala Ala Ala Pro His Ser Phe Val Ala 1 5 10 Leu Trp Ala Pro Leu Phe Leu Leu Arg Ser Ala Leu Ala Asp Phe Ser 25 Leu Asp Asn Giu Val His Ser Ser Phe Ile His Arg Arg Leu Arg Ser 40 Gin Glu Arg Arg Glu Met Gin Arg Giu Ile Leu Ser Ile Leu Gly Leu 55 Pro His Arg Pro Arg Pro His Leu Gin Gly Lys His Asn Ser Ala Pro 70 75 11,yv Phe Met Leu Asp Leu. Tyr Asn Ala Met Ala Val Giu Glu Ser Gly 90 Pro Asp Gly Gin Gly Phe Ser Tyr Pro Tyr Lys Ala Val Phe Ser Thr 100 105 110 Gin Gly Pro Pro Leu. Ala Ser Leu Gin Asp Ser His Phe Leu Thr Asp 115 120 125 Ala Asp Met Val Met Ser Phe Val Asn Leu Val Giu His Asp Lys Giu, 130 135 140 Phe Phe His Pro Arg Tyr His His Arg Giu Phe Arg Phe Asp ',Leu Ser 145 150 155 160 *Lys Ile Pro Giu Gly Giu Arg Vai Thr Ala Ala Giu Phe Arg Ile Tyr 165 170 175 Lys Asp Tyr Ile Arg Glu, Arg Phe Asp Asn Giu Thr Phe Gin Ile Thr :::180 185 190 Val Tyr Gin Val Leu Gin Glu His Ser Gly Arg Giu Ser Asp Leu Phe I**195 200 205 Leu Leu Asp 210 Ser Arg Thr Phe 225 Asn Asn Gin Ser Thr 305 Ser Ser Tyr.

Tyr.

Ile 385 Asn Asp Leu Pro Pro Ile 290 Pro S rr Phe Ala Met 370 Asn Ala le Giy Lys Phe 275 Arg Lys Ser Arg Ala 355 Asn.

Pro Ile Thr Leu Leu 260 Met Ser Asn Asp Asp 340 Ala Asp Ser Ala Gin 245 Ala Val Thr Gin Gin 325 Leu Tyr Thr Thr Val 405 Thr 230 Leu Gly Ala Giy Glu 310 Arg Gly Cys Asn Val 390 Leu Ile Trp Ala 215 Ser Asn His Ser Val Glu Leu Ile Gly 265 Phe Phe Lys 280 Gly Lys Gin 295 Ala Leu Arg Gin Ala Cys Trp Gin Asp 345 Giu Gly Giu 360 His Ala le 375 Pro Lys Pro Tyr Phe Asp Ser Trp Thr 250 Arg Ala Arg Met Lys 330 Trp Cys Va-l Cys Asp 410 Giu Val 235 Leu His Thr Ser Ala 315 Lys Ile Ala Gin Cys 395 Ser Giu 220 Val Asp Gly Giu Gin 300 Ser His le Phe Thr 380 Al a Ser Gly Asn Gly Pro Val 285 Asn Val1 Glu Ala Pro 365 Leu Pro Asn *Trp Leu Pro Arg Gin Ser 255 Gin Asn, 270 His Leu Arg Ser Ala Giu Leu Tyr 335 Pro Giu 350 Leu Asn Val His Thr Gin 400 Val Ile 415 Val His 240 le Lys Arg Lys Asn 320 Val Gly Ser Phe Leu Leu C

C.

Lys Lys Tyr Arg 420 Asn Met Val Val Arg 425 Ala Cys Gly Cys INFORMATION FOR SEQ ID 140:20: (i )SEQUENCE CHARACTERISTICS: LENGTH: 1723 base pairs TYPE: nucleic. acid STRANDEDNESS: single TOPOLOGY: linear (ii)HOLECULE TYPE: cDNA (vi)ORIGINAL SOURCE: ORGANISM: Homo sapiens TISSUE TYPE: HIPPOCAMPUS (ix)FEATJRE: NAME/KEY: CDS LOCATION: 490. .1696 OTHER INFORMATION: /note= 1 hOP2 (cDNA)" (xi)SEQUENCE DESCRIPTION: SEQ ID GGCGCCGGCA GACCAGGACT GGGCTGGAGG GCTCCCTATG CCACACCGCA CCAAGCGGTG, GCGGCCACAG CCGGACTGC CCGCAGAGTA GCCCCGGCCT GACAGGTGf C GCGCGGCGGG CCCCCCGCCC CCCCGCCCGC AGCCCCTGGG TCGGCCGCGG CGGCCTGCC ATO ACC CC Met Thr Ala GGCTGGAGGA GCTGTGGTTG GAGCACGACG TGCCACGGCA AGTGGCGGAG ACGGCCCAGG AGCCGCTGGA GCA.ACAGCTC GCTGCAGGAG CTCGCCCATC GCCCCTGCGC TGCTCGGACC GGGTACGCCG GCGACAGAGG CATTGGCCGA GACTCCCAGT CGAGGCGGTG GCCTCCCGGT CCTCTCCGTC CAGGAGCCAG GCTCCAGCGA CCGCGCCTGA GGCCGCCTGC CCGCCCGTCC CCCCCCCGA GCCCAGCCTC CWIGCCGTCG GGGCGTCCCC AGCCGATGCG CGCCCGCTGA CCGCCCCAGC TGACCGCCCC CTC CCC GGC CCG CTC TGG C CTG GGC CTG Leu Pro'Gly Pro Led Trp Leu Leu Gly Leu C CTA TGC GCG CIG CCC GGC CCC GGC CCC GGC Ala Leu Cys Ala Leu Gly Cly Gly Gly Pro Cly GCC TGT CCC CAG CGA CGT CTC CGC GCG CCC* GAG Gly Cys Pro Gin Arg Arg Leu Gly Ala Arg Giu 35 CCC GAG ATC CTG GCG GTG CTC GGC CTG CCT GG Arg Glu Ile Leu Ala Val Leu Cly Leu Pro Gly CGA CCC CCG CCC Arg Pro Pro Pro CGG GAC GTG Arg Asp Val CGC CCC CG Arg Pro Arg GCG CCA CCC GCC Ala Pro Pro Ala CTG GAC CTG TAC Leu Asp Leu Tyr GCC ICC CGG CTG Ala Ser Arg Leu OCG TCC GCG CCG Ala Ser Ala Pro CTC TTC ATG Leu Phe Met GAC GOC GCG Asp Giy Ala CAC GCC ATG His Ala ilet GGC GAC GAC GAC Gly Asp Asp Asp CCC GCG Pro Ala GAG CGG CGC CTG Glu Arg Arg Leu CGC GCC GAC CTG Arg Ala Asp Leu

GTC

Val 105 ATG AGC TIC GTT Met Ser Phe Val GAG CCC CAT TG Glu Pro His Trp

AAC

.Asn 110 ATG GTG GAG CGA Met Val Glu Arg

GAC

Asp 115 CGT GCC CTG C Arg Ala Leu Gly CAC CAG His Gin 120 AAG GAG TIC CGC Lys Glu Phe Arg

TT

Phe 130 GAC CTG ACC CAG Asp Leu Thr Gin CCG GCT GG Pro Ala Gly ACA GCT GCG Thr Ala Ala AAC AGO ACC Asn Arg Thr 160 TTC COG ATI TAC Phe Arg Ile Tyr GTG CCC AGC ATC Val Pro Ser Ile GAG GCG GTC Giu Ala Val 140 CAC CTG CIC His Leu Leu 155 GAG CAG TCC Giu Gin Ser CTC CAC GTC AGC Leu His Val Ser TIC CAG GTG GTC Phe Gin Val Val MAC AGO Asn Arg 175 GAG TCT GAC TTC Glu Ser Asp Leu

TTC

Phe 180 TTI ITO GAT CTT Phe Leu Asp Leu ACG CTC CGA GOT Thr Leu Arg Ala

GGA

Gly 190

TGG

Trp GAC GAG 000 TGG Asp Giu Gly Trp GIG CIG GAT GTC Val Leu Asp Val

ACA

Thr 200 OCA GCC ACT GAC Ala Ala Ser Asp

TGC

Cys 205 TTG CTG MAG Leu Leu Lys CAC MAG GAC CTG His Lys Asp Leu CTC CGC CTC TAT Leu Le4 Tyr GIG GAG Val Giu 220

S

ACT GAG GAC Thr Giu Asp CMA CGG GCC Gin Arg Ala 240 CAC AGO OTG GAT His Ser Val Asp C CTGG 0CC GC Gly Leu Ala Gly GIG CTG GOT Leu Leu Gly 235 TTC TIC AGO Phe Phe Arg CCA CGC TCC CAA Pro Arg Ser Gin

CAG

Gin 245 CCI TIC GTG GIC Pro Phe Val Val GCC ACT Ala Ser 255 CCG AGT CCC AIC Pro Ser Pro Ile

CGC

Arg 260 ACC CCI COG GCA Thr Pro Arg Ala

GIG

Val 265 AGO CCA CIC AGO Arg Pro Leu Arg

AGG,

Arg 270 AGG CAG COG MAG Arg Gin Pro Lys

A

Lys 275 AGC MAC GAG CIG Ser Asn Glu Leu CCG CAG GCC MAC CGA Pro Gin Ala Asn Arg

CTC

Leu 285 CCA GGG ATC TTT Pro Giy Ile Phe

GAT

Asp 290 GAC GTC CAC GGC Asp Val His GJly

TC

Ser 295 CAC GGC CGG CAG His Gly Arg Gin GTC TGC Val Cys 300 CGT CGG CAC Arg Arg His TGG GTC ATC Trp Val Ile 320

GAG

Giu 3059 CTC TAC GTC AGO Leu Tyr Val Ser CAG GAO CTC GGC Gin Asp Leu Gly TGG CTG GAC Trp Leu Asp 315 GAG GGG GAG Glu Gly Glu GCT CCC CMA GCC Ala Pro Gin Giy

TAC

Tyr 325 TCG GCC TAT TAO Ser Ala Tyr Tyr TGC TCC Cys Ser 335 TTC OCA CTG, GAC Phe Pro Leu Asp

TCC

Ser 340 TGC ATG MAT GCC ACC MOC CAC GOC ATC Cys Met Asn Ala Thr Asn His Ala Ile 345 1344 1392 1440 1488 1536 1584 1632 1680 1723

CTG

Leu 350 CAG TOO OTG GTG Gin Ser Leu Val

CAC

His 355 CTG ATG MAG Leu Met Lys OCA MAC GCA GTO CCC MAG Pro Asn Ala Val Pro Lys 360 TGC TGT GOA COO Cys Cys Ala Pro

ACC

Thr 370 MAG CTG AGO GCO Lys Leu Ser Ala TOT GTG OTO TAO Ser Vai Leu Tyr TAT GAO Tyr Asp 380 AGO AGO MAC Ser Ser Asn

MOC

Asn 385 GTO ATO OTG CC Val Ile Leu Arg CAC CGC MAC AIG His Arg Asn Met GTG GTC MAG Val Val Lys 395 GOC TGC GGC TGO CAC Aia Cys Gly Cys His 400 T GAGCAGCO GCCCAGCOCT ACTOCAG 98 INFORMATION FOR SEQ ID NO:21: SEQUENCE CHARACTERISTICS: LENGTH: 402 amino acids TYPE: amino acid TOPOLOGY: linear (ii)HOLECITLE TYPE: protein (ix)FEATURE: (A)OTHER INFORMATION: /product. "h0P2-PP" (xi)SEQUENCE DESCRIPTION: SEQ ID NO:21: hiet Thr Ala Leu Pro Gly Pro Leu Trp Leu Leu Gly Leu Ala Leu Cys 1 5 10 Ala Leu Gly Gly Gly Gly Pro Gly Leu Arg Pro Pro Pro Gly Cys Pro 25 Gin Arg Arg Leu Gly Ala Arg Glu Arg Arg Asp Val Gin Arg Glu Ile 40 Leu Ala Val Leu Gly Leu Pro Gly Arg Pro Arg Pro Arg Ala Pro Pro 55 Ala Ala Ser Arg Leu Pro Ala Ser Ala Pro Leu Phe Met Leu Asp Leu 70 75 'Tyr His Ala Met Ala Gly Asp Asp Asp Giu Asp Gly Ala Pro Ala Glu 90 Arg Arg Leu Gly Arg Ala Asp Leu Val Met Set Phe Val Asn Net Val 100 105 110 Glu Arg Asp Arg Ala Leu Gly His Gin Giu Pro His Trp Lys Glu Phe 115 120 125 *Arg Phe Asp Leu Thr Gin Ile Pro Ala Gly Giu Ala Val Thr Ala Ala 130 135 140 Glu Phe Arg Ile Tyr Lys Val Pro Se Ile His Leu Leu Asn Arg Thr 145 150 155 160 Leu His Val Ser Met Phe Gin Val Val Gin Giu Gin Ser Asn Arg Giu *165 170 175 Set Asp Leu Phe Phe Leu Asp Leu Gin Thr Leu Arg Ala Gly Asp Giu *180 185 190 Gly Trp Leu Vai Leu Asp Val Thr Ala Ala Set Asp Cys Trp Leu Leu 195 200 205 Lys Arg 210 His Lys Asp Leu Gly Leu Arg 215 Leu Tyr Val Glu Thr Giu Asp 220 Gly His 225 Pro Arg Ser Pro Pro Lys Phe Asp 290 Glu Leu 305 Ala Pro Pro Leu Leu Val Pro Thr 370 Ser Ser le Lys 275 Asp Tyr Gin Asp His 355 Lys Val Gin Arg 260 Ser Val Val Gly Ser 340 Leu Leu Asp Gin 245 Thr Asn His Ser Tyr 325 Cys Met Ser Pro 230 Pro Pro Glu Gly The 310 Ser Met Lys Ala Gly LeLI Ala Gly Leu Phe Arg Leu S er 295 Gin Ala Asn Pro Thr 375 Val Ala Pro 280 His Asp Ala Asn 360 Ser Val Val 265 Gin Gly Leu Tyr Thr 3 45 Ala 235 Thr Phe 250 Arg Pro Ala Asn Arg Gin Gly Trp 315 Cys Glu 330 Asn His Val Pro Leu Phe Leu Arg Val 300 Leu Gly Ala Lys Tyr 380 Gly Arg Arg Leu 285 Cys Asp Glu le Ala 365 Gin Ala Arg 270 Pro Arg Trp Cys Leu 350 rNY's Arg Ser 255 Arg Gly Arg Val Ser 335 Gin Ala 240 Pro Gin le His le 320 Phe Ser Ala Val Leu Tyr Asp Ser Ser Asn Asn 385 Cys Val His Ile Leu Arg Lys His Arg 390 Asn Met Val 395 Val Lys Ala Cys S S

S

100 INFORMATION FOR SEQ ID 140:22: SEQUENCE CHARACTERISTICS: LEN4GTH: 1926 base pairs TYPE: nucleic acid SThANDEDNESS: single (D TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (vi) ORIGINAL SOURCE: ORGANISM: HURIDAE TISSUE TYPE: EMBRYO (ix) FEATIURE: NAHE/KEY: CDS LOCATION: 93. .1289 OTHER INFORMATION: /note= "mOP2 cDNA"' (xi) SEQUENCE DESCRIPTION:, SEQ ID NO:22: GCCAGGCACA GGTGCGCCGT CTGGTC!CTCC CCGTCTGCCG TCAGCCGAC CCGACCACCT ACCAGTGGAT CCGCGCCGCC TGAAAGTCCG AG AIG GCT ATG CGT Met Ala Met Arg 1 104'

CCC

PTO

GGG CCA CTC TGG Gly Pro Leu Trp TTC GGC CTT GCT Leu Gly Leu Ala TGC CC CTG CGA Cys Ala Leu Gly (;GC CAC GGT CCG Gly His Gly Pro CCC CCG CAC ACC Pro Pro His Thr CCC CAG CGT CGC Pro G,"n Atrg Ar'g)Leu Cly G;CG CGC GAG Ala Arg Glu C7TA CCC GGA ]Leu Pro Gly 55 CCC GAC ATG CAG Arg Asp Met Gin

CGT

Arg GAA ATC CTG GCG Glu Ile Leu Ala CGG CCC CGA CCC Arg Pro Arg Pro GCA CAA CCC Ala Gln Pro GCG GCT Ala Ala TAC CAC Tyr His GTG CTC GGG Val Leu Cly GCC CCC CAIU Ala Arg Gln GCC ATG ACC Ala Met Thr 248 296 344 392 CCA CCG Pro Ala 70 TCC CC CCC CTC Ser Ala Pro Leu CAC GAC GGC CCC Asp Asp Gly Cly ATG TTG GAC CTA Met Leu Asp Leu

GAT

ASP

GAC

Asp CCA CCA CAG GCT Pro Pro GIn Ala TTA CCC CGT CC Leu Gly Arg Ala CTG GTC ATG AGO Leu Val Met Ser GTC MAC ATG GTG Val. Asn Met Val CGC GAO CGT ACC CTG GGC Arg Asp Arg Thr Leu Gly 115 TAO CAG GAG Tyr Gin Glu COT GOT GGG Pro Ala Gly 135

CCA

Pro 120 CAC TGG MAG GMA His Trp Lys Glu TIC CAC TTT GAO CTA Phe His Phe Asp Leu ACC CAG ATO Ihr Gin Ile 130 GAG GOT GTC ACA Glu Ala Val Thr GOT GAG TIC CGG ATO TAO AAA GAA Ala Glu Phe *Arg Ile Tyr Lys Glu 145 536 COO AGC Pro Ser 150 ACC CAC COG 070 AAC ACA ACC CTC CAC Thr His Pro Leu Asn Thr Thr Leu His 155 AGO ATG 770 GMA Ser Met Phe Glu GTC CMA GAG CAC Val Gin Ciu His

TO'Z

17C~ MAC AGG tAG TOT izn Arg Ser TTG TT0 TTT TTG Leu Phe Phe Leu 077 CAG ACG 010 Lea Gln Thr Leu TOT G(;G GAO GAG Ser Gly Asp Glu IGG CTG GIG CIG Trp Lea Val Leu GAO ATO Asp Ile 195 ACA GCA GC Thr Ala Ala CTC CGC OCTC Leu Arg Leu 215

AGT

Ser 200 GAO OGA TGG CTG Asp Arg Trp Leu MOC CAT CAC MAG Asn His His Lys GAO OTG GGA Asp Lea Gly 210 GAT COT GGO Asp Pro Gly 728 776 TAT GIG GMA ACC Tyr Val Glu Thr GAT GGG CAC AGO Asp Gly His Ser CTG GOT Leu Ala 230 GGT OIG 077 GGA Gly Leu Leu Gly CMA GCA OCA CGO Gin Ala Pro Arg AGA CAG COT 770 Arg Gin Pro Phe

ATG

Met 245 GTA ACC 770 770 Val Thr Phe-Phe

AGG

Arg 250 GCC AGO CAG ACT Ala Ser Gin Ser GIG CGG CO Val Arg Ala COT CC Pro Arg 260 GAG 077 Glu Lea 275 GOA CG AGA OOA Ala Ala Arg Pro

CTG

Leu 265 MAG AGG AGG tAG Lys Arg Arg Gin MAG AMA ACG MOC Lys Lys Thr Asn 872 920 968 1016 0* COG CAC COO Pro His Pro CCC CCC AGA Arg Gly Arg 295 AAO AA.A Asn Lys 280 OTC OCA GGG Leu Pro Gly 777 GAT CAT GGO Phe Asp Asp Gly CAC CCI TOO His Cly Ser 290 AGO ITO CT Ser Phe Arg GAG GTT TGO Glu Val Cys CGC AGG Arg Arg 300 OAT GAG 010 TAO His Giu Leu Tyr 102 GAC CTT GGC TGG CTG GAC TGG GTC ATC GCC CCC CAG GGC TAC TCT GCC Asp Leu Gly Trp Leu Asp Trp Val 315 Ile Ala Pro Gly Tyr Ser Ala

TAT

Tyr 325

GCC

Ala TCGT GAG GGG Cys Glu Gly TGT GCT TTC CCA Cys Ala Phe Pro GAC TCC TGT ATO Asp Ser Cys Met ACC MAC CAT Thr Asn His

GCC

Ala 345

AAG

Lys TTG CAG TCT Leu Gin Ser CAC CTG ATG His Leu Met AAG CCA Lys Pro 355 GAT GIT GIC Asp Val Val TCT GTG CIG Ser Vai Leu 375

CCC

Pro 360

TAC

Tyr GCA TGC TGT Ala Cys Cys

GCA

Ala 365

ACC

Thr AMA CIG AGT GCC ACC Lys Leu Ser Ala Thr 370 ATC CTG CGT MAA CAC Ile Leu Arg Lys His 385 TAT GAC AGC AGC Tyr Asp Ser Ser 380 MAC MAT GTC Asn Asn Val CGT MAC ATG GTG GTC MAG GCC TGT GOC TOC CAC Arg Asn Met Val Val Lys Ala Cys Gly Cys His TGAGGCCCCG CCCAGCATCC 1064 1112 1160 1208 1256 1309 1369 1429 1489 1549 1609 1669 1729 1789 1849 1909 1926

TGCTTCTACT

TATCATAGCT

AAATTCTGGT

CTCTCCATCC

ACTGAGAGGT

CTCAGCCCAC

GATCTGGGCT

CATACACTTA

AGAATCAGAG

AGGAGMATCT

AAAAAAAAAC

ACCTTACCAT

CAGACAGGG

CTTTCCCAGT

TCCTACCCCA

CTGGGGTCAG

AATGGCM.AT

CTCTGCACCA

GATCAMTGCA

CCAGGXATAG

CTGTGAGTTC

GGAATTC

CTGGCCGGGC

CMATGGGAGG

TCCTCTGTCC

AGCATAGACT

CACTGAAGGC

TCTGGATGGT

TTCATTGTGG

TCGCTGTACT

CGGTGCATGT

CCCTCTCCAG

CCCTTCACTT

TTCATGGGGT

GAATGCACAC

CCACATGAGG

CTMAGAAGGC

CAGTTGGGAC

CCTTGAAATC

CATTAATCCC

AGGCAGAAMC

CCCCTGGCCA

TTCGGGGCTA

AGCATCCCAG

MGACTGATC

CGTGGAATTC

ATTTTTAGGT

AGAGCTAGCT

AGCGCTAAAG

CCTTCTATGT

CTTCCTGCTA

TCACCCCGCC

AGCTATGCTA

CTTGGCCATC

TAAACTAGAT

ATAACAGACA

TGTTAGAA

AGACAGAGAC

a 04 *0 a a 4 a A.AGGCCACAT AGAAAGAGCC TGTCTCGGGA GCAGGAAAAA 103 INFORMATION FOR SEQ ID NO:23: SEQUENCE CHARACTERISTICS: LENGTH: 399 amuino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (ix) FEATURE: OTHER INFORMATION: /product= "naOP2-PP" (xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: het. Ala Met Arg Pro 1 5 Ala Leu Gly Gly Gly Gly Pro Leu Trp Leu Leu Gly Leu Ala Leu Cys Arg Arg Val Leu Ala Arg Met Thr Ala Asp 100 Leu Gly 115 Gin Ile Lys Glu Phe Giu Leu Asp 180 Gly Ala Leu Pro Pro Ala Asp Asp Val Met Gin Glu Ala Gly 135 Ser Thr 150 Val Gin Gin Thr His Arg Gly 55 Ser Asp Ser Pro 120 Giu.

His Glu Leu Gly Pro Giu Arg Arg Pro Ala Pro Gly Gly Phe Val 105 His Trp Ala* Val Pro Leu His Ser 170 Arg Ser 185 Arg 25 Arg Arg Leu Pro Asn Lys Thr Asn 155 Asn Ciy 10 Pro Asp Pro Phe 75 Pro Met Glu Ala 140 Thr Arg Asp Pro Met Arg 60 Met Gin Val Phe 125 Ala Thr Glu Glu His Gin Ala Leu Ala Gin 110 His Gin Len Ser Gly 190 Thr Arg Gin Asp His Arg Phe Phe His Asp 175 Trp Cys Gin Pro Len Len Asp Asp Arg Ile 160 Leu Leu Pro Ile Ala Tyr Gly Arg Leu Ile 145 Ser Phe Val Gin Leu Ala Ala His Ala Arg Thr Thr 130 Tyr Met Phe Leu t.

a 0 *0 to 0s** 0*a* ~0 a. 0 9**o 0S Asp 195 le Thr Ala Ala Ser 200 Asp Arg Trp Leu Len 205 Asn His His Lys 104 Leu Gly Leu Arg Leu Tyr Val. Glu Thr Ala Asp Gly His Ser Het Asp 215 220 225 Pro Gly Leu Ala Gly Leu Leu Gly 230 Pro Phe Met Val Thr Phe The Pro Arg 260 Glu Leu 275 Gly Ser The Arg Ser Ala Het Asn 340 Lys Pro 355 Ala Thr Lys His 245 Ala Ala Pro His Arg Gly Asp Leu 310 Tyr Tyr 325 Ala Thr Asp Val Ser Val Arg Asn 390 Arg Pro Arg 295 Giy Cys Asn Val Leu 375 Met Pro Asn 280 Glu Trp Glu His Pro 360 Ty'.

Val Leu 265 ,Y~s Val Leu Gly Ala 345 Lys Tyr Val Arg 250 Lys Leu Cys Asp Glu 330 Ile Ala Asp Lys Arg Gin 235 Ala Ser Arg Arg Pro Gly Arg Arg 300 Trp Val 315 Cys Ala Leu Gin Cys Cys Ser Ser 380 Ala Cys 395 Ala Gin Gin Ile 285 His Ile Phe Ser Ala 365 Asn Gly Pro Ser Pro 270 Phe Glu Ala Pro Leu 350 Pro Asn Cys Arg Ser Arg 240 Pro Val Arg Lys Lys Thr Asp Asp Gly Leu Tyr Val 305 Pro Gin Gly 320 Leu Asp Ser 335 Val His Leu Thr Lys Leu Val Ile Leu 385 His Gin Ala Asn His 290 Ser Tyr Cys Met Ser 370 Arg 4*

U

*ft ft.

U

ft...

*ft ft. ft.

4 6 ft ft ft.

105 (2)INFORKATION FOR SEQ ID NO:24: SEQUENCE CHARACTERISTICS: LENGTH: 1368 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear MOLECULE TYPE: cDNA NAME/KEY: CDS LOCATION: 1-.1368 OTHER INFORM.ATION:/STANDARD) PUBLICATION INFORMATION: AUTHORS: VHARTON, KRISTI T;SOlisEN, GELBERT, WILLIANM TITLE: DROSOPHILA 60A GENE...

JOURNAL: PROC. NAT'L ACAD). SiOI. USA VOLUME. 88 RELEVANT RESIDUES IN SEQ ID NO:3: FROM PAGES: 9214-9218 DATE: OCT 1991 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: GERALD H.; I TO 1366

S

5555 .5 5 *#II,5

S

S S 5, S ATG TCG GGA OTG Mlet Ser Gly Leu CTG GGA 'CTC GGA Ci,(.C GTT GAG 6CC iz Val GTh Ala CAG ACG A'TC ATG Gln 'rhr Ile Met 50 TCG TAO GAG ATC Scar Tyr 6Th Il~e MAC ACC TCG GAG Asn Thr Ser Giu AVG Gi' CVG CTC Ifet Val Leu Leu TTO GTG CGCG ACC ?he Val Ala Thr InACG, CCG CCCG Thr F P r GGC AG GAO Gly Lys Asp Gfl GCA GTG CTC GCC TCC Val Ala Val Leu Ala Ser ACC CAG TCC Thr Gin Ser CAC AGA GXG His Arg Val ATI TAC ATA GAC le TNyr Ile Asp CTG AGC GAG GAC Leu Ser Giu Asp MAG CTG GAO GTC Lys Leu Asp Val

CTC

Leu 'ITC CIG GGC ATO Phe Leu Gly Ile GMA CGG CCG ACG Glu Arg Pro Thr CTG AGO Leu Skir AGO CAC CAG Ser His Gin 85 TTG TCG CTG AGG Leu Ser Leu Arg TCG GCT CCC AAG Ser Al-1 Pro Lys TTC CTG Phe Leu 106 CTG GAC GTC Leu Asp Val CAT GAG GAC Asp Giu Asp 115 CAC CGC ATC ACG His Arg Ile Thr

GCG

Ala 105 GAG GAG GGT CTC Glu Glu Cly Leu GAC GAC TAC GMA Asp Asp Tyr Giu GOC CAT CCC Cly His Arg TCC AGO Ser-Arg 125 MAC TTC Asn Phe 140 ACC CAT GAG Ser Asp Gin 110 AGO AGC CC Arg Ser Ala ATC ACC GAC le Thr Asp GAC CTC Asp Leu 130 GAG GAG CAT GAG Glu Giu Asp Glu

GCC

Gly 135 GAG CAG CAG MAG Giu Gin Gin Lys

CTG

Leu- 145

MAC

Asn GAC LAG COG CC Asp Lys Arg Ala AAG CCC CAC CAC Lys Arg His His 165

ATC

le 150 GAG GAG AGC GAC Asp Giu Ser Asp

ATC

le 155 ATC ATG ACC TTC le Met Thr Phe

CTC

Leu 160 M.T GTG GAC GAA Asn Val Asp Giu CCT CAC GAG CAC Arg His Giu His GOC COT Gly Arg 175 CGC CTG TG Arg Leu Trp, ATG CCC GAG Met Ala Giu 195 GAC GTC TCC MAC Asp Val Ser Asn CCC MAC CAC MAC Pro Asn Asp Asn TAC CTG GTC Tyr Leu Val 190 MCG TOG CGG Lys Trp Leti CTG CGC ATC TAT Leu Arg Ile Ty r

CAG

Gin 200 MAC CCC MAC GAG Asn Ala Asn Giu

GCC

Gly 205 ACC CC Thr Ala 210 MAC AGO GAG TTC Asn Arg Giu Phe

ACC

Thr 215 ATC ACG GTA TAC Ile Thr Val Tyr

CC

Ala 220 ATT CCC ACC GC Ile Cly Thr Cly

ACC

Thr 225 CTG GGC CAG CAC Leu Gly Gin His

ACC

Thr 230 ATG GAG CCC CIC Met Giu Pr~o Leu

TCC

Ser 235 ICG GTG MC ACC Ser Val Asn Thr

ACC

Thr 240 720

C..

0*t*

C

C.

CCC4f GAC TAC GIG Gly AspTyr Val TCG TIC GAG CTC Trp, Leu Ciu Leu GTC ACC GAG GCC Val Thr Giu Oly CTG CAC Leu His 255 GAG ICC CTG Giu Trp Leu CAC C CTC His Ala Val 275 MAG TCC MAG GAC Lys Ser Lys Asp

MAT

Asn 265 CAT CCC tTC TAC His Gly Ile Tyr ATT CGA GCA le Cly Ala 270 GAO ATT CGA Asp Ile Gly MAC CGA CCC GAC Asn Arg Pro Asp GAG GTG MAG OTO Giu Val Lys Leu

GAC

Asp 2b65 CIG ATC Lei Ile 290 CAC CCC MAG His Arg Lys GTG GAC Val Asp 295 GAC GAG TIC GAG Asp Giu Phe Gin TTC ATG ATC GCC Phe Met Ile Gly 107 ~TTC TTC Phe Phe 305 CCC GGA CCC GAG CTG ATC A.AG C ACG GCC Arg Gly Pro Glu Leu Ile Lys Ala Thr Ala CAC AGC AGC CAC His Ser Ser His 320 CAC AGG AGC MAG His Arg Ser Lys AGC GCC AGC CAT Ser Ala Ser His CCA CC Pro Arg 330 MAG CGC MAG Lys Arg Lys MAG TCG Lys Set 335 [GTG TCG CCC Val Ser Pro AGC TCC CAG Ser Cys Gin 355 MAC GTG CCC CTG Asn Val Pro Leu

CTG

Leu 345 GAA CCG ATG GAG Giu Pro Het Glu AGC ACG CGC Ser Thr Arg 350 CTG GGC TGG Leu Cly Trp ATG CAC ACC CTG Met Gin Thr Leu ATA GAC TTC MCG le Asp Phe Lys CAT GAC His Asp 370 TGG ATC ATC GCA Trp le Ile Ala GAG CCC TAT GCC Glu Gly Tyr Cly TTC TAC TCC AGC Phe Tyr Cys Ser [;c t"-1y 385 GAG TCC MAT T IC Glu Cyrs Asn Phe

CCC

Pro 390 CTC MAT GCG CAC Leu Asn Ala His MAC CCC ACG MAC Asn Ala Thr Asn 1008 1056 1104 1152 1200 1248 1296 1344 1368 GCC ATC GTC CAG Ala Ile Val Gin

ACC

Thr 405 CTG CTC CAC CTG Leu Val His Leu GAG CCC MG MAG Glu Pro Lys Lys GTG CCC Val Pro 415 AAC CCC TGC Lys Pro Cys CAC CIC MAC Eis Leu Asn 1 435

TGC

Cys 420 CC CCC ACC ACG CTG GGA GCA CIA Ala, Pro Thr Arg Leu Gly Ala Leu 425 CCC GTT CTG TAC Pro Val Leu Tyr 430 AGA MAC ATG ATT Arg Asn Het Ile 445 GAC GAG MAT GIG MAC CTG AMA MC TAT Asp Glu Asn Val Asn Leu Lys Lys Tyr 440 GIG AMA Val Lys 450 TCC TCC CCC C Ser Cys Gly Cys CAT IGA His 455 108 (2)INFORKATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 455 amino acids TYPE: amiino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID Net Ser Gly Leu Arg Asn Thr Set Glu Ala Val Ala Val Leu Ala Ser 1 5 10 Leu Gly Leu Gly Met Val Leu Leu Met Phe Val Ala Thr Thr Pro Pro 25 Ala Val Giu Ala Thr Gin Set Gly Ile Tyr Ile Asp Asn Gly Lys Asp 40 Gin Thr Ile Met His Arg Val Leu Set Giu Asp Asp Lys Leu Asp Val 55 Set Tyr Giu Ile Leu Giu Phe Leu Gly Ile Ala Giu Arg Pro Thr His 70 75 Leu Set Set His Gin Leu Set Leu Arg Lys Set Ala Pro Lys Phe Leu 90 Leu Asp Val Tyr His Arg Ile Thr Ala Giu Glu Gly Leu Ser Asp Gin 100 105 110 Asp Glu Asp Asp Asp Tyr Glu Arg Gly His Arg Set Arg Arg Ser Ala 115 120 125 Leu Glu Glu Asp Giu Gly Glu Gln Gin Lys Asn Phe Ile Thr Asp 130 135 140 Leu Asp Lys Arg Ala Ile Asp Gin Set Asp Ile Ile Niet Thr-Phe Leu 145 150 155 160 Asn Lys Arg His His Asn Val Asp Gin Leu Atg His Giu His Gly Arg *165 170 175 Arg Leu Trp Phe Asp Vai Set Asn Val Pro Asn Asp Asn Tyr Leu Val 180 185 190 *Met Ala Giu Leu Arg Ile Tyr Gln Asn Ala Asn Gin Gly Lys Trp Leu 195 200 205 Thr Ala Asn Arg Gin Phe Thr Ile Tht Val Tyr Ala Ile Gly Thr Gly 210 215 220 109 Thr 225 Gly Glu His Leu Phe 305 His Val Ser His.

Gly 385 Ala Lys His Val Leu Asp Trp Ala Ile 290 Phe Arg Ser Cys Asp 370 Glu Ile Pro Leu.

Lys 450 Gly Tyr Leu Val 275 His Arg Ser Pro Gln 355 Trp Cys Val Cys Asn 435 Ser Gin Val Val 260 Asn Arg Gly Lys Asn 340 Met Ile Asn Gin Cys 420 Asp Cys His Gly 245 Lys Arg Lys Pro Arg 325 Asn Gin Ile Phe Thr 405 Ala Glu Gly Thr 230 Trp Ser Pro Val Giu 310 Ser Vai Thr Ala Pro 390 Leu Pro Asn Cys Met Glu Leu Glu Lys Asp Asp Arg 280 Asp Asp 295 Leu Ile Aia Ser Pro Leu Leu Tyr 360 Pro Glu 375 Leu Asn Val His Thr Arg Val Asn 440 His 455 Pro Leu Asn 265 Giu Val Lys Glu Phe Gln Lys Ala Thr 315 His Pro Arg 330 Leu Giu Pro 345 Ile Asp Phe Gly Tyr Gly Ala His Met 395 Leu Leu Glu 410 Leu Giy Ala 425 Ser Thr Ile Leu Pro 300 Ala Lys Met Lys Ala 380 Asn Pro Val Glu Tyr Asp 285 Phe His Arg Glu Asp 365 Phe Ala Lys Asn Gly Ile 270 Asp Met Ser Lys Ser 350 Leu Tyr rhr Lys Thr Leu 255 Gly Ile Ile Ser Lys 335 Thr Gly Cys Asn Val 415 Thr 240 His Ala Gly Gly His 320 Ser Arg Trp Ser His 400 Pro Leu Pro Val Leu Tyr 430 Asn Met Ile Leu Lys Lys Tyr Arg 445 c -L ~na~p~ 110 INFORATION FOR SEQ ID NO:26: SEQUENCE CHARACTERISTICS: LENGTH: amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (iii) ORIGINAL SOURCE: ORGANISH: Homo Sapiens (ix) FEATURE: NAME/KEY: Protein LOCATION: 1..102 OTHER INFORMATION: /note="BHP3" (xi) SEQUENCE DESCRIPTION: SEQ ID NO:26: (i)SEQUENCE CHARACTERISTICS: LENGTH: 104 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii)HOLECULE TYPE: protein (ix)FEATURE: NAME/KEY: Protein LOCATION: 1..104 OTHER INFORMATION: /note="BHP3" (xi)SEQUENCE DESCRIPTION: SEQ ID NO:26: Cys Ala Arg Arg Tyr Leu Lys Val Asp Phe Ala Asp Ile Gly Trp Ser 1 5 10 Glu Trp Ile Ile Ser Pro Lys Ser Phe Asp Ala Tyr Try Cys Ser Gly 25 Ala Cys Gin Phe Pro Met Pro Lys Ser Leu Lys Pro Ser Arn His Ala 35 40 Thr Ile Gin Ser Ile Val Ala Arg Ala Val Gly Val Val Pro Gly Ile 55 Pro Glu Pro Cys Cys Val Pro Glu Lys Met Ser Ser Leu Ser Ile Leu 65 70 75 Phe Phe Asp Glu Asn Lys Asn Val Val Leu Lys Val Tyr Pro Asn Met 85 90 Thr Val Glu Ser Cys Ala Cys Arg 100 INFORMATION FOR SEQ ID NO:27: SEQUENCE CHARACTERISTICS: LENGTH: 102 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein (vi) ORIGINAL SOURCE: ORGANISM: HOMO SAPIENS (ix) FEATURE: NAME/KEY: Protein LOCATION: 1..102 OTHER INFORMATION: /note=. (xi) SEQUENCE DESCRIPTION: SEQ ID NO:27: Cys Lys Lys His Glu Leu Tyr Val Ser Phe Arg Asp Leu Gly Trp Gin 1 5 10 Asp Trp Ile Ile Ala Pro Glu Gly Tyr Ala Ala Phe Tyr Cys Asp Gly 25 Giu Cys Ser Phe Pro Leu Asn Ala His Met Asn Ala Thr Asn His Ala 40 Ile Val Gin Thr Leu Val His Leu Met Phe Pro Asp His Val Pro Lys 55 Pro Cys Cys Ala Pro Thr Lys Leu Asn Ala Ile Ser Val Leu Tyr Phe 70 75 Asp Asp Ser Ser Asn Val Ile Leu Lys Lys Tyr Arg Asn Met Val Val 90 000Arg Ser Cys Gly Cys His 100 112 INFORMATION FOR SEQ ID NO: 28: SEQUENCE CH{ARACTERISTICS: LENGTH: 102 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein (vi) ORIGINAL SOURCE: ORGANISM: HOMO SAPIENS (ix) FEATURE: NAME/KEY: Protein LOCATION: 1--102 OTHER INFORMATION: /note= "BHP6" (xi) SEQUENCE DESCRIPTION: SEQ ID NO:28: Cys Arg Lys His Giu Leu Tyr Val Ser Phe Gin Asp Leu Gly Trp Gin 1 5 10 Asp Trp Ile Ile Ala Pro Lys Gly Tyr Ala Ala Asn Tyr Cys Asp Gly 25 Glu Cys Ser Phe Pro Leu Asn Ala His Met Asn Ala Thr Asn His Ala 40 le Val Gin Thr Leu Val His Leu Met Asn Pro Glu Tyr Val Pro Lys 55 Pro Cys Cys Ala Pro Thr Lys Leu Asn Ala Ile Ser Val Leu Tyr ?he 70 75 Asp Asp Asn Ser Asn Val Ile Leu Lys Lys Tyr Arg Trp Met Val Val 90 .Arg Ala Cys Gly Cys His 100 INFORMATION FOR SEQ ID NO:29: SEQUENCE CHARACTERISTICS: LENGTH: 102 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (ix) FEATURE: NAME/KEY: Protein LOCATION: 1.-102 OTHER INFORMATION: /label= OPX /note= "WHEREIN XAA AT EACH POS'IN IS INDEPENDENTLY SELECTED FROM THE RESIDUES OCCURRING AT THE CORRESPONDING POSIN IN THE C-TERMINAL SEQUENCE OF MOUSE OR HUMAN OPI OR 0P2 (SE~E SEQ. ID NOS. 5,6,7 and 8 or 16, 18,20 and 22.)" (xi) SEQUENCE DESCRIPTION: SEQ ID NO:29: Xaa Xaa His Leu Tyr Val Xaa Phe Xaa Asp Leu Gly Trp Xaa Asp Trp Xaa Ile Glu Cys Xaa Phe Ala Pro Xaa Gly Xaa Ala Tyr Tyr Pro Leu Xaa Xaa Met Asn Ala Thr Xaa Xaa Cys Glu Gly Asn His Ala Val Pro Lys Ile Xaa Gin Xaa Leu Val His 55 Thr Xaa 70 Xaa Xaa Xaa Pro laa Cys Cys Ala Pro Leu Xaa Ala Xaa 75 Ser Val Leu Tyr Xaa Asp Xaa Ser Xaa Asn Val Xaa Leu Xaa Lys Xaa Arg Asn Met Val Val 90 Xaa Ala Cys Gly Cys His 100 114 INFORMATION FOR SEQ ID (i )SEQUENCE CHARACTERISTICS: LENGTH: 97 amino acids TYPE: amino acids TOPOLOGY: linear (ii)HOLECULE TYPE: protein (ix) FEATURE: NAME: Generic Sequence OTHER INFORKATION: wherein each Xaa is independently selected from a group of one or more specified amino acids as defined in the specification.

(xi)SEQUENCE DESCRIPTION: SEQ ID Leu Xaa Xaa &aa Phe Xaa Xaa Xaa Gly Trp Xaa Xaa Trp Xaa Xaa Xaa Xaa Xaa Xaa Xaa Cys Xaa Xaa Xaa Xaa Xaa Pro Tyr Cys Pro Xaa Iaa Xaa Xaa Xaa Xaa Xaa Xaa Pro Xaa Xaa Xaa Xaa Xaa Xaa Cys Xaa Xaa Xaa Gly Xaa Xaa Xaa His Ala Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Val Xaa Met Xaa Xaa Xaa Ala Cys Xaa Xaa Xaa Xaa, Xaa Xaa Xaa Cys Xaa Xaa Xaa Leu Xaa Val Xaa 0 115 INFORMATION FOR SEQ ID NO:31: (i )SEQUENCE CHARACTERISTICS: LENGTH: 102 amino acids TYPE: amino acids TOPOLOGY: linear (1ii)HOLECULE TYPE: protein (ix) FEATURE: NAME: Generic Sequence 6 OTHER INFORMATION: wherein each Xaa is independently selected from a group of one or more specified amino acids as defined in the specification.

(xi) SEQUENCE DESCRIPTION: SEQ-ID :31: Cys Xaa Xaa Xaa Xaa Leu Xaa Xaa Xaa Phe 1 5 Xaa Xaa Xaa Gly Trp Xaa Xaa Trp Xaa Xaa Xaa Pro Xaa Xaa Xaa Xaa Ala Xaa Tyr Cys Xaa Gly Xaa Cys Xaa Xaa Pro Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Asn His Ala Xaa Xaa l aa Xaa Xaa Zaa Xaa Xaa Xaa Xaa Iaa Xaa Xaa Xaa Xaa Xaa Xaa Cys *.60 *Cys Xaa Pro Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Leu, Xaa Xaa Xaa Xaa Xaa Xaa laa Val Xaa Leu Xaa Xaa Xaa Xaa Xaa liet Xaa Val Xaa Xaa Cys Xaa Cys Xaa 116 INFORMATION FOR SEQ ID NO:32: SEQUENCE CHARACTERISTICS: LENGTH: 1238 base pairs, 372 amino acids TYPE: nucleic acid, amino acid STRA2NDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) OR(IGINAL SOURCE: ORGANISM: human TISSUE TYPE: BRAIN (iv) FEATURE: NAME/KEY: CDS

LOCATION:

OTHER INFORMATION: /product= "GDF-1" /note= "GDF-1 CDNA" PUBLICATION INFORMATION: AUTHORS: Lee, Se-Jin TITTLE: Expression of Grovth/Differentiation Factor 1 JOURNAL: Proc. Nat'l Acad. Sci.

VOLUME: 88 RELEVANT RESIDUES: 1-1238 PAGES: 4250-4254 DATE: May-1991 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:32: GGGGACACCG GCCCCGCCCT CAGCCCACTG CTCCCGGGCC GCCGCGGACC CTGCGCACTC

S

*5 5 4 TCTGGTCATC GCCTGGGAGG AAG ATG CCA CCG CCC CAG CAA GCT Met Pro Pro Pro Gin Gin Gly 1 5 CCC TGC GGC Pro Cys Gly CAC CAC CTC CTC CTC CTC CTG GCC CTG CTG His His Leu Leu Leu Leu Leu Ala Leu Leu 15 CTG ACC CCC CCC CCC GTG CCC CCA CCC CCA Leu Thr Arg Ala Pro Val Pro Pro Gly Pro CAG GOT CTA GGA CTG CCC CAT GAG CCC CAG Gin Ala Leu Gly Leu Arg Asp Clu Pro Gin 45 CTG CCC TOG CTG Leu Pro Ser Leu CCC GCC GCC CTG Ala Ala Ala Leu GGT GCC CCC AGG Gly Ala Pro Arg 117 CGG CCG GTT CCC Arg Pro Val Pro CCC CAG GAG ACC Pro Gin Giu Thr ACC CTG CMA CCG Thr Leu Gin Pro ATC GIG CGC CAC Ile Val Arg His GAG CCT GTC TCG Glu Pro Val Ser TIC GAC CTG TCG Phe Asp Leu Ser CCC CTG GAG CTG Arg Leu Glu Leu CCC GCC TOG GAG Oily Giy Trp Giu GCG GAC CCC CG Ala Asp Pro Gly

CCG

Pro

AGO

Arg

TGC

Cyc

ATC

Ile 105 0CC Ala 120

OCT

Ala 135 GTC ATG TOG COC Val Met Trp Arg TCT 0CC TCO CG Ser Gly Ser Arg CAC GIG GAG GAO His Val Giu Glu CCC GAC CGC GOT Pro Asp Arg Gly GCG COG CAT TOC Ala Gly His Cys GIG GAA CCC OCT Val Glu Pro Ala COT TIC Arg Phe 150 GCG GCG OCO Ala Ala Ala 010 Leu 165

CCC

Pro GGG CCG CCA Gly Pro Pro AAC 0CC TCA Asn Ala Fer 180 GIG COC Val Arg 195 TOG CCO Trp Pro 210 AGC GIG OCG CAA Ser Val Ala Gin GIG CIG CTC COC Val Leu Leu Arg 0CC GAG CTG CIG Ala Glu Leu Leu CCC AOC CTC COC Arg Ser Leu Arg CCC TOC GCG COC Ala Cys Ala Arg GAC CCC CCC CTG Asp Pro Arg Leu 4

I

4

CTG

Leu 65

CG

Arg 80

CTG

Leu 95

OCG

Ala 110

CCT

Pro 125

GAG

Glu 140 Ala 155

GCG

,la 170

CAG

Uin 185

;GC

100

:TG

.eu 15

:TG

.,eu ~30 rGC ,ys ~45 TT CGA COO CCC Phe Arg Arg Arg ACO ICC CCA GG Thr Ser Pro Gly CCC GTC 0CC GGA Gly Val Ala Gly CCC ACC CCC 0CC Pro Ihr Arg Ala GAG TOO ACA GTC Gin Irp Thr Val COC CCG AGO CG Arg Pro Ser Arg GCG OCA 0CC CCO Ala Ala Ala Pro

GAO

Asp

GTC

Val

AAC

Asn 100

TCG

Ser 115

GTC

Val 130 0CC Ala 145

GAG

Giu 160 CCC CAG GC GCC Oly Gin Cly Ala TIG OTG CCC 0CC Leu Val Pro Ala GCC GCT 100 OCT Ala Ala Trp Ala GCG CTG OCO CTA Ala Leu Ala Leu CCC GAG CCC ICC Ala Olu Ala Ser CAC CCC CTG 0CC His Pro Leu Ala

GCC

Cly 175

CIG

Leu 190

CC

Arg 205

CC

Arg 220 Leu 235

CGG

Arg 250 293 338 383 428 473 518 563 608 653 698 743 788 833 CCC CCC CCC CCI Pro Arg Ala Pro CTC CTG GIG ACC Leu Leu Val Thr I It CCC CCC CGC GAC Pro Arg Arg Asp GCT TGT CCC GCG Ala Cys Arg Ala TCC CAC CGC TGG Trp, His Arg 7Trp TGC CAG GGT CAG Cys Gin Gly Gin

GCC

Ala 255

CGG

Arg 270

GTC

7fa1 285

C

Cys 300 GMA CCC GTG TTG Glu Pro Val Leu CGG CIG TAC GTG Arg Leu Tyr Val ATC GCG CCC CC Ile Arg Pro Arg GCG CTG CCC GTC Ala Leu Pro Val

GGC

Cly 260

AGC

Ser 275

CCC

Gly 290

C

Ala 305 CCC GGC CCC CG Gly Gly Pro Gly TTC CCC CAG GIG Phe Arg Clu Val TTC CTG CCC MAC Phe Leu Ala Asn CTC TOG CCC ICC Leu Ser Gly Ser

GCC

Gly 265

GCC

Gly 280

TAC

Tyr 295

GCC

Cly 310 GGC CCG CCC C CTC MAC CAC CCT GTG Gly Pro Pro Ala Leu Asn His Ala Val 315 CCC CCG CTC ATG Arg Ala Leu Met

CAC

His 325 878 923 968 1013 1058 1103 1148 1193 1238 GCC CCC CCC CCC CCA Ala Ala Ala Pro CGC CTC 7CC CCC Arg Leu Ser Pro GIG GTC CTC CCC Val Val Leu Arg Gly 330

ATO

le 345

CAG

Gin 360 CCC CCC GAC CTG Ala Ala Asp Leu TCC GTG CTC TTC Ser Val Leu Phe TAT GAG GAO ATC Tyr Glu Asp Met

CCC

Pro 335

TTT

Phe 350

GIG

Val 365 TCC TCC CTG CCC Cys Cys Val Pro CAC MAC AGO GAO Asp Asn Ser Asp GIG CAC GAG C Val Asp Glu Cys

C

Ala 340

MAC

Asn 355

GC

Gly 370 TGO CCC TAACCCCCCC CCGGCACCCA CCCGGCCCCA ACAATAAATG CCCCGTGG Cys Arg 372 119 (34) INFORRATION FOR SEQ ID NO: 33,, SEQUENCE CHARACTERISTICS: LENGTH: 372 amino acids TYPE: amino acid STI1ANDEDNESS,; single (D1) TOPOLOGY: linear (ii) HOLECIJLE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) NTI-SENSE: NO (vi) ORIGINTAL SOURCE: ORGANISM: human TISSUE TYPE: BRAIN (ix) FEATURE: NAME/KEY: CDS

LOCA'ON:

OTHER INFORAT:ON: /function= /Product. IIGDF-11" (xi) SEQU)ENCE DESCRIPTION: S!:Q ID NO:33: Met Pro Pro Pro Gin GIn Gly Pro Cys Gly 1 5 His His Leu Leu Leu Leu Leu Ala Leu Leu Leu Pro Ser Leu Pro~ 20 Leu Thr Arg Ala Pro Val Pro Pro Gly Pro Ala Ala Ala Leu Leu 35 ~Gin Ala Leu Gly Leu Arg Asp Glu Pro Gin Gly Ala Pro Arg Leu 50 ~.Arg Pro Val Pro Pro Val Net Trp Arg Leu Phe Arg Arg Arg Asp :60 65 Pro GIn Glu Thr Arg Ser Gly Ser Arg Arg Thr Ser Pro Gly Val C.75 80 8 .:Thr Leu Gin Pro Cyc His Val Glu Glu Leu Gly Val AiaGly ASn 95 100 Ile Val Arg His Ile Pro Asp Arg Gly Ala Pro Thr Arg Ala Ser 105 110 115 Glu Pro Val Ser Ala Ala Gly His Cys 120 Pro Glu Trp 125 Thr Val Val Phe Arg Gly Ala Gly Asn Pro Leu Pro Ala Trp Cys Gly Ala Arg Leu Glu Trp Pro Pro Ser Ala Val Arg Arg Arg Gly Pro Ala Ser Val Phe Ser Val Ala Arg Ala Asp Glu Arg Ile Ala Asn Ala Ser Glu Ala Val Leu Glu Ser Cys Pro Pro Arg Leu H is Ala Val Ala Ala Gln Arg Leu Arg Arg Leu Leu Val Arg Val Val Leu Phe Arg Ala Gly Leu Ala Ala Al a His Gly Phe Phe Leu Arg Cys Asp Pro Ala Gln Val Ala Glu Pro Gly Arg Leu Ser Ala Cys Asn S er Ala GJly Pro Trp Ala Ala Leu Pro Glu Ala Gly Leu Val Ser Arg Pro Ala Ala Ala Leu Ser Ala Gly Val Asn Ser Met Pro Asp Ala 145 Glu 160 Gly 175 Leu 190 Arg 205 Arg 220 Leu 235 Arg 250 Gly 265 Gly 280 Tyr 295 Gly 310 His 325 Ala 340 Asn 355 o so S. *S o 4* 4 4 .4 Val Val Leu Arg Gln Tyr 360 Glu Asp Met Val Asp Glu Cys Cys Arg 372

Claims

1. A methiod of diagnosing renal tissue damage and/or disease in a mammalian subject comprising the step of measuring endogenous expression of OP-1 in renal tissue of said subject, wherein a depression or absence of said endogenous expression relative to undamaged or undiseased mammalian renal tissue indicates a diagnosis that said mammal is afflicted with said damage and/or disease.

2. The method according to claim 1 wherein the endogenous expression of OP-1 is measured by detecting OP-1 mRNA in renal tissue of said mammal.

3. The method according to claim 1 wherein the endogenous expression of OP-1 is measured by detecting OP-1 protein produced by renal tissue of said mammal.

4. The method according to claim 3 wherein the endogenous S 20 expression of OP-1 is measured by detecting OP-1 protein circulating in said mammal.

5. The method according to claims 3 or 4 wherein the endogenous expression is measured using an antibody specific for 25 said OP-1 protein.

6. The method according to any one of claims 1 to 5 wherein the renal tissue damage or disease comprises renal tissue degeneration. L I I I M I IIH0,J14 9S'I 144197 -122-

7. The method according to any one of claims 1 to 5 wherein the renal tissue damage or disease comprises kidney disease.

8. The method according to any one of claims 1 to 7 wherein the mammal is a human.

9. A method of diagnosing renal tissue damage and/or disease in a mammalian subject comprising the steps of: contacting a sample of mammalian renal tissue with an agent which is capable of specifically detecting endogenous expression of OP-1 in said tissue; detecting whether OP-1 is expressed endogenously in said tissue sample; and determining whether said endogenous expression or OP-I is absent or depressed compared to the level of OP-1 in undamaged or undiseased renal tissue derived from the same mammalian species, wherein absence or depression of OP-I expression is indicative of disease and/or damage.

A method of diagnosing renal tissue damage and/or disease in a mammalian-subject comprising the steps of: contacting a sample of mammalian renal tissue with an agent which is capable of specifically detecting endogenous 25 expression of OP-i in said tissue; detecting a level of endogenous expression of OP-1 in said tissue sample; and comparing said level of endogenously expressed OP-I in said tissue sample with a reference level of OP-I endogenously expressed in undamaged or undiseased mammalian i Y \MR\28624-92.SI'II. 11/8I/9 -123- renal tissue, wherein a detectable difference in the endogenous OP-1 expression between the tissue sample and reference is indicative of disease and/or damage.

11. The method according to claims 9 or 10 wherein the agent is a nucleic acid probe which hybridizes specifically with RNA transcribed from an OP-1 gene present in cells of the tissue sample.

12. The method according to claim 11 wherein the probe hybridizes specifically with RNA encoding an OP-1 polypeptide.

13. The method according to claims 11 or 12 wherein the probe hybridizes specifically with RNA encoding a non-conserved fragment of an OP-1 polypeptide.

14. The method according to claim 13 wherein the non-conserved fragment of an OP-1 polypeptide is located N-terminal to a 20 conserved C-terminal seven-cysteine domain of a mature OP-1 S: polypeptide.

15. The method according to claim 13 wherein the non-conserved fragment of an OP-1 polypeptide is located in the prodomain of 25 said OP-1 polypeptide.

16. The method according to claims 11 or 13 wherein the probe hybridizes specifically with RNA transcribed from a 3' noncoding region immediately following the stop codon of an OP-1 gene. I, \OPHOMIM 8624 InN)II.S'l! II H8/9 -124-

17. The method according to any one of claims 9 to 16 wherein the sample of mammalian renal tissue comprises poly(A)+ RNA derived from renal tissue.

18. The method according to claim 17 comprising the additional first step of extracting poly(A)+ RNA from renal tissue.

19. The method according to claims 17 or 18 comprising the further step of transferring the poly(A)+ RNA onto a membrane.

The method according to any one of claims 17 to 19 comprising the further step of electrophoretically fractionating. the poly(A)+ RNA prior to transferring the poly(A)+ RNA onto a membrane and/or prior to the step of contacting the poly(A)+ RNA with the agent capable of specifically detecting endogenous expression of OP-1 therein.

21. The method according to any one of claims 11 to 20 wherein the nucleic acid probe hybridizes specifically with a 4.0, 2.4, 2.2 or 1.8 kb RNA transcript of the OP-1 gene.

22. The method according to any one of claims 11 to 21 comprising the additional steps of: contacting a replica sample of the mammalian renal tissue with a control nucleic acid probe that hybridizes specifically with RNA which is transcribed from a non-OP-1 gene expressed uniformly in mammalian tissues; detecting a level of expression of the non-OP-1 gene in said tissue; and I' (IRI1MIIU)U t2.I92Si'I II1/II91 125 comparing the endogenous expression level of OP-I with the expression level of the non-OP-i gene.

23. The method according to claim 22 comprising the additional last step of comparing the relative expression levels of OP-i and the non-OP-1 gene in the renal tissue sample, with the relative expression levels of OP-i and the non-OP-I gene in undamaged or undiseased mammalian renal tissue.

24. The method according to claims 22 or 23 wherein the non-OP- 1 gene is transcriptional elongation factor.

The method according to any one of claims 9 to 24 wherein the mammalian renal tissue is human renal tissue.

26. The method according to any one of claims 9 to 25 wherein the subject is suspected of being afflicted with renal tissue damage. 20

27. The method according to any one of claims 9 to 25 wherein the subject is suspected of being afflicted with renal tissue degeneration or is so afflicted.

28. The method according to any one of claims 9 to 27 wherein 25 the subject is suspected of being afflicted with kidney disease.

29. The method according to any one of claims 1 to 28 wherein the subject suffers from bone loss.

I AjVFIIMR(\28624 92.SII1 27/I97 -126- The method of claim 2 wherein said OP-1 mRNA is detected with a nucleic acid probe which hybridizes with RNA encoding a sequence having at least 70% amino acid sequence homology with the C-terminal seven cysteine domain of human OP-1, residues 38 to 139 of SEQ ID No: 5, or (ii) a sequence defined by Generic Sequence 6, SEQ ID No:

31. 31. The method according to claims 3 or 4 wherein the endogenous expression is measured using an antibody that binds to a protein having a sequence having at least 70% amino acid sequence homology with the C-terminal seven cysteine domain of human OP-1, residues 38 to 139 of SEQ ID No: 5, or (ii) a sequence defined by Generic Sequence 6, SEQ ID No: 31.

32. The method of claim 11 wherein said agent is a nucleic acid probe which hybridizes with an RNA encoding 20 a sequence having at least 70% amino acid sequence homology with the C-terminal seven cysteine domain of human OP-1, residues 38 to 139 of SEQ ID No: 5, or (ii) a sequence defined by Generic Sequence 6, SEQ ID No: 31. DATED this 27TH day of AUGUST, 1997 Creative Biomolecules, Inc. by DAVIES COLLISON CAVE Patent Attorneys for the Applicants I '11 l.MIt)\2824 92 Sl'T. i.t/197 ABSTRACT Disclosed is a method of screening cells and/or tissues to determine the level of proteins which can induce tissue morphogenesis morphogenic proteins). More particularly, the present invention teaches methods of diagnosing tissue damage and/or disease in mammalian subjects by measuring levels of morphogenic protein expressed therein. *e e *e e