AU715138B2 - Peripheral nervous system specific sodium channel peptide (PNS SCP) modulating agent and bioassay therefor, treatment of diseases mediated by PNS SCP, and computer molecular modeling of PNS SCP - Google Patents
Peripheral nervous system specific sodium channel peptide (PNS SCP) modulating agent and bioassay therefor, treatment of diseases mediated by PNS SCP, and computer molecular modeling of PNS SCP Download PDFInfo
- Publication number
- AU715138B2 AU715138B2 AU85155/98A AU8515598A AU715138B2 AU 715138 B2 AU715138 B2 AU 715138B2 AU 85155/98 A AU85155/98 A AU 85155/98A AU 8515598 A AU8515598 A AU 8515598A AU 715138 B2 AU715138 B2 AU 715138B2
- Authority
- AU
- Australia
- Prior art keywords
- leu
- ser
- ile
- phe
- lys
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/435—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
- C07K14/705—Receptors; Cell surface antigens; Cell surface determinants
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K31/00—Medicinal preparations containing organic active ingredients
- A61K31/33—Heterocyclic compounds
- A61K31/395—Heterocyclic compounds having nitrogen as a ring hetero atom, e.g. guanethidine or rifamycins
- A61K31/495—Heterocyclic compounds having nitrogen as a ring hetero atom, e.g. guanethidine or rifamycins having six-membered rings with two or more nitrogen atoms as the only ring heteroatoms, e.g. piperazine or tetrazines
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K38/00—Medicinal preparations containing peptides
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6883—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
Landscapes
- Health & Medical Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Organic Chemistry (AREA)
- General Health & Medical Sciences (AREA)
- Medicinal Chemistry (AREA)
- Cell Biology (AREA)
- Biochemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Molecular Biology (AREA)
- Genetics & Genomics (AREA)
- Immunology (AREA)
- Toxicology (AREA)
- Zoology (AREA)
- Gastroenterology & Hepatology (AREA)
- Biophysics (AREA)
- Animal Behavior & Ethology (AREA)
- Epidemiology (AREA)
- Veterinary Medicine (AREA)
- Public Health (AREA)
- Pharmacology & Pharmacy (AREA)
- Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)
- Peptides Or Proteins (AREA)
Description
P00011 Regulation 3.2
AUSTRALIA
Patents Act, 1990
ORIGINAL
COMPLETE SPECIFICATION STANDARD PATENT TO BE COMPLETED BY THE APPLICANT NAME OF APPLICANT: ACTUAL INVENTORS: ADDRESS FOR SERVICE: TROPHIX PHARMACEUTICALS, INC and THE RESEARCH FOUNDATION OF STATE UNIVERSITY OF NEW YORK LAURENCE A BORDEN, GAIL MANDEL and SIMON HALEGOUA Peter Maxwell Associates Level 6 Pitt Street SYDNEY NSW 2000 PERIPHERAL NERVOUS SYSTEM SPECIFIC SODIUM CHANNEL PEPTIDE (PNS SCP) MODULATING AGENT AND BIOASSAY THEREFOR, TREATMENT OF DISEASES MEDIATED BY PNS SCP, AND COMPUTER MOLECULAR MODELING OF PNS SCP.
NIL
INVENTION TITLE: DETAILS OF ASSOCIATED PROVISIONAL APPLICATION NO(S): The following statement is a full description of this invention including the best method of performing it known to me:- The present invention is in the fields of biotechnology, protein purification and crystallization, x-ray diffraction analysis, three-dimensional computer molecular modeling, and rational drug design (RDD). The invention is directed to isolated peripheral nervous system (PNS) specific sodium channel proteins (SCPs) and encoding nucleic acid, as well as to compounds, compositions and methods for selecting, making and using therapeutic or diagnostic agents having sodium channel modulating activity.
The present invention further provides three-dimensional computer i modeling of the PNS SCP, and for RDD, based on the use of x-ray data and/or amino acid sequence data on computer readable media.
Voltage-sensitive ion channels are a class of transmembrane proteins that provide a basis for cellular excitability, as the ability to transmit information via ion-generated membrane potentials. In response to changes in membrane potentials, these molecules mediate rapid ion flux through highly selective pores in a nerve cell membrane. If the channel density is high enough, a suitable regenerative depolarization results, termed the oooo• action potential.
The voltage-sensitive sodium channel is the ion channel most often responsible for generating the action potential in excitable cells. Although sodium-based action potentials in different excitable tissues look similar (Hille, In: Ionic Channels of Excitable Membranes, B. Hille, ed., Sinauer, Sunderland, MA, (1984), pp. 70-71) recent electrophysiological studies indicate that sodium channels in different cells differ in both their structural and functional properties, and many sodium channels with distinct primary structures have now been identified. See, Mandel, J. Membrane Biol.
125:193-205 (1992).
lb Functionally distinct sodium channels have been described in a variety of neuronal cell types (Llinas et J. Physio. 305:197-213 (1980); Kostyuk et Neuroscience 6:2423-2430 (1981); Bossu et al., Neurosci. Lett. 51:241-246 (1984) 1981; Gilly et Nature 309:448-450 (1984); French et Neurosci. Lett. 56:289-294 (1985); Ikeda et J.
Neurophysio/. 55:527-539 (1986); Jones et J. Physio. 389:605-627 S(1987); Alonso Llinas, 1989; Gilly et J. Neurosci. 9:1362-1374 (1989)) and in skeletal muscle (Gonoi et J. Neurosci. 5:2559-2564 (1985); Weiss et al., Science 233:361-364 (1986)). The kinetics of 10 sodium currents in glia and neurons can also be distinguished (Barres et a., Neuron 2:1375-1388 (1989)).
The type II and type III genes, expressed widely in the central nervous system (CNS), are expressed at very low levels in some cells in the SPNS (Beckh, FEBS Lett. 262:317-322 (1990)). The type II and III mRNAs were barely detectable, by Northern blot analysis, in dorsal root ganglion (DRG), cranial nerves and sciatic nerves. On the other hand, type I mRNA was present in moderately high amounts in DRG and cranial nerve, but in low levels in sciatic nerve. A comparison of the amount of all three brain mRNAs, relative to total sodium channel mRNA detected with a conserved cDNA probe, suggested the presence of additional, as yet unidentified, sodium channel types in DRG neurons. Consistent with the mRNA studies, immunochemical studies showed that neither type I nor type II sodium channel alpha subunits made up a significant component of the total sodium channels in the superior cervical ganglion or sciatic nerve (Gordon et al., Proc. Nat. Acad ScL USA 84:8682-8686 (1987)).
A population of neurons in vertebrate DRG has been identified electrophysiologically that contains, in addition to the more conventional channels, a distinct sodium channel type; this DRG channel has a k for TTX approximately tenfold higher than the ko of sodium channels in either skeletal muscle or heart (Jones et a, J. Physiol. 389:605-627 (1987)).
The localization of different sodium channels to specific regions in the nervous system supports the possibility that cell-specific regulation of this gene family is at the transcriptional level. By analogy with other eukaryotic genes, distinct DNA elements can be present which mediate cell-specific and temporal regulation of individual sodium channel genes.
15 Studies of sodium channel gene regulation have been facilitated by the use of well-characterized cell lines, such i Ias pheochromocytoma (PC12) cells, a popular cell model for neuronal differentiation (Green et al, Proc. NatL Acad.
Sci. USA 732424-2428 (1976); Halegoua e al., Curr. Top. Microblol. ImmunoL 165:119-170 (1991)). In addition to extending neurites and initiating synthesis of certain neurotransmitters, NGF-treated PC12 cells acquire the ability to generate sodium-based action potentials (Dichter e at, Nature 268:501-504 (1977)). This ability is conferred by an 20 increase in the density of functional sodium channels in the membranes of the NGF-treated cells (Rudy et aL, J.
SNeuroscL 7:1613-1625 (1987); Mandel et a, Proc. NaL Acad Sci USA 85:924-928 (1988): OLague et al., Proc. Nal.
Acad Sct USA 77:1701-1705 (1980)). Northern blot analysis revealed that undifferentiated PC12 cells contained a basal level of sodium channel mRNA which increased coincident with the increase in channel activity observed after treatment with NGF (Mandel et aL, Proc. NatL Acad ScL USA 85.924-928 (1988)).
There is a long standing need to diagnose and/or treat pathologies relating to impaired peripheral nervous system (PNS) nerve conduction associated with PNS injury or in genetic or other disease states, such as those involving lack of, or defects in, PNS sodium channels (SCs). In view of the possibility of cell or tissue specific sodium channels, the discovery and use of isolated PNS SCs and encoding nucleic acid would provide an opportunity to diagnose or treat such pathologies by either screening suitable PNS SC modulating drugs or molecules analgesics), or by using recombinant PNS SCs for in situ or in vivo gene therapy to replace or supplement PNS SCs in at least one portion of the peripheral nervous system of a mammalian patient suffering from a PNS SC related pathology.
3 Summary Of The Invention According to the invention, there is provided a bioassay for assessing a candidate modulating agent of a PNS SCP, comprising: contacting a candidate agent with a cell line expressing in the cell membrane of said cell a PNS SCP; and evaluating the modulation of the SC biological activity of said cell mediated by said contacting of said candidate agent.
According to another aspect of the invention, there is provided a :method to treat diseases or conditions mediated by the presence of a PNS 10 SCP, comprising administering to a patient in need of such treatment an *•effective amount of a PNS SCP modulating agent according to claim 3, or a pharmaceutical composition thereof.
Molecular modeling methods and computer systems are also provided by the present invention for rational drug design (RDD). These to 15 drug design methods use computer modeling programs to find potential ligands or agents that are calculated to bind with sites or domains on the PNS SCP. Potential ligands or agents are then screened for modulating or S- binding activity. Such screening methods can be selected from assays for at least one biological activity of the protein, as associated with a PNS SCP-related pathology or trauma, according to known sodium channel assays. The resulting ligands provided by methods of the present invention are synthesized and are useful for treating, inhibiting or preventing at least one of PCS SCP-related pathology or trauma in a mammal.
Further objects, features, utilities, embodiments and/or advantages of the present invention will be apparent from the additional description provided herein.
Brief Description of the Drawings Figure 1 depicts a 323 amino acid and corresponding 969 nucleotide sequence of a PNS SCP as amino acids 233-555 of SEQ ID NO: 2 and nucleotides 699-1665 of SEQ ID NO: 1, as the primary structure of Domain III of the Peripheral Nerve type 1 (PN1) sodium channel alpha subunit for both amino acid and DNA sequences. The single amino acid code is used to denote deduced amino acids. YJ1 and YOIC refer to the oligonucleotide primers used to obtain the initial PCR fragment of PN1 cDNA.
Figure 2A-B shows a Northern blot analysis of sodium channel a 10 subunit mRNA in rat pheochromocytoma (PC12) cells treated with Nerve S*.o Growth Factor. In Figure the probe used is pRB211 which encodes the highly conserved fourth repeated domain of the rat type II sodium channel. Both type H and PN1 mRNAs are detected with this probe. In Figure the probe used contains sequences specific for PN1. The 1 5 levels of sodium channel mRNA .9are quantitated with reference to the amount of cyclophilin mRNA, as indicated. Control cells are PC 12 cells grown in the absence of NGF.
Figure 3A-B shows an example of tissue-specific distribution of PN1 mRNA. Figure 3(A) presents a Northern blot analysis using equal amounts of RNA from tissues. PN1 mRNA is indicated by the dash. 28S refers to the 28S rRNA. The probe contains sequences specific for the PN1 gene.
Note the absence of PN1mRNA in skeletal muscle, cardiac muscle, and the low levels of PN1 mRNA in spinal cord. Figure 3(B) shows RNAase o*oooo protection analysis of PN1 mRNA. PN1 refers to the PN1 probe protected 10 by mRNA from the different tissue samples. Actin refers to actin probe sequences protected by the same mRNA.
Figure 4A-F shows localization of PN1 mRNA in Superior Cervical Ganglion (SCG) and Dorsal Root Ganglion (DRG) tissues by in situ hybridization analysis. Figures 4A-4B represent neurons hybridized with a PN1-specific antisense RNA probe. Figures 4C-4D represent neurons hybridized with the radiolabeled PN1 probe in the presence of non-labeled PN1 competitor DNA. Figures 4E-4F represent tissue sections hybridized with an antisense type II probe.
Figure 5 shows a blot analysis comparing Levels of PN1 and brain type I a subunit mRNA in SCG. The pRB11 conserved sodium channel probe detects both type II/IIA and PN1 transcripts.
Figure 6A-B shows a Northern blot analysis which reveals differential expression of PN1 and type I sodium channel mRNAs during postnatal rat development. Figure 6(A) shows a representative autoradiogram of a Northern blot using radiolabeled antisense pRB211 RNA as probe.
Postnatal days 7 (P7) to 42 (P42) are shown. Figure 6(B) shows a plot of quantitation of the Northern blots showing a decrease in type I mRNA with time after birth.
Figure 7A-D show the deduced primary structure of cloned portion of PN1 a subunit cDNA as a partial 3033 nucleotide (SEQ ID NO:1) sequence and a partial 1011 amino acid (SEQ ID NO:2) sequence.
Figure 8A-D show a comparison of deduced primary amino acid sequences of PN1 (1-988 of SEQ ID NO:2) and brain type II/IIA a subunit (SEQ ID NO:7). A consensus sequence is also shown (SEQ ID NO:16).
tFigure 9A-9D show the entire DNA sequence for a rat PN1 PNS SCP 10 (SEQ ID NO:9).
Figure 10 shows the entire amino sequence for a rat PN1 PNS SCP (SEQ ID Figure 11A-11E shows amino acid sequences for rat PN1 ("RATPN1") (SEQ ID NO:10) and two expected human PN1 sequences "HUMPN1A" (SEQ ID NO:11) "HUMPN1B" (SEQ ID NO:16) HUMPN1C (SEQ ID NO:15) and HUMPN1D (SEQ ID NO:12). Alternative sequences S•include those where is O, 1, 2, or 3 of the same or different amino acids, which can be optionally selected from Table 1 or Table 2.
Figure 12 shows a computer system suitable for three dimensional structure determination and/or rational drug design.
Figure 13A-B shows a representative DNA sequence encoding a human PN1 (HUM PN1A) (SEQ ID NO:13).
Figure 14-B shows a representative DNA sequence encoding a human PN1 (HUM PN1B) (SEQ ID NO:14) Detailed Description of the Invention A need exists for modulating the activity of at least one peripheral nervous system specific (PNS) sodium channel (SCs). Such modulation could potentially provide analgesic or diagnostic agents for pain or pathologies associated with nerve conduction in the PNS.
Certain sodium channels -corresponding to PNS SCPs of the invention- are now discovered to be preferentially or selectively expressed in the peripheral nervous system (PNS). These sodium channels modulate peripheral nerve impulse conduction preferentially in the PNS. The present invention provides peripheral nervous system specific (PNS) sodium channel peptides (SCPs), encoding nucleic acid vectors, host cells and antibodies, as well as methods of making and using thereof, including recombinant expression, purification, cell-based drug screening, gene therapy, crystallization, X-ray diffraction analysis, as well as computer structure determination and rational drug design utilizing at least one PNS SCP amino acid sequence and/or x-ray diffrction data provided on computer readable media.
A PNS sodium channel peptide (PNS SCP) can refer to any subset of a PNS sodium channel (SC) having SC S activity, as a fragment, consensus sequence or repeating unit A PNS SCP of the invention can be prepared by: recombinant DNA methods; 15 proteolytic digestion of the intact molecule or a fragment thereof; chemical peptide synthesis methods well-known in the art and/or by any other method capable of producing a PNS SCP and having a conformation similar to an active portion of a PNS SCP and having SC activity. The SC activity can be screened according to known screening assays for sodium channel activity, n vitro, in situ or in vivo. The minimum peptide sequence to have activity is based on the S 20 smallest unit containing or comprising a particular region, domain, consensus sequence, or repeating unit thereof, of .at least one PNS SCP.
According to the invention, a PNS SCP includes an association of two or more polypeptide domains, such as transmembrne, por lining domains, or fragments thereof, coresponding to a PNS SCP, such as 1-40 domains or any "range or value therein. Transmembrane, cytoplasmic pore lining or other domains ofa PNS SCP of the invention may 25 have at least 74% homology, such as 74-100% overall homology or identity, or any range or value therein to one or more corresponding SC domains as described herein as presented Figures 1. 7. 10 or II). As would be understood by one of ordinary skill in the art, the above configuration of domains are provided as pan of a PNS SCP of the invention, such that a functional PNS SCP, when expressed in a suitable cell, is capable of transporting sodium ions across a lipid bilayer, a cell membrane or a membrane model. In intact cells having sufficient sodium channels, the cellcanbe capableofgeneratingsome form ofan action potential, such as in a cell expressing at least one PNS SCP of the present invention. Such transport, as measured by suitable SC activity assays, establishes SC activity of one or more PNS SCPs of the invention.
Accordingly, a PNS SCP of the invention altenatively includes peptides having a portion of a SC amino acid sequence which substantially corresponds to at least one 20 to 2005 amino acid fragment and/or consensus sequence of a PNS SCP or group of PNS SCPs, wherein the PNS SCP has homology or identity of at least 74-99%. such as 88- 99% (or any range or value therein eg., 87-99, 88-99, 9-99, 90-99, 91-99, 92-99, 93-99.94-99, 95-99, 96-99, 97-99.
or 98-99%) homology to at least one sequence or consensis sequence of Figures 1, 7, 8, 10 or 11. In one aspect, such a PNS SCP can maintain SC biological activity. It is preferred that a PNS SCP of the invention is not naturally occurring or is naturally occurring but is in a purified or isolated form which does not occur in nature. Preferably, a PNS SCP of the invention substantially corresponds to an set of domains of PN 1, having at least 10 contiguous amino acids of Figures 1, 7, 8, 10 and 11, or at least 74% homology thereto.
Alternatively or additionally, a PNS SCP of the invention may comprise at least one domain corresponding to known sodium channel domains, such as rat brain or spinal cord SC domains, such as transmembrane domains, pore lining domains, cytoplasmic domains or extracellular domains, such as Ils6 1-3 to 14-17 (Ils6), 18-23 to 210-214 (cytoplasmic), 229-236 to 254-258 (IIIS1) 268-272 to 293-297 (IIs2), 300-304 to 321-325 (Ills3), 326-330 to 347-351 (Ills4) 368-374 to 389-393 (IIs), 474-478 to 500-504 (116), 553-559 to 577-583 (IVl), 589-593 to 611-615 (IVs2), 619-623 to 642-646 (IVs3), 654-658 to 678-682 (IVs4, 690-694 to 711-715 (IVsS), 779-783 to 801-805 (IVs6), 348- 352 to 368-372,501-505 to 550-554, 233-555, 676-678 to 689-693, 554-557 to 941-945, or any range or value therein, corresponding to SEQ ID NO2 as presented in Figure 7A-7D, or variants thereof as presented substitutions in Table I or Table 2, having 74-100% overall homology or any range or value therein. At least one of such domains are present in the PNS SCPs presented in Figure IA-E, or fragments thereof, as non-limiting examples. Altrnative domaiiis are 15 also encoded by DNA which hybridizes under stringent conditions to at least 30 contiguous nucleotides of Figures 1, 7, 9, 13 or 14, or having codons substituted therefor which encode the same amino acid as a particular codon.
S. *Additionally, phosphorylation PKA and PKC) domains, as would be recognized by the those skilled in the art are also considered when providing a PNS SCP or encoding nucleic acid according to the invention.
Percent homology or identity can be determined, for example, by comparing sequence information using the GAP computer program, version 6.0, available from the University of Wisconsin Genetics Computer Group (UWGCG).
The GAP program utilizes the alignment method of Needleman and Wunsch Mot Biol. 48:443 (1970), as revised Sby Smith and Waterman (Adv. AppL Math 2:482 (1981). Briefly, the GAP program defines similarity as the number of aligned symbols nucleotides or amino acids) which are similar, divided by the total number of symbols in the shorter of the two sequences. The preferred default parameters for the GAP program include: a unitary comparison 25 matrix (containing a value of I for identities and 0 for non-identities) and the weighted comparison matrix of Gribskov and Burgess, NucL Acids Res. 14:6745 (1986) as described by Schwartz and Dayhoff, eds., ATLAS OF PROTEIN SEQUENCEAND STRUCTURE, National Biomedical Research Foundation, pp. 353-358 (1979); a penalty of for each gap and an additional 0.10 penalty for each symbol in each gap; and no penalty for end gaps. In a preferred embodiment, the peptide of the invention corresponds to a SC biologically active portion of SEQ ID NO:2, or variant thereof, as presented in Figure I IA-D.
Thus, one of ordinary skill in the art, given the teachings and guidance presented in the present specification, will know how to add, delete or substitute other amino acid residues in other positions ofa SC to obtain a PNS SCP, including substituted, deletional or additional variants, with a substitution as presented in Tables I or 2 below..
A PNS SCP of the invention also includes a variant wherein at least one amino acid residue in the peptide has been conservatively replaced, added or deleted by at least one different amino acid. For a detailed description of protein chemistry and strucure, See, eg., Schulz, ct al., Principles ofProtein Structure, Springer-Verlag, New York, 1978, and Creighton, T.E, Proteins: Structure and Molecular Properties, WH. Freeman Co., San Francisco, 1983, which are hereby incorporated by reference. For a presentation of nucletide sequence substitutions, such as codon preferences. see Ausubel e at. eds, Ciorew Protocols in Molecular Bilogy. Grene Publishing Assoc.. New York. NY (1987. 1992. 1993. 1994. 1995) at A.l.l-A.l.24, and Sambrook et al. Molecular Clonifig: A Laboratory aU0l.
Second Edition, Cold Spring Harbor Press Cold Spring Harbor. NY (1919). at Appendices C and D.
Conservative substitutions of a PNS SCP of the invention includes a variant wherein at least one amino acid in the peptide has beent conservatively replaced, added or deleted by at least one different amino acid. Such substitutions preferably are made in accordance with the following [Lit as preented in Table 1, which substitutions can be determined by routine experimentation to provide modified structural and functional properties of a synthesized peptdec molecule, while mainting SC biological activity, as determined by knowns SC activity assays. In the context of the invention, the term PNS SC? or "substantially corresponding to" includes such substitutions.
Table I Original Exemplary Residue Substitution Ala GlyScr Asn Gin; His Asp Glu *CYS Set Gln Ain .Glu Asp Gly Ala; Pro H'is Asm; Gin lie LeU; Val *.LCU lIe; Val LYS Arg; Gin; Glu met Ieu Tyr; le Phe Met; Leu; Tyr Ser Thr Thr *Trp Tyr Tyr Tr; Mes Val Ile; LA Alternatively, another group of substitutions of PNS SCPs of the invention are those in which at least one amino acid residue In the protein molecule has been removed and a diffecrent residue added In its place according to the following Table 2. The type of substitutions which can be made in the protein or peptide molecule of the invention can be based an anailysis of the frequencies of amino acid changes between a homologous protein of different species, such as those presented in Table 1-2 of Schulz et al., lInfra. Based on such an analysis, alternative conservative substitutions are defined herein as exchanges within one of the following five grups: TABLE 2 1. Small slowkzl, noapolfw srliabdy polar msldue: Ala, Ser. Th WPi. Gly); 2. Pdla, negaivoy dzwpd icaldues sad their mudw. Asp, Asm. Gtu, Gin.
3- folarposleivdy -h~c Inaiduna WtS. Azg.Lys; 4. LArge llvbeie n0ooarm umc Wti. LAM. upe, Val (Cys); aNd Larg aromatc KsaducL: Pt,Ty;, Tip.
-8- Most deletions and additions, and substitutions according to the invention are those which do not produce radical changes in the characteristics of the protein or peptide molecule. "Characteristics" is defined in a non-inclusive manner to define both changes in secondary structure, e.g. a-helix or P-sheet, as well as changes in physiological activity, e.g. in receptor binding assays.
Accordingly, based on the above examples of specific substitutions, alternative substitutions can be made by routine experimentation, to provide alternative PNS SCPs of the invention, e.g, by making one or more conservative substitutions of SC fragments which provide SC activity. However, when the exact effect of the substitution, deletion, or addition is to be confirmed, one skilled in the art will appreciate that the effect of at least one substitution, addition or deletion will be evaluated by at least one sodium channel activity screening assay, such as, but not limited to, immunoassays or bioassays, to confirm biological activity, such as, but not limited to, sodium channel activity.
Amino acid sequence variants of a PNS SCP of the invention can also be prepared by mutations in the DNA.
Such variants include, for example, deletions from, or additions or substitutions of, residues within the amino acid sequence. Any combination of deletion, addition, and substitution can also be made to arrive at the final construct, provided that the final construct possesses some SC activity. Preferably improved SC activity is found over that of the 15 non-variant peptide. Obviously, mutations that will be made in the DNA encoding the variant must not place the sequence out of reading frame and preferably will not create complementary regions that could produce secondary mRNA structure (see, EP Patent Application Publication No. 75,444; Ausubel, infra; Sambrook, infra). At the genetic level, these variants ordinarily are prepared by site-directed mutagenesis of nucleotides in the DNA encoding a PNS SCP, thereby producing DNA encoding the variant, and thereafter expressing the DNA in recombinant cell culture. The variants typically exhibit the same qualitative biological activity as the naturally occurring SC (see, e.g., Ausubel, nfra; Sambrook, Infra).
Once a PNS sodium channel structure or characteristics have been determined, PNS SCPs can be recombinantly or synthetically produced, or optionally purified, to provide commercially useful amounts of PNS SCPs for use in diagnostic or research applications, according to known method steps (see, Ausubel, Infra, and Sambrook, infra, 25 which references are herein entirely incorporated by reference).
A variety of methodologies known in the art can be utilized to obtain an isolated PNS SCP of the invention.
In one embodiment, the peptide is purified from tissues or cells which naturally produce the peptide. Alternatively, the above-described isolated nucleic acid fragments could be used to expressed the PNS SCP protein in any organism. The samples ofthe invention include cells, protein extracts or membrane extracts of cells, or biological fluids. The sample will vary based on the assay format, the detection method and the nature of the tissues, cells or extracts used as the sample.
The cells and/or tissue can include, normal or pathologic animal cells or tissues, such as the peripheral nervous system, and extracts or cell cultures thereof, provided in vivo. in situ or in vitro, as cultured, passaged, nonpassaged, transformed, recombinant, or isolated cells and/or tissues.
Any higher eukaryotic organism can be used as a source of at least one PNS SCI or PNS SCP of the invention, as long as the source organism naturally contains such a peptide. As used herein, "source organism" refers to the original organism from which the amino acid sequence of the peptide is derived, regardless of the organism the peptide is expressed in and/or ultimately isolated from. Preferred organisms as sources of at least one PNS SCI or encoding nucleic acid can be any vertebrate animal, such as mammals, birds, bony fish, electric eels, frogs and toads. Among mammals, the preferred recipients are mammals of the Orders Primata (including humans, apes and monkeys), Arteriodactyla (including horses, goats, cows, sheep, pigs), Rodents (including mice, rats, rabbits, and hamsters), and Carnivora (including cats, and dogs). The most preferred source organisms are humans.
One skilled in the art can readily follow known methods for isolating proteins in order to obtain the peptide free of natural contaminants. These include, but are not limited to: immunochromotography, size-exclusion chromatography, HPLC, ion-exchange choratography, and immunoaffinity chromatography. See, Ausubel, infra Sambrook, Infrr; Colligan, Infra.
Isolated Nudelc Add Molecule Coding for PNSSCP Peptid In one embodiment, the present invention relates to an isolated nucleic acid molecule coding for a peptide having an amino acid sequence corresponding to novel PNS SCPs. In one preferred embodiment, the isolated nucleic acid molecule comprises a PNS SCP nucleotide sequence with greater than 70% overall identity or homology to at least a 60 nucleotide sequence present in SEQ ID NO:1 "(preferably greater than 80%; more preferably greater than 90%, such as 70-99% any range or value therein). In another preferred embodiment, the isolated nucleic acid molecule comprises a PNS SCP nucleotide sequence corresponding to Figures 7 or 9, or encoding at least one domain ofFigures 1, 7, 8, 10 and II.
Also included within the scope ofthis invention are the functional equivalents of the herein-described isolated nucleic acid molecules and derivatives thereof. For example, as presented above for PNS SCP amino acid sequences, S the nucleic acid sequences depicted in SEQ ID NO:1 can be altered by substitutions, additions or deletions that provide for functionally equivalent molecules. Due to the degeneracy ofnucleotide coding sequences, other DNA sequences 20 which encode substantially the same amino acid sequence of a PNS SCP can be used in the practice of the invention.
These include but are not limited to amino acid sequences encoding all or portions of PNS SCP amino acid sequence of Figures 1, 8, 10 and 11, which are altered by the substitution of different codons that encode a functionally equivalent amino acid residue within the sequence, thus producing a silent change.
SSuch functional alterations of a given nucleic acid sequence afford an opportunity to promote secretion and/or 25 processing of hctcrologous proteins encoded by foreign nucleic acid sequences fused thereto. All variations of the nucleotide sequence of the PNS SCP gene and fragments thereof permitted by the genetic code are, therefore, included in this invention. See, e.g, Ausubel, Ifra; Sambrook, infra.
In addition, the nucleic acid sequence can comprise a nucleotide sequence which results from the addition.
deletion or substitution of at least one nucleotide to the 5'-end and/or the 3'-end of a nucleic acid sequence corresponding to Figures 1, 7 or 9, or encoding at least a portion of Figures 1, 8, 10 or or a variant thereof. Any nucleotide or polynucleotide can be used in this regard, provided that its addition, deletion or substitution does remove the sodium channel activity which is encoded by the nucleotide sequence. Moreover, the nucleic acid molecule of the invention can, as necessary, have restriction endonuclease recognition sites which do not remove the activity of the encoded PNS SCP.
Further, it is possible to delete codons or to substitute one or more codons by codons other than degenerate codons to produce a structurally modified peptide, but one which has substantially the same utility or activity of the pcptide produced by the unmodified nucleic acid molecule. As recognized in the art, the two peptides are functionally equivalent, as are the two nucleic acid molecules which give rise to their production, even though the differences between the nucleic acid molecules am not related to degeneracy of the genetic code. See. eg.. Ausubel, infra, Sambrook, ifra.
Isolation ofNadelcAcid In another aspect of the present invention, isolated nucleic acid molecules coding for peptides having amino acid sequences corresponding to PNS SCP are provided. In particular, the nucleic acid molecule can be isolated from a biological sample containing nammalian nucleic acid, as corresponding to a probe specific for a PNS SC obtained from a higher eukaryotic organism.
The nucleic Pid molecule can be isolated from a biological sample containing nucleic acid using known techniques, such as but not limited to, primer amplification or cDNA cloning.
The nucleic acid molecule can be isolated from a biological sample containing genomic DNA or from a genomic hbrary. Suitable biological samples include, but am not limited to, normal or pathologic animal cells or tissues, such as cerebrospinal fluid (CNS), peripheral nervous system (neurons, ganglion) and portions, cells of heart, smooth, skeletal or cardiac muscle, utonomic neous sytem, and xtacts or cell cultures thereof provided in vivo. in situ or in vitro, as cultured, passaged, non-pasaged, transfonned, recombinant, or isolated cells and/or tissues. The method of obtaining the biological sample will vary depending upon the naure of the sample.
15 One skilled in the art will realize that a mmmalian genome can be subject to slight allelic variations between Sindividuals. Therefore, the isolated nucleic acid molecule is also intended to include allelic variations, so long as the sequnce encodes a PNS SCP. When a PNS SCP allele does not encode the identical amino acid sequence to that found in Figures, 8, 10 or t1, or at least domain thereof, it can be isolated and identified as PNS SCP using the same techniques used herein, and capecially nucleic acid amplification techniques to amplify the appropriate gene with 20 primersbased on the sequences disclosed herin Such variations are presented, eg., in Figure 11 and in Tables I and 2- The cloning of large cDNAs is the same PN1 as a PNS SCP of the invention includes overlapping clones of about 13kDa) but takes more routine experimentation. than smaller cDNAs. One useful method relies on cDNA bacteriophage library screening Sambrook, oa, or Ausubel, infra). Probes for the screening are labeled, with random bexames and Klenow enzyme (Pharmacia kt). If 5 cDNAs ae not obtained with these approaches, a subcDNA library can be prepared in which a specific PNI primers are used to prime the reverse transcript reaction ,in place of oligo dT or random primers. The cDNA sublibrary is then cloned into standard vectors such as lambda zap and screened using conventional techniques. This strategy was used previously (Noda et a. Naure 320:188-192 (1986); Noda et a, Nature 322;826-828 (1986)) to clone the brain type I and II sodium channel cDNAs. he constuction of a full-length cDNA is performed by subcloning overlapping fragments into an expression vector (either prokaryotic or cukaryotic). This task is more difficult with large cDNAs because of the paucity of unique restriction sites, but routine restriction, cloning or PCR is used to join the fragments.
Syntheis offNuceicAcid Isolated nucleic acid molecules of the present invention are also meant to include those chemically synthesized. For example, a nucleic acid molecule with the nucleotide sequence which codes for the expression product of a PNS SCP gene can be designed and, if necessary, divided into appropriate smaller fragments.
Then an oligomer which corresponds to the nucleic acid molecule, or to each of the divided fragments, can be synthesized of 10-6015 nucleotides or any range or value therein, such as 10-100 nucleotides). Such synthetic -11 oligonucleotides can be prepared, for example, by known techniques (See, Ausubel, infra, or Sambrook, infra) or by using an automated DNA synthesizer.
A labeled oligonucleotide probe be derived synthetically or by cloning. If necessary, the 5'-ends of the oligomrs can be phosphorylated using T4 polynucleotide kinasc. Kinasing of single strands prior to annealing or for labeling can be achieved using an excess of the enzyme. If kinasing is for the labeling of probe, the ATP can contain high specific activity radioisotopes. Then, the DNA oligomer can be subjected to annealing and ligation with T4 ligase orthe like.
A Nudcle Acid Probe for theSpe Detecion ofPNS SCP In another embodiment, the present invention relates to a nucleic acid probe of 15-6000 nucleotides for the specific detection of the presence of PNS SCP in a sample comprising the above-described nucleic acid molecules or at least a fragment thereof which binds under stringent conditions to a nucleic acid encoding at least one PNS SCP.
The nucleic acid probe can be used to screen an appropriate chromosomal or cDNA library by known hybridization method steps to obtain a PNS SCP encoding nucleic acid molecule of the invention. A chromosomal DNA or cDNA library can be prepared from appropriate cells according to recognized methods in the art (See, Ausubel, 15 infra; Sambrook, infra) t In the alternative, organic chemical synthesis is carried out in order to obtain nucleic acid probes having •nucleotide sequences which correspond to suitable portions of the amino acid sequence of the PNS SCP. Thus, the S" synthesized nucleic acid probes can be used as primers in nucleic acid amplification method steps The invention can thus provide methods for amplification of DNA and/or RNA using heat stable, cross-linked nucleotide primers, which cross linked primers of the invention to provide nucleic acid encoding PNS SCPs according to the invention.
Methods of amplification of RNA or DNA are well known in the art nd can be used according to the invention without undue experimentation, based on the teaching and guidance presented herein. According to the invention, the use of nucleic acids encoding portions of PNS SCPs according to the invention, as amplification primers, allows for 25 advantages over known amplification primers, due to the increase in sensitivity, selectivity and/or rate of amplification.
Known methods of DNA or RNA amplification include, but are not limited to polymerase chain reaction (PCR) and related amplification processes (see, U.S. patent Nos. 4,683,195, 4,683,202, 4,800,159, 4,965,188, to Mullis et al.; 4,795,699 and 4,921,794 to Tabor et aL; 5,142,033 to Innis; 5,122,464 to Wilson et al.; 5,091,310 to Innis; S* 5,066,584 to Gyllensten et al.; 4,889,818 to Gelfand et al.; 4,994.370 to Silver et al.; 4,766,067 to Biswas; 4,656,134 to Ringold; 5,340,728 to Grosz etal.; 5,322,770 to Gelfand et al.; 5,338,671 to Scalice e al.; PCT WO 92/06200 to Cetus Corp.; PCT WO 94/14978 to Strack et al., which patent disclosures are entirely incorporated herein by reference) and RNA mediated amplification which uses antisense RNA to the target sequence as a template for double stranded DNA synthesis patent No. 5,130,238 to Malek et al., with the tradeneame NASBA) the entire contents of which patents and references are herein entirely incorporated by reference. Reviews of the PCR are provided by Mullis (Cold Spring Harbor Symp. Quant. Bio. 51:263-273 (1986)); Saiki et al. (BidoTechnology 3:1008-1012 (1985)); and Mullis et al. (Meth EymoL 155:335-350 (1987)). One skilled in the art can readily design such probes based on the sequence disclosed herein using methods such as computer alignment and sequence analysis known in the art. See, Ausubel, infra; Sambrook, infra.
12- The hybridization probes of the invention can be labeled by standard labeling techniques such as with a radiolabel, enzyme label, fluorescent label, biotin-avidin label, chemiluminescence, and any other known and suitable labels. After hybridization, the probes can be visualized using known methods. The nucleic acid probes of the invention include RNA, as well as DNA probes, such probes being generated using techniques known in the art (See, e.g., Ausubel, infra; Sambrook, ira). In one embodiment of the above described method, a nucleic acid probe is immobilized on a solid support. Examples of such solid supports include, but are not limited to, plastics such as polycarbonate, complex carbohydrates such as agarose and SEPHAROSE, and acrylic resins, such as polyacrylamide and latex beads. Techniques for coupling nucleci acid probes to such solid supports are well known in the art (See, e.g., Ausubel, infra; Sambrook, infa).
The test samples suitable for nucleic acid probing methods of the invention include, for example, cells or nucleic acid extracts of cells, or biological fluids. The sample used in the above-described methods will vary based on the assay format, the detection method and the nature of the tissues, cells or extracts to be assayed. Methods for preparing nucleic acid extracts of cells are well known in the art and can be readily adapted in order to obtain a sample which is compatible with the method utilized.
15 Mehods for Dectg The Praac ofPNS SCP Encoding Nudec Acid in a Bological Sa le. In another embodiment, the present invention relates to methods for detecting the presence of PNS SCP encoding nucleic acid in a sample. Such methods can comprise contacting the sample with the above-described nucleic acid probe, under conditions such that hybridization occurs, and detecting the presence of a labeled probe bound to the nucleic acid probe. One skilled in the art can select a suitable, labeled nucleic acid probe according to techniques known in the art as described above. Samples to be tested include, but are not limited to, RNA samples of mammalian tissue.
*PNS SCP has been found to be expressed in peripheral nerve and dorsal root ganglion cells. Accordingly. PNS SCP probes can be used detect the presence of RNA from PN cells in such a biological sample. Further, altered expression levels of PNS SCP RNA in an individual, as compared to normal levels, can indicate the presence of disease.
The PNS SCP probes can further be used to assay cellular activity in general and specifically in peripheral nervous system tissue.
A Kitfor Detectnf the Praeence of PNSSCP In a Saple. In another embodiment, the present invention relates to a kit for detecting the presence of PNS SCP in a sample comprising at least one container having disposed therein the above-described nucleic acid probe. In a preferred embodiment, the kit further comprises other containers comprising one or more of the following: wash reagents and reagents capable of detecting the presence of bound nucleic acid probe. Examples of detection reagents include, but are not limited to radiolabeled probes, enzymatic labeled probes (horse radish peroxidase, alkaline phosphatas), and affinity labeled probes (biotin, avidin, or steptavidin) (See, Ausubel, i#ra, Sambrook, ifra).
A compartmentalized kit includes any kit in which reagents are contained in separate containers. Such containers include small glass containers, plastic containers or strips of plastic or paper. Such containers allow the efficient transfer of reagents from one compartment to another compartment such that the samples and reagents are not cross-contaminated and the agents or solutions of each container can be added in a quantitative fashion from one compartment to another. Such containers will include a container which will accept the test sample, a container which contains the probe or primers used in the assay, containers which contain wash reagents (such as phosphate buffered -13saline, TRIS-buffers, and the like), and containers which contain the reagents used to detect the hybridized probe, bound antibody, amplified product, or the like.
One skilled in the art will readily recognize that the nucleic acid probes described in the invention can readily be incorporated into one of the established kit formats which are well known in the art.
DNA Consocts Comprisbig a PNSSCP Nucleic Acd Molecule and Hosts Contanirg These Constructs.
A nucleic acid sequence encoding an PNS SCP of the invention can be recombined with vector DNA in accordance with conventional techniques, including blunt-ended or staggered-ended termini for ligation, restriction enzyme digestion to provide appropriate trmini, filling in of cohesive ends as appropriate, alkaline phosphatase treatment to avoid undesirable joining, and ligation with appropriate ligases. Techniques for such manipulations are disclosed, by Ausubel e aL, ifra and are well known in the art, A nucleic acid molecule, such as DNA, is said to be "capable of expressing" a polypcptide if it contains nucleotide sequences which contain transcriptional and translational regulatory information and such sequences arc "operably linked" to nucleotide sequences which encode the polypeptide. An operable linkage is a linkage in which the regulatory DNA sequences and the DNA sequence sought to be expressed are connected in such a way as to permit 5 gene expression as PNS SCPs or Ab fragments in recoverable amounts. The precise nature of the regulatory regions needed for gene expression can vary from organism to organism, as is well known in the analogous art. See, e.g., Sambrook, bfra and Ausubel infra.
The invention accordingly encompasses the expression of an PNS SCP, in either prokaryotic or cukaryotic cells, although eukaryotic expression is preferred.
20 Preferred hosts are bacterial or cukaryotic hosts including bacteria, yeast, insects, fungi, bird and mammalian S cells either in viv, or n su, or host cells of mammalian, insect, bird or yeast origin. It is preferred tha the mammalian cell or tissue is ofhuman, primate, hamster, rabbit, rodent, cow, pig, sheep, horse, goat, dog or cat origin, but any other mammalian cell can be used.
Eukaryotic hosts can include yeast, insects, fungi, and mammalian cells cither in vivo, or in tissue culture.
Preferred eukaryotic hosts can also include, but are not limited to insect cells, mammalian cells either in vivo, or in tissue culture. Prefered mammalian cells includeXenopus oocytes, HeLa cells, cells of fibroblast origin such as VERO or SCHO-KI, or cells of lymphoid origin and their derivatives.
Mammalian cells provide post-translational modifications to protein molecules including correct folding or glycosylation at correct sites. Mammalian cells which can be useful as hosts include cells of fibroblast origin such as, but not limited to. NIH 3T3, VERO or CHO, or cells of lymphoid origin, such as, but not limited to. the hybridoma SP2/O-Agl4 or the murine myeloma P3-X63Ag8, hamster cell lines CHO-KI and progenitors, CHO- DUXBI 1) and their derivatives. One preferred type of mammalian cells are cells which are intended to replace the function of the genetically deficient cells in viv. Neuronally derived cells are preferred for gene therapy of disorders of the nervous system. For a mammalian cell host, many possible vector systems are available for the expression of at least one PNS SCP. A wide variety of transcriptional and translational regulatory sequences can be employed, depending upon the nature of the host. The transcriptional and translational regulatory signals can be derived from viral sources, such as, but not limited to, adenovirus, bovine papilloma virus, Simian virus, or the like, where the regulatory signals are associated with a particular gene which has a high level of expression. Alternatively, promoters from mammalian expression products, such as, but not limited to, actin, collagen, myosin, protein production. See, Ausubel, infra,; Sanbrook, Infra.
When live insects are to be used, silk moth caterpillars and baculoviral vectors are presently preferred hosts for large scale PNS SCP production according to the invention. Production of PNS SCPs in insects can be achieved, for example, by infecting the insect host with a baculovirus engineered to express at least one PNS SCP by methods known to those skilled in the related arts. See Ausubel et al, eds. Current Protocols in Molecular Biology. Wiley Interscience, §16.-16.11 (1987, 1992, 1993, 1994).
In a preferred embodiment, the introduced nucleotide sequence will be incorporated into a plasmid or viral vector capable of autonomous replication in the recipient host. Any of a wide variety of vectors can be employed for this purpose. See, Ausubel et aL, Infra, i§ 1.5, 1.10, 7.1.7.3, 8.1, 9.6, 9.7, 13.4, 16.2, 16.6, and 16.8-16.11.
Factors of importance in selecting a particular plasmid or viral vector include: the ease with which recipient cells that contain the vector can be recognized and selected from those recipient cells which do not contain the vector, the number of copies of the vector which are desired in a particular host; and whether it is desirable to be able to "shuttle" the vector between host cells of different species.
Different host cells have characteristic and specific mechanisms for the translational and post-translational processing and modification glycosylation, cleavage) of proteins. Appropriate cell lines or host systems can be chosen to ensure the desired modification and processing of the foreign protein expressed. For example, expression in a bacterial system can be used to produce an unglycosylated core protein product. Expression in yeast will produce a glycosylated product. Expression in mammalian cells can be used to ensure "native" glycosylation of the heterologous S 20 PNS SCP protein. Furthermore, different vector/host expression systems can effect processing reactions such as proteolytic cleavages to different extents.
S* As discussed above, expression of PNS SCP in cukaryotic hosts requires the use of eukaryotic regulatory regions. Such regions will, in general, include a promoter region sufficient to direct the initiation of RNA synthesis.
See eg., Ausubel. i#fr; Sambrook, ifra.
25 Once the vector or nucleic id molecule containing the construct(s) has been prepared for expression, the DNA construct(s) can be introduced into an appropriate host cell by any of a variety of suitable means, Le, transformation, transfection, conjugation, protoplast fusion, electroporation, particle gun technology, calcium phosphate-precipitation, direct microinjection, and the like. After the introduction of the vector, recipient cells are grown in a selective medium, Swhich selects for the growth of vector-containing cells. Expression of the cloned gene molecule(s) results in the production of at least one PNS SCP. This can take place in the transformed cells as such. or following the induction of these cells to differentiate (for example, by administration of bromodeoxyuracil to neuroblastoma cells or the like).
Isoladon ofPNSSCP. The PNS SCP proteins or fragments of this invention can be obtained by expression from recombinant DNA as described above. Altnmatively, a PNS SCP can be purified from biological material. If so desired, the expressed at least one PNS SCP can be isolated and purified in accordance with conventional method steps, such as extraction, precipitation, chromatography, affinity chromatography, electrophoresis, or the like. For example, cells expressing at least one PNS SCP in suitable levels can be collected by centrifugation, or with suitable buffers, lysed, and the protein isolated by column chromatography, for example, on DEAE-cellulose, phosphocellulose.
polyribocytidylic acid-agarose, hydroxyapatite or by electrophoresis or immunoprecipitation. Alternatively, PNS SCPs can be isolated by the use of specific antibodies, such as, but not limited to, an PNS SCP or SC antibody. Such antibodies can be obtained by known method steps (see e.g. Colligan, infrw, Ausubel, infra.
For purposes of the invention, one method of purification which is illustrative, without being limiting, consists of the following steps. A first step in the purification of a PNS SCP includes extraction of the PNS SCP fraction from a biological sample, such as peripheral nerve tissue or dorsal root ganglia (DRG), in buffers, with or without solubilizing agents such as urea, formic acid, detergent, or thiocyanate. A second step includes subjecting the solubilized material to ion-exchange chromatography on Mono-Q or Mono-S columns (Pharmacia LKB Biotechnology, Inc: Piscataway, NJ). Similarly, the solubilized material can be separated by any other process wherein molecules can be separated according to charge density, charge distribution and molecular size, for example. Elution of the PNS SCP from the ionexchange resin are monitored by an immunoassay, such as M-IRMA, on each fraction. Immunoreactive peaks would are then dialyzed, lyophilized, and subjected to molecular sieve, or gel chromatography. In a third step, molecular sieve or gel chromatography is a type of partition chromatography in which separation is based on molecular size. Dextran, polyacrylamide, and agarose gels are commonly used for this type of separation. One useful gel for the invention is SEPHAROSE 12 (Pharmacia LKB Biotechnology, Inc.). However, other methods, known to those of skill in the int 15 can be used to effectively separate molecules based on size. A fourth step in a purification protocol for a PNS SCP can include analyzing the immunoreactive peaks by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS- PAGE), a further gel chromatographic purification step, and staining, such as, for example, silver staining. A fifth step in a purification method can include subjecting the PNS SCP obtained after SDS-PAGE to affinity chromatography, or any other procedure based upon affinity between a substance to be isolated and a molecule to which it can specifically 20 bind. For further purification of a PNS SCP, affinity chromatography on SEPHAROSE conjugated to anti-PNS SCP mAbs (specific mABs generated against substantially pure PNS SCP) can be used. Alternative methods, such as reverse-phase HPLC, or any other method characterized by rapid separation with good peak resolution are useful.
It will be appreciated that other purification steps can be substituted for the preferred method described above.
SThose of skill in the art will be able to devise alternate purification schemes without undue experimentation.
25 An Andbody Having BInding afflnIy to a PNS SCP Ppdde und a Hybrldoma Contabning the Antibody.
In another embodiment, the invention relates to an antibody having binding affinity specifically to a PNS SCP peptide Sas described above or fragment thereof. Those which bind selectively to PNS SCP would be chosen for use in methods which could include, but should not be limited to, the analysis of altered PNS SCP expression in tissue containing PNS
SCP.
The PNS SCP proteins of the invention can be used in a variety of procedures and methods, such as for the generation of antibodies, for use in identifying pharmaceutical compositions, and for studying DNA/protin interaction.
The PNS SCP peptide of the invention can be used to produce antibodies or hybridomas. One skilled in the art will recognize that if an antibody is desired, such a peptide would be generated as described herein and used as an immunogen.
The antibodies of the invention include monoclonal and polyclonal antibodies, as well as fragments of these antibodies. The invention further includes single chain antibodies. Antibody fragments which contain the idiotype of the molecule can be generated by known techniques.
-16- The term "antibody" is meant to include polyclonal antibodies, monoclonal antibodies (mAbs), chimeric antibodies, anti-idiotypic (anti-Id) antibodies to antibodies that can be labeled in soluble or bound form, as well as fragments thereof provided by any known technique, such as. but not limited to enzymatic cleavage, peptide synthesis or recombinant techniques. Polyclonal antibodies are heterogenleous popuilationis of antibody molecules derived from the seua of animals immunized with an antigen. A monoclonal antibody contains a substantially homogeneous population of antibodies specific to antigens which population contains substantially similar epitope binding sites.
pAAbs can be obtained by methods known to those skilled in the art. See, Kohler and Milstein, Nature 256:.495-497 (1975); U.S. Patent No. 4,376,110; Ausubel et al, eds., CURIW.NTPROTOCOLS IN MOLECUAR BIOLOG Y. Greene Publishing Assoc. and Wile lnterscience, NMY, (1987,1992); and Harlow and Lane ANTIBODIES: A LA4BORA TORY MANUAL Cold Spring Harbor Laboratory (1988); Colligan ct al, eds., Curret Protocols In Imrnology, Greene Publishing Assoc. an Wiley lntmrcience, N.Y.(1992,1993). the contents of which references are incorporated entirely herein by reference. Such antibodies can be of any imnmunoglobulin clas including IgG, 1gM, IgE, IgA. GILD and any 9* subclass thereof. A hybridoma producing a mAb of the invention can be cultivated in vitro, in situ or in vivo.
S:Productiont of high titers of mAbs In vivo or in situ makes this the presently preferred method of production. 90: 15Chimeric antibodies are molecules different portions of which are derived from different animal species, such as those having variable region derived fr-om a murine mAli and a human inununoglobulin constant region, which are primarily used to reduce immunogenicity in application and to increase yields in product=o. for example. where murine mAbs have higher yields from bybridomas but higher immunogenicity in humans, such that human/murine chimeric 9. mAbs are used. Chimeric antibodies and methods for their production are known in the art (Cabilly et a4 Proc. Nai.
20 Acad Sca. USA 81:3273-3277 (1934); Morrison al.. Proc. Natl. Acad Sci. EISA 81:6851-6855 (1984); Boulianne et al. Nature 312.643-646 (1914); Cabilly et European Patent Application 125023. Neuberger et Nature 314:268-270 (1985); Taniguchi et al.. European Patent Application 171 496; Morrison ei al-. European Patent ::::Application 173 494; Neuberger et al., PCT Application WO 86101533; Kudo et al.. European Patent Application 134 187; Morrison eata.. European Patent Application 173 494; Sahagan et al.. J. immuno!. 137.1066&1074 (1986); Robinson dt al., international Patent Publication No. PCTAUS86/02269; Liu et dt. Proc. Nail Acad ScL USA 84:3439- 9*3443 (1987); Sun et al.. Proc. Nail Acad Sci, USA 84.214-219 (1987; Better et al.. Science 240:1041-1043 (1983); and Harlow, b&fr. These references are entirely incorporated herein by reference.
An anti-idiotypic (anti-Id) antibody is an antibody which recognizes unique determinants generally associated with the antigen-binding site of an antibody. An Id antibody can be prepared by immunizing an animal of the same species and genetic type mouse strain) as the source of the mAb with the mAb to which an anti-Id is being prepared. The immunized animal will recognize and respond to the idiotypic determinants of the immunizing antibody by producing an antibody to these idiotypic determinants (the anti-Id antibody). See, for example. U.S. patent No.
4,699,880. which is herein entirely incorporated by reference.
The anti-Id antibody can also be used as an "immunogeni" to induce an immune response in yet another animal, producing a so-called anti-anti-Id antibody. The anti-anti-Id can be epitopically identical to the original mnAb which induced the anti-Id. Thus, by using antibodies to the idiotypic determinants of a mAli, it is possible to identify other clones expressing antibodies of identical specificity.
-17- Accordingly, mAbs generated against a PNS SCP of the invention can be used to induce anti-Id antibodies in suitable animals, such as BALB/c mice. Spleen cells from such immunized mice are used to produce anti-d hybridomas secreting anti-d mAbs. Further, the anti-Id mAbs can be coupled to a carrir such as keyhole limpet hcmocyain
(KLH)
and used to mmunize additional BALB/C mice. Ser from these mice will contain anti-anti-Id antibodies that have the binding properties f the original mAb specific for a PNS SCP specific ephope. The anti-Id mAbs thus have their own idiotypic epitops, or "idiotopes" structurally similar to the epitope being evaluated.
The term "antibody" is also meant to include both intact molecules as well as fragments thereof, such as, for example, Fab and which are capable of binding antigen. Fab and F(ab 3fragments lack the Fc fragment of intact antibody, clear more rapidly from the circulation, and can have less non-specific tissue binding than an intact antibody (Wahl taL. J. ucL Me 24:316-325(193)). It will be appreciated that Fb F(b d ther fragments f he antibodies useful in the invention can be used for the detection andfor quantitaton of a PNS SCP according to the methods disclosed herein for intact antibody molecules. Such fragments are typically produced by proteolytic cleavage, using enzymes such as papain (to produce Fab fragments) or pepsin (to produce F(ab' fragments). Anantibody is aid to be "capable of binding" a molecule if it is capable of specifically reacting with the molecule to thereby bind the molecule to the antibody. The term "cpiope ismeant t o refer to that porti on of n y molec u l e capable of being bo u nd by an antibody which can also be rweognied by that antibody. Epitoes or "antigenic determinants" ually consist Of chemically active surface groupings of molecules such as amino acids or sugar side chains and have specificth dimensional structural characteristics as well as specific chare characteristics.
An "antigen" is a molecule or a portion of a molecule capable of being bound by an antibody which is 20. additionally capable of inducing an animal to produce antibody capable of binding to an pitope of that antigen. An antigen can have one, or more than one epitope. The specific reaction referrd to above is meant to indicate that the 9. antigen will react, in highly selectiv mMr, with its corresponding antibody and not with the multitude of other antibodies which can be evoked by other mantigens.
Inmmuauneys. Antibodies of the invention, directed against a PNS SCP. can be used to detect or diagnose 25 a PNS S or a PNS SC related pthologies. Screening methods a provided by the invention can include e.g.
immunoassays employing radioimmunOMaSY (RIA) or enzyme-linked immunosorbant assay (ELISA) methodologies, based on the production of specific anibodies (monoclonal or polyclonal) to a PNS SCP. For these assays, biological samples are obtained by, nerve biopsy. or other peripheral nervous system tissue sampling. For example, in one form of RIA. the substance under test is mixed with diluted antiseum in the presence of radiolabeled antigen. In this method.
the concentration of the test substance will be inversely proportional to the amount of labeled antigen bound to the specific antibody and directly related to the amount of free labeled antigen. Other suitable screening methods will be readily apparent to those of skill in the art.
Furthermore, one skilled in the at can readily adapt currently available procedures, as well as the techniques, methods and kits disclosed above with regard to antibodies, to generate peptides capable of binding to a specific pcptide sequence in order to generate rationally designed antipeptide peptides, for example see Hurby et aL, "Application of Synthetic Peptides: Antiscuse Pcptides", In: ywhetic ppides. A User's Guide W.H. Freeman. Y p. 289-307 (1992), and Kaspczak et Biochemniry 289230-8 (1989).
One embodiment for carrying out the diagnostic assay of the invention on a biological sample containing a PNS SCP, comprises: contacting a detectably labeled PNS SCP-specific antibody with a solid support to effect immobilization of said PNS SCP-specific antibody or a fragment thereof; contacting a sample suspected of containing a PNS SCP with said solid support; incubating said detectably labeled PNS SCP-specific antibody with said support for a time sufficient to allow the immobilized PNS SCP-specific antibody to bind to the PNS SCP; separating the solid phase support from the incubation mixture obtained in step and detecting the bound label and thereby detecting and quantifying PNS SCP.
The specific concentrations of detectably labeled antibody and PNS SCP, the temperature and time of incubation, as well as other assay conditions can be varied, depending on various factors including the concentration of a PNS SCP in the sample, the nature of the sample, and the like. The binding activity of a given lot of anti-PNS SCP antibody can be determined according to well known methods. Those skilled in the an will be able to determine operative and optimal assay conditions for each determination by employing routine experimentation. Other such stps as washing, strring, shaking, filtering and the like can be added to the assays as is customary or necessary for the particular sitution.
Detection can be accomplished using any of a variety of assays. For example, by radioactively labeling the PNS SCP-specific antibodies or antibody fragments, it is possible to detect PNS SCP through the use of radioimmune assays. A good description of a radioimmune assay can be found in Colligan, infra, and Ausubel, infra, entirely incorporated by reference herein. Preferably, the detection of cells which express a PNS SCP can be accomplished by S* in vivo imaging techniques, in which the labeled antibodies (or fragments thereof) are provided to a subject, and the presence of the PNS SCP is detected without the prior removal of any tissue sample. Such in vivo detection procedures have the advantage of being less invasive than other detection methods, and are, moreover, capable of detecting the presence of PNS SCP in tissue which cannot be easily removed from the patient, such as brain tissue.
25 Therere e many different in vivo labels and methods of labeling known to those of ordinary skill in the art.
Examples of the types of labels which can be used in the invention include radioactive isotopes and paramagnetic isotopes. Those of ordinary skill in the art will know of other suitable labels for binding to the antibodies used in the S. invention, or will be able to ascertain such, using routine experimentation. Furthermore, the binding of these labels to the antibodies can be done using standard techniques common to those of ordinary skill in the art.
30 For diagnostic in vivo imaging, the type of detection instrment available is a major factor in selecting a given radionuclide. The radionuclide chosen must have a type of decay which is detectable for a given type of instrument.
In general, any conventional method for visualizing diagnostic imaging can be utilized in accordance with this invention.
For example, positron emission tomography (PET), gamma, beta, and magnetic resonance imaging (MRI) detectors can be used to visualize diagnostic imagining.
The antibodies useful in the invention can also be labeled with paramagnetic isotopes for purposes of in vivo diagnosis. Elements which are particularly useful, as in Magnetic Resonance Imaging (MRI), include "Mn, 4'Dy, and "Fe.
-19- The antibodies (or fragments thereof) useful in the invention are also particularly suited for use in in vitro immunoassays to detect the presence ofa PNS SCP in body tissue, fluids (such as CSF), or cellular extracts. In such immunoassays, the antibodies (or antibody fragments) can be utilized in liquid phase or, preferably. bound to a solidphase carrier, as described above.
In situ detection can be accomplished by removing a histological specimen from a patient, and providing the combination of labeled antibodies of the invention to such a specimen. The antibody (or fragment) is preferably provided by applying or by overlaying the labeled antibody (or fragment) to a biological sample. Through the use of such a procedure, it is possible to determine not only the presence of a PNS SCP, but also the distribution of a PNS SCP on the examined tissue. Using the invention, those of ordinary skill will readily perceive that any of a wide variety of histological methods (such as staining procedures) can be modified in order to achieve such In situ detection.
As used herein, an effective amount of a diagnostic reagent (such as an antibody or antibody fragment) is one capable of achieving the desired diagnostic discrimination and will vary depending on such factors as age, condition, sex. the extent of disease of the subject, counter-indications, if any, and other variables to be adjusted by the physician.
The amount of such materials which are typically used in a diagnostic test are generally between 0.1 to 5 mg, and preferably between 0.1 to 0.5 mg.
The assay of the invention is also ideally suited for the preparation of a kit. Such a kit can comprise a carrier means being compartmentalized to receive in close confinement therewith one or more container means such as vials, tubes and the like, each of said container means comprising the separate elements of the immunoassay.
For example, there can be a container means containing a first antibody immobilized on a solid phase support, **20 and a further container means containing a second detectably labeled antibody in solution. Further container means can S* contain standard solutions comprising serial dilutions of the PNS SCP to be detected. The standard solutions of a PNS SCP can be used to prepare a standard curve with the concentration of PNS SCP plotted on the abscissa and the detection signal on the ordinate. The results obtained from a sample containing a PNS SCP can be interpolated from such a plot to give the concentration of the PNS SCP.
Dlanostic Scrteeing and Treantmen It is to be understood that although the following discussion is specifically directed to human patients, the teachings are also applicable to any animal that expresses at least one PNS SC. The diagnostic and screening methods of the invention are especially useful for a patient suspected of being at risk for developing a disease associated with an altered expression level of PNS SCP based on family history, or a patient in which it is desired to diagnose a PNS SCP-rclated disease.
According to the invention, presymptomatic screening of an individual in need of such screening is now possible using DNA encoding the PNS SCP protein of the invention. The screening method of the invention allows a presymptomatic diagnosis, including prenatal diagnosis, of the presence of a missing or aberrant PNS SC gene in individuals, and thus an opinion concerning the likelihood that such individual would develop or has developed a PNS SC-associated disease. This is especially valuable for the identification of carriers of altered or missing PNS SC genes, for example, from individuals with a family history of a PNS SC-related pathology. Early diagnosis is also desired to maximize appropriate timely intervention.
In one preferred embodiment of the method of screening, a tissue sample would be taken from such individual, and screened for the presence of the "normal" PNS SCP gene; the presence of PNS SCP mRNA and/or the presence of PNS SC? protein. The normal humtan gene can be characterized based upon, for example, detection of restriction digestion parterns In *normal" versus the patients DNA, including RFLP analysis, using DNA probes prepared against the PNS SC? sequence (or a functional fragment thereof) taught in the invention. Similarly, PNS SC? mRNA can be characterized and compared to normal PNS SC? mRNA levels and/or size as found in a human population not at risk of developing PNS SCP-associated disease using similar probes. Lastly, PNS SCP protein can be detected and/or quantitated using a biological assay for PNS SC? activity or using an immunological assay and PNS SC? antibodies. When assaying PNS SC? protein, the immnunological assay is preferred for its speed. An (1) abernint PNS SC? DNA size pattern, and/or aberranrt ?NS SC? mRNA sizes or levels and/or aberrant PNS SCP protein levels would indicate that the patient is i~t risk for developing a PNS SCP-associated disease.
The screening and diagnostic methods of the invention do not require that the entire PNS SC? DNA coding sequence be used for the probe Rather, it is only necessary to use a fragment or length of nucleic acid that is sufficient to detect the presence of the PNS SC? gene in a DNA preparation from a normal or affected individual, the absence of such gene, or an altered physical property of such gene (such as a chag in clectropihoretic migration pattern).
Prenatal diagnosis can be performed when desired. using any known method to obtain fetal cells, including 15 amniocentesis, chorionic villous sampling (CVS), and fetoscopy. Prenatal chromosome analysis cant be used to determine if the portion of the chromosome possessing the normal PNS SC? gene is present in a hecterozygous state.
Overview of PNS SCP Pwrrflce1on and Qysiallladox Methods, In general, a ?NS SC? as a membrane protein, is purified in soluble form using detergents octyglucosides) or other suitable arnphiphillic molecules. Thre resulting PNS SC? is in sufficient puirity and concentration for crystallization. The purified PNS SC? is then isolated and assayed for biological activity and for lack of aggregation (which interferes with crystallization). The purified and cleaved P145 SC? preferably nms as a single band under reducing or nonreducing polyacrylarnide gel electrophoresis (PAGE) (nonreducing is used to evaluate the presence of cysteine bridges). The purified PNS SC? is preferably crystallized under varying conditions of at least one of the following: pH, buffer type, buffer concentration, salt type, polymier type, polymer concentration, other precipitating ligands and concentration of purified and cleaved PNS SC? 25 by known methods. See, eg., Michel, Trends in Aiocaein. Sd. 8-56-59 (1983); Deisenhofer et a. J. Mol.Biol 180.3 398 (1984); Weiss e t. FEBS Leit. 267:268-272 (1990). Blumdel, etaL Protein Crystallographty Academic Press, London (1976); Oxender et aL eds.. Protein Engineering Liss. New York (1986); McPherson; Thie Preparation and Analysis of Protein Crystals Wiley, N.Y. (1982); or the methods provided in a commercial kit, such as CRYSTAL SCREEN (Hampton Research, Riverside, CA). The crystallized protein is also tested for at least one SC activity and differently sized and shaped crystals are further tetdfor suitability in X-ray diffraction. Generally, larger crystals provide better crystallography than smaller crystals, and thicker crystals provide bette crystallography than thinner crystals. See, Blundell., info; Oxender, fa, McPherson, If;Wyckoff et adr, Diffraction Methods for Biological Macromolecules. Vols. 114-115: Methods in Enzymology. Orlando, FL Academic Press (1985).
Protidn Crysal!) radon Methods. The hanging drop method is preferably used to crystallize a purified soluble, PNS5SC? protin. See Taylorat aa., J. MW. Blot 226:1287-1290 (IM9); Takimoto et al. (1992), L'fra, CRYSTAL SCREEN. Hampton Research-. A mixture of the protein and precipitant can include the following:9 pli(eg., 4-10); e buffer type (eg, tronetharnine (TRIZMA), sodium azide, phosphate, sodium, or cacodytate acetates, imidazole, Tris HCI, sodium hepes); buffer concentration 0.1-100 mM); salt type sodium azide, calcium chloride, sodium citrate, magnesium chloride, ammonium acetate, ammonium sulfate, potassium phosphate, magnesium acetate, zinc acetate; calcium acetate); polymer type and concentration: polyethylene glycol (PEG) 1-50%, type 6000- 10,000); other precipitating ligands (salts: potassium, sodium, tartrate, ammonium sulfate, sodium acetate, lithium sulfate, sodium formate, sodium citrate, magnesium formate, sodium phosphate, potassium phosphage; organics: 2-propanol; non-volatile: 2-methyl-2,4-pentanediol); and concentration of purified PNS SCP 0.1-100 mg/ml, with added amphiphillic molecules (detergents such as octylgluosides)). See. CRYSTAL SCREEN, Hampton Research.
The above mixtures are used and screened by varying at least one of pH. buffer type; buffer concentration, precipitating salt type or concentration. PEG type, PEG concentration, and cleaved protein concentraion. Crystals ranging in size from 0.1-.5 mm are formed in 1-14 days. These crystals diffract X-rays to at least 10 A resolution, such as 1.5-10.0 A, or any range of value therein, such as 1.5, 1.6, 1.7, 1.8, 1.9,2.0,2.1, 2.2,2.3. 2.4, 25.2.6,2.7,2.8,2.9, 3.1, 3.2, 3.3, 3.4, 3.5 or 3, with 3.5 A or less being preferred for the highest resolution. In addition to diffraction patterns having this highest resolution, lower resolution, such as 25-3.5 A can further be used.
Protein Crystals. Crystals appear after 1-14 days and continue to grow on subsequent days. Some of the i 15 crystals are removed, washed, and assayed for biological activity, which activity is preferred for using in further characterizations. Other washed crystals ae preferably run on a stained gel and those that migrate in the same position as the purified cleaved PNS SCP are preferably used. From two to one hundred crystals are observed in one drop and crystal forms can occur, such as, but not limited to, bipyramidal, rhomboid, and cubic. Initial X-ray analyses are expected to indicate that such crystals diffract at moderately high to high resolution. When fewer crystals are produced 20 in a drop, they can be much larger size, 02-.5 mm.
PNSSCPX-ry CrystallogaphyMedhod The crystals so produced for a PNS SCP are X-ray analyzed using a suitable X-ray source. A suitable number of difaction patterns are obtained. Crystals are preferably stable for at least 10 hrs in the X-ray beam. Frozen crystals (e.g.,-220 to -50C) are optionally used for longer X-ray exposures 4-72 hls), the crystals being relatively more stable to the X-rays in the frozen state. To collect the maximum number 25 of useful reflections, multiple frames are optionally collected as the crystal is rotated in the X-ray beam, for 12-96 hrs. Larger crystals (>02 mm) ar preferred, to ncrease the resolution of the X-ray diffraction. Crystals are preferably analyzed using a synchrotron high energy X-ray source. Using frozen crystals, X-ray diffraction data is collected on crystals that diffract to a resolution of 10-1.5 A with lower resolutions also useful, such as 25-IOA sufficient to solve the three-dimensional structure of a PNS SCP in considerable detail, as presented herein.
ConputerRelatedEmbodiments. An amino acid sequence ofa PNS SCP and/or x-ray diffraction data, useful for computer molecular modeling of a PNS SCP or a portion thereof can be "provided" in a variety of mediums to facilitate use thereof. As used herein, provided refers to a manufacture, which contains a PNS SCP amino acid sequence and/or x-ray diffraction data of the present invention, the amino sequence provided in Figures 1, 8, 10 or 11, a representative fragment thereof or an amino acid sequence having at least 80-100% overall identity to a 5-2005 amino acid fragment of an amino acid sequence of Figures 1 IA-D or a variant thereof. Such a method provides the amino acid sequence and/or x-ray diffraction data in a form which allows a skilled artisan to analyze and molecular model the three dimension structure of a PNS SCP or subdomain thereof.
In one application of this embodiment, PNS SCP, or at least one subdomain thereof, amino acid sequence and/or x-ray diffraction data of the present invention is recorded on computer readable medium. As used herein, "computer readable medium" refers to any medium which can be read and accessed directly by a computer. Such media include, but are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as optical discs or CD-ROM; electrical storage media such as RAM and ROM; and hybrids of these categories such as magnetic/optical storage media. A skilled artisan can readily appreciate how any of the presently known computer readable mediums can be used to create a manufacture comprising computer readable medium having recorded thereon a n amino acid sequence and/or x-ray diffraction data of the present invention.
As used herein, "recorded" refers to a process for storing information on computer readable medium. A skilled artisan an readily adopt any of the presently know methods for recording information on computer readable medium to generate manufactures comprising an amino acid sequence and/or x-ray diffraction data information of the present invention.
A variety of data storage structures are available to a skilled artisan for creating a computer readable medium having recorded thereon an amino acid sequence and/or x-ray diffraction data of the present invention. The choice of 15 the data storage saucure will generally be based on the means chosen to access the stored information. In addition, a variety of data processor programs and formats can be used to store the sequence and x-ray data information of the present invention on computer readable medium. The sequence information can be represented in a word processing text file, formatted in commercially-available software such as WordPerfect and MicroSoft Word, or represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, Oracle, or the like. A skilled artisan can 20 readily adapt any number of dataprocessor structuring formats text file or database) in order to obtain computer readable medium having recorded thereon the information of the present invention.
By providing the PNS SCP sequence and/or x-ray diffraction data on computer readable medium, a skilled Sartisan can routinely access the sequence and x-ray diffraction data to model a PNS SCP, a subdomain thereof, or a ligand thereof. Computer algorythms are publicly and commercially available which allow a skilled artisan to access this data provided in a computer readable medium and analyze it for molecular modeling and/or RDD.
The present invention further provides systems, particularly computer-based systems, which contain the *sequence and/or diffraction data described herein. Such systems are designed to do molecular modeling and RDD for a PNS SCP or at least one subdomain thereof.
As used herein, "a computer-based system" refers to the hardware means, software means, and data storage means used to analyze the sequence and/or x-ray diffraction data of the present invention. The minimum hardware means of the computer-based systems of the present invention comprises a central processing unit (CPU), input means.
output means, and data storage means. A skilled artisan can readily appreciate which of the currently available computer-based system are suitable for use in the present invention.
As stated above, the computer-based systems of the present invention comprise a data storage means having stored therein a PNS SCP or fragment sequence and/or x-ray diffraction data of the present invention and the necessary hardware means and software means for supporting and implementing an analysis means. As used herein, "data storage means" refers to memory which can store sequence or x-ray diffraction data of the present invention, or a memory access means which can access manufactures having recorded thereon the sequence or x-ray data of the present invention.
As used herein, "search means" or "analysis means" refers to one or more programs which are implemented on the computer-based system to compare a target sequence or target structural motif with the sequence or x-ray data stored within the data storage means. Search means are used to identify fragments or regions of a PNS SCP which match a particular target sequence or target motif. A variety of known algorithms are disclosed publicly and a variety of commercially available software for conducting search means are and can be used in the computer-based systems of the present invention. A skilled artisan can readily recognize that any one of the available algorithms or implementing software packages for conducting computer analyses that can be adapted for use in the present computerbased systems.
As used herein. "a target structural motif," or "target motif," refers to any rationally selected sequence or combination ofsequences in which the sequence(s) are chosen based on a three-dimensional configuration or electron density map which is formed upon the folding of the target motif. There re a variety of target motifs known in the art.
Protein target motifs include, but arc not limited to. enzymic active sites, structural sibdomains, epitopes, functional 15 domains and signal sequences. A variety of structural formats for the input and output means can be used to input and *output the information in the computer-based systems of the present invention.
A variety of comparing means can be used to compare a target sequence or target motif with the data storage means to identify structural motifs or electron density maps. A skilled artisan n readily recognize that any one of the publicly available computer modeling programs can be used as the search means for the computer-based systems of the 20 present invention.
SOne application of this embodiment is provided in Figure 12. Figure 12 provides a block diagram of a computer system 102 that can be used to implement the present invention. The computer system 102 includes a processor 106 connected to a bus 104. Also connected to the bus 104 are a main memory 108 (preferably implemented as random access memory. RAM) and a variety of secondary storage memory 110, such as a hard drive 112 and a 25 removable storage medium 114. The removable medium storage device 114 may represent, for example, a floppy disk drive, a CD-ROM drive, a magnetic tape drive, etc. A removable storage medium 116 (such as a floppy disk, a compact disk. a magnetic tape, etc.) containing control logic and/or data recorded therein may be inserted into the removable medi ;torage medium 114. The computer system 102 includes appropriate software for reading the control logic and/or me data from the removable medium storage device 114 once inserted in the removable medium storage device 114. A monitor 120 can be used as connected to the bus 104 to visualize the structure determination data.
Amino acid, encoding nucleotide or other sequence and/or x-ray diffraction data of the present invention may be stored in a well known manner in the main memory 108, any of the secondary storage devices 110, and/or a removable storage device 116. Software for accessing and processing the amino acid sequence and/or x-ray diffraction data (such as search tools, comparing tools, etc.) reside in main memory 108 during execution.
Three Dimtensional Snructure Determination. One or more computer modeling steps and/or computer algorythms are used to provide a molecular 3-D) model of a cleaved PNS SCP, using amino acid sequence data from Figures 1, 8. 10 or I I (or variants thereof) and/or x-ray diffraction data. If only the amino acid sequence is used. for three-dimensional structure determination then a suitable modeling program can be used, eag.. LINUS (Rose e. a.
Proteins: Structume Function and Genetics (June, 1995) and references cited herein. It is preferred that the PNS SCP model has no or Ala-substituted (for suface) residues in disallowed regions of the Ramachandra plot and gives a positive 313- 1D profile (Luthy et al. Nature 356:83-85 (1992)). suggesting that ali the residues are in acceptable environments (Kraulis (199 infra). Alternatively. the dissallowed regions can be corrected by the use of suitable alorythms. such as the RAVE program described herein. Phas determination is optionally used for solving the threedimensional stucture of a cleaved PNS SCP. This structure can then be used for RDD of modulators of PNS SCP neuraininidase, endothelin cathepsin A or other biological activity, which is relevant to a PNS SCP related pathology.
Density Mod#7cadon and Map Jnteapreadon. Electron density maps can be calculated using such programs as those from the CCP4 computing package (SERC (UK) Collaboratve Computing Project 4, Dartsbury Laboratofy.
UK, 1979). Cycles of two-fold averaging can fiurther be used, such as with the program RAVE (Kleywegt Jones, *Bailey et aL. eds., First Map to Final Model, SERC Daresbury Laboratory, UK. pp 59-66 (1994)) and gradual model expansion. For map visualization and model building a progrum usch as O0" (Jones (1991), k~a) can be used.
Refinement and ModVaidan. Rigid body and positional refinement can be carried out using a program such as X-PLOR (Briinger (1992)1 bpSrra) eg., with the stercochemical parameters of Engh and Huber (Acia Cryst.
A47:392-400 (1991)). If the model at this stage in the averaged maps still misses residues at least 5-10 per *~*subunit). the some or all of the missing residues can be incorporated in the model during additional cycles of positional refinement and model builing. The refinement procedure can start using data firm lower resolution 25-1 A to 10-3.0 A and then gradually extended to include dat from 12-6A to 3.0-IL5 f-values for individual atoms can be refined once data between 2.9 and 1.5 A has been added. Subsequently waters can be gradually added. A program such 25 as ARP (Lamzin and Wilson, Acia Oyui D49.129-147 (1993)) can be used to add crystallographic waters and as a tool to check for bad areas in the model. Progams such as PROCHECK (Lackowski et aL. J. AppL CrysL. 26:293-291I (1993))X WHATIF (VriendJ. MoL, Graph 8.52-56 (199)) and PROFILE 3D (L.Athy eta.. Nature 3J6:83-85 (1992)), as well as the geometrical analysis generated by X-PLOR can be been used to check the srucure for errors. Forthe final refinement cycle, 20-5% of the weakest data can be rejected using a IF.,.Va cutoff and anisotropic scaling between and applied after careful assessment of the quality and completeniess of the data .Stuctre~jiebys. A program such as DSSP can be used to assign the secondary structure elements (Kabseh anid Sander (1983), A program such as SUPPOS (firm the BIOMOL crystallographic computing package) can be used to for somec or all of the least-squares superpositions of various models and parts of models. Solvent accessible surfaces and electroatic potentials can be calculated using such programs as GRASP (Nicholls et a. (199 intfra).
Structure Detaerton. The structure of a PNS SCP can thus be solved with the molecular replacement procedure such as by using X-PLOR (Ertlnger (1992), i;Vra). A partial search model for the monomer can be constructed using a related protein, such as wheat serine carboxypeptidase structure (Liao at infra). The rotation and translation function can be used to yield two or more orientations and positions for two subunits to form a physiological dimer as determined based on their interactions. Cyclical two-fold density averaging can also be done using the RAVE program and model expansion can also be used to add missing residues for each monomer, resulting in a model with 95-99.9% of the total number residues. The model can be refined in a program such as X-PLOR (Bringer (1992), supra), to a suitable crystallographic The model data is then saved on computer readable medium for use in further analysis, such as rational drug design.
Rational Design of Drugs that Interact with the PNS SCP. The determination of the three dimensional structure of a cleaved PNS SCP, as described herein, provides a basis for the design of new and specific ligands for the diagnosis and/or treatment of at last one PNS SCP-related pathology. Several approaches can be taken for the use of the crystal structure of a PNS SCP in the rational design of ligands of this protein. A computer-assisted, manual examination of the active site structure is optionally done. The use of software such as GRID Goodford. J. Med Chen.
28:849-857 (1985)) a program that dctrmines probable interaction sites between probes with various functional group characteristics and the enzyme surface is used to analyze the active site to determine structures of inhibiting compounds. The program calculations, with suitable inhibiting groups on molecules protonated primary amines) as the probe, are used to identify potential hotspots around accessible positions at'suitable energy contour levels.
S* 15 Suitable ligands, as inhibiting or stimulating modulating compounds or compositions, are then tested for modulating Sactivities of at least one PNS SCP.
A diagnostic or therapeutic PNS SCP modulating ligand of the present invention can be, but is not limited to, at least one selected from a nucleic acid, a compound, a protein, an element, a lipid, an antibody, a saccharide, an isotope, a carbohydrate, an imaging agent, a lipoprotein, a glycoprotein, an enzyme, a detectable probe, and antibody S 20 or fragment thereof, or any combination thereof, which can be detectably labeled as for labeling antibodies. Such labels include, but are not limited to, enzymatic labels, radioisotope or radioactive compounds or elements, fluorescent compounds or metals, chemiluminescent compounds and bioluminescent compounds. Alternatively, any other known diagnostic or therapeutic agent can be used in a method of the invention.
After preliminary experiments are done to determine the K, of the substrate with each enzyme activity of a 25 PNS SCP, the time-dependent nature of modulation of ligand K, values are determined, by the method of Henderson (Bioche. J. 127:321-333 (1972)). For example, the substrate (or blank where appropriate) and enzyme are pre-incubated in buffer. Reactions are initiated bythe addition of substrate. Aliquots are removed over a suitable time course and each quenched by addition into the aliquots of suitable quenching solution sodium hydroxide in aqueous ethanol). The concentration of product is determined, fluorometrically, using a spectrometer. Plots of fluorescence against time can be close to linear over the assay period, and are used to obtain values for the initial velocity in the presence or absence of ligand. Error is present in both axes in a Henderson plot, making it inappropriate for standard regression analysis (Leatherbarrow, Trends Biochem. ScL 15:455-458 (1990)). Therefore, K, values is obtained from the data by fitting to a modified version of the Henderson equation for competitive inhibition: Qr t (E Q I)r El 0 where (using the notation of Henderso (Biochem J. 127:321-333 (1972)): Q K, and r This equation is solved for the positive root with the constraint that Q KJ) K using PROCNLIN from SAS (SAS Institute Inc, Cary, North Carolina, USA) which performs nonlinear regression using least-square techniques. The iterative method used is optionally the multivariate secant method, similar to the Gauss-Newton method, except that the derivatives in the Taylor series are estimated from the histogram of iterations rather than supplied analytically. A suitable convergence criterion is optionally used, where there is a change in loss function of less than 10 4 Once modulating ligands are found and isolated or synthesized, crystallographic studies of the compounds S 10 complexed to a PNS SCP are perfonned. As a non-limiting example, PNS SCP crystals are soaked for 2 days in 0.01- 100 mM ligand and X-ray diffraction data are collected on an area detector and/or an image plate detector a Mar image plate detector) using a rotating anode X-ray source. Data are collected to as high a resolution as possible, e.g., 1.5-3.5A. and merged with an R-factor on suitable intensities. An atomic model of the inhibitor is built into the difference Fourier map (F F The model can be refined to a solution in a cycle of simulated annealing (BrOnger (1987), ifra) involving 10-500 cycles of energy refinement 100-10,000 I-FS steps of room temperature dynamics and/or 10-500 more cycles of energy refinement. Harmonic restraints are also used for the atom refinement, except for atoms within a 10-15 A radius of the inhibitor. An R-factor is selected for the model for both the r.m.s.
deviations from the ideal bond lengths, as well as for the angles, respectively. Direct measurements of enzyme 2 2 inhibition provide further confirmation that the modeled ligands are modulators of at least one biological activity of a PNS SC.
Ligands ofa PNS SCP, based on the crystal structure of this enzyme, are thus also provided by the present invention. Demonstration of clinically useful levels, eg., In vivo activity is also important. In evaluating PNS SCP inhibitors for biological activity in animal models rat, mouse, rabbit) using various oral and parenteral routes of a dministration are evaluated. Using this approach, it is expected that modulation of a PNS SCP occurs in suitable animal models, using the ligands discovered by molecular modeling and x-ray crystallography.
Diagnostic sd/or Theraputc Agents. A diagnostic or therapeutic PNS SCP modulating agent or ligand of the present invention can be, but is not limited to, at least one selected from a nucleic acid, a compound, a protein, an element, a lipid, an antibody, a saccharide, an isotope, a carbohydrate, an imaging agent, a lipoprotein, a glycoprotein, an enzyme, a detectable probe, and antibody or fragment thereof, or any combination thereof, which can be detectably labeled as for labeling antibodies, as described herein. Such labels include, but are not limited to, enzymatic labels.
radioisotope or radioactive compounds or elements, fluorescent compounds or metals, chcmiluminescent compounds and bioluminescent compounds. Alternatively, any other known diagnostic or therapeutic agent can be used in a method of the invention.
A therapeutic agent used in the invention can have a therapeutic effect on the target cell as a cell or neuron of the peripheral nervous system, the effect selected from, but not limited to: correcting a defective gene or protein, a drug action, a toxic effect, a growth stimulating effect, a growth inhibiting effect, a metabolic effect, a catabolic affect, an anabolic effect, a neurohumoral effect, a cell differentiation stimulatory effect, a cell differentiation inhibitory effect.
a neuromodulatory effect, a pluripotent stem cell stimulating effect, and any other known therapeutic effects that modulates at least one SC in a cell of the peripheral nervous system can be provided by a therapeutic agent delivered to a target cell via pharmaceutical administration or via a delivery vector according to the invention.
A therapeutic nucleic acid as a therapeutic agent can have, but is not limited to, at least one of the following therapeutic effects on a target cell: inhibiting transcription of a DNA sequence; inhibiting translation of an RNA sequence; inhibiting revee transcription of an RNA or DNA sequence; inhibiting a post-translational modification of a protein; inducing transcription of a DNA sequence; inducing translation of an RNA sequence; inducing reverse transcription of an RNA or DNA sequence; inducing a post-translational modification of a protein; transcription of the nucleic acid as an RNA; translation of the nucleic acid as a protein or enzyme; and incorporating the nucleic acid into a chromosome of a target cell for constitutive or transient expression of the therapeutic nucleic acid.
Therapeutic effects of therapeutic nucleic acids can include, but are not limited to: turning off a defective gene or processing the expression thereof, such as antisense RNA or DNA; inhibiting viral replication or synthesis; gene therapy as expressing a heterologous nucleic acid encoding a therapeutic protein or correcting a defective protein; modifying a defective or undrxpression of an RNA such as anhnRNA, an mRNA, a tRNA, or an rRNA; encoding a drug or prodrug, or an enzyme that generates a compound as a drug or prodrug in pathological or normal cells S 20 expressing the chimeric receptor, and any other known therapeutic effects.
A therapeutic nucleic acid of the invention which encodes, or provides the therapeutic effect any known toxin, prodrug or gene drug for delivery to pathogenic nervous cells can also include genes under the control of a tissue specific transcriptional regulatory sequence (TRSs) specific for pathogenic SC containing cells. Such TRSs would Sfurther limit the expression of the therapeutic agent in the target cell, according to known methods.
25 Non-limiting examples of such PNS SCP modulating agents or ligands of the present invention and methods thereof include methyl/halophenyl-substituted piperizine compounds, such as lidoflazine (see. eg., Merck Index Monograph 5311 and US. patent No. 3,267,104, both entirely incoporated herein by reference). Such compounds were tested and found to inhibit sodium channel activity of at least one PNS SCP of the present invention in cell lines S* expressing at least one PNS SCP, such as PC12, PKI-4 and other isolated or recombinant cells expressing at least one PNS SCP of the present invention. Accordingly, the present invention provides PNS SCP modulating agents or ligands as methyl/halophenyl-subsituted piperizines. The substitutions can include alkyl- and/or halophenyl-substituted piperizines.
Pharmaceutical/Dagnostc AdmLnislstlon. Using PNS SCP modulating compounds or compositions (including antagonists and agonists as described above) the present invention further provides a method for modulating the activity of the PNS SCP protein in a celL In general, agents (antagonists or agonists) which have been identified to inhibit or enhance the activity of PNS SCP can be formulated so that the agent can be contacted with a cell expressing a PNS SCP protein in vivo. The contacting of such a cell with such an agent results in the in vivo modulation of the -28activity of the PNS SCP proteins. So long as a formulation barrier or toxicity barrier does not exist, agents identified in the assays described above will be effective for in vivo use.
In another embodiment, the invention relates to a method of administering PNS SCP or a PNS SCP modulating compound or composition (including PNS SCP antagonists and agonists) to an animal (preferably, a mammal (specifically, a human)) in an amount sufficient to effect an altered level of PNS SCP in the animal. The administered PNS SC or PNS SCP modulating compound or composition could specifically effect PNS SCP associated functions.
Further, since PNS SCP is expressed in peripheral nervous system tissue, administration of PNS SC or PNS SCP modulating compound or composition could be used to alter PNS SCP levels in the peripheral nervous system.
PNS SCP antagonists can be used to treat pain due to trauma or pathology involving the central or peripheral nervous system, or pathologies related to the abnormally high levels of expression of at least one naturally occurring nervous system specific (NS) sodium channel where a PNS SCP antagonist also inhibits at least one NS SC, or where the pain is mediated to some extent by PN SC. Such pathologies, include, but are not limited to; inflammatory diseases, neuropathies diabetic neuropathy), dystrophies reflex sympathetic dystrophy, post-herpetic neuralgia); trauma (tissue damage by any cause); focal pain by any cause.
15 Inflammatory diseases can include, but are not limited to, chronic inflammatory pathologies and vascular inflammatory pathologies. Chronic inflammatory pathologies include, but are not limited to sarcoidosis, chronic inflammatory bowel disease, ulcerative colitis, and Crohn's pathology and vascular inflammatory pathologies, such as, but not limited to, disseminated intravascular coagulation, atherosclerosis, and Kawasaki's pathology.
PNS SCP agonists can be used to treat pathologies involving the central or peripheral nervous system, or pathologies related to the abnormally low levels of expression of at least one naturally occurring nervous system specific (NS) sodium channel where a PNS SCP agonist also enhances or stimulaes at least one NS SC. Such pathologies, include, but are not limited to, neurodegenerative diseases, diseases of the gastrointestinal tract due to dysfunction of the enteric nervous system colitis, ileitis, inflammatory bowel syndrome); diseases of the cardiovascular system hypertension and congestive heart failure); diseases of the genitourinary tract involving sympathetic and parasympathetic innervation benign prostrte hyperplasia, impotence); diseases of the neuromuscular system (e.g, muscular dystrophy, multiple sclerosis, epilepsy).
Neurodegenerative diseases can include, but are not limited to, demyelinating diseases, such as multiple sclerosis and acute transverse myelitis; hyperkieric movement disorders, such as Huntington's Chorea and senile chorea; hypoklnetic movement disorders, such as Parkinson's disease; progressive supranucleo palsy, spinocerebellar degenerations, such as spinal ataxia, Friedreich's ataxia; multiple systems degenerations (Mencel, Dejerine-Thomas, Shi-Drager, and Machado-Joseph); and systemic disorders (Refsum's disease, abetalipoprotemia, ataxia, telangiectasia, and mitochondrial multi-system disorder); demyelinating core disorders, such as multiple sclerosis, acute transverse myelitis; disorders of the motor unit, such as neurogenic muscular atrophies (anterior horn cell degeneration, such as amyotrophic lateral sclerosis, infantile spinal muscular atrophy and juvenile spinal muscular atrophy); or any subset thereof.
Pharmaceutical/diagnostic administration of diagnostic/pharmacutical compound or composition of the invention, for a PNS SC related pathology can be administered by any means that achieve its intended purpose, for example, to treat or prevent a cancer or precancerous condition.
-29.
The term "protection", as in "protection from infection or disease", as used herein, encompasses "prevention," "suppression" or "treatment." "Prevention" involves administration of a Pharmaceutical composition prior to the induction of the disease. "Suppression" involves administration of the composition prior to the clinical appearance of the disease. "Treatment" involves administration of the protective composition after the appearance of the disease.
It will be understood that in human and veterinary medicine, it is not always possible to distinguish between "preventing" and "suppressing" since the ultimate inductive event or events can be unknown, latent, or the patient is not ascertained until well after the occurrence of the event or events. Therefore, i is common to use the term "prophylaxis" as distinct from "treatment" to encompass both "preventing" and "suppressing* as defined herein. The term "protection," as used herein, is meant to include "prophylaxis." See, eg., Berker, infra, Goodman, Infra, Avery, Infra and Katzung, infra, which are entirely incorporated herein by reference, including all references cited therein. The "protection" provided need not be absolute, Le., the disease need not be totally prevented or eradicated, provided that there is a statistically significant improvement relative to a control population. Protection can be limited to mitigating the severity or rapidity of onset of symptoms of the disease.
At least one PNS SC modulating compound or composition of the invention can be administered by any means 15 that achieve the intended purpose, using a pharmaceutical composition as previously described.
For example, administration can be by various parenteral routes such as subcutaneous, intravenous intrademal, intramuscular, intraperitoneal, intranasal, intracranial, transdermal, or buccal routes. Alternatively, or concurrently, .administration can be by the oral route. Parenteral administration can be by bous injection or by gradual perfusion over time.
An additional mode of using of a diagnostic/pharmaceutical compound or composition of the invention is by topical application. A diagnostic/pharmaceutical compound or composition of the invention can be incorporated into topically applied vehicles such as salves or ointments.
For topical applications, it is preferred to administer an effective amount of a diagnostic/pharmaccutical compound or composition according to the invention to target area, skin surfaces, mucous membranes, and the like, 25 which are adjacent to peripheral neuons which are to be treated. This amount will generally range from about 0.0001 mg to about I g of a PNS SC modulating compound per application, depending upon the area to be treated, whether the use is diagnostic, prophylactic or therapeutic, the severity of the symptoms, and the nature of the topical vehicle employed. A preferred topical preparation is an ointment, wherein about 0.001 to about 50 mg of active ingredient is used per cc of ointment base.
A typical regimen for treatment or prophylaxis comprises administration of an effective amount over a period of one or several days, up to and including between one week and about six months.
It is understood that the dosage of a diagnosticpharmaceutical compound or composition of the invention administered in vivo or in vitro will be dependent upon the age, sex, health, and weight of the recipient, kind of concurrent treatment, if any, frequency of treatment, and the nature of the diagnostic/pharmaceutical effect desired.
The ranges of effective doses provided herein are not intended to be limiting and represent preferred dose ranges.
However, the most preferred dosage will be tailored to the individual subject, as is understood and determinable by one skilled in thc'relevant arts. See, Berkow et aL, eds., The Merck Manual, 16th edition, Merck and Co, Rahway, NJ., 1992; Goodman et al., eds., Goodman and Gilan' The Pharmacological Basis of Therapeutics, 8th edition, Pergamon Press, Inc, Elmsford, (1990); Avey's Drug Treatment: Principles and Practice of Clinical Pharmacology and Therapeics, 3rd edition, ADIS Press, LTD., Williams and Wilkins, Baltimore, MD. (1987), Ebadi, Pharmacology, Little, Brown and Co., Boston, (1985); Osol et aL, eds., Remington's Pharmaceutical Sciences, 18th edition, Mack Publishing Co, Easton, PA (1990); Katzung. Basic and Clinical Pharmacology, Appleton and Lange.
Norwalk, CT (1992), which references are entirely incorporated herein by reference.
The total dose required for each reatment can be administered by multiple doses or in a single dose. The diagnosticpharmaceutical compoun compd composition can be administered alone or in conjunction with other diagnostics and/or pharmaceuticals directed to the pathology, or directed to other symptoms of the pathology.
Effective amounts ofa diagnosticpharmacentical compound or composition of the invention are from about 0.1 pg to about 100 mg/kg body weight, administered at intervals of 4-72 hours for a period of 2 hours to I year, and/or any range or value therein, such as 0.0001-1.0, 1-10, 10-50 and 50-100, 0.0001-0.001, 0.001-0.01,0.01-0.1, 0.1-1.0, 1.0-10, 5-10, 10-20,20-50 and 50-100 mg/kg, at intervals of 1-4, 4-10, 10-16, 16-24, 24-36,36-48,48-72 hours, for ,a period of 1-14, 14-28, or 30-44 days, or 1-24 weeks, or any range or value therein.
:The recipients of administration of compounds and/or compositions of the invention can be any vertebrate 15 animal, such as mammals, birds, bony fish, frogs and toads. Among mammals, the preferred recipients are mammals of the Orders Primata (including humans, apes and monkeys), Arteriodactyla (including horses, goats, cows, sheep, pigs), Rodenta (including mice, rats, rabbits, and hamsters), and Carnivora (including cats, and dogs). Among birds, S* the preferred recipients are turkeys, chickens and other members of the same order. The most preferred recipients are humans.
Gene Therapy. A delivery vector of the present invention can be, but is not limited to, a viral vector, a liposome, an anti-PNS SCP or anti-SC antibody, or a SC ligand, one or more of which delivery vectors is associated with a diagnostic or therapeutic agent.
The delivery vector can comprise any diagnostic or therapeutic agent which has a therapeutic or diagnostic effect on the target cell. The target cell specificity of the delivery vector is thus provided by use ofa target cell specific 25 delivery vector.
The delivery vector can also be a recombinant viral vector comprising at least one binding domain selected from the group consisting of an antibody or fragment, a chimeric binding site antibody or fragment, a target cell or specific ligand, a receptor which binds a target cell ligand an anti-idiotypic antibody, a liposome or other component which is specific for the target cell. A PNS SCP can be already associated with the target cell, or the delivery vector can bind the target cell via a ligand to a target cell receptor or vice versa.
Thus, the therapeutic or diagnostic agent, such as a therapeutic or diagnostic nucleic acid, protein, drug, compound composition and the like, is delivered preferentially to the target cell, where the nucleic acid is preferably incorporated into the chromosome of the target cell, to the partial or complete exclusion of non-target cells.
The invention is thus intended to provide delivery vectors, containing one or more therapeutic and/or diagnostic agents, including vectors suitable for gene therapy.
In a method of treating a PNS SCP-associated disease in a patient in need of such treatment, functional PNS SCP DNA can be provided to the PNS cells of such patient in a manner and amount that permits the expression of the PNS SCP protein provided by such gene, for a time and in a quantity sufficient to treat such patient, such as a suitable delivery vector. Many veto systems are known in the art to Provide such delivery to human patients in need of a gene or protein missing from the cell. For example, retrovirus systems can be used, especially modified retrovinis systems and especially herpes simplex virus systems. Such methods arc provided for, in, for example, the teachings of Breakefield. er at, 77w New Biologist 3:203-218 (1991); Huang, Q. et at. Expoerimsental Neurology 115:303-316 (1992).
W093/03743 and W090MM94I. Delivery of a DNA sequence encoding a functional PNS SCP protein will effectively replace the missing or mutated PNS SCP gene of the invention.
In another embodimsent of this invention, the PNS SC? modulating compound or composition is expressed as a recombinanrt gene in a cell, so that the cells can be transplanted into a mammal preferably a human in need of gene therapy. To provide gene therapy to an individual, a genetic sequence which encodes for all or part of the PNS SCP modulating compound or comnpositon is added into a vecto and introduced into a host cell. Examples of diseases that can be suitable for gene therapy include. but are not limited to, neurodegenerative diseases or disorders, Alzhcirners.
schizotphrenia, epilepsy neoplasms and cancer. Examples of vectors that can be used in gene therapy include. but are not linmited to, defective rctroviral, adenoviral, or other viral vectors (Mulligan, ILC., Science 26M:96-932 (1993)). See 9 ~Anderson Gene Therapy, 2463J. Amer. Med. Assn. 2737 (1980); Friedmnann, Progrets tot.mrd human gene therap, 15 244 Science 1275 (1989); Anderson, 256 Science 808 (1992); human gene therapy protocols published in Human Grewe 9 Therapy, Mury Ann Liebert Publishers, N.Y. (1990-1994); Bank et 565 Ann. N.Y. Acad. Sci. 37 (1989); LTR..Vectors Patent No. 4,405,712); Ausubel Infra, 9.10-9.17; Jon A. Wolff., ed. Gene Therapeutics: methods and applications of dIrect geme transfer, Dirkhauser, Boston (1994).
The means by which the vector carrying the gene can be introduced into the cell include but is not limited to, 20 microinjection, electroporation, tansduction, or transfection using DEAE-Dextran, lipofection, calcium phosphate or other procedures known to one skilled in the art (Sambrook iVi'r Ausubel, h&a).
Preparations for parentera administration include sterile or aqueous or non-aqueous solutions, suspensions, and emulsions. Examiples of non-aqueous solvents are propylene glycol, polyethylene glycol, vegetable oils such as :olive oil, aind injectable organic esters such as ethyl oleate. Aqueous carriers include water. alcoholaqueous solutions.
2-5 emrulsions or suspensionis, Including saline and buffered media. Parenteral vehicles include sodium chloride solution, Riger's dextrose and sodium chloride, lactate Ringer's, or fixed oils. Intavenous vehicles include fluid and nutrient replenishers, electrolyte replenishers, such as those based an Ringer's dextrose, and the like. Preservatives and other additives can also be present, such as, for example, antimicrobials. antioxidants, chelating agents. inert gases and the like. See, generally. Osol et al.. eds. Remington s Pharmaceutical Science, 16th Ed. (1980).
In another embodiment, the invention relates to a pharmaceutical composition comprising PNS SC or PNS SCP modulating compound or composition in an amount sufficient to alter PNS SCP associated activity, and a pharmaceutically acceptable diluent, carrier, or excipient. Appropriate concentrations and dosage unit sizes can be readily determined by one skilled in the art (See. Osol et al. ed. Remingtonfs Pharmaceutical Sciences, 16th Ed., Mack, Easton PA (1980) and WO 91/19008).
Included as well in the invention are pharmaceutical compositions comprising an effective amount of at least one PNS SCP antisense oligonucleotide. in combination with a pharmaceutically acceptable carrier. Such antisense oligos include, but are not limited to, at least one nucleotide sequence of 12-500 bases in length which is complementary -32to a DNA sequence of SEQ ID NO:I, or a DNA sequence encoding at least 4 amino acids of SEQ ID NO-2 or Figure
IIA-IIE.
Alternatively, the PNS SCP nucleic acid can be combined with a lipophilic carrier such as any one of a number of sterols including cholesterol, cholate and deoxycholic acid. A preferred sterol is cholesterol.
The PNS SCP gene therapy nucleic acids and the pharmaceutical compositions of the invention can be administered by any means that achieve their intended purpose. For example, administration can be by parenteral, subcutaneous, intravenous, intramuscular, intra-pcritoncal, or transdermal routes. The dosage administered will be dependent upon the age, health, and weight of the recipient, kind of concurrent treatment, if any. frequency of treatment, and the nature of the effect desired.
Compositions within the scope of this invention include all compositions wherein the PNS SCP antisense oligonucleotide is contained in an amount effective to achieve enhanced expression of at least one PNS SCP in a peripheral nervous system neuron or ganglion. While individual needs vary, determination of optimal ranges of effective amounts of each component is with the skill of the art. Typically, the PNS SCP nucleic acid can be administered to mammals, c.g. humans, at a dose of 0.005 to I mg/kg/day, or an equivalent amount of the pharmaceuti.
15 cally acceptable salt thereof, per day of the body weight of the mammal being treated.
Suitable formulations for parenteral administration include aqueous solutions of the PNS SCP nucleic acid in water-soluble form, for example, water-soluble sals. In addition, suspensions of the active compounds as appropriate
C*
oily injection suspensions can be administered. Suitable lipophilic solvents or vehicles include fatty oils, for example, sesame oil, or synthetic fatty acid esters, for example, ethyl oleate or triglycerides. Aqueous injection suspensions can contain substances which increase the viscosity of the suspension include, for example, sodium carboxymethyl cellulose, sorbitol, and/or dcxtran. Optionally, the suspension can also contain stabilizers.
Alternatively, at least one PNS SCP can be coded by DNA constructs which are administered in the form of virions, which are preferably incapable of replicating In vivo (see, for example, Taylor, WO 92/06693). For example, such DNA constructs can be administered using herpes-based viruses (Gage et at., US. Patent No. 5,082,670).
Alternatively, PNS SCP antisense RNA sequences, PNS SCP ribozymes, and PNS SCP EGS can be coded by RNA *constructs which are administered in the form of virions, such as recombinant, replication deficien retroviruses or adenoviruses. The preparation ofretroviral vectors is well known in the art (see, for example, Brown et "Retroviral Vectors," in DNA Cloning: A Practical Approach. Volume 3, IRL Press. Washington, D.C. (1987)).
Specificity for gene expression in the peripheral nervous system can be conferred by using appropriate cellspecific regulatory sequences, such as cell-specific enhancers and promoters. Since protein phosphorylation is critical for neuronal regulation (Kennedy, "Second Messengers and Neuronal Function," in An Introduction to Molecular Neurobiology, Hall, Ed. Sinauer Associates, Inc. (1992), protein kinase promoter sequences can be used to achieve sufficient levels ofPNS SCP gene expression.
Thus, gene therapy can be used to alleviate sodium channel related pathology by inhibiting the inappropriate expression of a particular form of PNS SC. Moreover, gene therapy can be used to alleviate such pathologies by providing the appropriate expression level of a particular form of PNS SCP. In this case, particular PNS SCP nucleic acid sequences can be coded by DNA or RNA constructs which are administered in the form of viruses, as described above.
Having now generally described the invention, the same will be more readily understood through reference to the following Examples which are provided by way of illustration, and are not intended to be limiting of the invention, unless specified.
Example 1: Cloning and Sequencing of a PNS SC Encoding Nucleic Acid Materials and Methods Cell Culture. PC12 cells and PKI-4 PC12 subclones were grown as previously described (Mandel et al, 1988). NGF(2.5 S subunit, kindly supplied by Dr. S. Halegoua, SUNY at Stony Brook), was added to the 10 culture medium at final concentration of 110 ng/ml. The PKI-4 PC12 subclone which expresses the cAMP-dependent kinase inhibitor protein °(PKI) was also provided by Dr. S. Halegoua (see D'Arcangelo et J. Cell Biol. 122:915-921 (1993)).
PCR Amplification. Total cellular RNA was isolated, according to the 15 method of Cathala et al DNA 2:329-335 (1983), from a PC12 subclone (PKI-4) which expresses high levels of the cAMP-dependent protein kinase inhibitor protein. Two jg of total RNA prepared time NGF-treated PKI-4 cells was used to synthesize first strand cDNA using random hexamer primers for the reverse transcriptase reaction. The cDNA then served as template for the PCR amplification, using a pair of degenerate oligonucleotide primers that specified a 400 base pair region within repeat domain III of the sodium channel a subunit gene. The 5' primer (designated YJ 1 :GCGAAGCTT(TC)TIATITT(TC)I(GATC)IAT(ATC)ATGGG (SEQ ID NO:3), underline indicates a Hindlll restriction site), corresponded to amino acids FWLIFSIM (SEQ ID NO:4) at positions 1347-1354 in the type II sodium channel gene. The 3' primer (designated YO1C:
GCAGGATCC(AG)TT(AG)AAA(AG)TT(AG)TC(AGT)AT(AGT)AT(AGCT)AC(
33a AGCT)CC (SEQ ID NO:5), underline indicates a BamH1 restriction site) corresponded to amino acids GVIIDNFN (SEQ ID NO:6) at positions 1470- 1447 in the type II gene. The amplification reaction mixture consisted of of the cDNA, 1mM MgCI 2 0.2mM dNTPSs, 0.5 iM each primer, Taq polymerase (Perkin-Elmer) in a buffer consisting of 0.1 M KCI, 0.1 M TRIS HCI (pH 8.3) and gelatin (1mg/ml). The reaction was performed in a Perkin-Elmer thermocycler as follows: 5 cycles of denaturation 94'C, 1 min.), annealing (37'C, 1 min.), and extension (72'C, 1 min) followed by 25 cycles of denaturation (94'C, 1 min.), annealing (50'C, 1 min.) and 10 extension (72'C, 1 min.). The PCR products were excised from a low melt agarose gel (SEAPLAQUE GTG, FMC BIOPRODUCTS) and subcloned into a Bluescript II SK plasmid vector previously restricted with Hindlll and BamH1. The clones were screened for cDNA inserts by miniprep (Sambrook et infra) and sequenced in both directions by dideoxy chain 15 termination (Sequenase 2.0 kit, UNITED STATES BIOCHEMICAL).
Sequence data was compiled and analyzed using GENWORKS software (INTELLIGENETICS, INC., Mountain View, CA).
cDNA Library Construction and Screening. Poly(A) mRNA from the PKI-4 PC12 subclone was purified (mRNA purification kit, PHARMACIA) and used to construct a random- and oligo (dT)-primed Lambda ZAP II cDNA library (STRATAGENE CORP., La Jolla, CA). The library consisted of 5.6 X 106 independent clones prior to amplification. Screening of approximately 4 X 106 recombinants using the cloned PCR product pPC12- 1 labelled by random primers (PHARMACIA kit) resulted in isolation of cDNAs ranging in size from 1-3 kb. Sequence analysis and comparison to published sequences established that the two of the cDNAs together encoded 3033 bp of the novel sodium channel a subunit, PN1.
Northemw blot analysIs and ribonrucease protecti n ap. Total cellular RNA was isolated from adult Sprague-Dawley rat brain, spinal cord, superior cervical ganglion, dorsal root ganglion, skeletal muscle, cardiac muscle, and adrenal gland using the standard method of Chirgwin, Biochemustry 18:5294-5299 (1979). RNA was clectrophoresed and traisficried to nylon membrane as previously described (Cooperman el Proc. Nat'IAcad Sc.
USA 84:8721 (1987)) (DURALON-UV; STRATAGENE CORP.). RNA blots were cross-linked to the nylon using Stratalinker UJV crosslinker (STRATAGENE CORP.) and hybridized to "P-UTP-labeled antisense RNA probes generated fiam the following linearized templates: pPCI2-l, pRB32I (Cooperman, bjfa 1987), p1 B 15 (cyclophilin; Danielson er aL, DNA 7:261-267 (198)1 and rat brain type 1. which contains 51 bp of intron, 5' untranslated sequence and 267 bp of coding sequence of the type I sodiutn channel. RNA probes were transaibed with either T3 (pPC12.l), T7, (pNachl), or SP6 (pRB21 1, pIBIS) RNA polymsnise according to the mnanufactrer's instructions (PROMEGA CORP, Madison, WI). The blots were washed once in 2 x SSC 0.1% NaDodSO, for I5 min. at 68'C, followed by two washes in 0.2 x SSC. 0. 1% NaDodSO 4 for 15 min. at 68'C. Antoradiography with preflasbed XAR-5 film (EASTMtAN KODAK CO.. Rochester, NY) was used for quantitation of mRI4A by densitomety.
Ribonuctcas protections assays were performed by use of a kit (RPA Ii, A [BON INC,. Austin, TDO. Total 15 RNA was hybridized with IO~ cpm of antasense RNA probe generated from pPCl2-1. To control for differences in the amount of total RNA between samrples, we included an antisense RNA probe for Pl atin, transcribed from pTRI-I3actin (AMBION, INC.).
In situ hybrIdb~aln. Tissue preparation and hybridization were performed using a modification of the procedure described by Yokouchi el at, Develop. 113:431-444 (1991). SCG and DRG were dissected from adult SpragDawleyas andfixed in4% paraformaldhydc (in0.l1 MPBS) for 2-6hrs.at 4*C. The tissue was then rinsed Smin.n 0. 1 M PBS (pH 7.31 cryoprotocted in 30% sucrose (in0. 1 M PBS) for 2hrs- at 4 Cand embedded in O.C.T.
(TISSUE-TEK). Ciyostat sections (14 laM) were collected on SUPERFROST/Plus slides (FISHER SCIENTIFIC). dried 2 hrs. at rorn temnp., and then stored at Immediately before prchybridization, sections were brought to room temp. and rehydratod in 0. 1 M PBS (pH 7.3) containingO03% TrimonX-I 00for 5min. Sections weethen tated with 0.2 NHCl for 20mi,washed in0.l1 M PBS for 5 min., and digested with protinas:K (5 pglml in0.1 M PBS) for 40 min. at37*C. Sections were then postfixed with 4% paraformaldehyde (in 0.1 M PMSrinsed with 0.1 M PBS containing 0.1 M glycne for 15min.. and :~9equilibrated in 50%1 fonnamide. 2 x SSC for I hr. (roomi temp.).
Sections were hybridized with antisense digoxigenin4abeled RNA probes traniscribed fromi pPC 12-1 or pNach2 (Cooperman etal. Proc. Natl AcaL &el USA 84.5721 (19M7) according to the manufactures instn-ictions for RNA labeling with digoxigenin-UTP (BOEHRINCIER MANNHEIM). Unlabeled probes were synthesized by replacing digoxigenin-UTP with rUTrP. Each section was covered with 100 pl of hybridization solution contitintig 20 mM TRIS HCI (pH18.0). 2.5 mM EDTA, 5001 formamide, 0.3 M NaCI. lx Denhardr's, 104/ dextran sulfate, I mg/ni tRNA, and probe at a concentmaton of 0.7 pg/mI. Sections were then covered with PARAFILM coverslips and incubated in a humid chamber overnight at 450C. After hybridization. sections were washed in 50%1 formnamide. 2 x SSC at 45C for I hr., followed by RNase digestion in 0.5M NaCl, 10 mM TRIS HCI (pH and 20 pghmI RNase A (BOEHRINGER MANNHEIM). Sections were subsequently washed at 45*C in 5011 formainide, 2 x SSC for I hr.. and 50%. formamidc, I x SSC fori I hr.
Immunological detection was perfonned using a kit (GENIUS 3 KIT, BOEHRINGER MANNHEIM), according to the manufacture's instructions. In most experiments, the sections were incubated in the color solution for 3-5 hrs. at room temp. Sections were then coverslipped with AQUA-MOUNT (Lerner Laboratories) and stored in the dark.
Denstometry. Levels of sodium channel mRNA were determined by densitometric analysis of the autoradiograms using Bio Image software (Millipore Corp., Ann Arbor, Michigan). Levels of RNA were normalized to the quantitated levels of cyclophilin mRNA.
Reslts Iolaton of cDNA cpessd pferemaly I peripheral nerve. DArcangelo et al., J. Cell Biol. 122:915- 921 (1993) showed previously thatNGF treatment of PC12 cells increase the level of an 11 kb sodium channel gene transcript which did not hybridize to probes specific for any of the known sodium channel genes. A transcript identical in size was also detected in mRNA from adult rat sympathetic and sensory ganglia, but not in mRNA from brain. These results suggested that the transcript encoded a new member of the sodium channel gene family (termed Peripheral Nerve type I (PNI)).
15 To confirm the identity of the PNI gene, cDNAs from an NGF-treated PC12 subclone which preferentially expresses PNI mRNA (PKI-4 cells) IYArcangelo et aL were amplified by the polymerase chain reaction (PCR), using a pair of degenerate oligonucleotide primers that specify a 400 base pair (bp) region of the sodium channel a subunit gene (see Methods, Figure Both primers specified putative membrane-spanning regions within repeat domain III, which are highly conserved among voltage-gated sodium channels. The amplified regions between the primers include the strictly-conserved pore-lining residues, as well as residues which are divergent among the different mammalian a subunits. Sequence analysis of the PCR products revealed a cDNA, pPC12-1, which encoded a portion of a novel S* putative sodium channel a subunit (Figure Additional cDNAs were further isolated which encapsulated the entire PNI coding region.
To determine whetherpPCl2-1 encode part of the PN1 gene, the cDNA was used to generate antisense RNA probes for Northern blot analysis of mRNA from control and NGF-treated PC12 cells (Figure 2B). For comparison, a duplicate blot (Figure 2A) was hybridized with an antisense probe pRB21I, which encode a highly-conserved region of the sodium channel a subunit (Cooperman e aL, Proc. Nat'l Acad Scl. USA 84:721 (1987)) and which crosshybridizes with the PNI transcript, and that, as shown by I'Arcangelo et aL, Cell Biol. 122:915-921 (1993). levels of the detected transcript should increase rapidly and transiently following NGF treatment (maximal 5 hrs).
Comparison of Figures 2A and 2B shows that pPCI2-I fulfilled both of these criteria. Also, consistent with ITArcangelo etal.,J. CellBiol. 122.915-921 (1993) we found that NGF induction of the trascript detected by pPCI2-I is independent of cAMP-dependent protein kinase activity.
To isolate additional cDNAs encoding PNI, a random- and oligo (dT)-primed Lambda ZAP II cDNA library (STRATAGENE, 5.6 X 10' independent clones) was prepared from poly(A)+ mRNA isolated from the same PC12 subclone from which pPCl2-l was isolated. Screening 4 X 10 recombinants with a probe generated from pPC12-1 resulted in isolation of 2 additional, overlapping cDNAs which are joined to give a 3033 bp cDNA (Figure 7).
Additional cDNAs were further isolated which encapsulated the entire PNI coding region.
Analysb ofd e deducedprimary structure ofPNI. As shown in Figure 8. the deduced primary stucture of PNI encodes repeat domain II of the sodium channel a subunit gene. Comparison with the type II sodium channel shows that the PNI sequence contains all of the structural motifs characteristic of volage-gated sodium channels, including six putative transmembrane domains (IIISI-IIIS6). The S4 domain, thought to serve as the voltage sensor, exhibits the highly-conserved pattern of a positively-charged residue (lysine or arginine) at every third position.
Furthermore, the putative pore-lining segments (IIISSI-ISS2) contain residues shown to be involved in sodiumselective permeation Heinemann et a, Nature 356:441-443 (1992)) as well as TTX affinity (Tcrlaue et al. FEBS Le.
293.93-96 (1991)).
In addition to such highly-conserved structural features, the sodium channel asubunit undergoes several characteristic post-translational modifications. All sodium channels sequenced to date exhibit a distinctive pattern of asparagine-linked (N-linked) glycosylation sites, which are found almost exclusively in the extracellular loops joining the S5 and S6 transmembrane helices. The N-linked glycosylation sites of PNI are in good agreement with this pattern; three potential exraccllular glycosylation sites are located between IIS5 and IIIS6. Two of the sites are also found in the type I, 11 and III sodium channels.
S 15 The a subunit is phosphorylated by protein kinase C (PKC), and deduced PN sequence contains the highly- Sconserved consensus PKC phosphorylation site at scrine" u (Figure This residue is located in the cytoplasmic loop joining domains III and IV that has been implicated in channel inactivation, and mutational analysis has shown that this serine is required for PKC modulation of channel inactivation (West t al., 1991).
The entire DNA (Figure 9A-D) and amino acid (Figure 10) sequences were determined. The rat PNI amino acid sequence was compared with new human sequences (Figure 1 IA-E) presented in Example 2.
In sum, the deduced primary structure of PNI contains all of the hallmark structural and functional domains characteristics a a subunit the voltage-gated sodium channeL The PNI gene Is epresedpreferentially In the PNS. To determine whether the PNI gene was expressed preferentially in the PNS, total RNA was isolated from adult rat brain, spinal cord, SCO, DRG, skeletal muscle, and 25 cardiac muscle and subjected to Northern blot analysis. Blots were hybridized with the PN I-specific antisense probe generated from pPCI2-1. As shown in Figure 3A, we found high levels of hybridization to an 11 kb transcript in both SCG and DRG. Much lower, bat detectable levels hybridization were seen to transcripts in both spinal cord and brain.
No detectable hybridization was observed to mRNA from skeletal muscle, cardiac muscle, or liver.
Riboouclase (RNase) protection analyses were also prepared. Total RNA was isolated from the same tissues 30 used in Northern blot analysis, as well as adrenal gland, and hybridized to PNI-specific antisense probe (pPCl2-1).
mRNA from SCG, DRG, brain, spinal cord, and adrenal gland protected a 343 bp fragment of the PN1 probe (Figure 4B). The non-protected bases represent oligonucleotide primer and plasmid sequences. The PNI probe was not protected by mRNA from either skeletal muscle or cardiac muscle.
To determine the relative amounts of PNI mRNA in the various tissues, autoradiographs from three separate RNase protection experiments were analyzed by densitometry. To control for small differences in the amount of total RNA between samples, we included a probe fora 0 actin. PN mRNA levels in both SCG and DRG are approximately greater than in spinal cord, adrenal gland and brain.
-37.
The PNI gene is expressed In sympathetic and sensory neurons. To determine whether the PN 1 gene is expressed in neurons of peripheral ganglia, in situ hybridization was used to examine the cellular distribution of PNI mRNA in adult rat SCG and DRG. Cryostat sections were hybridized with a PNI-specific digoxigenin-labeled RNA probe (pPC12-1), which was visualized using an anti-digoxigenin antibody conjugated to alkaline phosphatase. As shown in Figure 4A, B the PNI antisense probe labeled most neuronal cell bodies in both SCG and DRG. To confirm that the hybridization signal was due to binding of the probe specifically to PN mRNA, we performed two different negative controls: Sections were hybridized with the digoxigenin-labeled probe in the presence of a 100-fold excess of unlabeled PN antisense probe. Previous experiments have shown that SCG and DRG contain extremely low levels of type II sodium channel mRNA (Beckh, S, FEBSLett. 262:317-322 (1990)). Therefore, we also hybridized sections with a type II-specific antisense probe. As shown, in Figure 4C-F, both of these control experiments greatly reduced the hybridization signal. Also, consistent with the results of Northern blot and RNase protection analyses, we found that hybridization of the labeled PNI probe to sections ofadut rat cerebral cortex yielded no detectable staining.
Although the PNI probe stained most neuronal cell bodies in both SCG and DRG, we found that cell-to-cell variability in PNI mRNA levels differed between the two ganglia. SCG neurons were fairly homogeneous, in that the '5 intensity of reaction product was relatively constant between different cells. DRG neurons, however, were quite S heterogeneous in that the staining intensity varied considerably from cell to cell. For example, in Figure 4B, arrows Sindicate two DRG neurons of approximately the same diameter which differ markedly in staining intensity.
Finally, we found that the PN2 probe did not stain non-neronal cells such as satellite cells and Schwann cells.
SHowever, it is possible that these cells contain very low levels of PN mRNA which are not detectable by this method.
SCG neurons also exprss the type I sodium channel gene. Earlier Northern blot analysis has shown that mRNA from SCG contains two distinct sodium channel gene transcripts. As we have demonstrated, the larger, 11 kb transcript encodes the PNI sodium channel. The smaller transcript, however, has not yet been identified. We hypothesized that this smaller transcript encoded the type I sodium channel, because moderate levels of type I mRNA have been found in other PNS tissues (Beckh, FEBS Lea. 262:317-322 (1990)). To test this hypothesis, Northern S* 25 blots of SCG mRNA isolated from adult rats were hybridized with an antisense probe specific for the type I sodium channel gene (pNachl, see Methods above). As shown in Figure 5, the type I-specific probe hybridized specifically to the smaller trnscript Furthennore, we have found that SCG mRNA protects the type I probe in an RNas protection assay.
The putative PNIa subunit and type la subunit genes are differentally regulated during development.
30 Several studies have shown that the types I, II and III sodium channel genes are differentially regulated during development in both the central and peripheral nervous systems. To determine whether the PNI and type I genes are also independently regulated during development, we measured their relative mRNA levels in SCG isolated from rats of different postnatal ages. To visualize both transcripts simultaneously, Northen blots were hybridized with the conserved sodium channel gene probe pRB211. As shown in Figure 6A, in SCG removed on postnatal day 7 the levels of PNI and type I mRNA are approximately equal. However, by P14, their relative abundance has shifted such that level of PNI mRNA exceeds that of type I by w*-fold. This increase in ratio of PNI to type I mRNA levels continues for at least the next four postnatal weeks. By P42, PN I is the predominant sodium channel gene transcript, with levels of PNI mRNA several-fold greater than that of type 1.
-38- To quantitate the development changes in mRNA levels, autoradiogrphs from three separate experiments were analyzed by densitometry. To control for differences in the amount of total RNA between tanes, blots were subsequently hybridizing blots with a probe for the internal control cyclophilin. As shown in Figure 6B, in which percent maximum mRNA is plotted versus postnatal age, the shift in relative abundance of the two transcripts in largely due to a developmental decrease in level of type I sodium channel mRNA. From P7 to P42, the level of type I mRNA decreases by approximately Example 2: Drug Screening for PN-I Antagonists The ability of a PNS SCP-ligand antagonists and agonists) to inhibit or enhance the activity of a PNS SCP is be evaluated with cells expressing at least one PNS SCP. An assay for PNS SCP activity in such cells is used to determine the functionality of the PNS SCP protein in the presence of at least one agent which can act as antagonist or agonist, and thus, agents that interfere or enhance the activity of PNS SCP are identified. Two or more cell lines (each expressing a different PNS SCP) are used, as well as optionally using one or more cell lines expressing a CNS specific sodium channel as a control.
15 These agents are selected and screened at random; by a rational selection; and or by design using for example, computer modeling techniques.
S There are numerous variations of assays which can be used by a skilled artisan without the need for undue experimentation in order to isolate, modulating agents or ligands of a PNS SCP. Agent determination methods include Computer Assisted Molecular Design (CAMD), PNS SCP-agent binding, sophisticated chemical synthesis and testing, targeted screening, peptide combinatorial library technology, antisense technology and/or biological assays, according to known methods. See. Rapaka et al., ed.. Medications Development: Drug Discovery Databases, and S* Computer-Aded Drug Design, NIDA Research Monograph 134, NIH Publication No. 93-3638, U.S. Dept. of Health and Human Services, Rockville, MD (1993); Langone, Methods in Enzymology, Volume 203. Molecular Design and Modeling:Concepts and Applications. Part B, Antibodies and Antigens. Nucleic Acids. Polysaccharides and Drugs, S* 25 Section III, pp 587-702, Academic Press, New York (1991)).
~Alternatively, cell expression libraries, or other cells are used to that have been selected or genetically engineered to express and display a PNS SCP via the use of the PNS SCP nucleic acids of the invention are preferred in such methods, as host cell lines may be chosen which are devoid of related receptors. Rapaka, infra, (1993), at pages 58-65.
30 A PNS SCP agent in the context of the present invention refers to any chemical or biological molecule that associates with a PNS SCP in vitro, in situ or in vivo, and can be, but is not limited to, synthetic, recombinant or naturally derived chemical compounds and compositions, organic compounds, nucleic acids, peptides, carbohydrates, vitamin derivatives, hormones, neurotransmitters, viruses or receptor binding domains thereof, opsins, rhodopsins, nucleosides, nucleotides, coagulation cascade factors, odorants or pheremones, toxins, growth factors, platelet activating factors, neuroactive peptides, neurohumors, or any biologically active compound, such as drugs or naturally occurring compounds.
The agents are selected and screened at random or rationally selected or designed using computer modeling techniques. For random screening, potential agents ar selected and assayed for their ability to bind to the PNS SCP.
-39or a fragment thereof. Alternatively, agents may be rationally selected or designed. As used herein, a agent is said to be "rationally selected or designed" when the agent is chosen based on the configuration of at least one specific PNS SCP as presented in Figure 11). For example, one skilled in the art can readily adapt currently available procedures to generate agents capable of binding to a specific peptide sequence in order to generate rationally designed compounds, such as chemical compounds, nucleic acids or peptides. See. Rapaka, infra, (1993); Hurby et a., "Application of Synthetic Peptides: Antisense Peptides," in Synhetic Peptides: A User's Guide. W.H. Freeman, New York (1992), pp. 289-307; and Kaspczak e at, Biochemisty 289230-2938 (1989).
A method of screening for an agent that modulates the activity of at least one PNS SCP comprising: incubating at least one cell line expressing at least one PNS SCP with an agent to be tested; and assaying the at least one cell for the activity of the at least one PNS SCP protein by measuring the agents effect on PNS SCP binding or PNS SCP activity preferably the or assay distinguishes the agent's effect on alternative PNS SCP and determines that the agent has little or no effect on CNS sodium channels. or has relatively less effect on CNS sodium channels..
Any cell can be used in the above assay so long as it expresses a functional form ofPNS SCP protein and the 15 PNS SCP activity can be measured. The preferred expression cells are cukaryotic cells or organisms. Such cells can be modified to contain DNA sequences encoding the PNS SCP protein using routine procedures known in the art.
SAlternatively, one skilled in the art can introduce mRNA encoding the PNS SCP protein directly into the cell.
In an aternative embodiment stem cell populations for either neuronal or glial cells can be genetically engineered to express a functional PNS SCP ion channel. Such cells expressing the PNS SCP ion channel, can be transplanted to the diseased or injured region of the mammars neurological system. (Neural Transplantation. A Practical Approach. Donnet Djorklund, eds., Oxford University Press, New York, NY (1992)). In another embodiment, embryonic tissue or fetal neurons can be genetically engineered to express functional PNS SCP ion channel and transplanted to the diseased or injured region of the mammal's limbic system. The feasibility of transplanting fetal dopamine neurons into Parkinsonian patients has been demonstratd. (Lindvall et al., Archives of 25 Neurology46:615- 6 3 1 (198 9 At least two types of approaches are currently used to express voltage-dependent sodium channel clones in order to generate functional channel proteins. In one approach. mRNA encoding the cloned cDNA is expressed in SXenopus oocytes. The sodium channel cDNA is cloned into a bacterial expression vector such as the pGEM recombinant plasmid (Melton, et 1984). Transcription of the cloned cDNA is carried out using an RNA polymerase such as SP6 polymerase or '7 polymerase with a capping analog such as M'G(S)ppp(5)G. The resulting RNA about 50 nl, corresponding to 2-5 ng) is injected into stage V and stage VI oocytes isolated from Xenopus, and incubated for 3-5 days at 19*C. Oocytes axe tested for sodium channel expression with a two-microelectrode voltage clamp (Trimmer et a. Neuron 3:33-49 1989).
In an alternative approach, cDNAs encoding a voltage-dependent sodium channel is cloned into any one of a number of mammalian expression vectors, and transfected into mammalian cells which do not express endogenous voltage-dependent sodium channels (such as fibroblast cell lines). Transfected clones are selected expressing the cloned, transfected cDNA. Sodium channel expression is measured with a whole cell voltage clamp technique using a patch electrode (1YArcangelo et al., J. Cell Blot 122:915-921 (1993)).
Sou rces of PNS SCP and Cell Lines Useful for Drug Screenw. Any cell line expressing (Naturally, by induction or due to recombinant expressiori of a PNS SCP) can be used for drug screning. As a non-limiting example, PC12 cells express both PNI and Type 11 soiuim channels. A126-l112 cells are mutants deficient in Protein Kinase A (PKA) activity and which express PN 1, but are now discovered to not express Type 11 sodium channels. PKI-4 is a PC12 cell line trnsfected with a cDNA encoding a peptide inhibitor of PKA. Each of these cell lines can be used as one source of aPNS SCP of the present invention, or as acell line itself to use in drug screening. Treatment of PC 12 cells with NOF reduces both a PNS SOP (PHI) and type 11 sodium channels, while NGF Mnuces only PHI in A126-182 cells. PKI-4 cells express a PNS SOP (PHI) without NGF treatment (lYArcangelo ef aL, J. Cell Biol. 122:915-921 (1993)).
Additionally or alternatively, heterologous expression systems can also be used in which cell lines (such as Chinese Hamster Ovary cells (C110)) ame stabl transfected with a eDNA encoding PN-I. Method steps for transfecting and stably expressing eDNA to form heterologous cell lines ame well known in the art. An advantage of using transfected cells is that clones are obtained that express very high levels of a PNS SCP, such as PH-I1.
To screen for PNS SCP modulators, as antagonists or agonists. drugs arc examined for their ability tb: inhibit or enhance the binding of radioligands to a PNS SOP (labeled ligand binding reaction), and/or to inhibit or enhance ion flux through the channel of the PHSSCP ina cell line that expresses a PNS
SOP.
Labeled ligand binding neuroitoxins can be used to characterize PNS sodium channels. For example previous studies have identified at least six distinct neurotoxin binding sites on previously chwarceized non-PNS sodium channels (reviewed in Lotnbert et at, FEB 219(2):355-359 (19M7). Many of these sites are thought to be allosterically coupled to one another (for revieK. see Strichatz et aL, Annm Rev. Nesrroci 10.237-267 (1937), and references cited therein). In other words, binding of a drug or toxin to a particular neutrotoxin site can be sensitive to drug binding at not only that site, but other sites on the channel as well. T7his is advantageous for a drug screening program in that for a given labeled ligand, the likelihood of identifying agents that preferrentlally bind to a PHS SOP is increased.
25 The techniques described herein for measuring labeled ligand binding to a PNS SOP of the invention in intact cells P012 PKI or PHS SOP expresing heterologous cell lines) in suspension are similar to those described previously for radiohignd binding to other sodium channels in brain synaptosomal preparations (see, Catteail e at. J. Blot Chem 256(17):892248927 (198 However, it is well recognized by those skilled in the art that these techniques are routinely modified for the use of substrate-attached cells or broken cell preparations, based on the teaching and guidance presented herein.
A 126- 1B2, PC 12, PK 14 or other cells expressing a PNS SOP cells are grown using standard techniques, and optionally treated with NGF for 1-2 days to induce PN- I expression. Cells are harvested and tested for ion flux activity with alternative potential agents.
For both radioligands, binding reactions are conducted at 37 then stopped. Samples are quickly filtered with vacuum washed with ice-cold buffer, and bound radioactivity determined by scintillation counting.
-41- Ion Flux directly tests the ability of a potential PNS SCP agent to inhibit or enhance the activity of a PNS SCP function, by their ability to inhibit or enhance the influx of ion tracers through a PNS SCP.
Most previous sodium channel studics have employed "Na as a tacr (for example, see Cafterall et al. J. Biol.
Chem. 256(17):9922-8927 (1981)). However, the high toxicity of 13 Na can be a disadvantage for its use in highthroughput drug screening. A less toxic alternative is guanidirniumn ion, influx of which has been shown to be a reliable indicator of sodium channel opening (Reith, Europ. J. Pharrnacol. 188:33-41 (1990)). Accordingly. routine methods can be used to screen comnpounds for modulating PNS SCP ion channel activity, guanidimiuim ion flux using intact cells expressing at leas one PNS SCP. Additionally these methods ame well known to be easily modified for use with 'Na. Simnilarly, these known method steps could be modified for use with substrate-attached cells or vesicles prepared fr-om broken cells, according to known method steps.
For a guanidinium flux assay the methods for 'Na are modified from those of Reith (Europ. .1 Pharmacol.
188:33-41 (1990) for brain synaptosomes), eg.. as described in Example 2 below. Aliquots of a cell suspension containing hecterologous cells expressing at least one PNS SCP ame incubated for 10 minutes at 37*C in the presence of channel openers (typically, 100 MzM verutridine) and test drugs in a total volume of 1-00 ptM (0.20-025 mug protein).
Ion flux is initiated by the addition of IIEPESfIRIS solution also containing 4ntM gua~nidine HCI (fiqna) and 1000 dpm/nmol guanidine. The reaction is continued for 30 seconds and is stopped by the addition of ice-cold incubation buffer, followed by rapid filtrationt under vacuum over Whatman OF/C filter. The filters are washed rapidly with ice-cold incubation buffer and radioactivity determined by scintillation counting. Nonspecific uptake is determined in parallel by the inclusion of I mM tetrodotoxin during both preinubahion and uptake.
Using the guanidiniun flux assay several methyli/halophenyl substituted compounds, such as lidoflazine (see, Merck Index Monograph 5311 anid U.S. patent No. 3,267,104, both entirely incoporused herein by reference), were tested and found to inhibit sodium channel activity of at least one PNS SC? of the present invention in cell lines expressing at least one PNS SCP. with a pICSO of 6.51 for lidoflazine on PK1-4 cells. Accordingly, the present invention provides PNS SC? modulating agents as methyl/hslophenyl-substituted piperizines.
E Jfk3 Idetitwaio ofHuanPNS SCP Sequence from a Human Peripheral Nervo u System cDNA Library Similar to the procedures provided in Example 1. a human peripheral nervous system eDNA Ibrr (as a human DRG libay was used for polymerase chain reaction (PCR) amplification. The PCR used a 5S primer conespondingto DNA encoding amino acids 604-611 ofSEQ ID NO.2, and& corresponding 3 prinmer encoding amino acids 723-731 of SEQ ID NO2.
The PCR reaction mixture consisted of 5% of the cDNA, I mM MgClj. 0.2 mM dNTPSs, 0.5 mM, each primer.
Taq polymerase (Perkin-Elmer) in abuffer consisting of 0.1 M KCI, 0. 1 M TRJS HCI (pH 8.3) and gelatn (I mg/mI).
The reaction was performed in a Perkin-Elmer thennocycler as follows: five cycles of denaturations (94 C, I min.), annealing (37'C, I min), and extension (72*C, I min.). followed by 25 cycles of denaturation (94*C, I min.4 annealing C, I min-), and extension (72 C. I min.).
The resulting PCR products provided a human amplified eDNA which encoded amino acids 646-658 of SEQ ID NO.2, as presented in Figure 1 IA-E. Example 4: Cloning and Sequencing of Human PN- 1 Sequence from Human Dorsal Root Ganglion cDNA Library As in Examples 1 and 3 above, additional PCR primers corresponding to SEQ ID NO:1 are used to isolate clones from the human DRG cDNA library which encompass the entire coding region of one or more human PNS SCPs of the present invention. A 5' primer includes the sequence 5'TTTGTGCCCCACAGACCCCAG3' (SEQ ID NO:17) and a 3' primer includes the sequence 5'ACACAAATTCTTGATCTGGAATTGCT3' (SEQ ID 10 NO:18) or 5'CAACCTCAGACAGAGAGCAATGA3' (SEQ ID NO:19), which are used for nested PCR. According to Examples 1 and 3 above, PCR is performed to obtain cDNAs encoding a human PNS SCP.
Additional PCR is performed by "walking" 5' or 3' of the sequence corresponding to the above PCR product. In this way cDNAs S 15 encompassing the entire coding region of one or more human PNS SCPs are provided.
The resulting additional cDNA clones or PCR products, encoding the entire human PNS SCP, are subcloned into a plasmid vector previously restricted with suitable restriction sites. The clones are screened for cDNA inserts by miniprep (Sambrook et al, infra) and sequenced in both directions by dideoxy chain termination (Sequenase 2.0 kit, United States Biochemical). Sequence data is compiled and analyzed using GeneWorks software (IntelliGenetics, Inc., Mountain View, CA). The expected alternative amino acid sequences for a human PN1 sequence or presented in Figure 11 A-D and as SEQ ID NOS:7, 11 and 12, where Xaa represents 0, 1, 2 or 3 amino acids.
42a Transcripts of the size of the resulting human PNS SCP are then confirmed to be present in human PNS mRNA or cDNA (encoding a 1970- 1990 amino acid sequence of Figure 11A-E). However, as in Example 1, such transcripts are not expected to be detected in mRNA from brain. This expected result confirms new human members of the sodium channel gene family (termed Human Peripheral Nerve type 1 (HUMPN1A and HUMPN1B of Figure 11A-E, where X is O, 1, 2 or 3 of the same or different amino acid).
Complete DNA and amino acid sequences of novel human PNls are 10 then confirmed and are expected to contain all of the structural and functional domain characteristics of an (x subunit of a mammalian voltagegated sodium channel.
S•All references cited herein, including journal articles or abstracts, published or corresponding U.S. or foreign patent applications, issued U.S.
or foreign patents, or any other references, are entirely incorporated by reference herein, including all data, tables, figures, and text presented in .9 the cited references. The foregoing description of the specific 9 9 embodiments will so fully reveal the general nature of the invention that others can, by applying knowledge within the skill of the art (including the contents of the references cited herein), readily modify and/or adapt for various applications such specific embodiments, without undue experimentation, without departing from the general concept of the invention. Therefore, such adaptations and modifications are intended to be within the meaning and range of equivalents of the disclosed embodiments, based on the teaching and guidance presented herein. It is to be understood that the phraseology or terminology herein is for the purpose of description and not of limitation, such that the terminology or 42b phraseology of the present specification is to be interpreted by the skilled artisan in light of the teachings and guidance presented herein, in combination with the knowledge of one of ordinary skill in the art.
9 .9 9 .9 9 9 9.
9499 99 9 C 9 SEQUENCE LISTING GENERAL INFORMATION: APPLICANT: TROPHIX PHARMACEUTICALS, INC.
CRAGWOOD ROAD SOUTH PLAINFIELD, NJ 07080 UNITED STATES OF AMERICA THE RESEARCH FOUNDATION OF STATE SUNY OF STONY BROOK W5510 MELVILLE MEMORIAL LIBRARY STONY BROOK, NY 11794-3366 UNITED STATES OF AMERICA UNIVERSITY OF NEW YORK a.
S a as
S
9 55 as a 5 a a APPLICANT/INVENTORS: Mandel, Gail Halegoua, Simon Borden, Laurence A.
(ii) TITLE OF INVENTION: Peripheral Nervous System Specific Sodium Channels, DNA Encoding Therefor, Crystallization, X-ray Diffraction, Computer Molecular Modeling, Rational Drug Design, Drug Screening, and Methods of Making and Using Thereof (iii) NUMBER OF SEQUENCES: 19 (iv) CORRESPONDENCE ADDRESS: ADDRESSEE: STERNE, KESSLER, GOLDSTEIN FOX, P.L.L.C STREET: 1100 New York Ave., N. Suite 600 CITY: Washington STATE: DC COUNTRY: USA ZIP: 20005-3934 COMPUTER READABLE FORM: MEDIUM TYPE: Floppy disk COMPUTER: IBM PC compatible OPERATING SYSTEM: PC-DOS/MS-DOS SOFTWARE: PatentIn Release Version #1.30 (vi) CURRENT APPLICATION DATA: APPLICATION NUMBER: 41,434/96 FILING DATE: 02-NOV-1995
CLASSIFICATION:
5 (vii) PRIOR
(A)
(B)
(vii) PRIOR
(A)
(B)
(vii) PRIOR
(A)
(B)
APPLICATION DATA: APPLICATION NUMBER: PCT/US95/14251 FILING DATE: 02-NOV-1995 APPLICATION DATA: APPLICATION NUMBER: 08/482,401 FILING DATE: 07-JUN-1995 APPLICATION DATA: APPLICATION NUMBER: 08/334,029 FILING DATE: 02-NOV-1994 (viii) ATTORNEY/AGENT INFORMATION: NAME: Ludwig, Steven R.
REGISTRATION NUMBER: 36,203 -44- REFERENCE/DOCKET NUMBER: 0917.024AU02 (ix) TELECOMMUNICATION INFORMATION: TELEPHONE: 202-371-2600 TELEFAX: 202-371-2540 INFORMATION FOR SEQ ID NO:1: SEQUENCE CHARACTERISTICS: LENGTH: 3033 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both (ii) MOLECULE TYPE: DNA (genomic) (ix) FEATURE: NAME/KEY: CDS LOCATION: 1..3033 *999 *0 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:1: AGG AAC CTT GTG GTC CTG AAC CTG TTT CTG GCT CTT TTG CTG AGT TCC 48 Arg Asn Leu Val Val Leu Asn Leu Phe Leu Ala Leu Leu Leu Ser Ser S* 1 5 10 o* *:.TTT AGT TCT GAC AAT CTT ACA GCA ATT GAG GAA GAC ACC GAT GCA AAC 96 Phe Ser Ser Asp Asn Leu Thr Ala Ile Glu Glu Asp Thr Asp Ala Asn 25 AAC CTC CAG ATC GCA GTG GCC AGA ATT AAG AGG GGA ATC AAT TAC GTG 144 Asn Leu Gln Ile Ala Val Ala Arg Ile Lys Arg Gly Ile Asn Tyr Val 35 40 SAAA CAG ACC CTG CGT GAA TTC ATT CTA AAA TCA TTT TCC AAA AAG CCA 192 .o.o Lys Gln Thr Leu Arg Glu Phe Ile Leu Lys Ser Phe Ser Lys Lys Pro Pr 50 55 AAG GGC TCC AAG GAC ACA AAA CGA ACA GCA GAT CCC AAC AAC AAG AAA 240 Lys Gly Ser Lys Asp Thr Lys Arg Thr Ala Asp Pro Asn Asn Lys Lys 70 75 GAA AAC TAT ATT TCA AAC CGT ACC CTT GCG GAG ATG AGC AAG GAT CAC 288 Glu Asn Tyr Ile Ser Asn Arg Thr Leu Ala Glu Met Ser Lys Asp His 90 AAT TTC CTC AAA GAA AAG GAT AGG ATC AGT GGT TAT GGC AGC AGT CTA 336 Asn Phe Leu Lys Glu Lys Asp Arg Ile Ser Gly Tyr Gly Ser Ser Leu 100 105 110 GAC AAA AGC TTT ATG GAT GAA AAT GAT TAC CAG TCC TTT ATC CAT AAC 384 Asp Lys Ser Phe Met Asp Glu Asn Asp Tyr Gln Ser Phe Ile His Asn 115 120 125 CCC AGC CTC ACA GTG ACA GTG CCA ATT GCA CCT GGG GAG TCT GAT TTG 432 Pro Ser Leu Thr Val Thr Val Pro Ile Ala Pro Gly Glu Ser Asp Leu 130 135 140 GAG ATT ATG AAC ACA GAA GAG CTT AGC AGT GAC TCA GAC AGT GAC TAC 480 Glu Ile Met Asn Thr Giu Giu Leu Ser Se 145 150 AGG AAA GAG AAA GGG AAC GGA TCA AGC TG' r
T
Asp 155 Ser Asp Ser Asp Tyr Ser Lys Giu Lys 9*
GAC
Asp
GCA
Ala
GCA
Pro
ACC
Thr 225
AGC
Ser
GAA
Glu
TAT
Tyr
AAA
Lys
TGG
Trp 305 AAC I Asn I CTG I Leu I AGG G Arg V GTG C Val L 3
AA(
As
GAI
Asr
TGC
Cys 210
ATC
Ile
TTG
Phe
GAT
Asp
GCT
Ala
TGG
rrp 290
CTG
Leu kCT hr
GG
~rg
;TA
ral
:TT
eeu 70
CCT
i Pro
GAG
Glu 195
TGC
Gys
AGG
Arg
ATC
Ile
ATG
Ile
GAG
Asp 275
GTC
Val
GAG
Asp
CTT
Leu
GCC
Ala I GTG Val 355 GTC G Leu
CTG
Leu 180
CCT
Pro
CAA
Gin
AAG
Lys
GTT
Val
TAT
Tyr 260
AAG
Lys
GCA
Ala
TTG
Phe
GGC
Gly
'TA
,eu I 340 TC I lal I ;TG I tal C
T
T
I
I
'I
T
T
L
3 x; Arg Asn Arg 165 CCA GGA GAA Pro Gly Glu GAA GCC TGG Glu Ala Gys GTT AAT GTA Val Asn Val 215 kCG TGC TAG hr Gys Tyr 230 .TC ATG ATG 'eu Met Ile !45 ~TT GAA AAG le Giu Lys ~TA TTG ACC le Phe Thr AT GGG TAT yr Giy Tyr 295 TA ATT GTT eu Ile Val 2 310 AC TGA GAG C yr Ser Asp I 25 GA CCC CTA I rg Pro Leu T C. GGA CTC A sn Ala Leu I 3 GC CTT ATA T ys Leu Ile P 375 Sei
GA
G1
TTI
PhE 20C
GAC
Asp
AGG
Arg
GTG
Leu
AAA
Lys
TAG
Tyr 280
AA
Lys 3AT sp
TT
~eu
~GA
~rg
~TA
le 60
TC
he r Se 3 GA i G1 18
AC
Thi TC1 Sei
ATP
Ile
GTC
Leu
AAG
Lys 265
ATG
Ile
ACA
Thr
GTG
Val
GGG
Gly
GCC
Ala 345
GGA
Gly
TGG
Trp r Ser 170 G GCT u Ala 5 1 GAT Asp 1 GGG Gly
GTT
Val
AGG
Ser 250
ACC
Thr
TTG
Phe
TAT
Tyr
TCT
Ser
CCC
Pro 330 TTG Leu GCA I Ala I GTA P Leu I
TC
Se
GA
Gli
GG]
G1 Lys
GAA
Glu 235
AGT
Ser
ATT
Ile
ATT
Ile
TTG
Phe
ZTA
Leu 315
%TT
Ile
[CT
jer
ITC
le
~TA
le T GAG r Glu
GCA
Ala P TGT Gys C GGG Gly 220
CAC
His
GGA
Gly
AAG
Lys
GTG
Leu
ACT
Thr 300
GTT
Val
AAA
Lys AGA I Arg E CCT I Pro S 3 TTT A Phe S 380
TG(
Cy
GAC
Git
GTC
Val 205
AAA
Lys
AGG
Ser
GCT
Ala
ATT
Ile
SAA
Glu 285
AT
Asn
CT
rhr
CT
er 7
TT
'he 'cc ;er
AGG
s Ser
CCC
Pro 190
AGG
Arg
GTT
Val
TGG
Trp
GTG
Leu
ATG
Ile 270
ATG
Met
GCC
Ala
TTA
Leu
GTA
Leu GAA Glu C 350 ATC I Ile 1
AC
Thi 171
GTI
Val
AGI
Arc
TGC
Trp
TTT
Phe
GCT
Ala 255
GTG
Leu
GTT
Leu
TGG
Trp
GTA
Vral
-GG
Irg 335
GA
;ly
CTG
let 160 T GTT r Val k AAG Asn
TTT
Phe
TGG
Trp
GAA
Glu 240
TTT
Phe
GAG
Glu
GTA
Leu
TGT
Gys
GCC
Ala 320
ACA
Thr
ATG
Met
AAG
Asn 528 576 624 672 720 768 816 864 912 960 1008 1056 1104 1152 1200 IGC ATG ATG GGA er Ile Met Gly GTG AAT GTG TTT GGT GGG AAG TTG TAT GAG TGT GTC AAC ACC ACC GAT val 385 Asn Leu Phe Ala Gly 390 Lys Phe Tyr Glu Gys 395 Val Asn Thr Thr Asp 400 -46- GGG TGA GGA TTT GGT ACA TCT CAA GTI Gly Ser Arg Phe Pro Thr Ser Gin Val 405
GCG
Ala
AAG
Asn
ACA
Thr
AAT
Asn 465
TAG
Tyr
TTC
Phe
GGA
Gly
GCA
Ala
CA
Pro 545
GCT
Ala
ATG
Met
CAC
His
CTG
LeuI
ATT
Ile 625
CG
Le:
TTC
Phe
TTC
Phe 450
GTA
Val1
TTT
Phe
ATT
Ile
GGT
Gly
ATG
Met 530
GGG
Gi y
TTT
Phe
A~TG
TGG
Trp
%.AG
610
['TG
,eu
ATG
Met
GAG
Asp 435
AAG
Lys
AAT
As n
GTC
Vai
GGT
Gly
CAA
Gin 515
AAG
Lys
AAC
Asn
GAT
Asp
GTA
Val
ATG
Ile 595
CTA
Leu I TAT 9 Tyr
AAC
Asn 420
AAC
Asn
GGC
Gly
GAA
Giu
ATC
Ile
GTC
Vali 500
GAT
Asp
AAG
Lys
~AA
Lys
A.TG
Ile
GAA
580 1AC ksn kTC lie
TT
?he
GTT
Val
GTT
Val1
TGG
Trp
GAG
Gin
TTC
Phe 485
ATC
Ile
ATC
Ile
CTT
Leu
TTC
Phe
ACC
Thr 565 AAAi Lys
ATG
Met
TGG
Ser
GTG
Val
AG'I
Ser
GGG
Giy
ATG
Met
CCG
Pro 470
ATC
Ile
ATA
Ile
TTT
Phe
GGG
Giy
CAA
Gin 550
ATC
Ile
GAG
G1u
GTC
Val1
CTC
Le u 3TA Ia 1 630
GGA
Gly
CTT
Leu
GAT
Asp 455
AAA
Lys
ATC
Ile
GAT
Asp
ATG
Met
TGG
Ser 535
GGA
Gly
ATG
Met
GGG
Giy
TTC
Phe
AGA
Arg 615
GTG
ValI
AAT
As n
GGT
Gi y 440
ATT
Ile
TAG
T yr
TTG
Phe
AAT
As n
ACA
Thr 520
AAA
Lys
TGT
Cys
GTT
Val1
CAA
Gln
%STT
Ilie 600
'AT
kTG lie
GTG
Val1 425
TAC
Tyr
ATG
Met
GAA
Giu
GGC
Gly
TTG
Phe 505
GAA
Giu
AAA
Lys
ATA
Ile
CTT
Leu
ACT
Thr 585
ATC
Ile
TAG
Tyr
'TG
.eu
GGA
*Ala 410
GGA
*Arq
GTG
*Leu
TAT
Tyr
TAG
Tyr
TGA
Ser 490
AAG
Asn
GAA
Giu
GGA
Pro
TTT
Phe
ATA
Ile 570
GAG
Giu
GTG
Leu
TAG
Tyr
TGG
Ser
AAC
As r
TGG
T rr
TCG
Ser
GCA
Al a
AGT
Ser 475
TTG
Phe
GAA
Gin
GAG
Gin
GAA
Gin
GAG
Asp 555
TGG
Cys
TAG
ryr 1'TC Phe
TTC
Phe
WT
Ile 635
GGT
Arq
AAA
Lys
GTG
Leu
GGA
Ala 460
CTC
Leu
TTC
Phe
GAG
Gin
AAG
Lys
AAA
Lys 540
TTA
Leu
GTC
Leu
ATG
Met
ACT
Thr
AGT
Thr 620
GTAC
ValC Ser
AAC
As n
CTT
Leu 445
GTT
Val1
TAG
T yr
AG
Thr
AAA
Lys
AAA
Lys 525
GGA
Pro
GTG
Vali 1AGC
GAT
,sp
IGG
31Y 605
TG
Ia 1
GA
}ly
GAG
Glu
GTG
Leu 430
GAA
Gin
GAG
Asp
ATG
Met
TTG
Leu
AAA
Lys 510
TAG
Tyr
ATT
Ile
AGA
Thr
ATG
Met
TAT
Tyr 590
GAG
Giu
GGT
Gly
ATG
Met
TGI
Cys 415
AAA
Lys GT I Val1
TGT
Ser
TAG
Tyr
AAC
As n 495
AAG
Lys
TAT
Tyr
GGA
Pro
AAG
As n
GTA
Val1 575
GTT
Val1
TGT
Cys
TGG
Trp
~TTT
Phe
TTT
Phe
GTA
Val
GGA
Ala
GTT
Val1
ATT
Ile 480
GTG
Le u
GTT
Leu
AAT
As n
AGG
Arg
GAA
Gin 560
AGG
Thr
TTA
Leu
GTG
Val1
AAG
As n
GTG
Le u 640 1248 1296 1344 1392 1440 1488 1536 1584 1632 1680 1728 1776 1824 1872 1920 1968 GGT GAG ATG ATA Ala Giu Met Ile GAG AAG TAT TTG GTG TGC GGT AGG CTG TTG CGA GTG Giu Lys Tyr Phe Val Ser Pro Thr Leu Phe Arq Val to
ATC
Ile
AAG
Lys
GTG
Leu
ATC
Ile 705
AAT
Asn
TTC
Phe
CTG
Leu
AGT
Ser
TTT
Phe 785
ATC
Ile
GAG
Glu
AAG
Lys
GAG
Asp
AAA
Lys 865
CGC
Arg
GGG
Gly
TTG
Phe 690
TTT
Phe
GAC
Asp
CAA
Gin
AAG
Asn
TGA
Ser 770
GTG
Vali
GGT
Aia
CT
Pro
TTG
Phe
TTT
Phe 850
GTG
Val
CTG
Leu
ATC
Ile 675
AAC
Asn
GGG
Giy
ATG
Met
ATC
Ile
AGC
Ser 755
GTG
Vai
AGG
Ser
GTG
Val1
CTG
Leu
GAC
Asp 835
GCA
Aia
GAG
Gin
GGG
Aila 660
G
Arg
ATC
Ile
ATG
Met
TTG
Phe
ACC
Thr 740
GCA
Aia
GAA
Giu
TAG
Tyr
ATG
Ile
AGT
Ser 820
GGT
Pro
GGT
Aia
GTG
Le u 645
AGG
Arg
AGT
Thr
GGG
Giy
TGG
Se r
AAG
Asn 725
AGG
Thr
GGT
Pro
GGG
Giy
ATG
Ile
GTG
Leu 805
GAG
Giu
GAG
Asp
GGG
Aia
ATT
Ile
ATT
Ile
GTG
Le u
GTG
Leu
AAG
As n 710
TTT
Phe
TGT
Ser
GGG
Pro
GAG
Asp
ATG
Ile 790
GAG
Glu
GAG
Asp
GGG
Ala
CTG
Le u
GGG
Ala 870
GGA
Gly
GTG
Le u
GTG
Leu 695
TTT
Phe
GAG
Glu
GGG
Aila
GAG
Asp
TGT
Gys 775
ATA
Ile
AAG
As n
GAG
Asp
AGT
Thr
GAT
Asp 855
ATG
Met
GGA
Arg
TTT
Phe 680
GTT
Leu
GGG
Al a
AGT
Thr
GGG
Gly
TGT
Gys 760
GGG
Gly
TGG
Ser
TTG
Phe
TTT
Phe
GAG
Gin 840
GGT
Pro
GAG
Asp
ATG
Ile 665
GGT
Al a
TTG
Phe
TAG
Tyr
TTT
Phe
TGG
Trp 745
GAG
Asp
AAG
As n
TTG
Phe
AGG
Ser
GAG
Giu 825
TTG
Phe
GGG
Pro
GTG
Leu
TTG
Leu
GTG
Leu
GTT
Val1
GGG
Gly 730
GAG
Asp
GGT
Pro
GGA
Pro
GTG
Le u
GTG
Val1 810
ATG
Met
ATA
Ile
GTG
Le u
GGG
Pro
ATG
Met
GTG
Val1
AAA
Lys 715
AAG
As n
GGA
Gly
AAA
Lys
TGG
Ser
GTG
Val1 795
GGG
Al a
TTG
Phe
GAG
Glu
GTG
Leu
ATG
Met 875
ATG
Met
ATG
Met 700
AAG
Lys
AGC
Ser
GTG
Leu
AAA
Lys
GTG
Val 780
GTG
Val1
AGG
Thr
TAG
Tyr
TTG
Phe
ATG
Ile 860
GTG
Val1
TGG
Ser 685
TTG
Phe
GAG
Giu
ATG
Met
GTG
Leu
GTT
Val1 765
GGG
Giy
GTG
Val1
GAA
Glu
GAG
Giu
TGG
Gys 845
GGA
Al a
AGT
Se r GTA GG GTG ATG Leu Arq Leu Ile AAA GGG Lys Gly 670 GTT GGT Leu Pro ATG TAG Ile Tyr GGT GGA Aia Gly ATG TGG Ile Gys 735 GGG GGG Aia Pro 750 GAG GGA His Pro ATT TTT Ile Phe AAG ATG Asn Met GAG AGG Giu Ser 815 GTG TGG Val Trp 830 AAG GTG Lys Leu AAG GGA Lys Pro GGA GAG Giy Asp
GGG
Ala
GG
Aila
GGG
Al a
ATT
Ile 720
TTG
Leu
ATG
Ile
GGA
Gly
TAG
Tyr
TAG
Tyr 800
AGT
Thr
GAG
Giu
TGT
Ser
AAG
Asn
GG
Arg 880 2016 2064 2112 2160 2208 2256 2304 2352 2400 2448 2496 2544 2592 2640 2688 ATG GAG TGG GTG GAG ATG TTG TTT GGT TTT AGA AAG GGG GTG GTG GGT Ile His Gys Leu Asp 885 Ile Leu Phe Ala Phe Thr Lys Arg Val Leu Gly 890 895 -48- GAG GGT GGA GAG ATG GAT TCT CTT CGT TCA CAG ATG GAA GAA AGG TTC 2736 Glu Gly Gly Glu Met Asp Ser Leu Arg Ser Gin Met Glu Glu Arg Phe 900 905 910 ATG TCA GCC AAT CCT TCT AAA GTG TCC TAT GAA CCC ATC ACG ACC ACA 2784 Met Ser Ala Asn Pro Ser Lys Val Ser Tyr Glu Pro Ile Thr Thr Thr 915 920 925 CTG AAG AGA AAA CAA GAG GAG GTG TCC GCG ACT ATC ATT CAG CGT GCT 2832 Leu Lys Arg Lys Gin Glu Glu Val Ser Ala Thr Ile Ile Gin Arg Ala 930 935 940 TAC AGA CGG TAT CGC CTC AGA CAA CAC GTC AAG AAT ATA TCG AGT ATA 2880 Tyr Arg Arg Tyr Arg Leu Arg Gin His Val Lys Asn Ile Ser Ser Ile 945 950 955 960 TAC ATA AAA GAT GGA GAC AGG GAT GAT GAT TTG CCC AAT AAA GAA GAT 2928 Tyr Ile Lys Asp Gly Asp Arg Asp Asp Asp Leu Pro Asn Lys Glu Asp 965 970 975 ACA GTT TTT GAT AAC GTG AAC GAG AAC TCA AGT CCG GAA AAG ACA GAT 2976 Thr Val Phe Asp Asn Val Asn Glu Asn Ser Ser Pro Glu Lys Thr Asp 980 985 990 0* GTA ACT GCC TCA ACC ATC TCG CCA CCT TCC TAT GAC AGT GTC ACA AAG 3024 Val Thr Ala Ser Thr Ile Ser Pro Pro Ser Tyr Asp Ser Val Thr Lys *0 995 1000 1005 SCCA GAT CAA 3033 *o Pro Asp Gin 1010 INFORMATION FOR SEQ ID NO:2: SEQUENCE CHARACTERISTICS: LENGTH: 1011 amino acids TYPE: amino acid TOPOLOGY: linear MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: Arg Asn Leu Val Val Leu Asn Leu Phe Leu Ala Leu Leu Leu Ser Ser 1 5 10 Phe Ser Ser Asp Asn Leu Thr Ala Ile Glu Glu Asp Thr Asp Ala Asn 25 Asn Leu Gin Ile Ala Val Ala Arg Ile Lys Arg Gly Ile Asn Tyr Val 40 Lys Gin Thr Leu Arg Glu Phe Ile Leu Lys Ser Phe Ser Lys Lys Pro 55 Lys Gly Ser Lys Asp Thr Lys Arg Thr Ala Asp Pro Asn Asn Lys Lys 70 75 Glu Asn Tyr Ile Ser Asn Arg Thr Leu Ala Glu Met Ser Lys Asp His 90 -49- Asn Phe Leu a a Asp Pro Glu 145 Ser Asp Ala Pro Thr 225 Ser Giu T yr Lys T rp 305 Asn Leu Arg Val1 Val1 385 Gly Al a Lys Ser 130 Ile Lys Asn Asp.
Cys 210 Ile Phe Asp Ala Trp 290 Leu Thr Arq Val1 Leu 370 Asn Ser Leu Se r 115 Leu Met Gin Pro Gin 195 Cys Arg Ile Ile Asp 275 Val1 Asp Len Ala Val1 355 Leu Le u Arg Met Lys 100 Phe Thr As n Lys Len 180 Pro Gin Lys Val1 T yr 260 Lys Al a Phe Gly Len 340 Val1 Val1 Phe Phe As n 420 Met Val Thr Arg 165 Pro Gin Val1 Thr Len 245 Ile Ile Tyr Leu Tyr 325 Arq Asn Cys Ala Pro 405 Val1 Asp Gin Thr Val 135 Gin Gin 150 Asn Arg Gly Glu Ala Cys Asn Val 215 Cys Tyr 230 Met Ile Gin Lys Phe Thr Gly Tyr 295 Ile Val 310 Ser Asp Pro Len Ala Leu Len Ile 375 Gly Lys 390 Thr Ser Ser Giy Asn 120 Pro Leu Ser Giu Phe 200 Asp Arg Len Lys Tyr 280 Lys Asp Leu Arg Ile 360 Phe Phe Gin Asn Gin Lys Asp Arg Ile Ser 105 Asp Tyr Ile Ala Ser Ser Ser Ser 170 Glu Aia 185 Thr Asp Ser Gly Ile Val Leu Ser 250 Lys Thr 265 Ile Phe Thr Tyr Val Ser Gly Pro 330 Ala Leu 345 Gly Ala Trp Leu Tyr Giu Val Ala 410 Val Arg 425 Gly Tyr Gin Ser Pro Gly 140 Asp Ser 155 Ser Glu Gin Ala Gly Cys Lys Gly 220 Gin His 235 Ser Gly Ile Lys Ile Leu Phe Thr 300 Leu Val 315 Ile Lys Ser Arg Ile Pro Ile Phe 380 Cys Val 395 Asn Arg Trp Lys Gly Ser 110 Phe Ile 125 Gin Ser Asp Ser Cys Ser Gin Pro 190 Vai Arg 205 Lys Val Ser Trp Ala Len Ile Ile 270 Gin Met 285 Asn Ala Thr Len Ser Len Phe Gin 350 Ser Ile 365 Ser Ile Asn Thr Ser Giu Asn Len 430 Ser His Asp Asp Thr 175 Val Arg Trp Phe Ala 255 Leu Leu Trp Val1 Arg 335 Gly Met Met Thr Cys 415 Lys Leu Asn Leu Tyr 160 Val1 As n Phe Trp Gin 240 Phe Gin Leu Cys Ala 320 Thr Met As n Gly Asp 400 Phe Val1 Asn Phe Asp Asn Val Gly Leu Gi 435 44 Tyr Leu Ser Leu Leu Gin Val Ala 445 Thr Asn 465 Tyr Phe GI y Ala Pro 545 Ala Met His Leu Ile 625 Ala Ile Lys Leu Ile 705 Asn Phe Phe 450 Val1 Phe Ile Giy Met.
530 Gly Phe Met Trp Lys 610 Leu Giu Arg Giy Phe 690 Phe Asp Gin Lys Asn Val1 Giy Gin 515 Lys As n Asp Vali Ile 595 Leu Tyr Met Le u Ile 675 As n Gly Met Ile Giy Glu Ile Val1 500 Asp Lys Lys Ile Giu 580 As n Ile Phe Ile Ala 660 Arg Ile Met Phe Thr 740 Trp Gin Phe 485 Ile Ile Le u Phe Thr 565 Lys Met Ser Vai Giu 645 Arg Thr Giy Ser Asn 725 Thr Met Pro 470 Ile Ile Phe Giy Gin 550 Ile Giu Val1 Leu Val1 630 Lys Ile Le u Leu As n 710 Phe Ser Asp 455 Lys Ile Asp Met Ser 535 Gly Met Gly Phe Arg 615 Val Tyr Gly Leu Leu 695 Phe Giu Ala Ile Tyr Phe Asn Thr 520 Lys Cys Val1 Gin Ile 600 His Ile Phe Arq Phe 680 Leu Aia Thr Gly Met Giu Giy Phe 505 Glu Lys Ile Leu Thr 585 Ile T yr Leu Val1 Ile 665 Al a Phe T yr Phe T rp 745 Tyr Tyr Ser 490 Asn Giu Pro Phe Ile 570 Giu Leu Tyr Ser Ser 650 Leu Le u Le u Val1 Gly 730 Asp Al a Se r 475 Phe Gin Gin Gin Asp 555 Cys Tyr Phe Phe Ile 635 Pro Arg Met Val1 Lys 715 As n Gi y Ala 460 be u Phe Gin Lys Lys 540 be u Leu Met Thr Thr 620 Val1 Thr be u Met Met 700 Lys Se r be u Val Asp Tyr Met Thr beu bys Lys 510 Lys Tyr 525 Pro Ile Vai Thr Asn Met Asp Tyr 590 Gly Giu 605 Val Giy Giy Met Leu Phe Ile bys 670 Ser beu 685 Phe Ile Glu Ala Met Ile beu Ala 750 Ser Tyr Asn 495 bys Tyr Pro Asn Val 575 Val1 Cys Trp Phe Arg 655 Gly Pro Tyr Gly Cys 735 Pro Val1 Ile 480 Leu beu Asn Arg Gin 560 Thr Leu Val1 Asn beu 640 Val1 Ala Ala Ala Ile 720 beu Ile beu Asn Ser Ala Pro Pro Asp Cys 760 Asp Pro Lys Lys His Pro Gly -51- Ser Phe 785 Ile Glu Lys Asp Lys 865 Ile Glu Met Leu Tyr 945 Tyr Thr Val Ser 770 Val Ala Pro Phe Phe 850 Val His Gly Ser Lys 930 Arg Ile Val Thr Val Ser Val Leu Asp 835 Ala Gin Cys Gly Ala 915 Arg Arg Lys Phe Ala 995 Glu Tyr Ile Ser 820 Pro Ala Leu Leu Glu 900 Asn Lys Tyr Asp Asp 980 Ser Gly Ile Leu 805 Glu Asp Ala Ile Asp 885 Met Pro Gin Arg Gly 965 Asn Thr Asp Ile 790 Glu Asp Ala Leu Ala 870 Ile Asp Ser Glu Leu 950 Asp Val Ile Cys 775 Ile Asn Asp Thr Asp 855 Met Leu Ser Lys Glu 935 Arg Arg Asn Ser Ser Phe Phe Gin 840 Pro Asp Phe Leu Val 920 Val Gin Asp Glu Pro 1000 Phe Ser Glu 825 Phe Pro Leu Ala Arg 905 Ser Ser His Asp Asn 985 Leu Val 810 Met Ile Leu Pro Phe 890 Ser Tyr Ala Val Asp 970 Ser Val 795 Ala Phe Glu Leu Met 875 Thr Gin Glu Thr Lys 955 Leu Ser 780 Val Val Thr Glu Tyr Glu Phe Cys 845 Ile Ala 860 Val Ser Lys Arg Met Glu Pro Ile 925 lie Ile 940 Asn Ile Pro Asn Pro Glu Asn Glu Val 830 Lys Lys Gly Val Glu 910 Thr Gin Ser Lys Lys 990 Met Ser 815 Trp Leu Pro Asp Leu 895 Arg Thr Arg Ser Glu 975 Thr Tyr 800 Thr Glu Ser Asn Arg 880 Gly Phe Thr Ala Ile 960 Asp Asp Gly Asn Pro Ser Val Gly Ile Phe Tyr Pro Ser Tyr Asp Ser Val Thr Lys 1005 Pro Asp Gin 1010 INFORMATION FOR SEQ ID NO:3: SEQUENCE CHARACTERISTICS: LENGTH: 29 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (ix) FEATURE: NAME/KEY: misc feature LOCATION: 12 OTHER INFORMATION: /note= "Base is Inosine" (ix) FEATURE: NAME/KEY: misc feature LOCATION: OTHER INFORMATION: /note= (ix) FEATURE: NAME/KEY: misc feature LOCATION: 19 OTHER INFORMATION: /note= (ix) FEATURE: NAME/KEY: misc feature LOCATION: 21 OTHER INFORMATION: /note= "Base is Inosine" "Base is Inosine" "Base is Inosine" (xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: GCGAAGCTTY TNATNTTYNN NATHATGGG INFORMATION FOR SEQ ID NO:4: SEQUENCE CHARACTERISTICS: LENGTH: 8 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: Phe Trp Leu Ile Phe Ser Ile Met 1 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 34 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (xi) SEQUENCE DESCRIPTION: SEQ ID GCAGGATCCR TTRAAARTTR TCDATDATNA CNCC INFORMATION FOR SEQ ID NO:6: SEQUENCE CHARACTERISTICS: LENGTH: 8 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: Gly Val Ile Ile Asp Asn Phe Asn 1 INFORMATION FOR SEQ ID NO:7: SEQUENCE CHARACTERISTICS: LENGTH: 2005 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:7: Met Ala Arg Ser Val Leu Val Pro Pro Gly Pro Asp Ser Phe Arg Phe 1 5 10 Phe Thr Arg Glu Ser Leu Ala Ala Ile Glu Gin Arg Ile Ala Glu Glu 20 25 Lys Ala Lys Arg Pro Lys Gin Glu Arg Lys Asp Glu Asp Asp Glu Asn 40 Gly Pro Lys Pro Asn Ser Asp Leu Glu Ala Gly Lys Ser Leu Pro Phe 55 Ile Tyr Gly Asp Ile Pro Pro Glu Met Val Ser Glu Pro Leu Glu Asp 70 75 Leu Asp Pro Tyr Tyr Ile Asn Lys Lys Thr Phe Ile Val Leu Asn Lys 90 Gly Lys Ala Ile Ser Arg Phe Ser Ala Thr Ser Ala Leu Tyr Ile Leu 100 105 110 Thr Pro Phe Asn Pro Ile Arg Lys Leu Ala Ile Lys Ile Leu Val His 115 120 125 Ser Leu Phe Asn Val Leu Ile Met Cys Thr Ile Leu Thr Asn Cys Val 130 135 140 Phe Met Thr Met Ser Asn Pro Pro Asp Trp Thr Lys Asn Val Glu Tyr 145 150 155 160 Thr Phe Thr Gly Ile Tyr Thr Phe Glu Ser Leu Ile Lys Ile Leu Ala 165 170 175 Arg Gly Phe Cys 180 Leu Glu Asp Phe Thr Phe Leu Arg Asn Pro Trp Asn 185 190 Trp Asn Leu 225 Leu Phe Asn Glu Thr 305 Gin Leu Cys Thr Phe 385 Tyr ile Gin Met Ala 465 Gly Leu Leu 210 Lys Ile Cys Leu Ile 290 Al a Asp Cys Val1 Phe 370 Trp Met Asn Al a Leu 450 Ala Val1 Asp 195 Gly Thr Gin Len Arg 275 As n Phe Lys Gly Lys 355 Se r Giu Ile Le u Thr 435 Gin Ala Phe Phe Thr Asn Val Ile Ser Ser Val 245 Ser Val 260 Asn Lys Ile Thr Asn Arg Ser His 325 Asn Ser 340 Ala Gly Trp Ala Asn Leu Phe Phe 405 Ile Leu 420 Leu Giu Gin Leu Ser Ala Ser Gin 485 Val1 Ser Val1 230 Lys Phe Cys Ser Thr 310 Phe Ser Arg Phe Tyr 390 Val1 Al a Gin Lys Gin 470 Ser Ile Ala 215 Ile Lys Ala Leu Phe 295 Val1 Tyr Asp As n Len 375 Gin Leu Val1 Ala Lys 455 Ser Ser Thr 200 Leu Pro Leu Len Gin 280 Phe As n Phe Al a Pro 360 Ser Leu Val1 Val1 Glu 440 Gin Arg Ser Phe Arg Gly Ser Ile 265 T rp As n Met Leu Gly 345 Asn Leu Thr Ile Al a 425 Gin Gin Asp Val Al a Thr Len Asp 250 Gly Pro As n Phe Glu 330 Gin Tyr Phe Len Phe 410 Met Lys Gin Phe Ala 490 Tyr Phe Lys 235 Vali Len Pro Ser As n 315 Gly Cys Gly Arg Arg 395 Leu Ala Giu Gin Ser 475 Ser Val1 Arg 220 Thr Met Gin Asp Len 300 T rp Gin Pro Tyr Len 380 Ala Gly Tyr Ala Ala 460 Gly Lys Thr 205 Val1 Ile Ile Len Asn 285 Asp Asp Asn Giu Thr 365 Met Ala Ser Giu Giu 445 Gin Ala Leu Giu Leu Val Leu Phe 270 Ser Trp Giu Asp Gly 350 Se r Thr Gi y Phe Giu 430 Phe Al a Gi y Ser Phe Arq Gly Thr 255 Met Thr Asn Tyr Ala 335 Tyr Phe Gin Lys Tyr 415 Gin Gin Al a Gly Ser 495 Val1 Al a Ala 240 Val1 Gly -Phe Gly Ile 320 Leu Ile Asp Asp Thr 400 Len Asn Gin Al a Ile 480 Lys Ser Gin Lys Giu Leu Lys Asn Arg Arg 505 Lys Lys Lys Lys Gin Lys Giu 510 Gin Ala Gly 515 Glu Glu Glu Lys Glu Asp Ala Val Arg 9 99 9 9 999*** *9 9 9 9 .9 9 9 *9 .9 a *999 9 9* *999 9 a.
Giu Arg 545 Ser Leu Phe Asp ValI 625 Asn Val1 Glu Tyr Met 705 Ser Leu As n Ile Giu 785 Ile Tyr Leu Asp 530 Leu Ile Phe Al a Ser 610 Ser Gly Gly Gly His 690 Ser Arg Ile Leu Val 770 Gin Phe Tyr Ser Se r Thr Arg Asn Asp 595 be u Gin Lys Gly Thr 675 Val1 Met Gin Trp Val1 755 be u Phe Thr Phe be u 835 Ile Tyr Gi y Phe 580 Asp Phe Aila Met Pro 660 Thr Ser Al a Lys Asp 740 Val1 As n Ser Al a Gin 820 Met Arg Giu Ser 565 Lys Glu Val1 Ser His 645 Ser Thr Met Ser Cys 725 Cys Met Thr Ser Giu 805 Glu Glu Lys Lys 550 be u Gly His Pro Arg 630 Se r Al a Giu Asp Ile 710 Pro Cys Asp Le u Val1 790 Met Gly Leu Lys 535 Arg Phe Arg Ser His 615 Al a Al a be u Thr be u 695 beu Pro Lys Pro Phe 775 Le u Phe Trp Gly 520 Gly Phe Ser Val Thr 600 Arg Ser Val1 Thr Giu 680 beu Thr Cys Pro Phe 760 Met Ser Len As n Leu 840 Phe Ser Pro bys 585 Phe His Arg Asp Ser 665 Ile Gin Asn Trp Trp 745 Val1 Al a Val1 bys Ile 825 Al a Gin Ser Arg 570 Asp Glu Gly Gly Cys 650 Pro Arg Asp Thr T yr 730 Leu Asp Met Giy Ile 810 Phe Asn Phe Pro 555 Arg Ile Asp Gin Ile 635 Asn Val bys Pro Met 715 bys Lys beu Giu Asn 795 Ile Asp Val Ser 540 His Asn Gly As n Arg 620 Pro Gly Gly Arg Ser 700 Gin Phe Val1 Ala His 780 beu Ala Gly Gin bys 525 ben Gin Ser Ser Asp 605 Arg Thr Val1 Gin Arg 685 Arg Gin Ala bys Ile 765 T yr Val Met Phe Gly 845 Ser Ala Ser Gin Ser Arg Giu 590 Ser Pro beu ValI beu 670 Ser Gin ben As n His 750 Thr Pro Phe Asp Ile 830 beu Giy beu Aia 575 Asn Arg Ser Pro Ser 655 beu Ser Arg Gin Met 735 Val1 Ile Met Thr Pro 815 Val1 Ser Ser beu 560 Ser Asp Arg -As n Met 640 ben Pro Ser Al a Gin 720 Cys Val1 Cys Thr Gly 800 Tyr Ser Val1 -56- Leu Arg Ser Phe Arg Leu Leu Arg Val Phe Lys Leu Ala Lys Ser Trp 850 855 860 Pro Thr Leu Asn Met Leu Ile Lys Ile Ile Gly Asn Ser Val Gly Ala 865 870 875 880 Leu Gly Asn Leu Thr Leu Val Leu Ala Ile Ile Val Phe Ile Phe Ala 885 890 895 Val Val Gly Met Gin Leu Phe Gly Lys Ser Tyr Lys Glu Cys Val Cys 900 905 910 Lys Ile Ser Asn Asp Cys Glu Leu Pro Arg Trp His Met His His Phe 915 920 925 Phe His Ser Phe Leu Ile Val Phe Arg Val Leu Cys Gly Glu Trp Ile 930 935 940 Glu Thr Met Trp Asp Cys Met Glu Val Ala Gly Gin Thr Met Cys -Leu 945 950 955 960 Thr Val Phe Met Met Val Met Val Ile Gly Asn Leu Val Val Leu Asn I 965 970 975 Leu Phe Leu Ala Leu Leu Leu Ser Ser Phe Ser Ser Asp Asn Leu Ala 980 985 990 Ala Thr Asp Asp Asp Asn Glu Met Asn Asn Leu Gin Ile Ala Val Gly 995 1000 1005 Arg Met Gin Lys Gly Ile Asp Phe Val Lys Arg Lys Ile Arg Glu Phe S1010 1015 1020 Ile Gin Lys Ala Phe Val Arg Lys Gin Lys Ala Leu Asp Glu Ile Lys 1025 1030 1035 1040 Pro Leu Glu Asp Leu Asn Asn Lys Lys Asp Ser Cys Ile Ser Asn His 1045 1050 1055 Thr Thr Ile Glu Ile Gly Lys Asp Leu Asn Tyr Leu Lys Asp Gly Asn 1060 1065 1070 Gly Thr Thr Ser Gly Ile Gly Ser Ser Val Glu Lys Tyr Val Val Asp 1075 1080 1085 Glu Ser Asp Tyr Met Ser Phe Ile Asn Asn Pro Ser Leu Thr Val Thr 1090 1095 1100 Val Pro Ile Ala Leu Gly Glu Ser Asp Phe Glu Asn Leu Asn Thr Glu 1105 1110 1115 1120 Glu Phe Ser Ser Glu Ser Asp Met Glu Glu Ser Lys Glu Lys Leu Asn 1125 1130 1135 Ala Thr Ser Ser Ser Glu Gly Ser Thr Val Asp Ile Gly Ala Pro Ala 1140 1145 1150 Glu Gly Glu Gin Pro Glu Ala Glu Pro Glu Glu Ser Leu Glu Pro Glu 1155 1160 1165 Ala Cys Phe Thr Glu Asp Cys Val Arg Lys Phe Lys Cys Cys Gin Ile 1170 1175 1180 57- Ser Ile Giu Giu Gly Lys Gly Lys Leu Trp Trp Asn Leu Arq Lys Thr 1185 1190 1195 1200 Gys Tyr Lys Ile Val Giu His Asn Trp Phe Giu Ile Phe Ile Val Phe 1205 1210 1215 Met Ile Leu Leu Ser Ser Gly Ala Leu Ala Phe Giu Asp Ile Tyr Ile 1220 1225 1230 Giu Gin Arg Lys Thr Ile Lys Thr Met Leu Giu Tyr Ala Asp Lys Vai 1235 1240 1245 Phe Thr Tyr Ile Phe Ile Leu Giu Met Leu Leu Lys Trp Val Ala Tyr 1250 1255 1260 Gly Phe Gin Met Tyr Phe Thr Asn Ala Trp Cys Trp Leu Asp Phe Leu 1265 1270 1275 1280 Ile Vai Asp Vai Ser Leu Vai Ser Leu Thr Ala Asn Ala Leu Gly -Tyr :::1285 1290 1295 .Ser Giu Leu Giy Ala Ile Lys Ser Leu Arq Thr Leu Arq Ala Leu Arg 1300 1305 1310 :0 Pro Leu Arg Ala Leu Ser Arg Phe Giu Giy Met Arg Val Vai Val Asn *1315 1320 1325 Ala Leu Leu Gly Ala Ile Pro Ser Ile Met Asn Val Leu Leu Val Cys 1330 1335 1340 Leu Ile Phe Trp Leu Ile Phe Ser Ile Met Gly Val Asn Leu Phe Ala 1345 1350 1355 1360 Gly Lys Phe Tyr His Cys Ile Asn Tyr Thr Ile Gly Giu Met Phe Asp *1365 1370 1375 Ser Val Val Asn Asn Tyr Ser Giu Cys Gin Ala Leu Ile Giu Ser 1380 1385 1390 *C0Asn Gin Thr Ala Arg Trp Lys Asn Val Lys Val Asn Phe Asp Asn Val 1395 1400 1405 Gly Leu Gly Tyr Leu Ser Leu Leu Gin Vai Ala Thr Phe Lys Gly Trp 1410 1415 1420 Met Asp Ile Met Tyr Ala Ala Val Asp Ser Arg Asn Val Giu Leu Gin 1425 1430 1435 1440 Pro Lys Tyr Giu Asp Asn Leu Tyr Met Tyr Leu Tyr Phe Val Ile Phe 1445 1450 1455 Ile Ile Phe Gly Ser Phe Phe Thr Leu Asn Leu Phe Ile Gly Val Ile 1460 1465 1470 Ile Asp Asn Phe Asn Gin Gin Lys Lys Lys Phe Gly Gly Gin Asp Ile 1475 1480 1485 Phe Met Thr Glu Glu Gin Lys Lys Tyr Tyr Asn Ala Met Lys Lys Leu 1490 1495 1500 Gly Ser Lys Lys Pro Gin Lys Pro Ile Pro Arq Pro Ala Asn Lys Phe 1505 1510 1515 1520 -58- Gln Gly Met Val Phe Asp Phe Val Thr Lys Gin Val Phe Asp Ile Ser 1525 1530 1535 Ile Met Ile Leu Ile Cys Leu Asn Met Val Thr Met Met Val Glu Thr 1540 1545 1550 Asp Asp Gin Ser Gin Glu Met Thr Asn Ile Leu Tyr Trp Ile Asn Leu 1555 1560 1565 Val Phe Ile Val Leu Phe Thr Gly Glu Cys Val Leu Lys Leu Ile Ser 1570 1575 1580 Leu Arg His Tyr Tyr Phe Thr Ile Gly Trp Asn Ile Phe Asp Phe Val 1585 1590 1595 1600 Val Val Ile Leu Ser Ile Val Gly Met Phe Leu Ala Glu Leu Ile Glu 1605 1610 1615 Lys Tyr Phe Val Ser Pro Thr Leu Phe Arg Val Ile Arg Leu Aa -Arg 1620 1625 1630 Ile Gly Arg Ile Leu Arg Leu Ile Lys Gly Ala Lys Gly Ile Arg Thr 1635 1640 1645 Leu Leu Phe Ala Leu Met Met Ser Leu Pro Ala Leu Phe Asn Ile Gly 1650 1655 1660 Leu Leu Leu Phe Leu Val Met Phe Ile Tyr Ala Ile Phe Gly Met Ser 1665 1670 1675 1680 Asn Phe Ala Tyr Val Lys Arg Glu Val Gly Ile Asp Asp Met Phe Asn 1685 1690 1695 Phe Glu Thr Phe Gly Asn Ser Met Ile Cys Leu Phe Gin Ile Thr Thr 1700 1705 1710 Ser Ala Gly Trp Asp Gly Leu Leu Ala Pro Ile Leu Asn Ser Gly Pro 1715 1720 1725 Pro Asp Cys Asp Pro Glu Lys Asp His Pro Gly Ser Ser Val Lys Gly 1730 1735 1740 Asp Cys Gly Asn Pro Ser Val Gly Ile Phe Phe Phe Val Ser Tyr Ile 1745 1750 1755 1760 Ile Ile Ser Phe Leu Val Val Val Asn Met Tyr Ile Ala Val Ile Leu 1765 1770 1775 Glu Asn Phe Ser Val Ala Thr Glu Glu Ser Ala Glu Pro Leu Ser Glu 1780 1785 1790 Asp Asp Phe Glu Met Phe Tyr Glu Val Trp Glu Lys Phe Asp Pro Asp 1795 1800 1805 Ala Thr Gin Phe Ile Glu Phe Cys Lys Leu Ser Asp Phe Ala Ala Ala 1810 1815 1820 Leu Asp Pro Pro Leu Leu Ile Ala Lys Pro Asn Lys Val Gin Leu Ile 1825 1830 1835 1840 Ala Met Asp Leu Pro Met Val Ser Gly Asp Arg Ile His Cys Leu Asp 1845 1850 1855 -59- Ile Leu Phe Ala Phe Thr Lys Arg Val Leu Gly Glu Ser Gly Glu Met 1860 1865 1870 Asp Ala Leu Arg Ile Gin Met Glu Glu Arg Phe Met Ala Ser Asn Pro 1875 1880 1885 Ser Lys Val Ser Tyr Glu Pro Ile Thr Thr Thr Leu Lys Arg Lys Gin 1890 1895 1900 Glu Glu Val Ser Ala Ile Val Ile Gin Arg Ala Tyr Arg Arg Tyr Leu 1905 1910 1915 1920 Leu Lys Gin Lys Val Lys Lys Val Ser Ser Ile Tyr Lys Lys Asp Lys 1925 1930 1935 Gly Lys Glu Asp Glu Gly Thr Pro Ile Lys Glu Asp Ile Ile Thr Asp 1940 1945 1950 Lys Leu Asn Glu Asn Ser Thr Pro Glu Lys Thr Asp Val Thr Pro -er 1955 1960 1965 Thr Thr Ser Pro Pro Ser Tyr Asp Ser Val Thr Lys Pro Glu Lys Glu 1970 1975 1980 S: Lys Phe Glu Lys Asp Lys Ser Glu Lys Glu Asp Lys Gly Lys Asp Ile 1985 1990 1995 2000 Arg Glu Ser Lys Lys 2005 INFORMATION FOR SEQ ID NO:8: SEQUENCE CHARACTERISTICS: LENGTH: 813 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:8: Asn Leu Val Val Leu Asn Leu Phe Leu Ala Leu Leu Leu Ser Ser Phe 1 5 10 Ser Ser Asp Asn Leu Ala Asp Asn Asn Leu Gin Ile Ala Val Arg Gly 25 Ile Val Lys Arg Glu Phe Ile Lys Phe Lys Lys Lys Asp Asn Asn Lys 40 Lys Ile Ser Asn Thr Glu Lys Asp Asn Leu Lys Ser Gly Gly Ser Ser 55 Lys Asp Glu Asp Tyr Ser Phe Ile Asn Pro Ser Leu Thr Val Thr Val 70 75 Pro Ile Ala Gly Glu Ser Asp Glu Asn Thr Glu Glu Ser Ser Ser Asp 90 Ser Lys Gin 9 .9 9* 9* 9. 9*99 9 9e 9 9 9 Gin Cys His 145 Ala Asp Al-a Val1 Lys 225 Arg Ser Ser Gly Asn 305 Thr Val1 Ile Asp Thr 385 Lys Asp Ala Gin 130 Trp Phe Lys T yr Asp 210 Ser Phe Ile Ile Phe 290 Phe Phe Gin Phe As n 370 Giu Lys Val Gin 115 Gly Phe Gin Phe Giy 195 Val1 Leu Gin Met Met 275 Ser Asp Lys Pro Gly 355 Phe Gin Pro Thr Lys 100 Pro Lys Gin Asp Thr 180 Tyr Se r Arg Gly Asn 260 Gly Val Asn Gly Lys 340 Ser Asn Gin Gin Gin 420 Gin Gly Phe Ile 165 Tyr Phe Leu Thr Met 245 Val1 Val1 Asn Val1 T rp 325 Tyr Phe Glm Lys Lys 405 Phe Pro Lys Ile 150 Tyr Ile Thr Val1 Leu 230 Arg Len Asn Ser Giy 310 Met Gin Ph e Gin Lys 390 Pro Asp Gin T rp 135 Val1 Ile Phe As n Len 215 Arg Val1 Leu Len Giu 295 Leu Asp Leu Thr Lys 375 Tyr Ile Ile Ala 120 Trp Met Gin Ile Ala 200 Ala Ala Val Val Phe 280 Cys Gly Ile T yr Leu 360 Lys Tyr Pro Ile *Cys Arg Ile Lys Len 185 Trp As n Leu Val1 Cys 265 Al a Ala Tyr Met Met 345 Asn Lys As n Arg Met 425 Asn Ser Ser Ser Glu Ser Thr Val Asp Pro Gin Glu Phe Lys Len Thr 170 Glu Cys Leu Arg As n 250 Leu Gly Len Leu Tyr 330 Tyr Leu Giy Al a Pro 410 Leu Thr Thr Leu 155 Ile Met Trp Gly Pro 235 Ala Ile Lys Arg Ser 315 Al a Tyr Phe Gly Met 395 Asn Ile Cys Cys 140 Ser Lys Len Len Tyr 220 Len Len Phe Phe Trp 300 Len Al a Phe Ile Gin 380 Lys Lys Cys Val1 125 Tyr Ser Leu Len Asp 205 Ser Arg Gly Trp Tyr 285 Lys Len Val Val1 Gly 365 Asp Lys Phe Len 110 Arg Ile Gly Glu Lys 190 Phe Len Ala Al a Len 270 Cys As n Gin Asp Ile 350 Val1 Ile Leu Gin Asn 430 Phe Val Ala Tyr 175 Trp Len Gly Len Ile 255 Ile Asn Lys Val1 Ser 335 Phe Ile Phe
G
1 y Giy 415 MIet Cys Giu Len 160 Ala Val1 -Ile Ile Ser 240 Pro Phe Thr Val1 Ala 320 As n Ile Ile Met Ser 400 Phe Val1 Thr Met Met Val Giu Gin Met Leu Trp Ile Asn Val Phe Ile Leu Phe 435 440 445 Thr Gly Glu Cys Val Leu Lys Leu Ile Ser Leu Arq His Tyr Tyr Phe 450 455 460 Thr Gly Trp Asn Ile Phe Val Val Val Ile Leu Ser Ile Val Gly Met 465 470 475 480 Phe Leu Ala Giu Ile Glu Lys Tyr Phe Val Ser Pro Thr Leu Phe Arg 485 490 495 Val Ile Arg Leu Ala Arg Ile Gly Arg Ile Leu Arg Leu Ile Lys Giy 500 505 510 Ala Lys Gly Ile Arq Thr Leu Leu Phe Ala Leu Met Met Ser Leu Pro 515 520 525 Ala Leu Phe Asn Ile Gly Leu Leu Leu Phe Leu Val Met Phe Ile -Tyr Ala Ile Phe Gly Met Ser Asn Phe Ala Tyr VaT Lys Glu Gly Ile Asp **545 550 555 560 Met Phe Asn Phe Glu Thr Phe Gly Asn Ser Met Ile Cys Leu Phe Gin 9 *565 570 575 Ile Thr Thr Ser Ala Gly Trp Asp Gly Leu Leu Ala Pro Ile Leu Asn *580 585 590 *Ser Pro Pro Asp Cys Asp Pro Lys His Pro Gly Ser Ser Val Gly Asp 595 600 605 Cys Gly Asn Pro Ser Val Giy Ile Phe Phe Val Ser Tyr Ile Ile Ile .9610 615 620 Ser Phe Leu Val Val Val Asn Met Tyr Ile Ala Vai Ile Leu Giu Asn 625 630 635 640 Phe Ser Vai Ala Thr Glu Glu Ser Giu Pro Leu Ser Giu Asp Asp Phe **645 650 655 Glu Met Phe Tyr Giu Val Trp Glu Lys Phe Asp Pro Asp Ala Thr Gin 660 665 670 Phe Ile Giu Phe Cys Lys Leu Ser Asp Phe Ala Ala Al.a -Leu Asp Pro 675 680 685 Pro Leu Leu Ile Ala Lys Pro Asn Lys Val Gin Leu Ile Ala Met Asp 690 695 700 Leu Pro Met Val Ser Gly Asp Arg Ile His Cys Leu Asp Ile Leu Phe 705 710 715 720 Ala Phe Thr Lys Arg Val Leu Gly Giu Gly Giu Met Asp Leu Arg Gin 725 730 735 Met Glu Glu Arg Phe Met Asn Pro Ser Lys Val Ser Tyr Glu Pro Ile 740 745 750 Thr Thr Thr Leu Lys Arg Lys Gin Glu Glu Val Ser Ala Ile Gin Arg 755 760 765 -62- 0* *00 0 0 00
CO
0* 0
SO
*0 0 0000
S
*0*0 Ala Tyr Arg Arg Tyr Leu Gin Val Lys Ser Ser Ile Tyr Lys Asp Asp 770 775 780 Pro Lys Glu Asp Asp Asn Glu Asn Ser Pro Glu Lys Thr Asp Val Thr 785 790 795 800 Ser Thr Ser Pro Pro Ser Tyr Asp Ser Val Thr Lys Pro 805 810 INFORMATION FOR SEQ ID NO:9: SEQUENCE CHARACTERISTICS: LENGTH: 6452 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: both (ii) MOLECULE TYPE: DNA (genomic) (ix) FEATURE: NAME/KEY: CDS LOCATION: 326..6277 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:9: GTCGCCTCAT CCTGAGCAGA CTGGAAACAG ACTCCGTGCA GGCCTCGCCC GCGCTCCAGT TGCGACTGTA GGGTTTTCAT TCCTGCCCAC TGCGCAGACT GGGCTGAGCT AGCCTGGGTA TCCACGATTC GCGACTCGTA GTAACAGGCA CTCTGAGCAA CAGGATTTCA GAGAAAGAAG CAGAGGCAAG AAAGAAGCCT GGGGAGAGAG GAAGACTTTC CTTGGATCAG ACTCCGCAGG TGCACACACC GGGTGGGCAT GATCCGTGGG GCCAGGCCTC TTAGGTAAGG AGTCAAAGGG GAAATAAAAC ATACAGGATG AAAAG ATG GCG ATG CTG CCT CCT CCA GGA CCT Met Ala Met Leu Pro Pro Pro Gly Pro 1015 1020 CAG AGT TTC GTT CAC TTC Gin Ser Phe Val His Phe 1025 CGT ATT TCT GAA GAA AAA Arg Ile Ser Glu Glu Lys 1040 GAT GAG GAA GAA GGC CCC Asp Glu Glu Glu Gly Pro 1055 CAG CTC CCC TTC ATC TAT Gin Leu Pro Phe Ile Tyr 1070 CCC CTG GAG GAC CTG GAC ACA AAA CAG TCC CTT GCC CTC ATT GAA CAG Thr Lys Gin Ser Leu Ala Leu Ile Glu Gin 1030 1035 GCC AAG GAA CAC AAA GAC GAA AAG AAA GAT Ala Lys Glu His Lys Asp Glu Lys Lys Asp 1045 1050 AAG CCC AGC AGT GAC TTG GAA GCT GGG AAA Lys Pro Ser Ser Asp Leu Glu Ala Gly Lys 1060 1065 GGA GAC ATT CCC CCT GGA ATG GTG TCA GAG Gly Asp Ile Pro Pro Gly Met Val Ser Glu 1075 1080 CCA TAC TAT GCT GAC AAA AAA ACT TTT ATA 496 544 592 640 Pro Leu Glu Asp Leu Asp Pro Tyr Tyr Ala Asp Lys Lys Thr Phe Ile 1085 1090 1095 1100 GTA TTG AAC AAA GGG AAA GCA ATC TTC CGT TTC AAC GCC ACC CCT GCT -63- Val Leu Asn Lys Gly Lys Ala Ile Phe Arg Phe Asn Ala Thr Pro Ala 1105 1110 1115 TTG TAC ATG CTG TCT CCC TTC AGT CCT CTA AGA AGA ATA TCT ATT AAG 688 Leu Tyr Met Leu Ser Pro Phe Ser Pro Leu Arg Arg Ile Ser Ile Lys 1120 1125 1130 ATC TTA GTG CAC TCC TTA TTC AGC ATG CTA ATC ATG TGC ACA ATT CTG 736 Ile Leu Val His Ser Leu Phe Ser Met Leu Ile Met Cys Thr Ile Leu 1135 1140 1145 ACG AAC TGC ATA TTC ATG ACC TTG AGC AAC CCT CCA GAA TGG ACC AAA 784 Thr Asn Cys Ile Phe Met Thr Leu Ser Asn Pro Pro Glu Trp Thr Lys 1150 1155 1160 AAT GTA GAG TAC ACT TTT ACT GGG ATA TAT ACT TTT GAA TCA CTC ATA 832 Asn Val Glu Tyr Thr Phe Thr Gly Ile Tyr Thr Phe Glu Ser Leu Ile 1165 1170 1175 1180 AAA ATC CTT GCA AGA GGC TTT TGC GTG GGA GAA TTC ACC TTC CTC CGT 880 Lys Ile Leu Ala Arg Gly Phe Cys Val Gly Glu Phe Thr Phe Leu Arg *1185 1190 1195 GAC CCT TGG AAC TGG CTG GAC TTT GTT GTC ATT GTT TTT GCG TAT TTA 928 Asp Pro Trp Asn Trp Leu Asp Phe Val Val Ile Val Phe Ala Tyr Leu 1200 1205 1210 ACA GAA TTT GTA AAC CTA GGC AAT GTT TCA GCT CTT CGA ACT TTC AGA 976 ,,Thr Glu Phe Val Asn Leu Gly Asn Val Ser Ala Leu Arg Thr Phe Arg 1215 1220 1225 GTC TTG AGA GCT TTG AAA ACT ATT TCT GTA ATC CCA GGA CTA AAG ACC 1024 Val Leu Arg Ala Leu Lys Thr Ile Ser Val Ile Pro Gly Leu Lys Thr 1230 1235 1240 ATC GTG GGG GCC CTG ATC CAG TCA GTG AAG AAG CTC TCT GAC GTC ATG 1072 Ile Val Gly Ala Leu Ile Gin Ser Val Lys Lys Leu Ser Asp Val Met 1245 1250 1255 1260 ATC CTC ACT GTG TTC TGT CTC AGT GTG TTT GCA CTA ATT GGA CTA CAG 1120 Ile Leu Thr Val Phe Cys Leu Ser Val Phe Ala Leu Ile Gly Leu Gin 1265 1270 1275 CTG TTT ATG GGC AAC TTG AAG CAT AAA TGT TTC AGG AAG GAA CTC GAA 1168 Leu Phe Met Gly Asn Leu Lys His Lys Cys Phe Arg Lys Glu Leu Glu 1280 1285 1290 GAG AAT GAA ACA TTA GAA AGT ATC ATG AAT ACT GCT GAG AGT GAA GAA 1216 Glu Asn Glu Thr Leu Glu Ser Ile Met Asn Thr Ala Glu Ser Glu Glu 1295 1300 1305 GAA TTG AAA AAA TAT TTT TAT TAC TTG GAG GGA TCC AAA GAT GCT CTA 1264 Glu Leu Lys Lys Tyr Phe Tyr Tyr Leu Glu Gly Ser Lys Asp Ala Leu 1310 1315 1320 CTC TGC GGC TTC AGC ACA GAT TCA GGG CAG TGT CCA GAA GGC TAC ATC 1312 Leu Cys Gly Phe Ser Thr Asp Ser Gly Gin Cys Pro Glu Gly Tyr Ile 1325 1330 1335 1340 TGT GTG AAG GCT GGC AGA AAC CCG GAT TAT GGC TAC ACG AGC TTT GAC 1360 Cys Val Lys Ala Gly Arg Asn Pro Asp Tyr Gly Tyr Thr Ser Phe Asp 1345 1350 1355 -64- ACA TTC AGG TGG GCC TTC TTG GGC TTG TTT CGG GTA ATG ACT GAG GAC 1408 Thr Phe Ser Trp Ala Phe Leu Ala Leu Phe Arg Leu Met Thr Gin Asp 1360 1365 1370 TAC TGG GAG AAC CTT TAC CAA GAG ACT GTG GGT GGT GCT GGG AAA ACC 1456 Tyr Trp Giu Asn Leu Tyr Gin Gin Thr Leu Arg Ala Ala Gly Lys Thr 1375 1380 1385 TAC ATG ATT TTG TTT GTC GTG GTT ATT TTT GTG GGC TCC TTT TAG CTG 1504 Tyr Met Ile Phe Phe Val Val Val Ile Phe Leu Gly Ser Phe Tyr Leu 1390 1395 1400 ATA AAG TTG ATC GTG GGT GTG GTA GCC ATG GGG TAT GAG GAA GAG AAC 1552 Ile Asn Leu Ile Leu Ala Val Val Ala Met Ala Tyr Glu Giu Gin Asn 1405 1410 1415 1420 GAG GCC AAC ATG GAA GAA GGT AAA GAG AAA GAG TTA GAA TTT GAG GAG 1600 Gin Ala Asn Ile Giu Giu Ala Lys Gin Lys Giu Leu Giu Phe Gin Gin 1425 1430 1435 ATG TTA GAC CGA CTC AAA AAG GAG GAG GAA GAA GGT GAG GCG ATC GGT 1648 Met Leu Asp Arg Leu Lys Lys Giu Gin Giu Giu Aia Glu Ala Ile Ala *99999 1440 1445 1450 GCA GT GT GCT GAG TTC ACG AGT ATA GGG GG AGG AGG ATC ATG GGA 1696 Ala Ala Ala Ala Giu Phe Thr Ser Ile Gly Arg Ser Arg Ile Met Gly 1455 1460 1465 CTC TGT GAG AGG TGT TGA GAA ACC TCC AGG GTG AGC TGA AAG AGT GGC 1744 Leu Ser Giu Ser Ser Ser Giu Thr Ser Arg Leu Ser Ser Lys Ser Ala 9999: 1470 1475 1480 AAG GAG AGA AGA AAC GGA AGA AAG AAA AAG AAA GAG AAG ATG TGC AGT 1792 Lys Gu Arg Arq Asn Arq Arg Lys Lys Lys Lys Gin Lys Met Ser Ser 1485 1490 1495 1500 GGC GAG GAA AAG GGT GAG GAT GAG AAG GTG TCC AAG TGA GGA TCA GAG 1840 Gly Giu Glu Lys Gly Asp Asp Giu Lys Leu Ser Lys Ser Gly Ser Glu 1505 1510 1515 GAA AGC ATC CGA AAG AAA AGC TTG CAT CTC GGT GTG GAA GGG CAC CAC 1888 Glu Ser Ile Arg Lys Lys Ser Phe His Leu Giy Val Giu Gly His His 1520 1525 1530 GGG ACC CGG GAA AAG AGG GTG TCC ACC CCC AAC GAG TGG CCA CTC AGC 1936 Arq Thr Arg Glu Lys Arg Leu Ser Thr Pro Asn Gin Ser Pro Leu Ser 1535 1540 1545 ATT CGC GGG TCC GTG TTT TCT GCC AGG CGC AGC AGC AGG ACG AGT CTC 1984 Ile Arg Giy Ser Leu Phe Ser Ala Arg Arg Ser Ser Arg Thr Ser Leu 1550 1555 1560 TTC AGT TTT AAG GGG CGA GGA AGA GAT GTG GGA TGT GAG ACA GAA TTC 2032 Phe Ser Phe Lys Gly Arg Gly Arg Asp Leu Gly Ser Giu Thr Giu Phe 1565 1570 1575 1580 GCT GAT GAT GAG CAT AGC ATT TTT GGA GAG AAG GAG AGC AGA AGG GGT 2080 Ala Asp Asp Giu His Ser Ile Phe Gly Asp Asn Giu Ser Arg Arg Gly 1585 1590 1595 TCA GTA TTG GTA CCC CAT AGA CCC CGG GAG GGG CGC AGG AGT AAC ATG 2128 Ser Leu Phe Val Pro His Arg Pro Arg Glu Arq Arg Ser Ser Asn Ile 1600 1605 1610 AGT CAG GCC AGT AGG TCC CCG CCA GTG CTA CCG GTG AAC GGG AAG ATG 2176 Ser Gin Ala Ser Arg Ser Pro Pro Val Leu Pro Val Asn Gly Lys Met 1615 1620 1625 CAC AGT GCA GTG GAC TGC AAT GGA GTC GTG TCG CTT GTT GAT GGA CCC 2224 His Ser Ala Val Asp Cys Asn Gly Val Val Ser Leu Val Asp Gly Pro 1630 1635 1640 TCA GCC CTC ATG CTC CCC AAT GGA CAG CTT CTT CCA GAG GTG ATA ATA 2272 Ser Ala Leu Met Leu Pro Asn Gly Gin Leu Leu Pro Glu Val Ile Ile 1645 1650 1655 1660 GAT AAG GCA ACT TCC GAC GAC AGC GGC ACG ACT AAT CAG ATG CGC AAA 2320 Asp Lys Ala Thr Ser Asp Asp Ser Gly Thr Thr Asn Gin Met Arg Lys 1665 1670 1675 AAA AGG-CTC TCT AGT TCT TAC TTC TTG TCT GAG GAC ATG CTG AAT GAE 2368 Lys Arg Leu Ser Ser Ser Tyr Phe Leu Ser Glu Asp Met Leu Asn Asp 1680 1685 1690 ***CCG CAT CTC AGG CAA AGG GCC ATG AGC AGG GCG AGC ATA CTG ACC AAC 2416 Pro His Leu Arg Gin Arg Ala Met Ser Arg Ala Ser Ile Leu Thr Asn 1695 1700 1705 ACT GTG GAA GAA CTT GAA GAA TCT AGA CAA AAA TGT CCA CCA TGG TGG 2464 Thr Val Glu Glu Leu Glu Glu Ser Arg Gin Lys Cys Pro Pro Trp Trp 1710 1715 1720 *,TAC AGA TTT GCT CAC ACA TTT TTA ATC TGG AAT TGC TCT CCA TAT TGG 2512 Tyr Arg Phe Ala His Thr Phe Leu Ile Trp Asn Cys Ser Pro Tyr Trp 1725 1730 1735 1740 ATA AAA TTC AAA AAG CTC ATC TAT TTT ATT GTG ATG GAT CCT TTT GTA 2560 Ile Lys Phe Lys Lys Leu Ile Tyr Phe Ile Val Met Asp Pro Phe Val 1745 1750 1755 GAT CTT GCA ATT ACC ATT TGC ATA GTT TTA AAC ACC TTA TTT ATG GCT 2608 Asp Leu Ala Ile Thr Ile Cys Ile Val Leu Asn Thr Leu Phe Met Ala 1760 1765 1770 ATG GAG CAC CAC CCA ATG ACT GAA GAA TTC AAA AAT GTC CTT GCA GTG 2656 Met Glu His His Pro Met Thr Glu Glu Phe Lys Asn Val Leu Ala Val 1775 1780 1785 GGG AAC TTG ATC TTT ACA GGG ATC TTC GCA GCT GAA ATG GTA CTG AAG 2704 Gly Asn Leu Ile Phe Thr Gly Ile Phe Ala Ala Glu Met Val Leu Lys 1790 1795 1800 TTA ATA GCC ATG GAC CCC TAT GAG TAT TTC CAA GTA GGG TGG AAT ATT 2752 Leu Ile Ala Met Asp Pro Tyr Glu Tyr Phe Gin Val Gly Trp Asn Ile 1805 1810 1815 1820 TTT GAC AGC CTA ATT GTG ACG CTG AGT TTG ATA GAG CTT TTC CTA GCA 2800 Phe Asp Ser Leu Ile Val Thr Leu Ser Leu Ile Glu Leu Phe Leu Ala 1825 1830 1835 GAT GTG GAA GGA TTA TCA GTT CTG CGG TCA TTC AGA TTG CTC CGA GTC 2848 Asp Val Glu Gly Leu Ser Val Leu Arg Ser Phe Arg Leu Leu Arg Val 1840 1845 1850 -66- TTC AAG TTG GCA AAG TCC TGG CCC ACA CTG AAC ATG CTC ATT AAG ATC 2896 Phe Lys Leu Ala Lys Ser Trp Pro Thr Leu Asn Met Leu Ile Lys Ile 1855 1860 1865 ATC GGC AAC TCG GTG GGC GCA CTG GGC AAC CTG ACC CTG GTG CTG GCC 2944 Ile Gly Asn Ser Val Gly Ala Leu Gly Asn Leu Thr Leu Val Leu Ala 1870 1875 1880 ATC ATC GTC TTC ATT TTT GCC GTG GTC GGC ATG CAG CTG TTT GGA AAG 2992 Ile Ile Val Phe Ile Phe Ala Val Val Gly Met Gln Leu Phe Gly Lys 1885 1890 1895 1900 AGC TAC AAG GAG TGT GTC TGC AAG ATC AAT GTG GAC TGC AAG CTG CCG 3040 Ser Tyr Lys Glu Cys Val Cys Lys Ile Asn Val Asp Cys Lys Leu Pro 1905 1910 1915 CGC TGG CAC ATG AAC GAC TTC TTC CAC TCC TTC CTC ATC GTG TTC CGA 3088 Arg Trp His Met Asn Asp Phe Phe His Ser Phe Leu Ile Val Phe Arg 1920 1925 1930 S. GTG CTG TGT GGG GAG TGG ATA GAG ACC ATG TGG GAC TGC ATG GAG GTC 3136 Val Leu Cys Gly Glu Trp Ile Glu Thr Met Trp Asp Cys Met Glu Val ***1935 1940 1945 GCG GGC CAG ACC ATG TGC CTT ATT GTT TAC ATG ATG GTC ATG GTG ATT 3184 Ala Gly Gin Thr Met Cys Leu Ile Val Tyr Met Met Val Met Val Ile 1950 1955 1960 GGG AAC CTT GTG GTC CTG AAC CTG TTT CTG GCT CTT TTG CTG AGT TCC 3232 Gly Asn Leu Val Val Leu Asn Leu Phe Leu Ala Leu Leu Leu Ser Ser S1965 1970 1975 1980 TTT AGT TCT GAC AAT CTT ACA GCA ATT GAG GAA GAC ACC.GAT GCA AAC 3280 *Phe Ser Ser Asp Asn Leu Thr Ala Ile Glu Glu Asp Thr Asp Ala Asn 1985 1990 1995 AAC CTC CAG ATC GCA GTG GCC AGA ATT AAG AGG GGA ATC AAT TAC GTG 3328 Asn Leu Gin Ile Ala Val Ala Arg Ile Lys Arg Gly Ile Asn Tyr Val 2000 2005 2010 ***AAA CAG ACC CTG CGT GAA TTC ATT CTA AAA TCA TTT TCC AAA AAG CCA 3376 Lys Gin Thr Leu Arg Glu Phe Ile Leu Lys Ser Phe Ser Lys Lys Pro 2015 2020 2025 AAG GGC TCC AAG GAC ACA AAA CGA ACA GCA GAT CCC AAC AAC AAG AAA 3424 Lys Gly Ser Lys Asp Thr Lys Arg Thr Ala Asp Pro Asn Asn Lys Lys 2030 2035 2040 GAA AAC TAT ATT TCA AAC CGT ACC CTT GCG GAG ATG AGC AAG GAT CAC 3472 Glu Asn Tyr Ile Ser Asn Arg Thr Leu Ala Glu Met Ser Lys Asp His 2045 2050 2055 2060 AAT TTC CTC AAA GAA AAG GAT AGG ATC AGT GGT TAT GGC AGC AGT CTA 3520 Asn Phe Leu Lys Glu Lys Asp Arg Ile Ser Gly Tyr Gly Ser Ser Leu 2065 2070 2075 GAC AAA AGC TTT ATG GAT GAA AAT GAT TAC CAG TCC TTT ATC CAT AAC 3568 Asp Lys Ser Phe Met Asp Glu Asn Asp Tyr Gin Ser Phe Ile His Asn 2080 2085 2090 CCC AGC CTC ACA GTG ACA GTG CCA ATT GCA CCT GGG GAG TCT GAT TTG 3616 Pro Ser Leu Thr Val Thr Val Pro Ile Ala Pro Gly Glu Ser Asp Leu -67- 2095 2100 2105 GAG ATT ATG AAC ACA GAA GAG CTT AGC AGT GAC TCA GAC AGT GAC TAC 3664 Glu Ile Met Asn Thr Glu Glu Leu Ser Ser Asp Ser Asp Ser Asp Tyr 2110 2115 2120 AGC AAA GAG AAA CGG AAC CGA TCA AGC TCT TCT GAG TGC AGC ACT GTT 3712 Ser Lys Glu Lys Arg Asn Arg Ser Ser Ser Ser Glu Cys Ser Thr Val 2125 2130 .2135 2140 GAC AAC CCT CTG CCA GGA GAA GAG GAG GCT GAA GCA GAG CCC GTA AAC 3760 Asp Asn Pro Leu Pro Gly Glu Glu Glu Ala Glu Ala Glu Pro Val Asn 2145 2150 2155 GCA GAT GAG CCT GAA GCC TGC TTT ACA GAT GGT TGT GTG AGG AGA TTT 3808 Ala Asp Glu Pro Glu Ala Cys Phe Thr Asp Gly Cys Val Arg Arg Phe 2160 2165 2170 CCA TGC-TGC CAA GTT AAT GTA GAC TCT GGG AAA GGG AAA GTT TGG TG 3856 Pro Cys Cys Gin Val Asn Val Asp Ser Gly Lys Gly Lys Val Trp Trp 2175 2180 2185 ,..ACC ATC AGG AAG ACG TGC TAC AGG ATA GTT GAA CAC AGC TGG TTT GAA 3904 Thr Ile Arg Lys Thr Cys Tyr Arg Ile Val Glu His Ser Trp Phe Glu 2190 2195 2200 AGC TTC ATC GTT CTC ATG ATC CTG CTC AGC AGT GGA GCT CTG GCT TTT 3952 Ser Phe Ile Val Leu Met Ile Leu Leu Ser Ser Gly Ala Leu Ala Phe S2205 2210 2215 2220 g GAA GAT ATC TAT ATT GAA AAG AAA AAG ACC ATT AAG ATT ATC CTG GAG 4000 Glu Asp Ile Tyr Ile Glu Lys Lys Lys Thr Ile Lys Ile Ile Leu Glu 2225 2230 2235 STAT GCT GAC AAG ATA TTC ACC TAC ATC TTC ATT CTG GAA ATG CTT CTA 4048 Tyr Ala Asp Lys Ile Phe Thr Tyr Ile Phe Ile Leu Glu Met Leu Leu 2240 2245 2250 AAA TGG GTC GCA TAT GGG TAT AAA ACA TAT TTC ACT AAT GCC TGG TGT 4096 Lys Trp Val Ala Tyr Gly Tyr Lys Thr Tyr Phe Thr Asn Ala Trp Cys 2255 2260 2265 TGG CTG GAC TTC TTA ATT GTT GAT GTG TCT CTA GTT ACT TTA GTA GCC 4144 Trp Leu Asp Phe Leu Ile Val Asp Val Ser Leu Val Thr Leu Val Ala 2270 2275 2280 AAC ACT CTT GGC TAC TCA GAC CTT GGC CCC ATT AAA TCT CTA CGG ACA 4192 Asn Thr Leu Gly Tyr Ser Asp Leu Gly Pro Ile Lys Ser Leu Arg Thr 2285 2290 2295 2300 CTG AGG GCC CTA AGA CCC CTA AGA GCC TTG TCT AGA TTT GAA GGA ATG 4240 Leu Arg Ala Leu Arg Pro Leu Arg Ala Leu Ser Arg Phe Glu Gly Met 2305 2310 2315 AGG GTA GTG GTC AAC GCA CTC ATA GGA GCA ATC CCT TCC ATC ATG AAC 4288 Arg Val Val Val Asn Ala Leu Ile Gly Ala Ile Pro Ser Ile Met Asn 2320 2325 2330 GTG CTT CTC GTG TGC CTT ATA TTC TGG CTA ATA TTT AGC ATC ATG GGA 4336 Val Leu Leu Val Cys Leu Ile Phe Trp Leu Ile Phe Ser Ile Met Gly 2335 2340 2345 -68- GTC AAT CTG TTT GCT GGC AAG TTC TAT GAG TGT GTC AAC ACC ACC GAT 4384 Val Asn Leu Phe Ala Gly Lys Phe Tyr Glu Cys Val Asn Thr Thr Asp 2350 2355 2360 GGG TCA CGA TTT CCT ACA TCT CAA GTT GCA AAC CGT TCT GAG TGT TTT 4432 Gly Ser Arg Phe Pro Thr Ser Gin Val Ala Asn Arg Ser Glu Cys Phe 2365 2370 2375 2380 GCC CTG ATG AAC GTT AGT GGA AAT GTG CGA TGG AAA AAC CTG AAA GTA 4480 Ala Leu Met Asn Val Ser Gly Asn Val Arg Trp Lys Asn Leu Lys Val 2385 2390 2395 AAC TTC GAC AAC GTT GGG CTT GGT TAC CTG TCG CTG CTT CAA GTT GCA 4528 Asn Phe Asp Asn Val Gly Leu Gly Tyr Leu Ser Leu Leu Gin Val Ala 2400 2405 2410 ACA TTC AAG GGC TGG ATG GAT ATT ATG TAT GCA GCA GTT GAC TCT GTT 4576 Thr Phe Lys Gly Trp Met Asp Ile Met Tyr Ala Ala Val Asp Ser Val 2415 2420 2425 AAT GTA AAT GAA CAG CCG AAA TAC GAA TAC AGT CTC TAC ATG TAC ATT 4624 Asn Val Asn Glu Gin Pro Lys Tyr Glu Tyr Ser Leu Tyr Met Tyr Ile 2430 2435 2440 TAC TTT GTC ATC TTC ATC ATC TTC GGC TCA TTC TTC ACG TTG AAC CTG 4672 Tyr Phe Val Ile Phe Ile Ile Phe Gly Ser Phe Phe Thr Leu Asn Leu 2445 2450 2455 2460 **TTC ATT GGT GTC ATC ATA GAT AAT TTC AAC CAA CAG AAA AAA AAG CTT 4720 Phe Ile Gly Val Ile Ile Asp Asn Phe Asn Gin Gin Lys Lys Lys Leu 2465 2470 2475 GGA GGT CAA GAT ATC TTT ATG ACA GAA GAA CAG AAG AAA TAC TAT AAT 4768 Gly Gly Gin Asp Ile Phe Met Thr Glu Glu Gin Lys Lys Tyr Tyr Asn 2480 2485 2490 9999 GCA ATG AAG AAG CTT GGG TCC AAA AAA CCA CAA AAA CCA ATT CCA AGG 4816 Ala Met Lys Lys Leu Gly Ser Lys Lys Pro Gin Lys Pro Ile Pro Arg *2495 2500 2505 9999 GG*CCA GGG AAC AAA TTC CAA GGA TGT ATA TTT GAC TTA GTG ACA AAC CAA 4864 Pro Gly Asn Lys Phe Gin Gly Cys Ile Phe Asp Leu Val Thr Asn Gin 2510 2515 2520 GCT TTT GAT ATC ACC ATC ATG GTT CTT ATA TGC CTC AAC ATG GTA ACC 4912 Ala Phe Asp Ile Thr Ile Met Val Leu Ile Cys Leu Asn Met Val Thr 2525 2530 2535 2540 ATG ATG GTA GAA AAA GAG GGG CAA ACT GAG TAC ATG GAT TAT GTT TTA 4960 Met Met Val Glu Lys Glu Gly Gin Thr Glu Tyr Met Asp Tyr Val Leu 2545 2550 2555 CAC TGG ATC AAC ATG GTC TTC ATT ATC CTG TTC ACT GGG GAG TGT GTG 5008 His Trp Ile Asn Met Val Phe Ile Ile Leu Phe Thr Gly Glu Cys Val 2560 2565 2570 CTG AAG CTA ATC TCC CTC AGA CAT TAC TAC TTC ACT GTG GGT TGG AAC 5056 Leu Lys Leu Ile Ser Leu Arg His Tyr Tyr Phe Thr Val Gly Trp Asn 2575 2580 2585 ATT TTT GAT TTT GTG GTA GTG ATC CTC TCC ATT GTA GGA ATG TTT CTC 5104 Ile Phe Asp Phe Val Val Val Ile Leu Ser Ile Val Gly Met Phe Leu -69- 2590 2595 2600 GCT GAG ATG ATA GAG AAG TAT TTC GTG TCC CCT ACC CTG TTC CGA GTC 5152 Ala Glu Met Ile Glu Lys Tyr Phe Val Ser Pro Thr Leu Phe Arg Val 2605 2610 2615 2620 ATC CGC CTG GCC AGG ATT GGA CGA ATC CTA CGC CTG ATC AAA GGC GCC 5200 Ile Arg Leu Ala Arg Ile Gly Arg Ile Leu Arg Leu Ile Lys Gly Ala 2625 2630 2635 AAG GGG ATC CGC ACT CTG CTC TTT GCT TTG ATG ATG TCC CTT CCT GCG 5248 Lys Gly Ile Arg Thr Leu Leu Phe Ala Leu Met Met Ser Leu Pro Ala 2640 2645 2650 CTG TTC AAC ATC GGC CTC CTG CTT TTC CTG GTC ATG TTC ATC TAC GCC 5296 Leu Phe Asn Ile Gly Leu Leu Leu Phe Leu Val Met Phe Ile Tyr Ala 2655 2660 2665 ATC TTT-GGG ATG TCC AAC TTT GCC TAC GTT AAA AAG GAG GCT GGA ATT 5344 Ile Phe Gly Met Ser Asn Phe Ala Tyr Val Lys Lys Glu Ala Gly Ile 2670 2675 2680 ***AAT GAC ATG TTC AAC TTT GAG ACT TTT GGC AAC AGC ATG ATC TGC TTG 5392 Asn Asp Met Phe Asn Phe Glu Thr Phe Gly Asn Ser Met Ile Cys Leu S2685 2690 2695 2700 TTC CAA ATC ACC ACC TCT GCC-GGC TGG GAC GGA CTG CTG GCC CCC ATC 5440 Phe Gin Ile Thr Thr Ser Ala Gly Trp Asp Gly Leu Leu Ala Pro Ile 2705 2710 2715 CTC AAC AGC GCA CCT CCC GAC TGT GAC CCT AAA AAA GTT CAC CCA GGA 5488 Leu Asn Ser Ala Pro Pro Asp Cys Asp Pro Lys Lys Val His Pro Gly 2720 2725 2730 AGT TCA GTG GAA GGG GAC TGT GGG AAC CCA TCC GTG GGG ATT TTT TAC 5536 Ser Ser Val Glu Gly Asp Cys Gly Asn Pro Ser Val Gly Ile Phe Tyr 2735 2740 2745 TTT GTC AGC TAC ATC ATC ATA TCC TTC CTG GTG GTG GTG AAC ATG TAC 5584 Phe Val Ser Tyr Ile Ile Ile Ser Phe Leu Val Val Val Asn Met Tyr 2750 2755 2760 ATC GCT GTC ATC CTG GAG AAC TTC AGC GTC GCC ACC GAA GAG AGC ACT 5632 Ile Ala Val Ile Leu Giu Asn Phe Ser Val Ala Thr Glu Glu Ser Thr 2765 2770 2775 2780 GAG CCT CTG AGT GAG GAC GAC TTT GAG ATG TTC TAC GAG GTC TGG GAG 5680 Glu Pro Leu Ser Glu Asp Asp Phe Glu Met Phe Tyr Glu Val Trp Glu 2785 2790 2795 AAG TTC GAC CCT GAC GCC ACT CAG TTC ATA GAG TTC TGC AAG CTC TCT 5728 Lys Phe Asp Pro Asp Ala Thr Gin Phe Ile Glu Phe Cys Lys Leu Ser 2800 2805 2810 GAC TTT GCA GCT GCC CTG GAT CCT CCC CTC CTC ATC GCA AAG CCA AAC 5776 Asp Phe Ala Ala Ala Leu Asp Pro Pro Leu Leu Ile Ala Lys Pro Asn 2815 2820 2825 AAA GTC CAG CTC ATT GCC ATG GAC CTG CCC ATG GTG AGT GGA GAC CGC 5824 Lys Val Gin Leu Ile Ala Met Asp Leu Pro Met Val Ser Gly Asp Arg 2830 2835 2840 ATC CAC TGC CTG GAC ATC TTG TTT GCT TTT ACA AAG CGG GTC CTG GGT 5872 Ile His Cys Leu Asp Ile Leu Phe Ala Phe Thr Lys Arg Val Lou Gly 2845 2850 2855 2860 GAG GGT GGA GAG ATG GAT TCT CTT CGT TCA CAG ATG GAA GAA AGG TTC 5920 Glu Gly Gly Glu Met Asp Ser Leu Arg Ser Gin Met Glu Glu Arg Phe 2865 2870 2875 ATG TCA GCC AAT CCT TCT AAA GTG TCC TAT GAA CCC ATC ACG ACC ACA 5968 Met Ser Ala Asn Pro Ser Lys Val Ser Tyr Glu Pro Ile Thr Thr Thr 2880 2885 2890 CTG AAG AGA AAA CAA GAG GAG GTG TCC GCG ACT ATC ATT CAG CGT GCT 6016 Leu Lys Arg Lys Gin Glu Glu Val Ser Ala Thr Ile Ile Gin Arg Ala 2895 2900 2905 TAC AGA CGG TAT CGC CTC AGA CAA CAC GTC AAG AAT ATA TCG AGT ATA 6064 Tyr Arg Arg Tyr Arg Leu Arg Gin His Val Lys Asn Ile Ser Ser Ile 2910 2915 2920 TAC ATA AAA GAT GGA GAC AGG GAT GAT GAT TTG CCC AAT AAA GAA GAT 6112 Tyr Ile Lys Asp Gly Asp Arg Asp Asp Asp Leu Pro Asn Lys Glu Asp S2925 2930 2935 2940 9 ACA GTT TTT GAT AAC GTG AAC GAG AAC TCA AGT CCG GAA AAG ACA GAT 6160 Thr Val Phe Asp Asn Val Asn Glu Asn Ser Ser Pro Glu Lys Thr Asp 2945 2950 2955 GTA ACT GCC TCA ACC ATC TCG CCA CCT TCC TAT GAC AGT GTC ACA AAG 6208 Val Thr Ala Ser Thr Ile Ser Pro Pro Ser Tyr Asp Ser Val Thr Lys :2960 2965 2970 CCA GAT CAA GAG AAA TAT GAA ACA GAC AAA ACA GAG AAG GAA GAC AAA 6256 Pro Asp Gin Glu Lys Tyr Glu Thr Asp Lys Thr Glu Lys Glu Asp Lys 2975 2980 2985 9999 GAG AAA GAT GAA AGC AGG AAA TAGAGCTTTG GTTTTGATAC ACTGTTGACA 6307 Glu Lys Asp Glu Ser Arg Lys .2990 2995 ***GCCTGTGAAG GTTGACTCAC TCGTGTTAGT AAGACTCTTT TACGGAGGTC TATCCAAACT 6367 CTTTTATCAA AAATTCTCAA GGCAGCACAG CCATTAGCTC TGATCCAACG AGGCAGAGGG 6427 CAGCATTTAC ACATGGCTAT GTTTT 6452 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 1984 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID Met Ala Met Leu Pro Pro Pro Gly Pro Gin Ser Phe Val His Phe Thr 1 5 10 Lys Gin Ser Leu Ala Leu Ile Glu Gin Arg Ile Ser Glu Glu Lys Ala -71- Lys Glu His 9 9 999...
b u 9* 9 *99 9 .9 '9 9 9* .9 9 *9 C 9*9.
p 9* 9 9999 9 99*9 9. .9 9 99.9 999*99 9 Pro Asp Tyr Ile Ser Ser Leu 145 Gly Cys Phe As n Ile 225 Ser Se r His Ile Tyr 305 Ser Ser Ile Tyr Phe Pro Met 130 Ser Ile ValI Val1 Val1 210 Ser Val Val Lys Met 290 Leu Gly Ser Pro Ala Arq Leu 115 Leu Asn Tyr Gly Val1 195 Ser Val1 Lys Phe Cys 275 As n Giu Gln Lys Asp Pro Asp Phe 100 Arg Ile Pro Thr Glu 180 Ile Ala Ile Lys Ala 260 Phe Thr Gly Cys Le u Gly Lys As n Arg Met Pro Phe 165 Phe Val1 be u Pro Leu 245 Le u Arg Ala Se r Pro 325 Asp Giu Lys Lys Giu Met 70 Lys Al a Ile Cys Glu 150 Giu Thr Phe Arg Gly 230 Ser Ile Lys Glu Lys 310 Glu Ala 55 Val1 Thr Thr Ser Thr 135 Trp Ser Phe Ala Thr 215 Le u Asp Gly Glu Ser 295 Asp Gly 40 Gly Ser Phe Pro Ile 120 Ile Thr Leu Leu T yr 200 Phe Lys Val1 Leu be u 280 Glu Al a T yr 25 Asp Lys Giu Ile Ala 105 Lys Leu Lys Ile Arg 185 Leu Arg Thr Met Gin 265 Glu Glu Leu Ile Asp Gin Pro Val1 90 Leu Ile Thr Asn Lys 170 Asp Thr Val1 Ile Ile 250 Leu Glu Glu be u Cys 330 *Glu Leu be u 75 Leu Tyr beu As n Val1 155 Ile Pro Glu be u Val1 235 beu Phe As n beu Cys 315 Val1 Glu Pro Glu Asn Met Val1 Cys 140 Glu Leu T rp Phe Arg 220 Gly Thr Met Glu Lys 300 Gi y Lys Giu Phe Asp bys beu His 125 Ile Tyr Ala Asn Val1 205 Al a Ala Val Gly Thr 285 Lys Phe Ala Gly Ile beu Gly Ser 110 Ser Phe Thr Arq Trp 190 As n beu beu Phe Asn 270 beu Tyr Ser Gly Pro Tyr Asp bys Pro be u Met Phe Gly 175 Leu beu Lys Ile Cys 255 beu Glu Phe Thr Arg 335 Lys Gly Pro Al a Phe Phe Thr Thr 160 Phe Asp Gly Thr Gin 240 beu Lys Ser Tyr Asp 320 As n Pro Asp Tyr Gly 340 Tyr Thr Ser Phe Asp 345 Thr Phe Ser Trp Ala Phe Leu Ala Leu Phe Arg Leu Met Thr Gin Asp Tyr Trp Glu Asn Leu Tyr Gin 360 **too 099* 6 to ob.
Gin Val 385 Val1 Lys Giu Ser Thr 465 Lys Giu Phe Ser Ala 545 Arg Phe Pro Pro Gly 625 Gly Ser Thr 370 Ile Ala Gin Gin Ile 450 Se r Lys Lys His Thr 530 Arg Asp Gly Arg Val1 610 Val1 Gin Gi y *Leu Phe Met Lys Giu 435 -Gly Arg Lys Leu Leu 515 Pro Arg Leu Asp Giu 595 Leu Val1 Leu Thr Arg Leu Ala Giu 420 Giu Arg Le u Lys Se r 500 Gi y Asn Ser Gly As n 580 Arg Pro Ser Leu Thr 660 Ala Gly Tyr 405 Leu Ala Ser Ser Gin 485 Lys Val1 Gin Ser Ser 565 Glu Arg Val1 Leu Pro 645 As n Al a Ser 390 Giu Giu Giu Arg Ser 470 Lys Ser Giu Ser Arg 550 Giu Ser Ser Asn Val 630 Giu Gin Gly 375 Phe Giu Phe Ala Ile 455 Lys Met Gly Gly Pro 535 Thr Thr Arq Ser Gly 615 Asp Val1 Me t Lys Tyr Gin Gin Ile 440 Met Ser Ser Ser His 520 Leu Ser Giu Arg As n 600 Lys Gly Ile Arq Thr Leu As n Gin 425 Ala Gly Ala Ser Giu 505 His Ser Leu Phe Gly 585 Ile Met Pro Ile Lys 665 *Tyr Ile Gin 410 Met Ala Leu Lys Gly 490 Giu Arg Ile Phe Ala 570 Ser Se r His Se r Asp 650 Lys Met As n 395 Ala Leu Ala Ser Giu 475 Giu Ser Thr Arg Ser 555 Asp Leu Gin Ser Ala 635 Lys Arg Ile 38C Leu Asn Asp Ala Gi~u 460 Arg Giu Ile Arg Gly 540 Phe Asp Phe Ala Ala 620 Leu Ala Leu 365 Phe Ile Ile Arg Ala 445 Ser Arg Lys Arg Giu 525 Ser Lys Giu Val1 Ser 605 ValI Met Thr Ser Arg 685 Phe Leu Giu Leu 430 Giu Ser Asn Gi y Lys 510 Lys Leu Gly His Pro 590 Arg Asp Leu Ser Ser 670 Val1 Ala Giu 415 Lys Phe Ser Arg Asp 495 Lys Arg Phe Arg Ser 575 His Se r Cys Pro Asp 655 Ser Val1 Val1 400 Ala Lys Thr GlC Arg 480 Asp Ser Leu Ser Gly 560 Ile Arg Pro Asn Asn 640 Asp Tyr Phe Leu Ser Giu Asp Met Leu Asn Asp Pro His Leu Gin Arq Ala Met Ser Arg Ala Ser Ile Leu Thr Asn 690 695 Sex 705 Len Tyr Ile Gin Ile 785 Gin Leu Leu Pro Len 865 Val1 Lys Phe Gin Ile 945 Leu Ala Arg Arg Ile Phe Val1 Gin 770 Phe Tyr Ser Arg Thr 850 Gly Val Ile His Thr 930 Val Phe Ile Ile Gin Trp Ile Leu 755 Phe Aia Phe Len Ser 835 Len As n Gly As n Se r 915 Tyr Leu 3lu 995 *Lys Asn Vai 740 Asn Lys Aila Gin Ile 820 Phe Asn Leu Met Val1 900 Phe Trp Met Ala Gin 980 Arg Cys Cys 725 Met Thr As n Gin Val1 805 Gin Arg Met Thr Gin 885 Asp Leu Asp Met Len 965 A~sp Gly Pro 710 Ser Asp Leu Val1 Met 790 Gly Leu Leu Len Leu 870 Len Cys Ile Cys Val1 950 Len Thr Ile Pro Pro Pro Phe Len 775 Val1 Trp Phe Leu Ile 855 Val1 Phe Lys Val Met 935 Met Len Asp Asn Lys 1015 Trp Trp Tyr Trp Phe Val 745 Met Aia 760 Ala Vai Len Lys Asn Ile Len Ala 825 Arg Val 840 Lys Ile Leu Ala Giy Lys Len Pro 905 Phe Arg 920 Gin Vai Val Ile Ser Ser Ala Asn 985 Tyr Val 1000 -73- Thr Val Gin 700 Tyr Arg Phe 715 Ile Lys Phe 730 Asp Leu Ala Met Giu His Gly Asn Len 780 Leu Ile Ala 795 Phe Asp Ser 810 Asp Val Giu Phe Lys Leu Ile Giy Asn 860 Ile Ile Vai 875 Ser Tyr Lys 890 Arg Trp His Val Leu Cys Ala Gly Gin 940 Gly Asn Len 955 Phe Ser Ser 970 A'sn Leu Gin Lys Gin Thr Gin Leu Ala His Lys Lys Ile Thr 750 His Pro 765 Ile Phe Met Asp Len Ile Gly Leu 830 Ala Lys 845 Ser Val Phe Ile Gin Cys Met Asn 910 Gly Giu 925 Thr Met Val Val Asp Asn Ile Ala 990 Leu Arq 1005 Giu Thr Leu 735 Ile Met Thr Pro ValI 815 Se r Ser Gi y Phe Val1 895 Asp Trp Cys Len Len 975 Val1 Gln *Gin Phe 720 Ile Cys Thr Gly Tyr- 800 Thr Val1 Trp Al a Ala 880 Cys Phe Ile Len As n 960 Thr Ala Phe Ile Len Lys Ser Phe Ser 1010 Lys Pro Lys Gly Ser Lys Asp Thr Lys 1020 -74- Arq Thr Ala Asp Pro Asn Asn Lys Lys Glu Asn Tyr Ile Ser Asn Arg 1025 1030 1035 1040 Thr Leu Ala Giu Met Ser Lys Asp His Asn Phe Leu Lys Giu Lys Asp 1045 1050 1055 Arg Ile Ser Gly Tyr Gly Ser Ser Leu Asp Lys Ser Phe Met Asp Giu 1060 1065 1070 Asn Asp Tyr Gin Ser Phe Ile His Asn Pro Ser Leu Thr Val Thr Val 1075 1080 1085 Pro Ile Ala Pro Giy Giu Ser Asp Leu Giu Ile Met Asn Thr Giu Giu 1090 1095 1100 Leu Ser Ser Asp Ser Asp Ser Asp Tyr Ser Lys Giu Lys Arg Asn Arg 1105 1110 1115 1120 Ser Ser. Ser Ser Giu Cys Ser Thr Val Asp Asn Pro Leu Pro Gly Giti 1125 1130 1135 .Giu Giu Ala Giu Ala Giu Pro Val Asn Ala Asp Glu Pro Glu Ala Cys *1140 1145 1150 Phe Thr Asp Gly Cys Val Arg Arg Phe Pro Cys Cys Gin Val Asn Val *1155 1160 1165 *Asp Ser Gly Lys Giy Lys Val Trp Trp Thr Ile Arg Lys Thr Cys Tyr 1170 1175 1180 *Arg Ile Vai Giu His Ser Trp Phe Giu Ser Phe Ile Val Leu Met Ile *1185 1190 1195 1200 0. Leu Leu Ser Ser Gly Ala Leu Ala Phe Glu Asp Ile Tyr Ile Glu Lys **1205 1210 1215 Lys Lys Thr Ile Lys Ile Ile Leu Giu Tyr Ala Asp Lys Ile Phe Thr 1220 1225 1230 *Tyr Ile Phe Ile Leu Glu Met Leu Leu Lys Trp Val Ala Tyr Gly Tyr 1235 1240 1245 Lys Thr Tyr Phe Thr Asn Aia Trp Cys Trp Leu Asp Phe Leu Ile Val 1250 1255 1260 Asp Val Ser Leu Val Thr Leu Val Aia Asn Thr Leu Gly Tyr Ser Asp 1265 1270 1275 1280 Leu Gly *Pro Ile Lys Ser Leu Arg Thr Leu Arg Ala Leu Arg Pro Leu 1285 1290 1295 Arg Ala Leu Ser Arg Phe Glu Gly Met Arg Val Val Val Asn Ala Leu 1300 1305 1310 Ile Gly Ala Ile Pro Ser Ile Met Asn Val Leu Leu Val Cys Leu Ile 1315 1320 1325 Phe Trp, Leu Ile Phe Ser Ile Met Gly Val Asn Leu Phe Ala Gly Lys 1330 1335 1340 Phe Tyr Giu Cys Val Asn Thr Thr Asp Gly Ser Arg Phe Pro Thr Ser 1345 ilc~A 1.50D 1360 Gln Val Ala Asn Arg Ser Glu Cys Phe Ala Leu Met Asn Val Ser Gly 1365 1370 1375 Asn Val Arg Trp Lys Asn Leu Lys Val Asn Phe Asp Asn Val Gly Leu 1380 1385 1390 Gly Tyr Leu Ser Leu Leu Gin Val Ala Thr Phe Lys Gly Trp Met Asp 1395 1400 1405 Ile Met Tyr Ala Ala Val Asp Ser Val Asn Val Asn Glu Gin Pro Lys 1410 1415 1420 Tyr Glu Tyr Ser Leu Tyr Met Tyr Ile Tyr Phe Val Ile Phe Ile Ile 1425 1430 1435 1440 Phe Gly Ser Phe Phe Thr Leu Asn Leu Phe Ile Gly Val Ile Ile Asp 1445 1450 1455 Asn Phe.Asn Gin Gin Lys Lys Lys Leu Gly Gly Gin Asp Ile Phe Met 1460 1465 1470 Thr Glu Glu Gin Lys Lys Tyr Tyr Asn Ala Met Lys Lys Leu Gly Ser 14 7 5 1480 1485 Lys Lys Pro Gin Lys Pro Ile Pro Arg Pro Gly Asn Lys Phe Gin Gly 1490 1495 1500 Cys Ile Phe Asp Leu Val Thr Asn Gin Ala Phe Asp Ile Thr Ile Met 1505 1510 1515 1520 Val Leu Ile Cys Leu Asn Met Val Thr Met Met Val Glu Lys Glu Gly 1525 1530 1535 Gin Thr Glu Tyr Met Asp Tyr Val Leu His Trp Ile Asn Met Val Phe :1540 1545 1550 Ile Ile Leu Phe Thr Gly Glu Cys Val Leu Lys Leu Ile Ser Leu Arg 1555 1560 1565 His Tyr Tyr Phe Thr Val Gly Trp Asn Ile Phe Asp Phe Val Val Val 1570 1575 1580 Ile Leu Ser Ile Val Gly Met Phe Leu Ala Glu Met Ile Glu Lys Tyr 1585 1590 1595 1600 Phe Val Ser Pro Thr Leu Phe Arg Val Ile Arg Leu Ala Arg Ile Gly 1605 1610 1615 Arg Ile Leu Arg Leu Ile Lys Gly Ala Lys Gly Ile Arg Thr Leu Leu 1620 1625 1630 Phe Ala Leu Met Met Ser Leu Pro Ala Leu Phe Asn Ile Gly Leu Leu 1635 1640 1645 Leu Phe Leu Val Met Phe Ile Tyr Ala Ile Phe Gly Met Ser Asn Phe 1650 1655 1660 Ala Tyr Val Lys Lys Glu Ala Gly Ile Asn Asp Met Phe Asn Phe Glu 1665 1670 1675 1680 Thr Phe Gly Asn Ser Met Ile Cys Leu Phe Gin Ile Thr Thr Ser Ala 1685 1690 1695 -76- Gly Trp Asp Gly Leu Leu Ala Pro Ile Leu Asn Ser Ala Pro Pro Asp 1700 1705 1710 Cys Asp Pro Lys Lys Val His Pro Gly Ser Ser Val Glu Gly Asp Cys 1715 1720 1725 Gly Asn Pro Ser Val Gly Ile Phe Tyr Phe Val Ser Tyr Ile Ile Ile 1730 1735 1740 Ser Phe Leu Val Val Val Asn Met Tyr Ile Ala Val Ile Leu Glu Asn 1745 1750 1755 1760 Phe Ser Val Ala Thr Glu Glu Ser Thr Glu Pro Leu Ser Glu Asp Asp 1765 1770 1775 Phe Glu Met Phe Tyr Glu Val Trp Glu Lys Phe Asp Pro Asp Ala Thr 1780 1785 1790 Gin Phe. Ile Glu Phe Cys Lys Leu Ser Asp Phe Ala Ala Ala Leu Asp 1795 1800 1805 Pro Pro Leu Leu Ile Ala Lys Pro Asn Lys Val Gln Leu Ile Ala Met 1810 1815 1820 Asp Leu Pro Met Val Ser Gly Asp Arg Ile His Cys Leu Asp Ile Leu 1825 1830 1835 1840 Phe Ala Phe Thr Lys Arg Val Leu Gly Glu Gly Gly Glu Met Asp Ser 1845 1850 1855 Leu Arg Ser Gln Met Glu Glu Arg Phe Met Ser Ala Asn Pro Ser Lys 4: 1860 1865 1870 Val Ser Tyr Glu Pro Ile Thr Thr Thr Leu Lys Arg Lys Gin Glu Glu 1875 1880 1885 Val Ser Ala Thr Ile Ile Gin Arg Ala Tyr Arg Arg Tyr Arg Leu Arg 1890 1895 1900 Gn His Val Lys Asn Ile Ser Ser Ile Tyr Ile Lys Asp Gly Asp Arg 1905 1910 1915 1920 Asp Asp Asp Leu Pro Asn Lys Glu Asp Thr Val Phe Asp Asn Val Asn 1925 1930 1935 Glu Asn Ser Ser Pro Glu Lys Thr Asp Val Thr Ala Ser Thr Ile Ser 1940 1945 1950 Pro Pro Ser Tyr Asp Ser Val Thr Lys Pro Asp Gin Glu Lys Tyr Glu 1955 1960 1965 Thr Asp Lys Thr Glu Lys Glu Asp Lys Glu Lys Asp Glu Ser Arg Lys 1970 1975 1980 INFORMATION FOR SEQ ID NO:11: SEQUENCE CHARACTERISTICS: LENGTH: 1989 amino acids TYPE: amino acid STRANDEDNESS: not relevant TOPOLOGY: not relevant (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:11: Met Ala Met Leu Pro Pro Pro Gly Pro Gin Ser Phe Val His Phe Thr
S
S
S.
Lys Lys Pro Asp Tyr Ile Ser Ser Xaa 145 Gly Cys Phe Asn Ile 225 Ser Ser His Ile Gir Glu Ser Ile T yr Phe Pro Met 130 Xaa Ile Val1 Val1 Vali 210 Ser Val1 Val1 Lys Me t Ser IXaa Ser Pro Ala Arg Len 115 Len As n Tyr Gly Val1 195 Ser Val1 Lys Phe Cys 275 Asn Leu Lys Asp Pro Asp Phe 100 Arq Ile Pro Th r Gin 180 Ile Al a Ile Lys Al1a 260 Phe Tr Ala Xaa Len Giy Lys 85 Asn Arg Met Pro Phe 165 Phe Val1 Len Pro Len 245 Leu Arg Xaa Len Glu Gin Met 70 Lys Ala Ile Cys Xaa 150 Gin Thr Phe Arg Gi y 230 Ser Ile Xaa Glu Ile Lys Al a 55 Val1 Thr Thr Ser Thr 135 T rp Ser Phe Al a Thr 215 Len A~sp Xaa Ser Gin Lys 40 Gly Ser Phe Pro Ile 120 Ile Thr Len Len Tyr 200 Phe Lys Vail Len Leu 280 Gin Gin 25 Asp Lys Gin Ile Ala 105 Lys Len Lys Xaa Arg 185 Len A.rg I'hr Me t Gln 265 1lu 1lu Arg Asp Gin Pro Val1 90 Leu Ile Thr Asn Lys 170 Asp Thr Val Ile Ile 250 Leu Xaa Xaa Ile Xaa Len Len 75 Len Tyr Leu Asn Val1 155 Ile Pro Giu Len Val1 235 Len Phe %sn <aa Xaa Gin Pro Gin Asn Met Val1 Cys 140 Xaa Len T rp Phe Arg 220 Gly Thr M4et Glu Xaa Gin Gin Phe Asp Lys Leu His 125 Ile Tyr Ala Asn Val 205 Ala Ala Val Gly Thr 285 Lys Xaa Xaa Ile Leu Gly Ser 110 Ser Phe Thr Arg Trp .190 \s n Leu Len The Bs n 270 Leu ['yr Lys Pro Tyr Asp Lys Pro Leu Met Phe Gly 175 Len Len Lys Ile Cys 255 Len Gin Phe Xaa Lys Gly Pro Xaa Phe Phe Thr Thr 160 Phe Asp Gly Thr Gin 240 Len Lys Se r T'yr 295 295 300 Leu Giu Gly Ser
S
*5
C
S..
S*
S 5* S a.
a Ser Pro Ala Gin Val1 385 Val1 Lys Giu Ser Thr 465 Lys Xaa Ser Le u Ser 545 Giy Ile Arg Giy Asp Leu Thr 370 Ile Aia Gin Gin Ile 450 Ser Lys Giu Phe Ser 530 Ala Arg Phe Pro Gin Tyr Phe 355 Leu Phe Met Lys Giu 435 Xaa Xaa Lys Lys His 515 Thr Arg Asp Gly Xaa 595 cys Gi 34C Arg Arg Leu Ala Giu 420 Giu Arg Leu Xaa Leu 500 Leu Pro Arg Xaa Asp 580 Glu Pro 325 Tyr Leu Ala Gly Tyr 405 Leu Aia Ser Ser Gin 485 Ser Gly Asn Ser Giy 565 Asn Arg Lys 310 Giu Thr Met Ala Ser 390 Giu Giu Giu Arg Ser 470 Lys Lys Vai Gin Ser 550 Ser Giu Pirg Asp Aia Leu Gly Tyr Xaa Ser Phe Asp 345 Thr Gin Asp 360 Gly Lys Thr 375 Phe Tyr Leu Giu Gin Asn Phe Gin Gin 425 Aia Ile Ala 440 Ile Met Giy 455 Lys Ser Ala Lys Xaa Ser Ser Xaa Ser 505 Giu Gly His 520 Ser Pro Leu 535 Arg Thr Ser Giu Thr Giu Ser Arg Arg 585 Ser Ser Asn 600 Leu Cys 330 Thr Tyr Tyr Ile Gin 410 Met Al a Leu Lys Ser 490 Giu Xaa Ser Leu The 570 Gly Ile ICys 315 Vali Phe Trp Met Asn 395 Ala Le u Aia Ser Giu 475 Giy Xaa Arg Ile Phe 555 Ala Ser Ser Lys Ser Giu Ile 380 Le u As n Asp Aia Giu 4.60 Arg Giu Ser Xaa Arg 540 Ser Asp Leu iln Xaa Trp As n 365 Phe Ile Ile Arg Aila 445 Ser Arq Giu Ile Xaa 525 Gly Phe Asp Phe Ala 605 Giy Ala 350 Leu Phe Leu Giu Leu 430 Giu Ser As n Lys Arq 510 Giu Ser Lys Glu Vail 590 Ser Gly Phe Ser Thr Arg 335 Phe T yr Vali Al a Glu 415 Lys Xaa Ser Arg Giy 495 Xaa Lys [.eu Giy -His 575 Pro krg *Asp 320 Asn Leu Gin Val Val1 -4 00 Ala Lys Thr Giu Arg 480 Asp Lys Arg Phe Arg 560 Ser His Ser Pro Pro Xaa Leu Pro Val Asn Gly Lys Met His Ser Aia Val Asp Cys 610McCo Asn Giy Val Val Ser Leu Val Asp Gly Xaa Ser Ala Leu Met Leu Pro 625 630 635 640 Asn Gly Gin Leu Leu Pro Glu Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 645 650 655 Xaa Xaa Gly Thr Thr Asn Gin Xaa Xaa Lys Lys Arq Xaa Xaa Ser Ser 660 665 670 Tyr Xaa Leu Ser Giu Asp Met Leu Asn Asp Pro Xaa Leu Arq Gin Arg 675 680 685 Ala Met Ser Arq Ala Ser Ile Leu Thr Asn Thr Val Glu Giu Leu Glu 690 695 700 Giu Ser Arg Gin Lys Cys Xaa Xaa Xaa Xaa Tyr Arq Phe Ala His Xaa 705 710 715 720 Phge Leu Ile Trp Asn Cys Ser Pro Tyr Trp Ile Lys Phe Lys Lys -Xaa ::725 730 735 I.:le Tyr Phe Ile Vai Met Asp Pro Phe Val Asp Leu Ala Ile Thr Ile 740 745 750 *Cys Ile Val Leu Asn Thr Leu Phe Met Ala Met Giu His His Pro Met *755 760 765 Thr Giu Glu Phe Lys Asn Vai Leu Ala Xaa Gly Asn Leu Xaa Phe Thr 770 775 780 *Gly Ile Phe Ala Ala Giu Met Val Leu Lys Leu Ile Ala Met Asp Pro 785 790 795 800 Tyr Giu Tyr Phe Gin Vai Gly Trp Asn Ile Phe Asp Ser Leu Ile Val .4~*805 810 815 Thr Leu Ser Leu Xaa Giu Leu Phe Leu Ala Asp Val Giu Gly Leu Ser 820 825 830 *.*Val Leu Arq Ser Phe Arg Leu Leu Arq Vai Phe Lys Leu Ala Lys Ser 835 840 845 Trp Pro Thr Leu Asn Met Leu Ile Lys Ile Ile Gly Asn Ser Val Gly 850 855 860 Ala Leu Gly Asn Leu Thr Leu Val Leu Ala Ile Ile Val Phe Ile Phe 865 870 875 880 Ala Val Val Gly Met Gin Leu Phe Gly Lys Ser Tyr Lys Giu Cys Val 885 890 895 Cys Lys Ile Asn Xaa Asp Cys Xaa Leu Pro Arq Trp His Met Asn Asp 900 905 910 Phe Phe His Ser Phe Leu Ile Val Phe Arg Val Leu Cys Gly Giu Trp 915 920 925 Ile Glu Thr Met Trp Asp Cys Met Giu Val Ala Gly Gin Xaa Met Cys 930 935 940 Leu Ile Val Tyr Met Met Val Met Val Ile Gly Asn Leu Val Val Leu 945 950 955 960 Asn Leu Phe Leu Ala Leu Leu Leu Ser Ser Phe Ser Ser Asp Asn Leu 965 970 975 Thr Ala Ile Glu Glu Asp Xaa Asp Ala Asn Asn Leu Gin Ile Ala Val 980 985 990 Xaa Arg Ile Lys Xaa Gly Ile Asn Tyr Val Lys Gin Thr Leu Arg Glu 995 1000 1005 Phe Ile Leu Lys Xaa Phe Ser Lys Lys Pro Lys Xaa Ser Xaa Xaa Xaa 1010 1015 1020 Xaa Xaa Xaa Xaa Asp Xaa Asn Xaa Lys Lys Glu Asn Tyr Ile Ser Asn 1025 1030 1035 1040 Xaa Thr Leu Ala Giu Met Ser Lys Xaa His Asn Phe Leu Lys Giu Lys 1045 1050 1055 Asp Xaa Ile Ser Gly Xaa Gly Ser Ser Xaa Asp Lys Xaa Xaa Met -Xaa 1060 1065 1070 Xaa Xaa Asp Xaa Gln Ser Phe Ile His Asn Pro Ser Leu Thr Val Thr 1075 1080 1085 *Val Pro Ile Ala Pro Giy Giu Ser Asp Leu Glu Xaa Met Asn Xaa Giu 1090 1095 1100 Leu Ser Ser Asp Ser Asp Ser Xaa Tyr Ser Lys Xaa Xaa Xaa Asn 1105 1110 1115 1120 Ser Ser Ser Ser Glu Cys Ser Thr Val Asp Asn Pro Leu Pro Gly *1125 1130 1135 Glu Giy Giu Giu Ala Glu Ala Giu Pro Xaa Asn Xaa Asp Glu Pro Glu *1140 1145 1150 *Ala Cys Phe Thr Asp Gly Cys Val Arg Arg Phe Xaa Cys Cys Gin Val 1155 1160 1165 *Asn Xaa Xaa Ser Gly Lys Gly Lys Xaa Trp Trp Xaa Ile Arq Lys Thr 1170 1175 1180 *Cys Tyr Xaa Ile Val Glu His Ser Trp Phe Glu Ser Phe Ile Val Leu 1185 1190 1195 1200 Met Ile Leu Leu Ser Ser Gly Ala Leu Ala Phe Giu Asp Ile Tyr Ile 1205 1210 1215 Giu Xaa Lys Lys Thr Ile Lys Ile Ile Leu Glu Tyr Ala Asp Lys Ile 1220 1225 1230 Phe Thr Tyr Ile Phe Ile Leu Giu Met Leu Leu Lys Trp Xaa Ala Tyr 1235 1240 1245 Gly Tyr Lys Thr Tyr Phe Thr Asn Ala Trp Cys Trp Leu Asp Phe Leu 1250 1255 1260 Ile Val Asp Val Ser Leu Val Thr Leu Val Ala Asn Thr Leu Gly Tyr 1265 1270 1275 1280 Ser Asp Leu Gly Pro Ile Lys Ser Leu Arg Thr Leu Arq Ala Leu Arg 1285 1290 1295 -81- Pro Leu Arg Ala Leu Ser Arg Phe Giu Gly Met Arq Val Vai Val Asn 1300 1305 1310 Ala Leu Ile Gly Ala Ile Pro Ser Ile Met Asn Val Leu Leu Val Cys 1315 1320 1325 Leu Ile Phe Trp Leu Ile Phe Ser Ile Met Gly Val Asn Leu Phe Ala 1330 1335 1340 Gly Lys Phe Tyr Glu Cys Xaa Asri Thr Thr Asp Gly Ser Arg Phe Pro 1345 1350 1355 1360 Xaa Ser Gln Val Xaa Asn Arg Ser Giu Cys Phe Ala Leu Met Asn Val 1365 1370 1375 Ser Xaa Asn Val Arg Trp Lys Asn Leu Lys Val Asn Phe Asp Asn Val 1380 1385 1390 Giy Leu Gly Tyr Leu Ser Leu Leu Gin Val Ala Thr Phe Lys Giy ffrp 1395 1400 1405 Xaa Xaa Ile Met Tyr Ala Ala Val Asp Ser Val Asn Val Xaa Xaa Gin *1410 1415 1420 *.Pro Lys Tyr Giu Tyr Ser Leu Tyr Met Tyr Ile Tyr Phe Vai Xaa Phe 1425 1430 1435 1440 Sle Ile Phe Gly Ser Phe Phe Thr Leu Asn Leu Phe Ile Gly Vai Ile 1445 1450 1455 IAsp Asn P..e AnGin Gl ys Lys Lys Leu Gly Gly GnAsp Ile *1460 1465 1470 Phe Met Thr Giu Giu Gin Lys Lys Tyr Tyr Asn Ala Met Lys Lys Leu 1475 1480 1485 Gly Ser Lys Lys Pro Gin Lys Pro Ile Pro Arq Pro Gly Asn Lys Xaa 1490 1495 1500 Gin Gly Cys Ile Phe Asp Leu Val Thr Asn Gin Ala Phe Asp Ile Xaa 1505 1510 1515 1520 *lie Met Val Leu Ile Cys Leu Asn Met Val Thr Met Met Val Giu Lys 1525 1530 1535 Giu Gly Gin Xaa Xaa Xaa Met Xaa Xaa Val Leu Xaa Trp Ile Asn Xaa 1540 1545 1550 Vai Phe Ile Ile Leu Phe Thr Gly Glu Cys Val Leu Lys Leu Ile Ser 1555 1560 1565 Leu Arg His Tyr Tyr Phe Thr Val Gly Trp Asn Ile Xaa Xaa Phe Val 1570 1575 1580 Vai Val Ile Xaa Ser Ilie Vai Gly Met Phe Leu Ala Xaa Xaa Ile Giu 1585 1590 1595 1600 Xaa Tyr Phe Val Ser Pro Thr Leu Phe Arg Val Ile Arg Leu Ala Arg 1605 1610 1615 Ile Gly Arg Ile Leu Arq Leu Xaa Lys Gly Ala Lys Giy Ile Arg Thr 1620 1625 1630 -82- Leu Leu Phe Ala Leu Met Met Ser Leu Pro Ala Leu Phe Asn Ile Gly 1635 1640 1645 Leu Leu Leu Phe Leu Val Met Phe Ile Tyr Ala Ile Phe Gly Met Ser 1650 1655 1660 Asn Phe Ala Tyr Val Lys Lys Giu Xaa Gly Ile Asn Asp Met Phe Asn 1665 1670 1675 1680 Phe Glu Thr Phe Gly Asn Ser Met Ile Cys Leu Phe Gln Ile Thr Thr 1685 1690 1695 Ser Ala Gly Trp Asp Gly Leu Leu Ala Pro Ile Leu Asn Ser Xaa Pro 1700 1705 1710 Pro Asp Cys Asp Pro Lys Lys Val His Pro Gly Ser Ser Val Giu Gly 1715 1720 1725 Asp Cys Gly Asn Pro Ser Val Gly Ile Phe Tyr Phe Val Ser Tyr -Ile 1730 1735 1740 Ile Ile Ser Phe Leu Val Val Val Asn Met Tyr Ile Ala Val Ile Leu .::1745 1750 1755 1760 *9Giu Asn Phe Ser Val Ala Thr Giu Giu Ser Thr Glu Pro Leu Ser Giu :..1765 1770 1775 *Asp Asp Phe Glu Met Phe Tyr Glu Vai Trp Glu Lys Phe Asp Pro Asp 1780 1785 1790 *Ala Thr Gin Phe Ile Giu Phe Xaa Lys Leu Ser Asp Phe Ala Ala Ala 9: *1795 1800 1805 Leu Asp Pro Pro Leu Leu Ile Ala Lys Pro Asn Lys Val Gin Leu Ile *1810 1815 1820 *Ala Met Asp Leu Pro Met Val Ser Gly Asp Arg Ile His Cys Leu Asp 1825 1830 1835 1840 I ~le Leu Phe Ala Phe Thr Lys Arg Val Leu Gly Giu Xaa Gly Giu Met 1845 1850 1855 *9Asp Ser Leu Arg Ser Gin Met Giu Giu Arq Phe Met Ser Ala Asn Pro 1860 1865 1870 Ser Lys Val Ser Tyr Giu Pro Ile Thr Thr Thr Leu Lys Arg Lys Gin 1875 1880 1885 Glu Xaa Val Ser Ala Thr Xaa Ile Gin Arg Ala Tyr Arg Arg Tyr Arg 1890 1895 1900 Leu Arq Gin Xaa Val Lys Asn Ile Ser Ser Ile Tyr Ile Lys Asp Gly 1905 1910 1915 1920 Asp Arg Asp Asp Asp Leu Xaa Asn Lys Xaa Asp Xaa Xaa Phe Asp Asn 1925 1930 1935 Val Asn Giu Asn Ser Ser Pro Giu Lys Thr Asp Xaa Thr Xaa Ser Thr 1940 1945 1950 Xaa Ser Pro Pro Ser Tyr Asp Ser Val Thr Lys Pro Asp Xaa Giu Lys 1955 1960 1965 -83- Tyr Glu Xaa Asp Xaa Thr Glu Lys Glu Asp Lys Xaa Lys Asp Ser Lys 1970 1975 1980 Glu Ser Xaa Lys Xaa 1985 INFORMATION FOR SEQ ID NO:12: SEQUENCE CHARACTERISTICS: LENGTH: 1989 amino acids TYPE: amino acid STRANDEDNESS: not relevant TOPOLOGY: not relevant (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:12: Met Ala Met Leu Pro Pro Pro Gly Pro Gin Ser Phe Val His Phe Thr I 4.
0 0 0 0 *0 *5 *5 0 0 0* 4 0 *e 0 4L Lys Lys Pro Asp Tyr Ile Ser Ser Leu 145 Gly Cys Phe Gin Glu Ser 50 lie Tyr Phe Pro Met 130 Ser Ile Val Val Ser Leu 20 His Lys 35 Ser Asp Pro Pro Ala Asp Arg Phe 100 Leu Arg 115 Leu Ile Asn Pro Tyr Thr Gly Glu 180 Val Ile 195 Ala Asp Leu Gly Lys 85 Asn Arg Met Pro Phe 165 Phe Val Leu Ile Glu Lys Glu Ala 55 Met Val 70 Lys Thr Ala Thr Ile Ser Cys Thr 135 Glu Trp 150 Glu Ser Thr Phe Phe Ala Glu Lys 40 Gly Ser Phe Pro Ile 120 Ile Thr Leu Leu Tyr 200 Gin 25 Asp Lys Glu Ile Ala 105 Lys Leu Lys Ile Arg 185 Leu 10 Arg Ile Asp Glu Gin Leu Pro Leu 75 Val Leu 90 Leu Tyr Ile Leu Thr Asn Asn Val 155 Lys Ile 170 Asp Pro Thr Glu Ser Glu Pro Glu Asn Met Val Cys 140 Gly Leu Trp Phe Arg 220 Glu Glu Phe Asp Lys Leu His 125 Ile Tyr Ala Asn Val 205 Glu Gly Ile Leu Gly Ser 110 Ser Phe Thr Arg Trp 190 Asn Lys Ala Pro Lys Tyr Gly Asp Pro Lys Ala Pro Phe Leu Phe Met Thr Phe Thr 160 Gly Phe 175 Leu Asp Leu Gly Asn Val Ser Ala Leu Arg Thr Phe Arg Val Leu 210 215 Ala Leu Lys Thr Ile Ser Val Ile Pro 225 Ser Val Ser Val His Lys Ile Met 290 Tyr Leu 305 Sex Giy Pro Asp Ala Leu Gin Thr 370 Val Ile 385 Val Ala Lys Gin Giu Gin Ser Ile 450 Thr Ser 465 Lys Lys Asp Giu Ser Phe Leu Ser 530 Lys Phe Cys 275 Asn Giu Gin Tyr Phe 355 Leu Phe Met Lys Giu 435 Arg Arg Lys Lys His 515 Thr Lys Leu 245 Ala Leu 260 Phe Arg Thr Ala Gly Ser Cys Pro 325 Giy Tyr 340 Arg Leu Arg Ala Leu Gly Ala Tyr 405 Glu Leu 420 Giu Ala Arg Ser Leu Ser Lys Gin 485 Leu Sex 500 Leu Gly Pro Asn Gly 230 Ser Ile Lys Glu Lys 310 Giu Thr Met Ala Sex 390 Giu Giu Giu Arg Ser 470 Lys Lys Val Gin Leu Lys Thx Ile Asp Val Gly Leu Giu Leu 280 Ser Giu 295 Asp Ala Giy Tyr Ser Phe Thx Gin 360 Gly Lys 375 Phe Tyr Giu Gin Phe Gin Ala Ile 440 Ile Met 455 Lys Ser Xaa Met Ser Giy Giu Gly 520 Ser Pro 535 Met Ile 250 Gin Leu 265 Giu Giu Giu Giu .Leu Leu Ile Cys 330 Asp Thr 345 Asp Tyr Thr Tyr Leu Ile Asn Gin 410 Gin Met 425 Ala Ala Giy Leu Ala Lys Ser Ser 490 Sex Giu 505 His His Leu Ser Val1 235 Leu Phe Asn Leu Cys 315 Val1 Phe Trp Met Asn 395 Ala Leu Ala Sex Giu 475 Gly Giu Arg Ile Gly Ala Thr Val Met Giy Giu Thr 285 Lys Lys 300 Gly Phe Lys Ala Ser Trp Giu Asn 365 Ile Phe 380 Leu Ile Asn Ile Asp Axg Ala Ala 445 Giu Ser 460 Arg Arq Giu Giu Sex Ile Thx Arg 525 Arg Gly 540 Leu Phe As n 2-70 Le u T yr Sex Gly Ala 350 Leu Phe Leu Giu Leu 430 Giu Ser Asn Lys Axq 510 Glu Ser Ile Gin 240 Cys Leu 255 Leu Lys Glu Ser Phe Tyr Thx Asp 320 Axg -Asn 335 Phe Leu Tyr Gin Val Val Ala Val 400 Giu Ala 415 Lys Lys Phe Thr Sex Giu Axg Arq 480 Gly Asp 495 Lys Lys Lys Arg Leu Phe Ser Ala Axg Arg Ser Sex Arg Thr Sex Leu Phe Sex Phe Lys Giy 545 550 555 Gly Arg Asp Leu Gly 65 Ser Giu Thr Giu Phe Ala Asp Asp Giu His 570 Ile Arq Pro Asn 625 Asn Asp Tyr Ala Giu 705 Phe Ile Cys Thr Gly 785 Tyr Thr Val1 T rp Ala 865 Al a Phe Gly Pro Arg 595 Pro Val 610 Gly Val Gly Gin Ser Gly Phe Leu 675 Met Ser 690 Ser Arg Leu Ile Tyr Phe Ile Val 755 Glu Glu 770 Ile Phe Glu Tyr Leu Ser Leu Arg 835 Pro Thr 850 Leu Gly Val Val Asp 580 Giu Leu Val1 Leu Thr 660 Ser Arg Gin Trp Ile 740 Leu Phe Ala Phe Leu 820 Se r Leu As n Gly Asn Glu Arq Arq Pro Val Ser Leu 630 Leu Pro 645 Thr Asn Giu Asp Ala Ser Lys Cys 710 Asn Cys 725 Val Met Asn Thr Lys Asn Ala Glu 790 Gin Val 805 Ile Glu Phe Arg Asn Met Leu Thr 870 Met Gin 885 Ser Ser Asn 615 Val1 Giu Gin Met Ile 695 His Ser Asp Leu Val 775 Met Gly Leu Leu Leu 855 Leu Leu Arg Arg 585 Ser Asn 600 Gly Lys Asp Gly Val Ile Met Arg 665 Leu Asn 680 Leu Thr Gin Leu Pro Tyr Pro Phe 745 Phe Met 760 Leu Ala Val Leu Trp Asn Phe Leu 825 Leu Arq 840 Ile Lys Val Leu Phe Gly Gly Ile Met Pro Ile 650 Lys Asp Asn be u Trp 730 Val1 Al a Val1 Lys Ile 810 Ala Val1 Ile Ala Lys 890 Ser Ser His Ser 635 Asp Lys Pro Th r Tyr 715 Ile Asp Met Gi y Leu 795 Phe Asp Phe Ile Ile 875 Se r Leu Gin Ser 620 Al a Lys Arq His Val1 700 Arg Lys Leu Glu Asn 780 Ile Asp Val1 Lys Gly 860 Ile Tyr Phe Ala 605 Ala Leu Ala Leu Leu 685 Glu Phe Phe Ala His 765 Leu Ala Se r Glu Leu 845 As n Val1 Lys Val1 590 Se r Val1 Met Th r Ser 670 Arq Glu Ala Lys Ile 750 His Ile Met be u Gly 830 Al a Se r Phe Glu 575 Pro Arq Asp Le u Ser 655 Ser Gin be u His Lys 735 Thr Pro Phe Asp Ile 815 be u L ys Val1 Ile Cys 895 Ser His Ser Cys Pro 640 Asp Se r Arg Giu Thr 720 Leu Ile Met Thr Pro 800 Val1 Ser Ser Gly Phe 880 Val1 Cys Lys Ile Asn Val Asp Cys Lys Leu Pro Arg Trp His Met Asn Asp 900 905 910 Phe Phe His Ser Phe Leu Ile Val Phe Arg Val Leu Cys Gly Glu Trp 915 920 925 Ile Glu Thr Met Trp Asp Cys Met Glu Val Ala Gly Gin Thr Met Cys 930 935 940 Leu Ile Val Tyr Met Met Val Met Val Ile Gly Asn Leu Val Val Leu 945 950 955 960 Asn Leu Phe Leu Ala Leu Leu Leu Ser Ser Phe Ser Ser Asp Asn Leu 965 970 975 Thr Ala Ile Glu Glu Asp Thr Asp Ala Asn Asn Leu Gin Ile Ala Val 980 985 990 Ala Arg Ile Lys Arg Gly Ile Asn Tyr Val Lys Gin Thr Leu Arg Glu 995 1000 1005 SP.he Ile Leu Lys Ser Phe Ser Lys Lys Pro Lys Gly Ser Lys Asp Thr 1010 1015 1020 Lys Arg Thr Ala Asp Pro Asn Asn Lys Lys Glu Asn Tyr Ile Ser Asn 1025 1030 1035 1040 Arg Thr Leu Ala Glu Met Ser Lys Asp His Asn Phe Leu Lys Glu Lys 1045 1050 1055 Asp Arg Ile Ser Gly Tyr Gly Ser Ser Leu Asp Lys Ser Phe Met Asp 1060 1065 1070 Glu Asn Asp Tyr Gin Ser Phe Ile His Asn Pro Ser Leu Thr Val Thr 1075 1080 1085 *V..al Pro Ile Ala Pro Gly Glu Ser Asp Leu Glu Ile Met Asn Thr Glu 1090 1095 1100 Glu Leu Ser Ser Asp Ser Asp Ser Asp Tyr Ser Lys Glu Lys Arg Asn 1105 1110 1115 1120 Arg Ser Ser Ser Ser Glu Cys Ser Thr Val Asp Asn Pro Leu Pro Gly 1125 1130 1135 Glu Xaa Glu Glu Ala Glu Ala Glu Pro Val Asn Ala Asp Glu Pro Glu 1140 1145 1150 Ala Cys Phe Thr Asp Gly Cys Val Arg Arg Phe Pro Cys Cys Gin Val 1155 1160 1165 Asn Val Asp Ser Gly Lys Gly Lys Val Trp Trp Thr Ile Arg Lys Thr 1170 1175 1180 Cys Tyr Arg Ile Val Glu His Ser Trp Phe Glu Ser Phe Ile Val Leu 1185 1190 1195 1200 Met Ile Leu Leu Ser Ser Gly Ala Leu Ala Phe Glu Asp Ile Tyr Ile 1205 1210 1215 Glu Lys Lys Lys Thr Ile Lys Ile Ile Leu Glu Tyr Ala Asp Lys Ile 1220 1225 11nA I JUJ -87- Phe Thr Tyr Ile Phe Ile Leu Glu Met Leu Leu Lys Trp Val Ala Tyr 1235 1240 1245 Gly Tyr Lys Thr Tyr Phe Thr Asn Ala Trp Cys Trp Leu Asp Phe Leu 1250 1255 1260 Ile Val Asp Val Ser Leu Val Thr Leu Val Ala Asn Thr Leu Gly Tyr 1265 1270 1275 1280 Ser Asp Leu Gly Pro Ile Lys Ser Leu Arg Thr Leu Arg Ala Leu Arg 1285 1290 1295 Pro Leu Arg Ala Leu Ser Arg Phe Glu Gly Met Arg Val Val Val Asn 1300 1305 1310 Ala Leu Ile Gly Ala Ile Pro Ser Ile Met Asn Val Leu Leu Val Cys 1315 1320 1325 Leu Ile Phe Trp Leu Ile Phe Ser Ile Met Gly Val Asn Leu Phe Ala 1330 1335 1340 Gly Lys Phe Tyr Glu Cys Val Asn Thr Thr Asp Gly Ser Arg Phe Pro 1345 1350 1355 1360 Thr Ser Gin Val Ala Asn Arg Ser Glu Cys Phe Ala Leu Met Asn Val S* 1365 1370 1375 Ser Gly Asn Val Arg Trp Lys Asn Leu Lys Val Asn Phe Asp Asn Val 1380 1385 1390 Gly Leu Gly Tyr Leu Ser Leu Leu Gin Val Ala Thr Phe Lys Gly Trp 1395 1400 1405 Met Asp Ile Met Tyr Ala Ala Val Asp Ser Val Asn Val Asn Glu Gin 1410 1415 1420 0* Pro Lys Tyr Glu Tyr Ser Leu Tyr Met Tyr Ile Tyr Phe Val Ile Phe 1425 1430 1435 1440 Ile Ile Phe Gly Ser Phe Phe Thr Leu Asn Leu Phe Ile Gly Val Ile 1445 1450 1455 Ile Asp Asn Phe Asn Gin Gin Lys Lys Lys Leu Gly Gly Gin Asp Ile 1460 1465 1470 Phe Met Thr Glu Glu Gin Lys Lys Tyr Tyr Asn Ala Met Lys Lys Leu 1475 1480 1485 Gly Ser Lys Lys Pro Gin Lys Pro Ile Pro Arg Pro Gly Asn Lys Phe 1490 1495 1500 Gin Gly Cys Ile Phe Asp Leu Val Thr Asn Gin Ala Phe Asp Ile Thr 1505 1510 1515 1520 Ile Met Val Leu Ile Cys Leu Asn Met Val Thr Met Met Val Glu Lys 1525 1530 1535 Glu Gly Gin Thr Glu Tyr Met Asp Tyr Val Leu His Trp Ile Asn Met 1540 1545 1550 Val Phe Ile Ile Leu Phe Thr Gly Glu Cys Val Leu Lys Leu Ile Ser 1555 1560 1565 -88- Leu Arg His Tyr Tyr Phe Thr Val Gly Trp Asn Ile Leu Tyr Phe Val 1570 1575 1580 Val Val Ile Leu Ser Ile Val Gly Met Phe Leu Ala Glu Met Ile Glu 1585 1590 1595 1600 Lys Tyr Phe Val Ser Pro Thr Leu Phe Arg Val Ile Arg Leu Ala Arg 1605 1610 1615 Ile Gly Arg Ile Leu Arg Leu Ile Lys Gly Ala Lys Gly Ile Arg Thr 1620 1625 1630 Leu Leu Phe Ala Leu Met Met Ser Leu Pro Ala Leu Phe Asn Ile Gly 1635 1640 1645 Leu Leu Leu Phe Leu Val Met Phe Ile Tyr Ala Ile Phe Gly Met Ser 1650 1655 1660 AsD Phe Ala Tyr Val Lys Lys Glu Ala Gly Ile Asn Asp Met Phe Asn *o 1665 1670 1675 1680 Phe Glu Thr Phe Gly Asn Ser Met Ile Cys Leu Phe Gin Ile Thr Thr 1685 1690 1695 Ser Ala Gly Trp Asp Gly Leu Leu Ala Pro Ile Leu Asn Ser Ala Pro 1700 1705 1710 o Pro Asp Cys Asp Pro Lys Lys Val His Pro Gly Ser Ser Val Glu Gly 1715 1720 1725 Asp Cys Gly Asn Pro Ser Val Gly Ile Phe Tyr Phe Val Ser Tyr Ile 1730 1735 1740 Ile Ile Ser Phe Leu Val Val Val Asn Met Tyr Ile Ala Val Ile Leu 1745 1750 1755 1760 Glu Asn Phe Ser Val Ala Thr Glu Glu Ser Thr Glu Pro Leu Ser Glu 1765 1770 1775 Asp Asp Phe Glu Met Phe Tyr Glu Val Trp Glu Lys Phe Asp Pro Asp 1780 1785 1790 Ala Thr Gin Phe Ile Glu Phe Cys Lys Leu Ser Asp Phe Ala Ala Ala 1795 1800 1805 Leu Asp Pro Pro Leu Leu Ile Ala Lys Pro Asn Lys Val Gin Leu Ile 1810 1815 1820 Ala Met Asp Leu Pro Met Val Ser Gly Asp Arg Ile His Cys Leu Asp 1825 1830 1835 1840 Ile Leu Phe Ala Phe Thr Lys Arg Val Leu Gly Glu Gly Gly Glu Met 1845 1850 1855 Asp Ser Leu Arg Ser Gin Met Glu Glu Arg Phe Met Ser Ala Asn Pro 1860 1865 1870 Ser Lys Val Ser Tyr Glu Pro Ile Thr Thr Thr Leu Lys Arg Lys Gin 1875 1880 1885 Glu Glu Val Ser Ala Thr Ile Ile Gin Arg Ala Tyr Arg Arg Tyr Arg 1890 1895 1900 -89- Leu Arg Gin His Val Lys Asn Ile Ser Ser Ile Tyr Ile Lys Asp Gly 1905 1910 1915 1920 Asp Arg Asp Asp Asp Len Pro Asn Lys Gin Asp 1925 1930 Val Asn Gin Asn Ser Ser Pro Gin Lys Thr Asp 1940 1945 Ile Ser Pro Pro Ser Tyr Asp Ser Val Thr Lys 1955 1960 Tyr Gin Thr Asp Lys Thr Gin Lys Gin Asp Lys 1970 1975 Gin Ser Arg Lys Xaa 1985 INFORMATION FOR SEQ ID NO:i3: SEQUENCE CHARACTERISTICS: LENGTH: 6371 base pairs TYPE: nncleic acid STRANDEDNESS: both TOPOLOGY: both (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:13: Thr Vai Phe Asp Asn 1935 Val Thr Ala Ser Thr 1950 Pro Asp Gin Gin Lys 1965 Gin Lys Asp Xaa Xaa 1980 CTCTTATGTG AGGAGCTGAA GAGGAATTAA AATATACAGG
CCTCCCCCAG
CAACGCATTG
GAAGCCCCAA
GACATTCCTC
AAAAAGACTT
GCTTTATATA
CACTCCTTAT
AT GAATAAC C
TTTGAATCAC
CGTGACCCGT
GTAAACCTAG
ATTTCTGTAA
CTTTCTGATG
CAGCTGTTCA
GACCTCAGAG
CTGAAAGAAA
AGCCAAGCAG
CCGGCATGGT
TCATAGTATT
TGCTTTCTCC
TCAGCATGCT
CGCCGGACTG
TTGTAAAAAT
GGAACTGGCT
GCAATGTTTC
TCCCAGGCCT
TCATGATCCT
TGGGAAACCT
CTTTGTCCAT
ATCAAAGGAA
TGACTTGGAA
GTCAGAGCCC
GAACAAAGGG
TTTCAGTCCT
CATCATGTGC
GACCAAAAAT
CCTTGCAAGA
GGATTTTGTC
AGCTCTTCGA
GAAGACAATT
GACTGTGTTC
GAAGCATAAA
TTCACAAAAC
CCCAAAGAAG
GCTGGCAAAC
CTGGAGGACT
AAAACAATCT
CTAAGAAGAA
ACTATTCTGA
GTCGAGTACA
GGCTTCTGTG
GTCATTGTTT
ACTTTCAGAG
GTAGGGGCTT
TGTCTGAGTG
TGTTTTCGAA
ATGAAAAGAT
AGTCTCTTGC
AAAAGAAAGA
AACTGCCCTT
TGGACCCCTA
TCCGTTTCA.A
TATCTATTAA
CAAACTGCAT
CTTTTACTGG
TAGGAGAATT
TTGCGTATTT
TATTGAGAGC
TGATCCAGTC
TGTTTGCACT
ATTCACTTGA
GGCAATGTTG
CCTCATTGAA
TGATGATGAA
CATCTATGGG
CTATGCAGAC
TGCCACACCT
GATTTTAGTA
ATTTATGACC
AATATATACT
CACTTTTCTT
AACAGAATTT
TTTGAAAACT
AGTGAAGAAG
AATTGGACTA
AAATAATGAA
120 180 240 300 360 420 480 540 600 660 720 780 840 900 ACATTAGAAA GCATAATGAA TACCCTAGAG AGTGAAGAAG ACTTTAGAAA ATATTTTTAT a a a a a a a.
a a a a a. a a a a. a a a.
6 a a a
TACTTGGAAG
CCAGAGGGGT
GACACTTTCA
AACCTTTACC
GTGATTTTCC
TAT GAAGAAC
CAGATGTTAG
GCTGAATATA
ACATCCAAAC
CAAAAGAAGC
TCAGAGGACA
CATGAAAAGA
TCTGCAAGGC
GGATCTGAGA
AGGGGCTCAC
GCCAGTAGGT
AACGGTGTGG
CTGCCAGAGG
TCAGAGGATA
TTAACAAACA
AGATTTGCAC
TGTATCTATT
TTAAACACAT
CTTGCTATAG
ATTGCCATGG
GTGACTTTAA
TCATTCAGAC
ATTAAGATCA
ATCGTCTTCA
GTCTGCAAGA
GATCCAAAGA
ACACCTGTGT
GCTGGGCCTT
AACAGACGCT
TGGGCTCCTT
AGAACCAGGC
ACCGTCTTAA
CAAGTATTAG
TGAGCTCTAA
TCTCCAGTGG
GCATCAGAAG
GGTTGTCTAC
GAAGCAGCAG
CTGAATTTGC
TGTTTGTGCC
CCCCACCAAT
TCTGCCTGGT
GCACGACCAA
TGCTGAATGA
CTGTGGAAGA
ACAAATTCTT
TTATTGTAAT
TATTTATGGC
GAAATTTGGT
ATCCATATGA
GTTTAGTGGA
TGCTCCGAGT
TTGGTAACTC
TTTTTGCTGT
TCAATGATGA
TGCTCTCCTT
GAA-AATTGGC
CTTAGCCTTG
GCGTGCTGCT
TTATCTAATA
AAACATTGAA
AAAAGAGCAA
GAGAAGCAGA
AAGTGCTAAA
AGAGGAAAAG
AAAAAGTTTC
CCCCAATCAG
AACAAGTCTT
CGATGATGAG
CCACAGACCC
GCTGCCGGTG
TGATGGACGG
TCAAATACAC
TCCCAACCTC
ACTTGAAGAG
GATCTGGAAT
GGATCCTTTT
TATGGAACAC
CTTTACTGGA
GTATTTCCAA
GCTCTTTCTA
CTTCAAGTTG
AGTAGGGGCT
GGTCGGCATG
CTGTACGCTC
AGAAACCCTG
TTTAGGCTAA
GGCAAAACCT
AACTTGATCC
GAAGCTAAAC
GAAGAAGCTG
ATTATGGGCC
GAAAGAAGAA
GGAGATGCTG
CACCTTGGTG
TCACCACTCA
TTTAGTTTCA
CACAGCATTT
CAGGAGCGAC
AACGGGAAAA
TCAGCCCTCA
AAGAAAAGGC
AGACAGAGAG
TCCAGACAAA
TGCTCTCCAT
GTAGATCTTG
CACCCAATGA
ATCTTTGCAG
GTAGGCTGGA
GCAGATGTGG
GCAAAATCCT
CTAGGTAACC
CAGCTCTTTG
CCACGGTGGC
ATTATGGCTA
TGACCCAAGA
ACATGATCTT
TGGCTGTGGT
AGAAAGAATT
AGGCAATTGC
TCTCAGAGAG
ACAGAAGAAA
AGAAATTGTC
TCGAAGGGCA
GCATTCGTGG
AAGGCAGAGG
TTGGAGACAA
GCAGCAGTAA
TGCACAGTGC
TGCTCCGCAA
GTTGTAGTTC
CAATGAGTAG
AATGTCCACC
ATTGGATAAA
CAATTACCAT.
CTGAGGAATT
CTGAAATGGT
ATATTTTTGA
AAGGATTGTC
GGCCAACATT
TCACCTTAGT
GTAAGAGCTA
ACATGAACGA
TGTGGTTTCA GCACAGATTC AGGTCAGTGT
CACGAGCTTT
TTACTGGGAA
CTTTGTCGTA
TGCCATGGCA
AGAATTTCAA
AGCGGCAGCG
TTCTTCTGAA
GAAAAAGAATR
GAAATCAGAA
TAGGCGAGCA
CTCCTTGTTT
AAGAGATATA
TGAGAGCAGA
CATCAGCCAA
TGTGGACTGC
TGGACAGCTT
CTATCTCCTT
AGCAAGCATA
TTGGTGGTAC
ATTCAAAAAG
TTGCATAGTT
CAAAAATGTA
ATTAAAACTG'
CAGCCTTATT
AGTTCTGCGA
GAACATGCTG
GTTGGCCATC
CAAAGAATGT
CTTCTTCCAC
1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 2100 2160 2220 2280 2340 2400 2460 2520 2580 2640 2700 2760 -91- 9* 9
TCGTTCCTGA
ATGGAGGTCG
AACCTGGTGG
CTTACAGCAA
AAAAAGGGAA
AAAAAGCCAA
AACTATATTT
AAAGATAAAA
GGTCAATCAT
TCCGATTTGG
AAAGTGAGAT
GGAGAAGGAG
ACAGATGGTT
AAAATCTGGT
AGCTTGATTG
ATTGAAAGGA
ATCTTCATTC
AATGCCTGGT
AACACTCTTG
AGAGCTCTAA
GGAGCAATTC
AGCATCATGG
GGGTCACGGT
GTTAGTCAAA
TACCTATCTC
GTGGATTCTG
TATTTTGTCG
ATCATAGATA
GAAGAACAGA
CCAATTCCTC
GCCTTTGATA
TTGTGTTCCG
CTGGTCAAGC
TCCTAAACCT
TTGAAGAAGA
TAAATTATGT
AGATTTCCAG
CTAACCATAC
TCAGTGGTTT
TTATTCACAA
AAAATATGAA
TAAACCGGTC
AAGAAGCAGA
GTGTACGGAG
GGAACATCAG
TCCTCATGAT
AAAAGACCAT
TGGAAATGCT
GTTGGCTGGA
GCTACTCAGA
GAGCCTTATC
CTTCCATCAT
GAGTAAATTT
TTCCTGCAAG
ATGTGCGATG
TGCTTCAAGT
TTAATGTAGA
TCTTTATCAT
ATTTCAACCA
AGAAATACTA
GACCAGGGAA
TTAGTATCAT
CGTGCTGTGT
TATGTGCCTT
ATTTCTGGCC
CCCTGATGCA
GAALACAAACC
GGAGATAAGA
ACTTGCTGAA
TGGAAGCAGC
TCCCAGCCTC
TGCTGAGGAA
AAGCTCCTCA
GGCTGAACCT
GTTCTCATGC
GAAAACCTGC
CCTGCTCAGC
TAAGATTATC
TCTAAAATGG
TTTCCTAATT
TCTTGGCCCC
TAGATTTGAA
GAATGTGCTA
GTTTGCTGGC
TCAAGTTCCA
GAAAAACCTG
TGCAACTTTT
CAAGCAGCCC
CTTTGGGTCA
ACAGAAAAAG
TAATGCAATG
CAAAATCCAA
GGTTCTTATC
GGAGAGTGGA
ATTGTTTACA
TTATTATTGA
AACAACCTCC
TTACGTGAAT
CAAGCAGAAG
ATGAGCAAAG
GTGGACAAAC
ACAGTGACAG
CTTAGCAGTG
GAGTGCAGCA
ATGAATTCCG
TGCCAAGTTA
TACAAGATTG
AGTGGTGCCC
CTGGAGTATG
ATAGCATATG
GTTGATGTTT
ATTAAATCCC
GGAATGAGGG
CTTGTGTGTC
AAGTTCTATG
AATCGTTCCG
AAAGTGAACT
AAGGGATGGA
AAATATGAAT
TTCTTCAGTT
AAGCTTGGAG
AAAAAGCTGG
GGATGTATAT
TGTCTCAACA
TAGAGACCAT
TGATGGTCAT
GCTCATTTAG
AGATTGCAGT
TTATTCTAAA
ATCTGAATAC
GTCAGAATTT
ACTTGATGGA
TGCCAATTGC
ATTCGGATAG
CAGTTGATAA
ATGAGCCAGA
ACATAGAGTC
TTGAACACAG
TGGCTTTTGA
CAGACAAGAT
GTTATAAAAC
CTTTGGTTAC
TTCGGACACT
TCGTTGTGAA
TTATATTCTG
AGTGTATTAA
AATGTTTTGC
TTGATAATGT
CGATTATTAT
ATAGCCTCTA
TGAACTTGTT
GTCAAGACAT
GGTCCAAGAA
TTGACCTAGT
TGGTAACCAT
GTGGGACTGT
GGTCATTGGA
TTCAGAGAAT
GACTAGAATT
AGCATTTTCC
TAAGAAGGAA
CCTCAAGGAA
AGACAGTGAT
ACCTGGGGAA
TGAATACAGG-
CCCTTTGCCT
GGCCTGTTTC
AGGGAAAGGA
TTGGTTTGAA
AGATATTTAT
CTTCACTTAC
ATATTTCACC
TTTAGTGGCA
GAGAGCTTTA
TGCACTCATA
GCTGATATTC
CACCACAGAT
CCTTATGAAT
CGGACTTGGT
GTATGCAGCA
CATGTATATT
CAT TGGTGTC
CTTTATGACA
GCCACAAAAG
GACAAATGAA
GATGGTAGAA
2820 2880 2940 3000 3060 3120 3180 3240 3300 3360 3420 3480 3540 3600 3660 3720 3780 3840 3900 3960 4020 4080 4140 4200 4260 4320 4380 4440 4500 4560 4620 AAGGAGGGTC AAAGTCAACA TATGACTGAA GTTTTATATT GGATAAATGT
GGTTTTTATA
4680 *4
S
S.
S
S.
S S
S
S.
S.
S S
S.
5555 S. S
*SSS
S S
S
S
*55~)
S.
S
5.5.
S
ATCCTTTTGI
GTAGGATGGI
GCTGATTTGI
AGGATTGGcC
GCTTTGATGP
TTCATCTACG
AATGACATGT
ACCTCTGCTG
GACCCAAAAA
GGAATATTCT
ATTGCAGTCA
GAGGATGACT
TTTATAGAGT
GCAAAACCCA
ATCCATTGTC
ATGGATTCTC
TCCTATGAAC
ATTCAGCGTG
TACATAAAA~G
AATGTTAATG
CCTTCATATG
AAGGAAGACA
TTGTTTACAG
TGCCAAAATC
CTAAGAAAGG
CGTAAGAGAA
TTGGTTTTTA
TCACAGGATT
AAAAAAAA
CTGGAGAATG
ATATTTTTGA
TTGAAACGTA
GAATCCTACG
TGTCCCTTCC
CCATCTTTGG
TCAATTTTGA
GCTGGGATGG
AAGTTCATCC
ACTTTGTTAG
TACTGGAGAp.
TTGAGATGTT
TCTCTAAACT
ACAAAGTCCA
TTGACATCTT
TTCGTTCACA
CCATCACAI\C
CTTATAGACG
ATGGAGACAG
AGAACTCAAG
ATAGTGTAAC
AAGGGAAAGA
CCTGTGAAAG
CTTTTTATCA
TGGGCAGCAT9 CTCTGTAGGA .7 ATAAATCAGA
I
GTAATTAGTC T1
TGTGCTAAAA
TTTTGTGGTT
TTTTGTGTCC
TCTAGTCAAA
TGGGTTGTTT
AATGTCCAAC
GACCTTTGGC
ATTGCTAGCA
TGGAAGTTCA
TTATATCATC
TTTTAGTGTT
CTATGAGGTT
CTCTGATTTT
GCTCATTGCC
ATTTGCTTTT
GATGGAAGAA
CACACTAAAAI~
TTACCGCTTA
AGATGATGAT
rCCAGAAAJA
!AAGCCAGAG
-AGCAAGGAA
P'GATTTATTT
kAATATTCTCC 7AGCAGATGG 9 ~TTATTGATT
I
~GACCATGTA
G
'TGTTTCCCA T
CTGATCTCCC
GTGATTATCT
CCTACCCTGT
GGAGCAAAGG
AACATCGGC
TTTGCCTATG
AACAGTATGA
CCTATTCTTA
GTTGAAGGAG
ATATCCTTCC
GCCACTGAAG
TGGGAGAAGT
GCAGCTGCCC
ATGGATCTGC
ACAAAGCGTG
AGGTTCATGT
CGGAAACAAG
AGGCAAAATG
rTACTCAATA
,CAGATGCCA
%.AAGAGAAAT
kGCAAAAAAT
~TGTTAATAA
~AAGGCAGTG
TATTTTTGC i
TAGCATACA
~AAAACTTTT
GTAAATAAA C
TCAGACACTA
CCATTGTAGG
TCCGAGTGAT
GGATCCGCAC
TCCTGCTCTT
TTAAAAAGGA
TTTGCCTGTT
ACAGTAAGCC
ACTGTGGTAA
TGGTTGTGGT
AAAGTACTGA
TTGATCCCGA
TGGATCCTCC
CCATGGTTAG
TTTTGGGTGA
CTGCAAATCG
AGGATGTGTC
TCAAAAATAT
AAAAAGATAT
CTTCATCCAC
%iTGAACAAGA
%GAGCTTCAT
kACTCTTTTG
IAGTCACTAA
\CTGATGATT
C
.AAGTGATTG I ~CATCTGCCT
'I
AACACACGC
P
CTACTTCACT
TATGTTTCTA
CCGTCTTGCC
GCTGCTCTTT
CCTGGTCATG
AGATGGAATT
CCAAATTACA
ACCCGACTGT
CCCATCTGTT-
GAACATGTAC
ACCTCTGAGT
TGCGACCCAG
TCTTCTCATA
TGGTGACCGG
GAGTGGGGAG
TTCCAAAGTG
TGCTACTGTC
TTCAAGTATA
GGCTTTTGAT
CACCTCTCCA
CAGAACAGAA
rTTTGATATA kGGAAGTCTA
'TCTGATTTC
TTTAAGAAT
.TTCAGTTTT
GTCATCTTT
ITACAGAAAA
4740 4800 4860 4920 4980 5040 5100 5160 5220 5280 5340 5400 5460 5520 5580 5640 5700 5760 5820 5880 5940 6000 6060 6120 6180 6240 6300 6360 6371 INFORMATION FOR SEQ ID NO:14: SEQUENCE CHARACTERISTICS: LENGTH: 6404 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:14: CTCTTATGTG AGGAGCTGAA GAGGAATTAA
AATATACAG
V
V
*Ve.
V
CCTCCCCCAC
CAACGCATTC
GAAGCCCCA.
GACATTCCTC
AAAAAGACTT
GCTTTATATA
CACTCCTTAT
ATGAATAACC
TTTGAATCAC
CGTGACCCGT
GTAAACCTAG
ATTTCTGTAA
CTTTCTGATG
CAGCTGTTCA
ACATTAGAAA
TACTTGGAAG
CCAGAGGGGT
GACACTTTCA
AACCTTTACC
GTGATTTTCC
TATGAAGAAC
CAGATGTTAG
GCTGAATATA
ACATCCAAAC
GACCTCAGAG
CTGAAAGAAA
LAGCCAAGCAG
CCGGCATGGT
TCATAGTATT
TGCTTTCTCC
TCAGCATGCT
CGCCGGACTG
TTGTAAAAAT
GGAACTGGCT
GCAATGTTTC
TCCCAGGCCT
TCATGATCCT
TGGGAAACCT
GCATAATGAA
GATCCAAAGA
ACACCTGTGT
GCTGGGCCTT
AACAGACGCT
TGGGCTCCTT
AGAACCAGGC
ACCGTCTTAA
CAAGTATTAG
TGAGCTCTAA
CTTTGTCCAf
ATCAAAGGMP
TGACTTGGAA
GTCAGAGCCC
GAACAAAGGG
TTTCAGTCCT
CATCATGTGC
GACCAAAAAT
CCTTGCAAGA
GGATTTTGTC
AGCTCTTCGA
GAAGACAATT
GACTGTGTTC
GAAGCATAAA
TACCCTAGAG
TGCTCTCCTT
GAAAATTGGC
CTTAGCCTTG
GCGTGCTGCT
TTATCTAATA
AAACATTGAA
AAAAGAGCAA
GAGAAGCAGA
AAGTGCTAAA
TTCACAAAA(
CCCAAAGAA(
GCTGGCAAA(
CTGGAGGAC]
AAAACAATCI
CTAAGAAGA7
ACTATTCTGT
GTCGAGTACP
GGCTTCTGTC
GTCATTGTTT
ACTTTCAGAG
GTAGGGGCTT
TGTCTGAGTG
TGTTTTCGAA
AGTGAAGAAG
TGTGGTTTCA
AGAAACCCTG
TTTAGGCTAA
GGCAAAACCT
AACTTGATCC
GAAGCTAAAC
GAAGAAGCTG
ATTATGGGCC
GAAAGAAGAA
G ATGAAAAGA'
AGTCTCTTG(
3AAAAGAAAGI
AACTGCCCTI
TGGACCCCTI
TCCGTTTCAf
TATCTATTAP
iCAAACTGCAIr
LCTTTTACTGG
TAGGAGAATT
TTGCGTATTT
TATTGAGAGC
TGATCCAGTC
TGTTTGCACT
ATTCACTTGA
ACTTTAGAAA
GCACAGATTC
ATTATGGCTA
TGACCCAAGA
ACATGATCTT
TGGCTGTGGT
AGAAAGAATT
AGGCAATTGC
TCTCAGAGAG
ACAGAAGAAA
r' GGCAATGTTG
'CCTCATTGAA
k TGATGATGAA
CATCTATGGG
CTATGCAGAC
TGCCACACCT
GATTTTAGTA
ATTTATGACC
AATATATACT
CACTTTTCTT
AACAGAATTT
TTTGAAAACT
AGTGAAGAAG
AATTGGACTA
AAATAATGAA
ATATTTTTAT
AGGTCAGTGT
CACGAGCTTT
TTACTGGGAA
CTTTGTCGTA
TGCCATGGCA
AGAATTTCAA
AGCGGCAGCG
TTCTTCTGAA
GAAAAAGAAT
120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 -94- CAAAAGAAGC TCTCCAGTc3G AGAGGAAAAG GGAGATGCTG
AGAAATTGTC
a a.
a a.
a a a a a a a..
a a. a a a *ae.
6 a.
a *aaa..
a
TCAGAGGACJ
CAT GAAAAGI2
TCTGCAAGG(
GGATCTGAGI
AGGGGCTCAC
GCCAGTAGG'I
AACGGTGTGC
CTGCCAGAGG
CACAAGAAAA
CTCAGACAGA
GAGTCCAGAG
AATTGCTCTC
TTTGTAGATC
CACCACCCAA
GGAATCTTTG
CAAGTAGGCT
CTAGCAGATG
TTGGCAAAAT
GGTCTAGGTA
ATGCAGCTCT
CTCCCACGGT
TGTGGAGAGT
CTTATTGTTT
GCCTTATTAT
GCAAACAACC
ACCTTACGTG
AGACAAGCAG
GAAATGAGCA
AGCGTGGACA
CTCACAGTGA
kGCATCAGAAG k GGTTGTCTAC
GAAGCAGCAG
CTGAATTTGC
TGTTTGTGCC
CCCCACCAAT
TCTCCCTGGT
TGATAATAGA
GGCGTTGTAG
GAGCAATGAG
AAAAATGTCC
CATATTGGAT
TTGCAATTAC
TGACTGAGGA
CAGCTGAAAT
GGAATATTTT
TGGAAGGATT
CCTGGCCAAC
ACCTCACCTT
TTGGTAAGAG
GGCACATGAA
GGATAGAGAC
ACATGATGGT
TGAGCTCATT
TCCAGATTGC
AATTTATTCT
AAGATCTGAA
AAGGTCACAA
AACACTTGAT C CAGTGCCAAT
I]
AAAAAGTTTC
CCCCAATCAG
AACAAGTCTT
CGATGATGAG
CCACAGACCC
GCTGCCGGTG
TGATGGACGC
TAAGACAACT
TTCCTATCTC
TAGAGCAAGC
ACCTTGGTGG
AAAATTCAAA
CATTTGCATA
ATTCAAAAAT
GGTATTAAAA
TGACAGCCTT
GTCAGTTCTG
ATTGAACATG
AGTGTTGGCC
CTACAAAGAA
CGACTTCTTC
CATGTGGGAC
CATGGTCATT
rAGTTCAGAC iGTGACTAGA
IAAAGCATTTI
PACTAAGAAG
LTTCCTCAAGC
;GAAGACAGT
'GCACCTGGG G
CACCTTGGTC
TCACCACTCT
TTTAGTTTCI
CACAGCATTT
CAGGAGCGAC
AACGGGAAAA
TCAGCCCTCA
TCTGATGACA
CTTTCAGAGG
ATATTAACAA
TACAGATTTG
AAGTGTATCT
GTTTTAAACA
GTACTTGCTA
CTGATTGCCA
ATTGTGACTT
CGATCATTCA
CTGATTAAGA
PTCATCGTCT
TGTGTCTGCA
CACTCCTTCC
TGTATGGAGG
3GAAACCTGG
IATCTTACAG
kTTAAAAAGG rCCAAAAAGC
~AAAACTATA
~AAAAAGATA
~ATGGTCAAT
~AATCCGATT
TCGAAGGGCI
GCATTCGTGC
AAGGCAGAGC
TTGGAGACAP
GCAGCAGTAA.
TGCACAGTGC
TGCTCCCCAA
GCGGCACGAC
ATATGCTGMA
ACACTGTGGA
CACACAAATT
ATTT TAT TGT
CATTATTTAT
TAGGAAATTT
TGGATCCATA
TAAGTTTAGT
GACTGCTCCG
TCATTGGTAA
TCATTTTTGC
AGATCAATGA
TGATTGTGTT
TCGCTGGTCA
TGGTCCTAjAj
CAATTGAAGA
GAATAAATTA
CAAAGATTTC
TTTCTAACCA
AAATCAGTGG
CATTTATTCA
TGGAAAATAT
GAAATCAGAA
STAGGCGAGCA
;CTCCTTGTTT
AAGAGATATA
TGAGAGCAGA
CATCAGCCAA
TGTGGACTGC
TGGACAGCTT
CAATCAAATA
TGATCCCAAc AGAACTTGAAk
CTTGATCTGG
AATGGATCCT
GGCTATGGAA
GGTCTTTACT
TGAGTATTTC
GGAGCTCTTT
AGTCTTCAAG
CTCAGTAGGG
TGTGGTCGGC
TGACTGTACG
CCGCGTGCTG
AGCTATGTGC
CCTATTTCTG
AGACCCTGAT
TGTGAAACAA
CAGGGAGATA
TACACTTGCT
TTTTGGAAGC
CAATCCCAGC
GAATGCTGAG
1560 1620 1680 1740 1800 1860 1920 1980 2040 2100 2160 2220 2280 2340 2400 2460 2520 2580 2640 2700 2760 2820 2880 2940 3000 3060 3120 3180 3240 3300 3360 p a p p.
p p p p..
a a a. p p p p
GAACTTAGCA
TCAGAGTGCA
CCTATGAATT
TGCTGCCAAG
TGCTACAAGA
AGCAGTGGTG
ATCCTGGAST
TGGATAGCAT
ATTGTTGATG
CCCATTAA.AT
GAAGGAATGA
CTACTTGTGT
GGCAAGTTCT
CCAAATGGTT
CTGAAAGTGA
TTTAAGGGAT
CCCAAATATG
TCATTCTTCA
AAGAAGCTTG
ATGAAAAAGC
CAAGGATGTA
ATCTGTCTCA
GAAGTTTTAT
AAACTGATCT
GTTGTGATTA
TCCCCTACCC
AAAGGAGCAA
TTTAACATCG
AACTTTGCCT
GGCAACAGTA
GCACCTATTG
GTGATTCGGA
GCACAGTTGA
CCGATGAGCC
TTAACATAGA
TTGTTGAACA
CCCTGGCTTT
ATGCAGACAA
ATGGTTATAA
TTTCTTTGGT
CCCTTCGGAC
GGGTCGTTGT
GTCTTATATT
ATGAGTGTAT
CCGAATGTTT
ACTTTGATAA
GGACGATTAT
AATATAGCCT
CTTTGAACTT
GAGGTGAAGA
TGGGGTCCAA
TATTTGACCT
ACATGGTAAC
ATTGGATAAA
CCCTCAGACA
TCTCCATTGT
TGTTCCGAGT
AGGGGATCCG
GCCTCCTGCT
ATGTTAAAAA
TGATTTGCCT
TTAACAGTAA
TAGTGAATAC
TAACCCTTTG
AGAGGCCTGT
GTCAGGGAAA
CAGTTGGTTT
TGAAGATATT
GATCTTCACT
AACATATTTC
TACTTTAGTG
ACTGAGAGCT
GAATGCACTC
CTGGCTGATA
TAACACCACA
TGCCCTTATG
TGTCGGACTT
TATGTATGCA
CTACATGTAT
GTTCATTGGT
CATCTTTATG
GAAGCCACAA
AGTGACAAAT
CATGATGGTA
TGTGGTTTTT
CTACTACTTC
AGGTATGTTT
GATCCGTCTT
CACGCTGCTC
CTTCCTGGTC
GGAAGATGGA
GTTCCAAATT
GCCACCCGAC
AGGAAAGTGA
CCTGGAGAAG
TTCACAGATG
GGAAAAATCT
GAAAGCTTCA
TATATTGAAA
TACATCTTCA
ACCAATGCGT
GCAAACACTC
TTAAGACCTC
ATAGGAGCAA
TTCAGCATCA
GATGGGTCAC
AATGTTAGTC
GGTTACCTAT
GCAGTGGATT
ATTTATTTTG
GTCATGATAG
ACAGAAGAAC
AAGCCAATTC
CAAGCCTTTG
GAAAAGGAGG
ATAATCCTTT
ACTGTAGGAT
CTAGCTGATT
GCCAGGATTG
TTTGCTTTGA
ATGTTCATCT
ATTAATGACA
ACAACCTCTG
TGTGACCCAA
GATTAAACCG
GAGAAGAAGC
GTTGTGTACG
GGTGGAACAT
TTGTCCTCAT
GGAAAAAGAC
TTCTGGAAAT
GGTGTTGGCT
TTGGCTACTC
TAAGAGCCTT
TTCCTTCCAT
TGGGAGTAAA
GGTTTCCTGC
AAAATGTGCG
GTCTGCTTCA
CTGTTAATGT
TCGTCTTTAT
ATAATTTCAA
AGAAGAAATA
CTCGACCAGG
ATATTAGTAT
GTCAAAGTCA
TCACTGGAGA
GGAATATTTT
TGATTGAAAC
GCCGAATCCT
TGATGTCCCT
ACGCCATCTT
TGTTCAATTT
CTGGCTGGGA
AAAAAGTTCA
GTCAAGCTCC
AGAGGGTGAA
GAGGTTCTCA
CAGGAAAACC
GATCCTGCTC
CATTAAGATT
GCTTCTAAAA
GGATTTCCTA
AGATCTTGGC
ATCTAGATTTL
CATGAATGTG
TTTGTTTGCT
AAGTCAAGTT
ATGGAAAAAC
AGTTGCAACT
AGACAAGCAG
CATCTTTGGG
CCAACAGAAA
CTATAATGCA
GAACAAAATC
CATGGTTCTT
ACATATGACT
ATGTGTGCTA
TGATTTTGTG
GTATTTTGTG
ACGTCTAGTC
TCCTGCGTTG
TGGAATGTCC
TGAGACCTTT
TGGATTGCTA
TCCTGGAAGT
3420 3480 3540 3600 3660 3720 3780 3840 3900 3960 4020 4080 4140 4200 4260 4320 4380 4440 4500 4560 4620 4680 4740 4800 4860 4920 4980 5040 5100 5160 5220
TCAGTTGAAG
ATCATATCCT
GTTGCCACTG
GTTTGGGAGA
TTTGCAGCTG
GCCATGGATC
TTTACAAAGC
GAAAGGTTCA
AAACGGAAAC
TTAAGGCAAA
GATTTACTCA
AAAACAGATG
GACAAAGAGA
GAAAGCAAAA
TTTGTGTTAA
CTCGAAGGCA
TGGTTATTTT
ATTATAGCAT
GTAGAAAACT
CCATGTAAAT
GAGACTGTGG
TCCTGGTTGT
AAGAAAGTAC
AGTTTGATCC
CCCTGGATCC
TGCCCATGGT
GTGTTTTGGG
TGTCTGCAAA
AAGAGGATGT
ATGTCAAAAA
ATAMAAAAGA
CCACTTCATC
AATATGAACA
AATAGAGCTT
TAAAACTCTT
GTGCAGTCAC
TGCACTGATG
ACAAA.AGTGA
TTTACATCTG
AAACAACACA
TAACCCATCT GTTGGAATAT TCTACTTTGT TAGTTATATC 55.555
S
GGTGAACATG
TGAACCTCTG
CGATGCGACC
TCCTCTTCTC
TAGTGGTGAC
TGAGAGTGGG
TCCTTCCAAA
GTCTGCTACT
TATATCAAGT
TATGGCTTTT
CACCACCTCT
AGACAGAACA
CATTTTTGAT
TTGAGGAAGT
TAACTCTGAT
ATTCTTTAAG
TTGATTCAGT
CCTTGTCATC
CGCATACAGA
TACATTGCAG
AGTGAGGATG
CAGTTTATAG
ATAGCAAAAC
CGGATCCATT
GAGATGGATT
GTGTCCTATG
GTCATTCAGC
ATATACATAA
GATAATGTTA
CCACCTTCAT
GAAAAGGAAG
ATATTGTTTA
CTATGCCAAA
TCATACTGGA
ACTTTGAGAT
AGTTCTCTA.A
CCAACAAAGT
GTCTTGACAT
CTCTTCGTTC
AACCCATCAC
GTGCTTATAG
AAGATGGAGA
ATGAGAACTC
ATGATAGTGT
ACAAAGGGAA
CAGCCTGTGA
ATCCTTTTTA
GAATTTTAGT
GTTCTATGAG
ACTCTCTGAT
CCAGCTCATT
CTTATTTGCT
ACAGATGGAA
AACCACACTA
ACGTTACCGC
CAGAGATGAT-
AAGTCCAGAA
AACAAAGCCA
AGACAGCAAG
AAGTGATTTA
TCAAAATATT
CATTAGCAGA
GGAATTATTG
AGAAGACCAT
GTCTTGTTTC
5280 5340 5400 5460 5520 5580 5640 5700 5760 5820 5880 5940 6000 6060 6120 6180 6240 6300 6360 6404 TTCCTAAGAA AGGTGGGCAG AATCGTAAGA GAACTCTGTA TTTTTGGTTT TTAATAAATC TTTTCACAGG ATTGTAATTA AAAAAJAAAAA AAAA INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 1835 amino acids TYPE: amino acid STRANDEDNESS: not relevant TOPOLOGY: not relevant (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID Met Ala Met Leu Pro Pro Pro Gly Pro Gln Ser Phe Val His Phe Thr 1 5 10 Lys Gln Ser Leu Ala Leu Ile Glu Gln Arg Ile Glu Lys Lys Glu Lys 25 Glu Lys Lys Asp Asp Glu Glu Pro Lys Pro Ser Ser Asp Leu Glu Ala teC
C
C
C
Gly Ser Phe Al a Lys Leu Tyr 145 Arg Trp As n Leu Leu 225 Phe As n Ile Ser Pro 305 Ser Thr Gly Lys Glu Ile Leu Ile Thr 130 Thr Gly Leu Leu Lys 210 Ile Cys Le u Met Lys 290 Giu Phe Gin Lys Gin Pro Val1 T yr Leu 115 As n Phe Phe Asp Giy 195 Thr Gin Le u Lys As n 275 Asp Gly Asp Asp Thr 355 Phe Asp 70 Lys Ser Ser Phe Ile 150 Gly Val Ser Vai Lys 230 Phe Cys Ser Le u Val1 310 Ser Giu Ile Ile 55 Leu Gly Pro Leu Met 135 Tyr Giu Ile Ala Ile 215 Lys Ala Phe Glu Cys 295 Lys Trp As n Phe Tyr Asp Lys Phe Phe 120 Thr Thr Phe Val Leu 200 Pro Leu Leu Arg Giu 280 Gly Giy Al a Leu Phe 360 Gly Pro Ile Ser 105 Ser Asn Phe Thr Phe 185 Arg Gly Ser Ile Leu 265 Lys Phe Arq Phe Tyr 345 Val Ile Tyr 75 Arg Leu Leu Pro Ser 155 Leu Tyr Phe Lys Val 235 Leu Asn Phe Thr Pro 315 Ala Gin Val1 Pro Ala Phe Arg Ile Trp 140 Leu Arg Leu Arq Thr 220 Met Gin Giu Tyr Asp 300 Asp Leu Thr Ile Met Lys Thr Se r Thr As n Leu Trp 175 Phe Arg Gly Thr Met 255 Giu Glu Gin Tyr Leu 335 Ala Gly Val Thr Pro Ile Ile Val1 Ala 160 As n Val1 Ala Ala Val1 240 Gly Ser Gly Cys Thr 320 Met Al a Se r -97.1- 0 0.* a. a
S.
a Phe Giu 385 Phe Ala Gly Lys Giu 465 Lys Thr Arg Asp Asp 545 Arg Val1 Len Gin Met 625 Len Arg Phe Ile Len As n Gin Ala Ser 435 Arg Lys Phe As n Ser 515 Ser Gin Ser Gly Asp 595 Thr As n As n Al a Lys 675 Ile Ile Gin Met Al a 420 Gin Arg Gly His Gin 500 Ser Gin Ser Ser Lys 580 Gly Thr Asp Thr His 660 Ile Cys Len As n 390 Asp Ala Ser Arg Gin 470 Gly Pro Thr Gin Arq 550 Ile His Al a Gin Len 630 Gin Len Phe Val Ile 375 Ile Arg Ala Ser Arg 455 Lys Val Len Ser Phe 535 Gly Ser Ser Leu Lys 615 Arg Gin Ile Ile Len 695 Ala Gin Lys Thr 425 Thr Lys Ser Gly Ile 505 Phe Asp Leu Ala Val1 585 Len Arg Arg Gin As n 665 Met Thr Val Ala Lys 410 Ser Ser Lys Lys His 490 Arg Ser Asp Phe Ser 570 Asp Pro Ser Ala Gin 650 Cys Asp Len Val1 Lys 395 Gin Ile Len Gin Ser 475 Arg Gi y Phe Gin Val1 555 Arg Cys As n Ser Met 635 Ser Se r Pro Phe Ala 380 Gin Gin Arq Ser Lys 460 Ser Gin Ser Lys His 540 Pro Ser Asn Gly Tyr 620 Ser Arg Pro Phe Met 700 Met Lys Gin Ser Ser 445 Lys Gin Lys Len Gly 525 Ser His Pro Gly Gin 605 Len Arg Gin Tyr Val1 685 Ala Gin Gin 400 Gin Met Ala Gly -Arg 480 Ser Al a Arg Gly Gin 560 Pro Ser Pro Asp Ile 640 Tyr Lys Ala His -97.2a a a a a a.
a a a a.
a.
.a a a a. a a.
His 705 Thr Pro Val1 Val1 Trp 785 Ala Ala Cys His Thr 865 Tyr Leu Glu Ile Lys 945 Leu Gly Leu As n Pro Met Gly Ile Tyr Giu Thr Leu 755 Leu Arg 770 Pro Thr Leu Gly Val Val Lys Ile 835 Ser Phe 850 Met Trp Met Met Ala Leu Giu Asp 915 Asn Tyr 930 Lys Pro Ala Giu Giy Ser Thr Val 995 Giu Giu 1010 Thr Phe T yr 740 Ser Se r Leu As n Gly 820 As n Leu Asp Val1 Leu 900 Asp Val1 Lys Met Ser 980 Thr Leu Giu Al a 725 Phe Leu Phe Asn Leu 805 Met Asp Ile Cys Met 885 Leu Ala Lys Ser Ser 965 Asp ValI Ser Giu Phe Lys Asr 710 Ala Giu Met Val Gin Val Gly Trp 745 Giu Leu Phe Leu 760 Arg Leu Leu Arg 775 Met Leu Ile Lys 790 Thr Leu Val Leu Gin Leu Phe Gly 825 Cys Leu Pro Arq 840 Val Phe Arq Val 855 Met Glu Val Ala 870 Val Ile Gly Asn Ser Ser Phe Ser 905 Asn Asn Leu Gin 920 Gin Thr Leu Arg 935 Asp Asn Lys Lys 950 Lys His Asn Phe Lys Met Asp Gin 985 Pro Ile Ala Pro 1000 Ser Asp Ser Asp 1015 Val1 Le u 730 Asn Aila Vali Ile Aia 810 Lys Trp Leu Giy Leu 890 Ser Ile Giu Giu Leu 970 Ser Gly Ser Leu 715 Lys Ile Asp Phe Ile 795 Ile Ser His Cys Gin 875 Val1 Asp Aila Phe Asn 955 Lys Phe Glu Tyr Ala Le u Phe Val1 Lys 780 Gly Ile Tyr Met Gly 860 Met Val1 As n Val Ile 940 Tyr Glu Ile Ser Ser Gly Asn Ile Ala Asp Ser 750 Giu Gly 765 Leu Ala Asn Ser Val Phe Lys Giu 830 Asn Asp 845 Giu Trp Cys Leu Leu Asn Leu Thr 910 Ar Ile 925 Leu Lys Ile Ser Lys Asp His Asn 990 Asp Leu 1005 Lys Asn Leu Met 735 Leu Leu Lys Val1 lie 815 Cys Phe Ile Ile Leu 895 Ala Lys Phe Asn Ile 975 Pro Giu Arq Phe 720 Asp Ile Ser Ser Gly 800 -Phe Val1 Phe Giu Val1 880 Phe Ile Gly Ser Thr 960 Ser Ser Met Ser 1020 Ser Ser Ser Giu Cys Ser Thr Val Asp Asn Pro Leu Pro Giy Giu Giy 1025 1030 1035 1040 -97.3- Glu Glu Ala Glu Ala Glu Pro Asn Asp Glu Pro Glu Ala Cys Phe Thr 1045 1050 1055 Asp Gly Cys Val Arg Arg Phe Cys Cys Gin Val Asn Ser Gly Lys Gly 1060 1065 1070 Lys Trp Trp Ile Arg Lys Thr Cys Tyr Ile Val Glu His Ser Trp Phe 1075 1080 1085 Glu Ser Phe Ile Val Leu Met Ile Leu Leu Ser Ser Gly Ala Leu Ala 1090 1095 1100 Phe Glu Asp Ile Tyr Ile Glu Lys Lys Thr Ile Lys Ile Ile Leu Glu 1105 1110 1115 1120 Tyr Ala Asp Lys Ile Phe Thr Tyr Ile Phe Ile Leu Glu Met Leu Leu 1125 1130 1135 Lys Trp Ala Tyr Gly Tyr Lys Thr Tyr Phe Thr Asn Ala Trp Cys-Trp 1140 1145 1150 Leu Asp Phe Leu Ile Val Asp Val Ser Leu Val Thr Leu Val Ala Asn 1155 1160 1165 "oo Thr Leu Gly Tyr Ser Asp Leu Gly Pro Ile Lys Ser Leu Arg Thr Leu 1170 1175 1180 Arg Ala Leu Arg Pro Leu Arg Ala Leu Ser Arg Phe Glu Gly Met Arg 1185 1190 1195 1200 Val Val Val Asn Ala Leu Ile Gly Ala Ile Pro Ser Ile Met Asn Val 1205 1210 1215 Leu Leu Val Cys Leu Ile Phe Trp Leu Ile Phe Ser Ile Met Gly Val 1220 1225 1230 0*e Asn Leu Phe Ala Gly Lys Phe Tyr Glu Cys Asn Thr Thr Asp Gly Ser 1235 1240 1245 Arg Phe Pro Ser Gin Val Asn Arg Ser Glu Cys Phe Ala Leu Met Asn 1250 1255 1260 Val Ser Asn Val Arg Trp Lys Asn Leu Lys Val Asn Phe Asp Asn Val 1265 1270 1275 1280 Gly Leu Gly Tyr Leu Ser Leu Leu Gin Val Ala Thr Phe Lys Gly Trp 1285 1290 1295 Ile Met Tyr Ala Ala Val Asp Ser Val Asn Val Gin Pro Lys Tyr Glu 1300 1305 1310 Tyr Ser Leu Tyr Met Tyr Ile Tyr Phe Val Phe Ile Ile Phe Gly Ser 1315 1320 1325 Phe Phe Thr Leu Asn Leu Phe Ile Gly Val Ile Ile Asp Asn Phe Asn 1330 1335 1340 Gin Gin Lys Lys Lys Leu Gly Gly Gin Asp Ile Phe Met Thr Glu Glu 1345 1350 1355 1360 Gin Lys Lys Tyr Tyr Asn Ala Met Lys Lys Leu Gly Ser Lys Lys Pro 1365 1370 1375 I~ J -97.4- Gin Lys Pro Ile Pro Arg Pro Gly Asn Lys Gin Gly Cys Ile Phe Asp 1380 1385 1390 Leu Thr Asn Gin Ala Phe Asp Ile Ile Met Val Leu Ile Cys Leu Asn 1395 1400 1405 Met Val Thr Met Met Val Glu Lys Glu Gly Gin Met Val Leu Trp Ile 1410 1415 1420 Asn Val Phe Ile Ile Leu Phe Thr Gly Glu Cys Val Leu Lys Leu Ile 1425 1430 1435 1440 Ser Leu Arg His Tyr Tyr Phe Thr Val Gly Trp Asn Ile Phe Val Val 1445 1450 1455 Val Ile Ser Ile Val Gly Met Phe Leu Ala Ile Glu Tyr Phe Val Ser 1460 1465 1470 Pro Thr Leu Phe Arg Val Ile Arg Leu Ala Arg Ile Gly Arg Ile -Leu 1475 1480 1485 Arg Leu Lys Gly Ala Lys Gly Ile Arg Thr Leu Leu Phe Ala Leu Met 1490 1495 1500 Met Ser Leu Pro Ala Leu Phe Asn Ile Gly Leu Leu Leu Phe Leu Val 1505 1510 1515 1520 Met Phe Ile Tyr Ala Ile Phe Gly Met Ser Asn Phe Ala Tyr Val Lys 1525 1530 1535 Lys Glu Gly Ile Asn Asp Met Phe Asn Phe Glu Thr Phe Gly Asn Ser 1540 1545 1550
*O
Met Ile Cys Leu Phe Gin Ile Thr Thr Ser Ala Gly Trp Asp Gly Leu 1555 1560 1565 Leu Ala Pro Ile Leu Asn Ser Pro Pro Asp Cys Asp Pro Lys Lys Val o 1570 1575 1580 His Pro Gly Ser Ser Val Glu Gly Asp Cys Gly Asn Pro Ser Val Gly 1585 1590 1595 1600 Ile Phe Tyr Phe Val Ser Tyr Ile Ile Ile Ser Phe Leu Val Val Val 1605 1610 1615 Asn Met Tyr Ile Ala Val Ile Leu Glu Asn Phe Ser Val Ala Thr Glu 1620 1625 1630 Glu Ser Thr Glu Pro Leu Ser Glu Asp Asp Phe Glu Met Phe Tyr Glu 1635 1640 1645 Val Trp Glu Lys Phe Asp Pro Asp Ala Thr Gin Phe Ile Glu Phe Lys 1650 1655 1660 Leu Ser Asp Phe Ala Ala Ala Leu Asp Pro Pro Leu Leu Ile Ala Lys 1665 1670 1675 1680 Pro Asn Lys Val Gin Leu Ile Ala Met Asp Leu Pro Met Val Ser Gly 1685 1690 1695 Asp Arg Ile His Cys Leu Asp Ile Leu Phe Ala Phe Thr Lys Arg Val 1700 1705 1710 -97.5- Leu Gly Glu Gly Glu Met Asp Ser Leu Arg Ser Gin Met Glu Glu Arg 1715 1720 1725 Phe Met Ser Ala Asn Pro Ser Lys Val Ser Tyr Glu Pro Ile Thr Thr 1730 1735 1740 Thr Leu Lys Arg Lys Gin Glu Val Ser Ala Thr Ile Gin Arg Ala Tyr 1745 1750 1755 1760 Arg Arg Tyr Arg Leu Arg Gin Val Lys Asn Ile Ser Ser Ile Tyr Ile 1765 1770 1775 Lys Asp Gly Asp Arg Asp Asp Asp Leu Asn Lys Asp Phe Asp Asn Val 1780 1785 1790 Asn Glu Asn Ser Ser Pro Glu Lys Thr Asp Thr Ser Thr Ser Pro Pro 1795 1800 1805 Ser Tyr Asp Ser Val Thr Lys Pro Asp Glu Lys Tyr Glu Asp Thr-Glu 1810 1815 1820 Lys Glu Asp Lys Lys Asp Ser Lys Glu Ser Lys 1825 1830 1835 INFORMATION FOR SEQ ID NO:16: SEQUENCE CHARACTERISTICS: LENGTH: 1969 amino acids TYPE: amino acid STRANDEDNESS: not relevant TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:16: Met Ala Met Leu Pro Pro Pro Gly Pro Gin Ser Phe Val His Phe Thr 1 5 10 Lys Gin Ser Leu Ala Leu Ile Glu Gin Arg Ile Ala Glu Arg Lys Ser 25 Lys Glu Pro Lys Glu Glu Lys Lys Asp Asp Asp Glu Glu Ala Pro Lys 40 Pro Ser Ser Asp Leu Glu Ala Gly Lys Gin Leu Pro Phe Ile Tyr Gly 55 Asp Ile Pro Pro Gly Met Val Ser Glu Pro Leu Glu Asp Leu Asp Pro 70 75 Tyr Tyr Ala Asp Lys Lys Thr Phe Ile Val Leu Asn Lys Gly Lys Ala 90 Ile Phe Arg Phe Asn Ala Thr Pro Ala Leu Tyr Met Leu Ser Pro Phe 100 105 110 Ser Pro Leu Arg Arg Ile Ser Ile Lys Ile Leu Val His Ser Leu Phe 115 120 125 -97.6- Ser Met Leu Ile Met Cys Thr Ile Leu Thr Asn Cys Ile Phe Met Thr 130 135 140 Met Asn Asn Pro Pro Asp Trp Thr Lys Asn Val Gly Tyr Thr Phe Thr 145 150 155 160 Gly Ile Tyr Thr Phe Giu Ser Leu Val Lys Ile Leu Ala Arg Gly Phe 165 170 175 Cys Val Gly Giu Phe Thr Phe Leu Arg Asp Pro Trp Asn Trp Leu Asp 180 185 190 Phe Val Val Ile Val Phe Ala Tyr Leu Thr Giu Phe Val Asn Leu Gly 195 200 205 Asn Val Ser Ala Leu Arq Thr Phe Arg Val Leu Arq Ala Leu Lys Thr 210 215 220 Ile Ser Val Ile Pro Gly Leu Lys Thr Ile Val Gly Ala Leu le .Gln 225 230 235 240 ***Ser Val Lys Lys Leu Ser Asp Val Met Ile Leu Thr Val Phe Cys Leu .245 250 255 Ser Val Phe Ala Leu Ile Gly Leu Gin Leu Phe Met Gly Asn Leu Lys 260 265 270 .555 His Lys Cys Phe Arg Asn Ser Leu Giu Asn Asn Glu Thr Leu Giu Ser 275 280 285 Ile Met Asn Thr Leu Glu Ser Giu Giu Asp Phe Arq Lys Tyr Phe Tyr *290 295 300 Tyr Leu Giu Gly Ser Lys Asp Ala Leu Leu Cys Gly Phe Ser Thr Asp 305 310 315 320 *Ser Gly Gin Cys Pro Giu Gly Tyr Thr Cys Val Lys Ile Gly Arq Asn 325 330 335 Pro Asp Tyr Giy Tyr Thr Ser Phe Asp Thr Phe Ser Trp Ala Phe Leu *.340 345 350 Ala Leu Phe Arq Leu Met Thr Gin Asp Tyr Trp Giu Asn Leu Tyr Gin 355 360 365 Gin Thr Leu Arg Ala Ala Gly Lys Thr Tyr Met Ile Phe Phe Val Val 370 375 380 Val Ile Phe Leu Gly Ser Phe Tyr Leu Ile Asn Leu Ile Leu Ala Val 385 390 395 400 Val Ala Met Ala Tyr Glu Giu Gin Asn Gin Ala Asn Ile Glu Giu Ala 405 410 415 Lys Gin Lys Giu Leu Giu Phe Gin Gin Met Leu Asp Arg Leu Lys Lys 420 425 430 Giu Gin Giu Glu Ala Glu Ala Ile Ala Ala Ala Ala Ala Giu Tyr Thr 435 440 445 Ser Ile Arg Arg Ser Arg Ile Met Gly Leu Ser Glu Ser Ser Ser Giu 450 455 460 -97.7- Thr Ser Lys Leu Ser Ser Lys Ser Ala Lys Glu Arg Arg Asn Arg Arg 465 470 475 480 Lys Lys Lys Asn Gin Lys Lys Leu Ser Ser Gly Glu Glu Lys Gly Asp 485 490 495 Ala Glu Lys Leu Ser Lys Ser Glu Ser Glu Asp Ser Ile Arg Arg Lys 500 505 510 Ser Phe His Leu Gly Val Glu Gly His Arg Arg Ala His Glu Lys Arg 515 520 525 Leu Ser Thr Pro Asn Gin Ser Pro Leu Ser Ile Arg Gly Ser Leu Phe 530 535 540 Ser Ala Arg Arg Ser Ser Arg Thr Ser Leu Phe Ser Phe Lys Gly Arg 545 550 555 560 Gly Arg Asp Xaa Gly Ser Glu Thr Glu Phe Ala Asp Asp Glu His-Ser 565 570 575 Ile Phe Gly Asp Asn Glu Ser Arg Arg Gly Ser Leu Phe Val Pro His 580 585 590 Arg Pro Xaa Glu Arg Arg Ser Ser Asn Ile Ser Gin Ala Ser Arg Ser 595 600 605 Pro Pro Met Leu Pro Val Asn Gly Lys Met His Ser Ala Val Asp Cys 615 620 Asn Gly Val Val Ser Leu Val Asp Gly Xaa Ser Ala Leu Met Leu Pro 625 630 635 640 Asn Gly Gin Leu Leu Pro Glu Gly Thr Thr Asn Gin Ile His Lys Lys 645 650 655 Arg Arg Cys Ser Ser Tyr Leu Leu Ser Glu Asp Met Leu Asn Asp Pro 660 665 670 Asn Leu Arg Gin Arg Ala Met Ser Arg Ala Ser Ile Leu Thr Asn Thr 675 680 685 Val Glu Glu Leu Glu Glu Ser Arg Gin Lys Cys Pro Pro Trp Trp Tyr 690 695 700 Arg Phe Ala His Lys Phe Leu Ile Trp Asn Cys Ser Pro Tyr Trp Ile 705 710 715 720 Lys Phe Lys Lys Cys Ile Tyr Phe Ile Val Met Asp Pro Phe Val Asp 725 730 735 Leu Ala Ile Thr Ile Cys Ile Val Leu Asn Thr Leu Phe Met Ala Met 740 745 750 Glu His His Pro Met Thr Glu Glu Phe Lys Asn Val Leu Ala Ile Gly 755 760 765 Asn Leu Val Phe Thr Gly Ile Phe Ala Ala Glu Met Val Leu Lys Leu 770 775 780 Ile Ala Met Asp Pro Tyr Glu Tyr Phe Gin Val Gly Trp Asn Ile Phe 785 790 795 800 -97.8- Asp Ser Leu Ile Val Thr Leu Ser Leu Val Glu Leu Phe Leu Ala Asp 805 810 815 Val Glu Gly Leu Ser Val Leu Arg Ser Phe Arg Leu Leu Arg Val Phe 820 825 830 Lys Leu Ala Lys Ser Trp Pro Thr Leu Asn Met Leu Ile Lys Ile Ile 835 840 845 Gly Asn Ser Val Gly Ala Leu Gly Asn Leu Thr Leu Val Leu Ala Ile 850 855 860 Ile Val Phe Ile Phe Ala Val Val Gly Met Gin Leu Phe Gly Lys Ser 865 870 875 880 Tyr Lys Glu Cys Val Cys Lys Ile Asn Asp Asp Cys Thr Leu Pro Arg 885 890 895 Trp His Met Asn Asp Phe Phe His Ser Phe Leu Ile Val Phe Arg Val 900 905 910 Leu Cys Gly Glu Trp Ile Glu Thr Met Trp Asp Cys Met Glu Val Ala 915 920 925 Gly Gin Ala Met Cys Leu Ile Val Tyr Met Met Val Met Val Ile Gly S930 935 940 Asn Leu Val Val Leu Asn Leu Phe Leu Ala Leu Leu Leu Ser Ser Phe 945 950 955 960 oo Ser Ser Asp Asn Leu Thr Ala Ile Glu Glu Asp Pro Asp Ala Asn Asn 965 970 975 Leu Gin Ile Ala Val Thr Arg Ile Lys Lys Gly Ile Asn Tyr Val Lys 980 985 990 Gln Thr Leu Arg Glu Phe Ile Leu Lys Ala Phe Ser Lys Lys Pro Lys 995 1000 1005 Ile Ser Arg Glu Ile Arg Gin Ala Glu Asp Leu Asn Thr Lys Lys Glu S1010 1015 1020 Asn Tyr Ile Ser Asn Met Thr Leu Ala Glu Met Ser Lys Gly His Asn 1025 1030 1035 1040 Phe Leu Lys Glu Lys Asp Lys Ile Ser Gly Phe Gly Ser Ser Xaa Asp 1045 1050 1055 Lys His Leu Met Glu Asp Ser Asp Gly Gin Ser Phe Ile His Asn Pro 1060 1065 1070 Ser Leu Thr Val Thr Val Pro Ile Ala Pro Gly Glu Ser Asp Leu Glu 1075 1080 1085 Met Asn Glu Glu Leu Ser Ser Asp Ser Asp Ser Tyr Ser Lys Asn Arg 1090 1095 1100 Ser Ser Ser Ser Glu Cys Ser Thr Val Asp Asn Pro Leu Pro Gly Glu 1105 1110 1115 1120 Gly Glu Glu Ala Glu Ala Glu Pro Asn Asp Glu Pro Glu Ala Cys Phe 1125 1130 1135 -97.9- Thr Asp Gly Cys Val Arg Arg Phe Ser Cys Cys Gin Val Asn Ile Glu 1140 1145 1150 Ser Gly Lys Gly Lys Ile Trp Trp Asn Ile Arg Lys Thr Cys Tyr' Lys 1155 1160 1165 Ile Val Giu His Ser Trp Phe Glu Ser Phe Ile Val Leu Met Ile Leu 1170 1175 1180 Leu Ser Ser Gly Ala Leu Ala Phe Giu Asp Ile Tyr Ile Giu Arg Lys 1185 1190 1195 1200 Lys Thr Ile Lys Ile Ile Leu Giu Tyr Ala Asp Lys Ile Phe Thr Tyr 1205 1210 1215 Ile Phe Ile Leu Glu Met Leu Leu Lys Trp Ile Ala Tyr Gly Tyr Lys 1220 1225 1230 Tl-r Tyr Phe Thr Asn Ala Trp Cys Trp Leu Asp Phe Leu Ile Val-Asp 1235 1240 1245 *..Val Ser Leu Val Thr Leu Val Ala Asn Thr Leu Gly Tyr Ser Asp Leu 1250 1255 1260 Gly Pro Ile Lys Ser Leu Arg Thr Leu Arg Ala Leu Arq Pro Leu Arg 1265 1270 1275 1280 *Ala Leu Ser Arg Phe Glu Gly Met Arg Val Val Val Asn Ala Leu Ile 1285 1290 1295 .Gly Ala Ile Pro Ser Ile Met Asn Val Leu Leu Val Cys Leu Ile Phe *1300 1305 1310 Trp Leu Ile Phe Ser Ile Met Gly Val Asn Leu Phe Ala Gly Lys Phe 1315 1320 1325 *Tyr Glu Cys Ile Asn Thr Thr Asp Giy Ser Arq Phe Pro Ala Ser Gin 1330 1335 1340 Val Pro Asn Arq Ser Glu Cys Phe Ala Leu Met Asn Val Ser Gin Asn **1345 1350 1355 1360 Val Arg Trp Lys Asn Leu Lys Val Asn Phe Asp Asn Val Gly Leu Gly 1365 1370 1375 Tyr Leu Ser Leu Leu Gin Val Ala Thr Phe Lys Gly Trp Thr Ile Ile 1380 1385 1390 Met Tyr Ala Ala Val Asp Ser Val Asn Val Asp Lys Gin Pro Lys Tyr 1395 1400 1405 Glu Tyr Ser Leu Tyr Met Tyr Ile Tyr Phe Val Val Phe Ile Ile Phe 1410 1415 1420 Gly Ser Phe Phe Thr Leu Asn Leu Phe Ile Gly Val Ile Ile Asp Asn 1425 1430 1435 1440 Phe Asn Gin Gin Lys Lys Lys Leu Gly Gly Gin Asp Ile Phe Met Thr 1445 1450 1455 Giu Glu Gin Lys Lys Tyr Tyr Asn Ala Met Lys Lys Leu Gly Ser Lys 1460 1465 1470 -97.10- Lys Pro Gin Lys Pro Ile Pro Arg Pro Gly Asn Lys Ile Gin Gly Cys 1475 1480 1485 Ile Phe Asp Leu Val Thr Asn Gin Ala Phe Asp Ile Ser Ile Met Val 1490 1495 1500 Leu Ile Cys Leu Asn Met Val Thr Met Met Val Glu Lys Glu Gly Gin 1505 1510 1515 1520 Ser Gin His Met Thr Glu Val Leu Tyr Trp Ile Asn Val Val Phe Ile 1525 1530 1535 Ile Leu Phe Thr Gly Glu Cys Val Leu Lys Leu Ile Ser Leu Arg His 1540 1545 1550 Tyr Tyr Phe Thr Val Gly Trp Asn Ile Phe Asp Phe Val Val Val Ile 1555 1560 1565 Il-e Ser Ile Val Gly Met Phe Leu Ala Asp Leu Ile Glu Thr Tyr-Phe 1570 1575 1580 Val Ser Pro Thr Leu Phe Arg Val Ile Arg Leu Ala Arg Ile Gly Arg 1585 1590 1595 1600 •Ile Leu Arg Leu Val Lys Gly Ala Lys Gly Ile Arg Thr Leu Leu Phe 1605 1610 1615 Ala Leu Met Met Ser Leu Pro Ala Leu Phe Asn Ile Gly Leu Leu Leu 1620 1625 1630 Phe Leu Val Met Phe Ile Tyr Ala Ile Phe Gly Met Ser Asn Phe Ala 1635 1640 1645 Tyr Val Lys Lys Glu Asp Gly Ile Asn Asp Met Phe Asn Phe Glu Thr 1650 1655 1660 Phe Gly Asn Ser Met Ile Cys Leu Phe Gin Ile Thr Thr Ser Ala Gly 1665 1670 1675 1680 Trp Asp Gly Leu Leu Ala Pro Ile Leu Asn Ser Lys Pro Pro Asp Cys 1685 1690 1695 Asp Pro Lys Lys Val His Pro Gly Ser Ser Val Glu Gly Asp Cys Gly 1700 1705 1710 Asn Pro Ser Val Gly Ile Phe Tyr Phe Val Ser Tyr Ile Ile Ile Ser 1715 1720 1725 Phe Leu Val Val Val Asn Met Tyr Ile Ala Val Ile Leu Glu Asn Phe 1730 1735 1740 Ser Val Ala Thr Glu Glu Ser Thr Glu Pro Leu Ser Glu Asp Asp Phe 1745 1750 1755 1760 Glu Met Phe Tyr Glu Val Trp Glu Lys Phe Asp Pro Asp Ala Thr Gin 1765 1770 1775 Phe Ile Glu Phe Ser Lys Leu Ser Asp Phe Ala Ala Ala Leu Asp Pro 1780 1785 1790 Pro Leu Leu Ile Ala Lys Pro Asn Lys Val Gin Leu Ile Ala Met Asp 1795 1800 1805 -97.11- Leu Pro Met Val Ser Gly Asp Arg Ile His Cys Leu Asp Ile Leu Phe 1810 1815 1820 Ala Phe Thr Lys Arg Val Leu Gly Glu Ser Gly Glu Met Asp Ser Leu 1825 1830 1835 1840 Arg Ser Gin Met Glu Glu Arg Phe Met Ser Ala Asn Pro Ser Lys Val 1845 1850 1855 Ser Tyr Glu Pro Ile Thr Thr Thr Leu Lys Arg Lys Gin Glu Xaa Val 1860 1865 1870 Ser Ala Thr Val Ile Gin Arg Ala Tyr Arg Arg Tyr Arg Leu Arg Gin 1875 1880 1885 Asn Val Lys Asn Ile Ser Ser Ile Tyr Ile Lys Asp Gly Asp Arg Asp 1890 1895 1900 Asp Asp Leu Leu Asn Lys Glu Asp Met Ala Phe Asp Asn Val Asn-Glu 1905 1910 1915 1920 Asn Ser Ser Pro Glu Lys Thr Asp Ala Thr Ser Ser Thr Thr Ser Pro ,1925 1930 1935 Pro Ser Tyr Asp Ser Val Thr Lys Pro Asp Lys Glu Lys Tyr Glu Xaa 1940 1945 1950 Asp Gln Thr Glu Lys Glu Asp Lys Gly Lys Asp Ser Lys Glu Ser Lys S1955 1960 1965 Lys
S.
s O. INFORMATION FOR SEQ ID NO:17: rc SEQUENCE CHARACTERISTICS: LENGTH: 21 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:17: TTTGTGCCCC ACAGACCCCA G 21 INFORMATION FOR SEQ ID NO:18: SEQUENCE CHARACTERISTICS: LENGTH: 26 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA -97.12- (xi) SEQUENCE DESCRIPTION: SEQ ID NO:18: ACACAAATTC TTGATCTGGA ATTGCT 26 INFORMATION FOR SEQ ID NO:19: SEQUENCE CHARACTERISTICS: LENGTH: 23 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:19: .CAACCTCAGA CAGAGAGCAA TGA 23 **set:
Claims (13)
1. A bioassay for assessing a candidate modulating agent of a PNS SCP, comprising: contacting a candidate agent with a cell line expressing in the cell membrane of said cell a PNS SCP; and evaluating the modulation of the SC biological activity of said cell mediated by said contacting of said candidate agent.
2. A bioassay according to claim 1, wherein said cell line is selected from PC 12 cells or a recombinant form thereof having an isolated nucleic acid molecule coding for a peptide comprising an amino acid sequence corresponding to at least one peripheral nervous system specific (PNS) sodium channel peptide (SCP), wherein said SCP has sodium channel (SC) biological activity. a
3. A PNS SCP modulating agent, identified by a method according to claim 1. a
4. A PNS SCP modulating agent according to claim 3, wherein said agent is a methyl-phenyl/halophenyl-substituted piperizine compound. A PNS SCP modulating agent according to claim 4, wherein said piperizine compound is lidoflazine (Merck Index Monograph 5311) or a derivative thereof.
6. A method to treat diseases or conditions mediated by the presence of a PNS SCP, comprising administering to a patient in need of such treatment an effective amount of a PNS SCP modulating agent according to claim 3, or a pharmaceutical composition thereof.
7. A method for providing a molecular model of a PNS SCP, comprising: providing a computer readable medium having recorded thereon data corresponding to a coding sequence, a homologous amino acid or nucleic acid sequence, a structural domain or a functional domain of a PNS SCP comprising an amino acid or a nucleotide sequence of at least one PNS SCP, or at least one domain thereof; optionally providing a computer readable medium having recorded thereon x-ray diffraction data of said PNS SCP in crystalline ooF• form, said data sufficient to model the three-dimensional structure of said PNS SCP; analysing on a computer the amino acid or nucleotide sequence data from and optionally the x-ray diffraction data from to provide data output defining a molecular model of at least one PNS SCP, or at least one domain thereof, said analysing utilizing computing subroutines selected from the group consisting of data processing and reduction, auto-indexing, intensity scaling, intensity merging, amplitude conversion, truncation, molecular replacement, molecular alignment, molecular refinement, electron density map calculation, electron density modification, electron map visualization, model building, rigid body refinement and positional refinement; and 100 obtaining atomic model output data defining the three- dimensional structure of said PNS SCP, or at least one domain thereof.
8. A computer readable medium having recorded thereon molecular model data of a PNS SCP as the model output data produced by a method according to claim 7.
9. A computer-based system for providing a molecular model of a PNS SCP, comprising the following elements; a computer readable medium having recorded thereon data corresponding to an amino acid or nucleotide sequence of at least one PNS SCP, or at least one domain thereof; optionally, a computer readable medium having recorded 0 thereon x-ray diffraction data of said at least one PNS SCP or at least one domain thereof; at least one computing subroutine for analyzing on a computer the amino acid sequence data from and optionally, the x-ray •oil diffraction data from to provide data output defining a molecular model of PNS SCP, or at least one domain thereof, said analyzing utilizing computing subroutines selected from the group consisting of data processing and reduction, auto-indexing, intensity scaling, intensity merging, amplitude conversion, truncation, molecular replacement, molecular alignment, molecular refinement, electron density map calculation, electron density modification, electron map visualization, model building, rigid body refinement and positional refinement; and 101 retrieval means for obtaining model output data defining the three dimensional structure of said PNS SCP, or at least one domain thereof. A computer readable medium, comprising molecular model data of at least one PNS SCP produced by a method according to claim 9.
11. A method for providing a computer molecular model of a ligand of a PNS SCP, comprising: providing a computer readable medium according to claim comprising molecular model data of a PNS SCP, or at least one domain thereof; providing a computer readable medium having recorded S thereon molecular model data sufficient to generate molecular models of potential ligands of said PNS SCP; analyzing on a computer the molecular model data from (a) ~and the ligand data from to determine binding sites of said PNS SCP and to provide data output defining a molecular model of a ligand of said PNS SCP, said analyzing utilizing computing subroutines selected from the group consisting of data processing and reduction, auto-indexing, intensity scaling, intensity merging, amplitude conversion, truncation, molecular replacement, molecular alignment, molecular refinement, electron density map calculation, electron density modification, electron map visualization, model building, rigid body refinement and positional refinement; and obtaining model output data defining a molecular model of at least one ligand of a PNS SCP, or a domain thereof. 102
12. A PNS SCP ligand molecular model, comprising a computer readable medium having recorded thereon the model output data produced by a method according to claim 11.
13. An isolated PNS SCP ligand corresponding to the physical molecule of the molecular model of the ligand model produced by a method according to claim 11.
14. A computer-based system for providing a molecular model of a ligand of a PNS SCP, comprising the following elements; a computer readable medium having recorded thereon molecular model data of a PNS SCP, or at least one domain thereof; a computer readable medium having recorded thereon molecular model data sufficient to generate molecular models of potential ligands of said PNS SCP; at least one computing subroutine for analyzing on a computer the molecular model data of said PNS SCP from and the ligand data from to determine binding sites of PNS SCP and to provide data output defining molecular models of potential ligands of PNS SCP, said analyzing utilizing at least one computing subroutine selected from the group consisting of data processing and reduction, auto-indexing, truncation, molecular replacement, molecular alignment, molecular refinement, molecular translation, R-factor determination, electron density modification, electron density mapping, map density averaging, map visualization, model building, rigid body refinement, position refinement, crystallographic water adding, geometrical analysis and B-factor averaging; and 103 retrieval means for obtaining model output data defining the molecular models of potential ligands of said PNS SCP. A computer readable medium, comprising molecular model output data of a potential ligand of said PNS SCP, said data produced by a method according to claim 14.
16. An isolated PNS SCP ligand, corresponding to the physical molecule of the molecular model of a ligand produced by a computer system according to claim 14. io Dated this 15th day of September, 1998 TROPHIX PHARMACEUTICALS, INC and THE RESEARCH FOUNDATION OF STATE UNIVERSITY OF NEW YORK Patent Attorneys for the Applicant PETER MAXWELL ASSOCIATES O.llpo
Applications Claiming Priority (5)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US33402994A | 1994-11-02 | 1994-11-02 | |
| US08/334029 | 1994-11-02 | ||
| US48240195A | 1995-06-07 | 1995-06-07 | |
| US08/482401 | 1995-06-07 | ||
| AU41434/96A AU697465B2 (en) | 1994-11-02 | 1995-11-02 | Peripheral nervous system specific sodium channels, DNA encoding therefor, crystallization, x-ray diffraction, computer molecular modeling, rational drug design, drug screening, and methods of making and using thereof |
Related Parent Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| AU41434/96A Division AU697465B2 (en) | 1994-11-02 | 1995-11-02 | Peripheral nervous system specific sodium channels, DNA encoding therefor, crystallization, x-ray diffraction, computer molecular modeling, rational drug design, drug screening, and methods of making and using thereof |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| AU8515598A AU8515598A (en) | 1998-11-26 |
| AU715138B2 true AU715138B2 (en) | 2000-01-20 |
Family
ID=27154073
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| AU85155/98A Ceased AU715138B2 (en) | 1994-11-02 | 1998-09-16 | Peripheral nervous system specific sodium channel peptide (PNS SCP) modulating agent and bioassay therefor, treatment of diseases mediated by PNS SCP, and computer molecular modeling of PNS SCP |
Country Status (1)
| Country | Link |
|---|---|
| AU (1) | AU715138B2 (en) |
-
1998
- 1998-09-16 AU AU85155/98A patent/AU715138B2/en not_active Ceased
Non-Patent Citations (1)
| Title |
|---|
| MERCK INDEX MONOGRAPH 5311, 1983 * |
Also Published As
| Publication number | Publication date |
|---|---|
| AU8515598A (en) | 1998-11-26 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US6703486B2 (en) | Peripheral nervous system specific sodium channels | |
| DE69733572T2 (en) | MATERIAL HEMOCHROMATOSIS GENE | |
| EP0858467B1 (en) | Materials and methods relating to the identification and sequencing of the brca2 cancer susceptibility gene and uses thereof | |
| US20010047090A1 (en) | VANILREP1 polynucleotides and VANILREP1 polypeptides | |
| US6620914B1 (en) | Transcription factor islet-brain 1 (IB1) | |
| JP2003513611A (en) | ABC1 polypeptides and methods and reagents for regulating cholesterol levels | |
| EP0716695A1 (en) | Human calcium channel compositions and methods using them | |
| WO2002042735A2 (en) | Human kcr1 regulation of herg potassium channel block | |
| JPH08511158A (en) | Von Hippel-Lindau (VHL) disease gene and corresponding cDNA and method for detecting VHL disease gene carrier | |
| US7067255B2 (en) | Hereditary hemochromatosis gene | |
| WO1994007906A9 (en) | Epidermal surface antigen and uses thereof | |
| AU715138B2 (en) | Peripheral nervous system specific sodium channel peptide (PNS SCP) modulating agent and bioassay therefor, treatment of diseases mediated by PNS SCP, and computer molecular modeling of PNS SCP | |
| AU770018B2 (en) | Human N-type calcium channel isoform and uses thereof | |
| JP2002506875A (en) | Methods and compositions for diagnosis and treatment of chromosome 18p-related disorders | |
| WO2001044473A2 (en) | Polypeptides and nucleic acids encoding same | |
| US20020127669A1 (en) | Novel compositions for the expression of the human peptide histidine transporter 1 and methods of use thereof | |
| WO1996036711A2 (en) | Islet-specific homeoprotein and transcriptional regulator of insulin gene expression | |
| US20050159355A1 (en) | Novel leukotriene, B4 receptor | |
| US20090018071A1 (en) | Epididymis-specific receptor protein | |
| JP2002509860A (en) | Regulation of sodium channels in dorsal root ganglia | |
| CA2512444C (en) | Screening method of agents for improving the memory and learning | |
| CA2320579A1 (en) | Peripheral nervous system specific sodium channels, dna encoding therefor, crystallization, x-ray diffraction, computer molecular modeling, rational drug design, drug screening, and methods of making and using thereof | |
| HK1111734A (en) | Transcription factor islet-brain 1(ib1) | |
| HK1111733A (en) | Transcriptions factor islet-brain 1(ib1) |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| FGA | Letters patent sealed or granted (standard patent) |