AU735391B2 - Helicobacter polypeptides and corresponding polynucleotide molecules - Google Patents
Helicobacter polypeptides and corresponding polynucleotide molecules Download PDFInfo
- Publication number
- AU735391B2 AU735391B2 AU52662/98A AU5266298A AU735391B2 AU 735391 B2 AU735391 B2 AU 735391B2 AU 52662/98 A AU52662/98 A AU 52662/98A AU 5266298 A AU5266298 A AU 5266298A AU 735391 B2 AU735391 B2 AU 735391B2
- Authority
- AU
- Australia
- Prior art keywords
- ghpo
- seq
- leu
- lys
- ser
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K16/00—Immunoglobulins [IG], e.g. monoclonal or polyclonal antibodies
- C07K16/12—Immunoglobulins [IG], e.g. monoclonal or polyclonal antibodies against material from bacteria
- C07K16/1203—Gram-negative bacteria
- C07K16/121—Helicobacter (G); Campylobacter (G)
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/195—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
- C07K14/205—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria from Campylobacter (G)
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K38/00—Medicinal preparations containing peptides
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K39/00—Medicinal preparations containing antigens or antibodies
Landscapes
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Health & Medical Sciences (AREA)
- Medicinal Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Biophysics (AREA)
- General Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Biochemistry (AREA)
- Immunology (AREA)
- Gastroenterology & Hepatology (AREA)
- Medicines Containing Antibodies Or Antigens For Use As Internal Diagnostic Agents (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
- Peptides Or Proteins (AREA)
- Saccharide Compounds (AREA)
Description
-1- HELICOBACTER POLYPEPTIDES AND CORRESPONDING POLYNUCLEOTIDE MOLECULES The invention relates to Helicobacter antigens and corresponding polynucleotide molecules that can be used in methods to prevent or treat Helicobacter infection in mammals, such as humans.
Background of the Invention All references, including any patents or patent applications, cited in :this specification are hereby incorporated by reference. No admission is made that any reference constitutes prior art. The discussion of the references states what their authors assert, and the applicants reserve the right to challenge the accuracy and pertinency of the cited documents. It will be clearly understood that, although a number of prior art publications are referred to herein, this reference does not constitute an admission that any of these documents forms part of the common general knowledge in the art, in Australia or in any other country.
Helicobacter is a genus of spiral, gram-negative bacteria that colonize the gastrointestinal tracts of mammals. Several species colonize the stomach, most notably H.pylori, H.heilmanii, H. felis, and H.mustelae. Although H.pylori is the species most commonly associated with human infection, H. heilmanii and H. felis 2 0 have also been isolated from humans, but at lower frequencies than H. pylori.
Helicobacter infects over 50% of adult populations in developed countries and nearly 100% in developing countries and some Pacific rim countries, making it one of the most prevalent infections worldwide.
Helicobacter is routinely recovered from gastric biopsies of humans with histological evidence of gastritis and peptic ulceration. Indeed, H. pylori is now recognized as an important pathogen of humans, in that the chronic gastritis it causes is a risk factor for the development of peptic ulcer diseases and gastric carcinoma. It is thus highly desirable to develop safe and effective vaccines for preventing and RA4y treating Helicobacter infection.
-o 0 A number of Helicobacter antigens have been characterized or T 0 isolated. These include urease, which is composed of two structural subunits of approximately 30 and 67 kDa (Hu et al., Infect. Immun. 58:992, 1990; Dunn et H:\janel\Keep\Speci\52662-98.doc 26/04/01 WO 98/21225 PCT/US97/21353 -2al., J. Biol. Chem. 265:9464, 1990; Evans et al., Microbial Pathogenesis 10:15, 1991; Labigne et al., J. Bact., 173:1920, 1991); the 87 kDa vacuolar cytotoxin (VacA) (Cover et al., J. Biol. Chem. 267:10570, 1992; Phadnis et al., Infect.
Immun. 62:1557, 1994; WO 93/18150); a 128 kDa immunodominant antigen associated with the cytotoxin (CagA, also called TagA; WO 93/18150; U.S.
Patent No. 5,403,924); 13 and 58 kDa heat shock proteins HspA and HspB (Suerbaum et al., Mol. Microbiol. 14:959, 1994; WO 93/18150); a 54 kDa catalase (Hazell et al., J. Gen. Microbiol. 137:57, 1991); a 15 kDa histidine-rich protein (Hpn) (Gilbert et al., Infect. Immun. 63:2682, 1995); a 20 kDa membrane-associated lipoprotein (Kostrcynska et al., J. Bact. 176:5938, 1994); a 30 kDa outer membrane protein (Bolin et al., J. Clin. Microbiol. 33:381, 1995); a lactoferrin receptor (FR 2,724,936); and several porins, designated HopA, HopB, HopC, HopD, and HopE, which have molecular weights of 48-67 kDa (Exner et al., Infect. Immun. 63:1567, 1995; Doig et al., J. Bact.
177:5447, 1995). Some of these proteins have been proposed as potential vaccine antigens. In particular, urease is believed to be a vaccine candidate (WO 94/9823; WO 95/22987; WO 95/3824; Michetti et al., Gastroenterology 107:1002, 1994). Nevertheless, it is thought that several antigens may ultimately be necessary in a vaccine.
Summary of the Invention The invention provides polynucleotide molecules that encode Helicobacter polypeptides, designated GHPO 13, GHPO 73, GHPO 90, GHPO 107, GHPO 136, GHPO 191, GHPO 213, GHPO 240, GHPO 408, GHPO 411, GHPO 419, GHPO 431, GHPO 474, GHPO 591, GHPO 596, GHPO 699, GHPO 724, GHPO 730, GHPO 761, GHPO 804, GHPO 805, GHPO 812, GHPO 879, GHPO 888, GHPO 986, GHPO 1056, GHPO 1081, GHPO 1100, WO 98/21225 PCT/US97/21353 -3- GHPO 1140, GHPO 1148, GHPO 1200, GHPO 1212, GHPO 1258, GHPO 1263, GHPO 1273, GHPO 1284, GHPO 1299, GHPO 1327, GHPO 1346, GHPO 1378, GHPO 1412, GHPO 1443, GHPO 1466, GHPO 1476, GHPO 1536, GHPO 1559, GHPO 427, GHPO 1045, GHPO 1262, GHPO 1688, GHPO 1538, GHPO 346, GHPO 1012, GHPO 470, GHPO 1398, GHPO 1550, GHPO 276, GHPO 1501, GHPO 706, GHPO 1001, GHPO 732, GHPO 329, GHPO 574, GHPO 1190, GHPO 1374, GHPO 1620, GHPO 956, HPO 98, GHPO 689, GHPO 208, GHPO 296, GHPO 726, GHPO 1026, GHPO 1301, GHPO 1536, GHPO 166, GHPO 253, GHPO 297, GHPO 615, GHPO 1278, GHPO 1282, GHPO 1420, GHPO 1484, GHPO 1719, and GHPO 1252, which can be used, in methods to prevent, treat, or diagnose Helicobacter infection. The polypeptides of the invention include those having the amino acid sequences shown in SEQ ID NOs:2-170 (even numbers), as well as mature forms of proteins having sequences shown in SEQ ID NOs:2-170 in their unprocessed forms, and fragments thereof. Those skilled in the art will understand that the invention also includes polynucleotide molecules that encode mutants and derivatives of these polypeptides, which can result from the addition, deletion, or substitution of non-essential amino acids, as is described further below.
In addition to the polynucleotide molecules described above, the invention includes the corresponding polypeptides polypeptides encoded by the polynucleotide molecules of the invention, or fragments thereof), and monospecific antibodies that specifically bind to these polypeptides.
The present invention has many applications and includes expression cassettes, vectors, and cells transformed or transfected with the polynucleotides of the invention. Accordingly, the present invention provides methods for producing polypeptides of the invention in recombinant host systems and -4related expression cassettes, vectors, and transformed or transfected cells; (ii) live vaccine vectors, such as pox virus, Salmonella typhimurium, and Vibrio cholerae vectors, that contain polynucleotides of the invention (such vaccine vectors being useful in, methods for preventing or treating Helicobacter infection) in combination with a diluent or carrier, and related pharmaceutical compositions and associated therapeutic and/or prophylactic methods; (iii) therapeutic and/or prophylactic methods involving administration of polynucleotide molecules, either in a naked form or formulated with a delivery vehicle, polypeptides or mixtures of polypeptides, or monospecific antibodies of the invention, and related pharmaceutical compositions; (iv) methods for detecting the presence of Helicobacter in biological samples, which can involve the sue of polynucleotide molecules, monospecific antibodies, or polypeptides of the invention; and (v) methods for purifying polypeptides of the invention by antibody-based affinity chromatography.
For the purposes of this specification it will be clearly understood that the word "comprising" means "including but not limited to", and that the word "comprises" has a corresponding meaning.
Brief Description of the Drawings 20 Fig. 1A is a diagrammatic representation of transposon TnMax9, which is a derivative of the TnMax transposon system (Haas et al., Gene 130:23-21, 1993). The mini-transposon carries the blaM gene, which is the P-lactamase gene lacking a promoter and a signal sequence, next to the inverted repeats (IR) and the M13 forward (M13-FP) and reverse (M13-RP1) primer binding sites. The resolution S. 25 sites (res) and an origin of replication (orifd) are located between the BlaM gene and the constitutive catGc-resistance gene. The transposase tnpA and resolvase tnpR genes are located outside the mini-transposon and are under the control of the inducible Ptrc promoter. The laclq gene encodes the Lac repressor.
H:\WendyS\Keep\species\52662-98 Merieux.doc 7/12/99 WO 98/21225 PCT/US97/21353 Fig. 1B is a diagrammatic representation ofplasmid pMin2. pMin2 contains a multiple cloning site, the tetracycline resistance gene (tet), an origin of transfer (oriT), an origin of replication (orico.,), a transcriptional terminator and a weak, constitutive promoter H. pylori chromosome fragments were introduced into the BglII and Clal sites of pMin2.
Detailed Description Open reading frames (ORFs) encoding new, full length polypeptides, designated GHPO 13, GHPO 73, GHPO 90, GHPO 107, GHPO 136, GHPO 191, GHPO 213, GHPO 240, GHPO 408, GHPO 411, GHPO 419, GHPO 431, GHPO 474, GHPO 591, GHPO 596, GHPO 699, GHPO 724, GHPO 730, GHPO 761, GHPO 804, GHPO 805, GHPO 812, GHPO 879, GHPO 888, GHPO 986, GHPO 1056, GHPO 1081, GHPO 1100, GHPO 1140, GHPO 1148, GHPO 1200, GHPO 1212, GHPO 1258, GHPO 1263, GHPO 1273, GHPO 1284, GHPO 1299, GHPO 1327, GHPO 1346, GHPO 1378, GHPO 1412, GHPO 1443, GHPO 1466, GHPO 1476, GHPO 1536, GHPO 1559, GHPO 427, GHPO 1045, GHPO 1262, GHPO 1688, GHPO 1538, GHPO 346, GHPO 1012, GHPO 470, GHPO 1398, GHPO 1550, GHPO 276, GHPO 1501, GHPO 706, GHPO 1001, GHPO 732, GHPO 329, GHPO 574, GHPO 1190, GHPO 1374, GHPO 1620, GHPO 956, HPO 98, GHPO 689, GHPO 208, GHPO 296, GHPO 726, GHPO 1026, GHPO 1301, GHPO 1536, GHPO 166, GHPO 253, GHPO 297, GHPO 615, GHPO 1278, GHPO 1282, GHPO 1420, GHPO 1484, GHPO 1719, and GHPO 1252, have been identified in the H.
pylori genome. These polypeptides can be used, for example, in vaccination methods for preventing or treating Helicobacter infection. Some of the new polypeptides are secreted polypeptides that can be produced in their mature forms as polypeptides that have been exported through class II or class III WO 98/21225 PCT/US97/21353 -6secretion pathways) or as precursors that include signal peptides, which can be removed in the course of excretion/secretion by cleavage at the N-terminal end of the mature form. (The cleavage site is located at the C-terminal end of the signal peptide, adjacent to the mature form.) According to a first aspect of the invention, there are provided isolated polynucleotides that encode the precursor and mature forms of Helicobacter GHPO 13, GHPO 73, GHPO 90, GHPO 107, GHPO 136, GHPO 191, GHPO 213, GHPO 240, GHPO 408, GHPO 411, GHPO 419, GHPO 431, GHPO 474, GHPO 591, GHPO 596, GHPO 699, GHPO 724, GHPO 730, GHPO 761, GHPO 804, GHPO 805, GHPO 812, GHPO 879, GHPO 888, GHPO 986, GHPO 1056, GHPO 1081, GHPO 1100, GHPO 1140, GHPO 1148, GHPO 1200, GHPO 1212, GHPO 1258, GHPO 1263, GHPO 1273, GHPO 1284, GHPO 1299, GHPO 1327, GHPO 1346, GHPO 1378, GHPO 1412, GHPO 1443, GHPO 1466, GHPO 1476, GHPO 1536, GHPO 1559, GHPO 427, GHPO 1045, GHPO 1262, GHPO 1688, GHPO 1538, GHPO 346, GHPO 1012, GHPO 470, GHPO 1398, GHPO 1550, GHPO 276, GHPO 1501, GHPO 706, GHPO 1001, GHPO 732, GHPO 329, GHPO 574, GHPO 1190, GHPO 1374, GHPO 1620, GHPO 956, HPO 98, GHPO 689, GHPO 208, GHPO 296, GHPO 726, GHPO 1026, GHPO 1301, GHPO 1536, GHPO 166, GHPO 253, GHPO 297, GHPO 615, GHPO 1278, GHPO 1282, GHPO 1420, GHPO 1484, GHPO 1719, and GHPO 1252.
An isolated polynucleotide of the invention encodes: a polypeptide having an amino acid sequence that is homologous to a Helicobacter amino acid sequence of a polypeptide, the Helicobacter amino acid sequence being selected from the group consisting of the amino acid sequences shown in SEQ ID NO:2 (GHPO 13), SEQ ID NO:4 (GHPO 73), SEQ ID NO:6 (GHPO 90), SEQ ID NO:8 (GHPO 107), SEQ ID WO 98/21225 WO 981225PCTIUS97/21353 -7- (GHPO 13 SEQ ID NO: 12 (GHPO 19 SEQ ID NO: 14 (GHPO 213), SEQ ID NO: 16 (GHPO 240), SEQ ID NO: 18 (GIJPO 408), SEQ ID NO:20 (GHPO 411), SEQ ID NO:22 (GHPO 419), SEQ ID NO:24 (GHPO 43 SEQ ID NO:26 (GHPO 474), SEQ ID NO:28 (GHPO 591), SEQ ID NO:30 (GHPO 596), SEQ ID NO:32 (GHPO 699), SEQ ID NO:34 (GHPO 724), SEQ ID NO:36 (GHPO 730), SEQ ID NO:38 (GHPO 761), SEQ ID NO:40 (GHPO 804), SEQ ID NO:42 (GHPO 805), SEQ ID NO:44 (GHPO 812), SEQ ID NO:46 (GHPO 879), SEQ ID NO:48 (GHPO 888), SEQ ID NO:50 (GHPO 986), SEQ ID NO: 52 (GHPO 1056), SEQ ID NO:54 (GHPO 1081), SEQ ID NO: 56 (GHPO 1 100), SEQ ID NO: 58 (GHPO 1140), SEQ ID NO: 60 (GHPO 1148), SEQ ID NO:62 (GHPO 1200), SEQ ID NO:64 (GHPO 1212), SEQ ID NO:66 (GHPO 1258), SEQ ID NO:68 (GHPO 1263), SEQ ID NO:70 (GHPO 1273), SEQ ID NO:72 (GHPO 1284), SEQ ID NO:74 (GHPO 1299), SEQ ID NO:76 (GHPO 1327), SEQ ID NO:78 (GHPO 1346), SEQ ID NO:80 (GHPO- 1378), SEQ ID NO:82 (GHPO 1412), SEQ ID NO:84 (GHPO 1443), SEQ ID NO:86 (GHPO 1466), SEQ ID NO:88 (GHPO 1476), SEQ ID NO:90 (GHPO 1536), SEQ ID NO:92 (GHPO 1559), SEQ ID NO:94 (GHPO 427), SEQ ID NO:96 (GHPO 1045), SEQ ID NO:98 (GHPO, 1262), SEQ ID NO: 100 (GHPO 1688), SEQ ID NO: 102 (GHPO 153 SEQ ID NO: 104 (GHPO 346), SEQ ID NO: 106 (GHPO 10 12), SEQ ID NO: 108 (GUPO 470), SEQ ID NO: 110 (GHPO 1398), SEQ ID NO: 112 (GHPO 1550), SEQ ID N~O:1 114 (GHPO 276), SEQ ID NO: 116 (GHPO 150 SEQ ID NO:1 118 (GHPO 706), SEQ ID NO: 120 (GHPO 100 SEQ ID NO: 122 (GHPO 732), SEQ ID NO: 124 (GHPO 329), SEQ ID NO:126 (GHPO 574), SEQ ID NO:128 (GHPO 1190), SEQ ID NO: 130 (GHPO 1374), SEQ ID NO: 132 (GHPO 1620), SEQ ID NO: 134 (GHPO 956), SEQ ID NO: 136 (HPO 98), SEQ ID NO: 13 8 (GHPO 689), SEQ ID NO: 140 (GHPO, 208), SEQ ID NO: 142 (GHPO, 296), SEQ ID WO 98/21225 PCT/US97/21353 -8- NO:144 (GHPO 726), SEQ ID NO:146 (GHPO 1026), SEQ ID NO:148 (GHPO 1301), SEQ ID NO:150 (GHPO 1536), SEQ ID NO:152 (GHPO 166), SEQ ID NO:154 (GHPO 253), SEQ ID NO:156 (GHPO 297), SEQ ID NO:158 (GHPO 615), SEQ ID NO:160 (GHPO 1278), SEQ ID NO:162 (GHPO 1282), SEQ ID NO:164 (GHPO 1420), SEQ ID NO:166 (GHPO 1484), SEQ ID NO:168 (GHPO 1719), and SEQ ID NO:170 (GHPO 1252); or (ii) a derivative of the polypeptide.
In addition to the full-length polypeptides encoded by the polynucleotides of the invention, as set forth above, polynucleotides included in the invention can also encode polypeptides that lack signal sequences, as well as other polypeptide or peptide fragments of the full-length polypeptides.
The term "isolated polynucleotide" is defined as a polynucleotide that is removed from the environment in which it naturally occurs. For example, a naturally-occurring DNA molecule present in the genome of a living bacteria or as part of a gene bank is not isolated, but the same molecule, separated from the remaining part of the bacterial genome, as a result of, a cloning event (amplification), is "isolated." Typically, an isolated DNA molecule is free from DNA regions coding regions) with which it is immediately contiguous, at the 5' or 3' ends, in the naturally occurring genome.
Such isolated polynucleotides can be part of a vector or a composition and still be isolated, as such a vector or composition is not part of its natural environment.
A polynucleotide of the invention can consist of RNA or DNA cDNA, genomic DNA, or synthetic DNA), or modifications or combinations of RNA or DNA. The polynucleotide can be double-stranded or single-stranded and, if single-stranded, can be the coding (sense) strand or the non-coding (antisense) strand. The sequences that encode polypeptides of the invention, as WO 98/21225 PCT/US97/21353 -9shown in any of SEQ ID NOs:2-170 (even numbers), can be the coding sequence as shown in any of SEQ ID NOs:1-169 (odd numbers); a ribonucleotide sequence derived by transcription of or a different coding sequence that, as a result of the redundancy or degeneracy of the genetic code, encodes the same polypeptides as the polynucleotide molecules having the sequences illustrated in any of SEQ ID NOs:1-169 (odd numbers). The polypeptide can be one that is naturally secreted or excreted by, H. felis, H. mustelae, H. heilmanii, or H. pylori.
By "polypeptide" or "protein" is meant any chain of amino acids, regardless of length or post-translational modification glycosylation or phosphorylation). Both terms are used interchangeably in the present application.
By "homologous amino acid sequence" is meant an amino acid sequence that differs from an amino acid sequence shown in any of SEQ ID NOs:2-170 (even numbers), or an amino acid sequence encoded by the nucleotide sequence of any of SEQ ID NOs: -169 (odd numbers), by one or more non-conservative amino acid substitutions, deletions, or additions located at positions at which they do not destroy the specific antigenicity of the polypeptide. Preferably, such a sequence is at least 75%, more preferably at least 80%, and most preferably at least 90% identical to an amino acid sequence shown in any of SEQ ID NOs:2-170 (even numbers).
Homologous amino acid sequences include sequences that are identical or substantially identical to an amino acid sequence as shown in any of SEQ ID NOs:2-170 (even numbers). By "amino acid sequence that is substantially identical" is meant a sequence that is at least 90%, preferably at least 95%, more preferably at least 97%, and most preferably at least 99% identical to an amino acid sequence of reference and that differs from the WO 98/21225 PCT/US97/21353 sequence of reference, if at all, by a majority of conservative amino acid substitutions.
Conservative amino acid substitutions typically include substitutions among amino acids of the same class. These classes include, for example, amino acids having uncharged polar side chains, such as asparagine, glutamine, serine, threonine, and tyrosine; amino acids having basic side chains, such as lysine, arginine, and histidine; amino acids having acidic side chains, such as aspartic acid and glutamic acid; and amino acids having nonpolar side chains, such as glycine, alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan, and cysteine.
Homology can be measured using sequence analysis software Sequence Analysis Software Package of the Genetics Computer Group, University of Wisconsin Biotechnology Center, 1710 University Avenue, Madison, WI 53705). Similar amino acid sequences are aligned to obtain the maximum degree of homology identity). To this end, it may be necessary to artificially introduce gaps into the sequence. Once the optimal alignment has been set up, the degree of homology identity) is established by recording all of the positions in which the amino acids of both sequences are identical, relative to the total number of positions.
Homologous polynucleotide sequences are defined in a similar way.
Preferably, a homologous sequence is one that is at least 45%, more preferably at least 60%, and most preferably at least 85% identical to a coding sequence of any of SEQ ID NOs:1-169 (odd numbers).
Polypeptides having a sequence homologous to any one of the sequences shown in SEQ ID NOs:2-170 (even numbers), include naturallyoccurring allelic variants, as well as mutants or any other non-naturally WO 98/21225 PCT/US97/21353 -11occurring variants that are analogous in terms of antigenicity, to a polypeptide having a sequence as shown in any one of SEQ ID NOs:2-170 (even numbers).
As is known in the art, an allelic variant is an alternate form of a polypeptide that is characterized as having a substitution, deletion, or addition of one or more amino acids that does not alter the biological function of the polypeptide. By "biological function" is meant a function of the polypeptide in the cells in which it naturally occurs, even if the function is not necessary for the growth or survival of the cells. For example, the biological function of a porin is to allow the entry into cells of compounds present in the extracellular medium. The biological function is distinct from the antigenic function. A polypeptide can have more than one biological function.
Allelic variants are very common in nature. For example, a bacterial species, H. pylori, is usually represented by a variety of strains that differ from each other by minor allelic variations. Indeed, a polypeptide that fulfills the same biological function in different strains can have an amino acid sequence that is not identical in each of the strains. Such an allelic variation can be equally reflected at the polynucleotide level.
Support for the use of allelic variants of polypeptide antigens comes from, studies of the Helicobacter urease antigen. The amino acid sequence of Helicobacter urease varies widely from species to species, yet cross-species protection occurs, indicating that the urease molecule, when used as an immunogen, is highly tolerant of amino acid variations. Even among different strains of the single species H. pylori, there are amino acid sequence variations.
For example, although the amino acid sequences of the UreA and UreB subunits of H. pylori and H. felis ureases differ from one another by 26.5% and 11.8%, respectively (Ferrero et al., Molecular Microbiology WO 98/21225 PCT/US97/21353 -12- 9(2):323-333, 1993), it has been shown that H. pylori urease protects mice from H. felis infection (Michetti et al., Gastroenterology 107:1002, 1994). In addition, it has been shown that the individual structural subunits of urease, UreA and UreB, which contain distinct amino acid sequences, are both protective antigens against Helicobacter infection (Michetti et al., supra).
Similarly, Cuenca et al. (Gastroenterology 110:1770, 1996) showed that therapeutic immunization ofH. mustelae-infected ferrets with H. pylori urease was effective at eradicating H. mustelae infection. Further, several urease variants have been reported to be effective vaccine antigens, including, e.g., recombinant UreA UreB apoenzyme expressed from pORV 142 (UreA and UreB sequences derived from H. pylori strain CPM630; Lee et al., J. Infect.
Dis.172:161, 1995); recombinant UreA UreB apoenzyme expressed from pORV214 (UreA and UreB sequences differ from H. pylori strain CPM630 by one and two amino acid changes, respectively; Lee et al., supra, 1995); a UreA-glutathione-S-transferase fusion protein (UreA sequence from H. pylori strain ATCC 43504; Thomas et al., Acta Gastro-Enterologica Belgica 56:54, 1993); UreA UreB holoenzyme purified from H. pylori strain NCTC11637 (Marchetti et al., Science 267:1655, 1995); a UreA-MBP fusion protein (UreA from H. pylori strain 85P; Ferrero et al., Infection and Immunity 62:4981, 1994); a UreB-MBP fusion protein (UreB from H. pylori strain 85P; Ferrero et al., supra); a UreA-MBP fusion protein (UreA from H. felis strain ATCC 49179; Ferrero et al., supra); a UreB-MBP fusion protein (UreB from H. felis strain ATCC 49179; Ferrero et al., supra); and a 37 kDa fragment of UreB containing amino acids 220-569 (Dore-Davin et al., "A 37 kD fragment of UreB is sufficient to confer protection against Helicobacterfelis infection in mice"). Finally, Thomas et al. (supra) showed that oral immunization of mice WO 98/21225 PCT/US97/21353 -13with crude sonicates of H. pylori protected mice from subsequent challenge with H. felis.
Polynucleotides, DNA molecules, encoding allelic variants can easily be obtained by polymerase chain reaction (PCR) amplification of genomic bacterial DNA extracted by conventional methods. This involves the use of synthetic oligonucleotide primers matching sequences that are upstream and downstream of the 5' and 3' ends of the coding region. Suitable primers can be designed based on the nucleotide sequence information provided in any of SEQ ID NOs:1-169 (odd numbers). Typically, a primer consists of 10 to preferably 15 to 25 nucleotides. It can also be advantageous to select primers containing C and G nucleotides in proportions sufficient to ensure efficient hybridization, an amount of C and G nucleotides of at least preferably 50%, of the total nucleotide amount. Those skilled in the art can readily design primers that can be used to isolate the polynucleotides of the invention from different Helicobacter strains. Experimental conditions for carrying out PCR can readily be determined by one skilled in the art and an illustration of carrying out PCR is provided in the Examples below. As is well known in the art, restriction endonuclease recognition sites that contain, typically, 4 to 6 nucleotides (for example, the sequences 5'-GGATCC-3' (BamHI) or 5'-CTCGAG-3' (XhoI)), can be included on the 5' ends of the primers. Restriction sites can be selected by those skilled in the art so that the amplified DNA can be conveniently cloned into an appropriately digested vector, such as a plasmid.
Useful homologs that do not occur naturally can be designed using known methods for identifying regions of an antigen that are likely to be tolerant of amino acid sequence changes and/or deletions. For example, WO 98/21225 PCT/US97/21353 -14sequences of the antigen from different species can be compared to identify conserved sequences.
Polypeptide derivatives that are encoded by polynucleotides of the invention include, fragments, polypeptides having large internal deletions derived from full-length polypeptides, and fusion proteins. Polypeptide fragments of the invention can be derived from a polypeptide having a sequence homologous to any of the sequences of SEQ ID NOs:2-170 (even numbers), to the extent that the fragments retain the substantial antigenicity of the parent polypeptide (specific antigenicity). Polypeptide derivatives can also be constructed by large internal deletions that remove a substantial part of the parent polypeptide, while retaining specific antigenicity. Generally, polypeptide derivatives should be about at least 12 amino acids in length to maintain antigenicity. Advantageously, they can be at least 20 amino acids, preferably at least 50 amino acids, more preferably at least 75 amino acids, and most preferably at least 100 amino acids in length.
Useful polypeptide derivatives, polypeptide fragments, can be designed using computer-assisted analysis of amino acid sequences in order to identify sites in protein antigens having potential as surface-exposed, antigenic regions (Hughes et al., Infect. Immun. 60(9):3497, 1992). For example, the Laser Gene Program from DNA Star can be used to obtain hydrophilicity, antigenic index, and intensity index plots for the polypeptides of the invention.
This program can also be used to obtain information about homologies of the polypeptides with known protein motifs. One skilled in the art can readily use the information provided in such plots to select peptide fragments for use as vaccine antigens. For example, fragments spanning regions of the plots in which the antigenic index is relatively high can be selected. One can also select fragments spanning regions in which both the antigenic index and the WO 98/21225 PCT/US97/21353 intensity plots are relatively high. Fragments containing conserved sequences, particularly hydrophilic conserved sequences, can also be selected.
Polypeptide fragments and polypeptides having large internal deletions can be used for revealing epitopes that are otherwise masked in the parent polypeptide and that may be of importance for inducing a protective T cell-dependent immune response. Deletions can also remove immunodominant regions of high variability among strains.
It is an accepted practice in the field of immunology to use.fragments and variants of protein immunogens as vaccines, as all that is required to induce an immune response to a protein is a small 8 to 10 amino acids) immunogenic region of the protein. This has been done for a number of vaccines against pathogens other than Helicobacter. For example, short synthetic peptides corresponding to surface-exposed antigens of pathogens such as murine mammary tumor virus (peptide containing 11 amino acids; Dion et al., Virology 179:474-477, 1990), Semliki Forest virus (peptide containing 16 amino acids; Snijders et al., J. Gen. Virol. 72:557-565, 1991), and canine parvovirus (2 overlapping peptides, each containing 15 amino acids; Langeveld et al., Vaccine 12(15):1473-1480, 1994) have been shown to be effective vaccine antigens against their respective pathogens.
Polynucleotides encoding polypeptide fragments and polypeptides having large internal deletions can be constructed using standard methods (see, Ausubel et al., Current Protocols in Molecular Biology, John Wiley Sons Inc., 1994), for example, by PCR, including inverse PCR, by restriction enzyme treatment of the cloned DNA molecules, or by the method of Kunkel et al. (Proc. Natl. Acad. Sci. USA 82:448, 1985; biological material available at Stratagene).
WO 98/21225 PCT/US97/21353 -16- A polypeptide derivative can also be produced as a fusion polypeptide that contains a polypeptide or a polypeptide derivative of the invention fused, at the N- or C-terminal end, to any other polypeptide (hereinafter referred to as a peptide tail). Such a product can be easily obtained by translation of a genetic fusion, a hybrid gene. Vectors for expressing fusion polypeptides are commercially available, and include the pMal-c2 or pMal-p2 systems of New England Biolabs, in which the peptide tail is a maltose binding protein, the glutathione-S-transferase system of Pharmacia, or the His-Tag system available from Novagen. These and other expression systems provide convenient means for further purification of polypeptides and derivatives of the invention.
Another particular example of fusion polypeptides included in invention includes a polypeptide or polypeptide derivative of the invention fused to a polypeptide having adjuvant activity, such as, subunit B of either cholera toxin or E. coli heat-labile toxin. Several possibilities can be used for producing such fusion proteins. First, the polypeptide of the invention can be fused to the N-terminal end or, preferably, to the C-terminal end of the polypeptide having adjuvant activity. Second, a polypeptide fragment of the invention can be fused within the amino acid sequence of the polypeptide having adjuvant activity.
Spacer sequences can also be included, if desired.
As stated above, the polynucleotides of the invention encode Helicobacter polypeptides in precursor or mature form. They can also encode hybrid precursors containing heterologous signal peptides, which can mature into polypeptides of the invention. By "heterologous signal peptide" is meant a signal peptide that is not found in the naturally-occurring precursor of a polypeptide of the invention.
WO 98/21225 PCT/US97/21353 -17- A polynucleotide of the invention hybridizes, preferably under stringent conditions, to a polynucleotide having a sequence as shown in any of SEQ ID NOs: 1-169 (odd numbers). Hybridization procedures are, e.g., described by Ausubel et al. (supra); Silhavy et al. (Experiments with Gene Fusions, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, 1984); and Davis et al. (A Manualfor Genetic Engineering: Advanced Bacterial Genetics, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, 1980). Important parameters that can be considered for optimizing hybridization conditions are reflected in the following formula, which facilitates calculation of the melting temperature which is the temperature above which two complementary DNA strands separate from one another (Casey et al., Nucl. Acid Res. 4:1539, 1977): Tm 81.5 0.5 x G+C) 1.6 log (positive ion concentration) 0.6 x formamide). Under appropriate stringency conditions, hybridization temperature (Th) is approximately 20 to 40'C, 20 to 25 C, or, preferably, 30 to 40 0 C below the calculated Tm. Those skilled in the art will understand that optimal temperature and salt conditions can be readily determined empirically in preliminary experiments using conventional procedures. For example, stringent conditions can be achieved, both for pre-hybridizing and hybridizing incubations, within 4-16 hours at 42 0 C, in 6 x SSC containing formamide or (ii) within 4-16 hours at 65 C in an aqueous 6 x SSC solution (1 M NaCI, 0.1 M sodium citrate (pH For polynucleotides containing 30 to 600 nucleotides, the above formula is used and then is corrected by subtracting (600/polynucleotide size in base pairs). Stringency conditions are defined by a Th that is 5 to 10 0 C below Tm.
Hybridization conditions with oligonucleotides shorter than 20-30 bases do not precisely follow the rules set forth above. In such cases, the WO 98/21225 PCT/US97/21353 -18formula for calculating the Tm is as follows: Tm 4 x 2 For example, an 18 nucleotide fragment of 50% G+C would have an approximate Tm of 54°C.
A polynucleotide molecule of the invention, containing RNA, DNA, or modifications or combinations thereof, can have various applications. For example, a polynucleotide molecule can be used in a process for producing the encoded polypeptide in a recombinant host system, (ii) in the construction of vaccine vectors, such as poxviruses, which are further used in methods and compositions for preventing and/or treating Helicobacter infection, (iii) as a vaccine agent, in a naked form or formulated with a delivery vehicle, and (iv) in the construction of attenuated Helicobacter strains that can over-express a polynucleotide of the invention or express it in a non-toxic, mutated form.
According to a second aspect of the invention, there is therefore provided an expression cassette containing a polynucleotide molecule of the invention placed under the control of elements a promoter) required for expression; (ii) an expression vector containing an expression cassette of the invention; (iii) a procaryotic or eucaryotic cell transformed or transfected with an expression cassette and/or vector of the invention, as well as (iv) a process for producing a polypeptide or polypeptide derivative encoded by a polynucleotide of the invention, which involves culturing a procaryotic or eucaryotic cell transformed or transfected with an expression cassette and/or vector of the invention, under conditions that allow expression of the polynucleotide molecule of the invention, and recovering the encoded polypeptide or polypeptide derivative from the cell culture.
A recombinant expression system can be selected from procaryotic and eucaryotic hosts. Eucaryotic hosts include, for example, yeast cells Saccharomyces cerevisiae or Pichia Pastoris), mammalian cells COS1, WO 98/21225 PCT/US97/21353 -19- NIH3T3, or JEG3 cells), arthropods cells Spodopterafrugiperda (SF9) cells), and plant cells. Preferably, a procaryotic host such as E. coli is used.
Bacterial and eucaryotic cells are available from a number of different sources that are known to those skilled in the art, the American Type Culture Collection (ATCC; Rockville, Maryland).
The choice of the expression cassette will depend on the host system selected, as well as the features desired for the expressed polypeptide. For example, it may be useful to produce a polypeptide of the invention in a particular lipidated form or any other form. Typically, an expression cassette includes a constitutive or inducible promoter that is functional in the selected host system; a ribosome binding site; a start codon (ATG); if necessary, a region encoding a signal peptide, a lipidation signal peptide; a polynucleotide molecule of the invention; a stop codon; and, optionally, a 3' terminal region (translation and/or transcription terminator). The signal peptide-encoding region is adjacent to the polynucleotide of the invention and is placed in the proper reading frame. The signal peptide-encoding region can be homologous or heterologous to the polynucleotide molecule encoding the mature polypeptide and it can be specific to the secretion apparatus of the host used for expression. The open reading frame constituted by the polynucleotide molecule of the invention, alone or together with the signal peptide, is placed under the control of the promoter so that transcription and translation occur in the host system. Promoters and signal peptide-encoding regions are widely known and available to those skilled in the art and include, for example, the promoter of Salmonella typhimurium (and derivatives) that is inducible by arabinose (promoter araB) and is functional in Gram-negative bacteria such as E. coli Patent No. 5,028,530; Cagnon et al., Protein Engineering 4(7):843, 1991); the promoter of the bacteriophage T7 RNA polymerase gene, WO 98/21225 PCT/US97/21353 which is functional in a number of E. coli strains expressing T7 polymerase Patent No. 4,952,496); the OspA lipidation signal peptide; and RlpB lipidation signal peptide (Takase et al., J. Bact. 169:5692, 1987).
The expression cassette is typically part of an expression vector, which is selected for its ability to replicate in the chosen expression system.
Expression vectors plasmids or viral vectors) can be chosen from, for example, those described in Pouwels et al. (Cloning Vectors: A Laboratory Manual, 1985, Supp. 1987), and can be purchased from various commercial sources. Methods for transforming or transfecting host cells with expression vectors are well known in the art and will depend on the host system selected, as described in Ausubel et al. (supra).
Upon expression, a recombinant polypeptide of the invention (or a polypeptide derivative) is produced and remains in the intracellular compartment, is secreted/excreted in the extracellular medium or in the periplasmic space, or is embedded in the cellular membrane. The polypeptide can then be recovered in a substantially purified form from the cell extract or from the supernatant after centrifugation of the cell culture. Typically, the recombinant polypeptide can be purified by antibody-based affinity purification or by any other method known in the art, such as by genetic fusion to a small affinity-binding domain. Antibody-based affinity purification methods are also available for purifying a polypeptide of the invention extracted from a Helicobacter strain. Antibodies useful for immunoaffinity purification of the polypeptides of the invention can be obtained using methods described below.
Polynucleotides of the invention can also be used in DNA vaccination methods, using either a viral or bacterial host as gene delivery vehicle (live vaccine vector) or administering the gene in a free form, e.g., WO 98/21225 PCT/US97/21353 -21inserted into a plasmid. Therapeutic or prophylactic efficacy of a polynucleotide of the invention can be evaluated as is described below.
Accordingly, in a third aspect of the invention, there is provided a vaccine vector such as a poxvirus, containing a polynucleotide molecule of the invention placed under the control of elements required for expression; (ii) a composition of matter containing a vaccine vector of the invention, together with a diluent or carrier; (iii) a pharmaceutical composition containing a therapeutically or prophylactically effective amount of a vaccine vector of the invention; (iv) a method for inducing an immune response against Helicobacter in a mammal a human; alternatively, the method can be used in veterinary applications for treating or preventing Helicobacter infection of animals, e.g., cats or birds), which involves administering to the mammal an immunogenically effective amount of a vaccine vector of the invention to elicit an immune response, a protective or therapeutic immune response to Helicobacter; and a method for preventing and/or treating a Helicobacter H. pylori, H. felis, H. mustelae, or H. heilmanii) infection, which involves administering a prophylactic or therapeutic amount of a vaccine vector of the invention to an individual in need. Additionally, the third aspect of the invention encompasses the use of a vaccine vector of the invention in the preparation of a medicament for preventing and/or treating Helicobacter infection.
A vaccine vector of the invention can express one or several polypeptides or derivatives of the invention, as well as at least one additional Helicobacter antigen such as a urease apoenzyme or a subunit, fragment, homolog, mutant, or derivative thereof. In addition, it can express a cytokine, such as interleukin-2 (IL-2) or interleukin-12 (IL-12), that enhances the immune response. Thus, a vaccine vector can include an additional WO 98/21225 PCT/US97/21353 -22polynucleotide molecule encoding, urease subunit A, B, or both, or a cytokine, placed under the control of elements required for expression in a mammalian cell.
Alternatively, a composition of the invention can include several vaccine vectors, each of which are capable of expressing a polypeptide or derivative of the invention. A composition can also contain a vaccine vector capable of expressing an additional Helicobacter antigen, such as urease apoenzyme, a subunit, fragment, homolog, mutant, or derivative thereof, or a cytokine such as IL-2 or IL-12.
In vaccination methods for treating or preventing infection in a mammal, a vaccine vector of the invention can be administered by any conventional route in use in the vaccine field, for example, to a mucosal ocular, intranasal, oral, gastric, pulmonary, intestinal, rectal, vaginal, or urinary tract) surface or via a parenteral subcutaneous, intradermal, intramuscular, intravenous, or intraperitoneal) route, or a combination thereof.
Preferred routes depend upon the choice of the vaccine vector. The administration can be achieved in a single dose or repeated at intervals. The appropriate dosage depends on various parameters that are understood by those skilled in the art, such as the nature of the vaccine vector itself, the route of administration, and the condition of the mammal to be vaccinated the weight, age, and general health of the mammal).
Live vaccine vectors that can be used in the invention include viral vectors, such as adenoviruses and poxviruses, as well as bacterial vectors, e.g., Shigella, Salmonella, Vibrio cholerae, Lactobacillus, Bacille bili de Calmette- Guerin (BCG), and Streptococcus. An example of an adenovirus vector, as well as a method for constructing an adenovirus vector capable of expressing a polynucleotide molecule of the invention, is described in U.S. Patent No.
WO 98/21225 PCT/US97/21353 -23- 4,920,209. Poxvirus vectors that can be used in the invention include, e.g., vaccinia and canary pox viruses, which are described in U.S. Patent No.
4,722,848 and U.S. Patent No. 5,364,773, respectively (also see, Tartaglia et al., Virology 188:217, 1992, for a description of a vaccinia virus vector, and Taylor et al, Vaccine 13:539, 1995, for a description of a canary poxvirus vector). Poxvirus vectors capable of expressing a polynucleotide of the invention can be obtained by homologous recombination, as described in Kieny et al. (Nature 312:163, 1984) so that the polynucleotide of the invention is inserted into the viral genome under appropriate conditions for expression in mammalian cells. Generally, the dose of viral vector vaccine, for therapeutic or prophylactic use, can be from about 1x10 4 to about lx 10", advantageously from about 1x10 7 to about lx10' 0 or, preferably, from about xl107 to about 1x10 9 plaque-forming units per kilogram. Preferably, viral vectors are administered parenterally, for example, in 3 doses that are 4 weeks apart.
Those skilled in the art will recognize that it is preferable to avoid adding a chemical adjuvant to a composition containing a viral vector of the invention and thereby minimizing the immune response to the viral vector itself.
Non-toxicogenic Vibrio cholerae mutant strains that can be used in live oral vaccines are described by Mekalanos et al. (Nature 306:551, 1983) and in U.S. Patent No. 4,882,278 (strain in which a substantial amount of the coding sequence of each of the two ctxA alleles has been deleted so that no functional cholerae toxin is produced); WO 92/11354 (strain in which the irgA locus is inactivated by mutation; this mutation can be combined in a single strain with ctxA mutations); and WO 94/1533 (deletion mutant lacking functional ctxA and attRSI DNA sequences). These strains can be genetically engineered to express heterologous antigens, as described in WO 94/19482.
An effective vaccine dose of a V. cholerae strain capable of expressing a WO 98/21225 PCT/US97/21353 -24polypeptide or polypeptide derivative encoded by a polynucleotide molecule of the invention can contain, about 1x 10 to about 1x 10, preferably about lx106 to about Ix108, viable bacteria in an appropriate volume for the selected route of administration. Preferred routes of administration include all mucosal routes, but, most preferably, these vectors are administered intranasally or orally.
Attenuated Salmonella typhimurium strains, genetically engineered for recombinant expression of heterologous antigens, and their use as oral vaccines, are described by Nakayama et al. (Bio/Technology 6:693, 1988) and in WO 92/11361. Preferred routes of administration for these vectors include all mucosal routes. Most preferably, the vectors are administered intranasally or orally.
Others bacterial strains useful as vaccine vectors are described by High et al. (EMBO 11:1991, 1992) and Sizemore et al. (Science 270:299, 1995; Shigellaflexneri); Medaglini et al. (Proc. Natl. Acad. Sci. USA 92:6868, 1995; (Streptococcus gordonii); Flynn (Cell. Mol. Biol. 40 (suppl. I):31, 1194), and in WO 88/6626, WO 90/0594, WO 91/13157, WO 92/1796, and WO 92/21376 (Bacille Calmette Guerin). In bacterial vectors, a polynucleotide of the invention can be inserted into the bacterial genome or it can remain in a free state, for example, carried on a plasmid.
An adjuvant can also be added to a composition containing a bacterial vector vaccine. A number of adjuvants that can be used are known to those skilled in the art. For example, preferred adjuvants can be selected from the list provided below.
According to a fourth aspect of the invention, there is also provided a composition of matter containing a polynucleotide of the invention, together with a diluent or carrier; (ii) a pharmaceutical composition containing WO 98/21225 PCT/US97/21353 a therapeutically or prophylactically effective amount of a polynucleotide of the invention; (iii) a method for inducing an immune response against Helicobacter in a mammal by administering to the mammal an immunogenically effective amount of a polynucleotide of the invention to elicit an immune response, e.g., a protective immune response to Helicobacter; and (iv) a method for preventing and/or treating a Helicobacter H. pylori, H. felis, H. mustelae, or H. heilmanii) infection, by administering a prophylactic or therapeutic amount of a polynucleotide of the invention to an individual in need of such treatment. Additionally, the fourth aspect of the invention encompasses the use of a polynucleotide of the invention in the preparation of a medicament for preventing and/or treating Helicobacter infection. The fourth aspect of the invention preferably includes the use of a polynucleotide molecule placed under conditions for expression in a mammalian cell, in a plasmid that is unable to replicate in mammalian cells and to substantially integrate into a mammalian genome.
Polynucleotides (for example, DNA or RNA molecules) of the invention can also be administered as such to a mammal as a vaccine. When a DNA molecule of the invention is used, it can be in the form of a plasmid that is unable to replicate in a mammalian cell and unable to integrate into the mammalian genome. Typically, a DNA molecule is placed under the control of a promoter suitable for expression in a mammalian cell. The promoter can function ubiquitously or tissue-specifically. Examples of non-tissue specific promoters include the early Cytomegalovirus (CMV) promoter Patent No. 4,168,062) and the Rous Sarcoma Virus promoter (Norton et al., Molec.
Cell Biol. 5:281, 1985). The desmin promoter (Li et al., Gene 78:243, 1989; Li et al., J. Biol. Chem. 266:6562, 1991; Li et al., J. Biol. Chem. 268:10403, 1993) is tissue-specific and drives expression in muscle cells. More generally, WO 98/21225 PCT/US97/21353 -26useful promoters and vectors are described, in WO 94/21797 and by Hartikka et al. (Human Gene Therapy 7:1205, 1996).
For DNA/RNA vaccination, the polynucleotide of the invention can encode a precursor or a mature form of a polypeptide of the invention. When it encodes a precursor form, the precursor sequence can be homologous or heterologous. In the latter case, a eucaryotic leader sequence can be used, such as the leader sequence of the tissue-type plasminogen factor (tPA).
A composition of the invention can contain one or several polynucleotides of the invention. It can also contain at least one additional polynucleotide encoding another Helicobacter antigen, such as urease subunit A, B, or both, or a fragment, derivative, mutant, or analog thereof. A polynucleotide encoding a cytokine, such as interleukin-2 (IL-2) or interleukin- 12 (IL-12), can also be added to the composition so that the immune response is enhanced. These additional polynucleotides are placed under appropriate control for expression. Advantageously, DNA molecules of the invention and/or additional DNA molecules to be included in the same composition are carried in the same plasmid.
Standard methods can be used in the preparation of therapeutic polynucleotides of the invention. For example, a polynucleotide can be used in a naked form, free of any delivery vehicles, such as anionic liposomes, cationic lipids, microparticles, gold microparticles, precipitating agents, e.g., calcium phosphate, or any other transfection-facilitating agent. In this case, the polynucleotide can be simply diluted in a physiologically acceptable solution, such as sterile saline or sterile buffered saline, with or without a carrier. When present, the carrier preferably is isotonic, hypotonic, or weakly hypertonic, and has a relatively low ionic strength, such as provided by a sucrose solution, e.g., a solution containing 20% sucrose.
WO 98/21225 PCT/US97/21353 -27- Alternatively, a polynucleotide can be associated with agents that assist in cellular uptake. It can be, complemented with a chemical agent that modifies cellular permeability, such as bupivacaine (see, e.g., WO 94/16737), (ii) encapsulated into liposomes, or (iii) associated with cationic lipids or silica, gold, or tungsten microparticles.
Anionic and neutral liposomes are well-known in the art (see, e.g., Liposomes: A Practical Approach, RPC New Ed, IRL Press, 1990, for a detailed description of methods for making liposomes) and are useful for delivering a large range of products, including polynucleotides.
Cationic lipids can also be used for gene delivery. Such lipids include, for example, LipofectinM, which is also known as DOTMA (2,3-dioleyloxy)propyl]-N,N,N-trimethylammonium chloride), DOTAP (1,2bis(oleyloxy)-3-(trimethylammonio)propane), DDAB (dimethyldioctadecylammonium bromide), DOGS (dioctadecylamidologlycyl spermine), and cholesterol derivatives. A description of these cationic lipids can be found in EP 187,702, WO 90/11092, U.S. Patent No. 5,283,185, WO 91/15501, WO 95/26356, and U.S. Patent No. 5,527,928. Cationic lipids for gene delivery are preferably used in association with a neutral lipid, such as DOPE (dioleyl phosphatidylethanolamine; WO 90/11092). Other transfectionfacilitating compounds can be added to a formulation containing cationic liposomes. A number of them are described in, WO 93/18759, WO 93/19768, WO 94/25608, and WO 95/2397. They include, spermine derivatives useful for facilitating the transport of DNA through the nuclear membrane (see, for example, WO 93/18759) and membrane-permeabilizing compounds such as GALA, Gramicidine S, and cationic bile salts (see, for example, WO 93/19768).
WO 98/21225 PCT/US97/21353 -28- Gold or tungsten microparticles can also be used for gene delivery, as described in WO 91/359, WO 93/17706, and by Tang et al. (Nature 356:152, 1992). In this case, the microparticle-coated polynucleotides can be injected via intradermal or intraepidermal routes using a needleless injection device ("gene gun"), such as those described in U.S. Patent No. 4,945,050, U.S. Patent No. 5,015,580, and WO 94/24263.
The amount of DNA to be used in a vaccine depends, on the strength of the promoter used in the DNA construct, the immunogenicity of the expressed gene product, the condition of the mammal intended for administration the weight, age, and general health of the mammal), the mode of administration, and the type of formulation. In general, a therapeutically or prophylactically effective dose from about 1 g to about 1 mg, preferably, from about 10 ,g to about 800 pg, and, more preferably, from about 25 /g to about 250 can be administered to a human adult. The administration can be achieved in a single dose or repeated at intervals.
The route of administration can be any conventional route used in the vaccine field. As general guidance, a polynucleotide of the invention can be administered via a mucosal surface, an ocular, intranasal, pulmonary, oral, intestinal, rectal, vaginal, or urinary tract surface, or via a parenteral route, e.g., by an intravenous, subcutaneous, intraperitoneal, intradermal, intraepidermal, or intramuscular route. The choice of administration route will depend on, e.g., the formulation that is selected. A polynucleotide formulated in association with bupivacaine is advantageously administered into muscle. When a neutral or anionic liposome or a cationic lipid, such as DOTMA, is used, the formulation can be advantageously administered via intravenous, intranasal (for example, by aerosolization), intramuscular, intradermal, and subcutaneous routes. A polynucleotide in a naked form can advantageously be administered WO 98/21225 PCT/US97/21353 -29via the intramuscular, intradermal, or subcutaneous routes. Although not absolutely required, such a composition can also contain an adjuvant. A systemic adjuvant that does not require concomitant administration in order to exhibit an adjuvant effect is preferable.
The sequence information provided in the present application enables the design of specific nucleotide probes and primers that can be used in diagnostic methods. Accordingly, in a fifth aspect of the invention, there is provided a nucleotide probe or primer having a sequence found in, or.derived by degeneracy of the genetic code from, a sequence shown in any of SEQ ID NOs: 1-169 (odd numbers).
The term "probe" as used in the present application refers to a DNA (preferably single stranded) or RNA molecule (or modifications or combinations thereof) that hybridizes under the stringent conditions, as defined above, to a polynucleotide molecule having a sequence homologous to any of those shown in SEQ ID NOs: 1-169 (odd numbers), or to a complementary or anti-sense sequence of any of those shown in SEQ ID NOs: -169 (odd numbers). Generally, probes are significantly shorter than the full-length sequences shown in SEQ ID NOs: 1-169 (odd numbers). For example, they can contain from about 5 to about 100, preferably from about 10 to about nucleotides. In particular, probes have sequences that are at least preferably at least 85%, more preferably 95% homologous to a portion of a sequence as shown in any of SEQ ID NOs:1-169 (odd numbers) or a sequence complementary to any of such sequences.
Probes can contain modified bases, such as inosine, deoxycytidine, deoxyuridine, dimethylamino-5-deoxyuridine, or diamino-2, 6purine. Sugar or phosphate residues can also be modified or substituted. For example, a deoxyribose residue can be replaced by a polyamide (Nielsen et al., WO 98/21225 PCT/US97/21353 Science 254:1497, 1991) and phosphate residues can be replaced by ester groups, such as diphosphate, alkyl, arylphosphonate, and phosphorothioate esters. In addition, the 2'-hydroxyl group on ribonucleotides can be modified by addition of, alkyl groups.
Probes of the invention can be used in diagnostic tests or as capture or detection probes. Such capture probes can be immobilized on solid supports, directly or indirectly, by covalent means or by passive adsorption. A detection probe can be labeled by a detectable label, for example, a label selected from radioactive isotopes; enzymes, such as peroxidase and alkaline phosphatase; enzymes that are able to hydrolyze a chromogenic, fluorogenic, or luminescent substrate; compounds that are chromogenic, fluorogenic, or luminescent; nucleotide base analogs; and biotin.
Probes of the invention can be used in any conventional hybridization method, such as in dot blot methods (Maniatis et al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, 1982), Southern blot methods (Southern, J. Mol.
Biol. 98:503, 1975), northern blot methods (identical to Southern blot to the exception that RNA is used as a target), or a sandwich method (Dunn et al., Cell 12:23, 1977). As is known in the art, the latter technique involves the use of a specific capture probe and a specific detection probe that have nucleotide sequences that are at least partially different from each other.
Primers used in the invention usually contain about 10 to nucleotides and are used to initiate enzymatic polymerization of DNA in an amplification process PCR), an elongation process, or a reverse transcription method. In a diagnostic method involving PCR, the primers can be labeled.
WO 98/21225 PCT/US97/21353 -31- Thus, the invention also encompasses a reagent containing a probe of the invention for detecting and/or identifying the presence of Helicobacter in a biological material; (ii) a method for detecting and/or identifying the presence of Helicobacter in a biological material, in which a sample is recovered or derived from the biological material, DNA or RNA is extracted from the material and denatured, and the sample is exposed to a probe of the invention, for example, a capture probe, a detection probe, or both, under stringent hybridization conditions, so that hybridization is detected; and (iii) a method for detecting and/or identifying the presence of Helicobacter in a biological material, in which a sample is recovered or derived from the biological material, DNA is extracted therefrom, the extracted DNA is contacted with at least one, or, preferably two, primers of the invention, and amplified by the polymerase chain reaction, and an amplified DNA molecule is produced.
As mentioned above, polypeptides that can be produced by expression of the polynucleotides of the invention can be used as vaccine antigens. Accordingly, a sixth aspect of the invention features a substantially purified polypeptide or polypeptide derivative having an amino acid sequence encoded by a polynucleotide of the invention.
A "substantially purified polypeptide" is defined as a polypeptide that is separated from the environment in which it naturally occurs and/or a polypeptide that is free of most of the other polypeptides that are present in the environment in which it was synthesized. The polypeptides of the invention can be purified from a natural source, such as a Helicobacter strain, or can be produced using recombinant methods.
Homologous polypeptides or polypeptide derivatives encoded by polynucleotides of the invention can be screened for specific antigenicity by WO 98/21225 PCT/US97/21353 -32testing cross-reactivity with an antiserum raised against a polypeptide having an amino acid sequence as shown in any of SEQ ID NOs:2-170 (even numbers). Briefly, a monospecific hyperimmune antiserum can be raised against a purified reference polypeptide as such or as a fusion polypeptide, for example, an expression product of MBP, GST, or His-tag systems, or a synthetic peptide predicted to be antigenic. The homologous polypeptide or derivative that is screened for specific antigenicity can be produced as such or as a fusion polypeptide. In the latter case, and if the antiserum is also raised against a fusion polypeptide, two different fusion systems are employed.
Specific antigenicity can be determined using a number of methods, including Western blot (Towbin et aL, Proc. Natl. Acad. Sci. USA 76:4350, 1979), dot blot, and ELISA methods, as described below.
In a Western blot assay, the product to be screened, either as a purified preparation or a total E. coli extract, is fractionated by SDS-PAGE, as described, for example, by Laemmli (Nature 227:680, 1970). After being transferred to a filter, such as a nitrocellulose membrane, the material is incubated with the monospecific hyperimmune antiserum, which is diluted in a range of dilutions from about 1:50 to about 1:5,000, preferably from about 1:100 to about 1:500. Specific antigenicity is shown once a band corresponding to the product exhibits reactivity at any of the dilutions in the range.
In an ELISA assay, the product to be screened can be used as the coating antigen. A purified preparation is preferred, but a whole cell extract can also be used. Briefly, about 100 Ml of a preparation of about 10 /g protein/ml is distributed into wells of a 96-well ELISA plate. The plate is incubated for about 2 hours at 37°C, then overnight at 4°C. The plate is washed with phosphate buffer saline (PBS) containing 0.05% Tween WO 98/21225 PCT/US97/21353 -33- (PBS/Tween buffer) and the wells are saturated with 250 /l PBS containing 1% bovine serum albumin (BSA), to prevent non-specific antibody binding.
After 1 hour of incubation at 37 0 C, the plate is washed with PBS/Tween buffer.
The antiserum is serially diluted in PBS/Tween buffer containing 0.5% BSA, and 100 ,l dilutions are added to each well. The plate is incubated for minutes at 37°C, washed, and evaluated using standard methods. For example, a goat anti-rabbit peroxidase conjugate can be added to the wells when the specific antibodies used were raised in rabbits. Incubation is carried out for about 90 minutes at 37°C and the plate is washed. The reaction is developed with the appropriate substrate and the reaction is measured by colorimetry (absorbance measured spectrophotometrically). Under these experimental conditions, a positive reaction is shown once an O.D. value of is detected with a dilution of at least about 1:50, preferably of at least about 1:500.
In a dot blot assay, a purified product is preferred, although a whole cell extract can be used. Briefly, a solution of the product at a concentration of about 100 k/g/ml is serially diluted two-fold with 50 mM Tris-HCl (pH One hundred /l of each dilution is applied to a filter, such as a 0.45 gm nitrocellulose membrane, set in a 96-well dot blot apparatus (Biorad). The buffer is removed by applying vacuum to the system. Wells are washed by addition of 50 mM Tris-HCl (pH 7.5) and the membrane is air-dried. The membrane is saturated in blocking buffer (50 mM Tris-HCI (pH 0.15 M NaC1, 10 g/l- skim milk) and incubated with an antiserum diluted from about 1:50 to about 1:5000, preferably about 1:500. The reaction is detected using standard methods. For example, a goat anti-rabbit peroxidase conjugate can be added to the wells when rabbit antibodies are used. Incubation is carried out for about 90 minutes at 37C and the blot is washed. The reaction is developed WO 98/21225 PCT/US97/21353 -34with the appropriate substrate and stopped. The reaction is then measured visually by the appearance of a colored spot, by colorimetry. Under these experimental conditions, a positive reaction is associated with detection of a colored spot for reactions carried out with a dilution of at least about 1:50, preferably, of at least about 1:500. Therapeutic or prophylactic efficacy of a polypeptide or polypeptide derivative of the invention can be evaluated as described below.
According to a seventh aspect of the invention, there is provided a composition of matter containing a polypeptide of the invention, together with a diluent or carrier; (ii) a pharmaceutical composition containing a therapeutically or prophylactically effective amount of a polypeptide of the invention; (iii) a method for inducing an immune response against Helicobacter in a mammal by administering to the mammal an immunogenically effective amount of a polypeptide of the invention to elicit an immune response, a protective immune response to Helicobacter; and (iv) a method for preventing and/or treating a Helicobacter H. pylori, H. felis, H. mustelae, or H.
heilmanii) infection, by administering a prophylactic or therapeutic amount of a polypeptide of the invention to an individual in need of such treatment.
Additionally, this aspect of the invention includes the use of a polypeptide of the invention in the preparation of a medicament for preventing and/or treating Helicobacter infection.
The immunogenic compositions of the invention can be administered by any conventional route in use in the vaccine field, for example, to a mucosal ocular, intranasal, pulmonary, oral, gastric, intestinal, rectal, vaginal, or urinary tract) surface or via a parenteral subcutaneous, intradermal, intramuscular, intravenous, or intraperitoneal) route. The choice of the administration route depends upon a number of parameters, such as the WO 98/21225 PCT/US97/21353 adjuvant used. For example, if a mucosal adjuvant is used, the intranasal or oral route will be preferred, and if a lipid formulation or an aluminum compound is used, a parenteral route will be preferred. In the latter case, the subcutaneous or intramuscular route is most preferred. The choice of administration route can also depend upon the nature of the vaccine agent. For example, a polypeptide of the invention fused to CTB or to LTB will be best administered to a mucosal surface.
A composition of the invention can contain one or several polypeptides or derivatives of the invention. It can also contain at least one additional Helicobacter antigen, such as the urease apoenzyme, or a subunit, fragment, homolog, mutant, or derivative thereof.
For use in a composition of the invention, a polypeptide or polypeptide derivative can be formulated into or with liposomes, such as neutral or anionic liposomes, microspheres, ISCOMS, or virus-like particles (VLPs), to facilitate delivery and/or enhance the immune response. These compounds are readily available to those skilled in the art; for example, see Liposomes: A Practical Approach (supra). Adjuvants other than liposomes can also be used in the invention and are well known in the art (see, for example, the list provided below).
Administration can be achieved in a single dose or repeated as necessary at appropriate intervals that can be determined by those skilled in the art. For example, a priming dose can be followed by three booster doses at weekly or monthly intervals. An appropriate dose depends on various parameters, including the nature of the recipient whether the recipient is an adult or an infant), the particular vaccine antigen, the route and frequency of administration, the presence/absence or type of adjuvant, and the desired effect protection and/or treatment), and can be readily determined by one skilled WO 98/21225 PCT/US97/21353 -36in the art. In general, a vaccine antigen of the invention can be administered mucosally in an amount ranging from about 10 pg to about 500 mg, preferably from about 1 mg to about 200 mg. For a parenteral route of administration, the dose usually should not exceed about 1 mg, and is, preferably, about 100 /tg.
When used as components of a vaccine, the polynucleotides and polypeptides of the invention can be used sequentially as part of a multi-step immunization process. For example, a mammal can be initially primed with a vaccine vector of the invention, such as a pox virus, via a parenteral route, and then boosted twice with a polypeptide encoded by the vaccine vector, e.g., via the mucosal route. In another example, liposomes associated with a polypeptide or polypeptide derivative of the invention can be used for priming, with boosting being carried out mucosally using a soluble polypeptide or polypeptide derivative of the invention, in combination with a mucosal adjuvant LT).
Polypeptides and polypeptide derivatives of the invention can also be used as diagnostic reagents for detecting the presence of anti-Helicobacter antibodies, in blood samples. Such polypeptides can be about 5 to about preferably, about 10 to about 50, amino acids in length and can be labeled or unlabeled, depending upon the diagnostic method. Diagnostic methods involving such a reagent are described below.
Upon expression of a polynucleotide molecule of the invention, a polypeptide or polypeptide derivative is produced and can be purified using known methods. For example, the polypeptide or polypeptide derivative can be produced as a fusion protein containing a fused tail that facilitates purification.
The fusion product can be used to immunize a small mammal, a mouse or a rabbit, in order to raise monospecific antibodies against the polypeptide or polypeptide derivative. The eighth aspect of the invention thus provides a WO 98/21225 PCT/US97/21353 -37monospecific antibody that binds to a polypeptide or polypeptide derivative of the invention.
By "monospecific antibody" is meant an antibody that is capable of reacting with a unique, naturally-occurring Helicobacter polypeptide. An antibody of the invention can be polyclonal or monoclonal. Monospecific antibodies can be recombinant, chimeric consisting of a variable region of murine origin and a human constant region), humanized a human immunoglobulin constant region and a variable region of animal, e.g., murine, origin), and/or single chain. Both polyclonal and monospecific antibodies can also be in the form of immunoglobulin fragments, F(ab)'2 or Fab fragments. The antibodies of the invention can be of any isotype, e.g., IgG or IgA, and polyclonal antibodies can be of a single isotype or can contain a mixture of isotypes.
The antibodies of the invention, which can be raised against a polypeptide or polypeptide derivative of the invention, can be produced and identified using standard immunological assays, Western blot assays, dot blot assays, or ELISA (see, Coligan et al., Current Protocols in Immunology, John Wiley Sons, Inc., New York, NY, 1994). The antibodies can be used in diagnostic methods to detect the presence of Helicobacter antigens in a sample, such as a biological sample. The antibodies can also be used in affinity chromatography methods for purifying a polypeptide or polypeptide derivative of the invention. As is discussed further below, the antibodies can also be used in prophylactic and therapeutic passive immunization methods.
Accordingly, a ninth aspect of the invention provides a reagent for detecting the presence of Helicobacter in a biological sample that contains an antibody, polypeptide, or polypeptide derivative of the invention; and (ii) a WO 98/21225 PCT/US97/21353 -38diagnostic method for detecting the presence of Helicobacter in a biological sample, by contacting the biological sample with an antibody, a polypeptide, or a polypeptide derivative of the invention, so that an immune complex is formed, and detecting the complex as an indication of the presence of Helicobacter in the sample or the organism from which the sample was derived. The immune complex is formed between a component of the sample and the antibody, polypeptide, or polypeptide derivative, and any unbound material can be removed prior to detecting the complex. A polypeptide reagent can be used for detecting the presence of anti-Helicobacter antibodies in a sample, a blood sample, while an antibody of the invention can be used for screening a sample, such as a gastric extract or biopsy sample, for the presence of Helicobacter polypeptides.
For use in diagnostic methods, the reagent the antibody, polypeptide, or polypeptide derivative of the invention) can be in a free state or can be immobilized on a solid support, such as, for example, on the interior surface of a tube or on the surface, or within pores, of a bead. Immobilization can be achieved using direct or indirect means. Direct means include passive adsorption non-covalent binding) or covalent binding between the support and the reagent. By "indirect means" is meant that an anti-reagent compound that interacts with the reagent is first attached to the solid support. For example, if a polypeptide reagent is used, an antibody that binds to it can serve as an anti-reagent, provided that it binds to an epitope that is not involved in recognition of antibodies in biological samples. Indirect means can also employ a ligand-receptor system, for example, a molecule, such as a vitamin, can be grafted onto the polypeptide reagent and the corresponding receptor can be immobilized on the solid phase. This concept is illustrated by the well known biotin-streptavidin system. Alternatively, indirect means can be used, WO 98/21225 PCT/US97/21353 -39by adding to the reagent a peptide tail, chemically or by genetic engineering, and immobilizing the grafted or fused product by passive adsorption or covalent linkage of the peptide tail.
According to a tenth aspect of the invention, there is provided a process for purifying from a biological sample a polypeptide or polypeptide derivative of the invention, which involves carrying out antibody-based affinity chromatography with the biological sample, wherein the antibody is a monospecific antibody of the invention.
For use in a purification process of the invention, the antibody can be polyclonal or monospecific, and preferably is of the IgG type. Purified IgGs can be prepared from an antiserum using standard methods (see, Coligan et al., supra). Conventional chromatography supports, as well as standard methods for grafting antibodies, are described, for example, by Harlow et al.
(Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, 1988).
Briefly, a biological sample, such as an H. pylori extract, preferably in a buffer solution, is applied to a chromatography material, which is, preferably, equilibrated with the buffer used to dilute the biological sample, so that the polypeptide or polypeptide derivative of the invention the antigen) is allowed to adsorb onto the material. The chromatography material, such as a gel or a resin coupled to an antibody of the invention, can be in batch form or in a column. The unbound components are washed off and the antigen is eluted with an appropriate elution buffer, such as a glycine buffer, a buffer containing a chaotropic agent, guanidine HC1, or a buffer having high salt concentration 3 M MgCl 2 Eluted fractions are recovered and the presence of the antigen is detected, by measuring the absorbance at 280 nm.
WO 98/21225 PCT/US97/21353 An antibody of the invention can be screened for therapeutic efficacy as follows. According to an eleventh aspect of the invention, there is provided a composition of matter containing a monospecific antibody of the invention, together with a diluent or carrier; (ii) a pharmaceutical composition containing a therapeutically or prophylactically effective amount of a monospecific antibody of the invention, and (iii) a method for treating or preventing Helicobacter H. pylori, H. felis, H. mustelae, or H. heilmanii) infection, by administering a therapeutic or prophylactic amount of a monospecific antibody of the invention to an individual in need of such treatment. In addition, the eleventh aspect of the invention includes the use of a monospecific antibody of the invention in the preparation of a medicament for treating or preventing Helicobacter infection.
The monospecific antibody can be polyclonal or monoclonal, and is, preferably, predominantly of the IgA isotype. In passive immunization methods, the antibody is administered to a mucosal surface of a mammal, e.g., the gastric mucosa, orally or intragastrically, optionally, in the presence of a bicarbonate buffer. Alternatively, systemic administration, not requiring a bicarbonate buffer, can be carried out. A monospecific antibody of the invention can be administered as a single active agent or as a mixture with at least one additional monospecific antibody specific for a different Helicobacter polypeptide. The amount of antibody and the particular regimen used can be readily determined by one skilled in the art. For example, daily administration of about 100 to 1,000 mg of antibody over one week, or three doses per day of about 100 to 1,000 mg of antibody over two or three days, can be effective regimens for most purposes.
Therapeutic or prophylactic efficacy can be evaluated using standard methods in the art, by measuring induction of a mucosal immune response WO 98/21225 PCT/US97/21353 -41or induction of protective and/or therapeutic immunity using, the H. felis mouse model and the procedures described by Lee et al. (Eur. J.
Gastroenterology Hepatology 7:303, 1995) or Lee et al. Infect. Dis.
172:161, 1995). Those skilled in the art will recognize that the H. felis strain of the model can be replaced with another Helicobacter strain. For example, the efficacy ofpolynucleotide molecules and polypeptides from H. pylori is, preferably, evaluated in a mouse model using an H. pylori strain. Protection can be determined by comparing the degree of Helicobacter infection in the gastric tissue assessed by, for example, urease activity, bacterial counts, or gastritis, to that of a control group. Protection is shown when infection is reduced by comparison to the control group. Such an evaluation can be made for polynucleotides, vaccine vectors, polypeptides, and polypeptide derivatives, as well as for antibodies of the invention.
For example, various doses of an antibody of the invention can be administered to the gastric mucosa of mice previously challenged with an H.
pylori strain as described, by Lee et al. (supra). Then, after an appropriate period of time, the bacterial load of the mucosa can be estimated by assessing urease activity, as compared to a control. Reduced urease activity indicates that the antibody is therapeutically effective.
Adjuvants that can be used in any of the vaccine compositions described above are described as follows. Adjuvants for parenteral administration include, for example, aluminum compounds, such as aluminum hydroxide, aluminum phosphate, and aluminum hydroxy phosphate. The antigen can be precipitated with, or adsorbed onto, the aluminum compound using standard methods. Other adjuvants, such as RIBI (ImmunoChem, Hamilton, MT), can also be used in parenteral administration.
WO 98/21225 PCT/US97/21353 -42- Adjuvants that can be used for mucosal administration include, for example, bacterial toxins, the cholera toxin the E. coli heat-labile toxin the Clostridium difficile toxin A, the pertussis toxin and combinations, subunits, toxoids, or mutants thereof. For example, a purified preparation of native cholera toxin subunit B (CTB) can be used. Fragments, homologs, derivatives, and fusions to any of these toxins can also be used, provided that they retain adjuvant activity. Preferably, a mutant having reduced toxicity is used. Suitable mutants are described, in WO 95/17211 (Arg-7-Lys CT mutant), WO 96/6627 (Arg-192-Gly LT mutant), and WO 95/34323 (Arg-9-Lys and Glu-129-Gly PT mutant). Additional LT mutants that can be used in the methods and compositions of the invention include, e.g., Ser-63-Lys, Ala-69-Gly, Glu-110-Asp, and Glu-112-Asp mutants. Other adjuvants, such as the bacterial monophosphoryl lipid A (MPLA) of, E.
coli. Salmonella minnesota, Salmonella typhimurium, or Shigella flexneri; saponins, and polylactide glycolide (PLGA) microspheres, can also be used in mucosal administration. Adjuvants useful for both mucosal and parenteral administration, such as polyphosphazene (WO 95/2415), can also be used.
Any pharmaceutical composition of the invention, containing a polynucleotide, polypeptide, polypeptide derivative, or antibody of the invention, can be manufactured using standard methods. It can be formulated with a pharmaceutically acceptable diluent or carrier, water or a saline solution, such as phosphate buffered saline,, optionally, including a bicarbonate salt, such as sodium bicarbonate, 0.1 to 0.5 M. Bicarbonate can advantageously be added to compositions intended for oral or intragastric administration. In general, a diluent or carrier can be selected on the basis of the mode and route of administration, and standard pharmaceutical practice.
Suitable pharmaceutical carriers and diluents, as well as pharmaceutical WO 98/21225 PCT/US97/21353 -43necessities for their use in pharmaceutical formulations, are described in Remington's Pharmaceutical Sciences, a standard reference text in this field and in the USP/NF.
The invention also includes methods in which gastroduodenal infections, such as Helicobacter infection, are treated by oral administration of a Helicobacter polypeptide of the invention and a mucosal adjuvant, in combination with an antibiotic, an antisecretory agent, a bismuth salt, an antacid, sucralfate, or a combination thereof. Examples of such compounds that can be administered with the vaccine antigen and an adjuvant are antibiotics, including, macrolides, tetracyclines, p-lactams, aminoglycosides, quinolones, penicillins, and derivatives thereof (specific examples of antibiotics that can be used in the invention include, e.g., amoxicillin, clarithromycin, tetracycline, metronidizole, erythromycin, cefuroxime, and erythromycin); antisecretory agents, including, H 2 receptor antagonists cimetidine, ranitidine, famotidine, nizatidine, and roxatidine), proton pump inhibitors omeprazole, lansoprazole, and pantoprazole), prostaglandin analogs misoprostil and enprostil), and anticholinergic agents pirenzepine, telenzepine, carbenoxolone, and proglumide); and bismuth salts, including colloidal bismuth subcitrate, tripotassium dicitrate bismuthate, bismuth subsalicylate, bicitropeptide, and pepto-bismol (see, Goodwin et al., Helicobacter pylori, Biology and Clinical Practice, CRC Press, BocaRaton, FL, pp 366-395, 1993; Physicians' Desk Reference, 49' t edn., Medical Economics Data Production Company, Montvale, New Jersey, 1995). In addition, compounds containing more than one of the above-listed components coupled together, ranitidine coupled to bismuth subcitrate, can be used. The invention also includes compositions for carrying out these methods, compositions containing a Helicobacter WO 98/21225 PCT/US97/21353 -44antigen (or antigens) of the invention, an adjuvant, and one or more of the above-listed compounds, in a pharmaceutically acceptable carrier or diluent.
Amounts of the above-listed compounds used in the methods and compositions of the invention can readily be determined by one skilled in the art. In addition, one skilled in the art can readily design treatment/immunization schedules. For example, the non-vaccine components can be administered on days 1-14, and the vaccine antigen adjuvant can be administered on days 7, 14, 21, and 28.
Methods and pharmaceutical compositions of the invention can be used to treat or to prevent Helicobacter infections and, accordingly, gastroduodenal diseases associated with these infections, including acute, chronic, and atrophic gastritis, and peptic ulcer diseases, gastric and duodenal ulcers.
The clones of the invention were originally isolated by a transposon shuttle mutagenesis method. Briefly, in this method, a TnMax9 mini-blaM transposon was used for insertional mutagenesis of an H. pylori gene library established in E. coli. 192 E. coli clones expressing active P-lactamase fusion proteins were obtained, indicating that the corresponding target plasmids carry H. pylori genes encoding extracytoplasmic proteins. Individual mutants were transferred onto the chromosome of H. pylori P or P12 by natural transformation, resulting in 135 distinct H. pylori mutants. This method is described in further detail, as follows.
The transposon TnMax9 (Kahrs et al., Gene 167:53, 1995) was used to generate mutations in an H. pylori library in E. coli. As illustrated in Fig.
1A, TnMax9 contains, in addition to a catG-resistance gene close to the inverted repeat an unexpressed open reading frame encoding P-lactamase without a promoter or signal sequence (mature 3-lactamase, blaM; Kahrs et al., WO 98/21225 PCT/US97/21353 supra). For production of extracytoplasmic BlaM fusion proteins resulting in ampicillin-resistant (ampR) clones, expression of the cloned H. pylori genes in E. coli is obligatory. The minimal vector pMin2 (Kahrs et al., supra; see Fig.
1B), containing a weak constitutive promoter (Piga) upstream of the multiple cloning site, was used for construction of the H. pylori library to ensure expression ofH. pylori genes in E. coli.
In construction of the library, H. pylori DNA was partially digested with Sau3A and HpaII, size fractionated by preparative agarose gel electrophoresis, and 3-6 kilobase fragments were ligated into the BglII and Clal sites of pMin2. The library was introduced into E. coli strain El81(pTnMax9), which is a derivative of HB 101 containing the TnMax9 transposon, by electroporation. This generated approximately 2,400 independent transformants. More than 95% of the plasmids contained an insert of between 3 and 6 kilobases, showing that the 1.7 megabase H. pylori chromosome was statistically covered. Since not every plasmid could be expected to contain a target gene carrying an export signal, the library was partitioned into a total of 198 pools (24 pools of 20 clones and 174 pools of 11 clones). Using a cotton swab, either eleven or twenty individual colonies were inoculated in 0.5 ml LB medium in eppendorf tubes, vortexed, and 100 ml of the suspension was spread on LB agar plates supplemented with tetracycline and chloramphenicol to select for maintenance of both plasmids. Insertion of TnMax9 into the target plasmids was induced with 100 mM isopropyl-b-D-thiogalactoside (IPTG) separately for each pool (Haas et al., Gene 130:23-21, 1993). Plasmids were transferred into E145 by triparental mating, in which 25 ml of the donor strain (El 81), 25 ml of the mobilisator (HB101(pRK2013)), and 50 ml of the recipient strain (E145) were mixed from corresponding bacterial suspensions (O.D.ss 0 10). The matings were performed for 2-3 hours at 37°C on WO 98/21225 PCT/US97/21353 -46nitrocellulose filters, which were placed on LB plates. Bacteria were suspended in 1 ml LB and aliquots were spread on LB plates containing chloramphenicol, tetracycline, and rifampicin. Each pool gave rise to chloramphenicol-resistant transconjugates in E 145, demonstrating that both transposition and conjugation were successful. Generally, several thousand chloramphenicol-resistant transconjugates were obtained, but the number of ampR colonies varied in different pools, ranging from one to several hundred colonies. Two amp" colonies from each positive pool were isolated, plasmid DNA was extracted, and the DNA was characterized by further restriction analysis. Only those TnMax9 insertions of a single pool that mapped in obviously different plasmid clones, or in markedly different regions of the same clone, were used further.
From 158 of the 198 pools, ampicillin-resistant E145 transconjugates were obtained showing that in several pools, TnMax9 inserted into expressed genes, resulting in production of extracytoplasmic BlaM fusion proteins. Thus, a total of 192 ampR E145 clones could be isolated by conjugal transfer of plasmids from 198 pools.
To analyze the mutant library, it was determined whether defined gene sequences inactivated by TnMax9 were represented once or several times in the whole library. Five transposon-containing plasmids conferring an ampR phenotype to E145 (pMu7, pMul3, pMu75, pMu94, and pMul 10) were randomly selected and DNA fragments flanking the TnMax9 insert were isolated and used as probes in Southern hybridization of 120 ampR clones. The hybridization probes isolated from clones pMu7, pMu75, and pMu94 were between 0.9 and 1.1 kilobases in size, and hybridized exclusively with the inserts of the homologous plasmids. In contrast, the TnMax9 flanking regions of clones pMul3 and pMul 10 were 4.0 and 5.5 kilobases, respectively. They WO 98/21225 PCT/US97/21353 -47each hybridized with the homologous plasmids, and with one additional clone of the library. Such a result was expected, since the chance of a probe to find a homologous sequence in the library should be higher, the longer the hybridization probes.
In order to verify the insertion of the transposon into distinct ORFs encoding putative exported proteins, the TnMax9-flanking DNA of five representative amp R mutant clones (pMu7, pMul2, pMul 8, pMu20, and pMu26) was sequenced, taking advantage of the M13 forward and reverse primers on TnMax9 (Fig. 1A). This analysis revealed that the mini-transposon was inserted into different sequences in each plasmid, thereby interrupting ORFs encoding putative proteins. For two clones, the sequences located upstream of the blaM gene revealed a putative ribosome-binding site and a potential translational start codon (ATG). Other clones either revealed an ORF spanning the complete sequence (approximately 400 base pairs upstream and downstream of the TnMax9 insertion) or terminating shortly after the site of TnMax9 insertion. The partial protein sequences from different ORFs were used for database searches, but no significant homologies with known proteins were found.
In a further approach, it was determined whether a known gene, like vacA, encoding the extracellular vacuolating cytotoxin of H. pylori, could be identified using this method and how often such a mutation would be represented in the mutant library. Total cell lysates of the 135 mutants were tested in an immunoblot using the H. pylori cytotoxin-specific rabbit antiserum AK197 (Schmitt et al., Mol. Microbiol. 12:307-319, 1994). Two mutants were identified that no longer produced the cytotoxin antigen (mutants P1-26 and P 1-47) and partial DNA sequencing of the insertion sites revealed that TnMax9 WO 98/21225 PCT/US97/21353 -48was inserted at distinct positions in the vacA gene, 56 and 53 codons downstream of the ATG start codon.
Thus, the characterization of the mutant collection confirmed that a representative gene library was constructed in E. coli, in which target genes encoding exported H. pylori proteins were efficiently tagged by TnMax9.
In order to establish a collection of mutants lacking distinct exported proteins, the mutations had to be transferred back into the H. pylori chromosome. By means of natural transformation, 86 plasmids could be transformed into the original strain P1. H. pylori strains P1 or P12, which were naturally competent for DNA transformation, were transformed with circular plasmid DNA (0.2-0.5 mg/transformation). Transformations to streptomycin resistance were performed with chromosomal DNA (1 mg/transformation), isolated from a streptomycin-resistant NCTC 11637 H. pylori mutant according to the procedure described in Haas et al. (Mol. Microbiol. 8:753-760).
Selection was performed on serum plates containing 4 mg/ml chloramphenicol or 500 mg/ml streptomycin. The transformation frequency for a given mutant was calculated as the number of chloramphenicol-, streptomycin-, or erythromycin-resistant colonies per cfu (average of three experiments). The blaM gene was deleted by NotI digestion, and the plasmid religated, in those plasmids that did not transform strain P1 directly. This procedure, which resulted in a twenty- to thirty-fold higher frequency of transformation, as compared to the same plasmid containing blaM, resulted in 36 additional mutant P1 strains. The blaM-deletion plasmids that still did not transform strain P1 were used to transform the heterologous H. pylori strain P12, possessing an approximately 10-fold higher transformation frequency compared to P1. This resulted in thirteen further mutants.
WO 98/21225 PCT/US97/21353 -49- Thus, from the 192 amp" plasmids, a total of 135 H. pylori mutants (122 mutants in P1 and 13 mutants in P12) were finally obtained by selection for chloramphenicol resistance The transformation frequency varied between different plasmids in the range of Ixl 10 x 10- 7 The remaining plasmids did not result in any transformants. The collection was frozen as individual mutants in stock cultures at -70°C. To verify the correct insertion of the mini-transposon into the H.pylori chromosome, ten representative mutants were tested by Southern hybridization of chromosomal DNA using catoc DNA and the vector pMin2 as probes. Consistent with our previous experience concerning TnMax9-based shuttle mutagenesis of H. pylori, the minitransposon was, in all cases, inserted into the chromosome without integration of the vector DNA, which probably means by a double cross-over, rather than by a single cross-over event. As judged from the hybridization pattern obtained with the cat gene as a probe, it appears that TnMax9 is located in different regions of the chromosome, showing that distinct target genes have been interrupted in individual mutants.
The mutants were analyzed for motility, transformation competence, and adherence to KatoIII cells. Screening of the H. pylori mutant collection allowed identification of mutants impaired in motility, natural transformation competence, and adherence to gastric epithelial cell lines. Motility mutants could be grouped into distinct classes: mutants lacking the major flagellin subunit FlaA and intact flagella; (ii) mutants with apparently normal flagella, but reduced motility; and (iii) mutants with obviously normal flagella, but completely abolished motility. Two independent mutations, which exhibited defects in natural competence for genetic transformation, mapped to different genetic loci. In addition, two independent mutants were isolated by their failure to bind to the human gastric carcinoma cell line KatoIII. Both mutants WO 98/21225 PCT/US97/21353 carried a transposon in the same gene, approximately 0.8 kilobases apart, and showed decrease autoagglutination, when compared to the wild type strain.
Sequences of clones obtained using the above-described transposon shuttle mutagenesis method were used to identify intact genes, lacking inserted transposons, in the H. pylori genome, as is described below in Example The invention is further illustrated by the following examples.
Example 1 describes identification of genes, such as genes that encode the polypeptides of the invention, in the Helicobacter genome, as well as identification of signal sequences and primer design for amplification of genes lacking signal sequences. Example 2 describes cloning of DNA encoding GHPO 732, GHPO 419, GHPO 1398, GHPO 706, GHPO 1190, GHPO 986, GHPO 1420, GHPO 1299, and GHPO 13 into a vector that provides a histidine tag, and production and purification of the resulting his-tagged fusion proteins.
Example 3 describes methods for cloning DNA encoding the polypeptides of the invention so that they can be produced without his-tags, and Example 4 describes methods for purifying recombinantly produced polypeptides of the invention. Example 5 describes methods for obtaining the nucleic acids of the invention from the deposited clones. Example 6 describes purification of recombinant H. pylori antigen GHPO 1190.
EXAMPLE 1: Identification of genes in the H. pylori genome, identification of signal sequences, and primer design for amplification of genes lacking signal sequences 1.A. Creating H. pylori genomic databases The H. pylori genome was provided as a text file containing a single contiguous string of nucleotides that had been determined to be 1.76 Megabases in length. The complete genome was split into 17 separate files WO 98/21225 PCT/US97/21353 -51using the program SPLIT (Creativity in Action), giving rise to 16 contigs, each containing 100,000 nucleotides, and a 1 7 th contig containing the remaining 76,000 nucleotides. A header was added to each of the 17 files using the format: >hpg0.txt (representing contig .hpgl.txt (representing contig etc.
The resulting 17 files, named hpg0 through hpgl6, were then copied together to form one file that represented the plus strand of the complete H. pylori genome.
The constructed database was given the designation A negative strand database of the H. pylori genome was created similarly by first creating a reverse complement of the positive strand using the program SeqPup (D.G.
Gilbert, Indiana University Biology Department) and then performing the same procedure as described above for the plus strand. This database was given the designation The regions predicted to encode open reading frames (ORFs) were defined for the complete H. pylori genome using the program GENEMARKTM (Borodovsky et al., Comp. Chem. 17:123, 1993). A database was created from a text file containing an annotated version of all ORFs predicted to be encoded by the H. pylori genome for both the plus and minus strands, and was given the designation Each ORF was assigned a number indicating its location on the genome and its position relative to other genes. No manipulation of the text file was required.
1.B. Searching the H. pylori databases The databases constructed as is described above were searched using the program FASTA (Pearson et al., Proc. Natl. Acad. Sci. USA 85:2444-2448, 1988). FASTA was used for searching either a DNA sequence against either of the gene databases and/or or a peptide sequence against the ORF library TFASTX was used to search a peptide sequence against all WO 98/21225 PCT/US97/21353 -52possible reading frames of a DNA database and/or libraries).
Potential frameshifts also being resolved, FASTX was used for searching the translated reading frames of a DNA sequence against either a DNA database, or a peptide sequence against the protein database.
1.C. Isolation of DNA sequences from the H. pylori genome The FASTA searches against the constructed DNA databases identified exact nucleotide coordinates on one or more of the isolated contigs, and therefore the location of the target DNA. Once the exact location of the target sequence was known, the contig identified to carry the gene was exported into the software package MapDraw (DNAStar, Inc.) and the gene was isolated. Gene sequences with flanking DNA were then excised and copied into the EditSeq. Software package (DNAStar, Inc.) for further analysis.
1.D. Identification of signal sequences The deduced protein encoded by a target gene sequence was analyzed using the PROTEAN software package (DNAStar, Inc.). This analysis predicts those areas of the protein that are hydrophobic by using the Kyte-Doolittle algorithm, and identifies any potential polar residues preceding the hydrophobic core region, which is typical for many signal sequences. For confirmation, the target protein was then searched against a PROSITE database (DNAStar, Inc.) consisting of motifs and signatures. Characteristic of many signal sequences and hydrophobic regions in general, is the identification of predicted prokaryotic lipid attachment sites. Where confirmation between the two approaches is apparent at the N-terminus of any protein, putative cleavage sites were sought. Specifically, this includes the presence of either an Alanine Serine or Glycine residue immediately after the core hydrophobic WO 98/21225 PCT/US97/21353 -53region. In the case oflipoproteins, a Cysteine residue would be identified as the +1 residue, post-cleavage.
I.E. Rational design of PCR primers based on the identification of signal sequences To clone gene sequences as N-terminal translational fusions for the generation of recombinant proteins with N-terminal Histidine tags, the gene sequence that specifies the signal sequence is omitted. The 5'-end of the genespecific portion of the N-terminal primer is designed to start at the first codon beyond the cleavage site. In the case of lipoproteins, the 5'-end of the Nterminal primer begins at the second codon, immediately after the modifiable residue at position +1 post-cleavage. The omission of the signal sequence from the recombinant allows for one-step purification, and potential problems associated with insertion of signal sequences in the membrane of the host strain carrying the hybrid construct are avoided.
EXAMPLE 2: Preparation of isolated DNA encoding GHPO 732, GHPO 419, GHPO 1398, GHPO 706, GHPO 1190, GHPO 986, GHPO 1420, GHPO 1299, and GHPO 13, and production of these polypeptides as histidine-tagged fusion proteins 2.A. Preparation of genomic DNA from Helicobacterpylori Helicobacter pylori strain ORV2001, stored in LB medium containing 50% glycerol at -70 C, is grown on Colombia agar containing 7% sheep blood for 48 hours under microaerophilic conditions (8-10% CO2, 5-7% 02, 85-87% N 2 Cells are harvested, washed with phosphate buffer saline (PBS) (pH and DNA is then extracted from the cells using the Rapid Prep Genomic DNA Isolation kit (Pharmacia Biotech).
WO 98/21225 WO 982122PCTIUS97/21353 -54- 2.13. PCR amplification DNA molecules encoding GHPO 732, GHPO 419, GHPO 1398, GHPO 706, GHPO 1190, GHPO 986, GHPO 1420, GHPO 1299, and GHPO 13 are amplified from genomic DNA, as can be prepared as is described above, by the Polymerase Chain Reaction (PCR) using the following primers: GHPO 732 (HPO 64): N-termninal primer: '-GCCGGATCCATGACTTATGGGTATGGGGAA-3 '(SEQ ID NO: 17 1); and C-termninal primer: t -GCCCTCGAGACTTTTATTGATTCACCATTTCATT-3' (SEQ ID NO: 172).
GIIPO 419 (HPO 54): N-terminal primer: 5'-GCCGGATCCATCGCTGAAGAAAATGGGGCG-3' (SEQ ID NO: 173); and C-terminal primer: 5'-GCCCGGCCGCCCTAAAAACTATAAACATAACTC-3' (SEQ ID NO: 174).
GHPO 1398 (HPO N-terminal primer: 5'-GCCGGATCCGGTATTAGGAAGCTTATACCATC-3' (SEQ ID NO: 175); and C-terminal primer: 25 5'-GCCCTCGAGAAGTTCTATTTTTAATTCCTTGAGAG-3' (SEQ ID NO: 176).
GHPO 706 (HPO 501) WO 98/21225 WO 9821225PCT/US97/21353 N-terminal primer: t -GCCGGATCCTCTGATAGCCATAAAGAAAAAAAGGAC-3' (SEQ ID NO: 177); and C-terminal primer: 5'-GCCCTCGAGATCTTTAGAAATCAACCCCCAAAGC-3' (SEQ ID NO: 178).
GHPO 1190 (HPO 76): N-terminal primer:* 5'-GCCGGATCCGACTTAGAACATTTTAACACGCTC-3' (SEQ ID NO: 179); and C-terminal primer: 5'-GCCCTCGAGTCATTTTAAACGACTCAAAACANA-3' (SEQ ID NO: 180).
GHPO 986: N-terminal primer: 5'-GCCGGATCCGGCCAAAGCGTGCGCACTTATTGC-3' (SEQ ID NO: 18 and C-terminal primer: 5'-GCCCTCGAGTTATTGTTCCAACCCCCACGCATC-3' (SEQ ID NO: 182).
GHPO 1420: N-terminal primer: t -GCCGGATCCAAGAGCAATGCTGATGACAAACC-3 (SEQ ID NO: 183); and C-terminal primer: 5'-GCCCTCGAGTTATGAGTTAAAGCCCCTTGTCC-3 (SEQ ID NO: 184).
GHPO 1299: WO 98/21225 PCT/US97/21353 -56- N-terminal primer: 5'-GCCGGATCCGAATCAGTAAAAACAGGAAAAAC-3' (SEQ ID NO:185); and C-terminal primer: 5'-GCCCTCGAGCGGCTCTTTGGAGTTTTATTG-3' (SEQ ID NO: 186).
GHPO 13: N-terminal primer: 5'-GCCGGATCCATCATTCCCTCTCGCTCTATGG-3' (SEQ ID NO:187); and C-terminal primer: 5'-GCCCTCGAGACCTTAATGCGTTGCGTTTTCTTT-3' (SEQ ID NO: 188).
The N-terminal and C-terminal primers for each clone both include a clamp and a restriction enzyme recognition sequence for cloning purposes (BamHI (GGATCC) and XhoI (CTCGAG) or NotI (CGGCCG) recognition sequences). The N-terminal primer is designed so that the amplified product does not encode the signal sequence and the potential cleavage site.
Amplification of gene-specific DNA is carried out using Pwo DNA Polymerase (Boehringer Mannheim), which is a proof-reading polymerase, according to general guidance provided by the manufacturer. Because of the exonuclease activity of the polymerase, two reaction mixtures (mixtures 1 and 2) are first prepared separately and combined just prior to amplification. These mixtures are as follows: WO 98/21225 PCT/US97/21353 -57- Ingredient (final conc.) Mixture 1 (ul) Mixture 2 (ul) distilled H,O 160 79 dNTPs (200 /M each) 40 PCR buffer primers (100 nM each) 1 DNA template (200 ng) 2 as obtained in PCR buffer contains 100 mM Tris-HCl (pH 8.85), 250 mM KCI, 50 mM (NH 4
SO
4 20 mM MgSO 4 Amplification is carried out as follows: Cycling conditions Temp Time (min.) Number of cycles Initial denaturing step 96 4 1 Denaturing step 94 0.5 Annealing step 50 1 Extension step 72 1 Final extension step 72 5 1 2.C. Transformation and selection of transformants A single PCR product is thus amplified and is then digested at 37°C for 2 hours with BamHI and Xhol or NotI concurrently in a 20 /l reaction volume. The digested product is ligated to similarly cleaved pET28a (Novagen) that is dephosphorylated prior to the ligation by treatment with Calf Intestinal Alkaline Phosphatase (CIP). The gene fusion constructed in this manner allows one-step affinity purification of the resulting fusion protein because of the presence of histidine residues at the N-terminus of the fusion protein, which are encoded by the vector.
The ligation reaction (20 1) is carried out at 14°C overnight and then is used to transform 100 /l fresh E. coli XL1-blue competent cells (Novagen). The cells are incubated on ice for 2 hours, heat-shocked at 42 C for 30 seconds, and returned to ice for 90 seconds. The samples are then added to 1 ml LB broth in the absence of selection and grown at 37 °C for 2 hours.
WO 98/21225 .PCT/US97/21353 -58- The cells are plated out on LB agar containing kanamycin (50 /ug/ml) at a 1Ox and neat dilution and incubated overnight at 37°C. The following day, colonies are picked onto secondary plates and incubated at 37C overnight.
Five colonies are picked into 3 ml LB broth supplemented with kanamycin (100 /g/ml) and are grown overnight at 37°C. Plasmid DNA is extracted using the Quiagen mini-prep, method and is quantitated by agarose gel electrophoresis.
PCR is performed with the gene-specific primers under the conditions set forth above and transformant DNA is confirmed to contain the desired insert. IfPCR-positive, one of the five plasmid DNA samples (500 ng) extracted from the E. coli XL 1-blue cells is used to transform competent BL21 (XDE3) E. coli competent cells (Novagen; as described previously).
Transformants (10) are picked onto selective kanamycin (50 utg/mL) containing LB agar plates and stored as a research stock in LB containing 50% glycerol.
2.D. Purification of recombinant proteins One ml of frozen glycerol stock prepared as described in 2.C. is used to inoculate 50 ml of LB medium containing 25 gg/ml of kanamycin in a 250 ml Erlenmeyer flask. The flask is incubated at 37 0 C for 2 hours or until the absorbance at 600 nm (OD 6 0 0 reaches 0.4-1.0. The culture is stopped from growing by placing the flask at 4 0 C overnight. The following day, 10 ml of the overnight culture are used to inoculate 240 ml LB medium containing kanamycin (25 gg/ml), with the initial OD 6 0 0 about 0.02-0.04. Four flasks are inoculated for each ORF. The cells are grown to an OD6 0 0 of 1.0 (about 2 hours at 37 0 a 1 ml sample is harvested by centrifugation, and the sample is analyzed by SDS-PAGE to detect any leaky expression. The remaining culture WO 98/21225 PCT/US97/21353 -59is induced with 1 mM IPTG and the induced cultures are grown for an additional 2 hours at 37°C.
The final OD 6 0 0 is taken and the cells are harvested by centrifugation at 5,000 x g for 15 minutes at 4 0 C. The supernatant is discarded and the pellets are resuspended in 50 mM Tris-HCI (pH 2 mM EDTA. Two hundred and fifty ml of buffer are used for 1 liter of culture and the cells are recovered by centrifugation at 12,000 x g for 20 minutes. The supernatant is discarded and the pellets are stored at -45 0
C.
2. E. Protein purification Pellets obtained from 2.D. are thawed and resuspended in 95 ml of mM Tris-HCI (pH Pefabloc and lysozyme are added to final concentrations of 100 M and 100 [gg/ml, respectively. The mixture is homogenized with magnetic stirring at 5°C for 30 minutes. Benzonase (Merck) is added at a 1 U/ml final concentration, in the presence of 10 mM MgC12, to ensure total digestion of the DNA. The suspension is sonicated (Branson Sonifier 450) for 3 cycles of 2 minutes each at maximum output. The homogenate is centrifuged at 19,000 x g for 15 minutes and both the supernatant and the pellet are analyzed by SDS-PAGE to detect the cellular location of the target protein in the soluble or insoluble fractions, as is described further below.
2.E.1. Soluble fraction If the target protein is produced in a soluble form in the supernatant obtained in NaCI and imidazole are added to the supernatant to final concentrations of 50 mM Tris-HCl (pH 0.5 M NaCI, and 10 mM imidazole (buffer The mixture is filtered through a 0.45 utm membrane and WO 98/21225 PCT/US97/21353 loaded onto an IMAC column (Pharmacia HiTrap chelating Sepharose; 1 ml), which has been charged with nickel ions according to the manufacturer's recommendations. After loading, the column is washed with 50 column volumes of buffer A and the recombinant target protein is eluted with 5 ml of buffer B (50 mM Tris-HC1 (pH 0.5 M NaCI, 500 mM imidazole).
The elution profile is monitored by measuring the absorbance of the fractions at 280 nm. Fractions corresponding to the protein peak are pooled, dialyzed against PBS containing 0.5 M arginine, filtered through a 0.22 am membrane, and stored at -45 0
C.
2.E.2. Insoluble fraction If the target protein is expressed in the insoluble fraction (pellets obtained from purification is conducted under denaturing conditions.
NaCI, imidazole, and urea are added to the resuspended pellet to final concentrations of 50 mM Tris-HCI (pH 0.5 M NaCI, 10 mM imidazole, and 6 M urea (buffer After complete solubilization, the mixture is filtered through a 0.45 am membrane and loaded onto an IMAC column.
The purification procedures on the IMAC column are the same as described in except that 6 M urea is included in all buffers used and column volumes of buffer C are used to wash the column after protein loading, instead of 50 column volumes.
The protein fractions eluted from the IMAC column with buffer D (buffer C containing 500 mM imidazole) are pooled. Arginine is added to the solution to final concentration of 0.5 M and the mixture is dialyzed against PBS containing 0.5 M arginine and various concentrations of urea (4 M, 3 M, 2 M, 1 M, and 0.5 M) to progressively decrease the concentration of urea. The final dialysate is filtered through a 0.22 im membrane and stored at -45 0
C.
WO 98/21225 PCT/US97/21353 -61- Alternatively, when the above purification process is not as efficient as it should be, two other processes may be used as follows. A first alternative involves the use of a mild denaturant, N-octyl glucoside (NOG). Briefly, a pellet obtained in 2.E. is homogenized in 5 mM imidazole, 500 mM sodium chloride, 20 mM Tris-HCl (pH 7.9) by microfluidization at a pressure of 15,000 psi and is clarified by centrifugation at 4,000-5,000 x g. The pellet is recovered, resuspended in 50 mM NaPO 4 (pH 7.5) containing 1-2 weight /volume NOG, and homogenized. The NOG-soluble impurities are removed by centrifugation. The pellet is extracted once more by repeating the preceding extraction step. The pellet is dissolved in 8 M urea, 50 mM Tris (pH The urea-solubilized protein is diluted with an equal volume of 2 M arginine, mM Tris (pH and is dialyzed against 1 M arginine for 24-48 hours to remove the urea. The final dialysate is filtered through a 0.22 4m membrane and stored at -45 0
C.
A second alternative involves the use of a strong denaturant, such as guanidine hydrochloride. Briefly, a pellet obtained in 2.E. is homogenized in mM imidazole, 500 mM sodium chloride, 20 mM Tris-HCI (pH 7.9) by microfluidization at a pressure of 15,000 psi and clarified by centrifugation at 4,000-5,000 x g. The pellet is recovered, resuspended in 6 M guanidine hydrochloride, and passed through an IMAC column charged with Ni". The bound antigen is eluted with 8 M urea (pH Beta-mercaptoethanol is added to the eluted protein to a final concentration of 1 mM, then the eluted protein is passed through a Sephadex G-25 column equilibrated in 0.1 M acetic acid.
Protein eluted from the column is slowly added to 4 volumes of 50 mM phosphate buffer (pH The protein remains in solution.
WO 98/21225 PCT/US97/21353 -62- 2.F. Evaluation of the protective activity of the purified protein Groups of 10 Swiss Webster mice (Taconic Labs) are immunized rectally with 25 ,g of the purified recombinant protein, admixed with 1 /g of cholera toxin (Berna) in physiological buffer. Mice are immunized on days 0, 7, 14, and 21. Fourteen days after the last immunization, the mice are challenged with H. pylori strain ORV2001 grown in liquid media (the cells are grown on agar plates, as described in and, after harvest, the cells are resuspended in Brucella broth; the flasks are then incubated overnight at 37°C).
Fourteen days after challenge, the mice are sacrificed and their stomachs are removed. The amount ofH. pylori is determined by measuring the urease activity in the stomach and by culture.
2.G. Production of monospecific polyclonal antibodies 2.G.1. Hyperimmune rabbit antiserum New Zealand rabbits are injected both subcutaneously and intramuscularly with 100 /g of a purified fusion polypeptide, as obtained in 2.E. 1. or in the presence of Freund's complete adjuvant and in a total volume of approximately 2 ml. Twenty one and 42 days after the initial injection, booster doses, which are identical to priming doses, except that Freund's incomplete adjuvant is used, are administered in the same way.
Fifteen days after the last injection, animal serum is recovered, decomplemented, and filtered through a 0.45 /m membrane.
2.G.2. Mouse hyperimmune ascites fluid Ten mice are injected subcutaneously with 10-50 Mg of a purified fusion polypeptide as obtained in 2.E. 1. or in the presence of Freund's complete adjuvant and in a volume of approximately 200 Seven and 14 days after the WO 98/21225 PCT/US97/21353 -63initial injection, booster doses, which are identical to the priming doses, except that Freund's incomplete adjuvant is used, are administered in the same way.
Twenty one and 28 days after the initial infection, mice receive 50 /gg of the antigen alone intraperitoneally. On day 21, mice are also injected intraperitoneally with sarcoma 180/TG cells CM26684 (Lennette et al., Diagnostic Proceduresfor Viral, Rickettsial, and Chlamydial Infections, Ed. Washington DC, American Public Health Association, 1979). Ascites fluid is collected 10-13 days after the last injection.
EXAMPLE 3: Methods for producing transcriptional fusions lacking Histags Methods for amplification and cloning ofDNA encoding the polypeptides of the invention as transcriptional fusions lacking His-tags are described as follows. Two PCR primers for each clone are designed based upon the sequences of the polynucleotides that encode them (SEQ ID NOs: 1- 169 (odd numbers)). These primers can be used to amplify DNA encoding the polypeptides of the invention from any Helicobacter pylori strain, including, for-example, ORV2001 and the strain deposited as ATCC deposit number 43579, as well as from other Helicobacter species.
The N-terminal primers are designed to include the ribosome binding site of the target gene, the ATG start site, and any signal sequence and cleavage site. The N-terminal primers can include a 5' clamp and a restriction endonuclease recognition site, such as that for BamHI (GGATCC), which facilitates subsequent cloning. Similarly, the C-terminal primers can include a restriction endonuclease recognition site, such as that for XhoI (CTCGAG), which can be used in subsequent cloning, and a TAA stop codon.
WO 98/21225 PCT/US97/21353 -64- Amplification of genes encoding the polypeptides of the invention is carried out using Thermalase DNA Polymerase under the conditions described above in Example 2. Alternatively, Vent DNA polymerase (New England Biolabs), Pwo DNA polymerase (Boehringer Mannheim), or Taq DNA polymerase (Appligene) can be used, according to instructions provided by the manufacturers.
A single PCR product for each clone is amplified and cloned into appropriately cleaved pET 24 BamHI-Xhol cleaved pET 24), resulting in construction of a transcriptional fusion that permits expression of the proteins without His-tags. The expressed products can be purified as denatured proteins that are refolded by dialysis into 1 M arginine.
Cloning into pET 24 allows transcription of the genes from the T7 promoter, which is supplied by the vector, but relies upon binding of the RNAspecific DNA polymerase to the intrinsic ribosome binding sites of the genes, and thereby expression of the complete ORF. The amplification, digestion, and cloning protocols are as described above for constructing translational fusions.
Amplification of clone GHPO 1190 DNA Design ofPCR primersfor cloning Two PCR primers are designed based on the complete gene sequence (see table The N-terminal primer (FC1) is designed to include the ribosome binding site of the target gene, the ATG start site, and the signal sequence (with cleavage site). It includes a clamp (GCC) at the 5' most end, and a SacI recognition sequence (GAGCTC) for cloning purposes.
The C-terminal primer (RN2) includes an XhoI recognition sequence for cloning purposes, and the natural TAA stop codon.
N-terminal primer (FC 1): WO 98/21225 PCT/US97/21353 5'-GCCGAGCTCCAAGCAAAAAAATGTCAATTAAAAGGG-3' (SEQ ID NO:189) C-terminal primer (RN2): 5'-GCCCTCGAGGTCTAAATTAGAATAAGTGTTGTT-3' (SEQ ID NO: 190) Amplification of each specified gene can be achieved by employing FCI/RN2 primers for any of the genes described (see Table 1).
PCR conditions Amplification of gene-specific DNA is carried out using Pwo DNA Polymerase (Boehringer Mannheim) under the following conditions. Due to the exonuclease activity of the polymerase, two reaction mixtures are prepared separately and combined just prior to amplification.
Reaction ingredients: Ineredient (final conc.) Mixture 1 (il) Mixture 2 (ul) distilled H,O 160 79 dNTPs (200 pM each) 10X buffer primer 1 (100 nM) 1 primer 2 (100 nM) 1 Template (200 ng) 2 0 Cycline condition Temp Time(min) Number of cycles Initial denaturing step 96 4 1 Denaturing step 94 0.5 Annealing step 50 1 Extension step 72 1 Final extension step 72 1 1 A single PCR product of 624 basepairs is amplified and cloned into SacI-XhoI cleaved pET 24, allowing construction of a transcriptional fusion and expression of GHPO 1190 antigen in the absence of a His-tag. In this instance, expressed product can be purified as a denatured protein that is refolded by dialysis into 1 M arginine.
WO 98/21225 PCT/US97/21353 -66- Cloning into pET 24 allows transcription from the T7 promoter, supplied by the vector, but relies upon binding of the RNA-specific DNA polymerase to the intrinsic ribosome binding site for GHPO 1190, and thereby expression of the complete ORF. The amplification, restriction, and cloning protocols are as previously described for constructing translational fusions.
EXAMPLE 4: Purification of the polypeptides of the invention by immunoaffinity 4.A. Purification of specific IgGs An immune serum, as prepared in section is applied to a protein A Sepharose Fast Flow column (Pharmacia) equilibrated in 100 mM Tris-HCI (pH The resin is washed by applying 10 column volumes of 100 mM Tris-HCI and 10 volumes of 10 mM Tris-HCI (pH 8.0) to the column. IgG antibodies are eluted with 0.1 M glycine buffer (pH 3.0) and are collected as ml fractions to which is added 0.25 ml 1 M Tris-HCl (pH The optical density of the eluate is measured at 280 nm and the fractions containing the IgG antibodies are pooled, dialyzed against 50 mM Tris-HCI (pH and, if necessary, stored frozen at
C.
4.B. Preparation of the column An appropriate amount of CNBr-activated Sepharose 4B gel (1 g of dried gel provides for approximately 3.5 ml of hydrated gel; gel capacity is from 5 to 10 mg coupled IgG/ml of gel) manufactured by Pharmacia (17-0430- 01) is suspended in 1 mM HCI buffer and washed with a buchner by adding small quantities of 1 mM HCI buffer. The total volume of buffer is 200 ml per gram of gel.
WO 98/21225 PCT/US97/21353 -67- Purified IgG antibodies are dialyzed for 4 hours at 20±5°C against volumes of 500 mM sodium phosphate buffer (pH The antibodies are then diluted in 500 mM phosphate buffer (pH 7.5) to a final concentration of 3 mg/ml.
IgG antibodies are mixed with the gel overnight at 5±3 C. The gel is packed into a chromatography column and is washed with 2 column volumes of 500 mM phosphate buffer (pH and 1 column volume of 50 mM sodium phosphate buffer, containing 500 mM NaCI (pH The gel is then transferred to a tube, mixed with 100 mM ethanolamine (pH 7.5) for 4 hours at room temperature, and washed twice with 2 column volumes of PBS. The gel is then stored in 1/10,000 PBS/merthiolate. The amount of IgG antibodies coupled to the gel is determined by measuring the optical density (OD) at 280 nm of the IgG solution and the direct eluate, plus washings.
4.C. Adsorption and elution of the antigen An antigen solution in 50 mM Tris-HCI (pH 2 mM EDTA, for example, the supernatant obtained in 3.E. or the solubilized pellet obtained in after centrifugation and filtration through a 0.45 /m membrane, is applied to a column equilibrated with 50 mM Tris-HCI (pH 2 mM EDTA, at a flow rate of about 10 ml/hour. The column is then washed with 20 volumes of 50 mM Tris-HCl (pH 2 mM EDTA. Alternatively, adsorption can be achieved by mixing overnight at 5±3 C.
The adsorbed gel is washed with 2 to 6 volumes of 10 mM sodium phosphate buffer (pH 6.8) and the antigen is eluted with 100 mM glycine buffer (pH The eluate is recovered in 3 mL fractions, to each of which is added 150 /l of 1 M sodium phosphate buffer (pH Absorption is measured at WO 98/21225 PCT/US97/21353 -68- 280 nm for each fraction; those fractions containing the antigen are pooled and stored at 20 0
C.
EXAMPLE 5: Preparation of isolated DNA encoding the polypeptides of the invention from the deposited clones.
As mentioned above, E. coli strains including plasmids containing nucleic acids encoding GHPO 1190 (formerly HP076, ATCC# 98197), GHPO 1212 (formerly HPO18, ATCC# 98210), GHPO 1012 (formerly HPO121, ATCC# 98201), GHPO 1501 (formerly HPO45, ATCC# 98208), GHPO 1688 (formerly HPO101, ATCC# 98198), GHPO 346 (formerly HPO116, ATCC# 98200), GHPO 1200 (formerly HPO7, ATCC# 98211), GHPO 1538 (formerly HPO104, ATCC# 98199), GHPO 1398 (formerly HPO15, ATCC# 98214), GHPO 1001 (formerly HPO58, ATCC# 98206), GHPO 470 (formerly HPO 132, ATCC# 98202), GHPO 689 (formerly HP09, ATCC# 98203), GHPO 1550 (formerly HP038, ATCC# 98204), GHPO 1620 (formerly HP087, ATCC# 98205), GHPO 574 (formerly HPO71, ATCC# 98217), GHPO 329 (formerly HPO70, ATCC# 98219), GHPO 1374 (formerly ATCC# 98215), GHPO 956 (formerly HP095 ATCC# 98216), HPO 98 (ATCC# 98218), GHPO 1346 (formerly HP057, ATCC# 98220), GHPO 706 (formerly HPO50, ATCC# 98207), GHPO 732 (formerly HPO64, ATCC# 98213), GHPO 419 (formerly HP054, ATCC# 98212), and GHPO 276 (formerly HPO42, ATCC# 98209) were deposited in E. coli strain DH5a under the Budapest Treaty with the American Type Culture Collection (ATCC; Rockville, Maryland) on October 9, 1996 and were designated with accession numbers indicated in parentheses above. These plasmids each contain a genomic DNA BgllI-Clal insert from H. pylori strain P1 or P12 (referred to as WO 98/21225 PCT/US97/21353 -69- 69-A and 888-0 in Haas et al., Mol. Microbiol. (1993) 8:753). Each of the inserts are disrupted by the presence of transposon TnMax9 (Kahrs et al., Gene (1995) 167:53). DNA molecules lacking the transposon can be amplified from the plasmids using standard PCR techniques, such as inverse and recombinant PCR (see, Innis et al., supra), so that a full-length H. pylori insert is reconstituted. For example, the H. pylori sequences flanking the transposon can each be amplified by PCR, and then ligated together to form the full-length H. pylori gene lacking the transposon. Primers that can be used in these methods for each of the twenty-four deposited clones of the invention are shown in Table 1. The locations of insertion of the transposon in each of the deposited clones are between the nucleotides indicated in parentheses after the name of each clone, as follows: HPOO11 (497-498), GHPO 1538 (428-429), GHPO 346 (433-444), GHPO 1012 (463-464), GHPO 132 (408-409), GHPO 1212 (226-227), GHPO 1550 (347-348), GHPO 276 (372-373), GHPO 1501 (299-300), GHPO 706 (29-293), GHPO 419 (351-352), GHPO 1346 (266-267), GHPO 1001 (434-435), GHPO 732 (224-225), GHPO 329 (114-115), GHPO 574 (274-275), GHPO 1190 (412-413), GHPO 1200 (349-350), GHPO 1374 (105-106), GHPO 1620 (26-27), GHPO 956 (64-65), HPO 98 (43-44), and GHPO 689 (346-347).
EXAMPLE 6: Purification of recombinant H. pylori antigen from GHPO 1190.
A pellet of E. coli expressing GHPO 1190 is homogenized in 5 mM imidazole, 500 mM sodium chloride, 20 mM Tris-HCI (pH 7.9) by microfluidization at a pressure of 15,000 psi, and clarified by centrifugation at 4 000-5000g.
Method 1 WO 98/21225 PCT/US97/21353 The pellet containing cloned protein is suspended in buffer containing 2% N-octyl glucoside (NOG) and is homogenized. The NOG soluble protein is removed by centrifugation. The pellet is extracted one more time with 2% NOG. After centrifugation, the pellet is dissolved in 8 M urea. The urea solubilized protein is diluted with an equal volume of 2 M arginine and dialyzed against 1 M arginine for 24-48 hours to remove urea. The cloned protein remains in solution. SDS-PAGE and Coomassie staining, followed by densitometric scanning, shows that the protein is 80-85% pure cloned antigen.
Method 2 The pellet containing cloned protein is solubilized in 6 M guanidine hydrochloride and is passed through an IMAC column charged with Ni The bound antigen is eluted with 8 M urea (pH P-Mercaptoethanol is added to eluted protein to a final concentration of 1 mM, then passed through a Sephadex G-25 column equilibrated in 0.1 M acetic acid. Protein eluted from Sephadex G-25 column is slowly added to 4 volumes of 50 mM phosphate (pH The protein remains in solution.
Purification of recombinant proteins Recombinant proteins expressed as Histidine-tagged fusion proteins can be solubilized and purified by using a metal affinity column (nickel column).
The bound protein can be eluted with imidazole buffer, with or without urea, or by using low pH buffers, with or without urea. Urea or guanidine hydrochloride-denatured proteins can then be renatured using appropriate renaturing buffers. With a number of recombinant H. pylori antigens (HpaA and clone GHPO 1190), renaturation conditions using arginine hydrochloride (0.25-1 M) have been determined.
WO 98/21225 PCT/US97/21353 -71- Recombinant proteins without a His-tag can be solubilized and purified using immunoaffinity, ion-exchange, sizing, and/or hydrophobic chromatography. Proteins expressed as insoluble aggregates in inclusion bodies can be solubilized in denaturing agents, such as 8 M urea or 6 M guanidine hydrochloride. Appropriate folding and renaturation can readily be determined by one skilled in the art.
The above pellet containing cloned protein is suspended in 50 mM NaPO 4 (pH 7.5) containing 1% weight/volume N-octyl glucoside (NOG) and mixed vigorously. The NOG soluble impurities are removed by centrifugation.
The remaining pellet is extracted one more time with the 1% NOG solution to further remove impurities. After centrifugation, the pellet is solubilized in 8 M urea, 50 mM Tris (pH The Urea solubilized protein is diluted with an equal volume of 2 M Arginine, 50 mM Tris (pH and is dialyzed against 1 M Arginine, 50 mM Tris, 50 mM NaCl (pH 8.0) for 24-48 hours to remove urea. The cloned protein remains in solution following dialysis. SDS-PAGE and Coomassie staining followed by densitometric scanning shows that the protein is 80-85% pure cloned antigen.
Other embodiments are within the following claims.
RE(:(JNSIt( :'IoN I'A COI\IILE'I'E ORE" IY RECOMBINANT PR 0 denotes torward primer R' (enotes reverse primer C' dtenotes coding strandW N' denotes non-coding strand t All l:C1 and RN2 primers have incorporated at their 5' end a clamp and a recognition sequence for cloning purposes GGC clamp present for amplification and cloning of entire gene sequence from chromosomat DNA [XJ denotes any nucleotide sequence not present in the compteted gene sequence Identifies region ot overlap between the two original PCR products, and is consistently 10 nucteotides tong for each ctone CLONE Primer No. type 76 FC1 RN 1 FC2 RN2 18 FCl RN1 FC2 RN2 121 FC1
RNI
FC2 RN2 FC1
RN
FC2 RN2 nt positions 304 330 413 -391 404 -436 927 904 Primer sequence 3') GCC [XJ CAAGCAAAAAMATGTCAATTAAAAGGG
TMAGCCATACGATAGCCTATG
(TATGGAACTTA) GAACATTTAACACGGTCTATTA GCC [XI GTCTAAATTAGAATAAGTGTTGTT GCC AATATATGGGMGCTTAATGAGAAT
TGCGAGAMTAACCTGTMTCA
(AAATCTCGCA) GAAATCTTrCACAAGCGAGCAA GCC ATGTCATGTCAAACTATGAAGC GCC JXI TCACAATGGATAAAAACAACAACA GCCCTTTTGTTTAGGGG1TAG (ACAAAAGGGC) TTTAGAGCATGTGAGCCATC GCC [XI CTGTCCAAATCAGCCACGC GCC [XI AIGAAAAGATUGA1TTrGYFIMATC MAGCCGTATTG1TG1TTGGC
(AATACGGCTTTAAAGCTATAGAAAATTTAAACGC)
GCC [XI TTAAATATCCCAATCCTGCCAC Length of gene seq.
27 22 33 24 24 22 32 22 24 21 32 19 26 22 34 22 Tin (oC) 141 -164 451 4 '73 455 485 814 -796 1 -26 299 -278 290 323 603 -582 101 FC1 nN1 FC2 flN2 308 -332 497 -474 488 -519 893 -869 236 -259 434 -416 425 -456 812 790 195 -220 349 -327 339 -371 738 -717 Un m
U)
m 104 FC1 RN 1 FC2 RN2 58 FC1 RN 1 F02 RN2 -271 -407 -452 -761 -143 -413 -454 *630 -314 -378 -430 *741 GOC [XI GMAGGATA1TATGATTAAAAGAA
AACCTAAT-TTGAAATTCAAACCAT
(AAATTAGGTT) TTGTAGGCTMGCCAATAAATG GCC [XI AAGGAATAAATTAGAAAGTGAAGAA 0CC [X]CGCATTGATTTGATGAATAAACC
CGCCTATMACCGCTCCATT
(GTTATAGGCG) ATAAAGGTTTAACGCAGCTAAG 0CC [XI CTCACTAAAAAGCAATITTTGAG GCC [XI TAAGGAATGAAGTTGATAAAATTTGT GCATTrCAT-rCATTC1TrGGAC (ATGAAAATGC) ACGCCCAAATAATAAGGAAGTA GCC (XI GGATrTATTGAGCTTMCCCT GCC lxi AAAGGGCGAMAATGAGCAAGA
TAAAATAACCAACAGAGTGATCA
(GGTTATTTA) GTGGATATTTGGGTT-TATAGCGA GCC (XI TFrTAAGAATCACT1TTCTTCGG GCC lxi ATAGGAACAAGCATGTTTVAAAAC TGAAGTCT-rGCGA1TTTGCTT (CAAGACTTCA) MAAAG(AAGGAGCGGTrGCC GCC IXl CTGGCTTATTGCGTATCATC GGC [XI GGAAGAATAATGCTCGCTTCC
ACTGGAGTGTGGATAAAACTAT
(ACACTCCAGT) AGATGCT17CCCGGATAT1C GCC [XI CTA1TCTCCAGGGATATGGCC 0CC [XI GATGGATTFTTATGGGGGTGAG
GGCACTGCCGCAGATTCTA
(CGGCAGTGCC) 11TAGCCTATTAUTAGMGCGA GCC [XI ATGGTAT17GTCTAAGACCCTC 600 620 62 62 62 N) 211 -233 347 -328 338 -370 686 -665 38 FC1 RN 1 FC2 RN2 7 1 FCI RN1 FC2 RN2 FC I RN 1 FC2 RN2 FC1 RN1 FC2 RN2 FC1
RNI
FC2 RN2 98 FC1
RNI
FC2 RN2 42 FCI RN1 FC2 RN2 220 348- 239- 597- 1 -25 274 -254 265 -294 524 -505 1 -23 115 -96 106 -137 495 -471 1 25 106 95 97 -127 435 415 1 -27 64 -46 55 -98 432 -408 1 -22 43 -23 34 -62 336 -313 18-51 380 -35 1 366-396 82 2-80 1 GCC AAAAGGG ITIITIAAATAATGGCTG
ACAAGGATAAAAAACGCGCTAA
(rTATCC1TGT) TGCTGGCT-rGG1TFFFFMrAATT GCC JXJ AAGATTCTAAAAGGGCTTCAAAT GCC IX) ATGTTGAAATTTrAAATATGGTTTGA
AAACCCCACTCTTATCATCGG
(AGTG6G1GTM 1TITAGGGGGTGGGTATGCT GCC GAGCCTACAGGTTGCTTGC 6CC [X ATGGTATTTGACAGAACAATCAG GAAACCACCCCGCT-rAT-r (GTGGCTTTTC) AAAAAGAGTGGGTGCAAcAATTr 6CC TTAGGAATAGCATAACAAACAAACG GCC [XI ATGTTAGAAAAATIrGA17GAAAGAG
TGAACACATAGCCTAAAMCCAC
(TATGTGTTCA) TGAAAGAGTTGTGGCACATGC GCC [XI TTATGCGATAGGGGGCGTATC GOC [XI ATGAAAAAATTTTTTCTCAATCTT TGGCCAGTAGCGCG1TCAT (CTACTGGCCA) TGGATGGCAATGGCGT1TFFAG GCC [XI TTATTGATGAACATTAACCATTAAA 6CC (XI ATGAAAACCTITAAAAACCTGC
TAGCGATCAGGCTAAAACAGA
(CTGATCGCTA) TGAGfl6GCTCCAAGCGGA 6CC [XI TTAAAACTCATAGCGTTT1TrCAAT 6CC GAGAGTAGTGGCAGAGTTTATGCTGATTCC
(AACTTFFC)TCTATCCCAATTCGTTACGCTC
(GGATAGA)GAAAAGMTGGCGTCAAAAGTTGG
GCC [XI GGCTTAAACTGGAACGGATrC 23 22 33 23 21 30 20 23 20 32 25 25 21 31 21 27 19 34 2 22 21 29 24 34 31 22 600 600 60 iN 630 66 G62 62 62 62A 66 FCl RN1 FC2 RN2 64 FC1 RIN1 FG2 RN2 140-170 297-270 287-314 60 7-584 23-5 0 225-149 216-244 1039-1012 21-48 352 -32 7 345 -37 6 1280-1255 14-35 157-132 147-179 377-34 9 13-39 267 -24 4 258-294 957- 93 4 1-22 27-3 18-50 519-4 98 GCC TAAAGTTTGCTAAAAAGATGGTTTTrAATT
(GACTTCTAAAG)CGTCCTITTITTITTICTTTA
(C1TrTA)GAAGTGATTAAACAAAGAGGGGT GCC [XI CCCATCT-TTAGAAATCAACCCCCA GCC NX GAAATCAAGGAGTFTGTATGCAACAGCG (A)AGCTTlTTCATTATCTTCCCCATAAGC
(TGAAAAGCT)TAGCGAAGCGATCAAGCC
GCC [XI CCCAATACTTTTATTGATTCACCAT-TTC GCC [X1 CAATAAAACACCAAAATGAATGAOTTAC (A)GATFFVG1TVGAGCGTTAGAAATG
(CAAAATC)TATAAACTCAATCAAGTGAAAAATG
GCC [XI GCATTTACCCCCTAAAAACTATAAAC GOC KX CTGMAGGGTGTATGGTAU-AGG
(C)ACCATACATGTATCCTGCATTAATG
(CATGTATGGT)GTAGCAAAGAATIAAGGAGGC
GCC [XI CGTTAAAACTAAAGTTCTA1TTTAATTC GOC [X1 GTAAGGAATGAGATGATAMAGAGTTGG
(TGGMTATTCTGATCCACGCCATC
(GAATATTrCC)AAAAGCCGTFITTTATTACAGAAGAC GCC [XI CTAAACTCTGGCTTATTGCGTATC GOC [XI ATGCGT7ATTATTGTGGTGGG
(C)AATACCCACCACAATAATAAACGGAT
(GTGGGTATT)GGTATTATCGCTCT1TTAAATCC GCC [Xl TrAAA1TITAGGGAAAGGGTA FC1 RN1 FC2 RN2 57 FC1 RN1 FC2
RN?
87 FC1 ANi FC2 RN2 CONDITONS FOR RECOMBINANT PCR Two independent PCR conditions are carried out for FC1/RN1 and FC2/RN2 primers under the same conditions proposed for cloning genes for expression.
0 0 After 20 cycles, the product of each reaction is used as template for a further 20 cycles with FC1/RN2 only The product will encompass the full lenth gene minus the transposon.
The presence of restriction sites at the 5' ends of these primers allows for cloning/expression studies.
0)
C
C
-4 m
C
m
C)
tJ In WO 98/21225 PCT/US97/21353 -74- SEQUENCE LISTING GENERAL INFORMATION APPLICANT: ORAVAX, INC.
(ii) TITLE OF THE INVENTION: HELICOBACTER POLYPEPTIDES AND CORRESPONDING POLYNUCLEOTIDE MOLECULES (iii) NUMBER OF SEQUENCES: 190 (iv) CORRESPONDENCE ADDRESS: ADDRESSEE: Clark Elbing LLP STREET: 176 Federal Street CITY: Boston STATE: MA COUNTRY: USA ZIP: 02110-2214 COMPUTER READABLE FORM: MEDIUM TYPE: Diskette COMPUTER: IBM Compatible OPERATING SYSTEM: DOS SOFTWARE: FastSEQ for Windows Version (vi) CURRENT APPLICATION DATA: APPLICATION NUMBER: UNKNOWN FILING DATE: 14-NOV-1997
CLASSIFICATION:
(vii) PRIOR APPLICATION DATA: APPLICATION NUMBER: 08/749,051 FILING DATE: 14-NOV-1996 (vii) PRIOR APPLICATION DATA: APPLICATION NUMBER: 08/831,309 FILING DATE: 1-APR-1997 (vii) PRIOR APPLICATION DATA: APPLICATION NUMBER: 08/834,705 FILING DATE: 1-APR-1997 (vii) PRIOR APPLICATION DATA: APPLICATION NUMBER: 08/833,457 FILING DATE: 1-APR-1997 (vii) PRIOR APPLICATION DATA: APPLICATION NUMBER: 08/881,227 FILING DATE: 24-JUN-1997 (vii) PRIOR APPLICATION DATA: WO 98/21225 PCT/US97/21353 APPLICATION NUMBER: 08/902,615 FILING DATE: 29-JUL-1997 (viii) ATTORNEY/AGENT INFORMATION: NAME: Clark, Paul T.
REGISTRATION NUMBER: 30,175 REFERENCE/DOCKET NUMBER: 06132/028WO1 (ix) TELECOMMUNICATION INFORMATION: TELEPHONE: 617-428-0200 TELEFAX: 617-428-7045
TELEX:
INFORMATION FOR SEQ ID NO:1: SEQUENCE CHARACTERISTICS: LENGTH: 989 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 71...940 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:1: CTATGACGAT TGTCTCGCTT TTAGAAAACA CTCTAATCGC TTTTGAAAAA CAACAAAGGA AGGGATTTTA ATG AAA TTT TTA Met Lys Phe Leu 1 TCT GTT TAT GCA Ser Val Tyr Ala TGC TCC AGT Cys Ser Ser TGG GTA Trp Val GGG ACG ATT GTT Gly Thr Ile Val GTG CTG TTG GTT ATC TTT TTT ATC GCG Val Leu Leu Val Ile Phe Phe Ile Ala 157 GCC TTT ATC ATT CCC TCT CGC TCT ATG GTT GGC ACG CTC TAT GAG Ala Phe Ile Ile Pro Ser Arg Ser Met Val Gly Thr Leu Tyr Glu GGC GAC ATG CTC TTT GTC AAA AAG TTT Gly Asp Met Leu Phe Val Lys Lys Phe
TCT
Ser 55 TAC GGC ATA CCC Tyr Gly Ile Pro ATT CCT Ile Pro AAA ATC CCA Lys Ile Pro
TGG
Trp ATT GAG CTT CCT Ile Glu Leu Pro
GTT
Val 70 ATG CCT GAT TTT Met Pro Asp Phe AAA AAT AAC Lys Asn Asn GGA CAT TTG ATA GAG GGG GAT CGC CCT AAG CGT GGC GAA GTG GTG GTG Gly His Leu Ile Glu Gly Asp Arg Pro Lys Arg Gly Glu Val Val Val WO 98/2 1225 PTU9/15 PCTfUS97/21353 TTT ATC Phe Ile CCT CCC CAT GAA Pro Pro His Glu AAG TCT TAC TAT GTT AAA AGG AAT TTT Lys Ser Tyr Tyr Val Lys Arg Asn Phe 105 ATT GGA GGC GAT Ile Gly Gly Asp
GAG
Giu 115 GTG TTG TTC ACT Val Leu Phe Thr
AAT
Asn 120 GAG GGT TTT TAT Giu Gly Phe Tyr CAC CCT TTT GAG His Pro Phe Giu
AGC
Ser 130 GAC ACG GAC AAA Asp Thr Asp Lys TAC ATC GCT AAA Tyr Ile Ala Lys CAT TAC His Tyr 140 CCT AAC GCC Pro Asn Ala CCT TAT AAA Pro Tyr Lys 160 ACA AAA GAA TTT Thr Lys Giu Phe GGT AAA ATT Gly Lys Ile TTT GTT TTA AAC Phe Val Leu Asn 155 AAA GAC AAT CAA Lys Asp Asn Giu 170 AAT GAG CAT CCG Asn Ciu His Pro ATC CAT TAC CAA Ile His Tyr Gin ACC TTC Thr Phe 175 CAC TTA ATG GAG His Leu Met Giu
CAA
Gin 180 TTA CCC ACT CAA GGC OCA GAA GCT AAT Leu Ala Thr Gin Gly Aia Giu Ala Asn 185 ACC ATG CPA CTC Ser Met Gin Leu CP.A ATG GAG GGC Gin Met Glu Gly AAC GTG TTT TAT Lys Val Phe Tyr AAA ATC PAT GAC Lys Ile Asn Asp
GAT
Asp 210 GPA TTT TTC ATG Giu Phe Phe Met
ATC
Ile 215 GCC GAC PAC AGA Gly Asp Asn Arg GAC PAT Asp Asn 220 TCT ACC GAC Ser Ser Asp GCT TCG CCA Gly Ser Pro 240 CGC TTT TGG GGG Arg Phe Trp Gly
ACT
Ser 230 GTG GCT TAT AA Val Ala Tyr Lys AAC ATC GTG Asn Ile Val 235 AAT AGC CTA Asn Ser Leu TCC TTT OTT TAT Trp Phe Val Tyr ACT TTC ACT TTA Ser Leu Ser Leu
AAA
Lys 250 GPA ATG Giu Met 255 GAT GCA CPA PAT Asp Ala Giu Asn
PAC
Asn 260 CCT AAA AAA CC Pro Lys Lys Arg
TAT
Tyr 265 CTG CTG CGT TGG Leu Val Arg Trp
CA
Giu 270 CCC ATG TTT AA Arg Met Phe Lys
AGC
Ser 275 GTT GGA CCC TTA Val Giy Gly Leu
CPA
Giu 280 AAA ATC ATT AAA Lys Ile Ile Lys CPA PAC CCA ACG Ciu Asn Ala Thr CAT TACGTTTTT TCTGCPATTT TTTGATTTCT CTTTACPAAC T His 290
TTTATTAC
WO 98/21225 PCT/US97/21353 -77- INFORMATION FOR SEQ ID NO:2: SEQUENCE CHARACTERISTICS: LENGTH: 290 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: Met Lys Phe Leu Arg Ser Val Tyr Ala Phe 1 Thr Ile Leu Trp Ile Pro Gly Glu Met 145 Asn Leu Gln Asp Ser 225 Trp Ala Phe Thr Ile Ser Lys Leu Asp Lys 100 Val Thr Glu Pro Gln 180 Gln Phe Trp Tyr Asn 260 Val Val Val Tyr Pro Arg Tyr Asn 120 Tyr Lys Tyr Gln Glu 200 Gly Ala Ser Arg Glu 280 Ser Ile Tyr Ile Asn Val Asn Tyr His 140 Leu Asn Ala Tyr Asp 220 Ile Ser Arg Lys Ser Ala Glu Pro Asn Val Phe Leu 125 Tyr Asn Glu Asn Lys 205 Asn Val Leu Trp Lys 285 Gly Phe Met Pro Leu Pro Gly Phe Ala Lys 160 His Met Asn Asp Pro 240 Asp Met Ala INFORMATION FOR SEQ ID NO:3: SEQUENCE CHARACTERISTICS: WO 98/21225 PCT/US97/21353 -78- LENGTH: 514 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 112...471 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: GGATTTTTTA GAGCTCTTAG TCAATGATAA TGTGGTAGAA ACGATTGAAA AAGGCTTTGT GATAGGTTTT GGAGCGGGGG ATATTACCTA TCAATTAAGA GGCGAAATGT A ATG GGT Met Gly 1 GCA GTG GTT Ala Val Val GTT TTA TTT TTA Val Leu Phe Leu
ACG
Thr CTG GTT TTA TTG TTT TTA GTT TTA Leu Val Leu Leu Phe Leu Val Leu 165 AGG GAT TTT GGT TTA GCA Arg Asp Phe Gly Leu Ala CCC AAA CAA AAG ATT TTA GCT TTT TTA Pro Lys Gin Lys Ile Leu Ala Phe Leu
ATC
Ile GTA GGG ATT ATA Val Gly Ile Ile
GGA
Gly 40 GCG AGC ATC AGC Ala Ser Ile Ser
GTT
Val TAT ACT TAC AAG Tyr Thr Tyr Lys AAC CAA CAA AAC Asn Gin Gin Asn CAA GAG ATC GCT Gin Glu Ile Ala CAA AGA GCG TTT Gin Arg Ala Phe TTA AGG Leu Arg GGG GAA ACC Gly Glu Thr AAT TTA GTG Asn Leu Val
TTG
Leu TTG TGT AAA GGC Leu Cys Lys Gly ATT AAA GTC AAT AAC CAA ACC TTT Ile Lys Val Asn Asn Gin Thr Phe 75 TTT TTA GGC AAA AAA CAA ACC CCT Phe Leu Gly Lys Lys Gin Thr Pro AGC GGA ACT TTA Ser Gly Thr Leu ATG AAA Met Lys 100 GAC GTT CTT GTG Asp Val Leu Val
GAT
Asp 105 TTG GAT TCT TGT Leu Asp Ser Cys CAG ACG Gin Thr 110 CTC CAA AAA Leu Gin Lys GAT CCC TTA ATC CAA Asp Pro Leu Ile Gin
CCC
Pro 120 TAATGATGAA TAATAATAAT ACCCCACCCA AACCCCTA
GAAGA
INFORMATION FOR SEQ ID NO:4: WO 98/21225 PCT/US97/21353 -79- SEQUENCE CHARACTERISTICS: LENGTH: 120 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: Met Gly Ala Val Val Val Leu Phe Leu Thr Leu Val Leu Leu Phe Leu 1 5 10 Val Leu Arg Asp Phe Gly Leu Ala Ser Pro Lys Gln Lys Ile Leu Ala 25 Phe Leu Ile Val Gly Ile Ile Gly Ala Ser Ile Ser Val Tyr Thr Tyr 40 Lys Gin Asn Gin Gin Asn Gin Gin Glu Ile Ala Leu Gin Arg Ala Phe 55 Leu Arg Gly Glu Thr Leu Leu Cys Lys Gly Ile Lys Val Asn Asn Gin 70 75 Thr Phe Asn Leu Val Ser Gly Thr Leu Ser Phe Leu Gly Lys Lys Gin 90 Thr Pro Met Lys Asp Val Leu Val Asp Leu Asp Ser Cys Gin Thr Leu 100 105 110 Gin Lys Asp Pro Leu Ile Gin Pro 115 120 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 1233 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 135...1049 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID GTTTTTAATT TAATATTCAT TAAGCTTTTG TGGCTATTCC ATTTTAATTT TGTTTTTCAT TAAAACCCAA TCTAAAATCT TATTTTTATG ATAAAATACC TAATCATAAT ATCAAATCTT 120 AAACCAACGA AACC ATG AAA AAA GCT CTC TTA CTA ACT CTC TCT CTC TCG 170 Met Lys Lys Ala Leu Leu Leu Thr Leu Ser Leu Ser 1 5 TTC TGG CTC CAC GCT GAA AGG AAT GGA TTT TAT TTA GGT TTA AAT TTT 218 Phe Trp Leu His Ala Glu Arg Asn Gly Phe Tyr Leu Gly Leu Asn Phe 20 WO 98/21225 WO 9821225PCT/US97/21353 CTA GAA Leu Giu GGA AGC-TAT ATT Gly Ser Tyr Ile GGA CAA GGT Gly Gin Giy AGC ATC Ser Ile GGC AAA AAA GCT Gly Lys Lys Ala
TCA
Ser GCA GAA AAC 0CC TTA AAT GAA GCG ATC AAT AAC GCA AAA AAT Ala Giu Asn Aia Leu Asn Giu Aia Ile Asn Asn Aia Lys Asn TTA TTC CCT AAC Leu Phe Pro Asn
ACA
Thr AAA 0CC ATA AGA Lys Aia Ile Arg
GAT
Asp 70 GCA CAA AAC GCC Aia Gin Asn Ala TTA AAT Leu Asn GCA GTG AAA Aia Vai Lys GGA TCG GGC Giy Ser Gly TCA AAC AAA ATC Ser Asn Lys Ile
GCT
Aia AGC CGA _TTC GCA Ser Arg Phe Ala GGA AAT GGT Giy Asn Gly AAA TAT TTT Lys Tyr Phe GGT CTT TTT AAT Gly Leu Phe Asn CTC AGC TTT GGG Leu Ser Phe Gly TTG GGT Leu Gly 110 AAA AAA AGG ATT Lys Lys Arg Ile G TTT AGG CAC Gly Phe Arg His CTT TTT TTC GGT Leu Phe Phe Giy
TAC
Tyr 125 CAA CTT GGT GGC Gin Leu Giy Giy
GTT
Vali 130 GGT TCT GTT CCT Giy Ser Val Pro
GT
Giy 135 AGC GGT TTA ATC Ser Gly Leu Ile
GTT
Val1 140 TTT TTA CCC TAT Phe Leu Pro Tyr TTC AAT ACG OAT Phe Asn Thr Asp
TTG
Leu CTC ATT AAT TGG Leu Ile Asn Trp ACT AAC Thr Asn 155 OAT AAG CGA Asp Lys Ary TCT ATA TTT Ser Ile Phe 175 TCC CAA AAA TAT Ser Gin Lys Tyr
GTT
Vai 165 GAA CGA AGO GTA Giu Arg Arg Vai AAA 000 CTC Lys Oiy Leu 170 OCT AAT ACA Ala Asn Thr TAC AAA OAT ATG Tyr Lys Asp Met GOC AGA ACO CTA Oly Arg Thr Leu TTA AAA Leu Lys 190 ATT GOC Ile Oly 205 AAA OCA TCA AGO Lys Ala Ser Arg
CAT
His 195
GT
Gly GTA TTT AGA AAA Vai Phe Arg Lys TCA 000 CTT GTG Ser Gly Leu Val ATO GAA CTA Met Giu Leu 000 Giy 210 AGC ACT TOO Ser Thr Trp
TTT
Phe 215 AGT AAC AAT Ser Asn, Asn ACC CCT TTC AAT Thr Pro Phe Asn GTC AAG AGT CGC Val Lys Ser Arg
ACO
Thr 230 ATT TTT CAG Ile Phe Gin TTG CAA OGA Leu Gin Gly 235 OAT CGC TAT Asp Arg Tyr 250 AAA TTT GOC Lys Phe Gly
OTT
Vai 240 COT TOG AAT AAT Arg Trp Asn Asn
OAT
Asp 245 GAA TAC GAT ATT 0Th Tyr Asp Ile WO 98/21225 PCT/US97/21353 GGC GAT GAA Gly Asp Glu 255 GTG CCA GCG Val Pro Ala 270 ATC TAT CTT GGA GGT TCT AGT GTT GAA TTA GGG GTT AAA Ile Tyr Leu Gly Gly Ser Ser Val Glu Leu Gly Val Lys 260 265 TTT AAA GTC AAT TAC TAT AGC GAT GAT TAT GGG GAT AAA Phe Lys Val Asn Tyr Tyr Ser Asp Asp Tyr Gly Asp Lys 275 280 AAA AGA GTG GTG AGC GTT TAT CTT AAC TAT ACA TAT AAC Lys Arg Val Val Ser Val Tyr Leu Asn Tyr Thr Tyr Asn 290 295 300
TTG
Leu 285 GAT TAT Asp Tyr 938 986 1034 1090 1150 1210 1233 TTT AAA AAC AAA CAT TAAAACACGC TTTTTACCGC TCTTTAGTTG GTTTTTTAAA A Phe Lys Asn Lys His 305 AACCTTATTT TTTATTAGCT TGAAACTCTT CAAAGCCTTT TTTTCTCAAT TGGCATGCCG GGCATTTATC GCAACCATAA CCATAAGCAT GCAAAATCTT TCGCTCTCCT TGATAGCAGG TGTGCGTTTC TTTGATGACT AAA INFORMATION FOR SEQ ID NO:6: SEQUENCE CHARACTERISTICS: LENGTH: 305 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: Met Lys Lys Ala Leu Leu Leu Thr Leu Ser 1 Ala Tyr Ala Thr Ser Leu Arg Gly Gly 145 Ser Glu Ile Leu Lys Asn Phe Ile Val 130 Phe Gln Asn Gly Glu Ile Ile Glu 10.0 Gly Ser Thr Tyr Gly Gin Ala Arg Ala Leu Phe Val Asp Val 165 Gly Gly Ala Asn Ala Tyr 105 Leu Gly Asn Val Leu Asn Lys Asn Leu Asn Tyr Phe Ile Thr 155 Gly Ser Phe Ala Ser Asn Gly Phe Gly Val 140 Asn Leu Phe Leu Ser Leu Ala Gly Leu Tyr 125 Phe Asp Ser WO 98/21225 PCT/US97/21353 -82- Lys Asp Met Ser Arg His 195 Leu Gly Gly Gly Arg Thr Leu Ala Asn Thr Gly Leu Val Leu Lys Lys Ala 190 Ile Gly Met Glu Phe Arg Lys Ser 200 Ala 205 Thr Ser Thr Trp Ser Asn Asn 210 Gln Val Leu 220 Gly Pro Phe Asn Lys Ser Arg Phe Gln Leu Lys Phe Gly 225 Arg Trp Asn Asn Tyr Asp Ile Tyr Gly Asp Glu Ile 255 Tyr Leu Gly Lys Val Asn 275 Arg Val Val 290 Gly 260 Tyr Ser Val Glu Leu 265 Tyr Val Lys Val Tyr Ser Asp Asp 280 Asn Gly Asp Lys Pro Ala Phe 270 Asp Tyr Lys Lys Asn Lys Ser Val Tyr Tyr Thr Tyr Asn 300 INFORMATION FOR SEQ ID NO:7: SEQUENCE CHARACTERISTICS: LENGTH: 3012 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 142...2682 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:7:
AATGACGGCT
AAAAATGAAG
TCCAATCAAA
CTAAACCAAA CGATTTGACT TCTCCAAAAG CTCCAAAAAA TGAAGTTCAA AGAAATGAAG CGCCTAAAGA A ATG AAA GTC AAG TCC Met Lys Val Lys Ser 1 5 AAGCCTCTCA AGAATCTCAA CTCAAAAAGA AACCCCCCAA ATT TCT TAT GTC GGG Ile Ser Tyr Val Gly ATT GTA AAG ATT CGT Ile Val Lys Ile Arg 120 171 CTT TCT TAC ATG TCT GAC ATG CTC GCT AAT GAA Leu Ser Tyr Met Ser Asp Met Leu Ala Asn Glu 219 267 GTG GGC GAT ATT GTG GAT TCT AAA AAA ATA GAC ACC GCT GTT TTG GCT Val Gly Asp Ile Val Asp Ser Lys Lys Ile Asp Thr Ala Val Leu Ala TTG TTC AAT Leu Phe Asn CAA GGG TAT TTT AAA GAC GTT TAT GCC ACT TTT GAA GGC Gln Gly Tyr Phe Lys Asp Val Tyr Ala Thr Phe Glu Gly 50 WO 98/21225 PCT/US97/21353 -83- GGC ATA TTA GAG TTT CAT TTT GAT GAA AAA GCC Gly Ile Leu Glu Phe His Phe Asp Glu Lys Ala
AGG
Arg ATT GCC GGG GTA Ile Ala Gly Val
GAA
Glu ATC AAG GGT TAT Ile Lys Gly Tyr
GGG
Gly 80 ACT GAA AAG GAA Thr Glu Lys Glu
AAA
Lys 85 GAC GGC TTA AAA Asp Gly Leu Lys
TCC
Ser CAA ATG GGG ATC AAA AAG GGC GAC ACC Gin Met Gly Ile Lys Lys Gly Asp Thr
TTT
Phe 100 GAT GAG CAA AAA Asp Glu Gin Lys TTA GAG Leu Glu 105 CAT GCT AAA His Ala Lys GGG AGC GTG Gly Ser Val 125 GCT TTA AAA ACC Ala Leu Lys Thr
GCT
Ala 115 TTA GAG GGG CAG Leu Glu Gly Gin GGC TAT TAT Gly Tyr Tyr 120 GGT GCA TTA Gly Ala Leu GTG GAG GTG CGC Val Glu Val Arg
ACA
Thr 130 GAA AAG GTC AGT Glu Lys Val Ser TTG ATC Leu Ile 140 GTG TTT GAT GTG Val Phe Asp Val AGG GGG GAT AGC ATT TAT ATC AAA CAA Arg Gly Asp Ser Ile Tyr Ile Lys Gin 150
TCC
Ser 155 ATT TAT GAG GGA Ile Tyr Glu Gly GCG AAA TTA Ala Lys Leu AAA CGC Lys Arg 165 ATG GGC Met Gly 180 CGC ATG ATT GAA Arg Met Ile Glu TTG AGT GCG AAC Leu Ser Ala Asn CAA CGA GAT TTC Gin Arg Asp Phe TGG ATG TGG GGC TTG Trp Met Trp Gly Leu 185 AAT GAC GGG Asn Asp Gly ATC CAA GAT Ile Gin Asp 205
AAA
Lys 190 TTG CGT TTA GAT Leu Arg Leu Asp CTA GAA TAC GAT Leu Glu Tyr Asp TCT ATG CGT Ser Met Arg 200 CAT ATT TCT His Ile Ser 747 795 GTG TAT ATG CGT Val Tyr Met Arg
AGG
Arg 210 GGT TAC TTA GAC Gly Tyr Leu Asp TCG CCT Ser Pro 220 TTT TTG AAA ACG Phe Leu Lys Thr TTT TCT ACC CAT Phe Ser Thr His
GAC
Asp 230 GCT AAG CTT CAT Ala Lys Leu His
TAT
Tyr 235 AAA GTC AAA GAG Lys Val Lys Glu ATC CAA TAC AGG Ile Gin Tyr Arg TCA GAC ATT TTA Ser Asp Ile Leu
ATA
Ile 250 GAG ATT GAC AAC Glu Ile Asp Asn GTA GTC CCC TTA Val Val Pro Leu
AAA
Lys 260 ACC TTA GAA AAA Thr Leu Glu Lys GCG CTT Ala Leu 265 AAA GTG AAA Lys Val Lys
AGG
Arg 270 AAA GAT GTC TTT Lys Asp Val Phe
AAT
Asn 275 ATT GAG CAT TTA Ile Glu His Leu AGA GCG GAT Arg Ala Asp 280 WO 98/21225 WO 98/21225PCTIUS97/2 1353 -84- GCG CAA ATT Ala Gin Ile 285 TTA AAA ACC GAA Leu Lys Thr Glu
ATC
Ile 290 GCC OAT AAG GOT Ala Asp Lys Gly GCG TTT GCG Ala Phe Ala 103S GTG GTG Val Val 300 AAG CCA GAC TTG Lys Pro Asp Leu AAA GAT GAA AAA Lys Asp Glu Lys
AAC
Asn 310 GGG CTT GTG AAA Gly Leu Val Lys 1083 1131 ATT TAT COT ATT Ile Tyr Arg Ile
GAA
Giu 320 GTG GGC GAT ATG Val Gly Asp Met TAT ATC AAT GAT Tyr Ile Asn Asp ATC ATT TCA GOG AAC CAG COC ACG AGC Ile Ile Ser Gly Asn Gin Arg Thr Ser 335
OAT
Asp 340 AGO ATC ATT AGA Arg Ile Ile Arg AGO GAG Arg Glu 345 1179 TTA TTO TTA Leu Leu Leu TCC GAA AAT Ser Glu Asn 365 CCT AAG OAT AAA Pro Lys Asp Lys
TAC
Tyr 355 AAC TTG ACC AAA Asn Leu Thr Lys CTG AGA AAT Leu Arg Asn 360 GTC AAA ATT Val Lys Ile 1227 TCT TTA AGO COT TTA OGA TTC TTC TCT Ser Leu Arg Arg Leu Gly Phe Phe Ser GAA GAA Giu Giu 380 AAA AGO OTT AAT Lys Arg Val Asn
AGC
Se r 38S TCA CTC ATG OAT Ser Leu Met Asp TTA OTO AGC OTA Leu Val Ser Val 1275 1323 1371 1419 1467
GAA
Giu 395 GAO 000 COT ACT Giu Oly Arg Thr
GG
Giy 400 CAG TTO CAA TTT Gin Leu Gin Phe
GG
Gly 405 TTA 000 TAT GOC Leu Gly Tyr Gly
TCT
Ser 410 TAT OGA GOG CTT Tyr Gly Gly Leu CTT AAT 000 AGC Leu Asn Gly Ser
OTO
Val1 420
OCT
Ala AGC GAA AGA AAC Ser Glu Arg Asn CTT TTT Leu Phe 425 000 000 Gly Gly GOC ACA 000 Gly Thr Gly GOT AGA TCT Oly Arg Ser 445 ATO AGC TTO Met Ser Leu AAC ATC OCT Asn Ile Ala
ACA
Thr 440 TAT CCO GOC ATO Tyr Pro Gly Met AAA OGA OCO 000 Lys Gly Ala Gly ATO TTT 0CC Met Phe Ala 1515 000 AAT Oly Asn 460 TTO AOC TTO ACT Leu Ser Leu Thr
AAT
As n 465 CCA AGO ATT TTT Pro Arg Ile Phe AGC TOO TAT AGC Ser Trp Tyr Ser
TCT
Ser 475 ACO ATC AAC CTT Thr Ile Asn Leu
TAT
Tyr 480 OCO OAT TAC AGO Ala Asp Tyr Arg
ATA
Ile 485 AGC TAC CAA TAC Ser Tyr Gin Tyr 1563 1611 1659 CAA CAA GOC 000 Gin Gin Gly Oly TTT 000 OTO AAT Phe Gly Val Asn
GTC
Val 500 000 COC ATO CTG Gly Arg Met Leu GOT AAT Oly Asn 505 WO 98/21225 WO 9821225PCTIUS97/21353 AGA ACC CAT Arg Thr His GGT TTC AGC Gly Phe Ser 525 AGC TTA GGG TAT Ser Leu Gly Tyr TTG AAT GTT ACC AAA CTC CTT Leu Asn Val Thr Lys Leu Leu 520 1707 AGC CCT TTA TAC Ser Pro Le u Tyr
AAC
Asn 530 CGC TAC TAT TCC Arg Tyr Tyr Ser
TCT
Ser 535 GTT AAT GAA Val Asn Giu 1755 GTG GTT Val Val 540 TCT CCA AGG CAA Ser Pro Arg Gin TCT ACC CCC GCA Ser Thr Pro Ala
TCG
Ser 550 GTG ATT ATC AAT Val Ile Ile Asn
CC
Arg 555 TTA TCA GGC GGT Leu Ser Gly Giy ACC CCC TTA CAA Thr Pro Leu Gin
CCT
Pro 565 GAA AGC TGT TCT Olu Ser Cys Ser 1803 1851 1899 CCT GGA GCG ATC Pro Gly Ala Ile ACT TCA CCA GAA Thr Ser Pro Giu
ATA
Ile 580 AGA GGT ATT TGG Arg Gly Ile Trp OAT AGO Asp Arg 585 GAT TAC CAT Asp Tyr His GAC AAC ACC Asp Asn Thr 605
ACG
Thr 590 CCT ATC ACC AGC Pro Iie-Thr Ser TTC ACC CTT GAT Phe Thr Leu Asp GTG AGC TAT Vai Ser Tyr 600 ATC TTT AGT Ile Phe Ser 1947 1995 OAT OAT TAT TAC Asp Asp Tyr Tyr CCT AGA AAT 000 Pro Arg Asn Gly TCC TAT Ser Tyr 620 GCG ACO ATG TCT GOC TTO CCA AGC TCT Ala Thr Met Ser Oly Leu Pro Ser Ser 625
GGC
Gly 630 ACO CTC AAT TCT Thr Leu Asn Ser 2043
TG
Trp 635 AAC 000 TTA GOC Asn Gly Leu Gly
GG
Gly 640 AAT OTC COT AAC Asn Val Arg Asn
ACC
Thr 645 AAA OTT TAT GOT Lys Val Tyr Oiy
AAA
Lys 650 2091 2139 TTC 0CC OCT TAC Phe Ala Ala Tyr
CAC
His 655 CAT TTO CAA AAA His Leu Gin Lys TTA TTO ATA OAT Leu Leu Ile Asp TTO ATC Leu Ile 665 OCT COC TTT Ala Arg Phe OAT TAC TTO Asp Tyr Leu 685
AAA
Lys 670 ACO CAA OGA OOT Thr Gin Oly Gly ATC TTT AGO TAT Ile Phe Arg Tyr AAC ACC OAT Asn Thr Asp 680 OTA ACC ACO Val Thr Thr 2187 2235 CCC TTA AAC TCC Pro Leu Asn Ser TTC TAC ATG 000 Phe Tyr Met Oly
GOC
Oiy 695 OTO AGA Val Arg 700 GOC TTT AGO AAC Gly Phe Arg Asn
OGA
Oly 705 TCO OTT ACT CCT AAA OAT GAG TTT GOC Ser Vai Thr Pro Lys Asp Oiu Phe Oly 710 2283
TTG
Leu 715 TOG CTT OGA OGC Trp Leu Oly Gly
OAT
Asp 720 000 ATT TTT ACC Oly Ile Phe Thr
OCT
Al a 725 TCT ACT OAA TTO AOC Ser Thr Gliu Leu Ser 2331 WO 98/21225 WO 981225PCTIUS97/2 1353 -86- TAT GGG GTG Tyr Gly Val TTT GGT TTC Phe Gly Phe AAC GCT CCT Asn Ala Pro 765 CTA AAG, GCG GCT AAA ATG CGC TTA GCG TOG TTT TTT Leu Lys 735 Ala Ala Lys Met Arg 740 Leu Ala Trp Phe Phe 745
GAC
Asp 2379
TTA
Leu 750 ACC TTT AAA ACC Thr Phe Lys Thr
CCA
Pro 755 ACT AGA GGG AGT Thr Arg Gly Ser TTT TTC TAT Phe Phe Tyr 760 OTT ATA GOO Val Ile Gly 2427 2475 GTT ACG ACA GCG Val Thr Thr Ala
AAT
Asn 770 TTT AAA OAT TAT Phe Lys Asp Tyr GCT GG Ala Gly 780 TTT GAA AGA GCG Phe Glu Arg Ala TGG AGG GCT TCC Trp Arg Ala Ser
ACA
Thr 790 GGC TTG CAG ATT Gly Leu Gln Ile 2523
GAA
Glu 795 TOG ATT TCG CCC Trp Ile Ser Pro GGG CCT TTG GTG TTG ATT TTC CCT ATA Gly Pro Leu Val Leu Ile Phe Pro Ile 805 2571 TTT TTC AAC CAA TGG GGC GAT GGC AAT Phe Phe Asn Gin Trp Gly Asp Gly Asn 815
GGC
Gly 820 AAG AAA TGT AAA Lys Lys Cys Lys GGG CTA Gly Leu 82S 2.619 TGC TTC AAC Cys Phe Asn
CCT
Pro 830 AAC ATG GAC GAT Asn Met Asp Asp
TAC
Tyr 835 ACG CAA CAC TTT Thr Gln His Phe GAA TTT TCT Giu Phe Ser 840 2667 ATG GGA ACA AGG TTT Met Gly Thr Arg Phe 845 TAAAATGCGC ATCAACAGAG AAGAAATTTT GGATTTAATG A 2723
AAAACGCGCC
CTGAAAACTT
TGGATTGCAA
GCTATGAAGA
TTTTTCAAGG
CTTGAAAGAA TTGGGGCAAA GGGCTTTGAG GACGACTTTT ATTGTGGATA GGAATATCAA GTTTTGCGCG TTCAAACGCA CCTTAAAAGA AATTGATCAA AAGATTGAAG AATTGCTCGC GGGGGTGCAC CCGCAGCTAA AGATTGATTA
GGTGAAGCAA
TTACACCAAT
AAAAGACGCC
TATTGGCGGC
TTATGAGAA
CGCTTGCACC
ATTTGTTTTG
TATGTGTTGA
ACGCAGATCC
2783 2843 2903 2963 3012 INFORMATION FOR SEQ ID NO:8: SEQUENCE CHARACTERISTICS: LENGTH: 847 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:8: Met Lys Vai Lys Ser Ile Ser Tyr Val Gly 1 5 10 Met Leu Ala Asn Glu Ile Val Lys Ile Arg 25 Leu Ser Tyr Met Ser Asp Vai Gly Asp Ile Val Asp WO 9821225PCTIUS97/21353 WO 98/21225 -87- Ser Phe Phe Thr Gly Lys Arg Asn 145 Ala Arg Leu Ary Asp 225 Ile Val1 Val1 Clu Asp 305 Val1 Arg Asp Arg Ser 385 Gin Asn Ser Met Asn Lys Lys Lys Asp Asp Giu Glu Lys Asp Thr Thr Ala 115 Thr Giu 130 Arg Gly Lys Leu Asp Phe Asp Gin 195 Arg Gly 210 Phe Ser Gin Tyr Pro Leu Phe Asn 275 Ile Ala 290 Lys Asp Gly Asp Thr Ser Lys Tyr 355 Leu Gly 370 Ser Leu Leu Gin Gly Ser Leu Tyr 435 Pro Lys 450 Pro Arg I-i e Val1 Lys Giu Phe 100 Leu Lys Asp Lys Met 180 Leu Tyr Thr Arg Lys 260 Ile Asp Giu Met Asp 340 Asn Phe Met Phe Val1 420 Ala Gly Ile Asp Tyr Ala Lys Asp Giu Val1 Ser Arg 165 Gly Giu Leu His Ile 245 Thr Giu Lys Lys Val1 325 Arg Leu Phe Asp Gly 405 Ser As n Al a Phe Thr Al a Arg 70 Asp Giu Gly Ser Ile 150 Arg Trp Tyr Asp Asp 230 Ser Leu His Gly As n 310 Tyr Ile Thr Ser Leu 390 Leu Giu Ile Gly Asp Al a Thr 55 Ile Gly Gin Gin Giu 135 Tyr Met Met Asp Al a 215 Al a Asp Giu Leu Tyr 295 Gly Ile Ile Lys Lys 375 Leu Gly Arg Ala Arg 455 Ser Val 40 Phe Al a Leu Lys Gly 120 Gly Ile Ile Trp Ser 200 His Lys Ile Lys Arg 280 Al a Leu Asn Arg Leu 360 Val1 Val1 Tyr As n Thr 440 Met Trp Leu Giu Gly Lys Leu 105 Tyr Al a Lys Giu Gly 185 Met Ile Leu Leu Ala 265 Ala Phe Val Asp Arg 345 Arg Lys Se r Gly Leu 425 Gly Phe Tyr Al a Gly Val1 Ser 90 Giu Tyr Leu Gin Ser 170 Leu Arg Ser His Ile 250 Leu Asp Al a Lys Val1 330 Giu Asn Ile Val Ser 410 Phe Gly Ala Ser Leu Gly Giu 75 Gin His Gly Leu Ser 155 Leu Asn Ile Ser Tyr 235 Giu Lys Al a Val1 Val1 315 Ile Leu Ser Giu Giu 395 Tyr Gly Gly Gly Ser Phe Ile Ile Met Al a Ser Ile 140 Ile S er Asp Gin Pro 220 Lys Ile Val Gin Val 300 Ile Ile Leu Giu Giu 380 Giu Gly Thr Arg Asn 460 Thr Asn Leu Lys Gly Lys Val 125 Val Tyr Ala Gly Asp 205 Phe Val Asp Lys Ile 285 Lys Tyr Ser Leu Asn 365 Lys Gly Gly Gly ser 445 Leu Ile Gin Giu Gly Ile Thr 110 Val1 Phe Glu Asn Lys 190 Val1 Leu Lys As n Arg 270 Leu Pro Arg Giy Gly 350 Ser Arg Arg Leu Gin 430 Tyr Ser Asn Gly Tyr Phe His Tyr Giy Lys Lys Ala Leu Giu Val Asp Val Gly Ser 160 Lys Gin 175 Leu Arg Tyr Met Lys Thr Giu Giy 240 Pro Val 255 Lys Asp Lys Thr Asp Leu Ile Giu 320 Asn Gin 335 Pro Lys Leu Arg Val Asn Thr Gly 400 Met Leu 415 Ser Met Pro Giy Leu Thr Leu Tyr WO 98/21225 WO 981225PCTIUS97/21353 Ala Asp Tyr Arg Ile Ser Tyr Gin Tyr Gly Gly Tyr Cys 545 Thr Ser Thr Tyr Gly 625 Asn Leu Gly Ser Gly 705 Gly Al a Lys Al a Thr 785 Gly Asp Asp Val1 Tyr Asn 530 Ser Pro Pro Ser Phe 610 Leu Val1 Gin Gly Thr 690 Se r Ile Ly s Thr Asn 770 Trp Pro Gly Asp Asn Asn 515 Arg Thr Leu Giu Ser 595 Pro Pro Arg Lys Tyr 675 Phe Val1 Phe Met Pro 755 Phe Arg Leu Asn Tyr 835 Val1 500 Leu Tyr Pro Gin Ile 580 Phe Arg Ser Asn Tyr 660 Ile Tyr Thr Thr Arg 740 Thr Lys Aia Val1 Gly 820 Thr 485 Gly Asn Tyr Aia Pro 565 Arg Thr Asn Ser Thr 645 Leu Phe Met Pro Al a 725 Leu Arg Asp Ser Leu 805 Lys Gin Met Thr Ser 535 Val1 Ser Ile Asp Val1 615 Thr Val1 Ile Tyr Gly 695 Asp Thr Trp Ser Gly 775 Giy Phe Cys Phe Leu Lys 520 Vali Ile Cys Trp Val1 600 Ile Leu Tyr Asp Asn 680 Val1 Giu Giu Phe Phe 760 Val1 Leu Pro Lys Giu Giy 505 Leu Asn Ile Ser Asp 585 Ser Phe Asn Gly Leu 665 Thr Thr Phe Leu Phe 745 Phe Ile Gin Ile Giy 825 Phe -88- Ile 490 Asn Leu Giu Asn Ser 570 Arg Tyr Ser Ser Ly s 650 Ile Asp Thr Gly Ser 730 Asp Tyr Giy Ile Al a 810 -Leu Ser 475 Gin Arg Gly Val1 Arg 555 Pro Asp Asp Ser Trp 635 Phe Aia Asp Val1 Leu 715 Tyr Phe Asn Al a Glu 795 Phe Cys Met Gin Thr Phe Val 540 Leu Giy Tyr As n Tyr 620 Asn Aila Arg Tyr Arg 700 Trp Gly Gly Al a Gly 780 Trp Phe Phe Gly Gly His Ser 525 Ser Ser Ala His Thr 605 Aila Gly Ala Phe Leu 685 Gly Leu Val Phe Pro 765 Phe Ile Asn Asn Thr 845 Gly Gly 495 Vai Ser 510 Ser Pro Pro Arg Gly Gly Ile Thr 575 Thr Pro 590 Asp Asp Thr Met Leu Gly Tyr His 655 Lys Thr 670 Pro Leu Phe Arg Gly Giy Leu Lys 735 Leu Thr 750 Val Thr Giu Arg Ser Pro Gin Trp 815 Pro Asn 830 Arg Phe 480 Phe Leu Leu Gin Lys 560 Thr Ile Tyr Ser Gly 640 His Gin As n Asn Asp 720 Ala Phe Thr Al a Met 800 Gly Met INFORMATION FOR SEQ ID NO:9: SEQUENCE CHARACTERISTICS: LENGTH: 1032 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear WO 98/21225 PCTIUS97/21353 -89- (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 149...913 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:9: ATGTTTTGTG TTGCAAAAAC AAAACAGACC AATAAAGGCA TCACTTTTAA TAGGGGGGTT TGGTTATTGG TGTTTGATTA GAATAGGGTT GTTTTTAATT AGGAGTTTTT ACTTTTTTAA CGGTTTTT ATG CAT ATT TAT GCG TTA
AAGCGTTGTT
TTCTTTTAAG
TAT ATA 120 172 Met Asp Ile Tyr Ala Leu Tyr Ile GCG ATA Ala Ile CGG CTT TTT ACT GGC ATT CTA TCA GGG Gly Leu Phe Thr Cly Ile Leu Ser Cly
ATT
Ile TTT GGC ATT GGT Phe Gly Ile Gly
GGG
Cly GGG TTG ATC ATT Cly Leu Ile Ile
GTC
Val 30 CCT ATC ATG CTC GCA ACC GGG CAT TCT TTT Pro Ile Met Leu Ala Thr Gly His Ser Phe GAA GAA TCC ATT Glu Giu Ser Ile ATT TCC ATT TTC CAA ATG GCC CTT TCA Ile Ser Ile Leu Gin Met Ala Leu Ser TCG TTC Ser Phe CTG GGC TCT Val Giy Ser GGC TTG TTG Gly Leu Leu
GTT
Vai TTG AAT TTC AAA AAA AAA TCC CTT CAT Leu Asn Phe Lys Lys Lys Ser Leu Asp TTT TCT TTA Phe Ser Leu TTT AGC GGA Phe Ser Cly 364 412 ATA GCG GCA CGG GGG CTC ATA GGC GCG Ile Giy Ala Cly Gly Leu Ile Gly Ala 80
AGT
Ser TTT GTT Phe Val TTA AAA ATC CTT Leu Lys Ile Vai AGT AAA ATT TTA Ser Lys Ile Leu GTT ATT TTC GCG Val Ile Phe Ala TTA CTC GTG TAT Leu Val Vai Tyr ATC ATC CAA TTT Met Ile Cmn Phe TTC AAA CCC AAA Leu Lys Pro Lys 460 508 556 604 AAA CAT TTC ATA Lys Asp Leu Ile CAT ACT AAA CGC Asp Thr Lys Arg
TAT
Tyr 130 CAT CTC CAA GGT His Leu Gin Cly TTC AAA Leu Lys 135 TTA TTT TTA Leu Phe Leu ATT GGT GGG Ile Cly Cly 155
ATT
Ile 140 GGC ACC CTC ACA GGG TTT TTT CCT Cly Thr Leu Thr Cly Phe Phe Ala 145 ATC ACT TTA GGG Ile Thr Leu Cly 150 TAT TTT TTA GGG Tyr Phe Leu Gly 165 GGG ATC CTC ATC Gly Met Leu Met
GTG
Val 160 CCT TTC ATG CAT Pro Leu Met His WO 98/21225 WO 9821225PCTIUS97/21353 TAT GAT Tyr Asp 170 TCT AAA AAA TGC GTG GCT CTA GGG TTA Ser Lys Lys Cys Val Ala Leu Gly Leu
TTT
Phe 180 TTC ATC TTG TTT Phe Ile Leu Phe
TCT
Ser 185 TCT ATT TCA GGA Ser Ile Ser Gly TTT TCT TTA ATG TAT CAC CAC ATC ATC Phe Ser Leu Met Tyr His His Ile Ile 195
AAT
Asn 200 AAA GAA GTG CTC Lys Glu Vai Leu
TTA
Len 205 GCA GGG GCG ATT GTG GGA TTA GGA TCT Ala Gly Ala Ile Val Gly Leu Gly Ser 210 GTT ATG Val Met 215 GGC GTG AGC Gly Val Ser ATG CAT AAA Met His Lys 235
ATT
Ile 220 GGG ATT AAA TGG Gly Ile Lys Trp
ATC
Ile 225 ATG GGG CTT TTG Met Gly Leu Leu AAT GAA AAA Asn Gin Lys 230 CTA TTG ATT Leu Leu Ile GCT TTG ATT TTA Ala Len Ile Leu
GGG
Gly 240 GTG TAT GGT TTG Val Tyr Gly Leu OTT TTA Val1 Len 250 TAC AAA CTC TTT Tyr Lys Leu Phe TTT TAATTGATGG TTTTATACCA CTACTATTTT AAGA Phe 255 947 1007 1032 CCCCTAAGAG TTTCCCTTTA GAGTATTTGC ATTTGTGCGC TAATGAGAGC CATTTATTGA.
GATTGGATTT TGATGCGGCC AATTT INFORMATION FOR SEQ ID NO:iO: Wi SEQUENCE CHARACTERISTICS: LENGTH: 255 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:i0: Met 1 Leu Met Asp Ile Tyr Ser Gly Ile Len Ala Thr Len Tyr Ile Ala Gly Len Phe Len Ile Ile Oly Ile Oly Thr Gly Ile Val Pro Ile Ile Ser Ile Asn Phe Lys Gly His Ser Len Gin Met Phe Phe Gin Ser Ile Gly Ala Len Ser Val Gly Ser Lys Lys Val1 Ser Len Asp Len Oly Len Leu Lys Gin Len Len Ile Gly Ala Gly Ile Gly Ala Ile Len Met 100 Phe Val Leu Ser Gly Phe Lys Ile Val Ser Ser Ile Phe Ala Leu 105 Lys Val Val Tyr Ser Met Ile 110 Asp Thr Lys Lys Pro Lys Lys Asp Len Ile Ala i I WO 98/21225 PCT/US97/21353 -91- 115 Arg Tyr His 120 Lys 125 Gly Leu Gin Gly Leu Phe Leu Ile 140 Gly Thr Leu Thr Gly 145 Pro Phe Ala Ile Gly Ile Gly Gly 155 Ser Met Leu Met Leu Met His Leu Gly Tyr Asp 170 Ser Lys Lys Cys Val Ala 175 Leu Gly Leu Leu Met Tyr 195 Ile Val Gly Phe 180 Ile Leu Phe Ile Ser Gly His His Ile Ile Glu Val Leu Ala Phe Ser 190 Ala Gly Ala Ile Lys Trp 210 Ile Met Gly Leu Gly Ser Leu Leu Asn Gly Val Ser Lys Met His Leu Ile Leu Gly 240 Tyr Gly Leu Ser 245 Leu Ile Val Leu 250 Lys Leu Phe INFORMATION FOR SEQ ID NO:11: SEQUENCE CHARACTERISTICS: LENGTH: 1057 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 66...980 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:11: AAGAGCATGC GAGAGAGCAT AGAGGAATTT TTTAATCAAG AAATGTTGCA AAGTGAAGTG CCGTT ATG GGT AGA ATT GAA TCA AAA AAG CGT TTG AAA GCG CTT GTT TTT Met Gly Arg Ile Glu Ser Lys Lys Arg Leu Lys Ala Leu Val Phe 110 TTA GCC AGC TTG GGG GTT TTG TGG Leu Ala Ser Leu Gly Val Leu Trp TTT TTT AAA ACG AAA AAC CAC ATT Phe Phe Lys Thr Lys Asn His Ile GGA GCC AAT GTG CAC ACG AGC ATG Gly Ala Asn Val His Thr Ser Met 55 CCC ACC TGC CCT GGT AGC GTG TGT Pro Thr Cys Pro Gly Ser Val Cys GGC AAT AGC GCT GAA AAA ACG CCT Gly Asn Ser Ala Glu Lys Thr Pro 25 TAT CTA GGT TTT AGG CTA GGC ACA Tyr Leu Gly Phe Arg Leu Gly Thr 40 TGG CAA CAA GCC TAT AAA GAC AAC Trp Gin Gin Ala Tyr Lys Asp Asn TAT GGC GAG AAA TTA GAA GCC CAT Tyr Gly Glu Lys Leu Glu Ala His WO 98/21225 WO 981225PCTIUS97/21353
TAT
Tyr CAG GOG GOT AAA Gin Giy Gly Lys CTG TCT TAT ACC Leu Ser Tyr Thr CAA ATA GOC Gin Ile Giy GAT GAA Asp Oiu GO OAT Gly Asp 110 350 398 ATA OCT TTT OAT Ile Ala Phe Asp
AAA
Lys 100 CAC CAT ATT TTA His His Ile Leu
GC
Oly 105 TTA AOG GTG TOO Leu Arg Val Trp OTA GAA TAC OCT AAA OCO CAA TTA Vai Giu Tyr Ala Lys Ala Gin Leu 115 CAA AAA OTO 000 Gin Lys Val Gly GOT AAT ACC Gly Asn Thr 125 ACC TAC OAT Thr Tyr Asp CTT TTA TCC Leu Leu Ser 130 CAA 0CC AAT TAT Gin Ala Asn Tyr
GAC
Asp 135 CCA AAC GCG ATT Pro Asn Ala Ile TCT OCT Ser Ala 145 TCA AAC ACT CAA Ser Asn Thr Gin CCT TTA OTT TTG Pro Leu Val Leu
CAA
Gin 155 AAA ACC CCA AGC Lys Thr Pro Ser CAA AAC TTC CTT Gin Asn Phe Leu
TTC
Phe 165 AAT AAC 000 CAT Asn Asn Gly His ATO OCO TTT GOT Met Aia Phe Gly AAC OTO AAT OTO Asn Val Asn Val OTT AAC CTC CCT Val Asn Leu Pro
ATA
Ile 185 GAC ACC CTT TTA Asp Thr Leu Leu AAA CTC Lys Leu 190 OCT TTA AAA Ala Leu Lys 000 GOC 000 Gly Oly Gly 210 OAA AAA ATO CTG Oiu Lys Met Leu TTT AAA ATA GOC Phe Lys Ile Gly OTO TTT GOT Val Phe Gly 205 OTO GAA TAC GCA Vai Oiu Tyr Ala
ATA
Ile 215 TTA TOG AOT CCT AAC TAT CAA AAT Leu Trp Ser Pro Asn Tyr Gin Asn 220 CAA AAC Gin Asn 225 ACO AA.A CAA GOC Thr Lys Gin Oly AAA TTT TTT OCA Lys Phe Phe Ala
OCO
Ala 235 GOT 000 000 TTT Gly Oly Gly Phe
TTT
Phe 240 OTO AAT TTT 000 Val Asn Phe Gly TCT TTG TAT ATA Ser Leu Tyr Ilie AAA COC AAC CGC Lys Arg Asn Arg
TTC
Phe 255 AAT OTO GO TTA Asn Val Gly Leu ATC CCT TAC TAT Ile Pro Tyr Tyr TTO AGC OCO CAA Leu Ser Ala Gin AGT TG Ser Trp 270 AAA AAC TTT Lys Asn Phe AAC TTC AOC Asn Phe Ser 290
GOC
Oly 275 TCT AGC AAT GTO Ser Ser Asn Val CAO CAA CAA ACO Gin Gin Gin Thr ATC COA CAA Ile Arg Gin 285 TAC OCO TTC Tyr Ala Phe 926 974 OTT TTT AGO AAT Val Phe Arg Asn
AAA
Lys 295 GAA OTT TTT GTC Giu Val Phe Val WO 98/21225 WO 9821225PCTIUS97/21353 -93- TTG TTT TAGTTTGGAT TCGTTCTCAT TAAACACTGA TGATAAAATT CAAAAGATGG TT 1032 Leu Phe 305 TTATCGTTAC AAAATTCAAC ATTTC 1057 INFORMATION FOR SEQ ID NO:12: SEQUENCE CHARACTERISTICS: LENGTH: 305 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:12: Met Gly Arg Ile Giu Ser Lys Lys Arg Leu 1 Ala Phe Ala Thr Gin Ala Giu Leu Al a 145 Gin Val1 Leu Gly Asn 225 Val1 Val1 Ser Lys Asn Cys Gly Phe Tyr Ser 130 Ser Asn Asn Lys Gly 210 Thr Asn Gly Gly Lys His Gly Lys Lys 100 Lys Ala Thr Leu Phe 180 Glu Giu Gin Gly Lys 260 5 Val1 Asn Thr Ser As n His Ala As n Gin Phe 165 Val1 Lys Tyr Gly Gly 245 Ile Leu His Ser Val1 70 Leu His Gin Tyr Gly 150 Asn Asn Met Al a Asp 230 Ser Pro Gly Tyr Trp Tyr Tyr Leu Gly 120 Pro Leu Gly Pro Phe 200 Leu Phe Tyr Tyr Asn 25 Leu Gin Gly Thr Gly 105 Gin Asn Val1 His Ile 185 Phe Trp Phe Ile Ser 265 Al a Giu Arg Tyr Leu Ile Vai Gly Lys 140 Lys Al a Leu Gly Asn 220 Gly Arg Ala Leu Lys Leu Lys Giu Gly Trp Gly 125 Thr Thr Phe Leu Val 205 Tyr Gly Asn Gin Val1 Thr Gly Asp Ala Asp Gly 110 As n Tyr Pro Gl1y Lys 190 Phe Gin Gly Arg Ser 270 Phe Pro Thr Asn His Glu Asp Thr Asp Ser Leu 175 Leu Gly As n Phe Phe 255 Trp Leu Phe Gly Pro Tyr Ile Val1 Leu Ser Pro 160 Asn Ala Gly Gin Phe 240 Asn Lys Asn Phe Gly Ser Ser Asn Val Trp Gin Gin Gin Thr Ile Arg Gin Asn 275 280 285 WO 98/21225 PCT/US97/21353 -94- Phe Ser Val Phe Arg Asn Lys Glu Val Phe Val Ser Tyr Ala Phe Leu 290 295 300 Phe 305 INFORMATION FOR SEQ ID NO:13: SEQUENCE CHARACTERISTICS: LENGTH: 624 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 77...535 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:13: TATTAGTTGG TTTAATACGC TATAATCTGT GTGCCAACAT TGTGTGGCTC AAATCATTTT TAAAAGGGGT TTTATA ATG GAA AAC AAC GAA AAT CAT GAG AAA TTG AAT GGC Met Glu Asn Asn Glu Asn His Glu Lys Leu Asn Gly 112 GTT TTG CGC Val Leu Arg AAG TTT TTA GGC Lys Phe Leu Gly GCG TTC ACG CTT Ala Phe Thr Leu GGG AAA GAA Gly Lys Glu GGA GGA Gly Gly TTG AAT ATG GAA AAA TTG CGC GAA GCC ATT AAA AAA GAA AAA Leu Asn Met Glu Lys Leu Arg Glu Ala Ile Lys Lys Glu Lys
CCA
Pro ATC ATG AAT ATT TTG CTC ATG GGA GCT Ile Met Asn Ile Leu Leu Met Gly Ala
ACT
Thr 55 GGG GTG GGT AAA Gly Val Gly Lys
AGC
Ser TCG CTC ATT AAC GCT CTA TTC GGT AAG Ser Leu Ile Asn Ala Leu Phe Gly Lys
GAA
Glu 70 GTA GCT AAA GCA Val Ala Lys Ala GGT GTA Gly Val GGA AAA CCC Gly Lys Pro
ATC
Ile ACT CAG CAT CTT GAA AAA TAT GTT GAT Thr Gin His Leu Glu Lys Tyr Val Asp 85 GAA GAA AAA Glu Glu Lys GAT TAT GAA Asp Tyr Glu GGC TTG ATT TTA TGG GAC ACT AAA GGC ATT GAA GAT Gly Leu Ile Leu Trp Asp Thr Lys Gly Ile Glu Asp 400 448 AAT ACC Asn Thr 110 TTG GAA AGC ATT AAA AAA GAA ATG GAA GAT TCT TTT AAA ACG Leu Glu Ser Ile Lys Lys Glu Met Glu Asp Ser Phe Lys Thr 115 120 WO 98/21225 PCT/US97/21353 CTT GAT GAA AAA GAG GCT ATT GAT GTG GCG TAT CTG TGC GTT AAA GAG 496 Leu Asp Glu Lys Glu Ala Ile Asp Val Ala Tyr Leu Cys Val Lys Glu 125 130 135 140 ACT TCT GGT AGG GTT CAA GAG AGA GAG AGA GAG AGT TAT TAAGCTTTAC TA 547 Thr Ser Gly Arg Val Gin Glu Arg Glu Arg Glu Ser Tyr 145 150 AAAAATGGAA TATCCCAACG ATTTTCGTTT TCACCAACAC ACAAGAAAAA GCCGGCGATG 607 CCTTTGTTAA AAAAACT 624 INFORMATION FOR SEQ ID NO:14: SEQUENCE CHARACTERISTICS: LENGTH: 153 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:14: Met Glu Asn Asn Glu Asn His Glu Lys Leu Asn Gly Val Leu Arg Lys 1 5 10 Phe Leu Gly Asp Ala Phe Thr Leu Asp Gly Lys Glu Gly Gly Leu Asn 25 Met Glu Lys Leu Arg Glu Ala Ile Lys Lys Glu Lys Pro Ile Met Asn 40 Ile Leu Leu Met Gly Ala Thr Gly Val Gly Lys Ser Ser Leu Ile Asn 55 Ala Leu Phe Gly Lys Glu Val Ala Lys Ala Gly Val Gly Lys Pro Ile 70 75 Thr Gin His Leu Glu Lys Tyr Val Asp Glu Glu Lys Gly Leu Ile Leu 90 Trp Asp Thr Lys Gly Ile Glu Asp Lys Asp Tyr Glu Asn Thr Leu Glu 100 105 110 Ser Ile Lys Lys Glu Met Glu Asp Ser Phe Lys Thr Leu Asp Glu Lys 115 120 125 Glu Ala Ile Asp Val Ala Tyr Leu Cys Val Lys Glu Thr Ser Gly Arg 130 135 140 Val Gin Glu Arg Glu Arg Glu Ser Tyr 145 150 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 1083 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: WO 98/21225 PCT/US97/21353 -96- NAME/KEY: Coding Sequence LOCATION: 155...1033 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID GATOTTGTTA AOTCGTTGTT TATTATGTTA CACTAAAAGC TTAAATAAAA GGGCATAAGO GATAAAGGOA GTGTTAGTAO ATAGTTTTAA TAGGGTTATT GACTATATTA GGGTTTCTGT AACCAAACAG TGCAATTTCA GGTGTCAGTA TTGC ATG CCT OCT ACG CCA TTA AAT Met Pro Ala Thr Pro Leu Asn 120 175 TTT TTT OAT Phe Phe Asp PAT GAA GAA TTA Asn Glu Giu Leu
TTG
Leu CCT TTG GAT AAT Pro Leu Asp Asn TTA GAA TTT Leu Giu Phe 223 CTC AAA Leu Lys ATC GCC ATT GAT Ile Ala Ile Asp GGC OTT AAA AAA ATT AGA ATC ACO GGT Gly Val Lys Lys Ile Arg Ile Thr Oly
GGG
Oly GAO CCG CTA TTA Glu Pro Leu Leu AAA GOC TTA OAT Lys Gly Leu Asp TTT ATC OCT AAA Phe Ile Ala Lys CAC OCT TAC PAT His Ala Tyr Asn GAA GTG GAO TTA Glu Val Giu Leu
OTT
Val TTA AGC ACT PAT Leu Ser Thr Asn GGT TTT Gly Phe 319 367 415 463 TTA CTC AAA Leu Leu Lys GTG AAT OTT Val Asn Vai AAA ATG GCT PAG OAT TTA PAA PAT 0CC GGG Lys Met Ala Lys Asp Leu Lys Asn Ala Gly 80 TCA TTG OAT TCT TTA AAA AGC OAT AGG GTT Ser Leu Asp Ser Leu Lys Ser Asp Arg Val 95 100 TTA GCG CAA Leu Ala Gin TTA AAA ATC Leu Lys Ile TCT CAA Ser Gin 105 AAA GAC OCT CTT Lys Asp Ala Leu PAC ACO CTA GAA Asn Thr Leu Glu ATT GAA GAO TCT Ile Giu Giu Ser 511 559
TTG
Leu 120 PAA GTG GOT TTA Lys Val Oly Leu CTC APA TTA AAC Leu Lys Leu Asn OTT GTG ATA APA AGC Val Val Ile Lys Ser 135 OTT PAT OAT OAT GAA ATC TTA GAO CTT TTA OAA TAC OCA AAA Vai Asn Asp Asp Glu Ile Leu Glu Leu Leu Glu Tyr Ala Lys 140 145 PAT AGO Asn Arg 150 CAT ATA CPA ATC COC TAC ATT GAA His Ile Gin Ile Arg Tyr Ile Glu 155 ATG GAA AAC ACO Met Glu Asn Thr CAT OCT AAA His Ala Lys 165 AGT TTG GTT AAA GOC TTG AAA GAG COA GAA ATT TTA GAT TTG ATC OCT WO 98/21225 PCT/US97/21353 -97- Ser Leu Val 170 Lys Gly Leu Lys Glu 175 Arg Glu Ile Leu Asp Leu Ile Ala 180 CAA AAA Gin Lys 185 TAT CAA ATC ATT Tyr Gin Ile Ile GCA GAA AAA CCC Ala Glu Lys Pro AAA CAA Lys Gin 195 GGG TCT TCT Gly Ser Ser
AAA
Lys 200 ATC TAC ACG CTA Ile Tyr Thr Leu AAT GGC TAT CAA Asn Gly Tyr Gin
TTT
Phe 210 GGC ATT ATC GCT Gly Ile Ile Ala 751 799 847 CAT AGC GAT GAT His Ser Asp Asp TGC CAA TCT TGC Cys Gin Ser Cys
AAT
Asn 225 CGT ATC CGT TTG Arg Ile Arg Leu GCT TCT Ala Ser 230 GAT GGT AAG Asp Gly Lys AAA GAG GCG Lys Glu Ala 250
ATT
lie 235 TGC CCA TGT TTA Cys Pro Cys Leu TAT CAA GAC GCC Tyr Gin Asp Ala ATA GAC GCT Ile Asp Ala 245 AGG CTT TTA Arg Leu Leu 895 943 ATC ATC AAT AAG Ile Ile Asn Lys ACA AAA AAT ATA Thr Lys Asn Ile
AAA
Lys 260 AAG CAA Lys Gin 265 TCT GTC ATC AAT Ser Val Ile Asn CCA GAA AAA AAC Pro Glu Lys Asn
ATG
Met 275 TGG AAT GAT AAA Trp Asn Asp Lys
AAC
Asn 280 AGC GAA ACT CCC ACA AGG GCG TTT TAC Ser Glu Thr Pro Thr Arg Ala Phe Tyr 285
TAC
Tyr 290 ACA GGG GGG Thr Gly Gly TAGGGGAGT 1042 AAAATATTTA TTATTTTAAA CCTTTTTATT AAAAATAAGG C INFORMATION FOR SEQ ID NO:16: SEQUENCE CHARACTERISTICS: LENGTH: 293 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:16: Met Pro Ala Thr Pro Leu Asn Phe Phe Asp Asn Glu Glu Leu 1083 1 5 10 Leu Asp Asn Val Leu Glu Phe Leu Lys Ile Ala Ile 25 Lys Lys Ile Arg Ile Thr Gly Gly Glu Pro Leu Leu 40 Asp Glu Phe Ile Ala Lys Leu His Ala Tyr Asn Lys 55 Val Leu Ser Thr Asn Gly Phe Leu Leu Lys Lys Met Leu Pro Asp Arg Glu Glu Gly Val Lys Gly Leu Val Glu Leu Ala Lys Asp Leu WO 98/21225 PCT/US97/21353 -98- Lys Ser Leu Asn Leu 145 Met Glu Lys Gin Asn 225 Tyr Lys Lys Tyr Ala Arg Gly 115 Val Tyr Asn Leu Lys 195 Gly Ile Asp Ile Met 275 Thr Gly Val 100 Ile Val Ala Thr Asp 180 Gin Ile Arg Ala Lys 260 Trp Gly Val Ser Leu 120 Val His Ser Gin Lys 200 His Asp Lys Lys Asn 280 Asn Gin 105 Lys Asn Ile Leu Lys 185 Ile Ser Gly Glu Gin 265 Ser Ser Asp Gly Asp Ile 155 Lys Gin Thr Asp Ile 235 Ile Val Thr Leu Ala Leu Glu 140 Arg Gly Ile Leu Phe 220 Cys Ile Ile Pro Ser Lys 110 Leu Leu Ile Lys Glu 190 Asn Gin Cys Lys Lys 270 Arg INFORMATION FOR SEQ ID NO:17: SEQUENCE CHARACTERISTICS: LENGTH: 1181 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 121...1137 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:17: ACTTCTCAAT CAGCGAGCTA TCATGCAAGG CCTTATGTGG TGGATACCGC TTTTTTACGA TACGATTACA AAGATGTTTT TGGGTTTAAG GCGGGGCGCT ATGAAGCGAA TATTGATTTC ATG AGC GGA TCG AAT CAA GGG TGG--GAA GTG TAT TAT CAG CCC TAT AAG Met Ser Gly Ser Asn Gin Gly Trp Glu Val Tyr Tyr Gin Pro Tyr Lys 1 5 10 ACT GAA ACG CAA AGG TTA AGG TTT TGG TGG TGG AGT TCT TTT GGG AGA 120 168 WO 98/21225 PCT/US97/21353 -99- Thr Glu Thr GGT TTA GCG Gly Leu Ala Gin Arg Leu Arg Phe Trp Trp Ser Ser Phe Gly Arg ACG GTG CCT Thr Val Pro TTC AAC TCT TGG Phe Asn Ser Trp
ATT
Ile 40 TAT GAG TTT TTT Tyr Glu Phe Phe
GCG
Ala TAT TTG AAA AAG GGA GGC Tyr Leu Lys Lys Gly Gly CCT AAT AAC AGC Pro Asn Asn Ser
AAC
Asn GAT TTC ATC AAT Asp Phe Ile Asn
TAT
Tyr GGC TGG CAT GGA Gly Trp His Gly
ATC
Ile ACC ACA ACC TAT Thr Thr Thr Tyr TAT AAA GGT TTA Tyr Lys Gly Leu GCT CAA TTT TTT Ala Gin Phe Phe
TAT
Tyr TAT TTT GCG CCT Tyr Phe Ala Pro ACT TAT AAC GCT Thr Tyr Asn Ala CCT GGC Pro Gly 408 TTT AAG CTG Phe Lys Leu CGC TCT CAA Arg Ser Gin 115 TAT GAC ACG AAT AGG Tyr Asp Thr Asn Arg 105 AAT TTT CAA AAT Asn Phe Gin Asn GTA GGC TTT Val Gly Phe 110 TAT AGA GGG Tyr Arg Gly 456 AGC ATG ATC ATG Ser Met Ile Met ACC TTT CCT TTA Thr Phe Pro Leu
TAC
Tyr 125 504 TGG TAT Trp Tyr 130 AAC CCA GAG ACA AAC ACT TAT AGT TTA Asn Pro Giu Thr Asn Thr Tyr Ser Leu 135 GAC AGC ACG CCT Asp Ser Thr Pro
CAT
His 145 GGC TCG TTG TTG Gly Ser Leu Leu
GGG
Gly 150 AGG AAT GGC GTT Arg Asn Gly Val TTA AAT ATC CGC Leu Asn Ile Arg
CAG
Gin 160 GTT TTT TGG TGG Vai Phe Trp Trp AAT TTC AAC TGG Asn Phe Asn Trp
TCC
Ser 170 ATT GGC TTT TAT Ile Gly Phe Tyr AAC ACC Asn Thr 175 TTT GGC AAT Phe Gly Asn AAT AAC ACT Asn Asn Thr 195 GAC GCT TTT TTA Asp Ala Phe Leu TCT CAC ACG ATG Ser His Thr Met CCA AGG GGT Pro Arg Gly 190 ACT AGG CAT Thr Arg His TCC TAT ATC GGT Ser Tyr Ile Gly
AGT
Ser 200 GAA ATC TCC ATA Glu Ile Ser Ile
ACG
Thr 205 GCC GGA Ala Gly 210 ATO ATT GGC TAT Met Ile Gly Tyr
GAT
Asp 215 TTT TGG OAT AAT Phe Trp Asp Asn
ACG
Thr 220 GCT TAT GAT GGG Ala Tyr Asp Gly
CTA
Leu 225 GCT OAT GCG Ala Asp Ala ATC ACT Ile Thr 230 AAC GCT AAC ACT Asn Ala Asn Thr
TTC
Phe 235 ACT TTT TAC ACT Thr Phe Tyr Thr WO 98/21225 PCT/US97/21353 GTT GGA GGG ATC Val Gly Gly Ile
CAT
His 245
AAA
Lys AAG CGT TTT GCA Lys Arg Phe Ala CAT GTT TTT GGG His Val Phe Gly CGC GTC Arg Val 255 AAT GAA Asn Glu TCT CAT GCG Ser His Ala TAT TCC TTG Tyr Ser Leu 275 AAC GCG TTA Asn Ala Leu GTG GGG AGG Val Gly Arg CAA TTC AAC GCG Gin Phe Asn Ala TAT GCG TTC ACT Tyr Ala Phe Thr TCA ATC CTT Ser Ile Leu CTT AAC Leu Asn 290 TTT AGG ATC ACT Phe Arg Ile Thr
TAT
Tyr 295 TAT GGG GCT AGG Tyr Gly Ala Arg AAT AAA GGG TAT Asn Lys Gly Tyr 1032
CAA
Gin 305 GCG GGG TAT TTT GGA GCG CCC AAA TTC Ala Gly Tyr Phe Gly Ala Pro Lys Phe 310
AAT
Asn 315 AAC CCT GAT GGC Asn Pro Asp Gly
GAT
Asp 320 1080 TTT AGC GCT AAT Phe Ser Ala Asn
TAC
Tyr 325 CAA GAC AGA AGT Gln Asp Arg Ser ATG ATG ACC AAC Met Met Thr Asn CTC ACG Leu Thr 335 1128 CTG AAG TTT Leu Lys Phe TGATTTCCAA TCACAGCGAG TTAAAAACAC TCCAAGGCAT TTTT 1181 INFORMATION FOR SEQ ID NO:18: SEQUENCE CHARACTERISTICS: LENGTH: 339 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:18: Ser Gly Ser Asn Gin Gly Trp Glu Val Tyr Tyr Gin Pro Tyr Lys Thr Glu Thr Gin Arg Gly Leu Ala Phe Asn Tyr Leu Lys Lys Gly Leu Arg Phe Ser Trp Ile 40 Gly Asn Pro 55 Trp Trp 25 Tyr Glu Asn Asn Trp Ser Ser Phe Phe Ala Phe Gly Arg Thr Val Pro Phe Ile Asn Tyr Gly Trp His Gly Ile Thr Thr Thr Tyr 70 Ala Gin Phe Phe Tyr Tyr Phe Ala Pro Lys Ser Asn Ser Tyr Thr Tyr Asp Lys Gly Leu Asn Ala Pro Phe Lys Leu Val Tyr Asp Thr Asn 90 Arg Asn 105 Phe Gin Asn Val Gly Phe WO 98/21225 PCT/US97/21353 -101 Arg Trp His 145 Val Phe Asn Ala Leu 225 Val Ser Tyr Leu Gin 305 Phe Leu Gin 115 Asn Ser Trp Asn Thr 195 Met Asp Gly Ala Leu 275 Phe Gly Ala Phe Ser Pro Leu Trp Ser 180 Ser Ile Ala Ile Asn 260 Gin Arg Tyr Asn Met Glu Leu Asp 165 Asp Tyr Gly Ile His 245 Lys Phe Ile Phe Tyr 325 Ile Thr Gly 150 Asn Ala Ile Tyr Thr 230 Lys Asn Asn Thr Gly 310 Gin Thr 120 Thr Asn Asn Leu Ser 200 Phe Ala Phe Leu Ser 280 Tyr Pro Arg Thr Tyr Gly Trp Gly 185 Glu Trp Asn Ala Gly 265 Tyr Gly Lys Ser Phe Ser Val Ser 170 Ser lie Asp Thr Trp 250 Gin Ala Ala Phe Tyr 330 Pro Leu Thr 155 Ile His Ser Asn Phe 235 His Val Phe Arg Asn 315 Met Leu Glu 140 Leu Gly Thr Ile Thr 220 Thr Val Gly Thr Ile 300 Asn Met Tyr 125 Asp Asn Phe Met Thr 205 Ala Phe Phe Arg Glu 285 Asn Pro Thr Tyr Ser Ile Tyr Pro 190 Thr Tyr Tyr Gly Ala 270 Ser Lys Asp Asn Arg Thr Arg Asn 175 Arg Arg Asp Thr Arg 255 Asn Ile Gly Gly Leu 335 Gly Pro Gin 160 Thr Gly His Gly Ser 240 Val Glu Leu Tyr Asp 320 Thr INFORMATION FOR SEQ ID NO:19: SEQUENCE CHARACTERISTICS: LENGTH: 959 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 133...879 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:19: TAAGGAAATG AGTTTTTATA TCATAAAATA AAGTAACCGA GAAAAATCTT TCTCTAAAAA TAATACTTTT TTAGTTATAA TAACAATTTT GTTTTTTCAA AAACAATAAT TACTATATTT AGGATTTTAA GA ATG AAT GAC AAG CGT TTT AGA AAA TAT TGT AGT TTT TCT Met Asn Asp Lys Arg Phe Arg Lys Tyr Cys Ser Phe Ser 1 5 120 171 WO 98/21225 WO 981225PCTIUS97/21353 -102- ATT TTT TTG TCC TTA TTA OGA ACG TTT GAA TTA GAG OCT AAA GAA GAA Ile Phe Leu Ser Leu Leu OILY Thr Phe Giu Leu Glu Ala Lys Oiu Giu
GAA
Giu GAA AAA GAA GAA Giu Lys Giu Giu
AGA
Arg 3S AAG ACA GAA AGG AAA AAA OAA AAG AAC GCC Lys Thr Giu Arg Lys Lys Giu Lys Asn Ala 267 CAA CAC ACT CTA Gin His Thr Leu
GGC
OIly AAG GTT ACC ACT Lys Val Thr Thr
CAA
Gin 55 GCG GCT AAA ATC Ala Ala Lys Ile TTT AAC Phe Asn TAC AAC AAC Tyr Asn Asn GCC AAC CAA Ala Asn Gin
CAG
Gin ACA ACC ATT TCA Thr Thr Ile Ser AAG OAA TTA OAA Lys Giu Leu Oiu AGA AGO CAA Arg Arg Gin ATC AAT OTO Ile Asn Val 363 411 ATC AOC GAC ATO Ile Ser Asp Met
TTT
Phe B5 AGA AGA AAC CCT Arg Arg Asn Pro
AAT
Asn GOC GOT Gly OIly GOT OCO OTO ATA Gly Aia Vai Ile CAA AAA ATT TAT GTO CGC GOT ATT GAA Gin Lys Ile Tyr Val Arg Giy Ile Giu 105 459
GAC
Asp 110 AGA TTO OCT COO Arg Leu Ala Arg
OTT
Val 115 ACG OTO OAT 000 Thr Val Asp Gly OCO CAA ATO GOT Ala Gin Met Gly AOC TAT 000 CAT Ser Tyr Gly His
CAA
Gin 130 GOC AAT ACO ATC ATT GAC CCT OGA ATO Oly Asn Thr Ile Ilie Asp Pro Gly Met 135 CTT AAA Leu Lys 140 ACC OTO GTO OTT ACT AAA GOG OCO Ser Val Val Val Thr Lys Gly Aia 145 CAA GCO AGC GCO Gin Ala Ser Ala 000 CCT ATO Gly Pro Met 155 AOC GAT TTT Her Asp Phe OCT TTO ATT Ala Leu Ile 160 000 GCO ATT AAA Giy Ala Ile Lys
ATO
Met 165 GAG ACT AAA AGT Glu Thr Lys Ser ATC CCT Ile Pro 175 AAA GOT AAA GAC Lys Gly Lys Asp 0CC ATA ACT GGG Ala Ile Ser Oly 0CC ACT TTT TTA Ala Thr Phe Leu AAC TTT GOG OAT Asn Phe Giy Asp
CGA
Arg 195 OAA ACC OTG ATO Giu Thr Val Met OCT TAT COT CAT Aia Tyr Arg His
AAT
Asn 205 CAT TTT GAT GCG His Phe Asp Ala
CTT
Leu 210 TTG TAT TAC ACO Leu Tyr Tyr Thr
CAT
His 215 CAA AAT ATT TTT Gin Asn Ile Phe TAC TAT Tyr Tyr 220 CGT OAT 000 Arg Asp Oly AAT OCT ACA AAA Asn Aia Thr Lys CTC TTT AGA CCT Leu Phe Arg Pro AAA OCO GAO Lys Ala Oiu 235 WO 98/21225 PCT/US97/21353 -103- AAT AAA GTT ACA GAA GTC CTA GCG AGC AAA ACA ATG TGATGGCTAA GATCAA 895 Asn Lys Val Thr Glu Val Leu Ala Ser Lys Thr Met 240 245 TGGTTATTTG AGCGAAAGGG ATATTTTAAC GCTCAGTTAT AACATGACCA GAGACAACGC 955 TAAC 959 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 249 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID Met Asn Asp Lys Arg Phe Arg Lys Tyr Cys Ser Phe Ser Ile Phe Leu 1 5 10 Ser Leu Leu Gly Thr Phe Glu Leu Glu Ala Lys Glu Glu Glu Glu Lys 25 Glu Glu Arg Lys Thr Glu Arg Lys Lys Glu Lys Asn Ala Gin His Thr 40 Leu Gly Lys Val Thr Thr Gin Ala Ala Lys Ile Phe Asn Tyr Asn Asn 55 Gin Thr Thr Ile Ser Ser Lys Glu Leu Glu Arg Arg Gin Ala Asn Gn 70 75 Ile Ser Asp Met Phe Arg Arg Asn Pro Asn Ile Asn Val Gly Gly Gly 90 Ala Val Ile Ala Gin Lys Ile Tyr Val Arg Gly Ile Glu Asp Arg Leu 100 105 110 Ala Arg Val Thr Val Asp Gly Ala Ala Gin Met Gly Ala Ser Tyr Gly 115 120 125 His Gin Gly Asn Thr Ile Ile Asp Pro Gly Met Leu Lys Ser Val Val 130 135 140 Val Thr Lys Gly Ala Ala Gin Ala Ser Ala Gly Pro Met Ala Leu Ile 145 150 155 160 Gly Ala Ile Lys Met Glu Thr Lys Ser Ala Ser Asp Phe Ile Pro Lys 165 170 175 Gly Lys Asp Tyr Ala Ile Ser Gly Ala Ala Thr Phe Leu Thr Asn Phe 180 185 190 Gly Asp Arg Glu Thr Val Met Gly Ala Tyr Arg His Asn His Phe Asp 195 200 205 Ala Leu Leu Tyr Tyr Thr His Gin Asn Ile Phe Tyr Tyr Arg Asp Gly 210 215 220 Asp Asn Ala Thr Lys Asp Leu Phe Arg Pro Lys Ala Glu Asn Lys Val 225 230 235 240 Thr Glu Val Leu Ala Ser Lys Thr Met 245 INFORMATION FOR SEQ ID NO:21: WO 98/21225 PCT/US97/21353 -104- SEQUENCE CHARACTERISTICS: LENGTH: 1306 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 40...1266 OTHER INFORMATION: NAME/KEY: sig_peptide LOCATION: 40...219 OTHER INFORMATION: NAME/KEY: mat peptide LOCATION: 220...1266-- OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:21: TTTGACAGCT TATCATTTGG CAATAAAACA CCAAAATGA ATG AGT TAC ACA AAA Met Ser Tyr Thr Lys TAC TCA ACA CCA Tyr Ser Thr Pro
CCC
Pro AAC CGG CGT AAA ATG CAA AAC ATT ATC Asn Arg Arg Lys Met Gin Asn Ile Ile -45
GCT
Ala ATT AAA AGA TCC Ile Lys Arg Ser AGA GTC GAC CTG Arg Val Asp Leu
CAG
Gin GCA TGC AAG CTA Ala Cys Lys Leu GCT TTC Ala Phe GCG AGC TCG Ala Ser Ser TCA CCC ATG CAA TTT CAA AAA ACC TTA TTT CCT TTA Ser Pro Met Gln Phe Gin Lys Thr Leu Phe Pro Leu CCC TTA TTA TTT TTA TCT TGT Pro Leu Leu Phe Leu Ser Cys ATC GCT GAA GAA AAT GGG GCG TAT Ile Ala Glu Glu Asn Gly Ala Tyr
GCG
Ala AGC GTG GGG TTT Ser Val Gly Phe
GAA
Glu 15 TAT TCC ATT AGT Tyr Ser Ile Ser
CAT
His 20 GCC GTT GAG CAT Ala Val Glu His
AAT
Asn AAC CCT TTT TTA Asn Pro Phe Leu CAA GAA CGC ATC CAA ATC ATT TCT AAC GCT CAA Gin Glu Arg Ile Gin Ile Ile Ser Asn Ala Gin AAC AAA ATC TAT AAA CTC AAT CAA GTC AAA AAT GAA ATC ACA AGC ATG WO 98/21225 WO 9821225PCTIUS97/21353 105- Asn Lys Ile CAA AAC ACC Gin Asn Thr Tyr Lys Leu Asn Gin Lys Asn Glu Ile Thr Ser Met TTT AAT TAC ATC Phe Asn Tyr Ilie AAC GCT TTA AAA Asn Aia Leu Lys AAT GCT AAA Asn Ala Lys TTA ACC Leu Thr CCC ACT GAA ATC Pro Thr Glu Ile
CA.A
Gin GCT GAG AAA TAC Ala Giu Lys Tyr CTC CAA TCC ACC Leu Gin Ser Thr CA.A AAC ATT GAA Gin Asn Ile Giu
AAA
Lys ATA GTC ACA CTT Ile Val Thr Leu GGT GGC GTT GCA Gly Gly Val Ala
TCT
Ser 105 PAC CCC AAA CTA Asn Pro Lys Leu CAA GCG TTG GAA Gin Ala Leu Glu
AAA
Lys 115 ATG CAA GPA CCC Met Gin Glu Pro ATT ACT Ile Thr 120 PAC CCT TTA Asn Pro Leu GCT CPA TCT Ala Gin Ser 140 TTA GCA GAA PAC Leu Ala Giu Asn AGA AAT TTA GPA TTG CAA TTT Arg Asn Leu Glu Leu Gin Phe 135 CPA PAC CGC ATG Gin Asn Arg Met TCT TCT TTA TCT Ser Ser Leu Ser
TCT
Ser 150 CPA ACC GCT Gin Thr Ala 678 CPA ATT Gin Ile 155 TCA PAT TCT TTG Ser Asn Ser Leu
AAC
As n 160 GCG CTT GAT CCC Ala Leu Asp Pro TCT TAT TCT AAA Ser Tyr Ser Lys
PAC
Asn 170 ATT TCA AGC ATG Ile Ser Ser Met
TCT
Ser 175 GGG GTG AGT TTG Giy Val Ser Leu GTA GOG TAT P-AG Val Gly Tyr Lys 726 774 822 TTC TTT ACT PAG Phe Phe Thr Lys AAA PAT CPA GGG Lys Asn Gin Gly
TTT
Phe 195 CGC TAT TAC TTG Arg Tyr Tyr Leu TTT TAT Phe Tyr 200 GAC TAT GGT Asp Tyr Gly TTA GGC AA Leu Gly Lys 220
TAC
Tyr 205 ACT PAC TTT GGT Thr Asn Phe Giy GTG GGT AAT GGC Val Giy Asn Gly TTT GAT GGT Phe Asp Giy 215 PAC TAC CTT Asn Tyr Leu ATG PAT PAC CAC Met Asn Asn His TAT GGG CTT GGA Tyr Gly Leu Gly TAT PAT Tyr Asn 235 TTC ATT GAT AT Phe Ile Asp Asn
GCA
Al a 240 CAA AAA CAT TCG Gin Lys His Ser GTG GGT TTT TAT Vai Giy Phe Tyr 966 1014
GCG
Ala 250 GGT TTT GCT TTG Gly Phe Ala Leu GGG PAT TCG TGG Giy Asn Ser Trp GGG PAT GGT TTA Gly Asn Gly Leu
GGC
Gly 265 WO 98/21225 WO 9821225PCTIUS97/21353 -106- ATG TGG GTG Met Trp Val CAA GCT AAA Gin Ala Lys GTT CGT GTG Vai Arg Val 300 ATC CCT TTA Ile Pro Leu AGC CAA ACG GAT TTT ATC AAC PAT TAC TTG ATG Ser Gin 270 Thr Asp Phe Ile Asn Tyr Leu Met GGC TAT Gly Tyr 280
ATA
Ile 285
PAT
As n CAC ACG PAC TTT His Thr Asn Phe ATC CCT TTG Ile Pro Leu GTC PAT AGG Val Asn Arg
CAT
His 305
TTT
Phe GGA TTT GA Gly Phe Giu PAT TTT GGG Asn Phe Gly 295 GGC CTA AA Giy Leu Lys AAA GGG TTA Lys Gly Leu 1062 1110 1158 1206 1254 315 PAC ACT Asn Thr
TCC
Ser GCG GTG PAT Ala Val Asn CTC TTT TTC Leu Phe Phe TAT GPA ACG Tyr Giu Thr CGC CTT GTG GTG Arg Leu Vai Val PAT GTG AGT TAT Asn Val Ser Tyr TAT AGT TTT Tyr Ser Phe TAGGGGGTAA ATGCCTTCPA ACGCTCTTTT GATTGP.AGPA 1306 INFORMATION FOPR SEQ ID NO:22: SEQUENCE CHARACTERISTICS: LENGTH: 409 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: Met Gin Cys Thr Glu Ala Ile Giu Ly s Ser Asn Lys Leu Asn Val1 S er Ile As n Tyr Ile Leu Phe Gly Giu Asn Thr Asn Lys Ala Phe Leu Tyr Asn Gin Met Lys Lys Ile Ala Pro Ala Asn Asn Gin Leu Tyr Lys Ser Leu Ser Pro Ly s Asn Thr Ser Arg Ser Leu Val Phe Ile Thr Pro WO 98/21225 WO 981225PCT[US97/2 1353 -107- Tyr Gly Gin Leu Ser Ser 165 Val1 Tyr Asn Gly Ser 245 Gly Tyr Pro Glu His 325 Phe Leu Gly Giu Glu Ser 150 Ser Gly Tyr Gly Ile 230 Vali Asn Leu Leu Met 310 Giy As n Gin Val1 Pro Leu 135 Gin Tyr Tyr Leu Phe 215 Asn Giy Giy Met Asn 295 Gly Lys Val1 Se r Al a Ile 120 Gin Thr Ser Lys Phe 200 Asp Tyr Phe Leu Gly 280 Phe Leu Gly Ser Thr Ser 105 Thr Phe Ala Lys His 185 Tyr Gly Leu Tyr Gly 265 Tyr Gly Lys Leu Tyr 345 Leu 90 Asn Asn Al a Gin As n 170 Phe Asp Leu Tyr Al a 250 Met Gin Val1 Ile As n 330 Val1 Gin Pro Pro Gin Ile 155 Ile Phe Tyr Gly Asn 235 Gly Trp Ala Arg Pro 315 Thr Tyr Asn Lys Leu Ser 140 Ser Ser Thr Gly Lys 220 Phe Phe Val1 Lys Val 300 Leu Ser Ser Ile Leu Gi-u 125 Gin As n Ser Lys Tyr 205 Met Ile Ala Ser Ile 285 As n Ala Leu Phe Giu Val1 110 Leu Asn Ser Met Lys 190 Thr As n Asp Leu Gin 270 His Val1 Val1 Phe Lys 95 Gin Al a Arg Leu Ser 175 Lys Asn Asn As n Al a 255 Thr Thr Asn Asn Phe 335 Ile Ala Giu Met Asn 160 Gly Asn Phe His Ala 240 Gly Asp Asn Arg Ser 320 Lys Val Leu Asn Leu 145 Ala Val1 Gin Gly Leu 225 Gin Asn Phe Phe His 305 Phe Arg Thr Glu Leu 130 Ser Leu Ser Gly Phe 210 Tyr Lys Ser Ile Phe 290 As n Tyr Leu Leu Ser 100 Lys Met 115 Arg Asn Ser Leu Asp Pro Leu Ser 180 Phe Arg 195 Val Gly Gly Leu His Ser Trp Val 260 Asn Asn 275 Gln Ile Gly Phe Giu Thr Val Val 340 INFORMATION FOR SEQ ID NO:23: SEQUENCE CHARACTERISTICS: LENGTH: 1030 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 342... 824 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: WO 98/21225 WO 981225PCT1US97/21353 -108-
CACTCTAAGC
AGCCTTACAA
C CAAAACGC C
AAATAATGAA
AAAAACCCTA
CCGCTTGTGG
GTCAAACTCT
AATATCTTAA
GCTCAATCTA
ATCTTTTTAA
GCCCCCCTAG
GATAATCAAA
CTTTTTCTTT
ATAACGCTAA
TTATAGAAAT
TGCCTTTAGG
CCATAAAGCA
AAGGGTTTTA
AGAGGAAGAA
AAGCGCGCAT
TCAAAGCCTC
CACAAATAAC
TGGTTTCAGG
AAAAGTTAAT
AGCAAGCGGA TCCATCTTAA TTTAAATTTG TTTTAGAGAG TTGAAACAAC TCTCCTTAAA AACGAGCTAG ACAAAAATCT CTAAGCGATA GGCTTCATAT C ATG ACC ATC AAA GTT Met Thr Ile Lys Val 120 180 240 300 356 TTT TCG CCC AAA TAC CCC ACT GAA TTA Phe Ser Pro Lys Tyr Pro Thr Glu Leu GAA TTT TAT GCT Glu Phe Tyr Ala GAG CGT Glu Arg ATC GCT GAC Ile Ala Asp AGT ATT AGC Ser Ile Ser
AAC
Asn CCT TTA GGG TTT Pro Leu Gly Phe
ATC
Ile 30 CAA CGC TTG GAT Gin Arg Leu Asp CTT TTG CCT Leu Leu Pro GGG GAA TTT Gly Glu Phe GGG TTC GTT CAA Gly Phe Val Gin TTG CGC GAG CAT Leu Arg Glu His TTT GAA Phe Glu ATG AGA GAG GGT Met Arg Glu Gly
AAC
As n AAG CTC ATT GGC Lys Leu Ile Gly TGT GCC CTT AAT Cys Gly Leu Asn
CCT
Pro ATC AAT CAA ACA Ile Asn Gin Thr
GAA
Giu 75 CCC GAG CTG TGC Ala Giu Leu Cys TTC CAC ATA AAT Phe His Ile Asn
AGT
Ser GCT TAT CAA TCC Ala Tyr Gin Ser GCG CTA GGT CAA Giy Leu Giy Gin
AAA
Lys CTC TAT GAG AGC Leu Tyr Giu Ser GTG GAG Val Glu 100 AAA TAC GCT Lys Tyr Ala AAA AGC CAA Lys Ser Gin 120 CAC ATC AAA His Ile Lys 135 ATT AAA GGC TAT Ile Lys Gly Tyr
ACT
Thr 110 AAA ATC TCT CTG CAT GTG AGC Lys Ile Ser Leu His Val Ser ATC AAG GCA TGC Ile Lys Aia Cys CTC TAT CAA AAG CTG CCT TTT GTG Leu Tyr Gin Lys Leu Gly Phe Val GAA GAG CAT Glu Glu Asp
TGC
Cys 140 GTG GAG TTG GC Vai Ciu Leu Gly 145 GAG ACT TTC Giu Thr Leu
ATT
Ile 150 TTC CCC ACT CTT Phe Pro Thr Leu
TTT
Phe 155 ATC GAA AAG ATT Met Giu Lys Ile TTG TCT Leu Ser 160 TCATTGCTGC ATCCAT TTCACACACG CCCAAGCGAC ATTCAAACTA GCTAAATAAA CCCTAAAACA AACACTCCTT AAGTTTTAGA AGCCCTATTT AGGGGTTAAC
GATTTTATAG
840 900 960 1020 1030
TCAAACTTTC
CTTAAAATTT
GCTAAAATAG
ATTAACACAA CCCAATTAAC TCTTTTTCAA GCGCTTCGCA CCTATCAAAA CTACTTTAAT INFORMATION FOR SEQ ID NO:24: WO 98/21225 PCT/US97/21353 109- SEQUENCE CHARACTERISTICS: LENGTH: 161 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: Thr Tyr Asp Gly Cys His Glu Leu Leu 130 Glu Ile Ala Leu Gly Gly Ile Ser His 115 Gly Glu Val Arg Pro Phe Asn Ser Glu Ser Val Leu Phe Ser Pro Lys Tyr Pro Thr Glu Leu Glu Glu Ala Ile Glu Ile Tyr Tyr Ser Ile 135 Phe Asn Gly Arg Gin Ser Phe 105 Ile Glu Thr Pro Phe Glu Thr Gin Ile Lys Glu Leu Arg Glu Gly Lys Leu Ile Gin Leu Leu 160 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 1477 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 374...1267 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID CGTGGAGTTT TTTAGGCATT TCTTTATATT CATTCAATAA CGCTTGCGCG GGCAATTCTT CAACTAAAAT CTCTACTAAC AATTCATCTG AATGCAAAAT CTCAATTCTC CCTAAAAAAC AAAATCACTT TTAAGACTAA ATCATGTTAG AATTATACTT GAATTTACAC TCAGTTTAGT 120 180 WO 98/21225 PCT/US97/21353 -110- TTATTTCTTA ATACAAAAGG TAGGCGTTTT GAAACATTTA ACCCCACTCA CTCACACCAT CTTTAAAGCC TTATGGCTAG GCACAGCCTT AAGTGCATCT TTAAGTTTAG CCGCAACAGA AAGCCCCACT AAAACAGAGC CTAAGCCCGC TAAAGGGGTT AAAAACAAGC CCAAATCGCC CGTTACTAAA GTC ATG ATG ACC AAT TGC GAC AAT ATT AAA GAT TTT AAC Met Met Thr Asn Cys Asp Asn Ile Lys Asp Phe Asn 240 300 360 409 GCT AAG CAA Ala Lys Gin AAA GAA GTC TTA Lys Glu Val Leu GCC GCT TAT CAA Ala Ala Tyr Gin
TTC
Phe GGC TCT AAA Gly Ser Lys GAA AAT Glu Asn TTA GGC TAT GAA Leu Gly Tyr Glu
ATG
Met GCA GGC ATT GCA Ala Gly Ile Ala AAA GAG TCA TGC Lys Glu Ser Cys
GCA
Ala GGG GTT TAT AAA Gly Val Tyr Lys AAT TTT TCG GAT CCG AGC GCG GGC GTG Asn Phe Ser Asp Pro Ser Ala Gly Val 55
TAT
Ty-r CAT TCT TAT ATC His Ser Tyr Ile AGC GTT CTA AAA Ser Val Leu Lys
AGC
Ser TAT GGG CAT AAT Tyr Gly His Asn GAT AGC Asp Ser CCC TTT TTG Pro Phe Leu TTT GCT TCT Phe Ala Ser
CGT
Arg AAT GTG ATG GGG Asn Val Met Gly TTG CTC ATT AAA Leu Leu Ile Lys GAC GAT GCG Asp Asp Ala AAA ACA CGC Lys Thr Arg GAA GTG GCT TTA Glu Val Ala Leu GAG TTG CTC TAT Glu Leu Leu Tyr
TGG
Trp 105 TAC CAT GAC AAT TTA AAA Tyr His Asp Asn Leu Lys 110
GAC
Asp 115 ATG ATT AAA TCT Met Ile Lys Ser AAC AAG GGC AGT Asn Lys Gly Ser
CGT
Arg 125 TGG GAA AGG AGC Trp Glu Arg Ser
GAA
Glu 130 AAA TCT AAC GCT Lys Ser Asn Ala GAT GCT GAA AAA TAT TAC Asp Ala Giu Lys Tyr Tyr 135 140 AAA GAA TCT AAA ATC TTT Lys Glu Ser Lys Ile Phe 155 GAA GAG ATA CAA Glu Glu Ile Gin AGA ATC AGG CGT Arg Ile Arg Arg
TTG
Leu 150 GAT TCG CAG Asp Ser Gin AAC CTG GAT Asn Leu Asp 175
TCT
Ser 160 AGT AAT GAC CAA Ser Asn Asp Gin TTG CAA AAA AGC Leu Gin Lys Ser GCT AAT AGC Ala Asn Ser 170 GCC TTA ATT Ala Leu Ile TTA GAC CCT ATC GGC AAC CCC ATC CCC Leu Asp Pro Ile Gly Asn Ala Met Pro
CAA
Gin 185 GCC AAA Ala Lys 190 GAA ACT AAA ATA Glu Thr Lys Ile
GAA
Glu 195 GAM ACC CAA GCA Glu Thr Gin Ala AAA TCC CAA GAA Lys Ser Gin Glu WO 98/21225 PCT/US97/21353 AAA GAG ACA ACT Lys Glu Thr Thr GAG CAA ACA AAA Glu Gin Thr Lys AAG CCA GAA AAA Lys Pro Glu Lys
GCA
Ala 220 1033 1081 AAA GAT AAA CCC Lys Asp Lys Pro
ATG
Met 225 TAT TTG GCG CAA Tyr Leu Ala Gin
ATC
Ile 230 AAC AGC ACT GAT Asn Ser Thr Asp TTC ACA Phe Thr 235 CCC GTT AAA Pro Val Lys TCC TTT AAG Ser Phe Lys 255
AAA
Lys 240 AGC CCC AAA AAA Ser Pro Lys Lys GCT AAA GTG AGC Ala Lys Val Ser CAA AAA CAC Gin Lys His 250 GCC AAA ACC Ala Lys Thr 1129 1177 AAT AAC ATT AAA Asn Asn Ile Lys
AAT
Asn 260 AAT GTA AAA AAC Asn Val Lys Asn
AAC
Asn 265 GCT TCC Ala Ser 270 AAA AAA CAA GAA Lys Lys Gin Glu
ATG
Met 275 TGC AAA AAT TGC Cys Lys Asn Cys CCA GGG CAA AGG Pro Gly Gin Arg 1225
AAT
Asn 285 GCG ATT TTA GCT Ala Ile Leu Ala CAC ATC ACT CTC His Ile Thr Leu CAA GAG CTT Gin Glu Leu TAAAAAGTC 1276 CTAAAAATGG CGCAAAAAAC TCTTTTGATT ATCACTGATG AGCGATCATA ACGCTTTCTT CCATGCCAAA AAACCCACTT TTGCCTTATA GCCTGATTGA TACGCATGGC TTGAGCGTGG GGAAATTCTG AAGTGGGGCA T
GCATTGGGTA
ATGATTTGAT
GCTTACCTAA
TCGTAAAGAT
GTTTAAAACC
GGGGCAAATG
1336 1396 1456 1477 INFORMATION FOR SEQ ID NO:26: SEQUENCE CHARACTERISTICS: LENGTH: 298 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:26: Met Met Thr Asn Cys 1 5 Glu Val Leu Lys Ala Tyr Glu Met Ala Gly Lys Ile Asn Phe Ser Asp Asn Ile Lys Asp 10 Ala Tyr Gin Phe Gly Phe Asn Ala Ser Lys Glu 25 Ile Ala Trp Lys Glu Ser Cys Ala 40 Asp Pro Ser Ala Gly Val Tyr His Lys Gin Lys Asn Leu Gly Gly Val Tyr Ser Tyr Ile 55 Pro Ser Val Leu Lys Ser Tyr Gly His Asn 70 Asn Val Met Gly Glu Leu Leu Ile Lys Asp 90 Val Ala Leu Lys Glu Leu Leu Tyr Trp Lys Asp Ser Asp Ala Thr Arg Pro Phe Leu Phe Ala Ser Glu Tyr His Asp Asn WO 98/21225 PCT/US97/21353 -112- Leu Lys Ser Glu 130 Asp Arg 145 Ser Asn Asp Pro Lys Ile Thr Ser 210 Met Tyr 225 Ser Pro Asn Ile Gin Glu Ala Asn 290 Asp 115 Lys Ile Asp Ile Glu 195 Glu Leu Lys Lys Met 275 His Met Ser Arg Gin Gly 180 Glu Gin Ala Lys Asn 260 Cys Ile Ile Asn Arg Glu 165 Asn Thr Thr Gin Pro 245 Asn Lys Thr Lys Ala Leu 150 Leu Ala Gin Lys Ile 230 Ala Val Asn Leu Ser Asp 135 Lys Gin Met Ala Ser 215 Asn Lys Lys Cys Met Tyr 120 Ala Glu Lys Pro Glu 200 Lys Ser Val Asn Ser 280 Gin Asn Glu Ser Ser Gin 185 Lys Pro Thr Ser Asn 265 Pro Glu Gly Tyr Ile 155 Asn Leu Gin Lys Phe 235 Lys Lys Gin Ser Arg 125 Tyr Glu 140 Phe Asp Ser Asn Ile Ala Glu Met 205 Ala Lys 220 Thr Pro His Ser Thr Ala Arg Asn 285 Trp Glu Ser Leu Lys 190 Lys Asp Val Phe Ser 270 Ala Glu Ile Gin Asp 175 Glu Glu Lys Lys Lys 255 Lys Ile Arg Gin Ser 160 Leu Thr Thr Pro Lys 240 Asn Lys Leu INFORMATION FOR SEQ ID NO:27: SEQUENCE CHARACTERISTICS: LENGTH: 1515 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 141...1340 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:27: TTAGTGTTGA TTTTTTTATC GTTAGTGTTT GTGCGTCCTT TAGAGGCTTT GAGCGTGTTT ATGGGGTTGT ATTTGATTTA TGGCATCATT CGGTGGCTCT TTTTAATGGT AAAAATTATT TTTAATAAAA ATAAAAGCGC ATG AAA GAA TCT TTT TAC ATA GAG GGA ATG Met Lys Glu Ser Phe Tyr Ile Glu Gly Met 1 5 ACT TGC ACG GCG TGT TCT AGC GGG ATT GAA CGC TCT TTG GGG CGT AAG Thr Cys Thr Ala Cys Ser Ser Gly Ile Glu Arg Ser Leu Gly Arg Lys 20 AGT TTT GTG AAA AAA ATA GAA GTG AGC CTT TTA AAT AAG AGC GCT AAC Ser Phe Val Lys Lys Ile Glu Val Ser Leu Leu Asn Lys Ser Ala Asn 120 170 218 266 WO 98/21225 WO 981225PCT/US97/21353 -113- ATT GAA TTT Ile Giu Phe GAC GAA AAC CAA ACC Asp Giu Asn Gin Thr 50 AAT TTA GAC GAA ATT TTT AAA CTC Asn Leu Asp Giu Ile Phe Lys Leu ATT GAA Ile Giu AAG CTA GGC Lys Leu Giy TAT AGC CCT Tyr Ser Pro 65 AAA AAA GCT CTG Lys Lys Ala Leu ACA AAA GAA AAA Thr Lys Giu Lys
AAA
Lys GAA TTT TTT AGC Giu Phe Phe Ser CCT AAT GTT AAA TTA GCG Pro Asn Vai Lys Leu Ala 80 85 TTA GCG GTT ATT Leu Ala Val Ile
TTC
Phe ACC CTT TTT GTG Thr Leu Phe Val
GTG
Val TAT CTT'TCT ATG Tyr Leu Ser Met GCG ATG CTT AGC Ala Met Leu Ser CCT AGC Pro Ser 105 CTT TTA CCT Leu Leu Pro AAC GCT TGC Asn Aia Cys 125 AGC TTG CTT GCA Ser Leu Leu Ala GAT AAT CAT AGT Asp Asn Hi-s Ser AAT TTT TTA Asn Phe Leu 120 CAT TTG GGG His Leu Giy TTA CAG CTT ATA Leu Gin Leu Ilie GCA CTC ATT GTC Ala Leu Ile Val AGG GAT Arg Asp 140 -TTT TAC ATT CAA Phe Tyr Ile Gin
GGG
Giy 145 TTT AAA GCC TTA Phe Lys Ala Leu CAC AGA CAA CCC His Arg Gin Pro
AAC
As n 155 ATO AGC AGC CTT ATC GCC ATA GGC ACA Met Ser Ser Leu Ile Aia Ile Giy Thr 160
AGC
Ser 165 GCT GCC TTA ATT Ala Aia Leu Ile
TCA
Ser 170 AGC CTG TOG CAA Ser Leu Trp Gin TAT TTG GTC TAT Tyr Leu Vai Tyr AAT CAT TAT ACC Asn His Tyr Thr GAT CAG Asp Gin 185 TGG TCT TAT Trp Ser Tyr TTT GTG ATG Phe Val Met 205
GGG
Giy 190 CAT TAT TAT TTT His Tyr Tyr Phe AGC GTG TGC GTG Ser Val Cys Vai ATT TTA ATG Ile Leu Met 200 GAC AAA OCT Asp Lys Ala GTG GGC AAA CGC Val Gly Lys Arg
ATT
Ile 210 GAA AAT GTT Giu Asn Val TCT AAA Ser Lys 215 TTA GAC Leu Asp 220 GCT ATG CAA GCC Ala Met Gin Ala ATG AAA AAC GCC Met Lys Asn Ala
CCA
Pro 230 AAA\ ACC GCC CTT Lys Thr Ala Leu
AAA
Lys 235 ATG CAA AAT AAC Met Gin Asn Asn
CAA
Gin 240 CAG ATT GAA GTT TTA GTG GAT AGC ATT Gin Ile Giu Val Leu Val Asp Ser Ile 245
GTG
Vali 250 GTG GGG GAT ATT CTA AAA GTC CTC CCT GGA AGC GCG ATT GCG GTG GAT WO 98/21225 PCT/US97/21353 -114 Val Gly Asp Ile Leu Lys Val Leu 255 Pro Gly 260 Ser Ala Ile Ala Val Asp 265 GGT GAA ATC Gly Glu Ile GGC GAA GCG Gly Glu Ala 285
ATA
Ile 270 GAG GGC GAA GGG Glu Gly Glu Gly
GAA
Glu 275 TTA GAT GAG AGC Leu Asp Glu Ser ATG TTG AGC Met Leu Ser 280 GTC TTT TCA Val Phe Ser 986 1034 TTG CCG GTT TAT Leu Pro Val Tyr AAA GTC GGC GAT Lys Val Gly Asp
AAA
Lys 295 GGG ACA Gly Thr 300 TTC AAT AGC CAC Phe Asn Ser His
ACG
Thr 305 AGT TTT TTA ATG Ser Phe Leu Met
AAA
Lys 310 GCC ACG CAA AAC Ala Thr Gin Asn
AAC
Asn 315 AAA AAC AGC ACC Lys Asn Ser Thr TCT CAA ATT ATA Ser Gin Ile Ile ATG ATT TAT AAC Met Ile Tyr Asn 1082 1130 1178 CAA AGT TCA AAG Gin Ser Ser Lys GAG ATT TCT CGC Glu Ile Ser Arg
TTA
Leu 340 GCG GAT AAG GTT TCA AGC Ala Asp Lys Val Ser Ser 345 GTG TTT GTG Val Phe Val TGG CTC ATC Trp Leu Ile 365 GCT TTA GAA Ala Leu Glu 380
CCA
Pro 350 AGC GTG ATC GCT Ser Val Ile Ala TCT ATT TTA GCG Ser Ile Leu Ala ATT GCA CCT AAG Ile Ala Pro Lys GTG TTT GTA TCG Val Phe Val Ser
CCC
Pro 370
GTT
Val GAT TTT TGG TGG Asp Phe Trp Trp TTT GTG GTG Phe Val Val 360 TTT GGA ATC Phe Gly Ile CCT TGC GCT Pro Cys Ala 1226 1274 1322 TTA GTG ATT Leu Val Ile
TTA
Leu 395 GGA TTG CTA CGC Gly Leu Leu Arg
CTA
Leu 400 TGAGCATTTT AGTAGCGAAC CAGAAAGCGA GTTCTTTA 1378
GGGTTATTTT
TTTGATAAAA
ATAGAATTAT
TTAAAGACGC TAAAAGTTTA GAAAAAGCAA CCGGCACGCT CACTAACGGC AAGCCTGTCG
TAGAGTT
GGCTAGTCAA TACGATCGTT TTAAAAGCGT TCATTCTAAG 1438 1498 1515 INFORMATION FOR SEQ ID NO:28: SEQUENCE CHARACTERISTICS: .LENGTH: 400 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:28: WO 98/21225 WO 9821225PCTIUS97/21353 -115- Met 1 Ser Glu Gin Ser Asn Leu Leu Ile Gly 145 Ala Leu Tyr Arg Leu 225 Gin Val Glu Tyr Thr 305 Ser Ile Ile Lys Ser 385 Lys Gly Val Thr Pro Vai Ser Al a Gly 130 Phe Ile Val Phe Ile 210 Met Ile Leu Gly Ly s 290 Ser Gin S er Ala Pro 370 Val Glu Ile Ser Asn Lys Lys Met Ile 115 Ala Lys Gly Tyr Giu 195 Glu Lys Giu Pro Glu 275 Lys Phe Ile Arg Ile 355 Asp Leu Ser Phe Tyr Ile Glu Giy Met Thr Glu Leu Leu Ly s Leu Giy 100 Asp Leu Al a Thr Thr 180 Ser As n As n Val Gly 260 Leu Val1 Leu Ile Leu 340 Ser Phe Val1 5 Arg Leu Asp Ala Ala Ala Asn Ile Leu Ser 165 Asn Val Val Ala Leu 245 Ser Asp Gly Met Glu 325 Ala Ile Trp Ile Ser Asn Glu Leu 70 Leu Met His Val1 Trp 150 Ala His Cys Ser Pro 230 Val1 Al a Glu Asp Lys 310 Met Asp Leu Trp Se r Leu Lys Ile Thr Al a Leu Ser Met 135 His Al a Tyr Val Lys 215 Lys Asp Ile Ser Lys 295 Al a Ile Lys Ala Asn 375 Cys Gly Ser 40 Phe Lys Val Ser As n 120 His Arg Leu Thr Ile 200 Asp Thr Se r Al a Met 280 Val1 Thr Tyr Val1 Phe 360 Phe Pro Arg 25 Al a Lys Giu Ile Pro 105 Phe Leu Gin Ile Asp 185 Leu Ly s Al a Ile Val1 265 Leu Phe Gin Asn Ser 345 Val Gly Cys 10 Lys Asn Leu Lys Phe 90 Ser Leu Gly Pro Ser 170 Gin Met Ala Leu Val 250 Asp Ser Ser Asn Al a 330 Se r Val Ile Al a Ser Ile Ile Lys Thr Leu Asn Arg Asn 155 Ser Trp Phe Leu Lys 235 Val1 Gly Gly Gly Asn 315 Gin Val Trp Al a Leu 395 Cys Phe Glu Glu Giu Leu Leu Al a Asp 140 Met Leu Ser Val1 Asp 220 Met Gly Glu Glu Thr 300 Lys Ser Phe Leu Leu 380 Gly Thr Val1 Phe 4S Lys Phe Phe Pro Cys 125 Phe Ser Trp Tyr Met 205 Ala Gin Asp Ile Al a 285 Phe Asn Ser Val Ile 365 Glu Leu Al a Lys Asp Leu Phe Val1 Glu 110 Leu Tyr Ser Gin Gly 190 Val Met Asn Ile Ile 270 Leu Asn Ser Lys Pro 350 Ile Val1 Leu Cys Lys Glu Gly Ser Val1 Ser Gin Ile Leu Leu 175 His Gly Gin Asn Leu 255 Glu Pro Her Thr Ala 335 Ser Al a Phe Arg INFORMATION FOR SEQ ID NO:29: SEQUENCE CHARACTERISTICS: LENGTH: 1443 base pairs WO 98/21225 PCT/US97/21353 -116- TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 76...1389 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:29: ACTTTAAAAA ACCCCCTTAA AAAGGTTTTT AGGTATAATT AGCGATCTTT TAGTTTCAAA TAGTAGAGAG ATGGG ATG AAA AAA ATA TGG CTT TTA GTG TGG GGC TTG TGT Met Lys Lys Ile Trp Leu Leu Val Trp Gly Leu Cys 111 TCT TGG GTG Ser Trp Val TTT TTG CAT GCG Phe Leu His Ala ATA GAG ATG ATA GAA AAA GCC CCT ACA Ile Glu Met Ile Glu Lys Ala Pro Thr 20 GCC CCC CAT TTG TTG CTT TTA GCA GGG Ala Pro His Leu Leu Leu Leu Ala Gly 159 AAT GTA Asn Val GAG GAT AGA GAC Glu Asp Arg Asp
ATT
Ile CAA GGC GAT GAG Gin Gly Asp Glu GGT GGG TTT AAT Gly Gly Phe Asn
GCA
Ala 55 ACT AAT TTG TTT Thr Asn Leu Phe
TTA
Leu ATG CAT TAT AGC Met His Tyr Ser
GTT
Val TTA AAA GGT TTG Leu Lys Gly Leu GAA GTG GTT CCT Glu Val Val Pro GTA TTG Val Leu AAT AAG CCT Asn Lys Pro AAC CGC AAA Asn Arg Lys ATG TTA AGA AAT Met Leu Arg Asn AGG GGC TTG TAT Arg Gly Leu Tyr GGG GAT ATG Gly Asp Met TAC CCC ACT Tyr Pro Thr TTT GCC GCT TTA Phe Ala Ala Leu AAG AAT GAC CCT Lys Asn Asp Pro ATC CAG Ile Gin 110 GAA ATC AAA TCC TTG ATT GCA AAA CCC Glu Ile Lys Ser Leu Ile Ala Lys Pro 115
AGT
Ser 120 ATA GAC GCT GTC Ile Asp Ala Val 447 CAT TTG CAT GAT His Leu His Asp GGT GGG TAT TAC Gly Gly Tyr Tyr CCT GTT TAT GTT Pro Val Tyr Val
GAT
Asp 140 GCG ATG CTC AAT CCT AAG CGC TGG GGG Ala Met Leu Asn Pro Lys Arg Trp Gly TGC TTT ATT ATT Cys Phe Ile Ile GAT CAA Asp Gin 155 WO 98/21225 PCT/US97/21353 -117- GAT GAG GTT Asp Glu Val AAT ACG ATT Asn Thr Ile 175
AAA
Lys 160 GGG GCG AAA TTC CCT AAT TTG CTT GCT Asn Leu Leu Ala Gly Ala Lys Phe Pro 165 TTT GCA AAC Phe Ala Asn 170 ATT GAA GAG Ile Glu Glu 591 639 GAG AGT ATC AAC Glu Ser Ile Asn
GCC
Ala 180 CAT TTA TTG CAC His Leu Leu His
CCC
Pro 185 TAT CAT Tyr His 190 TTA AAA AAC ACG Leu Lys Asn Thr ACC GCG CAA GGC Thr Ala Gin Gly ACA GAA ATG CAA Thr Glu Met Gin
AAA
Lys 205 GCC CTA ACT TTT TAT GCG ATC AAC CAA Ala Leu Thr Phe Tyr Ala Ile Asn Gin 210
AAA
Lys 215 AAG AGC GCT TTT Lys Ser Ala Phe
GCC
Ala 220 AAT GAA GCT AGC Asn Glu Ala Ser
AAA
Lys 225 GAA CTC CCT TTA Glu Leu Pro Leu
GCA
Ala 230 TCA AGG GTG TTT Ser Arg Val Phe TAC CAC Tyr His 235 783 CTG CAA GCC Leu Gin Ala CGC GAT TTT Arg Asp Phe 255
ATT
Ile 240 GAG GGC TTA CTC Glu Gly Leu Leu CAG CTC AAT ATC Gin Leu Asn Ile CCT TTT AAG Pro Phe Lys 250 ATC AAT GAT Ile Asn Asp GAT CTT AAC CCT Asp Leu Asn Pro AGC GTG CAT GCC Ser Val His Ala AAA AAC Lys Asn 270 TTG TGG GCA AAA Leu Trp Ala Lys
ATC
Ile 275 AGC TCT TTG CCT Ser Ser Leu Pro ATG CCC CTT TTT Met Pro Leu Phe TTG CGC CCT AAA Leu Arg Pro Lys AAT CAT TTC CCC Asn His Phe Pro CCC CAC AAC ACT Pro His Asn Thr 927 975 1023 ATC CCA CAA ATC Ile Pro Gin Ile ATA GAG AGC AAC Ile Glu Ser Asn TAC ATT GTA GGG Tyr Ile Val Gly CTA GTC Leu Val 315 AAA AAT AAA Lys Asn Lys
CAA
Gin 320 GAA GTG TTT TTA Glu Val Phe Leu TAC GGC AAC AAG Tyr Gly Asn Lys CTC ATG ACA Leu Met Thr 330 1071 CGA TTA TCG CCT TTT TAC ATA Arg Leu Ser Pro Phe Tyr Ile 335 TTT GAT CCT TCT TTA GAA GAA GTG Phe Asp Pro Ser Leu Glu Glu Val 345 1119 AAA ATG Lys Met 350 CAA ATT GAC AAT Gin Ile Asp Asn
AAG
Lys 355 GAT CAA ATG GTT Asp Gin Met Val ATA GGG AGC GTG Ile Gly Ser Val 1167 1215 GAA GTG AAA GAG Glu Val Lys Glu TTT TAT ATC CAT Phe Tyr Ile His
GCT
Ala 375 ATG GAC AAT ATC Met Asp Asn Ile
CGT
Arg 380 WO 98/21225 PCTI -118- GCG AAT GTG ATT GGC TTT AGC GTT TCT AAT GAA AAT AAG CCT AAT GAA US97/21353 Ala Asn Val Ile Phe Ser Val Ser GCG GGT TAT Ala Gly Tyr GAC AAG CAA Asp Lys Gin 415
ACG
Thr 400
GAA
Glu AAA TTT AAA Lys Phe Lys Asn Glu Asn Lys Pro Asn Glu 390 395 TTT CAA AAA CGC TTT TCA TTG Phe Gin Lys Arg Phe Ser Leu 410 GAA TTT TAT AAA AAC AAC GCG Glu Phe Tyr Lys Asn Asn Ala 425 1263 1311 1359 1412 AGG ATC TAT Arg Ile Tyr TTT AGC GGG ATG ATC TTA GTG AAA TTT GTG TA Phe Ser Gly Met Ile Leu Val Lys Phe Val 430 435 CTTTTAACAT TCAAGGGTTT TGGTATTTTT T INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 438 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal GGAATGGA TAAATCTCAT TGC 1443 (xi) SEQUENCE DESCRIPTION: SEQ ID Met Lys Lys Ile Trp Leu Leu Val Trp 1 Leu Arg Glu Val Met Ala Lys Asp Pro 145 Gly His Asp Pro Leu Leu Ala Ser Gly 130 Lys Ala Ala Lys Gly Lys Arg Leu Leu 115 Gly Arg Lys Met His Asn Val 70 Arg Asn Lys Tyr Asn 150 Asn Leu Pro Ala Phe Val 75 Asp Pro Ala Val Asp 155 Ala Cys Thr Gly Leu Leu Met Thr Val Asp 140 Gin Asn Ser Asn Ile Met Asn Asn Ile Leu 125 Ala Asp Asn Phe Asp Asp Ser Ser Phe Ile His Asn Lys 160 Glu Ser Ile Asn Ala His Leu Leu His Pro Ile Glu Glu Tyr His Leu Lys WO 98/21225 WO 98/1225PCT/US97/2 1353 119- Asn Phe Lys 225 Glu Leu Al a Lys Pro 305 Giu Phe Asp Giu.
Gly 385 Ile Arg Ile Thr Tyr 210 Glu Gly Asn Lys Leu 290 Ile Val Tyr Asn Ser 370 Phe Lys Ile Leu Arg 195 Ala Leu Leu Pro Ile 275 Asn Glu Phe Ile Lys 355 Phe Ser Phe Tyr Val1 Thr Ala Gin Gly Ile Pro Leu.
As n 260 Ser His Ser Leu Glu 340 Asp Tyr Val1 Lys Arg 420 Lys Gin Lys 215 Ala Ser 230 Gin Leu Val His Leu Pro Pro Leu 295 Ala Tyr 310 Tyr Gly Asp Pro Met Val His Ala 375 Asn Giu 390 Phe Gin Glu Phe Val1 Asp Thr Glu.
200 Lys Ser Ala Arg Val Phe Asn Ile Pro 250 Ala Leu Ile 265 Lys Met Pro 280 Pro His Asn Ile Val Gly Asn Lys Leu 330 Ser Leu Giu 345 Lys Ile Gly 360 Met Asp Asn Asn Lys Pro Lys Arg Phe 410 Tyr Lys Asn Met Phe Tyr 235 Phe Asn Leu Thr Leu 315 Met Giu.
Ser Ile As n 395 Ser Gin Al a 220 His Lys Asp Phe Lys 300 Val1 Thr Vali Val1 Arg 380 Giu Leu.
Lys 205 Asn Leu.
Arg Lys Asn 285 Ile Lys Arg Lys Val1 365 Ala Aia Asp Ala Oiu Gin Asp Asn 270 Leu Pro Asn Leu Met 350 Glu Asn Gly Lys Leu Al a Al a Phe 255 Leu Arg Gin Lys Ser 335 Gin Val Val1 Tyr Gin Thr Ser Ile 240 Asp Trp Pro Ile Gin 320 Pro Ile Lys Ile Thr 400 Glu 425 Asn Ala Phe Ser Gly Met 430 INFORMATION FOR SEQ ID NO:31: SEQUENCE CHARACTERISTICS: LENGTH: 1280 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 66. 1223 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:31: ATCAATACCC CTTAAATAAA AGATATAATG CTGTATTATA AGCTAGTTTT AATTACAATT TTCAA ATG TTA AGO AAA AAC ATT TTA OCT TAC TAT 000 OCO AAT TTT CTC Met Leu Arg Lys Asn Ile Leu Ala Tyr Tyr Gly Ala Asn Phe Leu 1 5 10 110 WO 98/21225 PCTIUS97/21353 -120- TTA ATC ATC GCT Leu Ile Ile Ala
CAA
Gin AGC TTA CCC CAT Ser Len Pro His GCG ATT TTA ACC CCC TTG TTG Ala Ile Leu Thr Pro Leu Len 25 ATC TTG CTC GTG CAA ACC TTT Ile Leu Leu Vai Gin Thr Phe 158 CTT TCT AAA Len Ser Lys TTT AGC TTT Phe Ser Phe CTT AGT TTG AGT Len Ser Leu Ser
GAA
Glu TGC GTG CTA GTG Cys Val Leu Vai
GCT
Ala 55 GAA TAC CCA AGC Glu Tyr Pro Ser
GGC
Gly GTT TTA GCG Val Leu Ala GAT TTG Asp Len ATG AGC CGA AAA Met Ser Arg Lys TTA TTC CTG GTT TCT AAT GCC TTT TTA Len Phe Len Val Ser Asn Ala Phe Len
ATC
Ile GCT AGT TTT TCG Ala Ser Phe Ser
TTT
Phe GTG CTG TTT TTT Val Leu Phe Phe AGC TTT ATT TTC Ser Phe Ile Phe 350 CTT TTA GCG TGG Len Len Ala Trp
GGG
Gly 100 TTG TAT GGT TTG TAT AGC GCA TGC TCT AGC GGC Leu Tyr Gly Len Tyr Ser Ala Cys Ser Ser Gly ACG ATT GAA Thr Ile Glu TTA TCC AAG Len Ser Lys 130 TCA CTC ATC ACA Ser Leu Ile Thr
GAC
Asp 120 ATT AAG GAA AAC Ile Lys Glu Asn AAA AAA GAT Lys Lys Asp 125 TTA GGC ATG Leu Gly Met TTT TTA GCC AAA Phe Len Ala Lys
AAC
Asn 135 AAT CAA ATT ACT Asn Gin Ile Thr ATT ATA Ile Ile 145 GGG AGT TCT TTG Gly Ser Ser Len GGA TCG TTT TTG TAT CTC AAA GTC CAT GCG Cly Ser Phe Len Tyr Leu Lys Val His Ala 150 155 ATT TTT TTA ATC ATG CTC TCT GTG CTA ACG Ile Phe Leu Ile Met Len Cys Val Len Thr 170 175
ATG
Met 160 CTG TAT ATT GTG Len Tyr Ile Val ATC ATT TTT TAT Ile Ile Phe Tyr AAA GAG AAA GAA Lys Giu Lys Glu GAT TTT AAA AGC Asp Phe Lys Ser CAA AAA Gin Lys 190 AGC CTG AAA Ser Len Lys AAA GAT AAC Lys Asp Asn 210 CTT AAA GAG CAA Len Lys Glu Gin
GTC
Val 200 AAA GGC ACT CTT AAA GAG CTT Lys Gly Ser Len Lys Glu Len CCC AAA CTT AAA ATT CTG TTA GTG GGG Pro Lys Len Lys Ile Len Len Val Cly
CAT
His 220 TTC ATT ACG Leu Ile Thr CCC GTC Pro Vai 225 TTT TTT ATC AGC Phe Phe Met Ser TTT CAA ATG TGG Phe Gin Met Trp GCG TAT TTT TTA Ala Tyr Phe Len WO 98/21225 PCT/US97/21353 121-
AAA
Lys 240 CAA GGC GTT AAA Gin Gly Val Lys CAA TAC CTT TTT Gin Tyr Leu Phe
GTG
Val 250 TTT TAT ATC GCT Phe Tyr Ile Ala
TTT
Phe 255 830 878 CAA GTG ATT TCT Gin Val Ile Ser
ATT
Ile 260 CTC ATT CAT TTT Leu Ile His Phe
TTA
Leu 265 AAA GCC TCT AGT Lys Ala Ser Ser TAT AGC Tyr Ser 270 CAA AAA ATC Gin Lys Ile TTA TTG CTT Leu Leu Leu 290 ATG GTG GCG Met Val Ala 305 TTG AGT TCG CTT Leu Ser Ser Leu
GTG
Val 280 GTG TTG TTA GGC Val Leu Leu Gly AGC AAT ATC CCT Ser Asn Ile Pro TGT TTC ATA GGG Cys Phe Ile Gly AGC TAT TGC TTA Ser Tyr Cys Leu 315 GTT AGC CCC Val Ser Pro 285 TAT GCG CTC Tyr Ala Leu TAT CAA TTC Tyr Gin Phe 926 974 1022 TTT TTC ACT Phe Phe Thr
TAC
Tyr 310
TCC
Ser 320 AAA TTC GTT TCT Lys Phe Val Ser
AAA
Lys 325 AAC AAC ATT TCC Asn Asn Ile Ser
TCG
Ser 330 CTC TCA TCG CTT Leu Ser Ser Leu
TTA
Leu 335 1070 1118 TCA AGC TGT GTG Ser Ser Cys Val
CGC
Arg 340 GTG GTC TCT GTG Val Val Ser Val ATC TTA TCG CTC Ile Leu Ser Leu AGC AGT Ser Ser 350 CTG GAA CTG Leu Glu Leu GCC TTG ACG Ala Leu Thr 370 TTT GAT GAG Phe Asp Glu 385 TAC TTC TCA CCC Tyr Phe Ser Pro ACT ATC ATA ACC Thr Ile Ile Thr ATG CAT TTT Met His Phe 365 GCT AAG CCG Ala Lys Pro 1166 1214 CTT ATC ATC CTC Leu Ile Ile Leu TTC TTT TTG TAT Phe Phe Leu Tyr
AAG
Lys 380 TGAGCGGCTT TAAGAGTGCA ACCTTTTAGC GATTTCTATA GCAACATCA 1272 1280
TAGCCATG
INFORMATION FOR SEQ ID NO:32: SEQUENCE CHARACTERISTICS: LENGTH: 386 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:32: Met Leu Arg Lys Asn Ile Leu Ala Tyr Tyr Gly Ala Asn Phe Leu Leu WO 98/21225 WO 9821225PCT/US97/21353 -122- 1 Ile Ser Ser Leu Al a Leu Ile Ser Ile 145 Leu Ile Leu Asp Val1 225 Gin Vali Lys Leu Vali 305 Lys Ser Giu Leu Asp 385 Ile Lys Phe Met Ser Al a Giu Lys 130 Gly Tyr Phe Lys Asn 210 Phe Gly Ile Ile Leu 290 Ala Phe Cys Leu Thr 370 Glu Ala Gly Cys Ser Phe Trp Al a 115 Phe Ser Ile Tyr Leu 195 Pro Phe Val Ser Al a 275 Ser Phe Val1 Val1 Arg Leu Gin Ser Leu Ser Val Leu Arg Lys Ser Phe Gly Leu 100 Ser Leu Leu Ala Ser Leu Val Gly 165 Phe Lys 180 Leu Lys Lys Leu Met Ser Lys Glu 245 Ile Leu 260 Leu Ser Asn Ile Phe Thr Ser Lys 325 Arg Val 340 Tyr Phe Ile Ile Leu Leu Val1 Asn 70 Val1 Tyr Ile Lys Gly 150 Ile Giu Giu Ly s His 230 Gin Ile Ser Pro Tyr 310 Asn Val1 Ser Leu Pro Ser Al a 55 Leu Leu Gly Thr Asn 135 Ser Phe Lys Gin Ile 215 Phe Tyr His Leu Tyr 295 Met Asn Ser Pro Phe 375 His Giu 40 Giu Phe Phe Leu Asp 120 Asn Phe Leu Giu Val1 200 Leu Gin Leu Phe Val1 280 Cys Ser Ile Val Leu 360 Phe Ala 25 Ile Tyr Leu Phe Tyr 105 Ile Gin Leu Ile Gly 185 Lys Leu Met Phe Leu 265 Val1 Phe Tyr Ser Leu 345 Thr Phe Ile Leu Pro Val Asp 90 Ser Lys Ile Tyr Met 170 Asp Gly Val1 Trp Val 250 Lys Leu Ile Cys Ser 330 Ile Ile Leu Leu Leu Ser Ser 75 Ser Al a Glu Thr Leu 155 Leu Phe Ser Gly Gin 235 Ph e Al a Leu Gly Leu 315 Leu Leu Ile Tyr Thr Val1 Gly Asn Phe Cys Asn Tyr 140 Lys Cys Lys Leu His 220 Al a Tyr Ser Gly Val1 300 Asn Ser Ser Thr Lys 380 Pro Gin Val1 Al a Ile Ser Lys 125 Leu Val1 Val1 Ser Lys 205 Leu Tyr Ile Ser Val1 285 Tyr Tyr Ser Leu Met 365 Al a Leu Thr Leu Phe Phe Ser 110 Lys Gly His Leu Gin 190 Giu Ile Phe Al a Tyr 270 Ser Ala Gin Leu Ser 350 His Lys Leu Phe Al a Leu Met Gly Asp Met Al a Thr 175 Lys Leu Thr Leu Phe 255 S er Pro Leu Phe Leu 335 Ser Phe Pro Leu Phe Asp Ile Leu Thr Leu Ile Met 160 Ile Ser Lys Pro Lys 240 Gin Gin Leu Met Ser 320 Ser Leu Ala Phe INFORMATION FOR SEQ ID NO:33: SEQUENCE CHARACTERISTICS: LENGTH: 1264 base pairs TYPE: nucieic acid WO 98/21225 WO 98/ 1225PCTIUS97/21353 -123- STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 51..-.1205 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:33: ATTAAATATG ACTATATACA CTACA.ACAAT AAGATTTTGA AAGGTTGGTA ATG GAA Met Giu TCA GTA AAA Ser Val Lys ACA GGA AAA ACA Thr Gly Lys Thr AAT AAG GTT GGC AAG AAT ACA GAG ATG Asn Lys Val Gly Lys Asn Thr Giu Met 10 GAG OCT CAT TTT AAA CAA GCG AGC ACC Glu Ala His Phe Lys Gin Ala Ser Thr GCT AAT Ala Asn ACA AAG OCA AAT Thr Lys Ala Asn
AAA
Lys 25
ATT
Ile ACA AAT ATA ATC Thr Asn Ile Ile TCA ATT CGT GG Ser Ile Arg Gly
ATT
Ile TTT ACA AAA ATT Phe Thr Lys Ile 200 248 AAG AAA OTT AGA Lys Lys Vai Arg CTT GTA AAA AAA Leu Val Lys Lys CCC AAG AAA AGC Pro Lys Lys Ser AGT GCG Ser Ala GCA TTA GTA Ala Leu Val
GTA
Val1 TTG ACC CAT ATT Leu Thr His Ile
GCG
Ala 75 TGC AAG AAA GCG Cys Lys Lys Ala AAA GAA TTA Lys Giu Leu GAA AAT CAA Glu Asn Gin
GAC
Asp OAT AAA Asp Lys GTC CAA GAT AAA Val Gin Asp Lys AAA CAA OCT GAA Lys Gin Ala Glu
AAA
Lys ATC AAT Ile Asn 100 TOG TOG AAA TAT Trp Trp Lys Tyr
TCA
Ser 105 GGA TTA ACA ATA Gly Leu Thr Ile ACA AGT TTA TTA Thr Ser Leu Leu
TTA
Leu 115 0CC OCT TOT AGC ACT GOT OAT OTT AGT Ala Ala Cys Ser Thr Gly Asp Val Ser 120 CAA ATA GAA CTA Gin Ile Glu Leu
GAA
Giu 130 CAA OAA AAA CAA Gin Oiu Lys Gin ACO AGC AAT ATA Thr Ser Asn Ile ACT AAC AAT CAA ATA AAA Thr Asn Asn Gin Ile Lys 145 GTA GAA CAA Val Giu Gin
GAA
Glu 150 AAA CAA AAG ACA Lys Gin Lys Thr
AOC
Ser 155 AAT ATA GAG ACT Asn Ile Giu Thr AAT AAT CAA Asn Asn Gin 160 WO 98/21225 PCT/US97/21353 -124- ATA AAA GTA Ile Lys Val 165 GAA CAA GAA CAA CAG Glu Gin Glu Gin AAA ACA GAA Lys Thr Glu CAA GAA MGA CAG AAA Gin 170 Gin Glu 175 Xaa Gin Lys ACA GAA Thr Glu 180 CAA GAA AGA CAG Gin Glu Arg Gin
AAG
Lys 185 ACA GAA CAA GAA Thr Glu Gin Glu CAA AAG ACC ATT Gin Lys Thr Ile
AAA
Lys 195 ACA CAG AAA GAT Thr Gin Lys Asp ATT AAA TAT GTA GAA CAA AAT TGC CAA Ile Lys Tyr Val Glu Gin Asn Cys Gin 205
GAA
Glu 210 680 AAT CAT AAT CAA Asn His Asn Gin
TTC
Phe 215 TTT ATT GAA AAA Phe Ile Glu Lys
GGA
Gly 220 GGA ATT AAG GCT Gly Ile Lys Ala GGT ATT Gly ile 225 728 GGT ATA GAA Gly Ile Glu AAT CAA ACC Asn Gin Thr 245
GTA
Val 230 GAA GCT GAA TGC Glu Ala Glu Cys ACC CCT AAA CCT Thr Pro Lys Pro GCA AAA ACC Ala Lys Thr 240 CCT ATC CAG CCA Pro Ile Gin Pro CAC CTC CCA AAC TCT AAA CAA CCC His Leu Pro Asn Ser Lys Gin Pro 255 CGC TCT Arg Ser 260 CAA AGA GGA TCA Gin Arg Gly Ser GCG CAA GAG CTT Ala Gin Glu Leu
ATC
Ile 270 GCT TAT TTG CAA Ala Tyr Leu Gin GAG CTA GAA TCT Glu Leu Glu Ser CCC TAT TCA CAA Pro Tyr Ser Gin
AAA
Lys 285 GCT ATC GCT AAA Ala Ile Ala Lys
CAA
Gin 290 GTG GAT TTT TAT Val Asp Phe Tyr
AGA
Arg 295 CCA AGT TCT ATC Pro Ser Ser Ile TAT TTA GAA CTA Tyr Leu Glu Leu GAC CCT Asp Pro 305 AGA GAT TTT Arg Asp Phe CGC TCT AAA Arg Ser Lys 325 GTT ACA GAA GAA Val Thr Glu Glu CAA AAA GAA AAT Gin Lys Glu Asn TTA AAA ATA Leu Lys Ile 320 TTA AAA CCA Leu Lys Pro 1016 1064 GCT CAA GCT AAA Ala Gin Ala Lys CTT GAA ATG AGG Leu Glu Met Arg
AGT
Ser 335 GAC TCA Asp Ser 340 CAA GCC CAC CTT Gin Ala His Leu ACC TCT CAA AGC Thr Ser Gin Ser
CTT
Leu 350 TTG TTC GTT CAA Leu Phe Val Gin 1112
AAA
Lys 355 ATA TTT GCT GAT GTT AAT AAA GAA ATA Ile Phe Ala Asp Val Asn Lys Glu Ile 360
AAA
Lys 365 GTA GTT GCT AAT Val Val Ala Asn 1160 GAA AAG AAA GCA .Glu Lys Lys Ala
GAA
Glu 375 AAA GCG GGT TAT Lys Ala Gly Tyr TAT AGT AAA AGG Tyr Ser Lys Arg ATG TAGGC Met 385 1210 WO 98/21225 PCT/US97/21353 -125- ATAAGAAAAC ACCATAAAAT CGTTCTTAGC TTATTTATAG TATTTTAAAA ACTC 1264 INFORMATION FOR SEQ ID NO:34: SEQUENCE CHARACTERISTICS: S(A) LENGTH: 385 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:34: Met Glu Ser Val Lys Thr Gly Lys Thr Asn Lys Val Gly Lys Asn Thr 1 5 10 Glu Met Ala Asn Thr Lys Ala Asn Lys Glu Ala His Phe Lys Gin Ala 25 Ser Thr Ile Thr Asn Ile Ile Arg Ser Ile Arg Gly Ile Phe Thr Lys 40 Ile Ala Lys Lys Val Arg Gly Leu Val Lys Lys His Pro Lys Lys Ser 55 Ser Ala Ala Leu Val Val Leu Thr His Ile Ala Cys Lys Lys Ala Lys 70 75 Glu Leu Asp Asp Lys Val Gin Asp Lys Ser Lys Gin Ala Glu Lys Glu 90 Asn Gin Ile Asn Trp Trp Lys Tyr Ser Gly Leu Thr Ile Ala Thr Ser 100 105 110 Leu Leu Leu Ala Ala Cys Ser Thr Gly Asp Val Ser Glu Gin Ile Glu 115 120 125 Leu Glu Gin Glu Lys Gin Lys Thr Ser Asn Ile Glu Thr Asn Asn Gin 130 135 140 Ile Lys Val Glu Gin Glu Lys Gin Lys Thr Ser Asn Ile Glu Thr Asn 145 150 155 160 Asn Gin Ile Lys Val Glu Gin Glu Gin Gin Lys Thr Glu Gin Glu Xaa 165 170 175 Gin Lys Thr Glu Gin Glu Arg Gin Lys Thr Glu Gin Glu Lys Gin Lys 180 185 190 Thr Ile Lys Thr Gin Lys Asp Phe Ile Lys Tyr Val Glu Gin Asn Cys 195 200 205 Gin Glu Asn His Asn Gin Phe Phe Ile Glu Lys Gly Gly Ile Lys Ala 210 215 220 Gly Ile Gly Ile Glu Val Glu Ala Glu Cys Lys Thr Pro Lys Pro Ala 225 230 235 240 Lys Thr Asn Gin Thr Pro Ile Gin Pro Lys His Leu Pro Asn Ser Lys 245 250 255 Gin Pro Arg Ser Gin Arg Gly. Ser Lys Ala Gin Glu Leu Ile Ala Tyr 260 265 270 Leu Gin Lys Glu Leu Glu Ser Leu Pro Tyr Ser Gin Lys Ala Ile Ala 275 280 285 Lys Gin Val Asp Phe Tyr Arg Pro Ser Ser Ile Ala Tyr Leu Glu Leu 290 295 300 Asp Pro Arg Asp Phe Asn Val Thr Glu Glu Trp Gin Lys Glu Asn Leu 305 310 315 320 WO 98/21225 PCT/US97/21353 -126- Lys Ile Arg Ser Lys Ala Gin Ala Lys 325 Lys Pro Asp Ser Gin Ala His Leu Ser 340 345 Val Gin Lys Ile Phe Ala Asp Val Asn 355 360 Asn Thr Glu Lys Lys Ala Glu Lys Ala Met Leu 330 Thr Ser Lys Glu Glu Met Arg Ser Leu 335 Gin Ser Leu Leu Phe 350 Ile Lys Val Val Ala Gly Tyr Gly 380 Ser Lys Arg 370 375 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 410 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 62...340 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID ATTCATTTAC TTTTGAGAAA TATAATTCTC TCGCTTTTAA GATCATCACA AGGAGTTTCG T ATG AAA AAG CAA ATC TTG ACA GGT GTT TTA TTA TCA GTT TTG GCA GTG Met Lys Lys Gin Ile Leu Thr Gly Val Leu Leu Ser Val Leu Ala Val 109 AGT TCT GCA Ser Ser Ala TTT AGC ACA Phe Ser Thr
TAC
Tyr GCT CAC AAA GAT AAA AAA GAC GCC AAA Ala His Lys Asp Lys Lys Asp Ala Lys AAA CCT AAA Lys Pro Lys GAC GCT AAA Asp Ala Lys GAA TTA GTC GTG GCT CAA AAC GAC AAA Glu Leu Val Val Ala Gin Asn Asp Lys 40
AAA
Lys AAA CCT Lys Pro AAA TTT AGC ACA GAA TTA GTC GTG GCT CAA AAC GAC AAA AAA Lys Phe Ser Thr Glu Leu Val Val Ala Gin Asn Asp Lys Lys GAC GCT AAA AAA CCT AAA TTT AGC ACA GAA Asp Ala Lys Lys Pro Lys Phe Ser Thr Glu 70 GAC AAA AAA GAC GCT AAA AAA CCT AAA AAC Asp Lys Lys Asp Ala Lys Lys Pro Lys Asn GTC GTG GCT CAA AAC Val Val Ala Gin Asn TCA GTG GTC Ser Val Val TAATGGCTTT GA CTCTAAAAAA GCGTTTTTAA AAACGCTTTT TTGGATATTA TCCTATAATT TCCTACCA WO 98/21225 PCT/US97/21353 -127- INFORMATION FOR SEQ ID NO:36: SEQUENCE CHARACTERISTICS: LENGTH: 93 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:36: Met 1 Lys Lys Gin Ile Leu Thr Gly Val Leu Ser Val Leu Ala Val Ser Ser Ala Tyr Ala Phe Ser Thr Glu Leu Lys Pro Lys Phe Ser His Lys Asp Lys 25 Val Val Ala Gin Asp Ala Lys Asn Asp Lys Lys Pro Lys Asp Ala Lys Asp Lys Lys Asp Ala Thr Glu Leu Val Val Ala Gin 55 Lys Phe Ser Thr Glu Leu Val 7c Lys Lys Pro Val Ala Gin Lys Lys Asp Ala Lys Lys Pro Lys Ser Val Val INFORMATION FOR SEQ ID NO:37: SEQUENCE CHARACTERISTICS: LENGTH: 2097 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 67...2046 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:37: TAAAAACCCC TATCATAGGG CGTGGCATGA AGAAAAAAGC AAAAGTCTTT TTAATC ATG ATT TAT TGG TTG TAT TTG GCG GTC TTT TTT TTG Met Ile Tyr Trp Leu Tyr Leu Ala Val Phe Phe Leu 1 5
TGGTATTGTT
TTG AGC Leu Ser 108 GCA TTA GAC GCT AAA GAA ATC GCT ATG CAA CGA TTT GAC AAA CAA AAC Ala Leu Asp Ala Lys Glu Ile Ala Met Gin Arg Phe Asp Lys Gin Asn 20 25 CAT AAG ATT TTT GAA ATC CTT GCG GAT AAA GTG AGC GCT AAA GAC AAT His Lys Ile Phe Glu Ile Leu Ala Asp Lys Val Ser Ala Lys Asp Asn WO 98/21225 PCT/US97/21353 -128- GTG ATA ACC GCA TCA GGG AAT GCG Val Ile Thr Ala Ser Gly Asn Ala TTA TTG AAT TAT GAT Leu Leu Asn Tyr Asp
TAT
Tyr ATT CTA GCG Ile Leu Ala GAC AAG GTG CGT Asp Lys Val Arg
TAT
Tyr 70 GAC ACT AAA ACC Asp Thr Lys Thr
AAA
Lys GAA GCG TTA Glu Ala Leu TTA GAG Leu Glu GGG AAT ATC AAG GTT TAT AGG GGC GAG Gly Asn Ile Lys Val Tyr Arg Gly Glu
GGT
Gly TTG CTC GTT AAA Leu Leu Val Lys 348- 396 GAT TAC GTG AAA TTG AGT TTG AAT GAA Asp Tyr Val Lys Leu Ser Leu Asn Glu 100
AAA
Lys 105 TAT GAA ATC ATT TTC Tyr Glu Ile Ile Phe 110 CCC TTT TAT GTC Pro Phe Tyr Val
CAA
Gin 115 GAC AGC GTG AGC Asp Ser Val Ser ATT TGG GTG AGC Ile Trp Val Ser GCG GAT Ala Asp 125 ATT GCC AGC Ile Ala Ser TCA GGG TGC Ser Gly Cys 145 AAG GAT CAA AAA Lys Asp Gin Lys
TAT
Tyr 135 AAG GTT AAA AAC Lys Val Lys Asn ATG AGC ACT Met Ser Thr 140 GCG ACT TCA Ala Thr Ser AGC ATT GAT AAC Ser Ile Asp Asn ATT TGG CAT GTC Ile Trp His Val GGC TCA Gly Ser 160 TTC AAC ATG CAA Phe Asn Met Gin TCG CAT TTG TCT Ser His Leu Ser TGG AAT CCT AAG Trp Asn Pro Lys
ATC
Ile 175 TAT GTC GGT GAT Tyr Val Gly Asp
ATT
Ile 180 CCT GTA TTG TAT Pro Val Leu Tyr CCC TAT ATT TTC Pro Tyr Ile Phe
ATG
Met 190 TCC ACG AGC AAT Ser Thr Ser Asn AGA ACT ACT GGG Arg Thr Thr Gly TTA TAC CCT GAG Leu Tyr Pro Glu TTT GGC Phe Gly 205 ACT TCC AAC Thr Ser Asn CCC AAA AAC Pro Lys Asn 225
TTA
Leu 210 GAC GGC TTT ATT Asp Gly Phe Ile
TAT
Tyr 215 TTG CAA CCC TTT Leu Gin Pro Phe TAT TTA GCC Tyr Leu Ala 220 CGC TAT AAA Arg Tyr Lys TCA TGG GAT ATG Ser Trp Asp Met
ACC
Thr 230 TTT ACC CCA CAA Phe Thr Pro Gin AGG GGT Arg Gly 240 TTT GGC TTG AAT Phe Gly Leu Asn GAA GCG CGC TAC Glu Ala Arg Tyr AAC TCT AAA AAC Asn Ser Lys Asn GAC AGG TTT TTA TTC AAC GCG CGC TAT TTT AGG AAT TAC ACC CAA TAT Asp Arg Phe Leu Phe Asn Ala Arg Tyr Phe Arg Asn Tyr Thr Gin Tyr 876 WO 98/21225 WO 981225PCTIUS97/2 1353 129- 270 GTC AAA CGC TAC Val Lys Arg Tyr
GAT
Asp 275 TTG AGG AAT CAA Leu Arg Asn Gin
AAT
Asn 280 ATC TAC GGG TTT Ile Tyr Gly Phe GAA TTT Giu Phe 285 TTA AGC TCT Leu Ser Ser AAT ATT GAC Asn Ile Asp 305 AGG GAC ACT TTA CAA AAA TAC TTC CAC Arg Asp Thr Leu Gin Lys Tyr Phe His 295 CTT AAG TCT Leu Lys Ser 300 AAC CAT TTG Asn Asp Leu AAC GGC CAT TAC Asn Giy His Tyr GAC TTT TTA TAC Asp Phe Leu Tyr 1020 GAC TAT Asp Tyr 320 GTG CGT TTT CAA Vai Arg Phe Giu GTT AAT A-AG CGT Vai Asn Lys Arg
ATC
Ile 330 ACA GAC CCC ACG Thr Asp Aia Thr
CAC
His 335 ATC TCT ACG GCG Met Ser Arg Ala
AAT
As n 340 TAC TAT TTG CAA Tyr Tyr Leu Gin CAA AAC AAT TAT Ciu Asn Asn Tyr 1068 1116 1164 CCC TTC AAT ATC Giy Leu Asn Ile TAT TTT TTA AAC Tyr Phe Leu Asn
CTG
Leu 360 AAT AAA ATC AAC Asn Lys Ile Asn AAT AAC Asn Asn 365 CCC ACT TTC Arg Thr Phe TCT TTG TAT Ser Leu Tyr 385 TCT CTC CCT AAT Ser Val Pro Asn CAA TAC CAT AAA Gin Tyr His Lys TAT TTA AAT Tyr Leu Asn 380 CAG TTT ACA Cmn Phe Arg 1212 TTT AGA AAT TTC TTC TAT TCG CTC CAT Phe Arg Asn Leu Leu Tyr Ser Val Asp AAC ACC Asn Thr 400 CCA ACA GAG ATT Ala Arg Giu Ile
CGT
Gly 405 TAT CCC TAT CTC Tyr Gly Tyr Val
CAA
Cmn 410 AAC GCT TTC AAT Asn Ala Leu Asn 1260 1308 1356
GTG
Val1 415 CCC CTG CCC TTG Pro Vai Cly Leu
CAA
Gin 420 TTT TCT TTC TTT Phe Ser Leu Phe
AAA
Lys 425 AAG TAT TTG TCT Lys Tyr Leu Ser GGG CTT TGG AAT Gly Leu Trp Asn CTC CAA CTA TCT AAT CTG GCT TTA ATC CAA TCT Leu Gin Leu Ser Asn Val Ala Leu Met Gin Ser 1404 AAA AAT TCC Lys Asn Ser AAT TTT GTG Asn Phe Vai 465 GTG CCT ACG ATC Val Pro Thr Ilie AAT CAA TCA AGG Asn Ciu Ser Arg GAA TTT GGG Giu Phe Cly 460 CAT TTG OCT Asp Leu Ala 1452 1500 TCT TCA AAT TTT Ser Ser Asn Phe
TCC
Ser 470 ATG TAT CTC AAT Met Tyr Val Asn
ACG
Thr 475 AGA GAA TAC AAC AAG CTT TTC CAC ACG ATC CAA CTA GAA C ATT TTC 14 1548 WO 98/21225 PCT/US97/21353 130- Arg Glu 480 Tyr Asn Lys Leu His Thr Ile Gin Glu Ala Ile Phe
AAC
Asn 495 ATC CCT TAT TAC Ile Pro Tyr Tyr
ACC
Thr 500 TTT AAA AAC GGC Phe Lys Asn Gly
TTA
Leu 505 TTT TCT CAA AAC Phe Ser Gin Asn 1596 TAT GCT TTA AGC GCG CAA CCC TTA AAC Tyr Ala Leu Ser Ala Gin Ala Leu Asn 515
AGC
Ser 520 TAC ACT TCG CCT Tyr Thr Ser Pro TTA TTG Leu Leu 525 1644 AGA CAT TAT Arg Asp Tyr
CAT
Asp 530 TAT CAA GGG CGT Tyr Gin Cly Arg TAT CAC TCG GTG Tyr Asp Ser Val TGG AAT CCT Trp Asn Pro 540 1692 AGC ACT ATT TTA CCT AGC AAT Ser Ser Ile Leu Pro Ser Asn 545 AGC AAC AAC ACG GTG CAT TTA ACC Ser Asn Lys Thr Val Asp Leu Thr 555 1740 CTA ACG Leu Thr 560 CAA TAC CTT TAT Gin Tyr Leu Tyr TTA GGG GGG CAA Leu Gly Cly Gin
GAG
Glu 570 TTA TTC TAT TTT Leu Leu Tyr Phe 1788 1836 ATA TCC CAA CTC Ile Ser Gin Leu
ATC
Ile 580 AAT CTT GAC CAT Asn Leu Asp Asp
AAA
Lys 585 CTT TCG CCC TTT Val Ser Pro Phe ATG CCA CTA GAG ACC AAG ATC GGG TTT Met Pro Leu Clu Ser Lys Ile Cly Phe 595
TCC
Ser 600 CCC TTA ACG GGA TTC AAC Pro Leu Thr Cly Leu Asn 605 1884 ATC TTT GGG Ile Phe Gly ATC-TCT GTG Ile Ser Val 625
AAT
Asn 610 GTC TTT TAT TCG Val Phe Tyr Ser TAT CAA AAC CGC Tyr Gin Asn Arg TTA GAA GAA Leu Glu Glu 620 1932 ARC CCC AAT TAC Asn Ala Asn Tyr CGC AAC TTT TTA ACC TTT AAC CTC Arg Lys Phe Leu Ser Phe Asn Leu 635 1980 TCT TAT Ser Tyr 640 TTT TTA AAA AAC Phe Leu Lys Asn TTT ACC ACT GGG Phe Ser Ser Cly
ATT
Ile 650 AAT AGC ATT CTA Asn Ser Ile Val 2028 AAT CTC CGG ATT ATT Asn Leu Arg Ile Ile 660 TAAAGGCCGG TTTTACCAAC CACTTTGGCT ATTTTTCC 2084 ATCAGCGCGG ATC 2097 INFORMATION FOR SEQ ID NO:38: SEQUENCE CHARACTERISTICS: LENGTH: 660 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear WO 98/21225 WO 8/2225PCTIUS97/21353 -131- (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:38: Met Ile Tyr Trp Leu Tyr Leu Ala Val Phe 1 Asp Ile Thr Al a Gly Tyr Tyr Her Cys 145 Phe Val1 Ser As n Asn 225 Phe Phe Arg Ser Asp 305 Val1 Her Asn Phe Tyr 385 Ala Phe Al a Asp Asn Val1 Val1 Gly 130 Ser Asn Gly Asn Leu 210 Ser Gly Leu Tyr Ser 290 Asn Arg Arg I le Gin 370 Phe Glu Ile Gly Val1 Lys Leu 100 Asp Asp Asp Gin Ile 180 Arg Gly Asp Asn Asn 260 Leu Asp His Glu Asn 340 Tyr Val1 Asn 5 Ile Leu Asn Arg Val Ser Ser Gin Asn Lys 165 Pro Thr Phe Met Phe 245 Ala Arg Thr Tyr Lys 325 Tyr Phe Pro Leu Al a Ala Al a Tyr 70 Tyr Leu Val1 Lys Pro 150 Her Val Thr Ile Th r 230 Gi u Arg Asn Leu Ile 310 Val1 Tyr Leu Asn Leu 390 Gin Lys Leu Thr Gly Glu Gly 120 Lys Trp Leu Tyr Phe 200 Leu Thr Arg Phe As n 280 Lys Phe- Lys Gln Leu 360 Gin Ser Arg 25 Val1 Leu Ly s Giu Lys 105 Ile Val1 His Ser Leu 185 Leu Gin Pro Tyr Arg 265 Ile Tyr Leu Arg Thr 345 As n Tyr Val1 10 Phe Ser Asn Thr Gly 90 Tyr Trp, Lys Val1 Met 170 Pro Tyr Pro Gin Ile 250 Asn Tyr Phe Tyr Ile 330 Glu Lys His Asp Leu Lys Lys Asp Glu Leu Ile Her Met 140 Al a Asn Ile Giu Tyr 220 Arg Her Thr Phe Leu 300 Asn Asp Asn Asn Tyr 380 Gin Leu Gin Asp Val1 Al a Val1 Ile Al a 125 Ser Thr Pro Phe Ph e 205 Leu Tyr Lys Gin Glu 285 Lys Asp Al a Tyr Asn 365 Leu Phe Her Asn Asn Tyr Leu Lys Phe 110 Asp Thr Her Lys Met 190 Gly Al a Lys As n Tyr 270 Phe Her Leu Th r Tyr 350 Asn Asn Arg Ala His Val I le Leu Thr Pro I le Her Gly Ile 175 Her Thr Pro Arg Asp 255 Val Leu Asn Asp His 335 Gly Arg Her Asn Leu L~ys Ile Leu Glu Asp Phe Ala Gly Her 160 Tyr Thr Her Lys Gly 240 Arg Lys Her Ile Tyr 320 Met Leu Thr Leu Thr 400 WO 98/21225 PCTIUS97/21353 -132- Ala Val Trp Ser Val 465 Tyr Pro Leu Tyr Ile 545 Gin Ser Leu Gly Val 625 Phe Leu Arg Gly Asn Phe 450 Ser Asn Tyr Ser Asp 530 Leu Tyr Gin Glu Asn 610 Asn Leu Arg Glu Ile Gly 405 Leu Gin Phe 420 Asp Leu Gin 435 Val Pro Thr Ser Asn Phe Lys Leu Phe 485 Tyr Thr Phe 500 Ala Gin Ala 515 Tyr Gin Gly Pro Ser Asn Leu Tyr Gly 565 Leu Ile Asn 580 Ser Lys Ile 595 Val Phe Tyr Ala Asn Tyr Lys Asn Asn 645 Ile Ile 660 Tyr Gly Tyr Val Gin Asn Ala Leu Asn Ser Leu Ile Ser 470 His Lys Leu Arg Ala 550 Leu Leu Gly Ser Gin 630 Phe Phe Asn 440 Asn Tyr Ile Gly Ser 520 Tyr Asn Gly Asp Ser 600 Tyr Lys Ser Lys 425 Val Glu Val Gin Leu 505 Tyr Asp Lys Gin Lys 585 Pro Gin Phe Gly 410 Lys Ala Ser Asn Leu 490 Phe Thr Ser Thr Glu 570 Val Leu Asn Leu Ile 650 Tyr Leu Arg Thr 475 Glu Ser Ser Val Val 555 Leu Ser Thr Arg Ser 635 Asn Leu Ser Met Gin 445 Glu Phe 460 Asp Leu Ala Ile Gin Asn Pro Leu 525 Trp Asn 540 Asp Leu Leu Tyr Pro Phe Gly Leu 605 Leu Glu 620 Phe Asn Ser Ile Leu 430 Ser Gly Ala Phe Met 510 Leu Pro Thr Phe Arg 590 Asn Glu Leu Val Val 415 Gly Lys Asn Arg Asn 495 Tyr Arg Ser Leu Lys 575 Met Ile Ile Ser Glu 655 Pro Leu Asn Phe Glu 480 Ile Ala Asp Ser Thr 560 Ile Pro Phe Ser Tyr 640 Asn INFORMATION FOR SEQ ID NO:39: SEQUENCE CHARACTERISTICS: LENGTH: 961 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 168...764 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:39: ATGCCGATTA AATGCATGCT GATTAAATGA ATGAAAAGAG TCCAAACCAC CGCCTTTAAC GCACCACGCT TGAAATTAAA ACTAAATTTT AGTGTATTCT TAGCAAATTT TAGATAAGAT wo 98/21225 WO 9821225PCTIUS97/21353 -133- CAAGCGTGAT TTTTTCTAAA TTTTAGGCAT TTAAGOAATC AOTOTTT ATG ACA AOC Met Thr Ser 1 GCT CTG Al a Leu TTA GGC TTA CAA Leu Oly Leu Gin GTT TTA GCG OTA Val Leu Ala Val ATT GTG GTG GTG Ile Val Val Val
GTT
Val TTG TTO CAA AAA Leu Leu Gin Lys
ACT
Sex 2S TCT AGC ATC GGC Sex Ser Ile Gly GGG GCT TAT AOC Gly Ala Tyr Sex
GO
Gly ACT AAT GAG TCT Ser Asn Glu Sex TTT OGC OCT AAA Phe Gly Ala Lys
GGG
Oly CCT GCA AGC TTT Pro Ala Ser Phe ATG GCO Met Ala AAA TTA ACC Lys Leu Thr TTO GOC TAT Leu Gly Tyr
ATG
Met TTT TTA GOG CTO Phe Leu Gly Leu
TTA
Leu TTT GTC ATC AAC ACC ATC GCT Phe Val Ile Asn Thr Ile Ala TTT TAC AAC AAA The Tyr Asn Lys
GAA
Glu TAC GOC AAO AGC Tyr Gly Lys Sex TTA GAT GAO Leu Asp Olu ACT AA Thr Lys ACC AAC AAA OAA Thr Asn Lys Oiu TCG CCC CTA OTC Sex Pro Leu Val
CCT
Pro 0CC ACC GOC ACO Ala Thr Oly Thr AAC CCT OCA CTT Asn Pro Ala Leu CCC ACA TTA AAC Pro Thr Leu Asn ACO CTC AAC CCT Thr Leu Asn Pro GAG CAA 0CC CCA Oiu Gin Ala Pro
ACT
Thr 120 AAT CCT TTA ATO CCA CAA CAA ACO CCT Asn Pro Leu Met Pro Gin Gin Thr Pro 125 AAC GAA Asn 0 u 130 560 CTC CCT APA Leu Pro Lys PAT OPA PAG Asn Olu Lys 150 CCA 0CC APA ACG CCT TCT OTT OPA AOC Pro Ala Lys Thr Pro Ser Val Glu Sex 140 CCC APA CAG Pro Lys Gin 145 ATA PAG GOT Ile Lys Oiy PAT GPA PAG PAT Asn Glu Lys Asn 0CC AAA GAG P-AT Ala Lys Olu Asn OTT GAA Val Oiu 165 PAA ACC APA GAG Lys Thr Lys Glu 0CC AAA ACO CCC Ala Lys Thr Pro ACC. ACC CAC CAA Thr Thr His Gin
PAG
Ly s 180 CCT APA ACO CAT Pro Lys Thr His
GCA
Ala 185 ACO CPA ACC PAC 0CC CAT ACC PAC CA Thr Gin Thr Asn Ala His Thr Asn Gin 190 PAG OAT OPA APA Lys Asp Giu Lys TPATGTTACA GGCCATTTAT PACOAPACCA PAOATCTOAT GCPAA WO 98/21225 PCT/US97/21353 -134 AAAGCATTCA AGCTTTAAAC AGGGATTTTT CCACTCTAAG GAGCGCGAAA GTTTCAGTCA ATATTTTAGA TCACATCAAA GTGGATTATT ACGGCACGCC CACGGCATTA AATCAAGTCG GATCCGTGAT GAGCTTGGAT GCGACCACCC TT INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 199 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID 869 929 961 Met Thr Ser Ala Leu Leu Gly Leu 1 Val Tyr Phe Thr Leu Thr Asn Pro Pro 145 Ile Thr Asn Val Gly Ala Ala Glu Thr Leu 115 Glu Gin Gly Gln Lys Val Ser Lys Leu Thr Leu 100 Glu Leu Asn Val Lys 180 Lys 5 Leu Asn Leu Gly Lys Asn Gln Pro Glu Glu 165 Pro Asp Gin Ser 25 Phe Leu Asn Glu Asn 105 Asn Ala Lys Glu Ala 185 Ile 10 Ser Gly Gly Lys Leu 90 Pro Pro Lys Asn Asn 170 Thr Val Ser Ala Leu Glu Ser Thr Leu Thr Asp 155 Ala Gln Ala Gly Gly Phe Gly Leu Asn Pro 125 Ser Lys Thr Asn Val Leu Pro Val Lys Val Pro 110 Gin Val Glu Pro Ala 190 Leu Gly Ala Ile Ser Pro Thr Gln Glu Asn Pro 175 His INFORMATION FOR SEQ ID NO:41: SEQUENCE CHARACTERISTICS: LENGTH: 1058 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence WO 98/21225 PCT/US97/21353 -135- LOCATION: 325.. .879 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:41:
CCTAGTCCCT
CAACCCTTTA
CCCTAAAGAG
AAAGAATGAC
AACGCCCCCA
CAACCAAAAA
GCCACCGGCA
GAGCAAGCCC
CCAGCCAAAA
GCCAAAGAGA
ACCACCCACC
AAGGATGAAA
CGCTTAACCC TGCACTTAAT CCCACATTAA CAACTAATCC TTTAATGCCA CAACAAACGC CGCCTTCTGT TGAAAGCCCC AAACAGAATG ATGGTATAAA GGGTGTTGAA AAAACCAAAG AAAAGCCTAA AACGCATGCA ACGCAAACCA AATA ATG TTA CAG GCC ATT TAT AAC Met Leu Gin Ala Ile Tyr Asn
ACCCAACGCT
CTAACGAACT
AAAAGAATGA
AGAACGCCAA
ACGCCCATAC
GAA ACC Glu Thr 120 180- 240 300 351
AAA
Lys GAT CTG ATG CAA Asp Leu Met Gin
AAA
Lys 15 AGC ATT CAA GCT Ser Ile Gin Ala AAC AGG GAT TTT Asn Arg Asp Phe
TCC
Ser ACT CTA AGG AGC Thr Leu Arg Ser AAA GTT TCA GTC AAT ATT TTA GAT CAC Lys Val Ser Val Asn Ile Leu Asp His ATC AAA Ile Lys GTG GAT TAT Val Asp Tyr ATG AGC TTG Met Ser Leu
TAC
Tyr GGC ACG CCC ACG Gly Thr Pro Thr
GCA
Ala TTA AAT CAA GTC GGA TCC GTG Leu Asn Gin Val Gly Ser Val GAT GCG ACC ACC CTT CAA ATC AGC CCA Asp Ala Thr Thr Leu Gin Ile Ser Pro GAA AAA AAC Glu Lys Asn CTG CTC Leu Leu AAA GAA ATT GAA Lys Giu Ile Glu
AGA
Arg 80 TCC ATT CAA GAA GCC AAT ATT GGT GTC Ser Ile Gin Giu Ala Asn Ile Gly Val 591
AAT
Asn CCT AAT AAC GAC Pro Asn Asn Asp GAA ACG ATC AAG Glu Thr Ile Lys TTT TTC CCG CCC Phe Phe Pro Pro 639 687 ACA AGT GAG CAA Thr Ser Giu Gin AAA CTC ATC GCA Lys Leu Ile Ala
AAA
Lys 115 GAC GCC AAA GCG Asp Ala Lys Ala ATG GGT Met Gly 120 GAA AAG GCT Glu Lys Ala CAG GTG AAA Gin Val Lys 140
AAA
Lys 125 GTG GCT GTG AGG Vai Ala Val Arg
AAT
Asn 130 ATC CGC CAA GAT Ile Arg Gin Asp GCT AAC AAC Ala Asn Asn 135 GAT GAA AGC Asp Giu Ser AAA TTA GAA AAA Lys Leu Giu Lys
GAC
Asp 145 AAA GAA ATC AGC Lys Giu Ile Ser AAA AAA Lys Lys 155 GCC CAA GAG CAG Ala Gin Giu Gin CAA AAA ATC ACC Gin Lys Ile Thr
GAT
Asp 165 GAA GCC ATT AAA Glu Ala Ile Lys WO 98/21225 PCT/US97/21353 -136- AAA ATT GAT GAA AGC GTG AAA AAC AAA GAA GAC GCG ATC TTA AAG GTC T 880 Lys Ile Asp Glu Ser Val Lys Asn Lys Glu Asp Ala Ile Leu Lys Val 170 175 180 185 AAACCATGGA TATTAAGGCA TGTTATCAAA ACGCTAAAGC GTTATTAGAG GGGCATTTCT 940 TGCTCAGCAG TGGGTTTCAT TCCAATTATT ATTTGCAATC CGCTAAAGTT TTAGAAGATC 1000 CCAAACTAGC CGAACAATTA GCGCTAGAAT TAGCCAAACA AATCCAAGAA GCTCATTT 1058 INFORMATION FOR SEQ ID NO:42: SEQUENCE CHARACTERISTICS: LENGTH: 185 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:42: Met Leu Gin Ala Ile Tyr Asn Glu Thr Lys Asp Leu Met Gin Lys Ser 1 5 10 Ile Gin Ala Leu Asn Arg Asp Phe Ser Thr Leu Arg Ser Ala Lys Val 25 Ser Val Asn Ile Leu Asp His Ile Lys Val Asp Tyr Tyr Gly Thr Pro 40 Thr Ala Leu Asn Gin Val Gly Ser Val Met Ser Leu Asp Ala Thr Thr 55 Leu Gin Ile Ser Pro Trp Glu Lys Asn Leu Leu Lys Glu Ile Glu Arg 70 75 Ser Ile Gin Glu Ala Asn Ile Gly Val Asn Pro Asn Asn Asp Gly Glu 90 Thr Ile Lys Leu Phe Phe Pro Pro Met Thr Ser Glu Gin Arg Lys Leu 100 105 110 Ile Ala Lys Asp Ala Lys Ala Met Gly Glu Lys Ala Lys Val Ala Val 115 120 125 Arg Asn Ile Arg Gin Asp Ala Asn Asn Gin Val Lys Lys Leu Glu Lys 130 135 140 Asp Lys Glu Ile Ser Glu Asp Glu Ser Lys Lys Ala Gin Glu Gin Ile 145 150 155 160 Gin Lys Ile Thr Asp Glu Ala Ile Lys Lys Ile Asp Glu Ser Val Lys 165 170 175 Asn Lys Glu Asp Ala Ile Leu Lys Val 180 185 INFORMATION FOR SEQ ID NO:43: SEQUENCE CHARACTERISTICS: LENGTH: 1669 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: WO 98/21225 PCTIUS97/21353 -137- NAME/KEY: Coding Sequence LOCATION: 163.. .1389 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:43: GAGTGGATGA AAAACACACT TTCAATTTTG CAAAAATTGG AAGAATTAAA AGAAGTAGAA GAAAAGCATG CGTTTAAGAA TGCACAAAAT CGCCCCCACT ATCTTAAAAA AGAGGCTATA
CTATCAACAG
AATCCCTTTT
AA ATG GCT Met Ala
CGCAAGGGCG
CTCAAAGATT
CAA AAT Gin Asn 120 174 ACG AAA CTC AAC CCC CAG TTT GAA AAC Thr Lys Leu Asn Pro Gin Phe Giu Asn
ATC
Ile 15 ATT TTT GAA CAT Ile Phe Glu His
GAC
Asp GAC AAC CAA ATG Asp Asn Gin Met
ATT
Ile TTA AAC TTT GGC Leu Asn Phe Gly
CCC
Pro 30 CAA CAC CCC AGT Gin His Pro Ser ACT CAT Ser His GGG CAA TTG Gly Gin Leu CCT ACC CCT Ala Thr Pro
CGC
Arg TTG ATT TTC GAA Leu Ile Leu Glu GAG GGC CAA AAA Glu Giy Glu Lys ATC ATT AAC Ile Ile Lys AAG TTA GGC Lys Leu Cly GAA ATT CGC TAC Glu Ile Gly Tyr
TTC
Leu CAT AGA GGC TGT His Arg Gly Cys CAA AAC Clu Asn ATC ACC TAT AAC Met Thr Tyr Asn TAC ATC CCC ACT ACT CAT ACA TTG GAT Tyr Met Pro Thr Thr Asp Arg Leu Asp ACT TCT TCT ACC ACC AAT AAT TAC CCT Thr Ser Ser Thr Ser Asn Asn Tyr Ala
TAC
Tyr 95 CCT TAT GCG GTA Ala Tyr Ala Val
GAG
Glu 100 ACC TTA CTC AAT Thr Leu Leu Asn
TTA
Leu 105 CAA ATC CCA CGC Glu Ile Pro Arg
CGA
Arg 110 GCG CAG GTG ATC Ala Gin Val Ile CGC ACC Arg Thr 115 ATT TTA CTA Ile Leu Leu GTG CAT CCT Val His Ala 135 CTT AAC CCC ATC Leu Asn Arg Met TCA CAC ATC TTT Ser His Ile Phe TTT ATC AGC Phe Ile Ser 130 TTA CAT GTC GGG Leu Asp Val Cly
GCG
Ala 140 ATC ACC GTG TTT TTC TAT GCG TTT Met Ser Val Phe Leu Tyr Ala Phe 145 AAA ACC Lys Thr 150 AGG GAA TAC GGC Arg Clu Tyr Cly CAT TTC ATG GAG Asp Leu Met Glu TAT TCC GGG GCT Tyr Cys Cly Ala AGG CTC ACC CAT AAC CCT ATA AGG ATT GGG GGC GTG CCT TTA CAT TTA WO 98/21225 PCT/US97/21353 -138- Arg 165 Leu Thr His Asn Ile Arg Ile Gly Gly 175 Val Pro Leu Asp CCC CCT AAT TGG Pro Pro Asn Trp GAA GGC TTA AAA Glu Gly Leu Lys
AAG
Lys 190 TTT TTA GGC GAA Phe Leu Gly Glu ATG AGG Met Arg 195 750 798 GAA TGC AAA Glu Cys Lys CGG ATG CGC Arg Met Arg 215
AAA
Lys 200 CTC ATT CAA GGC Leu Ile Gin Gly TTG GAT AAG AAT CGC ATT TGG Leu Asp Lys Asn Arg Ile Trp 210 TTG GAA AAT GTG Leu Glu Asn Val
GGC
Gly 220 GTT GTA ACG CAA Val Val Thr Gin
AAA
Lys 225 ATG GCG CAA Met Ala Gin 846 AGC TGG Ser Trp 230 GGC ATG AGC GGT Gly Met Ser Gly ATG TTA AGA GGG Met Leu Arg Gly GGG ATC GCT TAT Gly Ile Ala Tyr
GAC
Asp 245 ATC AGA AAA GAA Ile Arg Lys Glu CCT TAT GAG CTT Pro Tyr Glu Leu
TAT
Tyr 255 AAA GAG CTT GAT Lys Glu Leu Asp
TTT
Phe 260 894 942 990 GAT GTG CCG GTG Asp Val Pro Val
GGC
Gly 265 AAT TAT GGC GAT Asn Tyr Gly Asp TAT GAT AGG TAT Tyr Asp Arg Tyr TGT TTG Cys Leu 275 TAT ATG TTA Tyr Met Leu CCT ATG TAT Pro Met Tyr 295
GAA
Glu 280 ATT GAT GAA AGC Ile Asp Glu Ser CGC ATC ATT GAA Arg Ile Ile Glu CAG CTC ATT Gin Leu Ile 290 AAC CCG CAT Asn Pro His 1038 1086 GCT AAA ACC GAT Ala Lys Thr Asp
ACG
Thr 300 CCT ATC ATG GCT Pro Ile Met-Ala TAT ATT Tyr Ile 310 TCC GCC CCT AAA Ser Ala Pro Lys GAT ATA ATG ACG Asp Ile Met Thr AAC TAC GCC TTG Asn Tyr Ala Leu 1134 1182
ATG
Met 325 CAG CAT TTT GTT Gin His Phe Val
TTA
Leu 330 GTG GCT CAG GGC Val Ala Gin Gly
ATG
Met 335 CGT CCG CCC GTT Arg Pro Pro Val
GGG
Gly 340 GAA GTG TAT GCC CCC ACA GAA AGC -CCT Glu Val Tyr Ala Pro Thr Glu Ser Pro 345 GGG GAA TTA GGG Gly Glu Leu Gly TTT TTT Phe Phe 355 1230 ATC CAT TCA Ile His Ser CCT AGC TTT Pro Ser Phe 375
GAG
Glu 360 GGC GAG CCT TAC Gly Glu Pro Tyr CAC AGG CTA AAA His Arg Leu Lys ATC AGA GCC Ile Arg Ala 370 GTG GGG CAA Val Gly Gin 1278 1326 TAT CAC ATT GGG Tyr His Ile Gly GCT TTG Ala Leu 380 AGC GAC ATT Ser Asp Ile WO 98/21225 PCT/US97/21353 -139- TAT TTA GCG GAT GCA GTA ACC GTG ATT GGC TCA ACC AAT GCG GTG TTT 1374 Tyr Leu Ala Asp Ala Val Thr Val Ile Gly Ser Thr Asn Ala Val Phe 390 395 400 GGC GAG GTG GAT AGA TGAAACGCTT TGATTTACGC CCCTTAAAAG CGGGTATTTT T 1430 Gly Glu Val Asp Arg 405 GAACGCTTAG AAGAATTGAT TGAAAAAGAA ATGCAACCTA ATGAAGTCGC TATTTTCATG 1490 TTTGAAGTGG GGGATTTTTC TAATATCCCT AAGAGCGCTG AATTTATCCA ATCTAAAGGG 1550 CATGAGCTCC TCAATTCTTT GCGTTTCAAT CAAGCGGATT GGACGATTGT CGTGAGAAAA 1610 AAGGCTTGAT TTTGAGCGGC TTTAACCCCT TAAATTCTCC CTTAGTCGCA AGCTCTTCT 1669 INFORMATION FOR SEQ ID NO:44: SEQUENCE CHARACTERISTICS: LENGTH: 409 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:44: Met Ala Gin Asn Phe Thr Lys Leu Asn Pro Gin Phe Glu Asn Ile Ile 1 5 10 Phe Glu His Asp Asp Asn Gin Met Ile Leu Asn Phe Gly Pro Gin His 25 Pro Ser Ser His Gly Gin Leu Arg Leu Ile Leu Glu Leu Glu Gly Glu 40 Lys Ile Ile Lys Ala Thr Pro Glu Ile Gly Tyr Leu His Arg Gly Cys 55 Glu Lys Leu Gly Glu Asn Met Thr Tyr Asn Glu Tyr Met Pro Thr Thr 70 75 Asp Arg Leu Asp Tyr Thr Ser Ser Thr Ser Asn Asn Tyr Ala Tyr Ala 90 Tyr Ala Val Glu Thr Leu Leu Asn Leu Glu Ile Pro Arg Arg Ala Gin 100 105 110 Val Ile Arg Thr Ile Leu Leu Glu Leu Asn Arg Met Ile Ser His Ile 115 120 125 Phe Phe Ile Ser Val His Ala Leu Asp Val Gly Ala Met Ser Val Phe 130 135 140 Leu Tyr Ala Phe Lys Thr Arg Glu Tyr Gly Leu Asp Leu Met Glu Asp 145 150 155 160 Tyr Cys Gly Ala Arg Leu Thr His Asn Ala Ile Arg Ile Gly Gly Val 165 170 175 Pro Leu Asp Leu Pro Pro Asn Trp Leu Glu Gly Leu Lys Lys Phe Leu 180 185 190 Gly Glu Met Arg Glu Cys Lys Lys Leu Ile Gin Gly Leu Leu Asp Lys 195 200 205 Asn Arg Ile Trp Arg Met Arg Leu Glu Asn Val Gly Val Val Thr Gin 210 215 220 Lys Met Ala Gin Ser Trp Gly Met Ser Gly Ile Met Leu Arg Gly Thr WO 98/21225 PCT/US97/21353 140- 225 Gly Ile Ala Tyr Arg Lys Glu Glu 250 Asn Tyr Glu Leu Tyr Lys 255 Glu Leu Asp Arg Tyr Cys 275 Glu Gin Leu Phe 260 Leu Val Pro Val Gly 265 Ile Tyr Gly Asp Tyr Met Leu Asp Glu Ser Val 285 Pro Ser Tyr Asp 270 Arg Ile Ile Ile Met Ala Ile Pro Met 290 Gin Asn Tyr 295 Ser Lys Thr Asp Pro His Tyr Ala Pro Lys Ile Met Thr 305 Asn Tyr Ala Leu Met 325 Glu His Phe Val Leu 330 Thr Ala Gin Gly Met Arg 335 Pro Pro Val Leu Gly Phe 355 Lys Ile Arg 370 Leu Val Glv Gly 340 Phe Val Tyr Ala Glu Ser Pro Ile His Ser Glu 360 Tyr Glu Pro Tyr Lys Gly Glu 350 His Arg Leu Ser Asp Ile Ala Pro Ser Gin Tyr Leu 390 Phe Gly Glu 405 Phe 375 Ala Asp Ala Val Ile Gly Ser Thr 400 His Ile Gly Ala Val Val Asp Arg INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 869 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 358...732 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID
TAACTTGTGG
TTTAGGCATT
CCAATATTGG
CATGCTAATC
CTTCTACTTA
GTTATTATCC
TTAACTACCG
TTTTCAGTGC
CTATCATTTA
CTTTGAAATT
AAAACCCTAA
GTTCGCAACA
CCAGACTCCT
CGATCTTAAA
CTTTGATTTC
TGATTTTAAA
TTTTTTAAAC
AGAATTTTCT
TTTGAGTTTG
GTTTTCAAGA
GCCCATCGTG
ACCTTAAAAA
ACCATTTCCA
TGTTATCTTA
GCAAACGCGC
CTGCGTTGCG
TCATGTTCAA
AATAGCATAA
CAATTTTTAC
ATGTAAAGGT
CAATGAGTTC
TTTGAGCCCC
TTCTAAATTG
ACTCTTATAC
ACAAAAGAGG
CAAAACG ATG Met AAA AAG TTA GCC GCT TTA TTT TTA GTA AGC GTG TTG GGG GTT ATG GGT Lys Lys Leu Ala Ala Leu Phe Leu Val Ser Val Leu Gly Val Met Gly WO 98/21225 PCT/US97/21353 -141- TTA AAC GCA Leu Asn Ala TGG GAG CAA ACC CTA Trp Glu Gin Thr Leu 25 AAA GCT AAT GAC TTG GAA GTG AAA Lys Ala Asn Asp Leu Glu Val Lys ATC AAA Ile Lys TCC GTG GGT AAC Ser Val Gly Asn
CCC
Pro 40 ATT AAA GGC GAT AAC ACT TTC ATT CTC Ile Lys Gly Asp Asn Thr Phe Ile Leu
AGC
Ser CCC ACT TTA AAA Pro Thr Leu Lys AAG GCT TTA GAA Lys Ala Leu Glu GCT ATC GTT AGG Ala Ile Val Arg CAG TTT ATG ATG Gin Phe Met Met GAA ATG CCC GGC Glu Met Pro Gly
ATG
Met CCA GCG ATG AAA GAA ATG Pro Ala Met Lys Glu Met GCG CAA GTG AGT GAA AAA AAC GGC Ala Gin Val Ser Glu Lys Asn Gly TAT GAA GCT AAA Tyr Glu Ala Lys ACC AAT CTT Thr Asn Leu TCT AAA GAG Ser Lys Glu TCT ATG AAC Ser Met Asn 100 GGG ACA TGG CAG Gly Thr Trp-Gln
GTT
Val 105 AGG GTG GAT ATT Arg Val Asp Ile
AAA
Lys 110 696 GGT CAG Gly Gin 115 GTT TAT CGC GCT Val Tyr Arg Ala ACA AGC CTG GAT Thr Ser Leu Asp TAAGAGCATG CTATCT TTTATAAGCG CGTTTGATAA AAGGGGCGTT TCAATACGCC TTTTAACAGC CTTGTTACTG CTTTTTAGTT TGGGTTTGGC TAAAGATTTA GAGATCCAAT CTTTTGTGGC TAAATACCTT INFORMATION FOR SEQ ID NO:46: SEQUENCE CHARACTERISTICS: LENGTH: 125 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:46: Met Lys Lys Leu Ala Ala Leu Phe Leu Val 1 5 10 Gly Leu Asn Ala Trp Glu Gin Thr Leu Lys 25 Lys Ile Lys Ser Val Gly Asn Pro Ile Lys Leu Ser Pro Thr Leu Lys Gly Lys Ala Leu 55 Val Gin Phe Met Met Pro Glu Met Pro Gly Ser Val Leu Gly Val Met Ala Asn Asp Gly Asp Asn Leu Glu Val Thr Phe Ile Glu Lys Met Pro Ala Ile Val Arg Ala Met Lys WO 98/21225 PCT/US97/21353 -142- Met Ala Gin Val Ser Giu Lys Asn Gly Leu Tyr Glu Ala Lys Thr Asn 90 Leu Ser Met Asn Gly Thr Trp Gin Val Arg Val Asp Ile Lys Ser Lys 100 105 110 Glu Gly Gin Val Tyr Arg Ala Lys Thr Ser Leu Asp Leu 115 120 125 INFORMATION FOR SEQ ID NO:47: SEQUENCE CHARACTERISTICS: LENGTH: 1217 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 73...1152 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:47: TCCATGCGTT TTGATGCGAT TTTAAAAAAT CTTTGGGTAT TTTAGCATGC CAATGGTTAA AAAAAGGTGG TT ATG AAT GGT TTT TGC GCT AGA CTA CGA GCC ATA ACT CAT Met Asn Gly Phe Cys Ala Arg Leu Arg Ala Ile Thr His 111 AAT GAA Asn Glu AGA TTA AAA ATG Arg Leu Lys Met
AAA
Lys 20 ATA GCG GTA Ile Ala Val TTA CTC Leu Leu AGT GGG GGG GTG Ser Gly Gly Val 159
GAT
Asp AGC TCT TAT AGC Ser Ser Tyr Ser TAT AGC TTA AAA GAG CAA GGG CAT GAA TTA Tyr Ser Leu Lys Glu Gin Gly His Glu Leu GTG GGG ATT TAT Val Gly Ile Tyr AAA CTC CAT GCG Lys Leu His Ala GAA AAA AAG CAT Glu Lys Lys His GAT TTA Asp Leu TAC ATC AAA Tyr Ile Lys GAG GTG TTG Glu Val Leu
AAC
Asn GCT CAA AAA GCA Ala Gin Lys Ala GAG TTT TTA GGC Glu Phe Leu Gly ATT CCT TTA Ile Pro Leu GAT TTT CAA AAG Asp Phe Gin Lys
GAT
Asp 85 TTT AAA AGC GCG GTT TAT GAT GAA Phe Lys Ser Ala Val Tyr Asp Glu TTT ATC Phe Ile AAC GCC TAT GAA Asn Ala Tyr Glu
GAA
Glu 100 GGG CAA ACC Gly Gin Thr CCA AAC Pro Asn 105 CCT TGT GCG TTG Pro Cys Ala Leu -TGC AAC CCT TTA ATG AAG TTT GGG CTA GCT TTG GAT CAC GCT TTA AAA 447 WO 98/21225 PCT/US97/21353 -143- Asn Pro Leu Met Lys Phe Gly Leu Ala Leu Asp His Ala Leu 115 120 Cys 110 TTA GGG TGT GAA Leu Gly Cys Glu ATC GCT ACC GGG Ile Ala Thr Gly
CAT
His 135 TAT GCG AGA GTC AAA GAA Tyr Ala Arg Val Lys Glu 140 ATT GAC AAA ATA AGT TAT ATT CAA Ile Asp Lys Ile Ser Tyr Ile Gin 145 GCT TTG GAT AAA Ala Leu Asp Lys ACT AAA GAT Thr Lys Asp 155 GCT AAA TTG Ala Lys Leu 543 CAG AGC TAT Gin Ser Tyr 160 TTT TTA TAC GCT Phe Leu Tyr Ala
TTA
Leu 165 GAG CAT GAA GTG Glu His Glu Val GTG TTC Val Phe 175 CCT TTA GGG GAT Pro Leu Gly Asp CTA AAA AAG GAT Leu Lys Lys Asp AAG CCT TTA GCC Lys Pro Leu Ala 639 687 AAT GCG ATG CCT Asn Ala Met Pro TTA GGC ACT TTA Leu Gly Thr Leu ACT TAT AAG GAA Thr Tyr Lys Glu
TCT
Ser 205 CAA GAA ATC TGC Gin Glu Ile Cys
TTT
Phe 210 GTG GAA AAA AGC TAC ATT GAC ACT TTA Val Glu Lys Ser Tyr Ile Asp Thr Leu 215 AAA AAG Lys Lys 220 CAT GTT GAA His Val Glu GTC ATT GGC Val Ile Gly 240 GAA AAA GAG GGC Glu Lys Glu Gly
GTG
Val 230 GTG AAA AAC CTA Val Lys Asn Leu CAA GGC GAA Gin Gly Glu 235 GGC AAA CGC Gly Lys Arg ACG CAT AAA GGC Thr His Lys Gly
TAT
Tyr 245 ATG CAA TAC ACG Met Gin Tyr Thr
ATT
Ile 250 AAA GGC Lys Gly 255 TTT AGT ATT AAA Phe Ser Ile Lys GCG TTA GAG CCG Ala Leu' Glu Pro
CAT
His 265 TTT GTG GTG GGG Phe Val Val Gly
ATT
Ile 270 GAC GCT AAA AAG Asp Ala Lys Lys
AAC
Asn 275 GAG CTA GTC GTG Glu Leu Val Val AAA AAA GAA GAT Lys Lys Glu Asp 879 927 975 1023 GCC ACG CAT TCG Ala Thr His Ser AAG GCT AAA AAC Lys Ala Lys Asn
AAA
Lys 295 TCT TTA ATG AAA Ser Leu Met Lys GAT TTT Asp Phe 300 AAA GAT GGC Lys Asp Gly AAA GCG CAT Lys Ala His 320 TAT TTT ATC AAG GCT CGT TAC AGG AGC Tyr Phe Ile Lys Ala Arg Tyr Arg Ser 310 GTG CCT GCT Val Pro Ala 315 GGG TTT AAA Gly Phe Lys GTG AGT TTG AAA Val Ser Leu Lys
GAT
Asp 325 GAG GTG ATT GAA Glu Val Ile Glu 1071 WO 98/21225 PCT/US97/21353 -144- GAG CCT TTT TAT GGC GTG GCT AAA GGG CAA GCT TTG GTC GTT TAT AAA 1119 Glu Pro Phe Tyr Gly Val Ala Lys Gly Gin Ala Leu Val Val Tyr Lys 335 340 345 GAT GAC ATC TTG CTT GGT GGG GGC GTG ATT GTT TAAAAACTAA AGAACTAAGA 1172 Asp Asp Ile Leu Leu Gly Gly Gly Val Ile Val 350 355 360 GATACGCCTT TTGGCAGTCT CTTAATGTTT TATTGAATAG GCGTT 1217 INFORMATION FOR SEQ ID NO:48: SEQUENCE CHARACTERISTICS: LENGTH: 360 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:48: Met Asn Gly Phe Cys Ala Arg Leu Arg Ala Ile Thr His Asn Glu Arg 1 5 10 Leu Lys Met Lys Ile Ala Val Leu Leu Ser Gly Gly Val Asp Ser Ser 25 Tyr Ser Ala Tyr Ser Leu Lys Glu Gin Gly His Glu Leu Val Gly Ile 40 Tyr Leu Lys Leu His Ala Ser Glu Lys Lys His Asp Leu Tyr Ile Lys 55 Asn Ala Gin Lys Ala Cys Glu Phe Leu Gly Ile Pro Leu Glu Val Leu 70 75 Asp Phe Gin Lys Asp Phe Lys Ser Ala Val Tyr Asp Glu Phe Ile Asn 90 Ala Tyr Glu Glu Gly Gin Thr Pro Asn Pro Cys Ala Leu Cys Asn Pro 100 105 110 Leu Met Lys Phe Gly Leu Ala Leu Asp His Ala Leu Lys Leu Gly Cys 115 120 125 Glu Lys Ile Ala Thr Gly His Tyr Ala Arg Val Lys Glu Ile Asp Lys 130 135 140 Ile Ser Tyr Ile Gin Glu Ala Leu Asp Lys Thr Lys Asp Gin Ser Tyr 145 150 155 160 Phe Leu Tyr Ala Leu Glu His Glu Val Ile Ala Lys Leu Val Phe Pro 165 170 175 Leu Gly Asp Leu Leu Lys Lys Asp Ile Lys Pro Leu Ala Leu Asn Ala 180 185 190 Met Pro Phe Leu Gly Thr Leu Glu Thr Tyr Lys Glu Ser Gin Glu Ile 195 200 205 Cys Phe Val Glu Lys Ser Tyr Ile Asp Thr Leu Lys Lys His Val Glu 210 215 220 Val Glu Lys Glu Gly Val Val Lys Asn Leu Gin Gly Glu Val Ile Gly 225 230 235 240 Thr His Lys Gly Tyr Met Gin Tyr Thr Ile Gly Lys Arg Lys Gly Phe 245 250 255 WO 98/21225 PCT/US97/21353 -145- Ser Ile Lys Lys Lys Asn 275 Ser Leu Lys Gly 260 Glu Ala Leu Glu Pro Leu Val Val Gly 280 Ser His Phe Val 265 Lys Lys Glu Leu Met Lys Val Gly Ile Asp Ala 270 Asp Leu Ala Thr His 285 Asp Phe Ala Lys Asn 290 Glu Tyr Phe Ile Lys Ala 310 Glu Tyr Arg Ser Val 315 Gly Ala Lys Asp Gly Lys Ala His 320 Glu Pro Phe 335 Asp Asp Ile Ser Leu Lys Asp 325 Lys Val Ile Glu Val 330 Val Phe Lys Tyr Gly Val Leu Leu Gly 355 Ala 340 Gly Gly Gin Ala Leu Val Tyr Lys Gly Val Ile Val INFORMATION FOR SEQ ID NO:49: SEQUENCE CHARACTERISTICS: LENGTH: 975 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 191...793 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:49:
ACATTACACA
TTGTCTATTC
GGGTTTTAGT
GTTTCAAATT
TATCTGTCGC TAAAACGCGC CGCTTCACTA GCATGCGTTT ATTTTACCCT ATTCTTTAAG TTTAGCATGT TAGCATTCAG CCACCACTCT AACCCACTGA TTGTAAAAAT TTTTTATCCA TAACTTATAA TTTTAAGGAA TTTGTTTGAA 120 180 229 ATG AGT TTG TTA GCC ACT.CTT TTA TTA GCC TCT TGC TTG Met Ser Leu Leu Ala Thr Leu Leu Leu Ala Ser Cys Leu CCC CCC Pro Pro AAA GGC CAT CAT Lys Gly His His GGT TTG GTG AAT CTT TAT ATC GCT CAT Gly Leu Val Asn Leu Tyr Ile Ala His
CAA
Gin GGC CAA AGC GTG Gly Gin Ser Val
CGC
Arg 35 ACT TAT TGG CGC Thr Tyr Trp Arg
AAA
Lys GTG GAT AGA GGA GTT Val Asp Arg Gly Val ATC GCT AAA CAC Ile Ala Lys His
AAT
Asn GAA GCG CTT AAA Glu Ala Leu Lys GAT CCT AAA GCA Asp Pro Lys Ala AAG CTC Lys Leu AAA GAC CCC Lys Asp Pro
AGG
Arg GGG CCT TTA TTC Gly Pro Leu Phe
ATG
Met 70 CTA GGG AGT GAG Leu Gly Ser Glu CGC TTC ATG Arg Phe Met WO 98/21225 PCT/US97/21353 -146- CTT TTA TGG Leu Leu Trp AAA AAC CGC TAC Lys Asn Arg Tyr GCT TTA Ala Leu 85 GCC AAG CCC Ala Lys Pro
CAA
Gin TCG TTC AGG Ser Phe Arg CTA GAG Leu Glu CCT GGT TTT TAT Pro Gly Phe Tyr TTG GAT TCT TTT Leu Asp Ser Phe GTG GAA ACT CAA Val Glu Thr Gin GGC GTC TTG CAG AGC GCT CCT GGC TAT TCA TAT ACT AAA AAT Gly Val Leu Gin Ser Ala Pro Gly Tyr Ser Tyr Thr Lys Asn TAT GAT TTC AAA Tyr Asp Phe Lys AAC CGC CCC TTT TTC CTG GCC TTT GAA Asn Arg Pro Phe Phe Leu Ala Phe Glu 135 GTC AAA Val Lys 140 CCT GAT GGC AAA ACC ATT CTT CCT Pro Asp Gly Lys Thr Ile Leu Pro GTG GAA TTA AGC Val Glu Leu Ser ACC CCT AGA Thr Pro Arg 160 145
GGC
Gly CTG ATT AAA Leu Ile Lys 155 AAT GAA AAG Asn Glu Lys 661 709 TTT TTA GGG Phe Leu Gly TTG TTT GAT Leu Phe Asp GGG ACT Gly Thr 175 AAC GCC AAG TGG Asn Ala Lys Trp
ATT
Ile 180 GAG GGG AGT TTG AAT TTA AAG CTT AAA Glu Gly Ser Leu Asn Leu Lys Leu Lys 185
AAC
Asn 190 GCT TCC TTT AAA Ala Ser Phe Lys GCG TGG GGG TTG Ala Trp Gly Leu GAA CAA Glu Gin 200 TAAAGCATGA AGTGAT
CGCTTGCTTT
AGATTTTATT
ACGATAAACA
TCGTAAGCTC TTTATGATTA GATTGTAAAA ACCCCTATTC AATTGGAACA AAGCCATTAA TAATCCGCGC TCCAAGTAAC ATAGCTTTCA
AAATGCCTTG
ATTTTTAAAA
AAAATG
AGTATTTTTT
ACTTTTAAAA
869 929 975 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 201 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID Met Ser Leu Leu Ala Thr Leu Leu Leu Ala Ser Cys Leu Pro 1 5 10 Gly His His Ser Gly Leu Val Asn Leu Tyr Ile Ala His 25 Ser Val Arg Thr Tyr Trp Arg Lys Val Asp Arg Gly Val 40 His Asn Glu Ala Leu Lys Lys Asp Pro Lys Ala Lys Leu Pro Lys Gin Gly Gin Ile Ala Lys Lys Asp Pro WO 98/21225 PCT/US97/21353 -147- Arg Lys Pro Leu Phe Gly Ser Glu Met Leu Leu Asn Arg Tyr Leu Ala Lys Pro Phe Arg Leu Glu Pro Gly Phe Tyr Tyr Leu 100 Leu Gin Ser Ala Pro 115 Lys Asn Asn Arg Pro 130 Lys Thr Ile Leu Pro 145 Gly Phe Leu Gly Val Asp Ser Phe Ser 105 Tyr Glu Thr Gin Gly Tyr Phe Phe 135 Ser Val Thr Lys Asn Lys Gly Val 110 Tyr Asp Phe Pro Asp Gly Ala Phe Glu Val Glu Leu Ser Lys Thr Pro Leu Phe Asp Glu Lys Gly Thr Asn 175 Ala Ser Ala Lys Trp Phe Lys Asp 195 Ile 180 Ala Gly Ser Leu Asn 185 Gin Leu Lys Leu Lys Trp Gly Leu INFORMATION FOR SEQ ID NO:51: SEQUENCE CHARACTERISTICS: LENGTH: 1116 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 90...1076 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:51: TATAAATACA TCGTTTCATT AGCGAATTTA ATGGCGTTAA GCGATCATAT TGATTTATTT TATGAATTTG TTTATTAAGG GAAAAAATC ATG TCA AAT AGC ATG TTG GAT AAA Met Ser Asn Ser Met Leu Asp Lys 1 113 AAT AAA GCG ATT CTT ACA Asn Lys Ala Ile Leu Thr GTG CTT TTT TAT TTA GCT Val Leu Phe Tyr Leu Ala 30 TTT TTG GAA GCC AGA GAA Phe Leu Glu Ala Arg Glu GGG GGT- GGG GCT TTA TTA TTA GGG CTA ATC Gly Gly Gly Ala Leu Leu Leu Gly Leu Ile 15 TAT CGC CCT AAG GCT GAA GTG TTG CAA GGG Tyr Arg Pro Lys Ala Glu Val Leu Gin Gly 35 TAC AGC GTG AGC TCC AAA GTC CCT GGC CGC Tyr Ser Val Ser Ser Lys Val Pro Gly Arg 50 161 209 257 WO 98/21225 WO 98/ 1225PCTIUS97/21353 -14 8- ATT GAA AAG Ile Glu Lys TTG GTT TTT Leu Val Phe
GTG
Val1 TTT GTT AAA Phe Val Lys AAA GGC Lys Gly 65 CCT GAA Pro Glu CAT CAC ATT AAA Asp His Ile Lys AAG CCC GAT Lys Gly Asp CTC CCT CAA Leu Ala Gin AGC ATT TCT AGC Ser Ile Ser Ser TTA GAA GCC Leu Clu Ala CCT CAA Ala Clu CCC CCC CAT AA Ala Cly His Lys CCT AAA CC CTT Ala Lys Ala Leu
AGC
Ser 100 CAT CAA CTC AAA Asp Clu Val Lys
ACA
Arg 105 CCC TCA ACA CAC Cly Ser Arg Asp
GAA
Clu 110 ACC ATT PAT TCT Thr Ile Asn Ser
GC
Ala 115 AGA CAC CTT TCC Arg Asp Val Trp 401 449 497 GCA CCC AAA TCC Ala Ala Lys Ser CCC ACT TTA GCC Ala Thr Leu Ala
AAA
Lys 130 CAG ACT TAT AAG Clu Thr Tyr Lys CCC CTT Arg Val 135 CPA CAT TTC TAT CAT PAT CCC CTC CC ACC TTG CPA AAC Cln Asp Leu Tyr Asp Asn Cly Val Ala Ser Leu Cmn Lys 140 145 CGC CAT CPA Ar9 Asp Clu 150 CCG CCT TAC Ala Ala Tyr CCC TAT Ala Tyr CAA AAC Cmn Lys 170 CCT TAT CPA AC Ala Tyr Clu Ser
ACT
Thr 160 AAA TAC AAC GAG Lys Tyr Asn Clu TAT AAA ATC CCT Tyr Lys Met Ala
TTA
Leu 175 GOG CCC CCG AGC Cly Gly Ala Ser
TCT
Ser 180 CPA ACT PAC ATT Glu Ser Lys Ile
CC
Al a 185 GCT AAC CCT AAA Ala Lys Ala Lys AGC CC CCT TTA Ser Ala Ala Leu CAA CTC P-AT CPA Cmn Val Asn Clu GAG TCT TAT TTA Ciu Ser Tyr Leu
AAA
Lys 205 CAC CTC AAA CC Asp Val Lys Ala CCC CCA ATT CAT Ala Pro Ile Asp CCC CPA Gly Ciu 215 CTC ACT AAC Val Ser Asn CCT GTG CTT Pro Val Val 235
CTC
Val 220 CTT TTA ACC CCT Leu Leu Ser Cly
GC
Cly 225 CAC CTT ACC CCT Glu Leu Ser Pro AAC GT TTT Lys Cly Phe 230 AAA ATC AC Lys Ile Ser TTA ATG ATA CAT Leu Met Ile Asp
TTA
Leu 240 PAC CAT ACT TCC Lys Asp Ser Trp CTG CCT Val Pro 250 CPA PAC TAT TTG Clu Lys Tyr Leu GAG TTT AAA GTC Clu Phe Lys Val
GCT
Cly 260 PAC CPA TTT CPA Lys Glu Phe Clu
GC
Cly 265 TAT ATC CCC CC Tyr Ile Pro Ala
TTC
Leu 270 APA PAA ACC ACC Lys Lys Ser Thr
AAA
Lys 275 TTC ACC CTC AAA Phe Arg Val Lys WO 98/21225 WO 9821225PCT[US97/21353 -149- TTG AGC CTG ATG G GAT TTT GC ACT TGG AAA GCG Leu ser Val Met Gly Asp Phe Ala Thr Trp Lys Ala 285 290 AAC ACT TAC GAC ATG AAA AGC TAT GAA CTG GAA CC Asn Thr Tyr Asp Met Lys Ser Tyr Glu Val Clu Ala 300 305 GAG TTG GAA AAT TTT AGG GTA COG ATG AGC GTG TTA Giu Leu Glu Asn Phe Arg Val Cly Met Ser Val Leu 315 320 CCT TAAAAAGGAT TGTTTTGTTC AGATTGATAA GCGCATGGGT Pro ACG AAT AAT TCC Thr Asn Asn Ser 295 ATA CCC TTA GAA Ile Pro Leu Gin 310 OTT ACC ATT AAA Val Thr Ile Lys 325 977 1025 1073 1116 INFORMATION FOR SEQ ID NO:52: Wi SEQUENCE CHARACTERISTICS: LENGTH: 329 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:52: Met Ser Asn Ser Met Leu Asp Lys Asn Lys 1 Gly Pro Val Gly Giu Lys Asn Ala Ala 145 Lys Gly Ala Lys Ser Asp Leu Al a Ser Lys 130 Ser Tyr Al a Leu Leu Ala Gin Ser Lys His Ile Gin Ala Leu Ser 100 Ala Arg 115- Glu Thr Leu Gin Asn Gin Ser Ser Len Val Val1 Lys Lys 85p Asp Tyr Lys Ser 165 Gin Len Gin Gly 55 Gly Al a Val1 Trp Arg 135 Asp Al a Lys Ile Gly 40 Arg Asp Gin Lys Gin 120 Val1 Gin Tyr Ile Val1 25 Phe Ile Leu Ala Arg 105 Al a Gin Al a Gin Ala Ile Tyr Ala Val1 Ser Gly Arg Ser Tyr 140 Ala Ly s Al a Len Len Arg Phe Ile His Asp Gin 125 Asp Tyr Met Lys Thr Al a 30 Gin Val1 Ser Lys Glu 110 Ala Asn Glu Ala Gin Gly Tyr Lys Ser Al a Thr Thr Gly Ser Len 175 Ser 180 185 190 Ala Leu Gly Gin Val Asn Clu Val Glu Ser Tyr Leu Lys Asp Val Lys WO 98/21225 PCTIUS97/21353 -150- 200 Glu Ala Thr 210 Gly Glu Pro Ile Asp Val Ser Asn Val 220 Leu Ser Gly Leu Ser Pro Phe Pro Val Val 235 Glu Leu Met Ile Asp Asp Ser Trp Ile Ser Val Lys Tyr Leu Asn Glu 255 Phe Lys Val Ser Thr Lys 275 Thr Trp Lys Gly 260 Phe Glu Phe Glu Gly 265 Leu Ile Pro Ala Arg Val Lys Tyr 280 Ser Ser Val Met Gly 285 Met Leu Lys Lys 270 Asp Phe Ala Lys Ser Tyr Ala Thr Asn Asn Thr Tyr Asp 290 Glu Val Glu Ala Ile Glu Glu Leu Glu Asn Phe Arg Val 305 Met Ser Val Leu Val Ile Lys Pro INFORMATION FOR SEQ ID NO:53: SEQUENCE CHARACTERISTICS: LENGTH: 1514 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 94...1467 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:53: AAATAAAATA GCGATCATTA TAACATGTTG GTATAGTGGC TTAAAAATTT TAGGATATTG CTTTTTAAGT GAAAGCGTTA AGTTGTTAGG AGA ATG CTT GAA ACT TCT AGC CAT Met Leu Glu Thr Ser Ser His 114 TTT TTA AAA TCG TTT CGC TTG AAG CGT TAT ATA GGG TTT TTA TTG ATT Phe Leu Lys Ser Phe Arg Leu Lys Arg Tyr Ile Gly Phe Leu Leu Ile 15 TCT TTA GCG TTA TTA ATC ACG CCC TTT GTT CGC ATT GAT GGG GCG CAT Ser Leu Ala Leu Leu Ile Thr Pro Phe Val Arg Ile Asp Gly Ala His 30 TTG TTT TTG ATC TCT TTT GAG CAT AAG CAA CTG CAT TTT TTA GGC AAG Leu Phe Leu Ile Ser Phe Glu His Lys Gin Leu His Phe Leu Gly Lys 45 50 ATC TTT AGC GCT GAA GAA TTG CAA GTC ATG CCT TTT ATG GTT ATT TTG lle Phe Ser Ala Glu Glu Leu Gin Val Met Pro Phe Met Val Ile Leu WO 98/21225 PCT[US97/21353 -151- CTT TTT ATA Leu Phe Ile TGC GGT TGG Cys Gly Trp
GGG
Gly ATT TTT TTC ATC ACC ACT AGC CTT GGG CGT GTG TGG Ile Phe Phe Ile Thr Thr Ser Leu Gly Arg Val Trp GCT TGC CCG CAA Ala Cys Pro Gin
ACC
Thr TTT TTA AGG GTG Phe Leu Arg Val TAT AGA GAT Tyr Arg Asp GTG ATT Val Ile 105 GAA ACC AAG ATT Glu Thr Lys Ile AAA CTC CAT AAA Lys Leu His Lys
AAG
Lys 115 ATC AGC AAC AAG Ile Ser Asn Lys
CAA
Gin 120 GAA AGC CCT AAA AAC ACC-CCA AGC TAC Glu Ser Pro Lys Asn Thr Pro Ser Tyr
AAG
Lys 130 ATC CGT AAA GTA TTG Ile Arg Lys Val Leu AGC GTT TTA TTG TTC GCT CCT GTT GTG Ser Val Leu Leu Phe Ala Pro Val Val 140 GGG CTA ATG ATG Gly Leu Met Met TTG TTT Leu Phe 150 TTC TTT TAT Phe Phe Tyr CCT AGC GAT Pro Ser Asp 170
TTC
Phe 155 ATC GCC CCA GAA Ile Ala Pro Glu TTT TTT ATG TAT Phe Phe Met Tyr CTT AAA AAC Leu Lys Asn 165 AGC ACG GCT Ser Thr Ala CAC CCT ATT GCT His Pro Ile Ala GGT TTT TGG CTT Gly Phe Trp Leu
TTT
Phe 180 GTG GTG Val Val 185 CTA TTT GAT ATA Leu Phe Asp Ile GTG GTT GCG GAG Val Val Ala Glu
CGT
Arg 195 TTT TGC ATT TAT Phe Cys Ile Tyr TGC CCT TAC GCT Cys Pro Tyr Ala
AGG
Arg 205 GTG CAA TCG GTG Val Gin Ser Val
TTG
Leu 210 TAT GAC AAT GAC Tyr Asp Asn Asp
ACC
Thr 215 TTA AAC CCT ATT Leu Asn Pro Ile GAT GAA AAG CGC Asp Giu Lys Arg GGA GCG CTT TAT Gly Ala Leu Tyr AAT AAT Asn Asn 230 CAG GGC CAT Gin Gly His GAA TGC GTG Glu Cys Val 250 TTC CCC TTA CCT Phe Pro Leu Pro AAA AAA CGC AGC Lys Lys Arg Ser CCA GAA AAC Pro Giu Asn 245 ACG CAT ATT Thr His Ile AAT TOT TTG CAT Asn Cys Leu His GTG CAG GTT TGC Val Gin Vai Cys GAC ATC Asp Ile 265 AGG AAG GGC TTG Arg Lys Gly Leu
CAA
Gin 270 TTA GAA TGC ATC Leu Giu Cys Ile
AAT
Asn 275 TGT TTA GAA TGC Cys Leu Giu Cys GTG GAT GCA TGC ACG ATT ACC ATG GCT AAA TTT AAC CGC CCT TCA CTC WO 98/21225 PCT/US97/21353 -152- Val 280 Asp Ala Cys Thr Thr Met Ala Lys Phe Asn Arg Pro Ser 290 ATC CAA TGG TCT Ile Gin Trp Ser ACT AAC GCT ATT Thr Asn Ala Ile ACG CGC CAA AAA Thr Arg Gin Lys GTG CAC Val His 310 1026 CTG GTG CGT Leu Val Arg ATC GCT CTT Ile Ala Leu 330
TTA
Leu 315 AAA ACG ATC GCT Lys Thr Ile Ala
TAC
Tyr 320 ATG GGG GTT ATC Met Gly Val Ile GCT ATT GTG Ala Ile Val 325 ATG CTC TTA Met Leu Leu 1074 1122 TTA GCC ATC ACT Leu Ala Ile Thr TTT AAA AAA GAA Phe Lys Lys Glu
CGC
Arg 340 GAC ATT Asp Ile 345 AAC CGC AAC AGC Asn Arg Asn Ser CTG TAT GAA TTG Leu Tyr Glu Leu TCT AGC GGG TAT Ser Ser Gly Tyr 1170 1218
GTG
Val 360 GAT AAC GAT TAC Asp Asn Asp Tyr TTT TTA TTC CAC Phe Leu Phe His
AAC
Asn 370 ACG GAC AAT AAA Thr Asp Asn Lys CAT GAG TTT TAT His Glu Phe Tyr AAA GTT TTA GGG CAA AAA GAC ATT CAG Lys Val Leu Gly Gin Lys Asp Ile Gin 385 ATC AAA Ile Lys 390 1266 AAG CCT TTA Lys Pro Leu GTA GTG ATT Val Val Ile 410
AAT
Asn 395 CCT ATC GCC ATT Pro Ile Ala Ile
AAA
Lys 400 GCC GGG CAA AAG Ala Gly Gln Lys ATT AAA GCG Ile Lys Ala 405 GAA TAC AAG Glu Tyr Lys 1314 1362 TTA AGA AAA CCC Leu Arg Lys Pro AAG AGT AAC GCC Lys Ser Asn Ala
ACA
Thr 420 AAC GCT Asn Ala 425 AAA GAC GCT CTA Lys Asp Ala Leu CCC ATT ACC ATA Pro Ile Thr Ile GCT TAT AGC GCG Ala Tyr Ser Ala 1410
GAC
Asp 440 GAT AAG AAT ATT ACG ATA GAA AGG GAA Asp Lys Asn Ile Thr Ile Glu Arg Glu 445
TCG
Ser 450 GTG TTT ATT GCA Val Phe Ile Ala
CCA
Pro 455 1458 AGT GAG GAT Ser Glu Asp TGAAGCCTAA AACTAGCGTT CAATCACTTC ATAAGGCAAG CCTTGTT 1514 INFORMATION FOR SEQ ID NO:54: SEQUENCE CHARACTERISTICS: LENGTH: 458 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear WO 98/21225 WO 981225PCTfUS97/21353 -153- (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:54: Met Leu Giu Thr Ser Ser His Phe Leu Lys 1 Tyr Val1 Gin Met Thr Levi His Tyr Ala 145 Phe Phe Al a Val1 Gly 225 Lys Gin Cys Lys Asn 305 Met Lys Givi His Ile Arg Levi Pro Ser Arg Lys Lys 130 Gly Phe Trp Giu Leu 210 Gly Lys Val1 Ile Phe 290 Th r Gly Lys Levi As n Gly Ile His Phe Levi Val1 Lys 115 Ile Levi Met Leu Arg 195 Tyr Ala Arg Cys Asn 275 Asn Arg Val1 Giu Arg 355 Thr Phe Asp Phe Met Gly Levi 100 Ile Arg Met Tyr Phe 180 Phe Asp Levi Ser Pro 260 Cys Arg Gin Ile Arg 340 Ser Asp 5 Leu Gly Levi Val1 Arg Tyr Ser Lys Met Leu 165 Ser Cys Asn Tyr Pro 245 Thr Leu Pro Lys Ala 325 Met Ser As n Levi Ala Gly Ile 70 Val1 Arg Asn Val1 Leu 150 Lys Thr Ile Asp Asn 230 Givi His Giu Ser Val1 310 Ile Leu Gly Lys Ile 390 Ile His Lys 55 Leu Trp Asp Lys Leu 135 Phe Asn Al a Tyr Th r 215 Asn Asn Ile Cys Leu 295 His Val1 Leu Tyr Asp 375 Ser Levi 40 Ile Levi Cys Val1 Gin 120 Ser Phe Pro Val Levi 200 Levi Gin Giu Asp Val1 280 Ile Levi Ile Asp Val1 360 His Leu 25 Phe Phe Phe Gly Ile 105 Givi Val Phe Ser Val1 185 Cys Asn Gly Cys Ile 265 Asp Gin Val1 Ala Ile 345 Asp Glu 10 Ala Lei Ser Ile Trp 90 Givi Ser Levi Tyr Asp 170 Levi Pro Pro His Val 250 Arg Ala Trp Arg Levi 330 Asn As n Phe Ser Leu Ile Ala Gly 75 Al a Thr Pro Levi Phe 155 His Phe Tyr Ile Levi 235 Asn Lys Cys Ser Levi 315 Levi Arg Asp Tyr Asn 395 Phe Levi Ser Glu Ile Cys Lys Ly s Phe 140 Ile Pro Asp Al a Tyr 220 Phe Cys Gly Thr Ser 300 Lys Al a Asn Tyr Phe 380 Arg Ile Phe Givi Phe Pro Ile As n 125 Al a Al a Ile Ile Arg 205 Asp Pro Leu Leu Ile 285 Thr Thr Ile Ser Val 365 Lys Leu Thr Giu Levi Phe Gin Ph e 110 Thr Pro Pro Ala Val1 190 Val1 Givi Levi His Gin 270 Thr Asn Ilie rhr A.sp 350 Phe Val1 Lys Pro Hs Gin Ile Thr Lys Pro Val1 Givi Met 175 Val1 Gin Lys Pro Cys 255 Leu Met Ala Al a Ser 335 Levi Leu Leu Arg Phe Lys Val1 Thr Phe Levi Ser Val Asp 160 Gly Val1 Ser Arg Pro 240 Vai Glvi Ala Ile Tyr 320 Phe Tyr Phe Gly 370 Gin 385 Lys Asp Ile Gin Lys Lys Pro Leu Pro Ile Ala Ile WO 98/21225 PCT/US97/21353 -154- Ala Gly Gin Lys Ile Lys Ala Val Val Ile Leu Arg 405 410 Ser Asn Ala Thr Glu Tyr Lys Asn Ala Lys Asp Ala 420 425 Thr Ile Gin Ala Tyr Ser Ala Asp Asp Lys Asn Ile 435 440 Glu Ser Val Phe Ile Ala Pro Ser Glu Asp 450 455 Lys Pro Leu Lys 415 Leu Ile Pro Ile 430 Thr Ile Glu Arg 445 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 990 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 228...782 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID ACGATTTGAT CAATAACGAA AATAAAATTG ATGAAATCAA TAATGAAGAA AACGCTGATC
CTTCGCAAAA
ATTCCCCACT
AAGTAACCCC
AAGAACGAAC AACGTTTTGC CAACAGGAAG TATTAAAGTG TTTATTTTTA AGCGTTTATT
AACGAGCCAC
TGAAACTTTT
TTTTAAACCC
TAACCACCAA GACAATCTCA TTCAAAGGAT TTATTTAAAA CACCATT ATG CAA GCC Met Gin Ala 120 180 236 AAA AGC Lys Ser CGT TTT TAT GTG Arg Phe Tyr Val TCT CAA TAC CAG GTG GGG AAA ATG ATC Ser Gin Tyr Gin Val Gly Lys Met Ile
ATG
Met AAA AAA TAC AAC GAT CTC AAA CGC ACG Lys Lys Tyr Asn Asp Leu Lys Arg Thr 25
ATT
Ile 30 GAA GGG GCG AGC Glu Gly Ala Ser
TTT
Phe TCT TTA GGC TGG Ser Leu Gly Trp
GAG
Glu ATT AAC CCC ACT Ile Asn Pro Thr
AAC
Asn 45 TAC TGG TTT TAT Tyr Trp Phe Tyr TCG CGC Ser Arg TAT TAC TTT Tyr Tyr Phe GGC GCT CAA Gly Ala Gin ATG GAT.TAC GGG Met Asp Tyr Gly
AAT
Asn GTC ATT CTC AAT Val Ile Leu Asn AAA AGA ACG Lys Arg Thr GAT TTG ATT Asp Leu Ile GCG AAC ATG TTC Ala Asn Met Phe TAT GGC TTT GGG Tyr Gly Phe Gly
GGG
Gly GTG GAA TAC AAT AAA AAC CCC TTG TAT GTA TTT TCT CTT TTT TAT GGC WO 98/21225 PCT/US97/21353 -155- Val Glu Tyr Asn Lys
ATG
Met 100 CAA GTT GCT GAA Gin Val Ala Glu Asn Pro Leu AAC ACA TGG Asn Thr Trp 105 TGG CGC AGC Trp Arg Ser Tyr Val Phe Ser Leu Phe Tyr Gly ACG ATT TCC AAA CAC AGC GCG AAT Thr Ile Ser Lys His Ser Ala Asn
ATT
Ile TTC ATC ATT GAC Phe Ile Ile Asp GGG TTT TCG CTC Gly Phe Ser Leu AAA ACT Lys Thr 130 TCC AAT TTT Ser Asn Phe CTA TTC CAC Leu Phe His 150
AGG
Arg 135 ATG TTG GGT TTA Met Leu Gly Leu
GTG
Val 140 GGG TTT AAA TTC Gly Phe Lys Phe CAA ACC GTG Gin Thr Val 145 TGG CCT TTT Trp Pro Phe 668 CAT GAC GCA AGT ATT GAA GTG GGG ATC His Asp Ala Ser Ile Glu Val Gly Ile
AAA
Lys 160 GCT TTT Ala Phe 165 GAA TAC GAC TCA Glu Tyr Asp Ser
GCC
Ala 170 TTT GTA AGG CTT Phe Val Arg Leu TCT GTC TTT ATT Ser Val Phe Ile
TCG
Ser 180 CAC ACT TTC TAC His Thr Phe Tyr
CTT
Leu 185 TAAACTAATT CCAACCCTAC CGGGCAATGA TCGCTCCC
TAAAATATCT
AAAATAATCA
GGTGTAAGCC
TTATAGATTA AAGCGTCTTT TAAGCGCGTT ATGCGCCAAC CAATGTTTTT ATCCCTTGCT TTTTCTTTGT TAGGGTAAAA ATAACGGAAA
TTTAAAGGGT
TGTTGCATGT
GTGTCAATAA
TAGAGCATAA
AACTCCACCA
INFORMATION FOR SEQ ID NO:56: SEQUENCE CHARACTERISTICS: LENGTH: 185 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:56: Met Gin Ala Lys Ser Arg Phe Tyr Val Ala Ser Gin Tyr 1 5 10 Lys Met Ile Met Lys Lys Tyr Asn Asp Leu Lys Arg Thr 25 Ala Ser Phe Ser Leu Gly Trp Glu Ile Asn Pro Thr Asn 40 Tyr Ser Arg Tyr Tyr Phe Phe Met Asp Tyr Gly Asn Val 55 Lys Arg Thr Gly Ala Gin Ala Asn Met Phe Thr Tyr Gly 70 Gin Val Gly Ile Glu Gly Tyr Trp Phe Ile Leu Asn Phe Gly Asp Leu Ile Val Glu Tyr Asn Lys Asn Pro 90 Leu Tyr Val Phe Ser WO 98/21225 PCT/US97/21353 156- Phe Tyr Gly Met Gin Val Ala Glu Asn Thr Trp Thr 100 105 Ser Ala Asn Phe Ile Ile Asp Asp Trp Arg Ser Ile 115 120 Leu Lys Thr Ser Asn Phe Arg Met Leu Gly Leu Val 130 135 140 Gin Thr Val Leu Phe His His Asp Ala Ser Ile Glu Ile Gin 125 Gly Ser Lys His 110 Gly Phe Ser Phe Lys Phe Val Gly Ile 145 Trp Pro Phe Ala Tyr Asp Ser 155 Ala Phe 170 Val Arg Leu Val Phe Ile Thr Phe Tyr INFORMATION FOR SEQ ID NO:57: SEQUENCE CHARACTERISTICS: LENGTH: 1161 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 109...1113 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:57: ATCTTACCTT TATCTTTTAA GATTTTATGA AAAATAGTTT CATTTTTACT ATTGTTATTT TCTTAGTAAT GTTATAATCG CTTTATAAAT CATACAAAAA GGATCGCT ATG TTA GTT Met Leu Val 1 ACT CGC TTT AAA AAA GCT TTC ATT TCT TAT TCT TTA GGC GTG CTT GTC Thr Arg Phe Lys Lys Ala Phe Ile Ser Tyr Ser Leu Gly Val Leu Val 117
GCT
Ala TCA TTA TGG TTG Ser Leu Trp Leu
AAC
Asn 25 GTG TGC AAC GCT Val Cys Asn Ala TCA GCG Ser Ala 30 CAA GAA GTC AAA Gin Glu Val Lys GTC AAG GAT TAT TTC GGG GAG CAA ACC ATC AAG CTT CCT GTT Val Lys Asp Tyr Phe Gly Glu Gin Thr Ile Lys Leu Pro Val TCT AAA Ser Lys ATA GCC TAT ATA GGG AGC TAT GTA GAA GTG CCT GCC ATG Ile Ala Tyr Ile Gly Ser Tyr Val Glu Val Pro Ala Met CTT AAT GTT Leu Asn Val GAC GAT ATT Asp Asp Ile 309 TGG AAT AGG Trp Asn Arg GTT GTA GGC GTT TCG GAT TAC GCT TTT Val Val Gly Val Ser Asp Tyr Ala Phe
AAA
Lys WO 98/21225 PCT/US97/21353 157- GTC AAA Val Lys GCC ACT CTC AAA GGC GAA GAT CTT AAA Ala Thr Leu Lys Gly Glu Asp Leu Lys GTC AAA CAC ATG Val Lys His Met 405
AGC
Ser 100 ACT GAT CAT ACA Thr Asp His Thr
GCC
Ala 105 GCG CTA AAT GTA Ala Leu Asn Val
GAG
Glu 110 CTT TTA AAA AAG Leu Leu Lys Lys
CTT
Leu 115 AGC CCT GAT CTT Ser Pro Asp Leu
GTG
Val 120 GTA ACC TTT GTG Val Thr Phe Val AAC CCT AAA GCG Asn Pro Lys Ala GTA GAG Val Glu 130 CAT GCG AAA His Ala Lys ATT GCA GAG Ile Ala Glu 150 TTT GGT ATA TCA TTT CTT TCT TTT CAA GAG ACA ACG Phe Gly Ile Ser Phe Leu Ser Phe Gin Glu Thr Thr 140 14 GCC ATG CAG GCC ATG CAA GCT CAA GCC Ala Met Gin Ala Met Gin Ala Gin Ala 155 GTT TTA GAG Val Leu Glu ATT GAC Ile Asp 165 GCT TCC AAA AAA Ala Ser Lys Lys
TTC
Phe 170 GCC AAA ATG CAA Ala Lys Met Gin ACT TTG GAT TTT Thr Leu Asp Phe
ATT
Ile 180 GCT GAG CGT TTG Ala Glu Arg Leu
AAA
Lys 185 AAT GTC AAA AAG AAA AAG GGG GTG GAG Asn Val Lys Lys Lys Lys Gly Val Glu 190 693 TTC CAT AAA GCC Phe His Lys Ala AAA ATC AGC GGC Lys Ile Ser Gly CAA GCC ATT AGC Gin Ala Ile Ser TCA GAC Ser Asp 210 ATT TTA GAA Ile Leu Glu TTT GGG CGT Phe Gly Arg 230 GGG GGC ATA GAC Gly Gly Ile Asp TTT GGC TTG AAA Phe Gly Leu Lys TAT GTC AAA Tyr Val Lys 225 GAA AAC CCT Glu Asn Pro GCT GAC ATT AGC Ala Asp Ile Ser
GTG
Val 235 GAA AAA ATC GTT Glu Lys Ile Val
AAA
Lys 240 GAG ATT Glu Ile 245 ATC TTT ATT TGG Ile Phe Ile Trp
TGG
Trp 250 ATA AGC CCA Ile Ser Pro CTC ACG Leu Thr 255 GCC ATT Ala Ile 270 CCT GAA GAT GTG Pro Glu Asp Val AAA AAC AAG CAG Lys Asn Lys Gin
TTA
Leu 260 AAC AAC CCC AAA Asn Asn Pro Lys
TTT
Phe 265 GCT ACC ATC AAA Ala Thr Ile Lys 885 933 981 1029 GTT TAT AAA CTC Val Tyr Lys Leu
CCC
Pro 280
ATC
Ile ACA ATG GAT ATT Thr Met Asp Ile GGG CCT AGA GCC Gly Pro Arg Ala CCA CTC Pro Leu 290 AAG GGC Lys Gly ATA AGT CTT Ile Ser Leu GCT CTA AAA Ala Leu Lys CCT GAA GCC Pro Glu Ala
TTT
Phe 305 WO 98/21225 PCT/US97/21353 -158- GTG GAT ATT AAT GCG ATG GTT AAA GAC TAC TAT AAA GTG GTT TTT GAT Val Asp Ile Asn Ala Met Val Lys Asp Tyr Tyr Lys Val Val Phe Asp 310 315 320 TTG AAT GAT GCA GAG GTT GAG CCC TTT TTA TGG CAT TAATTTTTAA AAAGGG Leu Asn Asp Ala Glu Val Glu Pro Phe Leu Trp His 325 330 335 GTTGATGTTT TTAGCCTTTC GTGTATCGCG CT INFORMATION FOR SEQ ID NO:58: SEQUENCE CHARACTERISTICS: LENGTH: 335 amino acids TYPE: amino acid- STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal 1077 1129 1161 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:58: Met Leu Val Thr Arg Phe Lys Lys Ala 1 Val Glu Val Leu Asp Lys.
Lys Ala Glu 145 Val Leu Val Ser Tyr 225 Glu Leu Val Ser Asn Asp His Lys Val 130 Thr Leu Asp Glu Ser 210 Val Asn Val Lys Lys Val Ile Met Leu 115 Glu Thr Glu Phe Leu 195 Asp Lys Pro Ala Val Ile Trp Val Ser 100 Ser His Ile Ile Ile 180 Phe Ile Phe Glu 5 Ser Lys Ala Asn Lys Thr Pro Ala Ala Asp 165 Ala His Leu Gly Ile 245 Asn 25 Gly Ser Gly Lys Ala 105 Val Gly Gin Lys Lys 185 Lys Gly Ile Trp Phe 10 Val Glu Tyr Val Gly 90 Ala Thr Ile Ala Phe 170 Asn Ile Ile Ser Trp 250 Tyr Ala Ile Val Tyr Leu Val Gly 125 Leu Ala Met Lys His 205 Phe Lys Pro Ser Ser Lys Pro Ala Lys Glu 110 Asn Ser Gin Gin Lys 190 Gin Gly Ile Leu Leu Ala Leu Ala Phe Arg Leu Pro Phe Ala Glu 175 Lys Ala Leu Val Thr 255 Gly Gin Pro Met Lys Val Leu Lys Gin Thr 160 Thr Gly Ile Lys Lys 240 Pro WO 98/21225 PCT/US97/21353 159- Glu Asp Val Asn Lys Gin 275 Ala Pro Leu Leu 260 Val Asn Asn Pro Lys Ala Thr Ile Lys Tyr Lys Leu Met Asp Ile Gly 285 His Ala Ile Lys 270 Gly Pro Arg Pro Glu Ala Ile Ser Leu 290 Phe Lys Phe 295 Asn Ile Ala Leu Lys Gly Val Asp Ala Met Val 305 Val Lys 315 Pro Tyr Tyr Lys Phe Asp Leu Ala Glu Val Glu 330 Phe Leu Trp INFORMATION FOR SEQ ID NO:59: SEQUENCE CHARACTERISTICS: LENGTH: 800 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 121...669 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:59: TTATTCGCAT GCATTAGCTA TTATTGAAGC TCAAAGCATT CAAGCGCATT TATTCTTAGA TGAAATCAAA CAAAGCCAAA AAGAAAAGAA AAAATTCCCC ACTTTCAAAG GAGGTTTTTA ATG CGT TGG TGG TGT TTT TTG GTG TGT TGT TTT GGT ATT TTA AGC GTG Met Arg Trp Trp Cys Phe Leu Val Cys Cys Phe Gly Ile Leu Ser Val 120 168 ATG GAC GCT Met Asp Ala CTT TTA GAG Leu Leu Glu
AAA
Lys AAA TTA GAG AAT Lys Leu Glu Asn AAT TTG AAA AAA GAA AGA GAG Asn Leu Lys Lys Glu Arg Glu ATT ACT GGC AAC CAA TTT GTA GCG AAC Ile Thr Gly Asn Gin Phe Val Ala Asn 40 GAC AAA ACC AAA Asp Lys Thr Lys GGT AAA GAC CGG Gly Lys Asp Arg ACC GCT Thr Ala GTT ATT CAA GGC Val Ile Gin Gly
AAT
Asn 55 GTG-CAG ATC AAA Val Gin Ile Lys
AAG
Lys 312 360
TTG
Leu TTT GCG GAC AAG Phe Ala Asp Lys AGC GTG TTT TTA Ser Val Phe Leu
AAC
Asn GAT AAA CGA AAG Asp Lys Arg Lys GAG CGC TAT GAA GCC ACA GGG AAC ACG Glu Arg Tyr Glu Ala Thr Gly Asn Thr TTT AAC ATC TTT ACA GAG Phe Asn Ile Phe Thr Glu WO 98/21225 WO 9821225PCTIUS97/21353 160- GAC AAT COT Asp Asn Arg CTO AAT GGG Leu Asn Gly 115
GAA
Glu 100 ATC AGC G00 ACT GCT GAC AAC CTC ATT Ile Ser Gly Ser Ala Asp Lys Leu Ile 105 TAT AAC GCG Tyr Asn Ala 110 AGA GAA GTO Arg Glu Val GAA TAC AAA TTA TTG CAA AAT GC CTG Ciu Tyr Lys Leu Leu Gin Asn Ala Val 120
OTT
Val 125 COG AAA Oly Lys 130 TCC AAT CTC ATC ACC CCC CAT CAA ATC Ser Asn Val Ile Thr Cly Asp Ciu Ile TTA AAC AAA ACT Leu Asn Lys Thr
AAC
Ly s 145 COT TAT OCT OAT CTG TTO CCC AC Cly Tyr Ala Asp Val Leu Cly Ser 150 OCG AAA Ala Lys 155 GAA AAT Clu Asn 170 CGG CCC OCT AAA Arg Pro Ala Lys GTC TTT CAT ATO Val Phe Asp Met CAT ATT AAT GAA Asp Ile Asn Glu CCT AAG OCT Arg Lys Ala AAA TTC Lys Leu 175 AAC AAO AAA Lys Lys Lys
OGC
Gly 180 CAA AAA CCA Clu Lys Pro TGATTG-TCAT TAAACACGCT CATTTTCTCA CTTC TTCAACCCAA CTTTTTCAAT CCCCTGCCAG TTTGACTTCT GAAATGCTGG TTTTAGGCC CACCAATCTA GCCAAAACCT CGTTTATTAA TACCTTC INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 183 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID Met Arg Trp Trp Cys Phe Leu Val Cys 1 5 Met Asp Ala Lys Lys Leu Giu Asn Lys 25 Leu Leu Glu Ile Thr Gly Asn Gin Phe 40 Thr Ala Val Ile Oln Cly Asn Vai Gln 55 Leu Phe Ala Asp Lys Val Ser Val Phe Cys 10 As n Phe Cly Ile Leu Ser Vai Leu Lys Lys Val Ala Asn Asp Cly Glu Arg Clu Lys Thr Lys Lys Asp Arg Ile Lys Lys Leu Asn Asp Lys Arg Lys Glu Phe Arg Tyr Glu Ala Thr Cly Asn Thr Asn Ile Phe Thr Clu Asp Asn Arg Giu Ile 100 Leu Asn Gly Glu Tyr Ser Oly Ser Ala Asp Lys Leu Ile 105 Lys Leu Leu Gin Tyr Asn Ala 110 Arg Clu Val Asn Aia Val Val WO 98/21225 PCT/US97/21353 -161- 115 120 Gly Lys Ser Asn Val Ile Thr Gly Asp 130 135 Lys Gly Tyr Ala Asp Val Leu Gly Ser 145 150 Val Phe Asp Met Glu Asp Ile Asn Glu 165 Lys Lys Lys Gly Glu Lys Pro 180 125 Glu Ile Ile Leu Asn Lys Thr Ala Lys 155 Glu Asn 170 Pro Ala Lys Arg Lys Ala Lys 175 INFORMATION FOR SEQ ID NO:61: SEQUENCE CHARACTERISTICS: LENGTH: 724 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 88...618 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:61: GTGATGATTG AAAACATGTG AAAGAGCGTT TTTTAAGCTT TTAAATGGTG TTTGAATGCG AAAAAAAGGC TAATACTATC ATAAGGA ATG AAG TTG ATA AAA TTT GTG CGT AAT Met Lys Leu Ile Lys Phe Val Arg Asn 114
GTG
Val GTT TTG TTC ATT Val Leu Phe Ile ACG GCG ATC TTT Thr Ala Ile Phe
TTA
Leu GCG TTC ATG CTT Ala Phe Met Leu 162 210 GTG AGT TAT TGC Val Ser Tyr Cys CCC CAT TAT AGC Pro His Tyr Ser GCT GTC ATT AGC Ala Val Ile Ser GGG GTG Gly Val GAA GTC AAA Glu Val Lys GTA AAA ACC Val Lys Thr
AGA
Arg ATG AAT GAA AAT GAA AAC ACG CCC AAT AAT AAG GAA Met Asn Glu Asn Glu Asn Thr Pro Asn Asn Lys Glu 258 306 CTT GCT AGA GAT GTC TAT TTT GTG CAA ACT TAC GAC CCT Leu Ala Arg Asp Val Tyr Phe Val Gin Thr Tyr Asp Pro AAA GAT Lys Asp CAA AAA AGC GTA Gin Lys Ser Val GTT TAT CGT AAC Val Tyr Arg Asn
GAA
Glu GAC ACG CGC TTT Asp Thr Arg Phe AGC TTC CCT TTT TAT TTT AAG TTT AAT TCG GCT GAT ATT TCA GCC CTC Ser Phe Pro Phe Tyr Phe Lys Phe Asn Ser Ala Asp Ile Ser Ala Leu 402 WO 98/21225 PCT/US97/21353 -162- 100
GTG
Val GCT CAA AGT TTA Ala Gin Ser Leu CAG CAA GTG Gin Gin Val AAA TAC TAT Lys Tyr Tyr GGT TGG Gly Trp 120 CGG ATC AAT Arg Ile Asn TTA AAA GAG Leu Lys Glu 140 TAC GCT TTG Tvr Ala Leu
TTG
Leu 125
AGC
Ser AAC ATG TTC Asn Met Phe GTG ATT TTT Val Ile Phe TTA AAG CCC Leu Lys Pro 135 TGG ATT TTA Trp Ile Leu ACT GAC ATT Thr Asp Ile
TCA
Ser 145 CCC ATT TTT Pro Ile Phe
AGC
Ser 150 450 498 546 594 648 708 724 155 ACT TTA Thr Leu 170
TTT
Phe CTG TTA ATG Leu Leu Met AAG AGC AAA Lys Ser Lys GGC TTT Gly Phe 160 GCT CAT Ala His TTT ATC AGC GCG CGT TCT GTT TGC Phe Ile Ser Ala Arg Ser Val Cys 165 TAAAACTTTT AGGCTTTGTT GGAAAATCAC AATGGGGTTA TTGGAGCGTG TATTAAAAAG CTCAATATAG GGCAAGCTGA TGCTGTGAAA AGCGGTGTTG TTTCCT INFORMATION FOR SEQ ID NO:62: SEQUENCE CHARACTERISTICS: LENGTH: 177 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal Met 1 Ala Tyr Asn Val Val Phe Gin Phe (xi) SEQUENCE Lys Leu Ile Lys 5 Ile Phe Leu Ala Ser Ala Ala Val Glu Asn Thr Pro Tyr Phe Val Gin Tyr Arg Asn Glu Asn Ser Ala Asp 100 Val Glu Val Lys 115 Pro Asn Val Ile DESCRIPTION: SEQ ID Phe Val Arg Asn Val 10 Phe Met Leu Leu Val Ile Ser Gly Val Glu 40 Asn Asn Lys Glu Val 55 Thr Tyr Asp Pro Lys 70 Asp Thr Arg Phe Ser 90 Ile Ser Ala Leu Ala 105 Tyr Tyr Gly Trp Arg 120 Phe Leu Lys Pro Leu NO: 62: Val Leu Ser Tyr Val Lys Lys Thr Asp Gin Phe Pro Gin Ser Ile Asn Lys Glu Phe Cys Arg Leu Lys Phe Leu Leu 125 Ser Leu Pro Asn Arg Val Phe Asn Asn Asp WO 98/21225 PCT/US97/21353 -163- 130 135 140 Ser Lys Pro Ile Phe Ser Trp Ile Leu Tyr Ala Leu Leu Leu Met Gly 145 150 155 160 Phe Phe Ile Ser Ala Arg Ser Val Cys Thr Leu Phe Lys Ser Lys Ala 165 170 175 His INFORMATION FOR SEQ ID NO:63: SEQUENCE CHARACTERISTICS: LENGTH: 982 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 117...911 OTHER INFORMATION: NAME/KEY: sig_peptide LOCATION: 117...167 OTHER INFORMATION: NAME/KEY: mat_peptide LOCATION: 168...911 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:63: CCACATTTAA GGTAGAAACC ACTCAATTAG ATGTAAAAAT TCCAAACGGC AACCAAAAAA TGGTTAAAAA GGACACAATA AACCCCAAAA ATGAAATTTA AATATATGGG AACTTA ATG 119 AGA ATT TTT TTT GTT ATC ATG GGA CTT GTG TTT TTT GGT TGC ACC AGT Arg Ile Phe Phe Val Ile Met Gly Leu Val Phe Phe Gly Cys Thr Ser -10 AAG GTG CAT GAG ATG AAA AAA AGC CCT TGC ACC TTG TAT GAA AAC AGG Lys Val His Glu Met Lys Lys Ser Pro Cys Thr Leu Tyr Glu Asn Arg 1 5 10 167 215 TTA AAT CTC GCA GAA ATC TTT CAC AAG Leu Asn Leu Ala Glu Ile Phe His Lys 25 GAG CTT TTA AGC CAC CAA GAA AAG CAT Glu Leu Leu Ser His Gin Glu Lys His CGA GCA ATT GAT Arg Ala Ile Asp CTA TTT AGA Leu Phe Arg TTA GAA AAC AAG CTT TCT GGT Leu Glu Asn Lys Leu Ser Gly WO 98/21225 WO 9821225PCT/US97121353 -164 TTT TCG Phe Ser GTC AGT GAT TTG GAC ATG CAA AGC Val Ser Asp Leu Asp Met Gin Ser 55 GTG TTT Val Phe GGC TTG Gly Leu 75 CGG CTG GAA AGA Arg Leu Glu Arg CGC TTG AAA ATC Arg Leu Lys Ile
GCT
Al a TAC AAG CTC TTA Tyr Lys Leu Leu ATG AGT TTT Met Ser Phe
ATC
Ile 407 455 GCT CTT ATT TTA Ala Leu Ile Leu
C
Ala ATC CTG TTA ATC Ile Val Leu Ile ACT CTT Ser Leu 90 CTA CCC TTA Leu Pro Leu CAA AAA Gin Lays ACC GAA CAC Thr Glu His ATT ATC CAA Ile Ile Gin 115 TTC CTG GAT TTT Phe Val Asp Phe AAC CAG GAC AAG Asn Gin Asp Lys CAT TAC CTC His Tyr Val 110 GCC TTC CCT Ala Leu Ala AGA GC CAT AAA Arg Ala Asp Lys
AGC
Ser 120 ATT TCC ACT AAT Ile Ser Ser Asn
CA
Ciu 125 CCT TCC Arg Ser 130 CTC ATT CCC CC Leu Ile Cly Ala
TAT
Tyr 135 CTG TTA AAC CCA GAG AGC ATT AAC CC Val Leu Asn Arg Glu Her Ile Asn Arg
ATT
Ile 14S GAC CAT AAA TCC Asp Asp Lys Her TAT CAA TTC CTC CGC TTG CA.A ACC ACT Tyr Ciu Leu Val Arg Leu Gin Her Ser 155 AAA CTC TCG CAA Lys Val Trp Gin TTT CAA CAT TTC Phe Clu Asp Leu AAA ACC CAA AAC Lys Thr Gin Asn ACC ATT Ser Ile 175 695 743 TAT CTC CA Tyr Val Cln
AC
Her 180 CAT TTC CPA AGA CPA CTC CAT ATC CTC His Leu Ciu Arg Clu Val His Ile Val 185 PAT ATT CC Asn Ile Ala 190 ATT CCC OCT Ile Ala Ala ATC TAT CAC Ile Tyr Gin 195 CAA CAC PAT PAC Cln Asp Asn Asn
CCC
Pro 200 ATT CC AGC GTC Ile Ala Ser Val
TCC
Ser 205 791 AAA CTT Lys Leu 210 TTC PAT CPA AAC Leu Asn Clu Asn CTC CTC TAT CPA AAC CCT TAT AAA ATC Leu Val Tyr Clu Lys Arg Tyr Lys Ile 220
GTA
Val1 225 TTC ACT TAT TTC Leu Her Tyr Leu
TTT
Phe 230 CAC ACC CCC ATC Asp Thr Pro Met
PAT
As n 235 TCA ACC TTC CPA Ser Ser Leu Cmn
GCT
Ala 240 TCC AC CTC TCA CCC TTC ATA CTT Cys Lys Leu Her Cly Phe Ile Val 245 TCACATCACA TATACATGAG CTTTATCCGC TACCATTATC ACACPATCCC TAACCCAGCA CCCACCCACT A WO 98/21225 -165- INFORMATION FOR SEQ ID NO:64: SEQUENCE CHARACTERISTICS: LENGTH: 265 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:64: PCT/US97/21353 Met Arg Ile Phe Phe Val Ile Met Gly Leu Lys 1 Leu Glu Phe Asn Ala Thr Ile Arg -Ile 145 Lys Tyr Ile Lys Val 225 Cys His Leu Leu Val Leu Ile His Gin 115 Leu Asp Trp Gin Gin 195 Leu Ser Leu Glu Ala Ser Ser Lys Leu His 100 Arg Ile Lys Gin Ser 180 Gin Asn Tyr Ser Ser His Lys Met Lys Leu Phe Ser 120 Val Glu Asp Arg Pro 200 Leu Thr Val Pro Lys His Gin Leu Ile Leu 105 Ile Leu Leu Leu Glu 185 Ile Val Pro Val Cys Arg Leu Ser Leu Ser Asn Ser Asn Val Ile 170 Val Ala Tyr Met Phe Leu Ile Asn Phe Leu Leu Asp Asn Glu 140 Leu Thr Ile Val Lys 220 Ser INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 2059 base pairs WO 98/21225 PCT/US97/21353 -166- TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 183...1961 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID GATATTTGTT TTGTTGGGGG TTAGGTTTTT GTTTAAGAAA GTTTTTTAAA CGCTTAAAAC AGAACCTTTT GTTTTTTAGG TTTTATTTTT TACTTTGGCT AGTCATTTTG ATTTCTAAAA ATAGTCTATA ATGCTCGCAA GAGATATTTT CA ATG AAA GCT ATA AAA ATA CTT TTT ATA ATG ACA CTC AGT Met Lys Ala Ile Lys Ile Leu Phe Ile Met Thr Leu Ser
ACTAAAGAAG
TGTTTTCAAA
TTAAGGTTAT
TTA AAC Leu Asn 120 180 227 GCT ATC AGC GTG Ala Ile Ser Val AGG GCG TTG TTT GAT TTA AAA GAT TCG CAA TTA Arg Ala Leu Phe Asp Leu Lys Asp Ser Gln Leu 275 AAA GGG GAA Lys Gly Glu AGC ACT GAA Ser Thr Glu
TTA
Leu ACG CCA AAA ATA Thr Pro Lys Ile
GTG
Val 40 AAT TTT GGG GGT Asn Phe Gly Gly TAT AAA AGC Tyr Lys Ser AAT GCG GCT Asn Ala Ala 323 371 GAG TGG GGG GCT Glu Trp Gly Ala
ACG
Thr 55 GCT TTA AAC TAT Ala Leu Asn Tyr
ATC
Ile AAT GGC GAT GCG AAA AAA Asn Gly Asp Ala Lys Lys AGC ACT CTA GTG Ser Thr Leu Val AAA ATG CGT TTT Lys Met Arg Phe 419 467
AAC
Asn TCC GGT ATA TTG Ser Gly Ile Leu AAT TTA AGA GTG Asn Leu Arg Val
CAT
His 90 GCA CGT TTG AGG Ala Arg Leu Arg
CAA
Gin GCC CTA AAA TTG Ala Leu Lys Leu AAG AAT TTG AAA TAT TGC CTT AAA ATC Lys Asn Leu Lys Tyr Cys Leu Lys Ile 105 ATC GCT Ile Ala 110 AGG GAT TCT TTT TAT AGC TAC CGC Arg Asp Ser Phe Tyr Ser Tyr Arg 115 GGT ATT TAT ATC Gly Ile Tyr Ile CCC TTA GGC Pro Leu Gly 125 GCT GAT TTG Ala Asp Leu ATT TCT TTA Ile Ser Leu 130 AAA GAT CAA AAA Lys Asp Gin Lys
ACG
Thr 135 GCT CAA AAA ATG Ala Gln Lys Met
CTC
Leu 140 -AGC GTG GTA GGG GCG TAT CTT AAA AAA CAA CAA GAG AAT GAA AAG GCT Ser Val Val Gly Ala Tyr Leu Lys Lys Gln Gin Glu Asn Glu Lys Ala WO 98/21225 PCT/US97/21353 -167-
CAA
Gin 160 AGC CCT TAT TAC Ser Pro Tyr Tyr
AGA
Arg 165 AAC AAC AAC TAT Asn Asn Asn Tyr
TAC
Tyr 170 AAC TCT TAC TAT Asn Ser Tyr Tyr CCT TAT TAC GGA Pro Tyr Tyr Gly TAT-GGT ATG TAT Tyr Gly Met Tyr ATG GGC ATG TAT Met Gly Met Tyr GGA ATG Gly Met 190 TAT GGC ATG Tyr Gly Met GGA TTC TAC Gly Phe Tyr 210 ATG TAT GAT TTT Met Tyr Asp Phe GAC TTT TAT GAT Asp Phe Tyr Asp GGC ATG TAT Gly Met Tyr 205 GAT TAC TTG Asp Tyr Leu CCT AAC ATG TTT Pro Asn Met Phe
TTC
Phe 215 ATG ATG CAA GTT Met Met Gin Val
CAA
Gin 220 ATG TTA Met Leu 225 GAA AAT TAC ATG Glu Asn Tyr Met
TAT
Tyr 230 GCG CTC GAT CAA Ala Leu Asp Gin GAG ATT TTA GAT Glu Ile Leu Asp
CAT
His 240 GAC GCT TCT ACT Asp Ala Ser Thr CAA CTT GAT ACG Gin Leu Asp Thr
CCT
Pro 250 ACT GAT GAT GAC Thr Asp Asp Asp GAC GAT AAA GAC Asp Asp Lys Asp AAA TCC TTA CAG Lys Ser Leu Gin GCA AAT CTT ATG Ala Asn Leu Met AAC TTT Asn Phe 270 TAT CGT GAT Tyr Arg Asp AGC GCT TTA Ser Ala Leu 290
CCC
Pro 275
GTC
Val AAA TTC AGC AAA Lys Phe Ser Lys ATT CAA ACC AAC Ile Gin Thr Asn CGC TTG AAT Arg Leu Asn 285 GAC AAT TCG Asp Asn Ser 1043 1091 AAT TTA GAC Asn Leu Asp
AAC
Asn 295 CGC ATG CTC Arg Met Leu
AAA
Lys 300 CTT TTC Leu Phe 305 CAC ACT AAA GCC His Thr Lys Ala
ATG
Met 310 CCC ACT AAA AGC Pro Thr Lys Ser GAT GCG ATA ACT Asp Ala Ile Thr 1139 1187
TCT
Ser 320 CAA GCC AAA GAG Gin Ala Lys Glu AAC CAT TTA GTG Asn His Leu Val
GGG
Gly 330 CAA ATC AAA GAA Gin Ile Lys Glu AAG CAA GAC GGG GCG AGT CCT AGT AAG ATT GAT TCA GTT GTC Lys Gin Asp Gly Ala Ser Pro Ser Lys Ile Asp Ser Val Val AAT AAA Asn Lys 350 1235 GCT ATG GAA Ala Met Glu
GTG
Val 355 AGG GAC AAG CTA Arg Asp Lys Leu AAT AAT CTC AAC Asn Asn Leu Asn CAA CTA GAC Gin Leu Asp 365 1283 AAT GAC TTA AAA GAT CAA AAA GGG CTT TCA AGC GAG CAA CAA GCT CAA Asn Asp Leu Lys Asp Gin Lys Gly Leu Ser Ser Glu Gin Gin Ala Gin 1331 WO 98/21225 PCT/US97/21353 -168- GTG GAT Val Asp 385 AAA GCC CTA GAC Lys Ala Leu Asp
AGC
Ser 390 GTG CAA CAA TTA Val Gin Gin Leu CAT AGC AGC GAT His Ser Ser Asp 1379
GTG
Val 400 GTG GGG AAT TAT Val Gly Asn Tyr
TTA
Leu 405 GAC GGG AGT TTG AAA ATT GAT GGC GAT Asp Gly Ser Leu Lys Ile Asp Gly Asp 410
GAT
Asp 415 1427 AGA GAT GAT TTG Arg Asp Asp Leu GAT GCG ATG AAT Asp Ala Met Asn CCT ATG CAA CAA Pro Met Gin Gin CCC GTG Pro Val 430 1475 CAA CAA ACG Gin Gin Thr AAG GAT CAA Lys Asp Gin 450
CCT
Pro 435 ACT AGC AAC ATG Thr Ser Asn Met GAC ACC CAT GCA Asp Thr His Ala AAT GAC AGC Asn Asp Ser 445 GCC ACT AAC Ala Thr Asn 1523 1571 GGG AGT AAC GCG Gly Ser Asn Ala
CTC
Leu 455 ATA AAC CCT AAC Ile Asn Pro Asn GCC GAC Ala Asp 465 GAC ACT CAC ACT Asp Thr His Thr GAT ACT CAC ACT Asp Thr His Thr ACT AAC ACC ACA Thr Asn Thr Thr 1619 1667 GAT GCT AGC ACC Asp Ala Ser Thr GAC ACC CCC ACT Asp Thr Pro Thr GAT AAA GAT GCT Asp Lys Asp Ala
AGC
Ser 495 GGC TTG AAC AAT ACC GGC GAT ATG AAT Gly Leu Asn Asn Thr Gly Asp Met Asn 500 ACG GAT ACC GGC Thr Asp Thr Gly AAC ACG Asn Thr 510 1715 GAC ACC GGC Asp Thr Gly
AAT
Asn 515 ACG GAT ACC GGT Thr Asp Thr Gly
AAC
Asn 520 ACT GAT GAT ATG Thr Asp Asp Met AGC AAC ATG Ser Asn Met 525 ATG AGC AAC Met Ser Asn 1763 AAC AAC GGC AAC GAT GAT ACG Asn Asn Gly Asn Asp Asp Thr 530 GGT AAC Gly Asn 535 GCT AAT GAC Ala Asn Asp 1811 GGC AAC Gly Asn 545 GAC ATG GGC GAT GAT TTG AAC AAC GCG AAC GAT ATG AAC GAC Asp Met Gly Asp Asp Leu Asn Asn Ala Asn Asp Met Asn Asp 1859
GAC
Asp 560 ATG GGT AAT GGC Met Gly Asn Gly
AAC
Asn 565 GAT GAC ATG GGC Asp Asp Met Gly
GAT
Asp 570 ATG GGG GAT ATG Met Gly Asp Met 1907 1955 GAC GAT ATG GGT Asp Asp Met Gly
GGC
Gly 580 GAT ATG GGA GAC Asp Met Gly Asp GGG GAT ATG GGC Gly Asp Met Gly GAT ATG Asp Met 590 GGG AAT TGAGATTAAC CCCAATATCA AAGAGTGATA GCCAAAACTT TAAGGAATAT TT 2013 WO 98/21225 PCT/US97/21353 -169- Gly Asn TTATAGTAAA AACGATTCTT TTAAGGTAAT AGGGGGGATA TTTTGC INFORMATION FOR SEQ ID NO:66: SEQUENCE CHARACTERISTICS: LENGTH: 593 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal 2059 Met 1 Ile Gly Thr Gly Ser Leu Asp Ser Val 145 Ser Tyr Gly Phe Leu 225 Asp Asp Arg Ala (xi) SEQUENCE Lys Ala Ile Lys 5 Ser Val Asn Arg Glu Leu Thr Pro Glu Glu Trp Gly Asp Ala Lys Lys Gly Ile Leu Gly Lys Leu Gln Lys 100 Ser Phe Tyr Ser 115 Leu Lys Asp Gin 130 Val Gly Ala Tyr Pro Tyr Tyr Arg 165 Tyr Gly Met Tyr 180 Met Gly Met Tyr 195 Tyr Pro Asn Met 210 Glu Asn Tyr Met Ala Ser Thr Asp 245 Lys Asp Asp Lys 260 Asp Pro Lys Phe 275 Leu Val Asn Leu DESCRIPTION: SEQ ID NO:66: Ile Leu Phe Ile Met Ala Lys Ala Phe 70 Asn Asn Tyr Lys Leu 150 Asn Gly Asp Phe Tyr 230 Gin Ser Ser Asp Phe Val 40 Ala Thr Arg Lys Thr 120 Ala Lys Asn Tyr Tyr 200 Met Leu Asp Gin Gly 280 Ser Asp Asn Leu Leu Val Tyr 105 Gly Gin Gln Tyr Gly 185 Asp Met Asp Thr Gin 265 Ile Arg Leu Phe Asn Val His Cys Ile Lys Gin Tyr 170 Met Phe Gin Gin Pro 250 Ala Gin Met Thr Lys Gly Tyr Glu 75 Ala Leu Tyr Met Glu 155 Asn Gly Tyr Val Glu 235 Thr Asn Thr Leu Leu Asp Gly Ile Lys Arg Lys Ile Leu 140 Asn Ser Met Asp Gin 220 Glu Asp Leu Asn Lys Ser Ser Tyr Asn Met Leu Ile Pro 125 Ala Glu Tyr Tyr Gly 205 Asp Ile Asp Met Arg 285 Asp Ala Lys Ser Asn Asn Ala Arg Ile Ser Gin 160 Pro Tyr Gly Met His 240 Asp Tyr Ser Leu WO 98/21225 PCTIUS97/21353 -170- Phe 305 Gin Gin Met Asp Asp 385 Val Asp Gin Asp Asp 465 Asp Leu Thr Asn Asn 545 Met Asp Asn His Ala Asp Glu Leu 370 Lys Gly Asp Thr Gin 450 Asp Ala Asn Gly Gly 530 Asp Gly Met Thr Lys Gly Val 355 Lys Ala Asn Leu Pro 435 Gly Thr Ser Asn Asn 515 Asn Met Asn Gly Lys Glu Ala 340 Arg Asp Leu Tyr Asn 420 Thr Ser His Thr Thr 500 Thr Asp Gly Gly Gly 580 Ala Leu 325 Ser Asp Gin Asp Leu 405 Asp Ser Asn Thr Thr 485 Gly Asp Asp Asp Asn 565 Asp Met 310 Asn Pro Lys Lys Ser 390 Asp Ala Asn Ala Asp 470 Asp Asp Thr Thr Asp 550 Asp Met Pro His Ser Leu Gly 375 Val Gly Met Met Leu 455 Asp Thr Met Gly Gly 535 Leu Asp Gly Thr Leu Lys Asp 360 Leu Gin Ser Asn Ala 440 Ile Thr Pro Asn Asn 520 Asn Asn Met Asp Lys Val Ile 345 Asn Ser Gin Leu Asn 425 Asp Asn His Thr Asn 505 Thr Ala Asn Gly Met 585 Ser Gly 330 Asp Asn Ser Leu Lys 410 Pro Thr Pro Thr Asp 490 Thr Asp Asn Ala Asp 570 Gly 300 Asp Ile Val Asn Gin 380 His Asp Gin Ala Ser 460 Thr Lys Thr Met Asp 540 Asp Gly Met Ala Lys Val Gln 365 Gln Ser Gly Gln Asn 445 Ala Asn Asp Gly Ser 525 Met Met Asp Gly Ile Glu Asn 350 Leu Ala Ser Asp Pro 430 Asp Thr Thr Ala Asn 510 Asn Ser Asn Met Asp 590 Thr Met 335 Lys Asp Gln Asp Asp 415 Val Ser Asn Thr Ser 495 Thr Met Asn Asp Asn 575 Met Ser 320 Lys Ala Asn Val Val 400 Arg Gin Lys Ala Asn 480 Gly Asp Asn Gly Asp 560 Asp Gly INFORMATION FOR SEQ ID NO:67: SEQUENCE CHARACTERISTICS: LENGTH: 1527 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 112...1461 OTHER INFORMATION: WO 98/21225 WO 82225PCTIUS97/21353 -171- (xi) SEQUENCE DESCRIPTION: SEQ ID NO:67: AATGAGCCAT TTGAA.AGATT TTGTCAATAA. AACTTCAAGC CCTTTAAATG CGAATTGATT TTCTTATATT ATGATTACGA TTTATCAATT TAAAACATTT GGAGAAAGAC A ATG ACT Met Ser 1 117 ATC CAA TTT Met Giu Phe GAT GCT GTT ATT ATT GGA GGT GGG OTT TCA GGG TOC GCG Asp Ala Val Ile Ile Gly Gly Gly Val Ser Gly Cys Ala ACC TTT Thr Phe TAT ACT TTG AGC Tyr Thr Leu Ser
GAA
Giu 25 TAC AGC TCT TTA AAG CGC GTG GCT ATC Tyr Ser Ser Leu Lys Arg Val Ala Ile
GTG
Val1 GAA AAA TGC TCT Glu Lys Cys Ser
AAA
Lys 40 TTG GCT CAA ATC Leu Ala Gin Ile
AGC
Ser TCC AGC GCT AAA Ser Ser Ala Lys AAT TCC CAA ACC Asn Ser Gln Thr CAT GAT CCC TCT His Asp Cly Ser GAA ACG PAT TAC Giu Thr Asn Tyr ACT CCC Thr Pro GPA APA GCT Glu Lys Ala GCT TTG PAT Ala Leu Asn AAA GTG CCT TTG Lys Val Arg Leu GCT TAT AAC ACC Ala Tyr Lys Thr AGG CPA TAC Arg Gin Tyr AAA CGC TTG CPA Lys Gly Leu Gin PAT GPA CTG ATT TTT CPA ACC CAG AAA Asn Glu Val Ile The Clu Thr Gin Lys 90 CPA CPA TGC GAG TTC ATC AAA AAA CGC Ciu Clu Cys Clu Phe Met Lys Lys Arg 110 ATG GCT Met Ala 100 ATA CCC CTG GC Ile Cly Val Gly
CAT
Asp 105
TAC
Tyr 115 CPA TCT Glu Ser TTT AAA CA Phe Lys Glu 120 ATC TTT CTC CCC Ile Phe Val Cly
TTA
Leu 125 CPA CPA TTT CAC PAG, Glu Glu Phe Asp Lys CPA AC ATT AAA Gin Lys Ilie Lys TTA GAG CCT PAT Leu Giu Pro Asn ATT TTA CGG OCT Ile Leu Cly Ala PAT GC Asn Cly 145 ATA CAC AGG Ile Asp Arg CP.A PAC ATT ATC Giu Asn Ile Ile CAT CCC TAT AGA His Gly Tyr Arg PAG CAT TCC Lys Asp Trp 160 AGC ACC ATC Ser Thr Met 165 PAT TTT CC PAG Asn Phe Ala Lys ACT CPA AAC TTC CTT CPA CPA CC Ser Giu Asn Phe Val Ciu Clu Ala 175 CTA AA Leu Lys 180 TTA PAG CCT PAC Leu Lys Pro Asn
PAC
Asn 185 CAC CTC TTT TTC Gin Val Phe Leu
AAT
Asn 190 TTC APA GTC APA Phe Lys Val Lys WO 98/21225 WO 981225PCTIUS97/21353
AAG
Lys 195 ATT GAA AAA CGC Ile Oiu Lys Arg GAC ACT TAC GCC Asp Thr Tyr Ala
GTA
Val1 205 ATT TCA GAA GAC Ile Ser Giu Asp
GCT
Ala 210 GAA GAA GTO TAT Giu Giu Vai Tyr AAA TTC GTG CTG Lys Phe Val Leu AAT GCC GGC TCT Asn Ala Gly Ser TAC GCT Tyr Ala 225 741 789 837 TTG CCT TTG Leu Pro Leu CCT GTG GCG Pro Val Ala 245
OCT
Ala 230 CAG AGC ATG GGC TAT GGC CTA GAT TTA Gin Ser Met Gly Tyr Gly Leu Asp Leu 235 GGG TGC TTG Gly Cys Leu 240 AGG GGT AAG Arg Gly Lys GGC AGC TTT TAT Gly Ser Phe Tyr
TTT
Phe GTG CCG GAT TTA Val Pro Asp Leu
TTA
Leu 255 OTT TAT Val Tyr 260 ACC GTT CAA AAC Thr Val Gin Asn AAA CTC CCT TTT Lys Leu Pro Phe 0CC GTG CAT CC Ala Val His Gly
GAC
Asp 275
TTA
Leu CCT GAT GCC GTC Pro Asp Ala Val
ATT
Ile 280
TTA
Leu AAA GGA AAA ACA Lys Oly Lys Thr
CGA
Arg 285
TOT
Cys ATC 000 CCT ACC Ilie Gly Pro Thr 933 9B1 1029
OCT
Ala 290
ATT
Ile ACO ATO CCT Thr Met Pro
AAA
Lys 295 GAA CCC AAC Olu Arg Asn
AAA
Lys 300 TOO CTT AAO Trp Leu Lys AOC TTO OAA Ser Leu Giu OCO TTT OAT Ala Phe Asp 325 TTG AAA ATO OAT Leu Lys Met Asp AAT AAA GAT OTO Asn Lys Asp Val TTT AAA ATT Phe Lys Ile 320 GTG TTT AAA Val Phe Lys 1077 1125 TTG ATO AOC OAT Leu Met Ser Asp OAA ATC CGA AAT Olu Ile Arg Asn
TAT
Tyr 335 AAC- ATO Asn Met 340 OTT TTT OAA TTO Val Phe Oiu Leu
CCC
Pro 345 ATT ATC OGT AAA Ile Ile Gly Lys
AGO
Arg 350 AAA TTT TTA AAA Lys Phe Leu Lys
GAC
Asp OCT CAA AAA ATC Ala Gin Lys Ile CCC TCT CTT AGC Pro Ser Leu Ser
CTA
Leu 365 GAA CAT CTA CAA Glu Asp Leu Glu
TAC
Tyr 370 1173 1221 1269 OCT CAT GOT TTT Ala His Oly Phe OAA OTO COC CCO Oiu Val Arg Pro OTT TTA CAC AGA Val Leu Asp Arg ACC AAO Thr Lys 385 COA AAA CTO Arg Lys Leu ACT TTT AAC Thr Phe Asn 405
OAA
Glu 390 TTA 0CC CAA AAA Leu Gly Olu Lys ATT TCC ACC CAT Ile Cys Thr His AAA 0CC ATC Lys Oly Ile 400 TTO CAA AAC Leu Gin Asn 1317 1365 ATO ACC CCT TCT Met Thr Pro Ser
CCA
Pro 410 COC CO ACO ACT Gly Ala Thr Ser WO 98/21225
PCTIUJS
-173- GCC CTT GTG GAT TCC CAA GAA ATC GCT GCG TAT TTG GGC GAG AGC TTT Ala Leu Val Asp Ser Gin Glu Ile Ala Ala Tyr Leu Gly Glu Ser Phe 420 425 430 GAA TTA GAA CGC TTT TAT AAA GAT TTA TCC CCA GAA GAA TTG GAA AAT T Glu Leu Glu Arg Phe Tyr Lys Asp Leu Ser Pro Glu Giu Leu Giu Asn.
435 440 445 450 AAAAACGCAT GCAAAAAGAA CAAGAAGCCC AAGAAATCGC TAAAAAAGCC GTTAAAATCG
TGTTT
97/21353- 1413 1462 1522 1527 INFORMATION FOR SEQ ID NO:68: SEQUENCE CHARACTERISTICS: LENGTH: 450 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:68: Met Ser Met Giu Phe Asp Ala Val 1 Cys Al a Ly s Thr Gin Gin Ly s Asp Asn 145 Asp Glu Val Asp Tyr 225 Ala Thr Ile Val Ala Asn Pro Glu Tyr Ala Lys Met Arg Tyr 115 Lys Gin 130 Gly Ile Trp Ser Ala Leu Lys Lys 195 Ala Glu 210 Ala Leu Phe Glu Ser Lys Leu Ala 100 Giu Lys Asp Thr Lys 180 Ile Glu Pro 5 Tyr Lys Gin Ala Asn Ile Ser Ile Arg Met 165 Leu Glu Val1 Leu Thr Cys Thr Lys 70 Lys Gly Phe Lys His 150 As n Lys Lys5 Tyr Al a 230 Leu Ser Ile 55 Lys Gly Val1 Lys Giu 135 Glu Phe Pro Arg Ala 215 Gin Ser Lys 40 His Val1 Leu Gly Glu 120 Leu Asn Al a Asn Asn 200 Lys Ser Ile Ile Gly 10 Giu Tyr Ser 25 Leu Ala Gin Asp Gly Ser Arg Leu Ser 75 Gin Asn Giu 90 Asp Giu Glu 105 Ile Phe Val Giu Pro Asn Ile Ile Gly 155 Lys Leu Ser 170 Asn Gin Val 185 Asp Thr Tyr Phe Val Leu Met Gly Tyr 235 dly Ser Ile Ile Al a Val Cys Gly Val1 140 His Glu Phe Ala Val1 220 Gly Gly Leu Ser Glu Tyr Ile Glu Leu 125 Ile Gly As n Leu Val 205 As n Leu Val1 Lys Ser Thr Lys Phe Phe 110 Glu Leu Tyr Phe Asn 190 Ile Ala Asp Ser Gly is Arg Val Ser Ala Asn Tyr Thr Arg Giu Thr Met Lys Glu Phe Gly Ala Arg Lys 160 Val Glu 175 Phe Lys Ser Glu Gly Ser Leu Gly 240 Leu Arg 255 -Cys Leu Pro Val Ala Gly Ser Phe Tyr Phe Val Pro Asp Leu WO 98/21225 PCT/US97/21353 -174- Gly His Thr Gly 305 Lys Phe Leu Glu Thr 385 Gly Gin Ser Glu Lys Gly Ala 290 Ile Ile Lys Lys Tyr 370 Lys Ile Asn Phe Asn Val Asp 275 Leu Ser Ala Asn Asp 355 Ala Arg Thr Ala Glu 435 Tyr Thr Val Gin Asn Pro Lys Leu 260 Pro Thr Leu Phe Met 340 Ala His Lys Phe Leu 420 Leu Asp Met Glu Asp 325 Val Gin Gly Leu Asn 405 Val Glu Ile 280 Leu Lys Ser Leu Ile 360 Glu Gly Pro Gin Tyr 440 265 Lys Glu Met Asp Pro 345 Pro Val Glu Ser Glru 425 Lys Gly Arg Asp Lys 330 Ile Ser Arg Lys Pro 410 Ile Asp Lys Asn Leu 315 Glu Ile Leu Pro Lys 395 Gly Ala Leu Pro Thr Lys 300 Asn Ile Gly Ser Gin 380 Ile Ala Ala Ser Phe Arg 285 Cys Lys Arg Lys Leu 365 Val Cys Thr Tyr Pro 445 Ala Gly Leu Val Tyr 335 Lys Asp Asp His Cys 415 Gly Glu Val Pro Lys Phe 320 Val Phe Leu Arg Lys 400 Leu Glu Leu INFORMATION FOR SEQ ID NO:69: SEQUENCE CHARACTERISTICS: LENGTH: 653 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 63...590 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:69: CTAGATTTAA TTTTAAAGTT ATATAATTAA ACCACAAAAT CCTTTTTTAA AAGAAACTAA GC ATG CCA AAA CCC AAG AAA AAC ACC CTC CCC TGT AGC CTT TCT GTC Met Pro Lys Pro Lys Lys Asn Thr Leu Pro Cys Ser Leu Ser Val 1 5 10 AAA ATG TCT TAT TTC ATG CGC TTT CTC ATT AAA TGG CGC ACC CGC TCT Lys Met Ser Tyr Phe Met Arg Phe Leu Ile Lys Trp Arg Thr Arg Ser 25 TTA AGC CAT AAA ATG ATG ACT CTC ATT CAA ATC TTA AGC ATT CTG GCT 107 155 WO 98/21225 PCT/US97/21353 -175- Leu Ser His Lys Met Met Thr Leu Ile Gin Ile Leu Ser Ile Leu Ala AAA AAA ATC Lys Lys Ile TTA GCG AGC Leu Ala Ser AAG GCC AGT GAA Lys Ala Ser Glu TTA GAA GAG CAA Leu Glu Glu Gin
CTC
Leu AAA GAT Lys Asp TAC ATT TAT AGA Tyr Ile Tyr Arg
ACC
Thr CTA AAC GCT AAA Leu Asn Ala Lys GCA TCG GAT GTG Ala Ser Asp Val
TAT
Tyr AAC CGA GTG CTT Asn Arg Val Leu
ATT
Ile 85 TTA GTG AAT Leu Val Asn TTG TTT GAC AAA Leu Phe Asp Lys ATT CAG CTT TAC Ile Gin Leu Tyr AGC GTT AAA ATT Ser Val Lys Ile TTA GTG GAT GAA Leu Val Asp Glu 120 GAA TAT Glu Tyr 90 TCA GAT Ser Asp 105 ATG CTT Met Leu TGC ACT AAT GAA Cys Thr Asn Glu
GAA
Glu 299 347 395 443 TTA CTC ATT Leu Leu Ile CAA GAC Gin Asp 110 AAA TAT Lys Tyr AAA GAA Lys Glu CAA GTC CAG Gin Val Gin 130 CAC ACC ATT TTA His Thr Ile Leu
AAG
Lys 135 GGC ATC ATC AAA Gly Ile Ile Lys
CGC
Arg 140 AAA TAC GAT Lys Tyr Asp GAA GCC TAC TCG CTC AAT Glu Ala Tyr Ser Leu Asn 145 GAA GAC AGG ATT CTT TTA GAA TAC CAA Glu Asp Arg Ile Leu Leu Glu Tyr Gin 155 CGC TTG CTA GAA Arg Leu Leu Glu TCA CAC GCG TCT Ser His Ala Ser
TTT
Phe 170 TCA AAT AAA AAA Ser Asn Lys Lys
AAA
Lys TGATTTGAAA GCGTTACTTG CCCTGCTTTT TGGGCTTTTA TTGAAAAAGG GCTTTA 646
AAATGAG
INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 176 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID Met Pro Lys Pro Lys Lys Asn Thr Leu Pro Cys Ser Leu Ser Val Lys 1 5 10 WO 98/21225 PCT/US97/21353 176- Ser His Ser Tyr Arg Asp Leu Gin 130 Tyr Leu Phe Met Ala Tyr Leu Glu 100 Ala Thr Leu Glu Ile Gin Glu Ala Glu Ser 105 Met Ile Arg Ser Trp Leu Gin Ile Cys Leu Lys Lys Leu 155 Ser Arg Ser Leu Ala Thr Leu Glu Arg 140 Leu Asn Thr Ile Lys Ser Asn Ile Asp 125 Lys Glu Lys 170 INFORMATION-FOR SEQ ID NO:71: SEQUENCE CHARACTERISTICS: LENGTH: 1883 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 91...1833 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:71: AAGCGTTAAA TTCCAATCAA AAACCATCGT ATCGGTGTTA ATATTGTGTA AAAATTAATG TTATGAATCT CTTGTATTAA AAGGACTTCA ATG AAA AAA TTG GTT TTA GTC ATC Met Lys Lys Leu Val Leu Val Ile 1 TTT TTA ACG CTA GCG CTT TCA ATA TCT GCA AAA GAA GTC AAA ATA GTG Phe Leu Thr Leu Ala Leu Ser Ile Ser Ala Lys Glu Val Lys Ile Val 15 TTT TTA GAA ACT TCA GAC ATT CAT GGG CGG CTT TTT TCG TAT GAT TAT Phe Leu Glu Thr Ser Asp Ile His Gly Arg Leu Phe Ser Tyr Asp Tyr 30 35 GCG ATT GGC GAG CAA AAA CCC AAT AAC GGC TTG ACA AGG ATT GCG ACT Ala Ile Gly Glu Gin Lys Pro Asn Asn Gly Leu Thr Arg Ile Ala Thr 50 114 162 210 258 WO 98/21225 PCT/US97/21353 -177- TTA ATC AAA Leu Ile Lys AGC GGG GAT Ser Gly Asp
AAG
Lys CAA AGG GCT GAG Gin Arg Ala Glu AAT AAA Asn Lys 65 AAT GTG GTT TTG ATT GAC Asn Val Val Leu Ile Asp TTG TTG CAA GGC AAT AGC GCG GAG TTG Leu Leu Gin Gly Asn Ser Ala Glu Leu 80 TTT AAT GAT GAG Phe Asn Asp Glu AAA TTT GAC ATT Lys Phe Asp Ile CCA ATT Pro Ile CAT CCG CTA GTT His Pro Leu Val
AGA
Arg 95 GCT GAA AAC GAT Ala Glu Asn Asp
TTG
Leu 100
CGT
Arg 105 GTG CTT GGC AAT Val Leu Gly Asn CAC GAG TTT AAT TTC AGT AAA GAT TTT TTA GAA His Glu Phe Asn Phe Ser Lys Asp Phe Leu Glu 110 115 120 TTT AAT GGC GAT GTC ATG AAT GCG AAT ATC ATT Phe Asn Gly Asp Val Met Asn Ala Asn Ile Ile 450 498 AAG AAT ATT AAG Lys Asn Ile Lys AAA ATT GCG Lys Ile Ala ATT GAT GGC lie Asp Gly 155
GAC
Asp 140 AAT AAG CCG TTT GTA AAA CCT TAT ATT Asn Lys Pro Phe Val Lys Pro Tyr Ile 145 ATT AAA AAA Ile Lys Lys 150 GCG CAC ATC Ala His Ile GTG AGG GTG GCG Val Arg Val Ala
GTT
Val 160 GTG GGG TAT GTG Val Gly Tyr Val
GTG
Val 165 CCA ACT Pro Thr 170 TGG GAG GCC TCT ACG CCT GAA CAT TTT Trp Glu Ala Ser Thr Pro Glu His Phe 175 GGA TTG AAG TTT Gly Leu Lys Phe GAC GCT GAA GAA Asp Ala Glu Glu TTA AAA AAG ACC Leu Lys Lys Thr
TTA
Leu 195 AAA GAG TTG AAA Lys Glu Leu Lys
GGG
Gly 200 AAG TAT GAT ATT Lys Tyr Asp Ile ATT GGC GCT TTT Ile Gly Ala Phe TTG GGG CGA GAA Leu Gly Arg Glu GAT GAG Asp Glu 215 AAA GGT GGC Lys Gly Gly GAC ATC ATT Asp Ile Ile 235
GAC
Asp 220 GGG ATA CCG GAT Gly Ile Pro Asp
TTA
Leu 225 GCG AAA AAA TTC Ala Lys Lys Phe CCG CAA TTT Pro Gin Phe 230 ACC AAA GTA Thr Lys Val TTT GCA GGG- CAT Phe Ala Gly His CAT GCG GTT TAT His Ala Val Tyr GGG AAA Gly Lys 250 GTG CAT ACC ATT Val His Thr Ile CCT GGA GCG TAT Pro Gly Ala Tyr
GGG
Gly 260 GCT TAT CTG GCA Ala Tyr Leu Ala
AAG
Lys 265 GGC GTG GTG GTA Gly Val Val Val GAC ACT AAA ACG Asp Thr Lys Thr
AAG
Lys 275 AAA AAA ATT ATA Lys Lys Ile Ile WO 98/21225 PCT/US97/21353 -178- ACT GAA AAT TTA Thr Glu Asn Leu
CCC
Pro 285 ACA AAA GAT GTG Thr Lys Asp Val
CCA
Pro 290 GAA GAT GAA GAA Glu Asp Glu Clu TTA GCG Leu Ala 295 AAA AAA TAC Lys Lys Tyr
GAA
Glu 300 TAT GTG GAT AAA Tyr Val Asp Lys TCA AAA GAA TAC Ser Lys Glu Tyr GTG GTT GGC GAA GTT ACA AAA Val Val Gly Glu Val Thr Lys TTT ATT GAC AGG Phe Ile Asp Arg ACA GGA Thr Gly 330 315
GAA
Glu
CCT
Pro 325
GCC
Ala GCT AAT GAA Ala Asn Glu 310 GAT TTT ATC Asp Phe Ile TTG CAA GAA Leu Gin Glu 1026 GAA AAA ATC Glu Lys Ile
ACC
Thr 335 ATG CCC ACC Met Pro Th-r
GCC
Ala 340
ACA
Thr 345 CCC GTG ATA GAA Pro Val Ile Glu ATT AAT AAA GTG Ile Asn Lys Val
CAA
Gln 355 AAA TAT TAC GCA Lys Tyr Tyr Ala
AAA
Lys 360 1074 1122 1170 1218 1266 1314 GCC CAT GTT TCA Ala Asp Val Ser GCA CCC TTA TTC Ala Ala Leu Phe TTT GGG GCG AAT Phe Gly Ala Asn TTG AAA Leu Lys 375 AAA OGG CCT Lys Gly Pro AAT ACO CTC Asn Thr Leu 395
TTC
Phe 380 AAA AGA AAA GAT GTC ACT TAT ATT TAC Lys Arg Lys Asp Val Thr Tyr Ile Tyr AAG TTC GCT Lys Phe Ala 390 ATT GGA GTG CGT Ile Cly Val Arg
ATA
Ile 400 ACG GGT GAA AAT CTG TTG AAA TAC Thr Gly Glu Asn Leu Leu Lys Tyr 405 ATG GAA Met Glu 410 TGG TCA TAC CGA Trp Ser Tyr Arg TAC AAT CAC TTG Tyr Asn Gin Leu
CAA
Gin 420 CCA GGA OAT TTG Pro Gly Asp Leu 1362
ACG
Thr 425 ATC ACT TTT AAT Ile Ser Phe Asn AAC ATT CGC GOC TAT AAC TTT OAT ATG Asn Ile Arg Oly Tyr Asn Phe Asp Met
TTT
Phe 440 1410 1458 TCT GOC OTG AAA TAC CAG GTT GAT OTT Ser Cly Val Lys Tyr Gin Val Asp Val 445
ACA
Thr 450 AAA CCC CCC GGA Lys Pro Ala Gly CAA AGO Gin Arg 455 ATT ATC AAT Ile Ile Asn TAT AAA TTA Tyr Lys Leu 475
CCC
Pro 460 ACA ATC AAC AAC Thr Ile Asn Asn
AAA
Lys 465 CCC ATT OAC CCC Pro Ile Asp Pro AAA CCC ATC Lys Ala Ile 470 1506 GCG ATC AAC AAT Ala Ile Asn Asn
TAC
Tyr 480 CGA TTC OGA ACA TTA TCC ACO ACA Arg Phe Oly Thr Leu Ser Thr Thr 485 1554 TTG AAT Leu Asn 490 TTG GTT ACA GAC Leu Val Thr Asp GMT AGO TAT Xaa Arg Tyr TAT PAT Tyr Asn 500 TCT TAC OAT GAA Ser Tyr Asp Olu 1602 WO 98/21225 WO 9821225PCTIUS97/21353 AAT GGG CAA ATA CGA GAT -179- TTG ATC ATC AAA TAC ATC ACG CTG CAA GAT Leu Gin Asp 505 GAA GAA AAA Glu Giu Lys ATC ATC AAC Ile Ile Asn AAA TTA AAA Lys Leu Lys 555 ACT TTO AAT Thr Leu Asn 570 Asn Gly Gin Ile Arg Asp Leu Ile Ile Lys Tyr Ile Thr 515
TTG
Leu 520
GGT
Giy
TAC
Tyr 540
GAG
Giu OTA ACC CCT Vai Thr Pro GAG GOT AAT Glu Giy Asn TGG GAA Trp Giu 535 TTC AAA AAC Phe Lys Asn
CCG
Pro 545
ATC
Ile TTG GAA AA.A Leu Glu Lys TTG AGA GAA Leu Arg Giu 550 OAT GGG AGG Asp Giy Arg 1650 1698 1746 1794 1840 GOG ACC ATC Giy Ser Ile
AAA
Lys 560
AAA
Lys CCC ACC TCA Pro Thr Ser
AAG
Lys 565 AAkA TAAAATT Lys GTC AAA TCC ATT Vai Lys Ser Ile 575 GAO AGT GAA OTT Giu Ser Glu Val TTTTATTTTT ATTATTTTAT CTTTAAGCCT AACTTAAAAA AGG INFORMATION FOR SEQ ID NO:72: SEQUENCE CHARACTERISTICS: LENGTH: 581 amino acids TYPE: amino acid STRANDEDNESS: singie TOPOLOGY: iinear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal 1883 (xi) SEQUENCE Met Lys Lys Leu Val i 5 Ser Ala Lys Glu Val Giy Arg Leu Phe Ser Asn Giy Leu Thr Arg Asn Lys Asn Vai Val Ser Ala Giu Leu Phe Glu Asn Asp Leu Lys 100 ASn Phe Ser Lys Asp 115 Asp Vai Met Asn Ala 130 Val Lys Pro Tyr Ile DESCRIPTION: SEQ ID NO:72: Leu Val Ile Phe Leu Lys Tyr Ile Leu 70 Asn Phe Phe Asn Ile Ile Asp Ala 55 Ile Asp Asp Leu Ile 135 Lys Thr Glu Gly Lys Asp Hius Leu Ile Al a Gly Leu Thr Glu Lys Leu Pro Gly Lys Asp 140 Val1 Al a Ser Gin Gin Leu Leu As n Gly 125 Asn Arg Ser Ile Pro Al a Gly Arg Olu Asn Pro Ala WO 98/21225 WO 981225PCTIUS97/21353 -180- Val Giu Lys Phe Leu 225 His Gly Lys Val Lys 305 Phe Met Ly s Phe Val1 385 Thr As n Arg Val1 Lys 465 Arg Arg Asp Pro Pro 545 Ile Glu Gly His Thr His 210 Ala Ala Ala Thr Pro 290 Ser Ile Pro Val1 Asn 370 Thr Gly Gin Gly Thr 450 Pro Phe Tyr Leu Giu 530 Leu Pro Ser Tyr Phe Leu 195 Leu Lys Val1 Tyr Lys 275 Giu Lys Asp Thr Gin 355 Phe Tyr Glu Leu Tyr 435 Lys Ile Gly Tyr Ile 515 Leu Leu Thr Giu Val1 Al a 180 Lys Gly Lys Tyr Gly 260 Lys Asp Giu Arg Al a 340 Lys Gly Ile Asn Gin 420 Asn Pro Asp Th r Asn 500 Ile Giu Giu Ser Val1 580 Ala Leu Leu Glu Pro 230 Thr Tyr Ile Glu Ala 310 Asp Leu Ty r Asn Lys 390 Leu Gly Asp Gly Lys 470 Ser Tyr Tyr Asn Leu 550 Asp His Lys Lys Asp 215 Gin Lys Leu Ile be u 295 Asn Phe Gin Ala Leu 375 Phe Ly s Asp Met Gin 455 Ala Thr Asp Ile Trp 535 Arg Gly Trp Ala Asp Gly Ile 235 Val Val Asn Tyr Gly 315 Glu Val Val Pro Leu 395 Trp Ser Val1 Asn Leu 475 Leu Asp Lys Asn Lys 555 Asfl Giu Glu Ile Asp 220 Phe His Val Leu Glu 300 Glu Glu Ile Ser Phe 380 Ile Ser Phe Lys Pro 460 Al a Val Asn Gly Tyr 540 Giu Val Thr 175 Leu Gly Pro His Glu 255 Asp Lys Asp Lys Thr 335 Ile Al a Lys Arg Phe 415 As n Val1 As n As n Al a 495 Ile Val1 Lys Ile I le 575 160 Pro Lys Ala Asp Glu 240 Pro Thr Asp Lys Thr 320 Thr As n Leu Asp Ile 400 Tyr Ile Asp Asn Tyr 480 Xaa Arg Thr Asn Lys 560 Lys WO 98/21225 PCT/US97/21353 -181- INFORMATION FOR SEQ ID NO:73: SEQUENCE CHARACTERISTICS: LENGTH: 1339 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 68...1252 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:73: CCAATCGTTT AATAGCGATT AAATATGACT ATATACACTA CAACAATAAG ATTTTGAAAG GTTGGTA ATG GAA TCA GTA AAA ACA GGA AAA ACA Met Glu Ser Val Lys Thr Gly Lys Thr
AAT
Asn AAG GTT GGC AAG Lys Val Gly Lys
AAT
Asn ACA GAG ATG GCT Thr Glu Met Ala
AAT
Asn 20 ACA AAG GCA AAT Thr Lys Ala Asn AAA GAG GCT CAT TTT AAA Lys Glu Ala His Phe Lys 25 TCA ATT CGT GGG ATT TTT Ser Ile Arg Gly Ile Phe CAA GCG AGC ACC Gin Ala Ser Thr
ATT
Ile ACA AAT ATA ATC Thr Asn Ile Ile ACA AAA ATT Thr Lys Ile AAA AGC AGT Lys Ser Ser
GCA
Ala AAG AAA GTT AGA Lys Lys Val Arg CTT GTA AAA AAA Leu Val Lys Lys CAC CCC AAG His Pro Lys TGC AAG AAA Cys Lys Lys GCG GCA TTA GTA Ala Ala Leu Val TTG ACC CAT ATT Leu Thr His Ile GCG AAA Ala Lys GAA TTA GAC GAT Glu Leu Asp Asp GTC CAA GAT AAA Val Gin Asp Lys
TCC
Ser AAA CAA GCT GAA Lys Gin Ala Glu GAA AAT CAA ATC Glu Asn Gin Ile TGG TGG AAA TAT Trp Trp Lys Tyr
TCA
Ser 105 GGA TTA ACA ATA GCG Gly Leu Thr Ile Ala 110 ACA AGT TTA TTA TTA GCC GCT TGT AGC Thr Ser Leu Leu Leu Ala Ala Cys Ser 115 GGT GAT GTT AGT Gly Asp Val Ser GAA CAA Glu Gin 125 ATA GAA CTA Ile Glu Leu
GAA
Glu 130 CAA GAA AAA CAA AAG ACG AGC AAT ATA Gin Glu Lys Gin Lys Thr Ser Asn Ile GAG ACT AAC Glu Thr Asn 140 WO 98/21225 PCT/US97/21353 182- AAT CAA ATA AAA GTA GAA CAA Asn Gin Ile Lys Val Glu Gin AAA CAA AAG ACA Lys Gin Lys Thr
AGC
Ser 155
AAG
Lys AAT ATA GAG Asn Ile Glu ACA AGC AAT Thr Ser Asn ACT AAT Thr Asn 160 CAA ATA AAA Gin Ile Lys
GTA
Val 165
CAA
Gin GAA CAA Glu Gin AAA GAT Lys Asp 185 541 589 637 685 CAG AAA GAT TTG Gin Lys Asp Leu GTT AAA GAA CAG Val Lys Glu Gin 180 GAA CAG AAA GAT Glu Gin Lys Asp TTG GTT AAA GAA Leu Val Lys Glu AAA GAT TTG GTT Lys Asp Leu Val GTT AAA GAA CAG Val Lys Glu Gin AAA GAT Lys Asp 205 TTG GTT AAA Leu Val Lys CAA GAA AAT Gin Glu Asn 225
ACA
Thr 210 CAG AAA GAT TTC Gin Lys Asp Phe
ATT
Ile 215 AAA TAT GTA GAA Lys Tyr Val Glu CAA AAT TGC Gin Asn Cys 220 ATT AAG GCT Ile Lys Ala CAT AAT CAA TTC His Asn Gin Phe
TTT
Phe 230 ATT GAA AAA GGA Ile Glu Lys Gly
GGA
Gly 235 GGT ATT Gly Ile 240 GGT ATA GAA GTA GAA GCT GAA TGC AAA Gly Ile Glu Val Glu Ala Glu Cys Lys 245 CCT AAA CCT GCA Pro Lys Pro Ala
AAA
Lys 255 ACC AAT CAA ACC Thr Asn Gin Thr
CCT
Pro 260 ATC CAG CCA AAA Ile Gin Pro Lys
CAC
His 265 CTC CCA AAC TCT Leu Pro Asn Ser
AAA
Lys 270 CAA CCC CGC TCT Gin Pro Arg Ser
CAA
Gin 275 AGA GGA TCA AAA Arg Gly Ser Lys CAA GAG CTT ATC Gin Glu Leu Ile GCT TAT Ala Tyr 285 TTG CAA AAA Leu Gin Lys AAA CAA GTG Lys Gin Val 305 CTA GAA TTT CTG Leu Glu Phe Leu TAT TCG CAA AAA Tyr Ser Gin Lys GCT ATC GCT Ala Ile Ala 300 TTA GAA CTA Leu Glu Leu 973 1021 GAT TTT TAC AGG Asp Phe Tyr Arg AGT TCT ATC GCT Ser Ser Ile Ala
TAT
Tyr 315 GAT CCT Asp Pro 320 AGA GAT TTT AAG Arg Asp Phe Lys
GTT
Val 325 ACA GAA GAA TGG Thr Glu Glu Trp AAA GAA AAT CTA Lys Glu Asn Leu ATA CGC TCT AAA Ile Arg Ser Lys
GCT
Ala 340 CAA GCT AAA ATG Gin Ala Lys Met
CTT
Leu 345 GAA ATG AGA AAC Glu Met Arg Asn 1069 1117 1165 CAA GCC CAC CTT Gin Ala His Leu
TCA
Ser 355 AAC TCT CAA AGC Asn Ser Gin Ser TTG TTC GTT CAA Leu Phe Val Gin AAA ATA Lys Ile 365 WO 98/21225 PCT/US97/21353 -183- TTT GCT GAT GTT AAT AAA GAA ATA GAA GCA GTT GCT AAT ACT GAA AAG 1213 Phe Ala Asp Val Asn Lys Glu Ile Glu Ala Val Ala Asn Thr Glu Lys 370 375 380 AAA GCA GAA AAA GCG GGT TAT GGT TAT AGT AAA AGG ATG TAGCGGTTAA AA 1264 Lys Ala Glu Lys Ala Gly Tyr Gly Tyr Ser Lys Arg Met 385 390 395 ACATTGCACC AAGTTTTTAA TTATCTGTCG GCTTTTGAAA ACATTTTTTA TGGTAGCGTT 1324 ATTTGGCAAT AAAAG 1339 INFORMATION FOR SEQ ID NO:74: SEQUENCE CHARACTERISTICS: LENGTH: 395 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:74: Met Glu Ser Val Lys Thr Gly Lys Thr Asn Lys Val Gly Lys Asn Thr 1 5 10 Glu Met Ala Asn Thr Lys Ala Asn Lys Glu Ala His Phe Lys Gin Ala 25 Ser Thr Ile Thr Asn Ile Ile Arg Ser Ile Arg Gly Ile Phe Thr Lys 40 Ile Ala Lys Lys Val Arg Gly Leu Val Lys Lys His Pro Lys Lys Ser 55 Ser Ala Ala Leu Val Val Leu Thr His Ile Ala Cys Lys Lys Ala Lys 70 75 Glu Leu Asp Asp Lys Val Gin Asp Lys Ser Lys Gin Ala Glu Lys Glu 90 Asn Gin Ile Asn Trp Trp Lys Tyr Ser Gly Leu Thr Ile Ala Thr Ser 100 105 110 Leu Leu Leu Ala Ala Cys Ser Thr Gly Asp Val Ser Glu Gin Ile Glu 115 120 125 Leu Glu Gin Glu Lys Gin Lys Thr Ser Asn Ile Glu Thr Asn Asn Gin 130 135 140 Ile Lys Val Glu Gin Glu Lys Gin Lys Thr Ser Asn Ile Glu Thr Asn 145 150 155 160 Asn Gin Ile Lys Val Glu Gin Glu Gin Gin Lys Thr Ser Asn Thr Gin 165 170 175 Lys Asp Leu Val Lys Glu Gin Lys Asp Leu Val Lys Glu Gin Lys Asp 180 185 190 Leu Val Lys Glu Gin Lys Asp Leu Val Lys Glu Gin Lys Asp Leu Val 195 200 205 Lys Thr Gin Lys Asp Phe Ile Lys Tyr Val Glu Gin Asn Cys Gin Glu 210 215 220 Asn His Asn Gin Phe Phe Ile Glu Lys Gly Gly Ile Lys Ala Gly Ile 225 230 235 240 Gly Ile Glu Val Glu Ala Glu Cys Lys Thr Pro Lys Pro Ala Lys Thr WO 98/21225 PCT/US97/21353 -184- Asn Gin Thr Arg Ser Gin 275 Lys Glu Leu 290 Val Asp Phe Gin Pro Lys His 265 Gin Pro Asn Ser Gly Ser Lys 255 Lys Gin Pro 270 Tyr Leu Gin Ala Lys Gin Ala 280 Tyr Glu Leu Ile Glu Phe Leu Tyr Arg Pro 310 Lys Val Thr Ser Gin Lys Ala 300 Leu Ser Ile Ala 305 Arg Tyr 315 Glu Leu Asp Asp Phe Ala Glu Glu Trp Lys Met Leu Arg Ser Lys His Leu Ser 355 Asp Val Asn Ala 340 Asn Gin Lys Glu 330 Glu Met Arg Phe Val Gin Asn Leu Lys Ile 335 345 Leu Ser Gin Ser Leu 360 Ala Asn Lys 365 Glu Pro Gin Ala 350 Ile Phe Ala Lys. Lys Ala Lys Glu Ile Val Ala Asn 370 Glu Lys 385 Ala Gly Tyr Ser Lys Arg INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 904 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 70...864 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID TAATAACTCA ATCCCATTTG AATGGCATTT TTAAGCCAAA TTGCTACTAT CTTTGGCTAA AGGTTAAAC ATG ATT AAA CAA ACC CTC ATC ATT CTT GCC CCT TTT TTT ATC Met Ile Lys Gin Thr Leu Ile Ile Leu Ala Pro Phe Phe Ile 111
GCA
Ala ACG CT.G TTG TAT Thr Leu Leu Tyr TTA GGC GCA CCG Leu Gly Ala Pro
GAT
Asp GGG TTA AGA CCT Gly Leu Arg Pro GCT TGG CTT TAT Ala Trp Leu Tyr
TTT
Phe TGT ATT TTC ATG Cys Ile Phe Met ATG ATT ATA GGG Met Ile Ile Gly CTA ATT Leu Ile TTA GAG CCG Leu Glu Pro CCA TCA GGT TTA Pro Ser Gly Leu GCG CTA AGC GCG Ala Leu Ser Ala TTA GTG CTG Leu Val Leu WO 98/21225 WO 9821225PCTIUS97/21353 -185- TGT ATA GCG Cys Ile Ala TTA AAA ATT GGA GCG AGC GAT AAA GTA Ser Asp Lys Val AGC GCT AAT Ser Ala Asn Leu Lys Ile Gly Al a AAG GOT Lys Ala ATT TOG TGG GGT Ile Ser Trp Gly AGO GGG TAT Ser Gly Tyr GOG AAT Ala Asri GGG TAT Gly Tyr 105 AAA AOG GTG TGG Lys Thr Val Trp,
OTT
Leu GTG TTT GTO GOT Val Phe Val Ala ATT TTG GGT TTA Ile Leu Gly Leu GAA AAA AGO Giu Lys Ser TTA GGG AAA OGG ATO GOT OTT TTA OTG ATT AGG TTT TTA GGG CAA ACC Leu Gly Lys Arg Ile Ala Leu Leu Leu Ile Arg Phe Leu Gly Gin Thr OOT TTA GGT Pro Leu Gly
TTA
Leu 130 GGO TAT CG ATT GGT TTG AGC GAA TTG Gly Tyr Ala Ile Gly Leu Ser Giu Leu 135 TGT OTA CO Oys Leu Ala 140 495 OOT TTT ATO OOT AGO AAO TOO Pro Phe Ile Pro Ser Asn Ser 145
GOT
Ala 150 AGA AGT GGA GGO ATA OTO TAT 000 Arg Ser Gly Gly Ile Leu Tyr Pro ATO CTT Ile Val 160 TOA TOT ATO OOG Ser Ser Ile Pro OOT TTA ATG CGA TOT ACT OOA AAT AAT AAO Pro Leu Met Gly Ser Thr Pro Asn Asn Asn 165 170 TAT TTG ATG TGG GTC GOT TTG GOT TOA ACT Tyr Leu Met Trp Val Ala Leu Ala Ser Thr 185 190
OOT
Pro 175 GAO AAA ATO GGO Asp Lys Ile Gly 639 TGO ATO ACT TOG 0 ys Ile Thr Ser ATC TTT TTA ACC Met Phe Leu Thr OTO GOT OOT AAO Leu Ala Pro Asn 000 OTA Pro Leu 205 GOA ATG GAA Ala Met Glu TOG TGG TTT Ser Trp, Phe 225
ATO
Ile 210 GOT GOO AAA ATG GC GTG AAT GAA ATC Ala Ala Lys Met Gly Val Asn Glu Ile TOA TCG TTT Ser Trp, Phe 220 687 735 783 TTA CG TTO TTC Leu Ala Phe Leu
OOT
Pro 230 TGT GGG GTG GTT TTG ATO TTGCOTT Oys Gly Val Val Leu Ile Leu Leu 235
CTGCCOT
Val Pro 240 TTA TTG GC TAT Leu Leu Ala Tyr ACC TGO AAA OC Thr Oys Lys Pro
ACC
Thr 250 TTA AAA GC TOA Leu Lys Cly Ser 831 884
AAA
Lys 255 GAA GTG AGT TTG Glu Val Ser Leu CO AAA AAA AGG Ala Lys Lys Arg TAGAGGGOAT GGCGAGGTTT TOTTTAAAAG AAATTTTAAT INFORMATION FOR SEQ ID NO:76: WO 98/21225 PCT/US97/21353 -186- SEQUENCE CHARACTERISTICS: LENGTH: 265 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:76: Met Ile Lys Gin Thr Leu Ile Ile Leu Ala 1 Leu Leu Pro Ala Ile Phe Lys Gly Ile 145 Ser Lys Thr Glu Phe 225 Leu Val Leu Tyr Val Leu Ser Val Arg Leu 130 Pro Ser Ile Ser Ile 210 Leu Leu Ser Tyr Phe Pro Lys Trp Ala Ile 115 Gly Ser Ile Gly Ser 195 Ala Ala Ala Leu Gly Phe Leu Ala 70 Ser Leu Leu Ile Ala 150 Leu Leu Leu Met Pro 230 Thr Lys Ala Met Ile 55 Ser Gly Gly Leu Gly 135 Arg Met Met Thr Gly 215 Cys Cys Lys Pro Gly 40 Ala Asp Tyr Leu Ile 120 Leu Ser Gly Trp Ala 200 Val Gly Lys Arg 10 Gly Ile Ser Val Asn Tyr Phe Glu Gly Thr 170 Ala Ala Glu.
Val Thr 250 Pro Leu Ile Ala Ala 75 Lys Glu Leu Leu Ile 155 Pro Leu Pro Ile Leu 235 Leu Phe Arg Gly Leu Ser Thr Lys Gly Cys 140 Leu Asn Ala Asn Ser 220 Ile Lys Phe Pro Leu Val Ala Val Ser Gin 125 Leu Tyr Asn Ser Pro 205 Trp Leu Gly Ala Ala Leu Cys Lys Leu Leu Pro Pro Ile Pro 175 Cys Ala Ser Val Lys 255 Thr Trp Glu Ile Ala Val Gly Leu Phe Val 160 Asp Ile Met Trp Pro 240 Glu INFORMATION FOR SEQ ID NO:77: SEQUENCE CHARACTERISTICS: LENGTH: 1194 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: WO 98/21225 PCT/US97/21353 -187- NAME/KEY: Coding Sequence LOCATION: 152...1069 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:77:
TTTTAAGCGG
TTTGACAATA
TGAGCTTTGG
TTTCCCTAAA ATAGGTTTTT TTATACTATA ATAACCAATT CTTAGTTTTG TAAGGAATGA AATCAATTTA ATCCAAAGTT AGATTGGGGT TTTACTGATT G ATG ATA AAG AGT TGG
GAATTTATTT
TTTCTTTGTG
ACT AAA 120 172 Met Ile Lys Ser Trp Thr Lys AAG TGG TTT Lys Trp Phe TTG ATT TTA TTT Leu Ile Leu Phe ATG GCA AGT TGT Met Ala Ser Cys
TCC
Ser AGT TAT TTG Ser Tyr Leu GTG GCT Val Ala ACA ACC GGT GAG Thr Thr Gly Glu
AAA
Lys 30 TAT TTT AAA ATG GCT ACT CAA GCC TTT Tyr Phe Lys Met Ala Thr Gin Ala Phe
AAG
Lys AGA GGG GAC TAC Arg Gly Asp Tyr AAA GCG GTG GCT Lys Ala Val Ala
TTT
Phe 50 TAT AAG AGG AGC Tyr Lys Arg Ser
TGT
Cys AAT TTA AGG GTG Asn Leu Arg Val GTT GGT TGC ACG Val Gly Cys Thr TTA GGC TCT ATG Leu Gly Ser Met TAT GAA Tyr Glu GAT GGC GAT Asp Gly Asp AGA AGA GGG Arg Arg Gly GTG GAT CAG AAT Val Asp Gin Asn ACA AAA GCC GTT Thr Lys Ala Val TTT TAT TAC Phe Tyr Tyr AGT CTA GGC Ser Leu Gly TGT AAT TTA AGG Cys Asn Leu Arg
AAT
Asn 95 CAT CTC GCT TGC His Leu Ala Cys
GCG
Ala 100 TCT ATG Ser Met 105 TAT GAA GAT GGC Tyr Glu Asp Gly
GAT
Asp 110 GGT GTG CAA AAA Gly Val Gin Lys
AAC
Asn 115 CTT CCA AAG GCT Leu Pro Lys Ala
ATC
Ile 120 TAT TAT TAC AGG Tyr Tyr Tyr Arg GGG TGC CAC TTA Gly Cys His Leu
AAG
Lys 130 GGT GGG GTG AGC TGT Gly Gly Val Ser Cys 135 GGG AGT TTA GGT TTT ATG TAT TTT AAT GGC ACG GGC GTT AAG CAA AAT Gly Ser Leu Gly Phe Met Tyr Phe Asn Gly Thr Gly Val Lys Gin Asn 140 145 150 TAT GCC AAA Tyr Ala Lys CTT TTT CTT TCT Leu Phe Leu Ser TAC GCT TGC AGT Tyr Ala Cys Ser TTG AAT TAC Leu Asn Tyr 165 652 GGC ATT AGT TGT AAC TTT GTA GGG TAT ATG TAT AGG AAC GCC AAA GGC 700 WO 98/21225 PCT/US97/21353 188- Gly Ile Ser 170 Cys Asn Phe Val Gly 175 Tyr Met Tyr Arg Asn Ala Lys Gly 180 GTA CAG Val Gin 185 AAG GAT TTG AAA Lys Asp Leu Lys GCC CTT GCG AAT Ala Leu Ala Asn
TTT
Phe 195 AAA AGA GGG TGC Lys Arg Gly Cys TTG AAA GAC GGA Leu Lys Asp Gly AGT TGT GTG AGC Ser Cys Val Ser
TTG
Leu 210 GGA TAC ATG TAT Gly Tyr Met Tyr
GAA
Glu 215 GTC GGT ATG GAT Val Gly Met Asp AAA CAA AAT GGA Lys Gin Asn Gly CAA GCC TTG AAT Gin Ala Leu Asn CTT TAT Leu Tyr 230 AAA AAG GGT Lys Lys Gly GTG ATG TAT Val Met Tyr 250 TAT TTA AAA AGG Tyr Leu Lys Arg AGC GGT TGT CAT Ser Gly Cys His AAT GTG GCG Asn Val Ala 245 GAT AAA GCC Asp Lys Ala TAC ACC GGT AAG Tyr Thr Gly Lys GTT CCA AAG GAT Va-1 Pro Lys Asp
TTA
Leu 260 ATT TCG Ile Ser 265 TAT TAT AAG AAA Tyr Tyr Lys Lys TGC ACT CTA GGC Cys Thr Leu Gly AGT GGT AGC TGT Ser Gly Ser Cys
AAA
Lys 280 GTG TTA GAA GAA Val Leu Glu Glu ATT GGC AAG AAG Ile Gly Lys Lys GAT GAT TTG CAA Asp Asp Leu Gin 988 1036 1089 GAC GCG CAA AAC Asp Ala Gin Asn ACG CAA GAT GAT Thr Gin Asp Asp ATG CAA Met Gin 305 TAAGTTAAAG CTTATGGACT AATGATTAAA ACTCATCTTA TAGAAATCTT TCTACTCTCT TGTTATCAAA GCGTCTCTAT TGATGGGTAT TGAGACTAAA AATCTGCAAA TCTAG TAGGGATTAA 1149 1194 INFORMATION FOR SEQ ID NO:78: SEQUENCE CHARACTERISTICS: LENGTH: 306 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:78: Met Ile Lys Ser Trp Thr Lys Lys Trp Phe Leu Ile Leu Phe 1 5 10 Ala Ser Cys Ser Ser Tyr Leu Val Ala Thr Thr Gly Glu Lys Leu Met Tyr Phe 25 Lys Met Ala Thr Gin Ala Phe Lys Arg Gly Asp Tyr His Lys Ala Val WO 98/21225 PCT/US97/21353 -189- Ala Ser Thr Leu Gin Leu Gly 145 Tyr Met Ala Ser Glu 225 Ser Pro Leu Lys Met Phe Leu Lys Ala Lys Lys 130 Thr Ala Tyr Asn Leu 210 Gin Gly Lys Gly Ser 290 Gin Tyr Gly Ala Cys Asn 115 Gly Gly Cys Arg Phe 195 Gly Ala Cys Asp Phe 275 Asp Lys Arg Ser Cys Asn Leu Ser Val Ala 100 Leu Gly Val Ser Asn 180 Lys Tyr Leu His Leu 260 Ser Asp Met Phe Ser Pro Val Lys Leu 165 Ala Arg Met Asn Asn 245 Asp Gly Leu Tyr Glu 70 Tyr Tyr Leu Gly Lys Ala Ser Cys 135 Gin Asn 150 Asn Tyr Lys Gly Gly Cys Tyr Glu 215 Leu Tyr.
230 Val Ala Lys Ala Ser Cys Gin Asp 295 Asp Arg Ser Ile 120 Gly Tyr Gly Val His 200 Val Lys Val Ile Lys 280 Asp Gly Arg Met 105 Tyr Ser Ala Ile Gin 185 Leu Gly Lys Met Ser 265 Val Ala Arg Asp Gly 90 Tyr Tyr Leu Lys Ser 170 Lys Lys Met Gly Tyr 250 Tyr Leu Gin Val Gly 75 Cys Glu Tyr Gly Ala 155 Cys Asp Asp Asp Cys 235 Tyr Tyr Glu Asn Gly Val Asn Asp Arg Phe 140 Leu Asn Leu Gly Val 220 Tyr Thr Lys Glu Asp 300 Val Asp Leu Gly Arg 125 Met Phe Phe Lys Ala 205 Lys Leu Gly Lys Val 285 Thr Gly Gin Arg Asp 110 Gly Tyr Leu Val Lys 190 Ser Gin Lys Lys Gly 270 Ile Gin Cys Asn Asn Gly Cys Phe Ser Gly 175 Ala Cys Asn Arg Gly 255 Cys Gly Asp Thr Ile His Val His Asn Lys 160 Tyr Leu Val Gly Gly 240 Val Thr Lys Asp INFORMATION FOR SEQ ID NO:79: SEQUENCE CHARACTERISTICS: LENGTH: 1001 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 101...865 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:79: TTTGTTTATA AGAAAAATTA TTTCAAATGT AGTAGAATTA AGGCAGTGTT TTTGCGTCAA GCGATTTTAG GTTAATTTTG AGTTTTTAGG AGCAGTTTTT ATG CAA CAA GAA GAG 115 WO 98/21225 PCTIUS97/21353 -190- Met Gin Gin Giu Glu 1 ATT ATA GAG GGT Ile Ile Glu Gly TAT GGT GCT AGC AAA GGG CTT AAA AAG Tyr Gly Ala Ser Lys Gly Leu Lys Lys AGC GGT Ser Gly ATT TAT GCC Ile Tyr Ala GCG CTC TTT Ala Leu Phe
AAG
Lys CTG GAT TTT TTA Leu Asp Phe Leu AGC GCT ACG GGC Ser Ala Thr Gly TTG ATT TTA Leu Ile Leu ATC TTG ATT Ile Leu Ile ATG ATA GCA CAC Met Ile Ala His
ATG
Met 45 TTT TTA GTC TCA Phe Leu Val Ser
AGT
Ser AGC GAT GAA GCC ATG TAT Ser Asp Glu Ala Met Tyr GTG GCG AAA TTT Val Ala Lys Phe
TTT
Phe GAA GGG AGC TTG Glu Gly Ser Leu
TTT
Phe TTA AAA GCG GGC Leu Lys Ala Gly CCG GCT ATT GTG Pro Ala Ile Val
AGC
Ser GTG GTT GCA GCA Val Val Ala Ala ATT ATT CTT ATT Ile Ile Leu Ile GTC GCG CAT GCT Val Ala His Ala TTG GCG TTA AGG Leu Ala Leu Arg AAA TTC Lys Phe 100 CCT ATC AAT Pro Ile Asn ATG AAA CAT Met Lys His 120
TAC
Tyr 105 AGG CAA TAC AAG Arg Gin Tyr Lys
GTT
Val 110 TTT AAA ACC CAT Phe Lys Thr His AAG CAT TTG Lys His Leu 115 CTC ACC GGG Leu Thr Gly 451 499 GGC GAT ACG AGC Gly Asp Thr Ser TGG TTT ATT CAA Trp Phe Ile Gin TTT GCG Phe Ala 135 ATG TTT TTC TTA Met Phe Phe Leu AGT ATC CAC TTA TTT GTC ATG CTC ACA Ser Ile His Leu Phe Val Met Leu Thr 145 CCT GAA AGT ATT GGG CCT CAT GGT TCA Pro Glu Ser Ile Gly Pro His Gly Ser
AGC
Ser 160
TTG
Leu TAT CGT TTT GTA Tyr Arg Phe Val AAC TTT TGG Asn Phe Trp
CTT
Leu 170 TAT ATT TTC Tyr Ile Phe TTT GCC GTA Phe Ala Val
GAA
Glu 180 CAT GGC TCT His Gly Ser AAA AAT GTG Lys Asn Val 200 GGG TTG TAT CGT Gly Leu Tyr Arg
TTA
Leu 190 GCG ATC AAA TGG Ala Ile Lys Trp GGG TOG TTT Gly Trp Phe 195 GCG ATG AGC Ala Met Ser AGC ATT CAA GGT Ser Ile Gin Gly AGA AAA GTC AAA Arg Lys Val Lys
TGG
Trp 210 GTG TTT TTT ATT GTT TTA GGG CTT TOC ACC TAT GOG OCT TAC ATT AAA 787 WO 98/21225 WO 981225PCT/US97/21353 -191 Val Phe 215 AAA GGT Lys Gly 230 Phe Ile Val Leu Leu Cys Thr AAT GGC ATT Asn Gly Ile TTA GAA AAT Leu Glu Asn
AAG
Lys 235 Tyr
AAA
Lys 240 Gly Ala Tyr Ile Lys 225 ACC ATG CAA GAA GCC Thr Met Gin Glu Ala 245 ;GGTAGA AA.ATGAAAAT AAC TTAAGGGCTA GTATCGCATG CCTGTCAGGC GTT ATA GAA GCT GAT GGG AAA TTC CAC AAA GAA TAA( Ile Glu Ala Asp Gly Lys Phe His Lys Glu 250 255 ATATTGTGAT GCGCTAATTA TTGGAGGCGG ACTAGCTGGG CAAACAAAAG GGTTTAAACA CCATCGTTTT AAGCCTAGTG INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 255 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal 835 888 948 1001 (xi) SEQUENCE Gin Gln Glu Giu Lys Lys Ser Gly Gly Leu Ile Leu Ser Ile Leu Ile Glu Gly Ser Leu Val Ala Ala Gly Leu Arg Lys Phe 100 His Lys His Leu 115 Ala Leu Thr Gly 130 Val Met Leu Thr Arg Phe Val Thr 165 Ala Val Giu Leu 180 Trp Gly Trp Phe 195 Trp Ala Met Ser 210 DESCRIPTION: SEQ ID Ile Ile Glu Gly Tyr Ile Ala Ser Phe Ile Pro Met Phe Glu 150 Gln His Lys Val1 Tyr Leu Asp 55 Leu Ile Ile Lys Ala 135 Pro As n Gly Asn Phe 215 Lys Met Ala Ala Ile Tyr 105 Gly Phe Ser Trp Ile 185 Ser Ile Leu I le Met Gly Leu Arg Asp Phe Ile Leu 170 Gly Ile Val1 Tyr Asp Ala Tyr Glu 75 Val Gln Thr Leu Gly 155 Leu Leu Glfl Leu Gly Phe His Lys Pro Al a Tyr Ser Al a 140 Pro Tyr Tyr Gly Gly 220 Gly Ala Val1 Phe Ser Leu Lys Ile Leu Ser 160 Leu Ile Val1 Tyr WO 98/21225 PCT/US97/21353 -192- Gly 225 Thr Ala Tyr Ile Lys Met Gin Glu Ala 245 Lys Gly Leu Glu Asn Lys Glu Asn Gly Ile Lys 230 235 240 Ile Glu Ala Asp Gly Lys Phe His Lys Glu 250 255 INFORMATION FOR SEQ ID NO:81: SEQUENCE CHARACTERISTICS: LENGTH: 975 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 82...912 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:81: TTTTAAAATT AAAGAAAATT TTTTTTAAAG ATTATCACTC TTTTTTGATA AAGTAATCAT TTAAAATTTA GGGAGTTTTT T ATG GAA GAA TCA ACA GCG TTT ATT Met Glu Glu Ser Thr Ala Phe Ile TTG GCT Leu Ala ATT GGT Ile Gly 111 159 CTT GTG GGG CTA TTC ACC GGC ATT ACC Leu Val Gly Leu Phe Thr Gly Ile Thr
GCC
Ala 20 GGG TTT TTT GGT Gly Phe Phe Gly GGG GGG GAG Gly Gly Glu GTC GTC CCT AGC Val Val Pro Ser
GCG
Ala ATT TTT GCC CAT Ile Phe Ala His AGC CAT GCG GTG GGT ATT TCG Ser His Ala Val Gly Ile Ser ATG CAA ATG CTT Met Gin Met Leu GTC GGC Val Gly
TCT
Ser
TTT
Phe
GAT
Asp TTT AGC TAT Phe Ser Tyr TCT TCA GTG Ser Ser Val TTG AGA GAA Leu Arg Glu ATC ATC AAT Ile Ile Asn AAG GGC TTA Lys Gly Leu
GGC
Gly TCA TTT GCC GCG Ser Phe Ala Ala
CTT
Leu 80 GGA GGG CTA ATG Gly Gly Leu Met
GGA
Gly GCG ATT TTA GGG Ala Ile Leu Gly 255 303 351 399 447 TTT ATC TTA AAA Phe Ile Leu Lys
ATC
Ile ATT GAC GAT AAA Ile Asp Asp Lys
ATT
Ile 100 TTA ATG GCG GTG Leu Met Ala Val TTT GTG Phe Val 105 GTG GTG GTG Val Val Val
TGC
Cys 110 TAC ACC TTT ATC AAA TAC GCT TTT TCT Tyr Thr Phe Ile Lys Tyr Ala Phe Ser AGC AAC AAG Ser Asn Lys 120 WO 98/21225 PCT/US97/21353 193- AAA CCC AAG Lys Pro Lys 125 CAT TTT GAA GAA ATG CAT TTT GAT TTG His Phe Glu Glu Met His Phe Asp Leu 130
CAT
His 135 GCG AAT AAC Ala Asn Asn 495 AAA ACG Lys Thr 140 CCC GAA AAA AAG CGC GCA ATC CCT TTT Pro Glu Lys Lys Arg Ala Ile Pro Phe 145 TCT ATG GAT AGA Ser Met Asp Arg
ACG
Thr 155 CAT GGG GTT TTG His Gly Val Leu
ATG
Met 160 CTC GCC GGT TTT Leu Ala Gly Phe
GTT
Val 165 ACC GGC ATC TTT Thr Gly Ile Phe
TCT
Ser 170 ATC CCA CTA GGC Ile Pro Leu Gly GGT GGA GGG ATT Gly Gly Gly Ile
TTA
Leu 180 ATG GTG CCG TTT Met Val Pro Phe TTG GGC Leu Gly 185 TAT TTT TTG Tyr Phe Leu TTT GTG GTG Phe Val Val 205 TAC GAT TCT AAA Tyr Asp Ser Lys ATC GTG CCT TTG Ile Val Pro Leu GGG CTA TTT Gly Leu Phe 200 TAT AAC GGG Tyr Asn Gly TTC GCT TCT TTA Phe Ala Ser Leu
TCT
Ser 210 GGG GTC ATC TCT Gly Val Ile Ser
CTT
Leu 215 AGG GTT Arg Val 220 CTT GAT AAT ATA Leu Asp Asn Ile
AGC
Ser 225 GTT CAA GCG GGG Val Gin Ala Gly ATT ACC GGC ATT Ile Thr Gly Ile
GGA
Gly 235 GCG TTT TTA GGC Ala Phe Leu Gly GGC ATT GGC ATC Gly Ile Gly Ile AAG CTT Lys Leu 245 ATC GCT TTG Ile Ala Leu
GCT
Ala 250 AAT GAA AAG GTG Asn Glu Lys Val AAA ATC CTG TTG Lys lie Leu Leu CTT ATT TAT GCT Leu Ile Tyr Ala TTA AGC Leu Ser 265 ATT TTA GCG Ile Leu Ala
ACT
Thr 270 TTA CAC AAG CTC Leu His Lys Leu ATT ATG GGG Ile Met Gly 275 TAAATCTAAA AACGCTTCTA GGGCATTTTT AAAATTAATA TCAAAGAGCT TTCACCAGCA AGC INFORMATION FOR SEQ ID NO:82: SEQUENCE CHARACTERISTICS: LENGTH: 277 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEO ID NO:82: WO 98/21225 WO 9821225PCTIUS97/21353 194- Met Gly Pro Ser Tyr Gly Asp Phe Glu Arg 145 Leu Gly Ser Leu Ser 225 Gly Ile Lys Glu Ile Ser Leu Lys Gly Asp Ile Met 130 Ala Al a Gly Lys Ser 210 Val1 Ile Leu Leu Gin Thr Ala Met Lys Len Lys Lys 115 His Ile Glv Ile Lys 195 Gly Gin Gly Len Ile Ser Thr Ala Phe Ile Leu Ala Leu Ala Ile Gin Gly Met Ile 100 Tyr Phe Pro Phe Leu 180 Ile Val1 Ala Ile Len 260 Met Phe His Phe Asp Ile Al a Ser His 135 Ser Gly Pro Len Len 215 Ile Ile Tyr Gly Phe 40 Ser Len Len Val1 Ser 120 Al a Met Ile Phe Gly 200 Tyr Thr Ala Ala Ile Ser Ser Arg Gly Phe 105 As n Asn Asp Phe Len 185 Leu Asn Gly Len Len 265 Gly Tyr Val1 Gin Se r Vai Lys As n Arg Ser 170 Gly Phe Gly Ile Al a 250 Ser Gly Ser Val1 Gly 75 Phe Val1 Ly S Ly s Thr 155 Ilie Tyr Phe Arg Gly 235 As n Ile Gly Gin Al a Ser Phe Len Val1 Lys 125 Pro Gly Len Leu.
Val1 205 Len Phe Lys Ala Phe Val Gly Ile Ala Ile Tyr Phe Lys Len Met 175 Tyr Ala As n Gly His 255 Len Thr Val Ile Asn Len I le Thr Gin Lys Met 160 Gly Asp Ser Ile Val 240 Lys His INFORMATION FOR SEQ ID NO:83: SEQUENCE CHARACTERISTICS: LENGTH: 1667 base pairs TYPE: nncieic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Seqnence LOCATION: 220.. .1482 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID N'O:83: WO 98/21225 WO 98/21225PCTIUS97/2 1353 -195-
AAGCGCGAGC
AAGGCTCTTG
TTATAGTCAJ\
CCTTTAAGCG
TATATGAGGA ATTTTAGCTT CTATGTGGGC TATTCAGTCG ATGAAAAATA CCAATACAAA AGAGATAAAG AATACAAGAA TACCACGCGC TCAAAAAAGG GCTTTTAAA.A ACGCTCTGCT TGGCGTTAGC TGAAGACGAT GGCTTTTAT ATG GGA GTG
GTTTTTA-AGG
TGAAAAAAGG
TTTTAGCCTT
GGC TAT Met Gly Val Gly Tyr 120 180 234 282 CAA ATC GGC GGC GCG CAA CAA AAT ATC Gin Ilie Gly Gly Ala Gin Gin Asn Ile AAC AAA GGC AGC Asn Lys Gly Ser ACC CTA Thr Leu AGG AAT AAT Arg Asn Asn GGG GGT AAT Gly Gly Asn ATT AAT AAT TTC Ile Asn Asn Phe
CGC
Arg CAA GTG GGC GTG Gin Val Gly Vai GOT ATG GCA Gly Met Ala ATG GAC GCT Met Asp Ala GGG CTT TTA GCC TTA GCG ACA AAC ACG Gly Leu Leu Ala Leu Ala Thr Asn Thr
ACC
Thr CTT TTA Leu Leu GGG ATA GGC AAC Gly Ile Giy Asn
CAA
Gin ATT GTC AAT ACT Ile Val-Asn Thr ACA ACT GTT AGC Thr Thr Val Ser
AAC
Asn AAC AAC GCA GAA Asn Asn Ala Giu
TTA
Leu 75 ACC CAG TTT AAA Thr Gin Phe Lys
AAA
Ly s 80 ATA CTC CCT CAA Ile Leu Pro Gin
ATT
Ile GAG CAA CGC TTT Giu Gin Arg Phe
GAA
Glu ACG AAT AAA AAC Thr Asn Lys Asn
GCT
Al a 95 TAT AGC GTT CAA Tyr Ser Val Gin GCC TTG Ala Leu 100 CAA GTG TAT Gin Val Tyr AAT GGC AGT Asn Gly Ser 120 AGT AAT GTG CTT TAT AAC TTG, GTT AAT AAT AGT AAT Ser Asn Val Leu Tyr Asn Leu Val Asn Asn Ser Asn AAT GGA GTC Asn Gly Val
GTT
Val1 125 GAA TAT GTA Giu Tyr Val 115 ATT ATA AAA Ile Ile Lys GTT CTC Val Leu 135 TAT GGT TCT CAA Tyr Giy Ser Gin
AAT
Asn 140 GAA TTC AGT CTC Giu Phe Ser Leu GCC ACG GAG AGT Ala Thr Giu Ser
GTG
Val1 150 GTG CTT TTA AAC Val Leu Leu Asn
GCG
Al a 155 CTT ACA AGG GTG Leu Thr Arg Val
AAT
Asn 160 CTG GAT AGT AAT Leu Asp Ser Asn GTG TTT TTA AAA Val Phe Leu Lys CTA TTA GCC CAA Leu Leu Ala Gin
ATG
Met 175 CAG CTT TTT AAT Gin Leu Phe Asn GAC ACT Asp Thr 180 TCT TCA GCA Ser Ser Ala CTA GGC CAG ATC Leu Gly Gin Ile GAA AAC TTG AAG AAC GGT GGT Giu Asn Leu Lys Asn Gly Gly 195 OCA GGA TCA ATG CTC CAA AAG GAT GTG AAA ACC ATC TCG GAT CGA ATC WO 98/21225 PCT/US97/21353 -196- Ala Giy Ser 200 Met Leu Gin Lys Asp 205 Val Lys Thr Ile Ser 210 Asp Arg Ile GCT ACT Ala Thr 215 TAC CAA GAG AAT Tyr Gin Giu Asn AAA CAG CTA GGA Lys Gin Leu Gly ATG CTA AAG AAT Met Leu Lys Asn GAT GAA CCC TAC TTG CCC CAA TTT GGG Asp Giu Pro Tyr Leu Pro Gin Phe Gly 235
CCA
Pro 240 GGC ACA AGC TCT Gly Thr Ser Ser
CAG
Gin 245 CAT GGG GTT ATT His Gly Val Ile
AAT
Asn 250 GGC TTT GGC ATT Gly Phe Gly Ile
CAA
Gin 255 GTG GGC TAT AAG Val Gly Tyr Lys CAA TTT Gin Phe 260 1002 TTT GGG AAC Phe Gly Asn TAT GGC TTT Tyr Gly Phe 280 CGG AAT ATA GGC Arg Asn Ile Gly
TTA
Leu 270 CGA TAT TAC GCT Arg Tyr Tyr Ala TTC TTT GAT Phe Phe Asp 275 AAA GCG AAT Lys Ala Asn 1050 1098 ACG CAA TTG GGC Thr Gin Leu Gly
AGT
Ser 285 CTT AGC AGC GCC Leu Ser Ser Ala
OTT
Val 290 ATC TTT Ile Phe 295 ACT TAT GGC GCT Thr Tyr Gly Ala
GGC
Gly 300 ACG GAC TTT TTA Thr Asp Phe Leu AAT ATC TTT AGA Asn Ile Phe Arg
AGG
Arg 310 GTT TTT AGC GAT Val Phe Ser Asp TCC TTG AAT GTG Ser Leu Asn Val
GGG
Gly 320 GTG TTT GGG GGC Val Phe Gly Gly 1146 1194 1242 CAA ATA GCG GGT Gin Ile Ala Gly ACT TGG GAT AGC Thr Trp Asp Ser TTA AGA GGT CAA Leu Arg Gly Gin ATT GAA Ile Glu 340 AAC TCG TTT Asn Ser Phe AAT TTG GGT Asn Leu Gly 360
AAA
Lys 345 GAA TAC CCC ACT Glu Tyr Pro Thr ACG PAT TTC CAA Thr Asn Phe Gin TTT TTG TTT Phe Leu Phe 355 CGC CGG TTT Arg Arg Phe 1290 1338 TTA AGG OCT CAT Leu Arg Ala His TTT GCC Phe Ala 365 AGC ACC ATG Ser Thr Met TTG AGC Leu Ser 375 GCG TCT CAA AGC Ala Ser Gin Ser CAG.CAT GGG ATG Gin His Gly Met
GAA
Glu 385 TTT GGC GTG AAA Phe Gly Val Lys 1386
ATC
Ile 390 CCG GCT ATC PAT Pro Ala Ile Asn AGG TAT TTG AGG GCC PAT GGG GCT GAT Arg Tyr Leu Arg Ala Asn Gly Ala Asp 400
GTG
Val 405 1434 GAT TAC AGG CGT Asp Tyr Arg Arg
TTG
Leu 410 TAT GCG TTC TAT ATC PAT TAC ACG ATA Tyr Ala Phe Tyr Ile Asn Tyr Thr Ile GGT TTT T Gly Phe 420 1483 WO 98/21225 PCT/US97/21353 -197- AAGCTCTTTT TAGGGCTTAT AAAGAGGCTT TTTACTTTTT TTTTGGTATT CTAACAAGCT 1543 TTTAAATAAT CCAATCTACT TTGTTTTAAG GATAATATTT TATGGCAGAT GTCGTTGTGG 1603 GGATCCAGTG GGGAGATGAG GGGAAGGGAA AAATTGTTGA TAGGATCGCT AAAGATTATG 1663 ACTT 1667 INFORMATION FOR SEQ ID NO:84: SEQUENCE CHARACTERISTICS: LENGTH: 421 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:84: Met Gly Val Gly Tyr Gin Ile Gly Gly Ala Gin Gin Asn Ile Asp Asn 1 5 10 Lys Gly Ser Thr Leu Arg Asn Asn Val Ile Asn Asn Phe Arg Gin Val 25 Gly Val Gly Met Ala Gly Gly Asn Gly Leu Leu Ala Leu Ala Thr Asn 40 Thr Thr Met Asp Ala Leu Leu Gly Ile Gly Asn Gin Ile Val Asn Thr 55 Asn Thr-Thr Val Ser Asn Asn Asn Ala Glu Leu Thr Gin Phe Lys Lys 70 75 Ile Leu Pro Gin Ile Glu Gin Arg Phe Glu Thr Asn Lys Asn Ala Tyr 90 Ser Val Gin Ala Leu Gin Val Tyr Leu Ser Asn Val Leu Tyr Asn Leu 100 105 110 Val Asn Asn Ser Asn Asn Gly Ser Asn Asn Gly Val Val Pro Glu Tyr 115 120 125 Val Gly Ile Ile Lys Val Leu Tyr Gly Ser Gin Asn Glu Phe Ser Leu 130 135 140 Leu Ala Thr Glu Ser Val Val Leu Leu Asn Ala Leu Thr Arg Val Asn 145 150 155 160 Leu Asp Ser Asn Ser Val Phe Leu Lys Gly Leu Leu Ala Gin Met Gin 165 170 175 Leu Phe Asn Asp Thr Ser Ser Ala Lys Leu Gly Gin Ile Ala Glu Asn 180 185 190 Leu Lys Asn Gly Gly Ala Gly Ser Met Leu Gin Lys Asp Val Lys Thr 195 200 205 Ile Ser Asp Arg Ile Ala Thr Tyr Gin Glu Asn Leu Lys Gin Leu Gly 210 215 220 Gly Met Leu Lys Asn Tyr Asp Glu Pro Tyr Leu Pro Gin Phe Gly Pro 225 230 235 240 Gly Thr Ser Ser Gin His Gly Val Ile Asn Gly Phe Gly Ile Gin Val 245 250 255 Gly Tyr Lys Gin Phe Phe Gly Asn Lys Arg Asn Ile Gly Leu Arg Tyr 260 265 270 Tyr Ala Phe Phe Asp Tyr Gly Phe Thr Gin Leu Gly Ser Leu Ser Ser 275 280 285 Ala Val Lys Ala Asn Ile Phe Thr Tyr Gly Ala Gly Thr Asp Phe Leu WO 98/21225 PCT/US97/21353 -198- 290 Asn 300 Ser Ile Phe Arg Arg Phe Ser Asp Gin Leu Asn Val 310 Gin 315 Thr Phe Gly Gly Ile 325 Glu Ile Ala Gly Asn 330 Glu Trp Asp Ser Ser Leu 335 Arg Gly Gin Phe Gin Phe 355 Met His Arg Ile 340 Leu Asn Ser Phe Tyr Pro Thr Pro Thr Asn 350 Ala Ser Thr Phe Asn Leu Arg Ala His Phe 365 Gin Arg Phe Leu 370 Glu Phe Ser 375 Pro Ser Gin Ser His Gly Met Gly Val Lys 385 Asn Ile 390 Asp Ala Ile Asn Gin 395 Tyr Tyr Leu Arg Ala 400 Asn Gly Ala Asp Tyr Arg Arg Leu 410 Ala Phe Tyr Tyr Thr Ile Gly Phe 420 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 926 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 207...746 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID CCCCTTAATT GCAGATGTTT TGCAAGAGGG ATTGCGTGGC GTCTATCATT CTAGAGAGAT AGACTTTGTA GAAAAAGTGG TTGTTTTAGA CAGCTGTCAA ATCCACCAAA AAGCGTTAAT GCATTTGCAA GAAACTTTGA TGATAGAAGT GGATAGGCTT GATTTTTCTT TAGTGGAGCG CTTGAACATT TTAGCGCGCA TGGAGA ATG AAA AGC ATG CGT TTT AGT TAC ATT Met Lys Ser Met Arg Phe Ser Tyr Ile 120 180 233 GAG CCA AGA GCG AAA TAC CTT Glu Pro Arg Ala Lys Tyr Leu 15 ATC AGC AAG Ile Ser Lys
CTT
Leu TCT AAA ATT TGG Ser Lys Ile Trp TTT TAC ATT TTT TTA TCT TTT GTG GTA Phe Tyr Ile Phe Leu Ser Phe Val Val ATG CAC AAC GCC ATT AAA AGC ACT CAA Met His Asn Ala Ile Lys Ser Thr Gin
ATA
Ile 35 GGG GGG TTA GTG Gly Gly Leu Val TGG TTT Trp Phe 329 GAC AAC GCG Asp Asn Ala TCC AGT TTG ACG Ser Ser Leu Thr 377 WO 98/21225 PCT/US97/21353 199- ATC CAA GAA Ile Gin Glu AGG CTC TAC CGC CAT GAA ATC AGC CGC TTA CAG GTT AAG Arg Leu Tyr Arg His Glu Ile Ser Arg Leu Gin Val Lys 425 ACT GAT Thr Asp GAA ACC TTA AAA Glu Thr Leu Lys ATT AAA GAA GCC Ile Lys Glu Ala
AAA
Lys AAG CGT TTG AAT Lys Arg Leu Asn
TAT
Tyr AAC GAT GAT ATA Asn Asp Asp Ile
CGA
Arg 95 GAT GTT TTG CAA Asp Val Leu Gin
GGG
Gly 100 CTT TTG AAT ATT Leu Leu Asn Ile
GTG
Val 105 CCG GAT TCC ATC Pro Asp Ser Ile
ACT
Thr 110 ATT AAT AGC ATT Ile Asn Ser Ile
GAA
Glu 115 ATA GAC CAG CAA Ile Asp Gin Gin AGC GTG Ser Val 120 GTT GTT AGC Val Val Ser CAA AAC AAA Gin Asn Lys 140 AAA ACC CCT TCT Lys Thr Pro Ser GAA GCC TTT TAT Glu Ala Phe Tyr TTT TTG TTT Phe Leu Phe 135 GAA TTT TTC Glu Phe Phe CTA AAC CCC ATG Leu Asn Pro Met GAT TAT TCT AGG Asp Tyr Ser Arg
GCG
Ala 150 CCC TTA Pro Leu 155 TCC TTA Ser Leu 170 AGC GAT GGG TGG Ser Asp Gly Trp
TTT
Phe 160
CCG
Pro AAT TTT GTC TCC Asn Phe Val Ser
ACC
Thr 165 AAC TTT TCT AAT Asn Phe Ser Asn 713 766 CTG ATA AAA Leu Ile Lys GAG TCT ATT Glu Ser Ile
AAA
Lys 180 TGAAGCCATT GCATTTTTCA
CACCTGGACA
GGGGTTTTTT
GAATTAAAAA
GAGAGCAATC AGGCGATGTG CCTTATTGGG TTGGTTGAAT AAATCCTTTT AGAAGAAAAT
GGGTTTATCA
ACCGAGTATT
CGTAAAAAAA
TTAAAAACCT CGTTTTTTTA TTCTATGGCC TAGCATGCTG INFORMATION FOR SEQ ID NO:86: SEQUENCE CHARACTERISTICS: LENGTH: 180 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:86: Met 1 Ile Val Lys Ser Met Ser Lys Leu Val Ile Gly Phe Ser Tyr Ile Pro Arg Ala Lys Tyr Leu Lys Ile Trp Tyr Ile Phe Thr Gin Asp Asn Gly Leu Val Ala Ser Ser Trp Leu Leu Ser Phe Ile Lys Ser Leu Tyr Arg Met His Asn Thr Ile Gin Glu WO 98/21225 PCT/US97/21353 -200- His Ile Ile Ser Arg Leu 70 Val Lys Thr Asp Asn Thr Leu Lys Lys Glu Ala Lys Arg Leu Asn Tyr Pro Asp Asp Ile Arg Asp Val Leu Gin Ser Ile Glu 115 Ser Lys Glu Leu Asn Ile Asp Ser Ile Asp Gin Gin Val Val Ser Gly 125 Leu Thr Ile Asn 110 Lys Thr Pro Asn Pro Met Ala Phe Tyr 130 Phe Asp Tyr Ser Arg Ala 150 Asn Phe Leu Phe 135 Glu Phe Phe Phe Ser Asn Gin Asn Pro Leu 155 Ser Leu 170 Lys 140 Ser Asp Gly Trp Phe 160 Pro Phe Val Ser Leu Ile Lys Glu Ser Ile INFORMATION FOR SEQ ID NO:87: SEQUENCE CHARACTERISTICS: LENGTH: 1440 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 151...1299 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:87: AGCACTTTCG CTTTTCATTG TTTTGATGCG AACGATGAGG TGAGCGATGC GTTTTTAATC CATAAAATCA TTCAAACCCA TTTCAAACGC ACTTCTAGTT TCAGGCTTTT GCAAGTGTTA ATACAAGATT TTAAAGAACA GCGCATCATT ATG TGC GTG GTT TTG AGC GTG AAA Met Cys Val Val Leu Ser Val Lys 1 AGA GAT GGT GAA AAA ACT TTA GAA AAT AAT GAA GAA AAT AAA GAT GAA Arg Asp Gly Glu Lys Thr Leu Glu Asn Asn Glu Glu Asn Lys Asp Glu 15 120 174 222 270 318 366
AAG
Lys CTT ATT TTG ATT GAT GAA TTT GAA GTT TTA GCC AAT AAA TTC ATT Leu Ile Leu Ile Asp Glu Phe Glu Val Leu Ala Asn Lys Phe Ile 30 35 TCT CGT TTG CCC AAT ATC CCT AGC ACC CCT AGA GAG TTT GGG TTA GGC Ser Arg Leu Pro Asn Ile Pro Ser Thr Pro Arg Glu Phe Gly Leu Gly 50 AAG GGC GAG ATC ATG GAG ATT GAT GTG CCT TTT GGG AGT ATT TTT GCT Lys Gly Glu Ile Met Glu Ile Asp Val Pro Phe Gly Ser Ile Phe Ala WO 98/21225 WO 9821225PCTIUS97/21353 -201- TAC AGA CAC Tyr Arg His ATT GGC TCT ATC Ile Giy Ser Ile CAA AAA GAA TAC Gin Lys Giu Tyr
AGG
Arg ATT GTA GGG Ile Val Gly CTT TAT Leu Tyr CGC AAC GAT OTT Arg Asn Asp Val
TTG
Leu TTO CTC TCC ACT Leu Leu Ser Thr TCT TTA GTT ATC Ser Leu Val Ile
CAG
Gin 105 CCG CGA GAC ATT Pro Arg Asp Ile
CTC
Leu 110 TTA GTG GCG GOT Leu Vai Ala Gly CCG GAA ATT TTG Pro Giu Ile Leu
AAT
Asn 120 GCG GTG TAT CTT Ala Val Tyr Leu
CA
Gin 125 GTC AAA AGC PAT Val Lys Ser Asn
OTG
Val1 130 GGG CAG TTC CCA Giy Gin Phe Pro 0CC CCC Ala Pro 135- TTT GOT PAG Phe Gly Lys AAA OCO ATG Lys Ala Met 155 ATT TAT TTA TAC Ile Tyr Leu Tyr GAT ATG COT TTG Asp Met Axg Leu CAG PAC AGA Gin Asn Arg 150 CAC APA CAT His Lys His ATG COC OAT OTG Met Arg Asp Val CPA 0CC TTG TTT Gin Ala Leu Phe TTA PAG Leu Lys 170 AGC TAC PAG CTC Ser Tyr Lys Leu
TAC
Tyr 175 ATT CAG GTT TTA Ile Gin Val Leu
CAC
His 180 CCC ACT AOC CCT Pro Thr Ser Pro
PAG
Lys 185 TTT TAC CAT AA Phe Tyr His Lys
TTT
Phe 190 TTA OCO CTA A Leu Ala Leu Glu
ACC
Thr 195 OAA AGC ATT PA Oiu Ser Ile Olu
OTO
Val1 200 PAT TTT OAT TTT Asn Phe Asp Phe AGG AAA AGT TTT Arg Lys Ser Phe CAA AAA CTC CAT Oin Lys Leu His OP.A GAC Glu Asp 215 CAC CAG AA His Gin Lys TCT APA AA Ser Lys Lys 235 ATO GOC CTA ATC Met Gly Leu Ile OTA GOC AGA GAO Val Oly Arg Glu CTT TTT TTA Leu Phe Leu 230 CCA OTT TAT Pro Val Tyr CAC CGA PAG 0CC His Arg Lys Ala TAT APA ACA 0CC Tyr Lys Thr Ala AAA ACC Lys Thr 250 PAC ACT TCT GOC Asn Thr Ser Oly
TTG
Leu 255 TCT AAA ACC TCT Ser Lys Thr Ser
CA
Gin 260 AGC OTG OTO OTA Ser Val Val Val
TTG
Leu 265 PAT GPA ACT TTG Asn Giu Ser Leu
OAT
Asp 270 ATT PAT GAO GAC Ile Asn Oiu Asp TCT TCA OTO ATT Ser Ser Val Ile OAT OTO TCT ATO CPA ATO OAT TTO OOC TTO TTG CTC TAT OAT TTT GAC 13 1038 WO 98/21225 PCT/US97/21353 202- Asp Val Ser Met Gin Met Asp Leu Gly 285 Leu 290 Leu Leu Tyr Asp Phe Asp 295 CCT AAC AAG Pro Asn Lys GCC AAC GCG Ala Asn Ala 315
CGC
Arg 300 TAT AAA AAC GAG ATT GTC AAT CAT TAT Tyr Lys Asn Glu Ile Val Asn His Tyr 305 GAA AAT TTA Glu Asn Leu 310 GAT ATT AGA Asp Ile Arg 1086 TTC AAC CGC AAG Phe Asn Arg Lys GAG ATT TTC CAA Glu Ile Phe Gin
ACC
Thr 325 1134 AAT CCT Asn Pro 330 ATC ATG TAT CTC Ile Met Tyr Leu TCT TTA AGA AAT CCC ATT TTG CAT TTC Ser Leu Arg Asn Pro Ile Leu His Phe 340 1182 CCT TTT GAA GAG Pro Phe Glu Glu
TGC
Cys 350 ATC ACG CAC ACG Ile Thr His Thr
CGC
Arg 355 TTT TGG TGG TTT Phe Trp Trp Phe TCC ACT AAA GTG Ser Thr Lys Val 1230 1278 1333 GAA AAA TTA GCG Glu Lys Leu 365 GTA GCG GAG Val Ala Glu Ala TTT TTA Phe Leu 370 AAC GAT GAT AAC Asn Asp Asp Asn CCT CAA Pro Gin 375 ATT TTT ATC Ile Phe Ile
CCT
Pro 380 TGAAAGAATG CAAGAAATTT TAATCCCTTT AAAA GAAAAAAACT ATAAAGTGTT TTTGGGGGAA CTGCCTGAAA TAAAATTGAA ACAAAAAGCC CTCATCATTA GCGATAGCAT CGTAGCCGGG TTGCATTTGC CCTATTT 1393 1440 INFORMATION FOR SEQ ID NO:88: SEQUENCE CHARACTERISTICS: LENGTH: 383 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:88: Met Cys Val Val Leu 1 5 Asn Asn Glu Glu Asn Glu Val Leu Ala Asn Thr Pro Arg Glu Phe Val Pro Phe Gly Ser Gin Lys Glu Tyr Arg Leu Ser Thr Lys Ser Ser Val Lys Arg Asp Gly Glu 10 Lys Asp Glu Lys Leu Ile Leu 25 Lys Phe Ile Ser Arg Leu Pro Lys Ile Asn Thr Leu Glu Asp Glu Phe Ile Pro Ser Glu Ile Asp Gly Leu Gly Lys Gly 55 Ile Phe Ala Tyr Arg 70 Ile Val Gly Leu Tyr 90 Leu Val Ile Gin Pro Glu Ile His Ile 75 Met Gly Ser Ile Arg Arg Asn Asp Val Arg Asp Ile Leu Leu Leu Leu Val WO 98/21225 WO 9821225PCT/US97/21353 -203- Ala Gly Asn Pro Giu Ile Leu Asn Ala Asn Ilie 145 Gin Gin Leu Phe Vali 225 Tyr Lys Glu Giy Ilie 305 Glu Leu His Phe 115 Gly Met Leu Leu Thr 195 Gin Giy Th r Ser Met 275 Leu As n Phe Asn Arg 355 Asn Phe Leu Leu 165 Pro Ser Leu Giu Thr 245 Ser Ser Tyr Tyr Thr 325 Ile Trp Asp Pro Gin iso His Thr Ile His Leu 230 Pro Vai Val Asp Giu 310 Asp Leu Trp Asn 120 Pro Arg His Pro Val 200 Asp Leu Tyr Vai Phe 280 Asp Leu Arg Phe Leu 360 Gin Phe Lys Leu Lys 185 Asn His Ser Lys5 Leu 265 Asp Pro Al a Asn Met 345 Ser Ile Val1 Giy Al a Lys 170 Phe Phe Gin Lys Thr 250 As n Val1 Asn As n Pro 330 Pro Thr Phe Leu Ser 140 Met Tyr His Phe Lys 220 His Thr Ser Met Arg 300 Phe Met Giu Val1 Pro 380 Gin 125 Ilie Arg Lys Lys Tyr 205 Met Arg Ser Leu Gin 285 Tyr Asn.
Tyr Giu Giu.
365 Val1 Lys Leu Val Tyr 175 Leu Lys Leu Ala Leu 255 Ile Asp Asn Lys Asn 335 Ile Leu Glu Ser Tyr Tyr 160 Ile Ala Ser Ile Leu 240 Ser Asn Leu Giu Ile 320 Ser Thr Al a INFORMATION FOR SEQ ID NO:89: Wi SEQUENCE CHARACTERISTICS: LENGTH: 517 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NA1'ME/KEY:.Coding Sequence LOCATION: .464 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:89: AGATTTCATT CGAGGTAGAA AATACATTGA AAAAGCGTGT GAATTA-AACG ATG GTA WO 98/21225 PCT/US97/21353 -204- Met Val 1 TAC TAT Tyr Tyr GGG GGT GGA ACG GTA AAA AAA GAC TTG AAG AAA GCC ATT CAA Gly Gly Gly Thr Val Lys Lys Asp 10 Leu Lys Lys Ala Ile Gin GTT AAA Val Lys GCG TGT GAA TTG Ala Cys Glu Leu GAA ATG TTT GGG Glu Met Phe Gly
TGT
Cys CTG TCA TTA GTT Leu Ser Leu Val AAC TCT CAA ATA AAC AAA CAA AAA CTC Asn Ser Gin Ile Asn Lys Gin Lys Leu
TTT
Phe CAA TAT CTC TCT Gin Tyr Leu Ser GCT TGT GAA TTA Ala Cys Glu Leu
AAT
Asn AGT GGT AAT GGA Ser Gly Asn Gly AGG TTT TTA GGG Arg Phe Leu Gly GAT TTT Asp Phe 248 TAT GAG AAT Tyr Glu Asn TAC TAC TCT Tyr Tyr Ser
GGA
Gly AAA TAT GTA AAA Lys Tyr Val Lys GAT TTA AGA AAA Asp Leu Arg Lys GCT GCT CAA Ala Ala Gin TGT TTA ATA Cys Leu Ile 296 344 AAA GCT TGT GGA Lys Ala Cys Gly AAT GAT CAA GAT Asn Asp Gin Asp
GGG
Gly CTA GGA Leu Gly 100 TAT AAG CAA TAT Tyr Lys Gin Tyr GGC AAG GGC GTA Gly Lys Gly Val GTC AAA AAT GAA Val Lys Asn Glu 110 TTA GGA TCT GAA Leu Gly Ser Glu
AAA
Lys
GAC
Asp 130
CAA
Gin 115 GCG GTG AAA ACC Ala Val Lys Thr GAA AAG GCT TGT Glu Lys Ala Cys
AGG
Arg 125 392 440 494 GCA TGT GGT ATT Ala Cys Gly Ile
TTA
Leu 135 AAC AAC TAC Asn Asn Tyr TAGATTTGAA ATAAATGCTG TTTTTTAGCT GGCTTTCATG TTTTTGTAAC CCC INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 138 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID Met Val Gly Gly Gly Thr Val Lys Lys Asp Leu Lys Lys Ala Ile Gin WO 98/21225 SPCT/US97/21353 -205- 1 Tyr Tyr Val Lys Cys Glu Leu o Glu Met Phe Gly Cys Leu Ser Gin Lys Leu Phe Gin Tyr Leu Leu Val Ser Ser Lys Ala Asn Ser Gin Ile Asn Ser Cys Glu Leu ASD Phe Asn 55 Lys Gly Asn Gly Cys Asp Phe Leu Gly Tyr Glu Asn Ala Gly Lys Tyr Val Lys Leu Arg Lys Gin Tyr Tyr Ala Cys Gly Asp Gin Asp Gly Cys Leu Ile Leu Glu Lys Gin 115 Glu Asp Ala 130 Gly 100 Ala Lys Gin Tyr Ala Lys Gly Val Val Lys Asn 110 Leu Gly Ser Val Lys Thr Phe Glu Lys 120 Leu Asn Asn Tyr 135 Ala Cys Arg 125 Cys Gly Ile INFORMATION FOR SEQ ID NO:91: SEQUENCE CHARACTERISTICS: LENGTH: 1663 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 68...1600 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:91: AAATGTTAGA AACCCTTACA AAACAAGCTA ATATATTCTA TTCAATTTGC CTCAAGGACA AACAAAC ATG AAA AAA CTT CTT TAT ACC ATA CTC GCG CTT CTT TTA ATC Met Lys Lys Leu Leu Tyr Thr Ile Leu Ala Leu Leu Leu Ile 109 GGC CTT Gly Leu TTA ACA ATC Leu Thr Ile CTC ATC CTT TTT Leu Ile Leu Phe GAA TGG GGG AAT Glu Trp Gly Asn
AAG
Lys ATC ATC GCT TCG Ile Ile Ala Ser
TAT
Tyr ATA GAG AAA AAA Ile Glu Lys Lys
ATC
Ile 40 AAC CCG Asn Pro AAC GAG CAC TAC Asn Glu His Tyr TTG GAT TTT AAA Leu Asp Phe Lys TTG AGC GTT Leu Ser Val
AAA
Lys ACC TTT AAA TTG AGA TTC AAC TCT Thr Phe Lys Leu Arg Phe Asn Ser 55 GCT CAA GCC AAC GAT GAT TCC ACG CTC ATT CTT AAG GGG GAT TTT TCA Ala Gin Ala Asn Asp Asp Ser Thr Leu Ile Leu Lys Gly Asp Phe Ser WO 98/21225 PCTIUS97/21353 -206- CTT TTA Leu Leu AAG CAA AGC GTA Lys Gin Ser Val TTG AAT TAC CAT Leu Asn Tyr His
ATA
Ile GAT ATT AAA CAT Asp Ile Lys Asp
TTA
Leu CGC TCT TTC AAA Arg Ser Phe Lys TGG ATA CCC TAC Trp Ile Pro Tyr
CCT
Pro TTA AGG GGG GCT Leu Arg Giy Ala 349 397 445 ATC ACT TCT GGG Ile Thr Ser Gly
AAT
Asn 115 ATT AAA GGG CAT Ile Lys Gly His AGA AAA Arg Lys 120 GCC CTT ATG Ala Leu Met ATT CAA Ile Gin 125 GGC GTC TCT Cly Val Ser GAT CAT TTC Asp Asp Phe 145 GTG GCT CAA TCC Val Ala Gin Ser ACT GCC TAC AAT Thr Ala Tyr Asn GCC CTT TTA Ala Leu Leu 140 CAC GCC PAT Asp Ala Asn PAG CTT TCT CGC TTA AAT TTG AAC GCA Lys Leu Ser Arg Leu Asn Leu Asn Ala 150 TTA GAA Leu Glu 160 CAT TTG CTT TAT Asp Leu Leu Tyr
TTA
Leu 165 ATC PAT CCC CCC Ile Asn Arg Pro TAT GCG AAC GCA Tyr Ala Asn Ala
AAA
Lys 175 CTG TCC TTA CAC Val Ser Leu Gin
GCC
Ala 180 CAT TTT AAC TCT Asp Phe Asn Ser
CTA
Leu 185 AAG CCT TTA GAG Lys Pro Leu Glu CAT TTC ATC CTA His Leu Ile Leu
ACA
Thr 195 GCT PAT AAC CCT Ala Asn Asn Ala ATC PAT AAC GCC Ile Asn Asn Ala CTA ATC Leu Ile 205 PAT CPA ATT Asn Gin Ile TCG CAT TCA Ser His Ser 225 CAT TTA AAC CTT His Leu Asn Leu CAC ACC CTT CTT Asp Thr Leu Val TTC ACC CTC Phe Ser Leu 220 CAT ACC ACC Asp Thr Thr 733 781 ACC CAC TTT AAA Ser Asp Phe Lys PAC AAA CCC ATC Asn Lys Ala Ile CTC ACT Leu Thr 240 AGC CCT TTA GCC Ser Pro Leu Ala
PAT
Asn 245 TTC AAA GCC CTA Phe Lys Ala Leu AGC GAA TAC CTT Ser Clu Tyr Leu
TTC
Phe 255 TCT ATT TTA AAA Ser Ile Leu Lys AAC CCC CCC TAC Asn Ala Pro Tyr
ACT
Thr 265 TTA GAA ATC CCC Leu Glu Ile Pro
PAT
Asn 270 829 877 925 CTA GCC APA CTC Leu Ala Lys Leu PAC ATT ACC PAC Asn Ile Thr Asn CCC TTA PAA GGG Pro Leu Lys Cly AGC TTC Ser Leu 285 ACT TTA AAA GGC CCT ATA GAA CPA ACC CCC AAA CTT TTA APA CTC AGC WO 98/21225 PCT/US97/21353 -207- Thr Leu Lys GGC CAT TCA Gly His Ser 305 Gly 290 Ala Ile Glu Gin Ser 295 Pro Lys Leu Leu Lys Val Ser 300 CTT TTA AAT Leu Leu Asn AAT TTA CTA GAC Asn Leu Leu Asp
GGC
Gly 310 GCG CTG GAT TTC Ala Leu Asp Phe
ACG
Thr 315 1021 AAA GAT Lys Asp 320 TTG AAA GGG CGT TTT TCC AAT ATT TCC Leu Lys Gly Arg Phe Ser Asn Ile Ser
ACT
Thr 330 TTA AAA GCT TTA Leu Lys Ala Leu 1069 1117 1165 TTA TTC CAT TAC CCT AAG TTT TTC CAA Leu Phe His Tyr Pro Lys Phe Phe Gin GAT TAT GAT Asp Tyr Asp
CTT
Leu 355 340
ATC
Ile
TCC
Ser 345
GTA
Val GTT GCA GAC GCT Val Ala Asp Ala
AAT
Asn 350
CTA
Leu GCT -AAG CAA Ala Lys Gin TTG AAA GCC Leu Lys Ala
CGC
Arg 365 AAA AAC GCA Lys Asn Ala TTC CTC AAA AAT Phe Leu Lys Asn
GCA
Ala 375 TTC AGC GAT TTT Phe Ser Asp Phe CTC TAC TCC Leu Tyr Ser 380 GCC AAT CTG Ala Asn Leu 1213 ATT TCT AAA TTT GAT ATT ACA Ile Ser Lys Phe Asp Ile Thr 385
AAA
Lys 390 GAA ATT TAT AAC Glu Ile Tyr Asn 1261 1309 GTA AGC Val Ser 400 CAA ATC AAC CAG Gin Ile Asn Gin CGC CTG CTC TCT Arg Leu Leu Ser CTG AGT TTA AAA Leu Ser Leu Lys CCC AAA ACC CAA Pro Lys Thr Gin
TTG
Leu 420 AAA ATC CAT AAC Lys Ile His Asn
GGT
Gly 425 TTG TTG GAT TTA Leu Leu Asp Leu
AAC
Asn 430 1357 1405 ACC AAA CAA ATG Thr.Lys Gin Met
AAC
Asn 435 ATG CTC ATG GAT Met Leu Met Asp GAA ATT TTA AAA Glu Ile Leu Lys TTC ATT Phe Ile 445 TTT AAA ATG Phe Lys Met ATT TTA AAC Ile Leu Asn 465
AAA
Lys 450 CTT CAA GGC AAC Leu Gin Gly Asn
ATG
Met 455 CAC CAG CCA AAA His Gin Pro Lys TTT TCT CTC Phe Ser Leu 460 GGC TTG AAA Gly Leu Lys 1453 1501 GAA AAA GCC ATT Glu Lys Ala Ile CAA AAC TTG CAA Gin Asn Leu Gin GAA ATC Glu Ile 480 TTA AAA AAC GAC Leu Lys Asn Asp CTT AAA AAA GGT Leu Lys Lys Gly GAT CAT TTG CTT Asp His Leu Leu 1549 1597 GAT GAT AAG CTC Asp Asp Lys Leu GAA AAG CTT GAA Glu Lys Leu Glu
AAA
Lys 505 GGG CTT AAG GGG Gly Leu Lys Gly
CTT
Leu 510 TTT TAAAAATTTT AAAGGATAGA AATGGCGCAC ATTTTAGTTA GCGGGGCGAC TTCAGG 1656 WO 98/21225 WO 981225PCT/US97/21353 -208- Phe GTTTGGA 16 1663 INFORMATION FOR SEQ ID NO:92: WI SEQUENCE CHARACTERISTICS: LENGTH: 511 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:92: Met Lys Lys Leu Leu Tyr Thr 1 Leu Al a Val1 Al a Lys Ser Ser Ser Phe 145 Asp Ser Ile Ile Ser 225 Ser Ile Lys Lys Thr Ser Lys Asn Gin Phe Gly Asn 130 Ly s Leu Leu Leu Phe 210 Ser Pro Leu Leu Gly 290 Tyr Ile Phe Asp Val1 Giu 100 Ile Al a Ser Tyr Ala 180 Al a Leu Phe Ala Leu 260 Asn Ile Leu Giu Lys Ser Asn Trp Lys Gin Arg Leu 165 Asp Asn Asn Lys Asn 245 Asn Ile Glu Ile Lys Leu Thr 70 Leu Ile Gly Ser Leu I50 Ile Phe Asn Leu Gly 230 Phe Al a Thr Gin Ile Leu Phe Thr Ile Asn Phe Asn Ile Leu Tyr His Tyr Pro 105 Arg Lys 120 Thr Ala Leu Asn Arg Pro Ser Leu 185 Leu Ile 200 Asp Thr Lys Ala Ala Leu Tyr Thr 265 His Pro 280 Pro Lys Leu Trp Asn Leu Gly 75 Asp Arg Leu Asn Gin 155 Tyr Pro Asn Val Ser 235 Ser Gi1U Lys Leu Leu Gly Giu Asp Asp Ile Gly Met Al a 140 Asp Al a Leu Al a Phe 220 Asp Giu Ile Gly Lys 300 Leu Ile Ser Gin Leu Arg Thr Val1 Asp Glu 160 Val1 Leu Gin His Thr 240 Ser Ala Leu His WO 98/21225 PCT/US97/21353 209- Ser 305 Leu Asn Leu Leu Asp Giy-Ala Leu Asp Phe Leu Leu Asn Lys 310 Ser Asp 320 Lys Gly Arg Asn Ile Ser Thr Lys Ala Leu Asp Leu 335 Phe His Tyr Tyr Asp Leu 355 Ala Arg Phe 370 Lys Phe Asp Pro 340 Ile Phe Phe Gin Ala Asp Ala Ala Lys Gin Gly 360 Leu Lys Ala Arg 365 Tyr Asn Leu Asp 350 Leu Lys Asn Ser Ile Ser 385 Gin Leu Lys Asn Ala Phe 375 Ile Thr Lys Glu Ile 390 Gin Gin Arg Leu Leu 405 Leu Lys Ile His Asn Ile Asn Ser Asp Phe Leu 380 Ala Tyr Asn Ser Asp 410 Gly Leu Asn Leu Val Ser Leu Lys Ser 415 Lys Thr Gin Gin Met Asn 435 Met Lys Leu Leu Asp Leu 420 Met Leu Met Asp 425 Glu Ala 440 His Ile Leu Lys Asn Thr Lys 430 Ile Phe Lys Leu Ile Leu Gin Gly Asn Gin Pro Lys 450 Asn Glu Lys Ala Ile Asn Leu-Gin 465 Leu Gin 475 Leu Lys Glu Ile 480 Asp Lys Asn Asp Thr 485 Glu Lys Lys Gly Leu 490 Gly Asp His Leu Leu Asp Lys Leu Lys 500 Lys Leu Glu Lys 505 Leu Lys Gly INFORMATION FOR SEQ ID NO:93: SEQUENCE CHARACTERISTICS: LENGTH: 947 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 292...645 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:93: AGTGCATAAA CGCACAGACC CCAAAAATGA AAGCTATTTT AGAATGGCAA AAGCGCGAAA ATGAAGACAG ACTCTCTGAT CCATGCCTCT ATCACGCCTT TAAATTTAGA CTTAACCAGT GGAATCTTGG CATGAGGGAA TGTTAAAGTG AGTAAAAAGC CTAATTGTTG GGGTTCTATT CTTCTTTAGT GCGTGTGAGC
TGGCTAGGGC
TTTGACGCTA
TATGATGATT
ACCGCTTGGC
ACCGCCTGCA
TACACCCTTT
TTGCTTCAAA
TGAAAAGTTT
TTTTTTAGGG
C ATG GGG Met Gly 1 120 180 240 297 TAT TAT TCA GAA GTT ACA GGG GAT TAT TTG TTC AAT TAT AAT TCC ACT Tyr Tyr Ser Glu Val Thr Gly Asp Tyr Leu Phe Asn Tyr Asn Ser Thr 345 WO 98/21225 WO 981225PCTIUS97/21353 -210- ATC GTG Ile Val GTG GCT TAT GAC Val Ala Tyr Asp
AGA
Arg 25 AGC GAT GCG ATG Ser Asp Ala Met
ACT
Thr TCT TAT TAT ATC Ser Tyr Tyr Ile
AAT
Asn
ACG
Thr GTG ATT GTT TAT Val Ile Val Tyr CAA GCG GAA TTC Gin Ala Glu Phe TTG CAA AAA TTA GGC TTT TAC AAT GTC Leu Gin Lys Leu Gly Phe Tyr Asn Val 45
TTC
Phe
GCG
Ala CTA GAT AAA Leu Asp Lys
GCC
Ala AAA AAT GTG ATC Lys Asn Val Ile COC ATT GTC Arg Ile Val CAA CTG ATT Gin Leu Ile
CGT
Arg PAC ATC TCA GCT Asn Ile Ser Ala
GTG
Val1 75 CCG TTC TAC CAA Pro Phe Tyr Gin GAT CAA GTC PAT Asp Gin Val Asn
AAG
Lys 90 CCT TGT TAT TTT Pro Cys Tyr Phe
CTT
Leu TAC AAT TAC Tyr Asn Tyr GGG GGG CAG Gly Gly Gin ATG GCT TTA Met Ala Leu 537 585 TTT TAT Phe Tyr 100 TGC TCT CPA ACC CTA CGG ATT ATT ACG Cys Ser Gin Thr Leu Arg Ile Ile Thr CTA TCA Leu Ser 110
GCG
Al a 115 AGC PAA TTT Ser Lys Phe TPATGAGTGC TPATTCGCAT TTTATTTTAG ATTGGTATGA TGTGG TGTTGCAA.AA ACGGGTTTTA TATGTGGATG GGAGCGTGAG AGATGCTGTA. TAGGGATTTG ATTPAAAGCA CGATCPAACG PACGCTACTA CTACPATTTA AGACTGCCCC TTTATCAGCC GTTATCAGGC GATTGTATCA ATTTTGCGCT AGCCATGTGG AATGCGCTC AAPATAT
CGGGAGGACT
CATTGATTTT
ATGTTATAGG
TGCGCAATTG
TGCGGCTATC
PACCGCCCTG
CAATGAAATG
CTCTTCTTTA
INFORMATION FOR SEQ ID NO:94: ()SEQUENCE CHARACTERISTICS: LENGTH: 118 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:94: Met Gly Tyr Tyr Ser Glu Val Thr Gly Asp Tyr Leu Phe Asn 1 5 10 Ser Thr Ile Val Val Ala Tyr Asp Arg Ser 25 Tyr Ile Asn Val Ile Val Tyr Glu Leu Gin 40 Val Phe Thr Gin Ala Glu Phe Pro Leu Asp Tyr Asn Asp Aia Met Lys Leu Gly Thr Ser Tyr Phe Tyr Asn Lys Ala Lys Asn Val Ile WO 98/21225 PCT/US97/21353 -211- -55 Tyr Ala Arg Ile Val Arg Asn Ile Ser 70 Asn Tyr Gin Leu Ile Asp Gin Val Asn Gly Gin Phe Tyr Cys Ser Gin Thr Leu 100 105 Ala Leu Ala Ser Lys Phe 115 Pro Ala Val 75 Lys Pro 90 Arg Ile Phe Tyr Gin Tyr Cys Tyr Phe Ile Thr Leu 110 Leu Gly Ser Met INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 875 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 348-...716 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID
TGCGGAGGGA
TTTTGAAAAT
ACGCACTTAA
AAATTGAAAC
ATGAATTAAA
ATGATCCTAA
ATGTCTATGA
TTTAATCCTT
AGCAGAGATT
CATTAACTTA
CGCTTCCAAT
TTTACGCCTA
TAAAATCTCA
TTTTTTATCT
AAAGAAGTTT
GAAATCCCAG
AAGCTTAAAA
CCGGTGCGTT
GAAAAATTTG
GTTTAAACGC
ACCTTAAAGA
AGCGCTTTTC
AAGATGGGGT
ATAGCGTGAT
TAGAAAAAGT
ATTGTTCGCC
ATACAAAGAC
TAACGCTTCC
CGTGTTTTTA
GGATAACGGG
CTAGATTCAA
TTAAAATTAG
ATTTTAAGCT
AGGTTGGAAA
120 180 240 300 356 AGGCAGC ATG CAG GCT Met Gin Ala TTT AAA Phe Lys AGC GTT AGC GCG Ser Val Ser Ala
ATT
Ile 10 AAA AAA GAT GAA Lys Lys Asp Glu
AAC
Asn ATC ACC GCT AAT Ile Thr Ala Asn 404 452
AAC
Asn ACT CAA AAA GAG Thr Gin Lys Glu ATT TTG TTT GGT Ile Leu Phe Gly CTT TCT AAC CCC TTA Leu Ser Asn Pro Leu TTA GAG GGC GCG Leu Glu Gly Ala GAT AAA GTG AGC GCG AAA AAT TTT ATC Asp Lys Val Ser Ala Lys Asn Phe Ile CCC CCT Pro Pro 500 AAC ACG CTT Asn Thr Leu AGC ACG GAT AAA Ser Thr Asp Lys CAA GCT TTA ATT Gin Ala Leu Ile ATC GTG CGT Ile Val Arg 548 AAA AAT GAC ATT ATC ACC GGG GTG TAT GAA GAG GGG CAA ATC AGC ATA Lys Asn Asp Ile Ile Thr Gly Val Tyr Glu Glu Gly Gin Ile Ser Ile 596 WO 98/21225 PCT/US97/21353 -212- 75 GAA ATA AGC CTA AAA GCC CTA GAA AAT GGC GCG CTT AAT CAA ATC ATT 644 Glu Ile Ser Leu Lys Ala Leu Glu Asn Gly Ala Leu Asn Gin Ile Ile 90 CAA GCG AAA AAT TTA GAA AGC AAT AAA ATA CTC AAA GCA AAA GTG TTG 692 Gin Ala Lys Asn Leu Glu Ser Asn Lys Ile Leu Lys Ala Lys Val Leu 100 105 110 115 AGC AGC TCT AAA GCG CAA ATC TTA TAAAGGACAT TCATGAAATT GGTTTTAGGC 746 Ser Ser Ser Lys Ala Gin Ile Leu 120 ATCAGTGGAG CGAGCGGGAT ACCCCTAGCC TTGCGGTTTT TAGAAAAATT ACCCAAAGAA 806 ATTGAAGTTT TTGTCGTGGC GTCTAAAAAC GCGCATGTCG TGGCGTTAGA AGAATCTAAT 866 ATTAACCTT 875 INFORMATION FOR SEQ ID NO:96: SEQUENCE CHARACTERISTICS: LENGTH: 123 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:96: Met Gin Ala Phe Lys Ser Val Ser Ala Ile Lys Lys Asp Glu Asn Ile 1 5 10 Thr Ala Asn Asn Thr Gin Lys Glu Arg Ile Leu Phe Gly Ala Leu Ser 25 Asn Pro Leu Leu Glu Gly Ala Ile Asp Lys Val Ser Ala Lys Asn Phe 40 Ile Pro Pro Asn Thr Leu Leu Ser Thr Asp Lys Thr Gin Ala Leu Ile 55 Ile Val Arg Lys Asn Asp Ile Ile Thr Gly Val Tyr Glu Glu Gly Gin 70 75 Ile Ser Ile Glu Ile Ser Leu Lys Ala Leu Glu Asn Gly Ala Leu Asn 90 Gin Ile Ile Gin Ala Lys Asn Leu Glu Ser Asn Lys Ile Leu Lys Ala 100 105 110 Lys Val Leu Ser Ser Ser Lys Ala Gin Ile Leu 115 120 INFORMATION FOR SEQ ID NO:97: SEQUENCE CHARACTERISTICS: LENGTH: 394 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear WO 98/21225 PCT/US97/21353 -213- (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 160...345 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:97: GGCATCACTT TTAACATGAC CCCTTCTCCA GGCGCGACGA GTTGTTTGCA AAACGCCCTT GTGGATTCCC AAGAAATCGC TGCGTATTTG GGCGAGAGCT TTGAATTAGA ACGCTTTTAT AAAGATTTAT CCCCAGAAGA ATTGGAAAAT TAAAAACGC ATG CAA AAA GAA CAA Met Gin Lys Glu Gin 1 120 174 GAA GCC CAA GAA Glu Ala Gin Glu GCT AAA AAA GCC Ala Lys Lys Ala GTT AAA ATC GTG TTT TTT TTA Val Lys Ile Val Phe Phe Leu 222 GGG CTT GTG Gly Leu Val AAT CAA ATC Asn Gin Ile GTG CTT TTG ATG ATG ATA AAC CTT TAC ATG CTC ATC Val Leu Leu Met Met Ile Asn Leu Tyr Met Leu Ile 270 AAC GCG AGC GCT Asn Ala Ser Ala
CAA
Gin 45 ATG AGC CAC CAA ATC AAA AAG ATA Met Ser His Gin Ile Lys Lys Ile 318 372 GAA GAA Glu Glu AGG CTT AAT CAG Arg Leu Asn Gin GAG CAA AAA Glu Gin Lys TAAAAAAGGC TTTTTGGTAT TTTTACG ATCAAATAGT AAAGAGCTTA TC INFORMATION FOR SEQ ID NO:98: SEQUENCE CHARACTERISTICS: LENGTH: 62 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:98: Met Gin Lys Glu Gin Glu Ala Gin Glu Ile Ala Lys Lys Ala Val Lys 1 5 10 Ile Val Phe Phe Leu Gly Leu Val Val Val Leu Leu 25 Leu Tyr Met Leu Ile Asn Gin Ile Asn Ala Ser Ala 40 Gin Ile Lys Lys Ile Glu Glu Arg Leu Asn Gin Glu 55 Met Met Ile Asn Gin Met Ser His Gin Lys WO 98/21225 PCT/US97/21353 -214- INFORMATION FOR SEQ ID NO:99: SEQUENCE CHARACTERISTICS: LENGTH: 982 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 320...880 OTHER INFORMATION: NAME/KEY: sig_peptide LOCATION: 320...400 OTHER INFORMATION: NAME/KEY: mat_peptide LOCATION: 401...880 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:99:
AGATTAGCAG
CGATTAAAAA
ATTGAAGTTG
GGTTTTGTAG
ATCCTTGATA
TCAACAAGAA
CAGCAGGGAT TTTTAAATTC GGCAAACGCT TTGAAAGTAT GTGATTATAC CTATTTGTAT GTGTATCCCA CTTATCCAAT AAAAATTAAA CCTTTTAGAA GGATTTATT ATG ATT AAA CTGGCCAACA GGGGCGGTTG GAAAAAAATA TTTTTCATAG AAATTCCCTT TTGTTAAATG CTTAAAAATT TGATTTTAAA AGTTTGAGAT TTATATCAAT ATTTTCACTC TAAAACCCTC AAATAACCGA TTTTAGGGTG TAACTTTAAT AGA ATT GCT TGT ATT TTA AGC TTG 120 180 240 300 352 400 Met Ile Lys Arg Ile Ala Cys Ile Leu Ser Leu -27 -25 AGT GCG Ser Ala AGT TTA GCG CTG GCT GGC GAA GTG AAT GGG TTT TTC ATG GGT Ser Leu Ala Leu Ala Gly Glu Val Asn Gly Phe Phe Met Gly
GCG
Ala 1 GGT TAT CAG CAA GGT CGT TAT GGT Gly Tyr Gln Gln Gly Arg Tyr Gly 5 TAT AAC AGC AAT TAC TCT ,Tyr Asn Ser Asn Tyr Ser GAT TGG CGC Asp Trp Arg TTT GTA GGC Phe Val Gly CAT GGC His Gly AAT GAT CTT Asn Asp Leu GGT TTG AAT TTC Gly Leu Asn Phe AAA TTA GGT Lys Leu Gly TTT GCC AAT AAA TGG TTT GGG GCT AGG GTG TAT GGC TTT Phe Ala Asn Lys Trp Phe Gly Ala Arg Val Tyr Gly Phe TTA GAT TGG TTT AAC ACT TCA GGG ACA GAA CAC ACC AAA ACC AAT TTG WO 98/21225 PCT/US97/21353 -215- Trp Pne Asn 'inr- Ser Gly Thr Glu His Thr Lys Thr Asn Leu Leu Asp
CTC
Leu ACC TAT GGT GGC Thr Tyr Gly Gly GGC GAT TTG ATT Gly Asp Leu Ile AAT CTC ATT CCT Asn Leu Ile Pro GAT AAA TTC GCT CTA GGT CTC ATC GGT GGC GTT CAA TTA GCC Asp Lys Phe Ala Leu Gly Leu Ile Gly Gly Val Gin Leu Ala GGA AAC Gly Asn ACT TGG ATG Thr Trp Met TGG AAT TTA Trp Asn Leu 115
TTC
Phe 100 CCT TAT GAT GTC Pro Tyr Asp Val
AAT
Asn 105 CAA ACG AGA TTC CAG TTC TTA Gin Thr Arg Phe Gin Phe Leu GGC GGA AGA ATG Gly Gly Arg Met GTT GGG GAT CGC AGT GCG TTT GAA Val Gly Asp Arg Ser Ala Phe Glu 125 GCA GGC Ala Gly 130 GTG AAA TTC CCT ATG GTT AAT CAA GGC AAC AAA GAT GTT AGG Val Lys Phe Pro Met Val Asn Gin Gly Asn Lys Asp Val Arg 135 140
GCT
Ala 145 TAT CCG CTA CTA TTC TTG GGT ATG TGG Tyr Pro Leu Leu Phe Leu Gly Met Trp ATG TTC TTC ACT Met Phe Phe Thr TTC T Phe 160 881 941 982 AATTTATTCC TTTCATTCGC TCTTCTTCAT CAAATCAACC CTAACCCACT CTTAAAAGGT TGGGGTTCAA AAATCTTTTT CATAAATAAA ATTTGCCTTA A INFORMATION FOR SEQ ID NO:100: SEQUENCE CHARACTERISTICS: LENGTH: 187 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:100: Met Ile Lys Arg Ile -27 -25 Leu Ala Gly Glu Val Gly Arg Tyr Gly Pro Asn Asp Leu Tyr Gly Asn Lys Trp Phe Gly Thr Ser Gly Thr Glu Ala Cys Ile Leu Ser -20 Asn Gly Phe Phe Met Tyr Asn Ser Asn Tyr Leu Ser Ala Ser Leu Ala Gly Ala Gly Tyr Gin Gin Ser Asp Trp Arg His Gly 15 Leu Asn Phe Lys Leu Gly Phe Val 30 Ala Arg Val Tyr Gly Phe Leu Asp 45 His Thr Lys Thr Asn Leu Leu Thr 60 Gly Phe Ala Trp Phe Asn Tyr Gly Gly WO 98/21225 PCT/US97/21353 -216- Gly Gly Gly Asp Leu lie Leu Ile Gly Gly Gin vai-Asn 75 Val Gin Thr Arg Leu Ile Pro Leu Leu Ala Gly Asn 95 Phe Gin Phe Leu Asp Lys Phe Ala Thr Trp Met Phe Pro 100 Tyr Asp Val Asn 105 Arg Met Arg Val 120 Trp Asn Leu Gly Gly 115 Ala Gly Val Lys Phe Gly Asp Arg Phe Glu 130 Tyr Pro Met Val 135 Phe Leu 150 Asn Gin Gly Asn Lys Asp Val Arg 140 Met Trp Ile Met Phe Phe Thr Phe 155 160 Ala 145 Pro Leu Leu Gly INFORMATION FOR SEQ ID NO:101: SEQUENCE CHARACTERISTICS: LENGTH: 843 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 262...777 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:101: CCAATGGAGG CGTTTCCAAA AACCCAAACG GGCGCTTTTT TCAGGGAGCA AGCGGTAAAA ATCGTAGAAA AACGCTTGAT GCGATTTTAA TGAAGAAGAA TTAAAAATCA TGTTTGAAGC AGCAAATCCA CGCTAAAGAA TTGAAAGAAA AGCAAGAAAA
AAAGAAAAAT
AAAAGAGAAT
TGAAGAAAAA
AACCACCAAG
CTCAAAAAAT
ATGCAACTGA
AGGTTGTTAG
CATTTTAAAG
120 180 240 AAGTTTGGGA AAAGGGCGAA A ATG AGC AAG AAA Met Ser Lys Lys 1 AGC GTA ATT TCT Ser Val Ile Ser
GGT
Gly TTA ATG AAT TTT TTT AGC GAA AAG AAT GAA CGC TGG CTC TTA Leu Met Asn Phe Phe Ser Glu Lys Asn Glu Arg Trp Leu Leu GCC CAC Ala His 339 AGG CAC ACG Arg His Thr AGC ATT GCG Ser Ile Ala
AGA
Arg GGG TTT GTG ATA Gly Phe Val Ile
GTG
Val 35 GCG TGG CTT TTT Ala Trp Leu Phe CGG TTT AAA Arg Phe Lys TTA GTG GAT Leu Val Asp TTT TCT ATT TTG Phe Ser Ile Leu
ATC
Ile ACT CTG TTG GTT Thr Leu Leu Val ATT TGG GTT TAT AGC GAT GTG CGT CAG TTT TTA TTG GAC ACT TCT AGC Ile Trp Val Tyr Ser Asp Val Arg Gin Phe Leu Leu Asp Thr Ser Ser WO 98/21225 WO 981225PCTIUS97/21353 -217- TTT ATT TGG CTT Phe Ile Trp Leu ATC GCT TTA CTA Ile Ala Leu Leu AAG TGG GGC GTG Lys Trp Gly Val
ATT
Ile GTC ATA AGC GCA Val Ile Ser Ala AAA TGC TAC CAA TTC AGC CAA AAA ATG TTT ACG Lys Cys Tyr Gin Phe Ser Gin Lys Met Phe Thr CTC ATT CAA Leu Ile Gin AAC TAC AAA Asn Tyr Lys 125
AGA
Arg 110 AAA AGG CAA ATC Lys Arg Gin Ile
AGA
Arg 115 GAG AAT TTA AAA Giu Asn Leu Lys AAC CGC TCC Asn Arg Ser 120 ATC GCT GAA Ile Ala Giu GAT ACC AAA AAT Asp Thr Lys Asn
GCG
Ala 130 GAA AAA CTC TCT Giu Lys Leu Ser
AGC
Ser 13S GAA ATC Giu Ile 140 ATT TCA AAA AAA CAA GAA GAG TCC CGC Ile Ser Lys Lys Gin Giu Giu Ser Arg
CCC
Pro 150 AAA GAA GAT TCT Lys Giu Asp Ser
AAT
As n 155 CAT GAA AAC CAT Hi-s Glu Asn His
AAA
Lys 160 GAA AAG CTT TCT AAC ATT ACC GAA GAA Giu Lys Leu Ser Asn Ile Thr Giu Giu 165
AGT
Ser 170 723 771 829 843 GAT TCT Asp Ser TAAAAACAA GAGGAATTGA AAAGCTAAAA AGGATAGGGG GGGATTACCC AA AGCATATTGG AGGG INFORMATION FOR SEQ ID NO:102: SEQUENCE CHARACTERISTICS: LENGTH: 172 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:102: Met
I
Glu Ser Lys Lys Asn Ser Val Lys Asn Giu 5 Arg Trp Leu Trp Leu Phe Ile Ser Gly 10 Leu Ala His 2 5- Arg Phe Lys 40 Leu Vai Asp Leu Met Asn Phe Phe Ser Val Ile Val Leu Ile Thr Ala Arg His Thr Ser Ile Ala Ile Trp Val Arg Gly Phe Phe Ser Ile Tyr Ser Asp Leu Leu Vai WO 98/21225 PCT/US97/21353 -218- Val Ile Arg Gin Phe Leu Ala Leu Leu Leu- Asp 70 Lys Trp Gin Lys Thr Ser Ser Ser Gly Val Ile Val Met Phe Thr Leu Phe Ile Trp Leu Ile Ser Ala Arg Lys Cys Tyr Gin Gin Ile Arg 115 Asn Ala Glu Phe 100 Glu Ile Gin Arg Lys Arg 110 Tyr Lys Asp Thr Lys Asn Leu Lys Asn 120 Ile Ser Asn 130 Gin Glu Glu Lys Leu Ser Ser 135 Ser Arg Pro Lys 150 Ser Asn Ile Thr Ala Glu Glu Ile 145 Glu Ser Lys Lys 140 Glu Asp Ser Asn His 155 Glu Glu Ser Asp Ser 170 Glu Asn His Lys Leu 165 INFORMATION FOR SEQ ID NO:103: SEQUENCE CHARACTERISTICS: LENGTH: 1047 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 34...1005 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:103: AGAAAGAAAC CATTCAAGGA ACGCATTGAT TTG ATG AAT AAA CCA TTT TTA ATC Met Asn Lys Pro Phe Leu Ile TTA CTC ATA Leu Leu Ile GCC CTA ATT GTC TTT AGC GGC TGT AAC ATG AGA AAA TAT Ala Leu Ile Val Phe Ser Gly Cys Asn Met Arg Lys Tyr TTC AAA Phe Lys CCC GCT AAA CAC Pro Ala Lys His
CAA
Gin ATT AAA GGC GAA Ile Lys Gly Glu TAT TTC CCT AAC Tyr Phe Pro Asn
CAT
His TTG CAA GAA AGT ATC GTT TCG TCT AAT Leu Gin Glu Ser Ile Val Ser Ser Asn 45
CGT
Arg TAT GGA GCC ATT TTG Tyr Gly Ala Ile Leu AAA AAT GGA GCG GTT ATA GGC GAT AAA GGT Lys Asn Gly Ala Val Ile Gly Asp Lys Gly 65 TTA ACG CAG CTA Leu Thr Gin Leu AGA ATC Arg Ile GGT AAG AAC TTC AAT TAC GAA AGC AGT TTT TTA AAT GAG AGT CAA GGG WO 98/21225 WO 98/ 1225PCTIUS97/2 1353 -219- Giy Lys Asn Pne Asn ayr-Giu Ser Ser Phe Leu Asn Giu Ser Gin Giy AAA AAA ACA Lys Lys Thr TTT TTT ATT Phe Phe Ile CTT GCG CAA GAT Leu Ala Gin Asp
TGT
Cys TTG AAC AAG ATT Leu Asn Lys Ile 342 AAC AAA Asn Lys 105 AGC AAG GTG GCT Ser Lys Vai Ala ACT GAA GAA ACG GAA TTG AAA TTA AAG Thr Giu Giu Thr Giu Leu Lys Leu Lys 115
GGC
Gly 120 GTT GAA GCG GAA Vai Giu Ala Giu CAA GAT AAA GTC Gin Asp Lys Val TGT CAT CAA GTG GAA TTG Cys His Gin Val Giu Leu 130 135 TCT ATC GTT ATT CCT TTG Ser Ile Vai Ile Pro Leu 150 ATT ASC AAT AAC Ile Ser Asn Asn AAC GCC AGC CAA Asn Ala Ser Gin 486 GAG ACT TTT Glu Thr Phe GTG TTA. SCG Val Leu Ala 170 TTG AGC GCA ASC Leu Ser Ala Ser
GTT
Val1 160 AAA 555 AAT CTT Lys Giy Asn Leu TTA GCG GTG Leu Ala Val 165 TCT CAA AAA Ser Gin Lys SAC AAT TCA GCG AAC TTA TAC SAC ATC Asp Asn Ser Ala Asn Leu Tyr Asp Ile TTS CTT Leu Leu 185 TTT AGT GAG AAA Phe Ser Giu Lys TCC CCA AGC ACC Ser Pro Ser Thr ATC AAT TCT TTA Ile Asn Ser Leu
ATG
Met 200 GCG ATG CCT ATT Ala Met Pro Ile ATS SAT ACG GTC Met Asp Thr Val GTS TTC CCC ATG Val Phe Pro Met
CTA
Leu 215 GAT GGG CGC TTG Asp Sly Arg Leu GTC STS GAT TAT Val Val Asp Tyr
GTS
Val1 225 CAC GSA AAC CCT His Sly Asn Pro ACG CCT Thr Pro 230 ATT AGA AAC Ile Arg Asn TAC CTT ATC Tyr Leu Ile 250
ATT
Ile 235 GTT ATC AGC AGC Val Ile Ser Ser
SAT
Asp 240 AAS TTT TTT AAC Lys Phe Phe Asn AAT ATC ACC Asn Ile Thr 245 GGG AAA ASS Gly Lys Arg GTA SAT GGC AAT Val Asp Giy Asn ATS ATC GCT TCT Met Ile Ala Ser ATA CTC Ile Leu 265 TCA GTA STS ASC Ser Val Val Ser CAA GAS TTC AAC Sin Glu Phe Asn GAT GS SAT ATT Asp Gly Asp Ile
GTG
Val1 280 SAT TTS CTT TAT Asp Leu Leu Tyr AAS GGS ACT TTA Lys Sly Thr Leu
TAT
Tyr 290 GTG CTC ACS CTA Val Leu Thr Leu WO 98/21225 WO982125PCTIUS97/2 1353 -220- GGG CAG ATT TT J CAA AT GAT AAG AGT TTG AGG GAA TTA AAC AGC GTG Gly Gin Ile Leu Gin Met Asp Lys Ser Leu Arg Giu Leu Asn Ser Val 300 305 310 AAA CTG CCT NTC NTC GCT CAA CAC GAT TGT ATT AAA CCA TAATAAATTG TA Lys Leu Pro Xaa Xaa Ala Gin His Asp Cys Ile Lys Pro 315 320 TTCTTTAGAA AAACGAGGGT ATGTGATAGA INFORMATION FOR SEQ ID NO:104: SEQUENCE CHARACTERISTICS: LENGTH: 324 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear 966 1017 1047 (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:104: Met Asn Lys Pro Phe Leu Ile Leu Leu 1 Gly Gly Asn Gly Phe Asn Giu Val Gin 145 Lys Tyr Ser Val1 Val1 225 Ile Asn Ala Tyr Thr Asn Ile Giu His Ile As n Ile Thr 195 Val1 Gly Phe Ser Phe His Lys Gly Phe Asn Gly 120 Ile Giu Val1 Leu Met 200 Asp Ile Tyr Ile Lys Leu Asn Lys Phe Lys 105 Val1 Ser Thr Leu Leu 185 Ala Gly Arg Leu Leu Ile 10 Pro Gin Gly Asn Ile Ser Giu Asn Phe Al a 170 Phe Met Arg As n Ile 250 Ser Leu Lys Ser Val Asn Al a Val1 Giu Pro 140 Leu Asn Glu Ile Leu 220 Val1 Asp Val1 Val1 Gin Val1 Gly Giu Asp Lys 110 Gin Al a Al a Ala Gly 190 Met Val1 Ser Asn Gly Phe Ile Ser Asp Ser Cys Thr Asp Ser Ser Asn 175 Ser Asp Asp Ser Asn 255 Gin Ser Lys Ser Lys Ser Leu Giu Lys Gin Val1 160 Leu Pro Thr Tyr Asp 240 Met 02-u WO 98/21225 PCTIUS97/21353 -221- 260 265 270 Phe Asn Tyr Asp Gly Asp Ile Val Asp Leu Leu Tyr Asp Lys Gly Thr 275 280 285 Leu Tyr Val Leu Thr Leu Asp Gly Gin lie Leu Gin Met Asp Lys Ser 290 295 300 Leu Arg Glu Leu Asn Ser Val Lys Leu Pro Xaa Xaa Ala Gln His Asp 305 310 315 320 Cys Ile Lys Pro INFORMATION FOR SEQ ID NO:105: SEQUENCE CHARACTERISTICS: LENGTH: 1968 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 153...1793 OTHER INFORMATION: NAME/KEY: Signal Sequence LOCATION: 153...219 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:105: TCTGGGGGCA TTGCTTACCC TACTACTCGC TTC AAAGATTCTA ATCGCAATTT TAAAACCATC ACT GCAACTTACT ACATCATTAA GGTTTAATCA CA CTC CGC TTG ATT TTA GCG ATC GCT CTG Leu Arg Leu Ile Leu Ala Ile Ala Leu -10 TAT AGC TAT TTT TTC CAA AAA CCA AAC Tyr Ser Tyr Phe Phe Gin Lys Pro Asn 10 AAG CAA GAA ACA ACC AAC AAC CAT ACA Lys Gin Glu Thr Thr Asn Asn His Thr 25 AAC GCC CAA CAT TTT AGC ACC ACT CAA Asn Ala Gin His Phe Ser Thr Thr Gin 3AAACGCC CAAGCCTGAT CCAATCTCAT 'TTTTGGC TCGTTCCCAC AAAAAGCCAC ATG GAT AAA AAC AAC AAT AAT Met Asp Lys Asn Asn Asn Asn TCT TTC TTG TTT ATC GCT CTT Ser Phe Leu Phe Ile Ala Leu -5 1 AAA ACA ACA ACC CAA ACC ACA Lys Thr Thr Thr Gin Thr Thr GCA ACA AGT CCT AAC GCG CCC Ala Thr Ser Pro Asn Ala Pro ACA ACC CCC CAA GAG AAT TTG Thr Thr Pro Gin Glu Asn Leu 120 173 221 269 317 365 WO 98/21225 WO 981225PCTIUS97/21353 -222- CTA AGC ACG ATT T'1 711-GAG CAT GCC Leu Ser Thr Ile Ser Phe Giu His Ala 55 GGG CGC ATC AAA CAG GTT TAT CTC AAG Gly Arg Ile Lys Gin Val Tyr- Leu Lys AGG ATT Arg Ile GAA ATT GAT TCT Glu Ile Asp Ser GAT AAA AAG TAT CTA ACC CCT Asp Lys Lys Tyr Leu Thr Pro GGC CAT CTT TTT AGC TCC AAA Gly His Leu Phe Ser Ser Lys AAA CAA AAG Lys Gin Lys CAA AAC GCG Clu Asn Aia 100
GGC
Gly TTT TTA GAG CAT Phe Leu Ciu His CAA CCC CCC CTA Gin Pro Pro Leu
AAA
Lys 105 GAG CTC CCC CTT Giu Leu Pro Leu GCA GCC GAT Ala Ala Asp 557 AAA CTC Lys Leu 115 AAG CCT TTA GAA Lys Pro Leu Giu CGT TTT TTA GAC Arg Phe Leu Asp ACG CTC AAT AAC Thr Leu Asn Asn 605
AAA
Lys 130 GC TTC AAC ACC CCT TAT AGC GCT TCA Ala Phe Asn Thr Pro Tyr Ser Ala Ser 135
AAA
Ly s 140 ACC ACT CTT GG Thr Thr Leu Gly
CCT
Pro 145 AAC GAA CAG CTT GTT TTA ACC CAA GAT Asn Giu Gin Leu Val Leu Thr Gin Asp 150 GGC ACT CTT AGC Gly Thr Leu Ser ATC ATT Ile Ile 160 AAA ACC CTG Lys Thr Leu TTC AAA TCG Phe Lys Ser 180
ACT
Thr 165 TTC TAT GAT GAT Phe Tyr Asp Asp CAT TAT GAT TTA His Tyr Asp Leu AAA ATC GCA Lys Ile Ala 175 ACC AAT GGT Thr Asn Gly CCC AAT AAC CTT Pro Asn Asn Leu
ATC
Ile 185 CCT AGC TAT GTG Pro Ser Tyr Val TAC AGG Tyr Arg 195 CCG GTG GCT GAT Pro Val Ala Asp GAC AGC TAC ACC Asp Ser Tyr Thr
TTT
Phe 205 TCA GGC GTG CTT Ser Cly Val Leu
TTA
Leu 210 GAA AAT AGC GAC Glu Asn Ser Asp AAA ATT GAA AAA Lys Ile Giu Lys
ATT
Ile 220 CPA GAT AAA GAC Clu Asp Lys Asp AAA GAA ATC AAA Lys Giu Ile Lys
CGC
Arg 230 TTT TCT AAC ACC Phe Ser Asn Thr TTT TTA TCC AGC GTG GAT Phe Leu Ser Ser Val Asp 240 AGG TAT TTC ACC ACC TTG CTT TTC Arg Tyr Phe Thr Thr Leu Leu Phe 245 AAA GAT CCT CAA Lys Asp Pro Gin GGT TTT GAA Cly Phe Giu 255 CCC TTC ATT Gly Phe Ile GCC TTA ATT Ala Leu Ile 260 GAT TCA CPA ATC Asp Ser Ciu Ile
GC
Cly 265 ACT AAA AAC CCC Thr Lys Asn Pro 1037 WO 98/21225 PCT/US97/21353 -223- TCC CTT Ser Leu 275 AAA AAT GAA GCG Lys Asn Glu Ala TTG CAT GGC TAT Leu His Gly Tyr
ATT
Ile 285 GGC CCT AAG GAT Gly Pro Lys Asp CGC TCT TTG AAA Arg Ser Leu Lys
GCG
Ala 295 ATT TCA CCC ATG Ile Ser Pro Met ACC GAT GTG ATA Thr Asp Val Ile
GAG
Glu 305 1085 1133 1181 TAT GGC TTA ATC Tyr Gly Leu Ile TTC TTT GCA AAA Phe Phe Ala Lys
GGC
Gly 315 GTG TTT GTT TTA Val Phe Val Leu CTG GAT Leu Asp 320 TAT TTG TAT Tyr Leu Tyr ACG ATT ATC Thr Ile Ile 340 TTC GTG GGC AAT Phe Val Gly Asn GGT TGG GCT ATC Gly Trp Ala Ile ATT CTT TTA Ile Leu Leu 335 AAG GGC ATG Lys Gly Met 1229 GTG CGC ATC ATC CTT TAT CCT TTA AGC Val Arg Ile Ile Leu Tyr Pro Leu Ser GTG AGC Val Ser 355 ATG CAA AAG CTC Met Gin Lys Leu
AAA
Lys 360 GAA TTA-GCC CCT Glu Leu Ala Pro
AAA
Lys 365 ATG AAA GAA CTC Met Lys Glu Leu
CAA
Gin 370 GAA AAA TAC AAG Glu Lys Tyr Lys
GGC
Gly 375 GAA CCC CAA AAA Glu Pro Gin Lys
TTG
Leu 380 CAA GCC CAC ATG Gin Ala His Met 1277 1325 1373 1421 1469 1517 CAG CTT TAC AAA Gin Leu Tyr Lys
AAA
Lys 390 CAT GGG GCT AAC His Gly Ala Asn CTA GGG GGT TGT Leu Gly Gly Cys CTG CCC Leu Pro 400 TTA ATC TTA Leu Ile Leu AAC GCT GTG Asn Ala Val 420 ATC CCG GTG TTT Ile Pro Val Phe TTT GCC ATT TAT AGA GTG CTT TAT Phe Ala Ile Tyr Arg Val Leu Tyr 410 415 GAG TGG ATC TTA TGG ATT CAT GAT Glu Trp Ile Leu Trp Ile His Asp 430 GAA TTG AAA AGC Glu Leu Lys Ser TTA TCC Leu Ser 435 ATC ATG GAT CCG TAT TTT ATT TTA CCG CTT CTT ATG GGA GCG Ile Met Asp Pro Tyr Phe Ile Leu Pro Leu Leu Met Gly Ala 440 445 1565
TCT
Ser 450 ATG TAT TGG CAC Met Tyr Trp His
CAA
Gin 455 AGC GTT ACG CCA Ser Val Thr Pro
AAC
Asn 460 ACC ATG ACC GAT Thr Met Thr Asp 1613 1661 ATG CAA GCA AAG Met Gin Ala Lys
ATT
Ile 470 TTT AAA CTC TTA Phe Lys Leu Leu
CCC
Pro 475 CTA TTA TTC ACA Leu Leu Phe Thr ATC TTT Ile Phe 480 TTA ATC ACT Leu Ile Thr CCG GCA GGG TTA Pro Ala Gly Leu TTG TAT TGG ACC Leu Tyr Trp Thr ACG AAC AAC Thr Asn Asn 495 1709 WO 98/21225 WO 982122 PCTIUS97/21353 -224- ATC CTT TCGC GTG TTGj CAA- CAA CTC ATC- ATC AAT Ile Leu Ser Val Leu Gin Gin Leu Ile Ile Asn 500 505 AAA AAA CGC ATG CAT GCG CAA AAC AAA AAG GAA Lys Lys Arg Met His Ala Gin Asn Lys Lys Glu 515 520 TGAAATCAAA GCCAAAACCT TAGAAGAAGC CCTCATTCAA CCCCATTATT AATTTGCAAT ACGAAGTCAT TCAAACGCCC TGGTAAAAAA GAAGCCATTA TCTTAGCGGG CGTTAAAGA INFORMATION FOR SEQ ID NO:106: SEQUENCE CHARACTERISTICS: LENGTH: 547 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (ix) FEATURE: NAME/KEY: Signal Sequence LOCATION: .22 OTHER INFORMATION: AAA GTC TTA GAG AAT Lys Vai Leu Glu Asn 510 CAT TGATGCAAAA TTTTAT His 525 GCTTCTATCG CCTTGAATTG TCTAAAGGGT TTTTAAGCAT 1757 1809 1869 1929 1968 (xi) SEQUENCE DESCRIPTION: SEQ ID NQ:106: Met Asp Lys Asn Asn Asn Asn Ser Lys Al a Thr Arg Asp Gly Leu Leu Ser Leu Phe Thr Thr Thr Ile Lys His Pro Asp Lys 140 Gly Phe Thr Pro Gin Ile Tyr Phe Leu 110 Thr Thr Leu Ile Gin Asn Glu Asp Leu Ser Al a Leu Leu Ser Al a Thr Ala Asn Ser Thr 80 Ser.
Al a Asn Gly Ile Leu Tyr Lys Asn Leu Gly Lys Giu Ly s Lys 130 Asn Lys Arg Ser Gin Al a Ser Arg Gin As n Leu 115 Ala Glu Thr Leu Tyr Giu Gin Thr Ile Lys Ala 100 Lys Phe Gin Leu Ile Phe 5 Thr His Ile Lys Gly 85 Gin Pro Asn Leu Thr Leu Phe 'Phr Phe Ser Gin Phe Pro Leu Thr Val1 150 Phe Leu As n Thr Gin Ala Lys Val Giu Phe Al a Asp Leu WO 98/21225 PCT/US97/21353 225- 160- His Tyr Asp Leu Lys Ile Ala Phe Lys Ser Tyr Lys Leu 235 Lys Lys Gly Met Gly 315 Gly Pro Ala Lys Pro 395 Ala Trp Leu Pro Pro 475 Leu Ile Lys Tyr Thr Ile 220 Phe Asp Asn Tyr Leu 300 Val Trp Leu Pro Leu 380 Leu Ile Ile Pro Asn 460 Leu Tyr Asn Glu Val Ile 190 Phe Ser 205 Glu Asp Leu Ser Pro Gin Pro Leu 270 Ile Gly 285 Thr Asp Phe Val Ala Ile Ser Tyr 350 Lys Met -365 Gin Ala Gly Gly Tyr Arg Leu Trp 430 Leu Leu 445 Thr Met Leu Phe Trp Thr Lys Val 510 His 525 175 Thr Gly Lys Ser Gly 255 Gly Pro Val Leu Ile 335 Lys Lys His Cys Val 415 Ile Met Thr Thr Thr 495 Asn Val Asp Val 240 Phe Phe Lys Ile Leu 320 Leu Gly Glu Met Leu 400 Leu His Gly Asp Ile 480 Asn Gly Leu Ala 225 Asp Glu Ile Asp Glu 305 Asp Leu Met Leu Met 385 Pro Tyr Asp Ala Pro 465 Phe Asn Tyr Leu 210 Lys Arg Ala Ser Tyr 290 Tyr Tyr Thr Val Gin 370 Gin Leu Asn Leu Ser 450 Met Leu Ile Arg 195 Glu Glu Tyr Leu Leu 275 Arg Gly Leu Ile Ser 355 Glu Leu Ile Ala Ser 435 Met Gin Ile Leu Ser 180 Pro Asn Ile Phe Ile 260 Lys Ser Leu Tyr Ile 340 Met Lys Tyr Leu Val 420 Ile Tyr Ala Thr Ser 500 165 Pro Val Ser Lys Thr 245 Asp Asn Leu Ile Gin 325 Val Gin Tyr Lys Gin 405 Glu Met Trp Lys Phe 485 Val Asn Asp Lys 215 Phe Leu Glu Ala Ala 295 Phe Val Ile Leu Gly 375 His Pro Lys Pro Gin 455 Phe Ala Gin Leu Leu 200 Lys Ser Leu Ile Asn 280 Ile Phe Gly Ile Lys 360 Glu Gly Val Ser Tyr 440 Ser Lys Gly Gin Ile 185 Asp Ile Asn Phe Gly 265 Leu Ser Ala Asn Leu 345 Glu Pro Ala Phe Ser 425 Phe Val Leu Leu Leu 505 170 Pro Ser Glu Thr Thr 250 Thr His Pro Lys Trp 330 Tyr Leu Gin Asn Phe 410 Glu Ile Thr Leu Val 490 Ile Leu Glu Asn Lys Lys Arg Met His Ala Gin Asn Lys INFORMATION FOR SEQ ID NO:107: SEQUENCE CHARACTERISTICS: LENGTH: 3280 base pairs TYPE: nucleic acid STRANDEDNESS: single WO 98/21225 PCT/US97/21353 -226- TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 151...3207 OTHER INFORMATION: NAME/KEY: Signal Sequence LOCATION: 151...241 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:107: TAAAGGTTTT AGGCCTGTGG TGGTTCAAGT TTTAGAAGAG CGCAGCAAGA TTTTTATCGT GAACGCTCAA AATTTACACC CTAATGACAG CGTGGCAGTG GGGTCATTGA TAGGGTTAAA AGGCATGATC AACAATTTAG GGGAGGAATG ATG CTC GCT TCC ATT ATT GAA TTT Met Leu Ala Ser Ile Ile Glu Phe 120 174 TCC TTA CGC Ser Leu Arg CAA AGA GTG ATC Gin Arg Val Ile ATT GTT GGT GCG ATT CTT ATT TTA Ile Val Gly Ala Ile Leu Ile Leu TTT TTT Phe Phe GGG ACT TAT AGT TTT ATC AAC ACT CCA GTG GAC GCT TTC Gly Thr Tyr Ser Phe Ile Asn Thr Pro Val Asp Ala Phe
CCG
Pro GAT ATT TCG CCC Asp Ile Ser Pro CAA GTT AAA ATC Gin Val Lys Ile
ATT
Ile TTA AAA CTC CCC Leu Lys Leu Pro GGC TCT Gly Ser AGC CCT GAA Ser Pro Glu CTT TTA GGC Leu Leu Gly ATG GAA AAC AAC Met Glu Asn Asn
ATC
Ile GTG CGC CCT TTA Val Arg Pro Leu GAA TTG GAG Glu Leu Glu TTG AAA GGG CAA Leu Lys Gly Gin TCT TTA AGG AGT GTT TCA AAA TAT Ser Leu Arg Ser Val Ser Lys Tyr TCT ATT Ser Ile TCA GAT ATT ACG ATA GAT TTT GAT GAC AGC GTG GAT ATT TAT Ser Asp Ile Thr Ile Asp Phe Asp Asp Ser Val Asp Ile Tyr GCG AGG AAT ATT Ala Arg Asn Ile
GTC
Val AAT GAG CGC TTG Asn Glu Arg Leu AGC GTG ATG AAA Ser Val Met Lys TTA CCC GTG GGG Leu Pro Val Gly GAG GGG GGC ATG Glu Gly Gly Met CCC ATT GTT ACG CCG CTA Pro Ile Val Thr Pro Leu WO 98/21225 WO 9821225PCTIUS97/21353 -227- TCA GAT ATC Ser Asp Ile AAA CGA CAG Lys Arg Gin 125 TT-T ATG TTW* ACT ATT GAT GGC A-AT ATC ACT Gly Asn Ile Thr Phe 110 Met Phe Thr Ile GAG ATA GAA Giu Ile Giu 120 AGA ATG ATT Arg Met Ile CTT TTA GAT TTT Leu Leu Asp Phe
GTG
Vai 130 ATC CGC CCA CAA Ile Arg Pro Gin
TTA
Leu 135 AGC GGC Ser Gly 140 GTA GCA GAT GTC Vai Ala Asp Val TCC ATT GGA GGC Ser Ile Gly Gly AGC AGA GCG TTT Ser Arg Ala Phe
GTG
Val1 155 ATC GTG CCG GAT Ile Val Pro Asp AAT GAC ATG GCA Asn Asp Met Ala
AGG
Arg 165 CTT GGG GTG AGT Leu Gly Val Ser
ATT
Ile 170 702 750 798 TCT GAT TTA GAA Ser Asp Leu Giu
TCG
Ser 175 GCT GTG AGA GTG Ala Val Arg Val TTA AGA AAC AGC Leu Arg Asn Ser GGA GCG Gly Ala 185 GGG CGC GTG Gly Arg Val GCT TCT TTG Ala Ser Leu 205
GAT
Asp 190 AGA GAT GGC GAA Arg Asp Gly Giu TTT TTA GTC AAA Phe Leu Val Lys ATC CAA ACC Ile Gin Thr 200 TCC ACT AAT Ser Thr Asn 846 894 AGT TTA GAA GAC Ser Leu Giu Asp
ATT
Ile 210 GGC AAA A-TC ACC Gly Lys Ile Thr TTA GGG Leu Gly 220 CGC ACC Arg Thr CAT TTG CAC ATT His Leu His Ile GAT TTT GCG AAA Asp Phe Ala Lys
GTC
Val1 230
GTG
Val1 ATC AGC CAG TCT Ile Ser Gin Ser COT TTG GGG Arg Leu Gly
TTT
Phe 240 ACT AAA GAT Thr Lys Asp GGC GAG ACC Gly Glu Thr 942 990 1038 1086 GAA GGC TTG GTG Glu Gly Leu Vai
CTT
Leu 255 TCT TTA AAA GAC Ser Leu Lys Asp AAC ACC AAA GAA Asn Thr Lys Glu ATC ATC Ile Ile ACT CAA GTG Thr Gin Val GGC GTG TCC Gly Vai Ser 285 CAA AAA CTA GAA Gin Lys Leu Giu GAA TTA AAA CCC TTT TTA CCG AAT Glu Leu Lys Pro Phe Leu Pro Asn 275 280 GAT CGC TCA GAA TTT ACG CAA AAA Asp Arg Ser Glu Phe Thr Gin Lys 295 ATT AAT GTT TTT Ile Asn Val Phe
TAT
Tyr 290 1134 GCC ATT Ala Ile 300 GCC ACC GTT TCT Ala Thr Val Ser ACC CTC ATT GAA Thr Leu Ile Giu
GCC
Al a 310 OTT GTT TTA ATC Val Val Leu Ile 1182 1230 ATC ACG CTC TTT Ile Thr Leu Phe
TTA
Leu 320 TTT TTA GGG AAT Phe Leu Gly Asn AGG GCG AOC GTG Arg Ala Ser Val
GCT
Al a 330 WO 98/21225 WO 9821225PCTIUS97121353 -228- GTG GGG GTG ATi Val Gly Val Ile Ii. A Leu 335 CCT.-TTA AGC TTG Pro Leu Ser Leu
TCC
Ser 340 GTG GCG TTT ATT TTT ATC 1278 Val Ala Phe Ile Phe Ile 345 AAG TTT AGC Lys Phe Ser ATC OCT ATA Ile Ala Ile 365 CTG ACT TTA AAT Leu Thr Leu Asn ATG AGT TTA GGG Met Ser Leu Gly OGA TTG GTT Gly Leu Vai 360 1326 GGC ATG CTC ATT Gly Met Leu Ile
GAC
Asp 370 TCA GCC GTG GTG GTG OTG OAA AAC Ser Ala Val Val Val Val Giu Asn 375 1374 OCT TTT Ala Phe 380 GAA AAA TTA AGC Glu Lys Leu Ser AAC ACT AAA ACC Asn Thr Lys Thr
ACT
Thr 390 AAA CTC CAT GCA Lys Leu His Ala
ATC
Ile 395 TAT COT TCG TGT Tyr Arg Ser Cys
AALA
Lys 400 GAA ATC GCT GTT Olu Ile Ala Val
TCA
Ser 405 GTG GTG AGC GGG Val Vai Ser Gly
GTG
Vali 410 GTG ATC ATC ATT Vai Ile Ile Ile TTT TTT GTG CCG Phe Phe Val Pro
ATT
Ile 420 TTA ACC TTA CAG Leu Thr Leu Gin GGG TTA Gly Leu 425 1422 1470 1518 1566 1614 GAG GOT AAG Giu Gly Lys TTA GOC ACT Leu Gly Thr 445 TTT AGO CCT TTA Phe Arg Pro Leu GCG CAA AGC ATT GTO TAT OCO CTT Ala Gin Ser Ile Val Tyr Ala Leu 435 440 ACA ATC ATT CCT OTA OTC AGC TCT Thr Ile Ile Pro Val Val Ser Ser TTA OTT CTA TCT Leu Val Leu Ser
ATT
Ile 450 CTT GTC Leu Val 460 TTA AAA 0CC ACO Leu Lys Ala Thr CAT AOC GAA ACC His Ser Glu Thr TTA ACO AGO TTT Leu Thr Arg Phe
TTA
Leu 475 AAC AGA ATC TAC Asn Arg Ile Tyr 0CC Ala 480 CCT TTA TTO GAA Pro Leu Leu Giu
TTT
Phe 485 TTT GTO CAT AAC Phe Val His Asn
CCT
Pro 490 1662 1710 1758 AAA AAA OTO ATT Lys Lys Val Ile
TTA
Leu 495 OGA OCO TTT OTT Gly Ala Phe Val
TTT
Phe 500 TTA ATC OCA AGC Leu Ile Ala Ser CTT TCT Leu Ser 505 TTA TTC CCT Leu Phe Pro OAT OTO GTT Asp Val Val 525 OTO GGG AAO AAT Val Oly Lys Asn ATG CCC OTT TTA Met Pro Val Leu OAT GAO GOC Asp Glu Oly 520 1806 TTG AGC OTG OAA Leu Ser Val Oiu ACC CCT TCT ATT TCT TTA OAT CAA Thr Pro Ser Ile Ser Leu Asp Gin 535 1854 TCT AGO OAT CTC ATO CTA AAC ATT GAO AGC OCG ATT AAA AAO CAT GTC Ser Arg Asp Leu Met Leu Asn Ile Glu Ser Ala Ile Lys Lys His Val 540 545 550 1902 WO 98/21225 WO 9821225PCTIUS97/21353 -229-
AAG
Lys 555 GAA GTT AAA AGC Glu Val Lys Ser *GTC GCG CGC Val Ala Arg ACA GGG Thr Gly 565 ACC GAT GAA TTG GGG Ser Asp Glu Leu Cly CTG GAT TTA GGA Leu Asp Leu Cly
GGT
Gly 575 TTG PAT CAA ACC Leu Asn Gin Thr ACT TTT ATT TCT Thr Phe Ile Ser TTT ATT Phe Ile 585 1950 1998 2046 CCT AAA AAA Pro Lys Lys ATC ATG GAT Ile Met Asp 605
GAA
Glu 590 TGG AGC GTT AAA ACC AAA CAT GAA TTA Trp Ser Val Lys Thr Lys Asp Glu Leu 595 TTA CAA AAA Leu Giu Lys 600 TCT TTC ACC Ser Phe Thr TCT TTA AAA GAC Ser Leu Lys Asp AAG GGG ATT AAC Lys Gly Ile Asn
TTT
Phe 615 2094 CAA CCC Gin Pro 620 ATT GAA ATG AGA Ile Ciu Met Arg
ATT
Ile 625 TCT GAA ATG CTG Ser Clu Met Leu
ACA
Thr 630 GGG GTT AGG GGG Gly Val Arg Cly 2142 2190
GAT
Asp 635 TTA GCG GTT AAG Leu Ala Val Lys TTT GGA GAT GGT Phe Gly Asp Gly
ATT
Ile 645 AGC GAA TTG AAT Ser Giu Leu Asn TTG ACT TTT CAA ATC GCG CAA GCT CTA Leu Ser Phe Gin Ile Ala Gin Ala Leu 655 GGG ATT AAA GGA Cly Ile Lys Gly TCT AGT Ser Ser 665 2238 GAA CTT TTA Giu Val Leu CCT AAT AAA Pro Asn Lys 685
ACC
Thr 670 ACG CTT PAT GAG Thr Leu Asn Clu CTC P.AT TAT TTG Val Asn Tyr Leu TAT GTA ACC Tyr Val Thr 680 GAT CPA TTT Asp Glu Phe 2286 2334 CPA TCC ATC CC Ciu Ser Met Ala
CAT
Asp 690 CTC CCC ATC ACT Val Gly Ile Thr TCC PAG Ser Lys 700 TTT TTA AAA TCC Phe Leu Lys Ser
CCT
Ala 705 TTA GAG CCC TTC CTT CTA CAT GTC ATC Leu Clu Cly Leu Val Val Asp Val Ile 710 2382 ACA CCC ATT TCA Thr Cly Ile Ser
CC
Arg 720 ACC CCA CTG ATG Thr Pro Val Met
ATC
Ile 725 CCC CAA GAG AC Arg Gin Ciu Ser
CAT
Asp 730 2430 2478 TTT CCA ACC TCT Phe Ala Ser Ser ACT PAA ATC AAA Thr Lys Ile Lys TTA CCC TTG ACT Leu Ala Leu Thr TCA AAA Ser Lys 745 TAT CCC CTT Tyr Gly Val CAT CCC CCT Asp Gly Pro 765
TTA
Leu 750 GTC CCT ATC ACT Val Pro Ile Thr
TCT
Ser 755 ATC CCC AAA ATT Ile Ala Lys Ile CPA CPA GTC Clu Clu Val 760 ATC ACC CTG Met Ser Val 2526 2574 CTT TCT GTT GTC Val Ser Val Val
CCT
Arg 770 CPA PAT TCA ATC Glu Asn Ser Met WO 98/21225 WO 9821225PCT/US97/21353 -230- OTT CGC Val Arg 780 AGT AAT GTG Ser Asn Val GTPG -GG Val Gly 785 CGC GAT TTG AAA Arg Asp Leu Lys TTT GTA GAA GAG Phe Val Giu Giu
OCT
Al a 795 AAA AAA OTO ATC Lys Lys Val Ile CAA MAC ATC MA Gin Asn Ile Lys
CTC
Leu 805 CCT CCC AGC TAC Pro Pro Ser Tyr
TAT
Tyr 810 2622 2670 2718 ATC ACT TAT GO Ile Thr Tyr Giy CAG TTT GMA AAC Gin Phe Giu Asn CAA COG 0CC MAT Gin Arg Aia Asn AMA AGG Lys Arg 825 CTC TCC ACC Leu Ser Thr TTT TTC ACT Phe Phe Thr 845 ATC CCT TTA AGC Ile Pro Leu Ser TTA OCO ATT TTT Leu Ala Ile Phe TTC ATT CTT Phe Ile Leu 840 CTT TTG MAT Leu Leu Asn 2766 2814 TTT AMA AGC ATT Phe Lys Ser Ile
CCT
Pro 850 TTA GCC TTG CTC Leu Aia Leu Leu ATC CCT Ile Pro 860 GAG TAT Oiu Tyr 875 TTT OCO OTT ACC Phe Aia Val Thr
OGA
Gly 865
OCO
Al a GOC CTT ATT OCO Gly Leu Ile Ala TTT OCO OTC 000 Phe Ala Val Giy 2862 2910 ATT TCA OTO Ile Ser Val AOC OTO GOC Ser Val Oly
TTT
Phe 885 ATC OCT CTT TTT Ilie Ala Leu Phe ATT OCO OTT TTA MAT GOC OTO OTO ATO ATA GOC TAT TTT AMA Ile Ala Val Leu Asn Giy Vai Val Met Ile Oly Tyr Phe Lys GAG CTT Oiu Leu 905 2958 CTC TTO CMA Leu Leu Gin AGO COT TTO Arg Arg Leu 925 000 Giy 910 AMA AOC OTA GMA Lys Ser Val Oiu TOC OTT TTA TTO Cys Val Leu Leu GOC OCT AMA Giy Aia Lys 920 GOT TTO GOT Oiy Leu Oly 3006 3054 AGA CCO OTT TTA Arg Pro Vai Leu
ATO
Met 930 ACC OCT TOC ATT Thr Ala Cys Ile TTO CTC Leu Leu 940 CCT TTA TTA TTT Pro Leu Leu Phe CAT AGC OTO OGA His Ser Vai Gly
TCA
Ser 950 GMA OTC CMA AMA Olu Val Gin Lys 3102 3150 TTA OCO ATC OTO Leu Ala Ile Val CTT OGA GOC TTG Leu Oly Giy Leu
OTT
Val 965 ACC TCA AOC OCT Thr Ser Ser Ala
CTA
Leu 970 ACC TTA CTC CTA Thr Leu Leu Leu AMA ATC OTT TOAC Lys Ile Val
CTO
Leu 975 CCO CCA ATO TTT ATO CTC ATC OCT AMA Pro Pro Met Phe Met Leu Ilie Ala Lys MOG ATT Lys Ile 985 3198 3256 ,TTAMAG OATTTCACAT OCTCGCTTTA GAAATTTATA TTOATATTT WO 98/21225 PCT/US97/21353 -231- GTTTGAAAGA CGCTTTAATA GATT 3280 INFORMATION FOR SEQ ID NO:108: SEQUENCE CHARACTERISTICS: LENGTH: 1019 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (ix) FEATURE: NAME/KEY: Signal Sequence LOCATION: 1...30 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:108: Met Leu Ala Ser Ile Ile Glu Phe Ser Leu Arg Gln Arg Val Ile Val -25 -20 Ile Val Gly Ala Ile Leu Ile Leu Phe Phe Gly Thr Tyr Ser Phe Ile -5 1 Asn Thr Pro Val Asp Ala Phe Pro Asp Ile Ser Pro Thr Gln Val Lys 10 Ile Ile Leu Lys Leu Pro Gly Ser Ser Pro Glu Glu Met Glu Asn Asn 25 Ile Val Arg Pro Leu Glu Leu Glu Leu Leu Gly Leu Lys Gly Gln Lys 40 45 Ser Leu Arg Ser Val Ser Lys Tyr Ser Ile Ser Asp Ile Thr Ile Asp 60 Phe Asp Asp Ser Val Asp Ile Tyr Leu Ala Arg Asn Ile Val Asn Glu 75 Arg Leu Ser Ser Val Met Lys Asp Leu Pro Val Gly Val Glu Gly Gly 90 Met Ala Pro Ile Val Thr Pro Leu Ser Asp Ile Phe Met Phe Thr Ile 100 105 110 Asp Gly Asn Ile Thr Glu Ile Glu Lys Arg Gln Leu Leu Asp Phe Val 115 120 125 130 Ile Arg Pro Gln Leu Arg Met Ile Ser Gly Val Ala Asp Val Asn Ser 135 140 145 Ile Gly Gly Phe Ser Arg Ala Phe Val Ile Val Pro Asp Phe Asn Asp 150 155 160 Met Ala Arg Leu Gly Val Ser Ile Ser Asp Leu Glu Ser Ala Val Arg 165 170 175 Val Asn Leu Arg Asn Ser Gly Ala Gly Arg Val Asp Arg Asp Gly Glu 180 185 190 Thr Phe Leu Val Lys Ile Gln Thr Ala Ser Leu Ser Leu Glu Asp Ile 195 200 205 210 Gly Lys Ile Thr Val Ser Thr Asn Leu Gly His Leu His Ile Lys Asp 215 220 225 Phe Ala Lys Val Ile Ser Gin Ser Arg Thr Arg Leu Gly Phe Val Thr WO 98/21225 WO 8/2225PCT/US97/21353 -232- Lys Asp Glu 275 Asp Leu Gly Leu Leu 355 Ser Thr Ala Pro Ala 435 Thr Ser Leu Val1 Phe 515 Thr Giu Arg Thr Thr 595 Lys Glu Asp Leu Asp Ala 260 Leu Arg Ile Asn Ser 340 Met Al a Lys Val1 Ile 420 Gin Ile Giu Giu Phe.
500 Met Pro Ser Thr Asp 580 Lys Gly Met Gly Lys 660 Gly 245 Asn Lys Ser Glu Leu 325 Val Ser Val1 Thr Ser 405 Leu Ser Ile Thr Phe 485 Leu Pro Ser Ala Gi1y 565 Thr Asp Ile Leu Ile 645 Gly Val1 Thr Pro Glu Al a 310 Arg Al a Leu Val Thr 390 Val1 Thr Ile Pro Phe 470 Phe Ile Val1 Ile Ile 550 Ser Phe Giu Asn Thr 630 Ser Ile Giu Glu Leu 280 Thr Val Ser Ile Gly 360 Val Leu Ser Gin Tyr 440 Val1 Thr His Ser Asp 520 Leu Lys Giu Ser Leu 600 Ser Val Leu Gly Thr 250 Ile Asn Lys Ile Ala 330 Ile Val Asn Ala Val1 410 Leu Leu Ser Phe Pro 490 Ser Gly Gin Val Gly 570 Ile Lys Thr Gly Glu 650 Ser Leu Val1 Ser 285 Ala Thr Val Ser Ile 365 Giu Arg Ile Lys Thr 445 Leu Arg Val1 Pro Val1 525 Asp Val1 Leu Lys Asp 605 Ile Al a Phe Leu Val1 Tyr 270 Ile Thr Leu Ile Asp 350 Gly Lys Ser Ile Met 430 Leu Lys Ile Ile Phe 510 Leu Leu Lys Gly Giu 590 Ser Giu Val1 Gin Thr 670 240 Leu Ser 255 Gin Lys Asn Val Val Ser Phe Leu 320 Leu Pro 335 Leu Thr Met Leu Leu Ser Cys Lys 400 Vai Phe 415 Phe Arg Val Leu Ala Thr Tyr Aia 480 Leu Gly 495 Val Gly Ser Val Met Leu Ser Ile 560 Giy Leu 575 Trp Ser Leu Lys Met Arg Lys Ile 640 Ile Ala 655 Thr Leu Leu Leu Phe Lys 305 Phe Leu Leu Ile Al a 385 Glu Phe Pro Ser Pro 465 Pro Al a Lys Giu As n 545 Val1 As n Val1 Asp Ile 625 Phe Gin Asn Lys Giu Tyr 290 Thr Leu Ser Asn Asp 370 Asn Ile Val1 Leu Ile 450 His Leu Phe Asn Thr 530 Ile Al a Gin Lys Phe 610 Ser Gly Ala Glu WO 98/21225 WO 9821225PCTIUS97/2 1353 -233- Gly 675 Val Giu Val1 Lys Ser 755 Giu Asp Ile Asn Ile 835 Leu Leu.
Val1 Met Giu 915 Thr Ser Gly Phe Val1 Gly Gly Met Ser 740 Ile Asn Leu Lys Gin 820 Leu Al a Ile Gly Ile 900 Cys Ala Val1 Leu Met 980 Asn Ile Leu Ile 725 Leu Ala Ser Lys Leu 805 Gin Al a Leu Ala Phe 885 Gly Val1 Cys Gly Val1 965 Leu Tyr Thr Val 710 Arg Ala Lys Met Ser 790 Pro Arg Ile Leu Leu 870 Ile Tyr Leu Ile Ser 950 Thr Ile Leu lyr--Val Thr Pro Asn 680 Ser 695 Vali Gin Leu Ile Arg 775 Phe Pro Ala Phe Ile 855 Phe Ala Phe Leu Al a 935 Glu Ser Ala Asp Asp Glu Thr Giu 760 Met Val1 Ser Asn Phe 840 Leu Al a Leu Lys Gly 920 Gly Val1 Ser Lys Giu Val1 Ser Ser 745 Giu Ser Glu Tyr Lys 825 Ile Leu Val1 Phe Giu 905 Ala Leu Gin Ala Lys 985 Phe Ile Asp 730 Lys Val1 Val Giu.
Tyr 810 Arg Leu Asn Gly Gly 890 Leu.
Lys Gly Lys Leu 970 Ile Ser Pro 715 Phe Tyr Asp Val1 Al a 795 Ile Leu Phe Ile Giu.
875 Ile Leu Arg Leu Pro 955 Th r Lys Lys 685 Phe Gly Ser Val1 Pro 765 Ser Lys Tyr Thr Thr 845 Phe Ile Val Gin Leu 925 Pro Aila Leu Val Giu Ser Leu Lys Ile Ser Ser Ile 735 Leu Vai 750 Val Ser Asn Val Val Ile Giy Gly 815 VaI Ile 830 Phe Lys Aia Val Ser Val Leu Asn 895 Gly Lys 910 Arg Pro Leu Leu Ile Vai Leu Leu 975 Met Ser Arg 720 Thr Pro Val Val1 Al a 800 Gin Pro Ser Thr Pro 880 Gly Ser Val Phe Val1 960 Pro Ala Al a 705 Thr Lys Ile Val Gly 785 Gin Phe Leu Ile Gly 865 Ala Val Val Leu Ser 945 Leu Pro Asp 690 Leu Pro Ile Thr Arg 770 Arg As n Glu.
Ser Pro 850 Gly Ser Val Giu Met 930 His Gly Met INFORMATION FOR SEQ ID NO:109: Wi SEQUENCE CHARACTERISTICS: LENGTH: 898 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: .835 OTHER INFORMATION: WO 98/21225 WO 981225PCT/US97/21353 -234- NAME/KEY: Signal Sequence LOCATION: 86 161 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:109: GCATAAAATA AACAAACATT AAGTAAOGCT TATCAATATT TGATTACAAT TATAAGGGTT ACATTTTTTT AATAGGAGAT ATACC
ATO
Met CTA GGA AAC OTT AAA AAA ACC CTT Leu Gly Asn Val Lys Lys Thr Leu 112 160 TTT GG Phe Oly GTC TTG TGT TTO Val Leu Cys Leu
GGC
Gly ACG TTO TGT TTG AGA GGG TTA ATG GCA Thr Leu Cys Leu Arg Gly Leu Met Ala
GAG
Giu 1 CCA GAC GCT Pro Asp Ala GAG CTT OTT AAT Giu Leu Val Asn
TTA
Leu 10 GOC ATA GAG AOC GCG AAG Gly Ile Giu Ser Ala Lys 208 AAG CAA OAT Lys Gin Asp TTA AAA AAT Leu Lys Asn
TTC
Phe OCT CAA OCT AAA ACO CAT TTT OAA AAA Ala Gin Ala Lys Thr His Phe Giu Lys 25 OCT TOT GAO Ala Cys Olu GOC TTT OGA TOT OTT TTT TTA 000 OCO TTC TAT OAA GAA Oly Phe Gly Cys Val Phe Leu Oly Ala Phe Tyr Oiu Olu 000 AAA Oly Lys OGA OTO OGA AAA Oly Val Oly Lys
GAC
Asp TTG AAA AAA 0CC ATC CAA TTT TAC ACT Leu Lys Lys Ala Ile Gin Phe Tyr Thr
AAA
Lys GOT TOT GAA TTA Gly Cys Giu Leu OAT GOT TAT 000 TOT AAC CTO CTA OGA Asp Oly Tyr Gly Cys Asn Leu Leu Gly TTA TAC TAT AAC Leu Tyr Tyr Asn CAA GOC OTO TCA Gin Oly Val Ser
AAA
Lys GAC OCT AAA AAA Asp Ala Lys Lys 0CC TCA Aia Ser CAA TAC TAC Oin Tyr Tyr OTA TTA OGA Val Leu Oly 115 AAA OCT TOC GAC Lys Ala Cys Asp TTA AAC CAT OCT OAA 000 TOT ATO Leu Asn His Ala Oiu Oly Cys Met 105 110 GOC OTA GOC ACO CCT AAG OAT TTA Gly Val Oly Thr Pro Lys Asp Leu 125 448 496 544 ACC TTA CAC CAT Ser Leu His His AGA AAO Arg Lys 130 OCT CTT OAT TTO Ala Leu Asp Leu
TAT
Tyr 135 OAA AAA OCT TOC OAT TTA AAA GAC AOC Giu Lys Ala Cys Asp Leu Lys Asp Ser 140 WO 98/21225 PCT/US97/21353 235-
CCA
Pro 145 GGG TGT ATT AAT Gly Cys Ile Asn
GCA.-GGA
Ala Gly 150 TAT ATA TAT Tyr Ile Tyr AGT GTA ACA AAG Ser Val Thr Lys 155 TGC GAA TTA AAA Cys Glu Leu Lys AAT TTT Asn Phe 160 GAT GGT Asp Gly 175 640 688 AAG GAG GCT ATC Lys Glu Ala Ile
GTT
Val 165 CGT TAT TCT AAA Arg Tyr Ser Lys AGG GGG TGT Arg Gly Cys GCA AAG GAC Ala Lys Asp 195 TCA AGC GTT Ser Ser Val 210
TAT
Tyr 180 AAT TTA GGG GTT Asn Leu Gly Val CAA TAC AAC GCT Gin Tyr Asn Ala GAA AAG CAA GCG Glu Lys Gin Ala
GTA
Val 2.00
GAC
Asp GAA AAC TTT AAA Glu Asn Phe Lys CAA GGT ACA Gin Gly Thr 190 GGC TGC AAA Gly Cys Lys AAA ATA GAA Lys Ile Glu AAA GAA GCA Lys Glu Ala
TGC
Cys 215 GCT CTC AAG Ala Leu Lys
GAA
Glu 220
CTT
Leu 225 TAATTTCAAT GAAGTTAGCT AAACGCTGCG TTTAGCTGGC TTTTACGCTT TTTATA
TTTTAAG
INFORMATION FOR SEQ ID NO:110: SEQUENCE CHARACTERISTICS: LENGTH: 250 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (ix) FEATURE: NAME/KEY: Signal Sequence LOCATION: 1...25 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID.NO:110: Met Leu Gly Asn Val Lys Lys Thr Leu Phe -20 Thr Leu Cys Leu Arg Gly Leu Met Ala Glu Gly Val -15 Pro Asp Leu Cys Leu Gly Ala Val Asn Leu Gly Ile Glu Ser Ala Lys Lys Gin Asp Phe 15 Lys Thr His Phe Glu Lys Ala Cys Glu Leu Lys Asn Gly 30 Val Phe Leu Gly Ala Phe Tyr Glu Glu Gly Lys Gly Val 45 Lys Glu Leu Ala Gin Ala Phe Gly Cys Gly Lys Asp WO 98/21225 PCT/US97/21353 Leu Gly Val Asp Tyr 120 Glu Tyr Ser Val Val 200 Asp Lys Tyr Ser Leu 105 Gly Lys Ile Lys Met 185 Glu Ala Lys Ala Gly Cys Lys Asp Asn His Val Gly Ala Cys Tyr Ser 155 Ala Cys 170 Gin Tyr Asn Phe Leu Lys lie uin. Phe Tyr Thr Asn Ala Ala Thr Asp 140 Val Glu Asn Lys Glu 220 Leu Lys Gly 110 Lys Lys Lys Lys Gin 190 Gly Lys Gly Ala 95 Cys Asp Asp Asn Asp 175 Gly Cys Ile Asn 80 Ser Met Leu Ser Phe 160 Gly Thr Lys Glu -236- Lys Leu Gin Val Arg Pro 145 Lys Arg Ala Ser Leu 225 Gly Cys Glu Leu Asn Asp Tyr Tyr Leu Lys 130 Gly Glu Gly Lys Ser 210 Tyr Tyr Gly 115 Ala Cys Ala Cys Asp 195 Val Asn Ser 100 Ser Leu Ile Ile Tyr 180 Glu Lys Gly Lys Leu Asp Asn Val 165 Asn Lys Glu Gin Ala His Leu Ala 150 Arg Leu Gin Ala Gly Cys His Tyr 135 Gly Tyr Gly Ala.
Cys 215 INFORMATION FOR SEQ ID NO:111: SEQUENCE CHARACTERISTICS: LENGTH: 1079 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 169...834 OTHER INFORMATION: NAME/KEY: Signal Sequence LOCATION: 169...289 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:111: CAAAAAAAAA AAAAAACAAT TTCAGTTTCT TATTAGCTAG GTTTGATTAA AATGAAAAGC TTTTATGTGT TTAAACTTCA TTGTCTTAAA ACTTTTAAGA GCAATTTTAA AATTCGTTGG CGTATAATAT CCGTTTTGAA TGAACTACTA AAAAAAGGGT TTTAAATA ATG GCT GAA Met Ala Glu AAT TCT TTC AAA AAT GTT TCC ACA CAA CCC AAA GTA TTT TTC TTA TTG 120 177 WO 98/21225 WO 98/ 225PCTfUS97/21353 -237- Asn Ser Phe Lys Asn vai Ser Thr Gin Pro Lys Vai Phe Phe Leu Leu CCA GCT Pro Ala AAA ACC CTG TTT Lys Thr Leu Phe
CTT
Leu -15 TTA GGA GGC GTT Leu Gly Gly Val TTT AGC Phe Ser GCG TTT TTT Ala Phe Phe 273
ATC
Ile CTT ATT GCT GGC TTG GTT TTT TTT GAT TAT GCT CAT TTG ATG GAC Leu Ile Ala Gly Leu Val Phe Phe Asp Tyr Ala His Leu Met Asp AAT GCC ATT Asn Ala Ile ATT TTA ACT Ile Leu Thr
TTT
Phe
CTA
Leu AAT TTT GCG CGT Asn Phe Ala Arg
TCA
Ser 20
ATC
Ile ACC CCC TTT AAT Thr Pro Phe Asn TCC AGC CCT Ser Ser Pro TCT TCT CAA Ser Ser Gin ATC CTC CAA Ile Leu Gin
AAT
Asn GCT AAT TTA Ala Asri Leu TTC GTG Phe Vai TTG CCT TTC AGT TTG TTO GTG GGG GTG Leu Pro Leu Ser Leu Leu Val Gly Val so
TTT
Phe TTA AGC CTT TAT Leu Ser Leu Tyr
CGC
Arg AGA AAC TTA CTG Arg Asn Leu Val
CTT
Leu 65 GGG GTG TGG TTT Gly Val Trp Phe TTA ACC GTG ATC Leu Ser Vai Ile
TTG
Leu 513 561 TTT GAA GCC CTT Phe Giu Ala Leu
TTA
Leu GAA TCT TTA AAA Giu Ser Leu Lys CTT TTT GCA TAT Leu Phe Ala Tyr TCC ATT Ser Ile CAG TGG CTT Gin Trp Leu TTA GTG CTA Leu Val Leu 110 CGC ACC GCT AAT Arg Ser Ala Asn
TTC
Phe 100 CCT AAC GCT ACT Pro Asn Ala Thr GCG CTT TCT Ala Leu Ser 105 CAT TTA ATC His Leu Ile 609 657 TTT TAT GGG TTC Phe Tyr Gly Leu ATT TTA TTG ATA Ile Leu Leu Ile
CCC
Pro 120 ACG CAT Thr His 125 CAA ACG CTT AAA Gin Thr Leu Lys
AAT
Asn 130 OTT CTT TTT TAT AGC TTA TTT GGT TTG Val Leu Phe Tyr Ser Leu Phe Gly Leu 135 TTT TTA ATA GGC Phe Leu Ile Gly GCA CTG ATT GTT Ala Leu Ile Val GGG GTT TCT TTC Gly Val Ser Phe ACT GTT TTA GGA Ser Val Leu Gly TTT TGT TTA CCC Phe Cys Leu Cly
GCG
Al a 165 TTA GG GCT TGT Leu Cly Ala Cys TTT TCC Phe Ser 170 753 801 854 ATA CCC ATT TAT TTC ACC CTC TTT Ile Cly Ile Tyr Leu Ser Val Phe 175 CAA AAG ATC Gin Lys Ile 180 TAAACCAACC CCTTAAAAGA ATCAAAATTT TATCAACCTT TTAATATTGG ATTTAAAGCT ATTATTCCAA CCCATTCTTC 91 -914 WO 98/21225 PCT/US97/21353 -238- ATTTTTTCAT CAAGCTCAAT AAAAAGCAAA AAATCGCCCT GATTGCAGCT GGGGTTTTGA 974 TCACGGCTTT GCTTGTGTTT TTATTGCTCT ATCCCTTTAA AGAAAAAGAC TACACGCAAG 1034 GGGGTTATGG GGTTTTATTT GAAGGTTTAG ACTCTAGCGA TAACG 1079 INFORMATION FOR SEQ ID NO:112: SEQUENCE CHARACTERISTICS: LENGTH: 222 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (ix) FEATURE: NAME/KEY: Signal Sequence LOCATION: 1...40 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:112: Met Ala Glu Asn Ser Phe Lys Asn Val Ser Thr Gln Pro Lys Val Phe -35 -30 Phe Leu Leu Pro Ala Lys Thr Leu Phe Leu Leu Gly Gly Val Phe Ser -15 Ala Phe Phe Ile Leu Ile Ala Gly Leu Val Phe Phe Asp Tyr Ala His 1 Leu Met Asp Asn Ala Ile Phe Asn Phe Ala Arg Ser Thr Pro Phe Asn 15 Ser Ser Pro Ile Leu Thr Leu Ile Leu Gln Asn Ile Ala Asn Leu Gly 30 35 Ser Ser Gln Phe Val Leu Pro Leu Ser Leu Leu Val Gly Val Phe Leu 50 Ser Leu Tyr Arg Arg Asn Leu Val Leu Gly Val Trp Phe Val Leu Ser 65 Val Ile Leu Phe Glu Ala Leu Leu Glu Ser Leu Lys His Leu Phe Ala 80 Tyr Ser Ile Gln Trp Leu Ser Arg Ser Ala Asn Phe Pro Asn Ala Thr 95 100 Ala Leu Ser Leu Val Leu Phe Tyr Gly Leu Leu Ile Leu Leu Ile Pro 105 110 115 120 His Leu Ile Thr His Gln Thr Leu Lys Asn Val Leu Phe Tyr Ser Leu 125 130 135 Phe Gly Leu Ile Phe Leu Ile Gly Leu Ala Leu Ile Val Leu Gly Val 140 145 150 Ser Phe Ser Ser Val Leu Gly Gly Phe Cys Leu Gly Ala Leu Gly Ala 155 160 165 Cys Phe Ser Ile Gly Ile Tyr Leu Sef Val Phe Gln Lys Ile 170 175 180 INFORMATION FOR SEQ ID NO:113: WO 98/21225 PCT/US97/21353 -239- SEQUENCE CHARACTERISTICS: LENGTH: 962 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 97...912 OTHER INFORMATION: NAME/KEY: Signal Sequence LOCATION: 97...217 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:113: TTTTATTGAA TGTGTTGTAA TGTTTTTAAG GTATAATAAA AAAGTTTGCA ACCTGATGAG AGTAATAATA GAGTTT ATG Met CTCTTTTTAA GTCAAGCAAT CTG ATT TCA TTA AAA Leu Ile Ser Leu Lys 114 ACA TTC CTA AAA Thr Phe Leu Lys
ATA
Ile TTA TTG AAA ATA Leu Leu Lys Ile CTA AAA ACC TTC Leu Lys Thr Phe CAA AAG Gin Lys ATT TGG GTA Ile Trp Val AAC GCT AAC Asn Ala Asn 1 TGC GTT ATT ATT TGG GGG TTA GGC TGT AGT TTT TTA Cys Val Ile Ile Trp Gly Leu Gly Cys Ser Phe Leu AGC ATT CAA TTA GAA GAA ACG CTC Ser Ile Gin Leu Glu Glu Thr Leu CGA AGC CCT AAA Arg Ser Pro Lys
AAT
Asn CTT ATT TGG CAA Leu Ile Trp Gin TTT AAA AAG AAG TTT AAA AAG AGC AAC Phe Lys Lys Lys Phe Lys Lys Ser Asn ATC CCT TAT GCC Ile Pro Tyr Ala
CCA
Pro AAT AGC CGT TGG Asn Ser Arg Trp
AAA
Lys 40 TAT TTA GGC ACG Tyr Leu Gly Thr AGC ATA Ser Ile GGG ATT TTA Gly Ile Leu ATG CCA GAG Met Pro Glu GTG TCT TTG GTG ATA GGG ATT GTG GGG Val Ser Leu Val Ile Gly Ile Val Gly CTG TAT CTC Leu Tyr Leu GGG ATC AAA Gly Ile Lys 354 402 450 AGC GTA ACG AAT TGG GAT AAA GAA AAG Ser Val Thr Asn Trp Asp Lys Glu Lys
TTT
Phe WO 98/21225 PCT/US97/21353 -240- AGT TGG TTT Ser Trp Phe GAA AAT GTC.CGC ATG GGG CCA AAA Met Gly Pro Lys Glu Asn Val CTG GAC AAT GAT AGT Leu Asp Asn Asp Ser 498 Arg
TTT
Phe ATT TTT AAT GAA Ile Phe Asn Glu
ATT
Ile 100 TTG CAC CCT TAT Leu His Pro Tyr
TTT
Phe 105 GGG GCT ATG TAT Gly Ala Met Tyr
TAT
Tyr 110 546 594 ATG CAA CCG CGC Met Gin Pro Arg
ATG
Met 115 GCT GGA TTT AGC Ala Gly Phe Ser
TGG
Trp 120 ATG GCA TCA GCG Met Ala Ser Ala TTT TTT Phe Phe 125 TCT TTT ATC ACT TCC ACG CTT TTT Ser Phe Ile Thr Ser Thr Leu Phe 130 TGG GAA TAT GGC TTG GAA GCG TTT Trp Glu Tyr Gly Leu Glu Ala Phe 135 140 TTA GTG ATC ACG CCT TTA TTA GGC Leu Val Ile Thr Pro Leu Leu Gly 155 642 GTG GAA GTG Val Glu Val 145 CCT AGC TGG CAG Pro Ser Trp Gin 690 TCC ATT Ser Ile 160 TTA GGG GAG GGG Leu Gly Glu Gly
TTT
Phe 165 TAT CAG CTC ACG Tyr Gin Leu Thr TAT ATC CAA CGC Tyr Ile Gin Arg
AAT
Asn 175 GAA GGC AAG CTT Glu Gly Lys Leu GGC TCT TTA TTT Gly Ser Leu Phe
TTA
Leu 185 GGG CGT TTA GTC Gly Arg Leu Val 738 786 834 GCT CTT ATG GAT Ala Leu Met Asp
CCT
Pro 195 ATC GGT TTT ATC Ile Gly Phe Ile AGG GAT TTA GGA Arg Asp Leu Gly CTT GGG Leu Gly 205 GAA GCT TTA GGG ATT TAT AAT AAA Glu Ala Leu Gly Ile Tyr Asn Lys 210 GAA ATC CGT TCC AGC TTA AGC Glu Ile Arg Ser Ser Leu Ser 220 CCC AAT GGT Pro Asn Gly 225 TTG AAT TTG ACT Leu Asn Leu Thr TAC AAA TTT Tyr Lys Phe 230 TAAGAGCTTA AAATTTAAGA AAA TTATAAAGAG TTTTGATAGA ATACCTT INFORMATION FOR SEQ ID NO:114: SEQUENCE CHARACTERISTICS: LENGTH: 272 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (ix) FEATURE: NAME/KEY: Signal Sequence WO 98/21225 WO982125PCTIUS97/2 .1353 -241- LO.CATIUh: OTHER INFORMATION: Met Leu Leu Leu Phe Tyr Ile Glu Lys Phe 105 Met Tyr Ile Thr Leu 185 Arg Ile (xi) SEQUENCE Leu Ile Ser Leu Lys Thr Phe Gin Gly Cys Ser Phe Arg Arg Ser Pro Lys Lys Ser Asn Leu Gly Thr Ser Val Gly Leu Tyr Lys Phe Gly Ilie Leu Asp Asn Asp Gly Ala Met Tyr Ala Ser Ala Phe 125 Gly Leu Glu Ala 140 Thr Pro Leu Leu 155 Arg Tyr Ile Gin 170 Giy Arg Leu Val Asp Leu Gly Leu 205 Arg Ser Ser Leu 220 DESCRIPTION: SEQ ID NO:i14: Lys -35 Lys Leu Lys Thr 30 Ile Leu Lys Ser Tyr 110 Phe Phe Gly Arg Ile 190 Giy Ser Thr Ile Asn As n i5 Ile Giy Met Ser Phe 95 Met Ser Val1 Ser Asn 175 Ala Glu Pro Phe Trp Al a Leu.
Pro Ile Pro Trp 80 Ile Gin Phe Giu Ilie 160 Giu Leu Aila Asri Leu Val Asn 1 Ile Tyr Leu Glu 65 Phe Phe Pro Ile Val1 145 Leu Gly Met Leu Gly 225 Lys Val1 -15 Ser Trp Al a Giy s0 Ser Giu Asn Arg Thr 130 Pro Gly Lys Asp Gly 210 Leu Leu Val1 Gin His Asn Ser Thr Val Ile 100 Al a Thr Trp Gly Phe 180 Ile Tyr Leu Leu Ile Leu Phe Ser Leu Asn Arg Leu Gly Leu Gin Phe 165 Gly Gly Asn Thr Lys Ile Glu Lys Arg Vali Trp Met His Phe Phe Asp 150 Tyr Se r Phe Lys Tyr 230 Ile Trp Giu.
Lys Trp Ile Asp Gly Pro Ser Trp 135 Leu Gin Leu Ile His 215 Lys Phe Gly Thr Lys Lys Gly Lys Pro Tyr Trp 120 Giu Val1 Leu Phe Ile 200 Glu Phe INFORMATION FOR SEQ ID NO-115: SEQUENCE CHARACTERISTICS: LENGTH: i422 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 216... .1202 WO 98/21225 PCT/US97/21353 -242- OTHER INFORMATION: NAME/KEY: Signal Sequence LOCATION: 216...273 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:115: AAATTAAACG AGTTTGGTTT AGAGCCGTAT TTAGGGTTTT TGCACCCCCA GATTTTGAAA ATAACCCTAA TGAGCAATCA GCGCTCTTTG TCTTGCCCCT AGCGCTCTTA ATGTGCATGC ACTCAAATTT GTGTTGTTGG AAGCGTTACC ATTTTTAAAA TAATCCATTA AAATAAAGGC GAGGA ATG AAA AGA TTT
TTTAACCAAT
TTCAGCGGTT
CTAAAACGCT
GTT TTG 120 180 233 Met Lys Arg Phe Val Leu TTT TTA TTG TTC ATG TGC GTT TGC Phe Leu Leu Phe Met Cys Val Cys CAA GCT TAC GCC Gin Ala Tyr Ala GAG CAA GAT Glu Gin Asp 1 TAC TTT TTT AGG GAT TTT AAA TCT AGA GAT TTG Tyr Phe Phe Arg Asp Phe Lys Ser Arg Asp Leu CAA AAA CTC CAT Gin Lys Leu His
CTT
Leu GAT AAA AAG CTC Asp Lys Lys Leu
TCC
Ser 25 CAA ACA ATA CAG Gin Thr Ile Gin TGC ATG CAA CTT AAC Cys Met Gin Leu Asn GCA TCA AAA CAC TAC ACT TCT ACC GGG GTT AGA GAG CCT GAT Ala Ser Lys His Tyr Thr Ser Thr Gly Val Arg Glu Pro Asp AAA TGC Lys Cys ACA AAG AGT Thr Lys Ser GGT TAT TTG Gly Tyr Leu
TTT
Phe AAA AAA TCC GCT CTC ATG TCC TAT GAC Lys Lys Ser Ala Leu Met Ser Tyr Asp 60 TTA GCG CTA Leu Ala Leu GCT ATA GAA Ala Ile Glu GTG AGT AAG AAT Val Ser Lys Asn CAA TAC GGC TTA Gin Tyr Gly Leu ATT TTA Ile Leu AAC GCT TGG GCT Asn Ala Trp Ala
AAA
Lys GAG CTT CAA AGC Glu Leu Gln-Ser GAT ACT TAT CAG Asp Thr Tyr Gin GAG GAT AAT ATC Glu Asp Asn Ile
AAT
Asn 105 TTT TAC ATG CCT Phe Tyr Met Pro
TAT
Tyr 110 ATG AAC ATG GCT Met Asn Met Ala
TAT
Tyr 115 TGG TTT GTC AAA Trp Phe Val Lys GCG TTT CCT AGC Ala Phe Pro Ser
CCA
Pro 125 GAA TAT GAA GAT TTC ATT Glu Tyr Glu Asp Phe Ile 130 AAG CGG ATG CGC CAG TAT TCT CAA TCA GCT CTT AAC ACT AAC CAT GGG 713 WO 98/21225 PCTIUS97/21353 -243- Lys Arg Met Arg Gin Tyr-ser Gin Ser Ala Leu Asn Thr Asn His Gly 145 TTA GAC GAT Leu Asp Asp GCG TGG GGC Ala Trp Gly 150 ATT CTT TTT GAT Ile Leu Phe Asp AGT TCT GCG Ser Ser Ala CTA GCG Leu Ala 160 AAT GCC Asn Ala 165 CTT TTG CAC AAT Leu Leu His Asn
AGC
Ser 170 GCT AAT CGG TGG Ala Asn Arg Trp GAG TGG GTG TTT Glu Trp Val Phe
AAA
Lys 180 GCC ATA GAT GAG Ala Ile Asp Glu GGG GTT ATT GNT Gly Val Ile Xaa GCG ATC ACT AGG Ala Ile Thr Arg
AGC
Ser 195 809 857 905 GAT ACG AGC GAT Asp Thr Ser Asp
TAT
Tyr 200 CAT GGC GGC CCT His Gly Gly Pro
ACA
Thr 205 AAG GGC ATT AAG Lys Gly Ile Lys GGG ATA Gly Ile 210 GCT TAT ACC Ala Tyr Thr CTT TTT GAG Leu Phe Glu 230
AAT
Asn 215 TTC GCG CTT CTT Phe Ala Leu Leu CTA ACC ATA TCA Leu Thr Ile Ser GGC GAA TTG Gly Giu Leu 225 GGG AAA AGO Oly Lys Arg 953 1001 AAC GOG TAT GAT Asn Gly Tyr Asp TGG GGT AGT GGA Trp Gly Ser Gly
GCT
Ala 240 CTC TCT Leu Ser 245 GTG GCG TAT AAC Val Ala Tyr Asn
AAA
Lys 250 GTT GCA ACA TGG Val Ala Thr Trp TTA AAC CCT GAA Leu Asn Pro Glu TTC CCT TAT TTC Phe Pro Tyr Phe
CAG
Gin 265 CCT AAC CTT ATC Pro Asn Leu Ile GTG CAT AAC AAC Val His Asn Asn 1049 1097 1145 TAT TTC ATT ATT Tyr Phe Ile Ile
TTA
Leu 280 GCC AAG CAT TAT Ala Lys His Tyr
TCT
Ser 285 AGC CCT AGT GCA Ser Pro Ser Ala AAT GAG Asn Glu 290 CTT TTA AAO Leu Leu Lys CGA TCG CCA Arg Ser Pro 310
CAA
Gin 295 GGC OAT TTA CAC Gly Asp Leu His
GAA
Glu 300 GAT GOT TTC AGG Asp Gly Phe Arg CTG AAA CTC Leu Lys Leu 305 1193 TGAATTTTTC TGTATCCAAG- GTTAGCCTTA AGGATGGCCA TGCGCTTTA 1251
ACCTTTTGAT
AAAATAAACG
ATCGTTTTAG
GAATGGTTCA GAAAGTTTGT TTCAGTCAGC CAATTGTATC TCTTGAGTCG TCTTTAGAGT TTGTAAGCGT GCTTATTTAC ACTAAAATAA ATTATTTACA AAAAGAGTTT GCAAATGATT ATCAAAATGA TAAGCGTTAT T 1311 1371 1422 INFORMATION FOR SEQ ID NO:116: SEQUENCE CHARACTERISTICS: LENGTH: 329 amino acids WO 98/21225 PCT/US97/21353 -244- TYPE: amino.acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (ix) FEATURE: NAME/KEY: Signal Sequence LOCATION: 1...19 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:116: Met Lys Arg Phe Val Leu Phe Leu Leu Phe Met Cys Val Cys Val Gln -10 Ala Tyr Ala Glu Gln Asp Tyr Phe Phe Arg Asp Phe Lys Ser Arg Asp 1 5 Leu Pro Gln Lys Leu His Leu Asp Lys-Lys Leu Ser Gin Thr Ile Gln 20 Pro Cys Met Gln Leu Asn Ala Ser Lys His Tyr Thr Ser Thr Gly Val 35 40 Arg Glu Pro Asp Lys Cys Thr Lys Ser Phe Lys Lys Ser Ala Leu Met 55 Ser Tyr Asp Leu Ala Leu Gly Tyr Leu Val Ser Lys Asn Lys Gin Tyr 70 Gly Leu Lys Ala Ile Glu Ile Leu Asn Ala Trp Ala Lys Glu Leu Gin 85 Ser Val Asp Thr Tyr Gln Ser Glu Asp Asn Ile Asn Phe Tyr Met Pro 100 105 Tyr Met Asn Met Ala Tyr Trp Phe Val Lys Lys Ala Phe Pro Ser Pro 110 115 120 125 Glu Tyr Glu Asp Phe Ile Lys Arg Met Arg Gln Tyr Ser Gin Ser Ala 130 135 140 Leu Asn Thr Asn His Gly Ala Trp Gly Ile Leu Phe Asp Val Ser Ser 145 150 155 Ala Leu Ala Leu Asp Asp Asn Ala Leu Leu His Asn Ser Ala Asn Arg 160 165 170 Trp Gin Glu Trp Val Phe Lys Ala Ile Asp Glu Asn Gly Val Ile Xaa 175 180 185 Ser Ala Ile Thr Arg Ser Asp Thr Ser Asp Tyr His Gly Gly Pro Thr 190 195 200 205 Lys Gly Ile Lys Gly Ile Ala Tyr Thr Asn Phe Ala Leu Leu Ala Leu 210 215 220 Thr Ile Ser Gly Glu Leu Leu Phe Glu Asn Gly Tyr Asp Leu Trp Gly 225 230 235 Ser Gly Ala Gly Lys Arg Leu Ser Val Ala Tyr Asn Lys Val Ala Thr 240 245 250 Trp Ile Leu Asn Pro Glu Thr Phe Pro Tyr Phe Gln Pro Asn Leu Ile 255 260 265 Gly Val His Asn Asn Ala Tyr Phe Ile Ile Leu Ala Lys His Tyr Ser 270 275 280 285 Ser Pro Ser Ala Asn Glu Leu Leu Lys Gin Gly Asp Leu His Glu Asp WO 98/21225 PCT/US97/21353 -245- 290 295 Gly Phe Arg Leu Lys Leu Arg Ser Pro 305 310 INFORMATION FOR SEQ ID NO:117: SEQUENCE CHARACTERISTICS: LENGTH: 1080 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 157...987 OTHER INFORMATION: NAME/KEY: Signal Sequence LOCATION: 157...226 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:117: AGCGGTAAAA TCGCTGAAGA AAACAACGCT AAAGAATTTT AGAGCGCAAA AATTTTTAGA AACTTTCCAT TTTTTAGGGA AAAAAGATGA TTCTAATTTC AAAAAAAGGT GTTTTT ATG Met TTAACCACCC GAAATCTCAA GCTGTTAAAT AAAGTTTGCT AAA ACA AAC GGG CTT Lys Thr Asn Gly Leu 120 174 222 TTT AAA ATG TGG GGG CTG TTT Phe Lys Met Trp Gly Leu Phe GTT TTA ATC GCT TTA GTC TTT AAT Val Leu Ile Ala Leu Val Phe Asn GCA TGT Ala Cys 1 TCT GAT AGC CAT AAA GAA AAA AAG Ser Asp Ser His Lys Glu Lys Lys 5
GAC
Asp 10 GCT TTA GAA GTC Ala Leu Glu Val
ATT
Ile 270 AAA CAA AGA GGG Lys Gin Arg Gly
GTT
Val TTA AAA GTG GGG Leu Lys Val Gly TTT AGC GAT AAG Phe Ser Asp Lys CCT CCT Pro Pro TTT GGC TCT GTG GAT TCT AAA GGG Phe Gly Ser Val Asp Ser Lys Gly TAT CAA GGC TAT GAT GTA GTT Tyr Gin Gly Tyr Asp Val Val 366 ATT GCT AAA Ile Ala Lys CGC ATG GCT CTT GAT TTA TTG GGC GAT Arg Met Ala Leu Asp Leu Leu Gly Asp
GAA
Glu AAT AAG ATT Asn Lys Ile WO 98/21225 WO 981225PCTIUS97/21353 -246- GAG TTT ATT Glu Phe Ile 6S CCT GTA GAA GCT Pro Val Giu Ala TCA GCT AGG GTG GAA Ser Ala Arg Val Glu TTT TTA AAA GCC Phe Leu Lys Ala
AAT
As n AAA GTG GAT ATT Lys Val Asp Ile ATG GCT AAT TTC Met Ala Asn Phe
ACG
Thr 90 CGC ACT AAA GAA Arg Thr Lys Glu
AGA
Arg GAA AAA GTC GTG Giu Lys Val Val TTC GCT AAG CCG Phe Ala Lys Pro
TAT
Tyr 105 ATG AAA GTC GCT TTA GGG Met Lys Val Ala Leu Gly 110 GTG GTT TCT Val Val Ser AAA GAG TTG Lys Glu Leu 130
AAA
Lys 115 GAT GGG GTC ATT Asp Gly Val Ile AAT ATA GAA GAG Asn Ilie Glu Giu TTG AAA GAT Leu Lys Asp 125 TAT TTC ACT Tyr Phe Thr ATT GTG AAT AAA Ile Val Asn Lys
GGC
Gly 135 ACG ACA GCG GAT Thr Thr Ala Asp
TTT
Phe 140 AAA PAT Lys Asn 145 TAC CCC AAT ATC Tyr Pro Asn Ile CTT TTG AAA TTT Leu Leu Lys Phe CPA PAT ACA GAG Gin Asn Thr Giu
ACT
Thr 160 TTT TTA GCC CTT Phe Leu Ala Leu
TTA
Leu 165 AAC P-AT AAG GCT Asn Asn Lys Ala
ACC
Thr 170 GCT CTA GCC CAT Ala Leu Ala His PAC ACT TTA TTG Asn Thr Leu Leu GCT TGG ACG AAA Ala Trp Thr Lys
CA
Gin 185 CAC CCT GPA TTT His Pro Giu Phe AAA TTA Lys Leu 190 GGC ATT ACA Gly Ile Thr AAA GGC PAC Lys Gly Asn 210 CTT GGC GAT PAG Leu Gly Asp Lys GTG ATC GCT CCA Val Ile Ala Pro GCG ATT AAA Aia Ile Lys 205 ATA GAT TCC Ile Asp Ser CCC P-AG CTT TTA Pro Lys Leu Leu
GAA
Giu 215 TOG TTO PAT AAC Trp Leu Asn Asn
A
Giu 220 CTC ATT Leu Ile 225 TCT AGC GAC TTC Ser Ser Asp Phe AAA GAA OCT TAT Lys Giu Ala Tyr
CPA
Gin 235 GAG ACT TTA GCA Giu Thr Leu Ala GTT TAT GGC GAT Val Tyr Oiy Asp
GA
Giu 245 ATC APA CCG OAA Ilie Lys Pro Glu
OAA
Oiu 250 ATT ATT TTT GPA TGATT Ile Ile Phe Giu CTAP-ATTAGC PATTTTGTGA TCTTTAGGCT TTGPATTCTT GACAGGGTGC GTTTTTATTG TCTTTTTGTT TTTCATTTTG AGATATAT INFORMATION FOR SEQ ID NO:118: SEQUENCE CHARACTERISTICS: LENGTH: 277 amino acids 1052 1080 WO 98/21225 PCT/US97/21353 -247- TYPE: amino -acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (ix) FEATURE: NAME/KEY: Signal Sequence LOCATION: 1...23 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:118: Met Lys Thr Asn Gly Leu Phe Lys Met Trp Gly Leu Phe Leu Val Leu -15 Ile Ala Leu Val Phe Asn Ala Cys Ser Asp Ser His Lys Glu Lys Lys 1 Asp Ala Leu Glu Val Ile Lys Gin Arg Gly Val Leu Lys Val Gly Val 15 20 Phe Ser Asp Lys Pro Pro Phe Gly Ser Val Asp Ser Lys Gly Lys Tyr 35 Gin Gly Tyr Asp Val Val Ile Ala Lys Arg Met Ala Leu Asp Leu Leu 50 Gly Asp Glu Asn Lys Ile Glu Phe Ile Pro Val Glu Ala Ser Ala Arg 65 Val Glu Phe Leu Lys Ala Asn Lys Val Asp Ile Ile Met Ala Asn Phe 80 Thr Arg Thr Lys Glu Arg Glu Lys Val Val Asp Phe Ala Lys Pro Tyr 95 100 105 Met Lys Val Ala Leu Gly Val Val Ser Lys Asp Gly Val Ile Lys Asn 110 115 120 Ile Glu Glu Leu Lys Asp Lys Glu Leu Ile Val Asn Lys Gly Thr Thr 125 130 135 Ala Asp Phe Tyr Phe Thr Lys Asn Tyr Pro Asn Ile Lys Leu Leu Lys 140 145 150 Phe Glu Gin Asn Thr Glu Thr Phe Leu Ala Leu Leu Asn Asn Lys Ala 155 160 165 Thr Ala Leu Ala His Asp Asn Thr Leu Leu Leu Ala Trp Thr Lys Gin 170 175 180 185 His Pro Glu Phe Lys Leu Gly Ile Thr Ser Leu Gly Asp Lys Asp Val 190 195 200 Ile Ala Pro Ala Ile Lys Lys Gly Asn Pro Lys Leu Leu Glu Trp Leu 205 210 215 Asn Asn Glu Ile Asp Ser Leu Ile Ser Ser Asp Phe Leu Lys Glu Ala 220 225 230 Tyr Gin Glu Thr Leu Ala Pro Val Tyr Gly Asp Glu Ile Lys Pro Glu 235 240 245 Glu Ile Ile Phe Glu 250 INFORMATION FOR SEQ ID NO:119: WO 98/21225 PCT/US97/21353 -248- SEQUENCE CHARACTERISTICS: LENGTH: 1114 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 37...1050 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:119: CGAGCTATCA CAACAAATCA ATTTGTAGGA ACAAGC ATG TTT TTT AAA Met Phe Phe Lys 1 ACT TAT Thr Tyr CAA AAA TTA Gin Lys Leu GGG AGT GAT Gly Ser Asp GGT GCG AGC TGT Gly Ala Ser Cys
TTG
Leu ACG TTG TAT TTA Thr Leu Tyr Leu GCG GGC TGT Ala Gly Cys AGT AGC GAG CCA Ser Ser Glu Pro
TTG
Leu 30 GTG GGA ATT GAA AAA AAT AGC TTC Val Gly Ile Glu Lys Asn Ser Phe AAT TCT Asn Ser ACC GTG AAA ATC Thr Val Lys Ile TCT AAA ACC GAC AAC ATA GAA ATC CAA Ser Lys Thr Asp Asn Ile Glu Ile Gin
GAC
Asp TTG AAG CTC AAT Leu Lys Leu Asn GGC AAT TGT GAG Gly Asn Cys Glu
CAT
His GAT CAA AAT TTC Asp Gin Asn Phe GTA AAG TTA ATC Val Lys Leu Ile GAA ACA GCC AAT Glu Thr Ala Asn TAC CTG TTT GCA Tyr Leu Phe Ala TCA GAA Ser Glu AAA GAA AAA Lys Glu Lys AAA GAT TTA Lys Asp Leu 105
GCG
Ala ATC AAA AAC CAC Ile Lys Asn His
CAA
Gin 95 GCA AAA ATC GCA Ala Lys Ile Ala AGA CTT CAA Arg Leu Gin 100 AAT AAT CTT Asn Asn Leu GAA GAA CTC ACA CAG CAT GTG CAA CAA Glu Glu Leu Thr Gin His Val Gin Gin 110 GAT AAA Asp Lys 120 TTG TTA GAA AAT Leu Leu Glu Asn GGA CTA TTC GTT AGT GGC CAT GAT TAT Gly Leu Phe Val Ser Gly His Asp Tyr AAA TAT ACA AAA GAT GAT AAC CCA ATA TAT GTT GTT AAG AGG ATG CTT Lys Tyr Thr Lys Asp Asp Asn Pro Ile Tyr Val Val Lys Arg Met Leu WO 98/21225 PCT/US97/21353 -249- GAT AAC CTT GAT AGC TAT AAA TAT GAA Asp Asn Leu Asp Ser Tyr Lys Tyr Glu 155 GAC GAC GTG CTA Asp Asp Val Leu GAC GTG Asp Val 165 CCA TAT GAG Pro Tyr Glu AAC CCC AAA Asn Pro Lys 185 CTA TTG GAA Leu Leu Glu ATA AGC Ile Ser 175 ATT GCT ATT GAA Ile Ala Ile Glu GAC ACT AAA Asp Thr Lys 180 AAA AAA TTA Lys Lys Leu 582 630 GAC TAC CCT TAT Asp Tyr Pro Tyr AAC CTT AAA GAA Asn Leu Lys Glu
CTC
Leu 195 ATA GAT Ile Asp 200 AGT ATT ATT GAT Ser Ile Ile Asp CAT GGT TAT ATG His Gly Tyr Met GAT GGC TTT TTG Asp Gly Phe Leu
AAT
Asn 215 GAA TAT TCT AAT Glu Tyr Ser Asn GTA TCA AAA AAA Val Ser Lys Lys CTC CAA ATC CTT Leu Gin Ile Leu
GCT
Ala 230 AAA CTA AAA TCC Lys Leu Lys Ser TGG CCT AGC GTA Trp Pro Ser Val AAA TTT TAT TTC Lys Phe Tyr Phe GCC TCT Ala Ser 245 TTG AAA GAG Leu Lys Glu
GCT
Ala 250 ATC CCA AGG CAT Ile Pro Arg His AAA GAA GTT ACT Lys Glu Val Thr GAC AAG ATG Asp Lys Met 260 AAA CTC ACT Lys Leu Thr ATT AGC TCT GAA GAA AAA TCT Ile Ser Ser Glu Glu Lys Ser 265 AAA GCC AAT CAA Lys Ala Asn Gin
GTC
Val 275 GAA GCG Glu Ala 280 AAG CAA GAT ATT Lys Gin Asp Ile AAA ATG GAA AAA Lys Met Glu Lys
ATC
Ile 290 ATT AAA GAT TTA Ile Lys Asp Leu
GAA
Glu 295 AGC AAG AAA AAC Ser Lys Lys Asn
ACC
Thr 300 TTA TCA GTG TAT Leu Ser Val Tyr
TTA
Leu 305 AAA TTT GGA GAA Lys Phe Gly Glu
AGT
Ser 310 TTC ACA GCG CAT TAT AAG TGT CAA AAT Phe Thr Ala His Tyr Lys Cys Gin Asn 315 ATA GAA GTT GGA ,Ile Glu Val Gly GTC AAA Val Lys 325 1014 ACC GAT AAA Thr Asp Lys TCC TGG ACT Ser Trp Thr TTC AAC Phe Asn 335 TTT AAC AGA Phe Asn Arg TAAATCAGGC AAATAT 1066 GGACAATAGC ACAGACAGAG CAAAAATCCT TATAGAAGAG CTTAAAAT INFORMATION FOR SEQ ID NO:120: SEQUENCE CHARACTERISTICS: LENGTH: 338 amino acids 1114 WO 98/21225 WO 981225PCTIUS97121353 -250- TYPE: amino-acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID Met Phe Phe Lys Thr Tyr Gin Lys Leu Leu 1 Leu Ile Asp His Tyr Ly s Gin Val1 Val1 145 Asp Ala Lys Met Gly 225 Lys Giu Asn Lys Leu 305 Ile As n Tyr Giu As n Asp Leu Ile Gin Ser 130 Val1 Asp Ile Giu Al a 210 Leu Phe Val1 Gin Ile 290 Ly s Glu Arg Leu Lys Ile Gin Phe Ala Ser 115 Gly Lys Val1 Giu Leu 195 Asp Gin Tyr Thr Val1 275 Ilie Phe Val1 Al a Asn Giu Asn Ala Arg 100 Asn.
His Arg Leu Asp 180 Lys Gly Ile Phe Asp 260 Lys Lys Gly Gly 5 Gly Ser Ile Phe Ser Leu Asn.
Asp Met Asp 165 Thr Lys Phe Leu Ala 245 Lys Leu Asp Giu Val 325 Gly Asn Asp 55 Val1 Lys Lys Asp Lys 135 Asp Pro Asn Ile Asn 215 Lys Leu Ile Giu Glu 295 Phe Thr Gly Ser Lys Asn Gin 75 Ile Giu Glu Asp Ser 155 Leu Tyr Ile Asn Met 235 Ile Glu Asp Asn Tyr 315 Ser Ala Glu Ile Arg Giu Lys Leu Asn.
Asp 140 Tyr Leu Pro Asp Arg 220 Trp Pro Lys Ile Thr 300 Lys Trp, Ser Pro Ile Gly Thr Asn.
Thr Gly 125 Asn.
Lys Giu Tyr Asp 205 Val1 Pro Arg Ser Asp 285 Leu Cys Thr Leu Val1 Lys Cys As n Gin His Leu Ile Glu Ser 175 Asn Gly Lys Val1 Ala 255 Lys Met Val Asn.
Asn 335 Thr Gly Thr Giu Thr Al a Val1 Phe Tyr Ser 160 Ile Leu Tyr Lys Gly 240 Lys Ala Glu Tyr Leu 320 Phe INFORMATION FOR SEQ ID NO:12i: WO 98/21225 PCT/US97/21353 -251- SEQUENCE CHARACTERISTICS: LENGTH: 1101 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 40...1026 OTHER INFORMATION: NAME/KEY: sig_peptide LOCATION: 40...99 OTHER INFORMATION: NAME/KEY: mat peptide LOCATION: 100...1026 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:121: GGTTATACCG AAAAAACAAT ATGAAATCAA GGAGTTTGT ATG CAA CAG CGT CAT Met Gln Gln Arg His
TTA
Leu GGC CCT TTA AAA Gly Pro Leu Lys GGT GCA TTA GCT CTA GGG TGC ATG GGC Gly Ala Leu Ala Leu Gly Cys Met Gly ACT TAT GGG Thr Tyr Gly ATC CAT AAG Ile His Lys GGG GAA GTC CAT Gly Glu Val His AAA AAG CAG ATG GTT AAA CTT Lys Lys Gln Met Val Lys Leu GCT TTG GAA TTG Ala Leu Glu Leu ATT AAC TTT TTT Ile Asn Phe Phe ACT GCA GAG Thr Ala Glu GCT TAT Ala Tyr GGG GAA GAT AAT Gly Glu Asp Asn
GAA
Glu AAG CTT TTA GCG AAG CGA TCA AGC CTT Lys Leu Leu Ala Lys Arg Ser Ser Leu
ATT
Ile AAA GAC AAG GTT Lys Asp Lys Val GTG GTA GCG AGC AAG TTT GGG ATT TAC Val Val Ala Ser Lys Phe Gly Ile Tyr 55 60 TAC GCA ACC ATG TTT TTA GAC TCC AGT Tyr Ala Thr Met Phe Leu Asp Ser Ser TAC GCA Tyr Ala TCT AAC Ser Asn GAT CCT AAT GAC Asp Pro Asn Asp CGC ATT AAG AGT GCC ATT GAA GGG AGT TTG AAA CGC TTA AAA GTA GAA WO 98/21225 WO 981225PCTJUS97/21353 252- Arg Ilie Lys TGC ATT GAT Cys Ilie Asp 100 S er Ala Ile-Giu Gly Ser 90 Leu Lys Arg Leu Lys Vai Glu ACG CCC ATA Thr Pro Ile TTA TAC TAC CAA CAC CGC ATG GAT ACT Leu Tyr Tyr Gin His Arg Met Asp Thr
AAC
Asn 110 GAA GAA Giu Giu 115 GTG GCA GAA GTT Vai Ala Giu Val
ATG
Met 120 CAA GCT CTT ATT Gin Ala Leu Ile
AAA
Lys 125 GAA GGA AAA ATT Giu Giy Lys Ile
AAA
Lys 130 GCT TGG GGG ATG Ala Trp Gly Met GAG GCA GGG TTA TCT AGC ATC CAA AAA GCC Giu Ala Giy Leu Ser Ser Ile Gin Lys Ala CAT CAA ATT TGC His Gin Ile Cys
CCT
Pro 150 TTA AGC GCG TTG Leu Ser Ala Leu
CAG
Gin 155 AGC GAA TAT TCC Ser Giu Tyr Ser TTG TGG Leu Trp 160 TGG CGC GAA Trp Arg Giu ATT GGC TTT Ilie Gly Phe 180 GAA AAA GAG ATT Giu Lys Giu Ile
TTA
Leu 170 GGT TTT TTA GAA Gly Phe Leu Giu AAA GAA AAA Lys Giu Lys 175 TTA GGC GCG Leu Giy Ala GTC GCT TTT TCG Val Ala Phe Ser
CCT
Pro 185 TTG GGT AAG GGG Leu Gly Lys Gly AAA TTT Lys Phe 195 GAA AAA AAT GCT Giu Lys Asn Ala
ACC
Thr 200 TTC GCT AGT GAA Phe Ala Ser Glu TTT AGA AGC GTT Phe Arg Ser Val 726 774 822
TCT
Ser 210 CCT AGG TTT AAT Pro Arg Phe Asn GAA AAT CTA GCC Giu Asn Leu Ala
AAA
Lys 220 AAT TAC GTC TTG Asn Tyr Val Leu
GTG
Val1 225 GAA TTA ATC CA Giu Leu Ile Gin
GAT
Asp 230 CAT GCA CAC GCT His Ala His Ala
AAA
Lys 235 GGC GTT ACA CCA Gly Val Thr Pro GCC CA Ala Gin 240 CTG OCT CTC Leu Ala Leu TTT GGC ACC Phe Gly Thr 260 CAG GTT TCT Gin Val Ser 275 TGG ATT TTG CAC Trp Ile Leu His
ACG
Thr 250 CPA AAA ATC ATT Gin Lys Ile Ile ACC AAA GAA TCC Thr Lys Giu.Ser
AGG
Arg 265
GAA
Glu CTC ATA GPA PAT Leu Ile Glu Asn GTC CCT CTC Val Pro Leu 255 GGG GCT TTG Gly Ala Leu AAA GAA TTG Lys Giu Leu 870 918 966 TGG AGT CA Trp Ser Gin
AAA
Lys 280 TTG GAG ATT Leu Glu Ile
ACT
Thr 290 GCA ATC AAA ATA Ala Ile Lys Ile GGG GCC CGC TAC Gly Ala Arg Tyr GPA AGA ATC PAT Giu Arg Ile Asn
GA
Giu 305 1014 WO 98/21225 PCT/US97/21353 -253- ATG GTG AAT CAA TAAAAGTATT GGGTATTTAT AATTGCATTG GCTCTTTTAA AAGAG 1071 Met Val Asn Gin ATTGAGCGTT ATTTCCTGTT TGTCAGTGTG 1101 INFORMATION FOR SEQ ID NO:122: SEQUENCE CHARACTERISTICS: LENGTH: 329 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:122: Met Gin Gin Arg His Leu Gly Pro Leu Lys Val Gly Ala Leu Ala Leu -15 -10 Gly Cys Met Gly Met Thr Tyr Gly Tyr Gly Glu Val His Asp Lys Lys 1 5 Gin Met Val Lys Leu Ile His Lys Ala Leu Glu Leu Gly Ile Asn Phe 20 Phe Asp Thr Ala Glu Ala Tyr Gly Glu Asp Asn Glu Lys Leu Leu Ala 35 Lys Arg Ser Ser Leu Ile Lys Asp Lys Val Val Val Ala Ser Lys Phe 50 55 Gly Ile Tyr Tyr Ala Asp Pro Asn Asp Lys Tyr Ala Thr Met Phe Leu 70 Asp Ser Ser Ser Asn Arg Ile Lys Ser Ala Ile Glu Gly Ser Leu Lys 85 Arg Leu Lys Val Glu Cys Ile Asp Leu Tyr Tyr Gin His Arg Met Asp 100 105 Thr Asn Thr Pro Ile Glu Glu Val Ala Glu Val Met Gin Ala Leu Ile 110 115 120 Lys Glu Gly Lys Ile Lys Ala Trp Gly Met Ser Glu Ala Gly Leu Ser 125 130 135 140 Ser Ile Gin Lys Ala His Gin Ile Cys Pro Leu Ser Ala Leu Gin Ser 145 150 155 Glu Tyr Ser Leu Trp Trp Arg Glu Pro Glu Lys Glu Ile Leu Gly Phe 160 165 170 Leu Glu Lys Glu Lys Ile Gly Phe Val Ala Phe Ser Pro Leu Gly Lys 175 180 185 Gly Phe Leu Gly Ala Lys Phe Glu Lys Asn Ala Thr Phe Ala Ser Glu 190 195 200 Asp Phe Arg Ser Val Ser Pro Arg Phe Asn Gin Glu Asn Leu Ala Lys 205 210 215 220 Asn Tyr Val Leu Val Glu Leu Ile Gin Asp His Ala His Ala Lys Gly 225 230 235 Val Thr Pro Ala Gln Leu Ala Leu Ser Trp Ile Leu His Thr Gin Lys 240 245 250 Ile Ile Val Pro Leu Phe Gly Thr Thr Lys Glu Ser Arg Leu Ile Glu WO 98/21225 PCT/US97/21353 -254- 255 260 265 Asn Ile Gly Ala Leu Gin Val Ser Trp Ser Gin Lys Glu Leu Glu Ile 270 275 280 Phe Gin Lys Glu Leu Thr Ala Ile Lys Ile Glu Gly Ala Arg Tyr Pro 285 290 295 300 Glu Arg Ile Asn Glu Met Val Asn Gin 305 INFORMATION FOR SEQ ID NO:123: SEQUENCE CHARACTERISTICS: LENGTH: 955 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 126...806 OTHER INFORMATION: NAME/KEY: Signal Sequence LOCATION: 126...237 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:123: GTCAGCCTTT AAAGGTTTCA TTATAGCAAA GAATATTATT TTTTTATTCC TTGCGTTTTC TGTGCGTTTG TGGGGCAAAT AAGATATAAT CGCCTTTTTA AAATTCATTT TTTAAAGGGG 120 TTTGA ATG GTA TTT GAC AGA ACA ATC AGC GTA AGA GAA AAA AAA GCG GCT 170 Met Val Phe Asp Arg Thr Ile Ser Val Arg Glu Lys Lys Ala Ala -30 AAA ACG CTT GGG ATT ATT GGG ATC GTC TTT TTT ATT TTG TTT GGC ATC 218 Lys Thr Leu Gly Ile Ile Gly Ile Val Phe Phe Ile Leu Phe Gly Ile -15 GTG ATA AGC GGG GTG GCT TTT CAA AAA GAG TGG GTG CAA CAA TTG GAT 266 Val Ile Ser Gly Val Ala Phe Gin Lys Glu Trp Val Gin Gin Leu Asp -1 5 TTA TTT TTT ATA GAC TTG ATC CGC AAC CCT GCC CCC ATT CAA AAA AGC 314 Leu Phe Phe Ile Asp Leu Ile Arg Asn Pro Ala Pro Ile Gin Lys Ser 20 GCG TGG CTT TCT TTC GTG TTT TTT AGC ACT TGG TTT GCA CAA AGC AAG 362 Ala Trp Leu Ser Phe Val Phe Phe Ser Thr Trp Phe Ala Gin Ser Lys 35 WO 98/21225 WO 98/21225PCT/US97/2 1353 -255- CTC ACC ACT CCT- ATA GCC TTA Leu Thr Thr Pro Ile Ala Leu
CTC
Leu 50 ATT GGC TTG Ile Gly Leu TGG TTT GGG TTT CAA Trp Phe Gly Phe Gin AAA CGC Lys Arg ATC GCT TTG GGG Ile Aia Leu Gly
GTG
Val1 65 TGG TTT TTC TTT Trp Phe Phe Phe
AGC
Ser ATC TTA TTA GGT Ile Leu Leu Gly TTC ACC TTA AAA Phe Thr Leu Lys
TCC
Ser CTT AAG CTT TTA Leu Lys Leu Leu GCG CGC CCA CGG Aia Arg Pro Arg GTA ACC AAT GGC Vai Thr Asn Gly
GAA
Giu TTG GTT TTC GCG Leu Val Phe Ala GGC TTT AGT TTC Giy Phe Ser Phe CCT AGC Pro Ser 105 506 554 602 GGG CAT GCT Gly His Ala TTA TGC TAT Leu Cys Tyr 125 GCT TCA GCG CTT TTT TAC GGC TCT TTG Ala Ser Ala Leu Phe Tyr Gly Ser Leu 115 GCG TTG TTG Ala Leu Leu 120 ATT GCT GTG Ile Ala Val TCT AAC GCC AAC Ser Asn Ala Asn CGC ATT AAA ACG Arg Ile Lys Thr GTT TTG Val Leu 140 CTT TTT TGG ATT Leu Phe Trp Ile
TTT
Phe 145 TTA ATG GCG TAT Leu Met Ala Tyr
GAT
Asp 150 AGG GTT TAT TTA Arg Val Tyr Leu
GGG
Gly 155 GTG CAT TAC CCT Val His Tyr Pro
AGC
Ser 160 GAT GTT TTA GGA Asp Val Leu Gly TTT TTA TTA GGG Phe Leu Leu Gly
ATT
Ile 170 GCT TGG TCG TGC Ala Trp Ser Cys TCT TTA GCG CTT Ser Leu Ala Leu TTA GGG TTT TTG Leu Gly Phe Leu AAA CGC Lys Arg 185 CCT TAT AAT Pro Tyr Asn TAAAGGCTTT ATTTAACCAA ACACTGACAA CTAAAATTTT TAAAA 851 911 955 TTCTATTTTT TGATAAAACT CATTCTCTTA AGGGGATAGG GGGTATTTTG CGATAATACC CCCTTAACCC CCTTAAGAAA CCCCCTAACC CCCAAGACCG CTTT INFORMATION FOR SEQ ID NO:124: Wi SEQUENCE CHARACTERISTICS: LENGTH: 227 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (ix) FEATURE: WO 98/21225 PCT/US97/21353 -256- NAME/KEY: Signal Sequence LOCATION: 1...37 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:124: Met Val Phe Asp Arg Thr Ile Ser Val Arg Glu Lys Thr Leu Ile Ser Phe Phe Trp Leu Thr Thr Arg Ile Phe Thr Thr Asn His Ala Cys Tyr 125 Leu Leu 140 Val His Trp Ser Tyr Asn Gly Gly Ile Ser Pro Ala Leu Gly Leu 110 Ser Phe Tyr Cys Gin 190 Ile Val Asp Phe Ile Leu Lys Glu Ala Asn Trp Pro Cys 175 Ile Gin Arg Phe Leu Trp Lys Phe Leu Asn 130 Leu Val Ala Phe Trp 5 Ala Trp Leu Phe Val Gly Gly Lys Tyr Gly 165 Leu Lys Phe Gin Gin Gin Gly Leu Pro Phe Ala 120 Ile Val Leu Leu Ala Gly Leu Lys Ser Phe Leu Arg Pro 105 Leu Ala Tyr Gly Lys 185 Ala Ile Asp Ser Lys Gin Gly Pro Ser Leu Val Leu Ile 170 Arg Lys Val Leu Ala Leu Lys Glu Val Gly Leu Val Gly 155 Ala Pro INFORMATION FOR SEQ ID NO:125: SEQUENCE CHARACTERISTICS: LENGTH: 1183 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 91...1032 OTHER INFORMATION: WO 98/21225 PCT/US97/21353 -257- NAME/KEY: Signal Sequence LOCATION: 91...148 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:125: CTTAAAAGAA ACTTCGCAAA CCTTTTTATA TTATTTTAAA AGCACTAATA TTTATTATAT TAGTTACAAC TATTTATTGT AAAGGCTAAA ATG TTG AAA TTT AAA TAT GGT TTG Met Leu Lys Phe Lys Tyr Gly Leu 114 ATT TAT Ile Tyr ATC GCG CTC ATA Ile Ala Leu Ile GGA CTT CAA GCG ACA GAT TAT GAC AAT Gly Leu Gin Ala Thr Asp Tyr Asp Asn TTA GAA GAA GAA Leu Glu Glu Glu CAA CAA TTA GAT Gin Gin Leu Asp
GAA
Glu AAA ATA AAC CAT Lys Ile Asn His TTA AAG Leu Lys 210 CAA CAG CTC Gin Gin Leu AAG TTT GAA Lys Phe Glu
ACC
Thr GAA AAA GGG GTT Glu Lys Gly Val CCC AAA GAG ATG Pro Lys Glu Met GAT AAG GAT Asp Lys Asp ATT TCT TCC Ile Ser Ser 258 306 GAA GAA TAC ATC Glu Glu Tyr Ile
AAT
Asn CGA TCT TAT CCT Arg Ser Tyr Pro AAG AAA Lys Lys AAA GAG AAA TTG Lys Glu Lys Leu AAA TCT TTT TCC Lys Ser Phe Ser GCC GAT GAT AAG Ala Asp Asp Lys
AGT
Ser GGG GTT TTT TTA Gly Val Phe Leu
GGG
Gly 75 GGG TAT GCT Gly Tyr Ala
TAT
Tyr 80 GGG GAA CTT AAC Gly Glu Leu Asn
TTG
Leu TCT TAT CAA GGG GAA ATG TTA GAC AGA Ser Tyr Gin Gly Glu Met Leu Asp Arg GGC GCG AAT GCC Gly Ala Asn Ala CCT AGC Pro Ser 100 450 GCG TTT AAA Ala Phe Lys GCT AAA TTT Ala Lys Phe 120
AAC
Asn 105 AAT ATC AAT ATT Asn Ile Asn Ile AAC GCT CCT GTT TCT ATG ATT AGC Asn Ala Pro Val Ser Met Ile Ser 1.10 115 TTT GTG TCT TAT TTT GGG ACA CGA Phe Val Ser Tyr Phe Gly Thr Arg 130 GGG TAT CAA AAA Gly Tyr Gin Lys
TAC
Tyr 125 TTT TAT Phe Tyr 135 GGG GAT TTA TTG CTT GGG GGT GGG GCA TTA AAA GAG GAT GCA Gly Asp Leu Leu Leu Gly Gly Gly Ala Leu Lys Glu Asp Ala ATC AAG CAG CCT GTA GGC TCG TTT ATT TAT GTT TTA GGG GCT GTC AAT Ile Lys Gin Pro Val Gly Ser Phe Ile Tyr Val Leu Gly Ala Val Asn WO 98/21225 PCT/US97/21353 -258- 160 ACC GAT TTA TTG Thr Asp Leu Leu GAT ATG CCT TTA GAT TTT AAA ACT AAA Asp Met Pro Leu Asp Phe Lys Thr Lys 175 AAG CAT Lys His 180 TTT TTA GGC Phe Leu Gly GAC AGG CCT Asp Arg Pro 200
GTT
Val 185 TAT GCG GGT TTT Tyr Ala Gly Phe ATA GGG CTT ATG Ile Gly Leu Met CTC TAT CAA Leu Tyr Gin 195 GGC TAT TCA Gly Tyr Ser AAT CAA AAC GGG Asn Gin Asn Gly
AGG
Arg 205 AAT TTA GTA GTG Asn Leu Val Val AGC CCT Ser Pro 215 AAT TTT TTA TGG Asn Phe Leu Trp TCT TTG ATT GAA Ser Leu Ile Glu
GTG
Val 225 GAT TAC ACT TTT Asp Tyr Thr Phe GTG GGC GTG AGT Val Gly Val Ser
TTA
Leu 235 ACG CTT TAT AGG Thr Leu Tyr Arg
AAA
Lys 240 CAC CGT TTA GAG His Arg Leu Glu
ATT
Ile 245 834 882 930 GGC ACA AAA TTG Gly Thr Lys Leu
CCG
Pro 250 ATT AGC TAT TTG Ile Ser Tyr Leu ATG GGA GTG GAA Met Gly Val Glu GAG GGA Glu Gly 260 GCG ATT TAT Ala Ile Tyr AAC AAC CAG Asn Asn Gin 280
CAA
Gin 265 AAT AAA GAA GAT Asn Lys Glu Asp
GAT
Asp 270 GAG CGT TTG TTG Glu Arg Leu Leu GTT TCG GCT Val Ser Ala 275 TTC AAG CGA TCC Phe Lys Arg Ser
AGT
Ser 285 TTT TTA TTA GTG AAT TAT GCG TTT Phe Leu Leu Val Asn Tyr Ala Phe 290 1026 ATT TTT Ile Phe 295 TAAGGCTTGA TCTTGGAGTT AAGGTTTAAA ATTTTAGCGT TAGTCGTTTT AA 1084 1144 1183 TTTTAGGGGG TTATTTGATT TTTAACGCTT TAATCACAAA ACCCAGAGCT TTAAGTTTTA GTTTAAATAG CAAAGAGGGT GCGCTTAATG ACAATGATG INFORMATION FOR SEQ ID NO:126: SEQUENCE CHARACTERISTICS: LENGTH: 314 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (ix) FEATURE: NAME/KEY: Signal Sequence LOCATION: 1...19 WO 98/21225 WO 981225PCTIUS97/21353 -259- OTHER INFORMATION: Met Leu Asp Ser Arg Ser Tyr Arg As n 110 Phe Gly Ile Leu Gly 190 Asn Leu Tyr Leu Asp 270 Phe (xi) SEQUENCE Leu Lys Phe Lys Gin Ala Thr Asp 1 Glu Lys Ile Asn Pro Lys Glu Met Ser Tyr Pro Lys Phe Ser Ile Ala Ala Tyr Gly Giu Tyr Gly Ala Asn Ala Pro Val Ser Val Ser Tyr Phe 130 Gly Ala Leu Lys 145 Tyr Vai Leu Gly 160 Asp Phe Lys Thr 175 Ile Gly Leu Met Leu Val Val Gly 210 Ile Giu Val Asp 225 Arg Lys His Arg 240 Arg Met Gly Val 255 Giu Arg Leu Leu Leu Leu Val Asn 290 DESCRIPTION: SEQ ID NO:i26: Tyr Gly Leu Ile Tyr Tyr His Asp 35 Ile Asp Leu Ala Met 115 Gly Glu Al a Lys Leu 195 Gly Tyr Leu Glu Val 275 Tyr Asp Leu Lys Ser Asp Asn Pro 100 Ile Thr Asp Val Lys 180 Tyr Tyr Thr Giu Giu 260 Ser Al a Leu Gin Lys Lys Ser 70 Ser Ala Al a Phe Ile 150 Thr Phe Asp Ser As n 230 Gly Al a Asn Ile -10 Giu Gin Phe Lys 55 Gly Tyr Phe Lys Tyr 135 Lys Asp Leu Arg Pro 215 Val1 Thr Ile Asn Phe 295 Ile Giu Leu Glu 40 Lys Val Gin Lys Phe 120 Gly Gin Leu Giy Pro 200 As n Gly Lys Tyr Gin 280 Leu Asn Giu Giu Lys Leu Giu Asn Tyr Leu Val1 Phe 170 Tyr Gin Leu Ser Pro 250 As n Lys Ile Gin Lys Tyr Leu Giy Met Ile Gin Leu Gly 155 Asp Al a Asn Trp Leu 235 Ile Lys Arg Leu Gly Gin Leu Gly Vai Ile Asn Leu Lys Gly Giy Leu Asp Asn Ile Lys Tyr 125 Leu Giy 140 Ser Phe Met Pro Gly Phe Giy Arg 205 Lys Ser 220 Thr Leu Ser Tyr Giu Asp Ser Ser 285 INFORMATION FOR SEQ ID NO:127: SEQUENCE CHARACTERISTICS: LENGTH: 1851 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear WO 98/21225 PCT/US97/21353 -260- (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 238.. .1665 OTHER INFORMATION: NAME/KEY: Signal Sequence LOCATION: 238.. .313 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:127:
GAGCTAGTTT
ATTCAGATGG
GTTACCCTAA
AAAAGCTTAT
TAAAAAGTTA GTTTTGTTTT CTAAGGCACA CAAGAAATTA AATCCTATTG CATAGGTCTA GTTTTCATTA AAAATGTTAT
AAAAAGTTAA
GGGGACTCTG
AATAAGAGCT
GATACGCTCA
TACTATTTTG
CTGTATTCCT
TAGGGATCAT
AATAGTCAAG
AAGCACTCCT
ACCCTGAAGC
TTTAGCCATA
CAAAAAA ATG Met 120 180 240 TCA ATT AAA AGG Ser Ile Lys Arg AGA TTG AAA ATA Arg Leu Lys Ile
TTC
Phe OTT CTG TTG ATO TCG GTA Val Leu Leu Met Ser Val ATT TTA OGA Ile Leu Gly
ATA
Ile TCA TTA ACA GOT TGC ATA GGC Ser Leu Thr Gly Cys Ile Gly TAT CGT Tyr Arg CCT AAA Pro Lys ATO GAC TTA Met Asp Leu AAA GCT TAT Lys Ala Tyr GAA CAT Glu His TTT AAC ACG CTC Phe Asn Thr Leu TAT GAA GAA AGC Tyr Olu Glu Ser
GAA
Glu TAT TCC AAA CAA Tyr Ser Lys Gin
TTC
Phe ACT AAO AAA AAA Thr Lys Lys Lys AAC OCT CTT TTA Asn Ala Leu Leu GAC TTG CAA AAC GGC TTG AGC GCT TTA Asp Leu Gin Asn Gly Leu Ser Ala Leu
TAC
Tyr 50 GCC AGA GAT TAC CAG ACT Ala Arg Asp Tyr Gln Thr TCT TTA GGG Ser Leu Oly AGC GCT TTT Ser Ala Phe
GTA
Val TTA OAT CAA GCC Leu Asp Gin Ala
GAG
Glu 65 CAA CGC TTT GAT Gin Arg Phe Asp AAA ACG CAA Lys Thr Gin ACA AGA GGG GCT Thr Arg Oly Ala
GGT
Gly 80 TAT GTG GGC GCT ACC ATO ATT AAT Tyr Val Gly Ala Thr Met Ile Asn 576 OAT AAT Asp Asn GTG COC OCT TAT Val Arg Ala Tyr GGG AAT ATT TAT Oly Asn Ile Tyr GAO GGC Glu Gly 100 OTT TTA ATC Val Leu Ile WO 98/21225 PCT/US97/21353 -261-
AAT
Asn 105 TAT TAC AAA GCG Tyr Tyr Lys Ala
ATA
Ile 110 GAC TAC ATG CTT TTA AAC GAT AGC GCG Asp Tyr Met Leu Leu Asn Asp Ser Ala 115 GCT AGG GTG CAA Ala Arg Val Gin AAC CGT GCG AAC GAA CGC CAG CGC AGG Asn Arg Ala Asn Glu Arg Gin Arg Arg GCT AAA Ala Lys 135 GAA TTT TAT TAT GAG GAA GTG CAA AAA GCC ATT AAA GAG Glu Phe Tyr Tyr Glu Glu Val Gin Lys Ala Ile Lys Glu ATC GAT TCT Ile Asp Ser 150 GAA GTG AGC Glu Val Ser AGC AAA AAG CAC AAT ATT AAT Ser Lys Lys His Asn Ile Asn GAA CGC TCT AGG Glu Arg Ser Arg GAG ATT Glu Ile 170 TTA AAC AAC ACC Leu Asn Asn Thr
TAT
Tyr 175 TCT AAT TTA GAC Ser Asn Leu Asp TAC GAA GCT TAT Tyr Glu Ala Tyr
CAG
Gin 185 GGC TTA CTT AAC Gly Leu Leu Asn GCG GTT TCG TAT CTC TCA GGG TTG TTT Ala Val Ser Tyr Leu Ser Gly Leu Phe 195
TAC
Tyr 200 912 GCT TTA AAT GGG Ala Leu Asn Gly GAG AAT AAG GGA Glu Asn Lys Gly
TTA
Leu 210 GGC TAT CTT AAT Gly Tyr Leu Asn GAA GCC Glu Ala 215 TAT GGG ATC Tyr Gly Ile CAA AGC CCT TTT Gin Ser Pro Phe
GTA
Val 225 GCC CAA GAC TTG Ala Gin Asp Leu AAA AAC CCT AAC AGG AGC CAT Lys Asn Pro Asn Arg Ser His ACT TGG ATC ATC Thr Trp Ile Ile GTT TTT TTC Val Phe Phe 230 GAA GAT GGT Glu Asp Gly ATT TTT ATG Ile Phe Met 1008 AAA GAG Lys Glu 250 CAA AAA AGC Gin Lys Ser
GAA
Glu 255 AAA ATT GAT Lys Ile Asp 1056 1104 1152 1200
ATC
Ile 265 GAT TCG GTT TAT Asp Ser Val Tyr
AAC
Asn 270 GTG AGT ATA GCC Val Ser Ile Ala CCC AAG CTA GAA Pro Lys Leu Glu
AAA
Lys 280 GGG GAA GCG TTT Gly Glu Ala Phe CAA AAT TTC ACT Gin Asn Phe Thr
CTC
Leu 290 AAA GAT GGA GAA Lys Asp Gly Glu AAA GTA Lys Val 295 ACG CCC TTT Thr Pro Phe TTC AGG AAG Phe Arg Lys 315 ACT TTA GCC TCA Thr Leu Ala Ser
ATA
Ile 305 GAT GCG GTG GTC Asp Ala Val Val GCT AGC GAA Ala Ser Glu 310 TTA TCG GCC Leu Ser Ala 1248 1296 CAG TTG CCC TAC Gin Leu Pro Tyr ATC ACT AGG GCT Ile Thr Arg Ala WO 98/21225 PCT/US97/21353 -262- ACT TTT Thr Phe 330 AAA GTG GGC ATG CAA GCG GTG GCG AAC Lys Val Gly Met Gin Ala Val Ala Asn 335 TAT TAT Tyr Tyr 340 TTG GGG TTT Leu Gly Phe 1344
GTT
Val 345 GGA GGG TTA GTA Gly Gly Leu Val
ACT
Thr 350 TCC TTG TAT TCA Ser Leu Tyr Ser GTG AGC ACC TTT Val Ser Thr Phe
GCA
Ala 360 1392 1440 GAC ACT AGA AGC Asp Thr Arg Ser
ACG
Thr 365 AGC ATT TTT GCC Ser Ile Phe Ala
CAT
His 370 AAA ATC TAC CTC Lys Ile Tyr Leu ATG CGC Met Arg 375 ATT AAA AAC Ile Lys Asn GAC GCT TTT Asp Ala Phe 395 GCC TTT GAA AGT Ala Phe Glu Ser GAA GTT CGA GCC Glu Val Arg Ala GAT TCC ATT Asp Ser Ile 390 CTT GAA AGC Leu Glu Ser 1488 1536 TCG TTT TCA TTA Ser Phe Ser Leu CCT TGT AAA AGA Pro Cys Lys Arg
TCG
Ser 405 CCT AAA Pro Lys 410 ATC ATT GAC GCT Ile Ile Asp Ala
AGG
Arg 415 GAA TTG CTT TCT Glu Leu Leu Ser TTT GTA GCA GCC Phe Val Ala Ala CAA ATC TTT TGC Gin Ile Phe Cys
TCT
Ser 430 AAC CGC CAT AAT Asn Arg His Asn
ATT
Ile 435 TTA TAC GTG CGC Leu Tyr Val Arg
AGT
Ser 440 1584 1632 1685 1745 1805 1851 TTT AAA AAC GGG Phe Lys Asn Gly
TTT
Phe 445 GTT TTG AGT CGT Val Leu Ser Arg TTA AAA Leu Lys 450 TGATTTCAAA ACCCCCACCA AACACGATAT AATTATAAAA GTTTAGCGGG CATTGATCAA
GTTTCA
AAGGAATTTT
CGATACGAAA
GTTACGAGTT
AGTTTTTAAG TGTCGTTGGC ATTAAACGCA ACCTAAATTA AGGGGAAGTC ATGGCTGATA TGCATAAAAA TAACGAGTTA CAATTGTTGT INFORMATION FOR SEQ ID NO:128: SEQUENCE CHARACTERISTICS: LENGTH: 476 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (ix) FEATURE: NAME/KEY: Signal Sequence LOCATION: 1...25 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:128: WO 98/21225 WO 9821225PCTIUS97/21353 -263- Met Val1 Leu Tyr Trp, Thr Gin As n Ile Lys 120 Lys Ser Ser Tyr Tyr 200 Aia Phe Giy Met Lys 280 Val1 Glu Aia Phe Al a 360 Arg Ile Ile Leu His Tyr Leu Leu Aia Asn Tyr Arg Phe Lys Ile 170 Giy Leu Gly As n Giu 250 Asp Giu Pro Arg Phe 330 Gly Thr Lys Ala Lys Gly Phe Ser Gin Gly Phe Val Tyr Val1 Tyr Lys 155 Leu Leu As n Ile Pro 235 Pro Ser Ala Phe Lys 315 Lys Gly Arg Asn Phe Arg Vai Arg Leu Lys Ile Phe Leu Leu Phe Leu Asp Giy Tyr Ile 110 Asn Giu Ile Thr Pro 190 Giu Se r Ser Ser Asn 270 Gin Leu Pro Met Thr 350 Ser Phe Ser Thr Tyr 15 Thr Ser Gin Al a Gly 95 Asp Arg Val1 Asn Tyr 175 Al a Asn Pro His Glu 255 Vai Asn Ala Tyr Gin 335 Ser Ile Glu Leu Giy Tyr Lys Ala Ala Gly 80 G ly Tyr Al a Gin Met 160 Ser Val1 Lys Phe Phe 240 Phe Ser Phe Ser Ile 320 Ala Leu Phe Ser Lys Cys Giu Lys Leu Glu Tyr Asn Met Asn Ly s 145 Glu Asn Ser Gly Val1 225 Thr Lys Ile Thr Ile 305 Ile Vai Tyr Al a Tyr 385 Pro -15 Ile Giu Lys Tyr 50 Gin Val1 Ile Leu Clu 130 Al a Arg Leu Tyr Leu 210 Al a Trp Ile Al a Leu 290 Asp Thr Al a Ser His 370 Giu Cys Val Giy Ser Lys Ala Arg Gly Tyr Leu 115 Arg Ile Ser Asp Leu 195 Gly Gin Ile Asp Leu 275 Lys Al a Arg Asn Gly 355 Lys Vali Lys Leu Tyr Pro Asn Arg Phe Al a Giu 100 Asn Gin Lys Arg Lys 180 Ser Tyr Asp Ile Vali 260 Pro Asp Vali Al a Tyr 340 Val Ile Arg Arg Leu Arg Lys Ala Asp Asp Thr Giy Asp Arg Giu Val 165 Tyr Gly Leu Leu Ile 245 Pro Lys Gly Val1 Ile 325 Tyr Ser Tyr Ala Ser Met Met Lys Leu Tyr Lys Met Val1 Ser Arg Ile 150 Glu Glu Leu As n Val1 230 Glu Ile Leu Glu Al a 310 Leu Leu Thr Leu Asp 390 Leu Ser Asp Al a Leu Gin Thr Ile Leu Al a Ala 135 Asp Val Ala Phe Glu 215 Phe Asp Phe Glu Lys 295 Ser Ser Gly Phe Met 375 Ser Glu 395 400 405 Ser Pro Lys Ile Ile Asp Ala Arg Giu Leu Leu Ser Gly Phe Val Ala WO 98/21225 PCT/US97/21353 -264- 410 415 Ala Pro Gin Ile Phe Cys Ser Asn Arg His Asn 425 430 Ser Phe Lys Asn Gly Phe Val Leu Ser Arg Leu 440 445 450 INFORMATION FOR SEQ ID NO:129: SEQUENCE CHARACTERISTICS: LENGTH: 435 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE: 420 Ile Leu Tyr Val Arg 435 Lys NAME/KEY: Coding Sequence LOCATION: 1...432 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:129: ATG TTA GAA AAA TTG ATT GAA AGA GTG Met Leu Glu Lys Leu Ile Glu Arg Val TTT GCC ACT CGT Phe Ala Thr Arg TGG TTG Trp Leu CTA GCC CCT Leu Ala Pro TAT GTG TTC Tyr Val Phe
TTA
Leu TGT ATT GCC ATG Cys Ile Ala Met
TCG
Ser 25 TTA GTG CTG GTG Leu Val Leu Val GTT TTA GGC Val Leu Gly TTA AAC ACG Leu Asn Thr 96 144 ATG AAA GAG TTG Met Lys Glu Leu
TGG
Trp CAC ATG CTC AGC His Met Leu Ser ATC AGC Ile Ser GAA ACG GAT TTG Glu Thr Asp Leu TTA TCA GCC TTA Leu Ser Ala Leu TTA GTG GAT TTG Leu Val Asp Leu
TTG
Leu TTT ATG GCC GGG Phe Met Ala Gly GTT TTA ATG GTG Val Leu Met Val
TTA
Leu CTC GCC AGT TAT Leu Ala Ser Tyr AGC TTT GTT TCT Ser Phe Val Ser TTA GAC AAG GTG Leu Asp Lys Val GCC AGT GAA ATC Ala Ser Glu Ile ACT TGG Thr Trp CTA AAG CAC Leu Lys His
ACG
Thr 100 GAT TTT AAC GCT Asp Phe Asn Ala
TTA
Leu 105 AAA TTA AAG GTT Lys Leu Lys Val TCA CTC TCC Ser Leu Ser 110 ATT GTA GCG ATT TCA GCG ATT TTC TTG CTC AAA CGC TAC ATG AGT TTA Ile Val Ala Ile Ser Ala Ile Phe Leu Leu Lys Arg Tyr Met Ser Leu WO 98/21225 PCT/US97/21353 -265- 115 120 125 GAA AGA TGT TTT ATC CCA GCA TTC CCT AAG GAT ACG CCC CCT ATC GCA T 433 Glu Arg Cys Phe Ile Pro Ala Phe Pro Lys Asp Thr Pro Pro Ile Ala 130 135 140 AA 435 INFORMATION FOR SEQ ID NO:130: SEQUENCE CHARACTERISTICS: LENGTH: 144 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:130: Met 1 Leu Tyr Ile Leu Ser Leu Ile Glu Leu Ala Val Ser Phe Phe Lys Val Arg 130 Glu Pro Phe Glu Met Val His Ala 115 Cys Lys Leu Met Thr Ala Ser Thr 100 Ile Phe Leu 5 Cys Lys Asp Gly Lys Asp Ser Ile Ile Glu Arg Val Leu Phe Ala Thr Arg Trp Leu Met Trp 40 Leu Leu Lys Ala Phe 120 Phe Ser His Ser Met Val Leu 105 Leu Pro Val Leu Leu Leu 75 Ala Leu Lys Asp Val His Leu Ala Glu Val Tyr 125 Pro Leu Gly Asn Thr Asp Leu Tyr Glu Thr Trp Leu Ser Ser Leu Ile Ala INFORMATION FOR SEQ ID NO:131: SEQUENCE CHARACTERISTICS: LENGTH: 2234 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 213...2081 OTHER INFORMATION: WO 98/21225 WO 82225PCTJUS97/21353 -2 66- NAME/KEY: Signal Sequence LOCATION: 213... .273 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:131: ATCATAAAAT GTAAAAATAC TCAAAGCATC GCATCAAGCA GCTCACAATT GAGCTAAAGC CCGCTTTTTA GGGATAAATA TGGGTAACTT TATGGGGCGA AGCGTTTCTA AATTTTGGTA
ATATAGCGAT
AAAAGCGTTT
TAATCGCTAG
CTGAAAAGAG
TCAAATTGCA
AAATTGTGAG
AAAGATTCTA TCTTGTTTGA GTGGGGTTTC GC ATG COT TTA TTA TTG TOG TGG Met Arg Leu Leu Leu Trp Trp CCT TTG AGA GCG GTT GAA GAG Pro Leu Arg Ala Val Olu Giu 120 180 233 281 '129( GTA TTG GTA TTA TCG CTC TTT TTA Val Leu Val Leu Ser Leu Phe Leu
AAT
Asn CAT GAA ACA GAT GCG GTG GAT TTG TTT TTG ATT TTC AAT CAA ATC AAC His Giu Thr Asp Ala Val Asp Leu Phe Leu Ile Phe Asn Gin Ile Asn
CAG
Gin CTC AAT CAA GTC Leu Asn Gin Val
ATT
Ile 25 GAA ACT TAC AAA Giu Thr Tyr Lys
PAAA
Lys AAC CCT GAA AGA Asn Pro Giu Arg GCT GAA ATC TCT CTG TAT AAC ACC CAA AAG AAT GAC TTG ATT AAA AGT Ala Glu Ile Ser Leu Tyr Asn Thr Gin Lys Asn Asp Leu Ile Lys Ser TTG ACT TCT AAA GTG TTG AAT GAA Leu Thr Ser Lys Val Leu Asn Giu
AGG
Arg 60 GAT AAG ATC GGG ATT GAT ATC Asp Lys Ile Gly Ile Asp Ile PAT CPA PAT Asn Gin Asn TTA AAA GAG CAG Leu Lys Oiu Gin
A
Giu 75 AAA ATC APA AAG Lys Ile Lys Lys
COT
Arg TTO TCT AAA Leu Ser Lys AOC ATT Ser Ile PAT GGC OAT GAT Asn Giy Asp Asp TAC ACT TTC ATG AAA GAC AGA TTG TCT Tyr Thr Phe Met Lys Asp Arg Leu Ser
TTA
Leu 100 GAT ATT TTG TTG Asp Ile Leu Leu
ATA,
Ile 105 GAT GPA ATT TTO Asp Olu Ile Leu TAT COT TTT ATA OAT AAA Tyr Ary Phe Ile Asp Lys 110 115 CPA AAA OAT GTA OPA AOC Gin Lys Asp Val Oiu Ser 130 ATC AGG AGC AGT Ile Arg Ser Ser
ATT
Ile 120 OAT ATT TTT AGC Asp Ile Phe Ser ATC AGC GAT OCT TTC CTT TTG CGT TTA GGG CPA TTC AAA CTC TAC ACT WO 98/21225 PCT/US97/21353 -267- Ile. Ser Asp TTC CCT AAA Phe Pro Lys 150 Ala 135 Phe Leu Leu Arg Gly Gin Phe Lys Leu Tyr Thr 145 GAG CAG ATG Glu Gin Met AAT TTA GGC AAT Asn Leu Gly Asn
GTC
Val 155 AAA ATG CAT GAA Lys Met His Glu TTT AGC Phe Ser 165 GAT TAT GAA TTG Asp Tyr Glu Leu TTG AAC ACT TAC Leu Asn Thr Tyr
ACC
Thr 175 GAA GTC TTG CGT Glu Val Leu Arg
TAC
Tyr 180 ATT AAA AAC CAC Ile Lys Asn His AAA GAA GTG CTT Lys Glu Val Leu
CCT
Pro 190 AAA AAC TTG ATC Lys Asn Leu Ile 809 857 905 GAA GTG AAT ATG Glu Val Asn Met
GAT
Asp 200 TTT GTG TTA AAC Phe Val Leu Asn
AAA
Lys 205 ATC AGC AAG GTT Ile Ser Lys Val TTG CCT Leu Pro 210 TTC ACA ACC Phe Thr Thr ATT TTA GCC Ile Leu Ala 230 CAT AGC TTG CAA GTG AGT His Ser Leu Gin Val Ser 215 220 AAA ATC GTG CTA Lys Ile Val Leu GCT TTG ACG Ala Leu Thr 225 TGG CTT TTA Trp Leu Leu 953 1001 TTA TTG CTG GGT Leu Leu Leu Gly
TTA
Leu 235 AGG AAG TTG ATC Arg Lys Leu Ile GCC TTA Ala Leu 245 TTG TTA GAT CGT Leu Leu Asp Arg TTT GAA ATC ATG Phe Glu Ile Met CAG CGC AAT AAA Gin Arg Asn Lys 255 CCG GTT TCT GTC Pro Val Ser Val
AAA
Lys
TTT
Phe 275
ATG
Met 260 CAT GTC AAT GTG His Val Asn Val AAG AGC ATT GTT Lys Ser Ile Val 1049 1097 1145 TTA GCC CTA TTT Leu Ala Leu Phe
AGT
Ser 280 TGC GAT GTG GCT Cys Asp Val Ala GAT ATT TTC TAC Asp Ile Phe Tyr TAC CCT Tyr Pro 290 AAC GCA TCG Asn Ala Ser ATG CTT TTA Met Leu Leu 310
CCC
Pro 295 CCT AAA GTT TCT Pro Lys Val Ser
ATG
Met 300 TGG GTG GGC GCG Trp Val Gly Ala GTG TAT ATC Val Tyr Ile 305 TAT GGG GAA Tyr Gly Glu 1193 GCA TGG TTA GTG ATA GCG CTT TTT AAA Ala Trp Leu Val Ile Ala Leu Phe Lys GCG TTA Ala Leu 325 GTT ACG AAT ATG Val Thr Asn Met
GCT
Ala 330 ACC AAA AGC ACG Thr Lys Ser Thr AAT TTT AGA AAA Asn Phe Arg Lys 1241 1289 1337 GAA Glu 340 GTG ATC AAC TTG Val Ile Asn Leu TTA AAA GTC GTG Leu Lys Val Val
TAT
Tyr 350 TTT TTG ATC TTT Phe Leu Ile Phe
ATT
Ile 355 WO 98/21225 PCT/US97/21353 -268- GTC GCG CTT TTA Val Ala Leu Leu
GGG
Gly 360 GTT TTG AAA CAA Val Leu Lys Gin
CTA
Leu 365 GGG TTT AAC GTT Gly Phe Asn Val TCA GCC Ser Ala 370 1385 ATC ATC GCT Ile Ile Ala AAA GAT GTG Lys Asp Val 390
TCT
Ser 375 TTA GGG ATT GGG Leu Gly Ile Gly
GGG
Gly 380 TTA GCG GTG GCT Leu Ala Val Ala TTG GCG GTT Leu Ala Val 385 TTA TTA GAO Leu Leu Asp 1433 1481 TTA GOG PAT TTT Leu Ala Asn Phe
TTT
Phe 395 GCT TCG GTC ATT Ala Ser Val Ile
TTA
Leu 400 PAT TCG Asn Ser 405 TTT TOT CPA GGG GAT TGG ATC GTG TGC Phe Ser Gin Giy Asp Trp Ile Val Cys 410
GGT
Gly 415 GAA GTG GAG GGO Glu Val Glu Gly 1529 GTG GTG GPA ATG Val Val Giu Met TTA AGG OGC ACC Leu Arg Arg Thr
ACG
Thr 430 ATC AGA GCC TTT Ile Arg Ala Phe
GAO
Asp 435 1577 i625 1673 PAC GOT OTT TTG Asn Ala Leu Leu
TCC
Ser 440
OGT
Arg GTG COT PAT TCA Val Pro Asn Ser
GAA
Glu 445
AGG
Arg TTA GOC GGA AAA Leu Ala Gly Lys CCC ATO Pro Ile 450 GAA ATA Glu Ile AGG PAT TGG Arg Asn Trp GGC TTA ACT Gly Leu Thr 470
AGO
Ser 455 CGT PAA GTG Arg Lys Val
GGA
Gly 460 OGT ATT AAA Arg Ile Lys
ATG
Met 465 TAT AGO TCC AGT Tyr Ser SerSer
CAA
Gin 475 AGC GOT TTA CAG Ser Ala Leu Gin
OTT
Leu 480 TGO GTG AAA Cys Val Lys 1721 GAO ATT Asp Ile 485 AAA GPA ATG TTA Lys Giu Met Leu
GAA
Glu 490 PAC CAC OCT AAA Asn His Pro Lys GCT PAC GGA GCC Ala Asn Giy Ala 1769 1817 GAT Asp 500 AGO GOT TTG CAA Ser Ala Leu Gin GTG AGC GAT TAO Val Ser Asp Tyr TAO ATG TTT AAA Tyr Met Phe Lys
AAA
Lys 515 GAT ATT GTT TCT Asp Ile Vai Ser
ATT
Ile 520 GAT GAT TTT TTA GGG TAT AAA PAC PAT Asp Asp Phe Leu Gly Tyr Lys Asn Asn 525 TTG TTT Leu Phe 530 1865 GTC TTT TTA Val Phe Leu TGO TTT TCT Cys Phe Ser 550
GAT
Asp 535 CAG TTT GOG GAO AGO TOT ATT PAT ATT Gin Phe Ala Asp Ser Ser Ile Asn Ile 540 TTA GTG TAT Leu Val Tyr 545 GTO APA GAA Val Lys Glu 1913 PAG ACA GTG GTT Lys Thr Val Val
TGG
Trp 555 GAA GAG TGG OTA Glu Giu Trp Leu
GAA
Glu 560 1961 GAT GTG Asp Val 565 ATG OTA PAA ATC Met Leu Lys Ile GGG ATT GTA GAA Gly Ile Val Glu CAC CAT TTG AGT His His Leu Ser 2009 WO 98/21225 PCT/US97/21353 -269- TTT GCT TTC CCA-TCA CAG AGT TTG TAT GTG GAG AGT TTG CCA GAA GTT 2057 Phe Ala Phe Pro Ser Gin Ser Leu Tyr Val Glu Ser Leu Pro Glu Val 580 585 590 595 AGC CTG AAA GAA GGG GCT AAA ATC TGAAATTATT GGTAGATGTA TTCTTTGGTT 2111 Ser Leu Lys Glu Gly Ala Lys Ile 600 AAGGGGAAAG TGTTATCCAC GCTGTTGGTT AAAAGCAATT GGAATAAATC CGCGCTCCCC 2171 ACCCTAAAGG CGGATGCGCA AGTCCTTAAA TACAGATCCC ACATGCGGAT AAAGCGTTCG 2231 TCA 2234 INFORMATION FOR SEQ ID NO:132: SEQUENCE CHARACTERISTICS: LENGTH: 623 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (ix) FEATURE: NAME/KEY: Signal Sequence LOCATION: 1...20 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:132: Met Arg Leu Leu Leu Trp Trp Val Leu Val Leu Ser Leu Phe Leu Asn -15 -10 Pro Leu Arg Ala Val Glu Glu His Glu Thr Asp Ala Val Asp Leu Phe 1 5 Leu Ile Phe Asn Gin Ile Asn Gin Leu Asn Gin Val Ile Glu Thr Tyr 20 Lys Lys Asn Pro Glu Arg Ser Ala Glu Ile Ser Leu Tyr Asn Thr Gin 35 Lys Asn Asp Leu Ile Lys Ser Leu Thr Ser Lys Val Leu Asn Glu Arg 50 55 Asp Lys Ile Gly Ile Asp Ile Asn Gin Asn Leu Lys Glu Gin Glu Lys 70 Ile Lys Lys Arg Leu Ser Lys Ser Ile Asn Gly Asp Asp Phe Tyr Thr 85 Phe Met Lys Asp Arg Leu Ser Leu Asp Ile Leu Leu Ile Asp Glu Ile 100 105 Leu Tyr Arg Phe Ile Asp Lys Ile Arg Ser Ser Ile Asp Ile Phe Ser 110 115 120 Glu Gln Lys Asp Val Glu Ser Ile Ser Asp Ala Phe Leu Leu Arg Leu 125 130 135 140 Gly Gin Phe Lys Leu Tyr Thr Phe Pro Lys Asn Leu Gly Asn Val Lys 145 150 155 Met His Glu Leu Glu Gin Met Phe Ser Asp Tyr Glu Leu Arg Leu Asn WO 98/21225 PCT/US97/21353 -270- 160 165 170 Thr Tyr Thr Glu Val Leu Arg Tyr Ile Lys Asn His Pro Lys Glu Val 175 180 185 Leu Pro Lys Asn Leu Ile Met Glu Val Asn Met Asp Phe Val Leu Asn 190 195 200 S Lys Ile Ser Lys Val Leu Pro Phe Thr Thr His Ser Leu Gin Val Ser 205 210 215 220 Lys Ile Val Leu Ala Leu Thr Ile Leu Ala Leu Leu Leu Gly Leu Arg 225 230 235 Lys Leu Ile Thr Trp Leu Leu Ala Leu Leu Leu Asp Arg Ile Phe Glu 240 245 250 Ile Met Gin Arg Asn Lys Lys Met His Val Asn Val Gin Lys Ser Ile 255 260 265 Val Ser Pro Val Ser Val Phe Leu Ala Leu Phe Ser Cys Asp Val Ala 270 275 -280 Leu Asp Ile Phe Tyr Tyr Pro Asn Ala Ser Pro Pro Lys Val Ser Met 285 290 295 300 Trp Val Gly Ala Val Tyr Ile Met Leu Leu Ala Trp Leu Val Ile Ala 305 310 315 Leu Phe Lys Gly Tyr Gly Glu Ala Leu Val Thr Asn Met Ala Thr Lys 320 325 330 Ser Thr His Asn Phe Arg Lys Glu Val Ile Asn Leu Ile Leu Lys Val 335 340 345 Val Tyr Phe Leu Ile Phe Ile Val Ala Leu Leu Gly Val Leu Lys Gin 350 355 360 Leu Gly Phe Asn Val Ser Ala Ile Ile Ala Ser Leu Gly Ile Gly Gly 365 370 375 380 Leu Ala Val Ala Leu Ala Val Lys Asp Val Leu Ala Asn Phe Phe Ala 385 390 395 Ser Val Ile Leu Leu Leu Asp Asn Ser Phe Ser Gin Gly Asp Trp Ile 400 405 410 Val Cys Gly Glu Val Glu Gly Thr Val Val Glu Met Gly Leu Arg Arg 415 420 425 Thr Thr Ile Arg Ala Phe Asp Asn Ala Leu Leu Ser Val Pro Asn Ser 430 435 440 Glu.Leu Ala Gly Lys Pro Ile Arg Asn Trp Ser Arg Arg Lys Val Gly 445 450 455 460 Arg Arg Ile Lys Met Glu Ile Gly Leu Thr Tyr Ser Ser Ser Gin Ser 465 470 475 Ala Leu Gin Leu Cys Val Lys Asp Ile Lys Glu Met Leu Glu Asn His 480 485 490 Pro Lys Ile Ala Asn Gly Ala Asp Ser Ala Leu Gin Asn Val Ser Asp 495 500 505 Tyr Arg Tyr Met Phe Lys Lys Asp Ile ValSer Ile Asp Asp Phe Leu 510 515 520 Gly Tyr Lys Asn Asn Leu Phe Val Phe Leu Asp Gin Phe Ala Asp Ser 525 530 535 540 Ser Ile Asn Ile Leu Val Tyr Cys Phe Ser Lys Thr Val Val Trp Glu 545 550 555 Glu Trp Leu Glu Val Lys Glu Asp Val Met Leu Lys Ile Met Gly Ile 560 565 570 Val Glu Lys His His Leu Ser Phe Ala Phe Pro Ser Gin Ser Leu Tyr 575 580 585 Val Glu Ser Leu Pro Glu Val Ser Leu Lys Glu Gly Ala Lys Ile 590 595 600 WO 98/21225 PCT/US97/21353 -271- INFORMATION FOR SEQ ID NO:133: SEQUENCE CHARACTERISTICS: LENGTH: 432 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 1...429 OTHER INFORMATION: NAME/KEY: sig_peptide LOCATION: 1...93 OTHER INFORMATION: NAME/KEY: mat_peptide LOCATION: 94...429 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:133: ATG AAA Met Lys -31 AAA TTT TTT TCT Lys Phe Phe Ser TCT TTA TTA GCT Ser Leu Leu Ala ATT GTG TCT ATG Ile Val Ser Met
AAC
Asn GCG CTA CTG GCC Ala Leu Leu Ala GAT GGC AAT GGC Asp Gly Asn Gly TTT TTA GGG GCG Phe Leu Gly Ala
GGT
Gly 1 TAT TTG CAA Tyr Leu Gin CAA GCC ACT Gin Ala Thr CAA GCC CAA ATG Gin Ala Gln Met
CAT
His GCG GAT ATT AAT Ala Asp Ile Asn TCT CAA AAA Ser Gin Lys AAC GCT ACT ATC Asn Ala Thr Ile GGC TTT GAT GCG CTT TTA GGG TAT Gly Phe Asp Ala Leu Leu Gly Tyr CAA TTT TTC TTT GGG AAA Gin Phe Phe Phe Gly Lys TTT GGC TTG CGT Phe Gly Leu Arg TAT GGG TTT TTT Tyr Gly Phe Phe
GAC
Asp TAC GCT CAT GCC AAT TCT ATT AGG CTT Tyr Ala His Ala Asn Ser Ile Arg Leu AAC CCT AAC TAT Asn Pro Asn Tyr AGC GAA GTG GCG CAA TTG GCG GGT CAA ATT CTT GGG AAA CAA GAA ATC Ser Glu Val Ala Gin Leu Ala Gly Gin Ile Leu Gly Lys Gin Glu Ile WO 98/21225 PCT/US97/21353 -272- AAT CGC TTA ACG AGC CTT GCT GAT CCT Asn Arg Leu Thr Ser Leu Ala Asp Pro 90 CTC ACT TAT GGG GGG GCT-ATG GAT TTA Leu Thr Tyr Gly Gly Ala Met Asp Leu 100 105 75 AAA ACC TTT GAG CCA AAC ATG Lys Thr Phe Glu Pro Asn Met ATG GTT AAT GTT CAT CAA TAA Met Val Asn Val His Gin 110 384 432 INFORMATION FOR SEQ ID NO:134: SEQUENCE CHARACTERISTICS: LENGTH: 143 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:134: Met -31 Asn Tyr Gin Gin Asp Ser Asn Leu Lys Leu Gin Thr Phe Ala Val Leu Tyr 100 Ser Gin Ser Leu Leu Ala Leu Ile Val Ser Met Met -10 Ala Thr Lys Asn 55 Leu Leu Ala Asp Gin Ile Tyr Ser Ala Ala Met Gly Ala Phe Leu Leu Ile Lys Met Val -5 Asp Asp Arg Lys 60 Leu Thr Val Leu Asn Leu Tyr Pro Lys Glu Val Ala Gin Gly Phe Tyr Glu Asn Gin Gly 1 Lys Tyr Phe Asn Ile Met INFORMATION FOR SEQ ID NO:135: SEQUENCE CHARACTERISTICS: LENGTH: 336 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE: NAME/KEY: Coding Sequence WO 98/21225 PCT/US97/21353 -273- LOCATION: 1...333 OTHER INFORMATION: NAME/KEY: sig_peptide LOCATION: 1...60 OTHER INFORMATION: -NAME/KEY: mat_peptide LOCATION: 61...333 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:135: AAA ACC TTT AAA AAC CTG CTC TGT TTT Lys Thr Phe Lys Asn Leu Leu Cys Phe CTG ATC GCT ATG Leu Ile Ala Met TGG CTC CAA GCG GAC ATG TTG GAT AAT TTC ACT AGG GCC Trp Leu Gin Ala Asp Met Leu Asp Asn Phe Thr Arg Ala ATT AAC AGC Ile Asn Ser AAT AGC GCT Asn Ser Ala TAC ACC ACT AAA AAG CTT AAT Tyr Thr Thr Lys Lys Leu Asn
GAA
Glu 20 ATC AAG GAT CAA Ile Lys Asp Gin
GTC
Val AAC CCT Asn Pro ACT AAA AAT CAC AAT ACC ACT TAT AAC GCT AAT GGC ATG CTC Thr Lys Asn His Asn Thr Thr Tyr Asn Ala Asn Gly Met Leu 192 AAC ATT GAT TGT Asn Ile Asp Cys
AAA
Lys GTC TTA AAA AAT Val Leu Lys Asn
AAC
Asn 55 TTC TAT TCG GTG Phe Tyr Ser Val
TGT
Cys TAT TCT AGC GAG Tyr Ser Ser Glu
TTA
Leu AAA AAC CCT ATT Lys Asn Pro Ile
TAT
Tyr 70 GGC GTG AGC GTG Gly Val Ser Val TTG TTT Leu Phe GGG GAT TTA Gly Asp Leu
GTG
Val GAT AAA AAT AAT Asp Lys Asn Asn GAA AAA CGC TAT Glu Lys Arg Tyr GAG TTT TAA Glu Phe INFORMATION FOR SEQ ID NO:136: SEQUENCE CHARACTERISTICS: LENGTH: 111 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal WO 98/21225 PCT/US97/21353 -274- (xi) SEQUENCE'DESCRIPTION: SEQ ID NO:136: Met Lys Thr Phe Lys Asn Leu Leu Cys Phe Ser Leu Ile Ala Met Ser -15 -10 Trp Leu Gin Ala Asp Met Leu Asp Asn Phe Thr Arg Ala Ile Asn Ser 1 5 Tyr Thr Thr Lys Lys Leu Asn Glu Ile Lys Asp Gin Val Asn Ser Ala 20 Asn Pro Thr Lys Asn His Asn Thr Thr Tyr Asn Ala Asn Gly Met Leu 35 Ile Asn Ile Asp Cys Lys Val Leu Lys Asn Asn Phe Tyr Ser Val Cys 50 55 Tyr Ser Ser Glu Leu Lys Asn Pro Ile Tyr Gly Val Ser Val Leu Phe 70 Gly Asp Leu Val Asp Lys Asn Asn Ile Glu Lys Arg Tyr Glu Phe 85 INFORMATION FOR SEQ ID NO:137: SEQUENCE CHARACTERISTICS: LENGTH: 2185-base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 81...2069 OTHER INFORMATION: NAME/KEY: Signal Sequence LOCATION: 81...144 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:137: GTAAAAAATG GCTTATCTGT TCTAGCCTAC TCCCCTTATT TTTTCTTAAT CCCTTAGCGG CAGAAGATGA TGGGTTTTTT ATG GGG GTG AGT TAT CAA ACT TCT CTA GCT 110 Met Gly Val Ser Tyr Gin Thr Ser Leu Ala ATT CAA AGG GTG GAT AAC TCA GGG CTT AAC GCC AGT CAA GCC GCA TCC 158 Ile Gin Arg Val Asp Asn Ser Gly Leu Asn Ala Ser Gin Ala Ala Ser -5 1 ACC TAC ATC CGC CAG AAC GCT ATC GCT CTA GAA TCT GCG GCG GTG CCT 206 Thr Tyr Ile Arg Gin Asn Ala Ile Ala Leu Glu Ser Ala Ala Val Pro 15 WO 98/21225 WO 9821225PCTIUS97/21353 275- TTA GCC TAT TAT TTA'GAA GCG ATG GGC CAA CAA ACC AGG Gin Gin Thr Arg Leu Ala Tyr Tyr Leu Giu Ala Met Gly 30 GTT TTA ATG Val Leu Met TAT GCT GGA Tyr Ala Gly CAA ATG CTC Gin Met Leu TGC CCT GAT CCT Cys Pro Asp Pro AAA CGC TGT TTG Lys Arg Cys Leu GGT TAT Gly Tyr CCC CCA Pro Pro AAA AAC GGA TCA Lys Asn Giy Ser AAT ACT AAC GGC Asn Thr Asn Gly ACA GGC AAC AAC Thr Gly Asn Asn AGA GGC AAT Arg Gly Asn
GTC
Val1 GCC ACC TTT Ala Thr Phe
GAT
Asp 80 CAA TCT CTA Gin Ser Leu
GTC
Val AAT AAT TTA AAC Asn Asn Leu Asn CTC ACC CAA CTC Leu Thr Gin Leu GGC GAG ACT TTA Gly Glu Thr Leu ATC COT Ile Arg 100 AAC CCT GAA Asn Pro Giu AAT CAA AGC Asn Gin Ser 120
AAT
As n 105 CTT TCT AAC GCC Leu Ser Asn Ala
AAA
Lys 110 GTC TTT AAT GTC Val Phe Asn Val AAA TTT GGC Lys Phe Gly 115 ACT GTT ATT GCA Thr Val Ile Ala
TTG
Leu 125 CCT GAG GGT CTA GCC AAT ACC ATG Pro Glu Giy Leu Ala Asn Thr Met 130 AAC GCT Asn Ala 135 TTA AAC GAT GAT Leu Asn Asp Asp ACC AAC GCT TTA Thr Asn Ala Leu ACG CTC TGG TAT Thr Leu Trp, Tyr
AAC
Asn 150 CAA ACC TTA ACG Gin Thr Leu Thr AAA TCT TTT AAT Lys Ser Phe Asn
AGC
Ser 160 GGT AAT TCC GTG Gly Asn Ser Val
AAT
Asn 165 TTT AGC CCC CAA Phe Ser Pro Gin TTG CAA CAC CTT Leu Gin His Leu
TTA
Leu 175 CAA GAC GGC TTA Gin Asp Gly Leu GCC ACA Ala Thr 180 AGT AAT CAA Ser Asn Gin GAA GCT -AAA Giu Ala Lys 200 ATT TGC AGC ACT Ile Cys Ser Thr AAC CAA TGC ACC Asn Gin Cys Thr GCC ACC AAT Ala Thr Asn 195 CAG GCT TTA Gin Ala Leu TCT ATC GCT CAA Ser Ile Ala Gin GCC CAA AAC ATC Ala Gin Asn Ile ATG CAA Met Gin 215 GCA GGG ATT TTA Ala Gly Ile Leu 000 Gly 220 GOC TTA GCC AAT Gly Leu Ala Asn AAG CAA TTT GGC Lys Gin Phe Giy ACT TAC AAC AAA Thr Tyr Asn Lys
CC
Al a 235 CCT AAT GOT AGC Pro Asn Oly Ser TCC CAA CAA GGC Ser Gin Gin Gly
TAC
Tyr 245 WO 98/21225 PCT/US97/21353 -276- CAA AGC TTT AGC Gin Ser Phe Ser CCG GGT TAT TAC Pro Gly Tyr Tyr AAA AAC GGC GCT Lys Asn Gly Ala AAT GGC Asn Gly 260 ACT ACC CAA GCG CCC TTG AAA Thr Thr Gin Ala Pro Leu Lys 265 TCA GGC AAT GGC CAA TAC ACC Ser Gly Asn Gly Gin Tyr Thr 280 GCA TTA Ala Leu 270 CCC GCT GGA GCG Pro Ala Gly Ala ACA ATT GGA Thr Ile Gly 275 GTC TAT TAT Val Tyr Tyr 974 1022 CAC CCC AGC TCG His Pro Ser Ser
GCA
Ala 290 TTA GCC Leu Ala 295 GAT AGC ATC ATT Asp Ser Ile Ile
GCT
Ala 300 AAT GGC ATC ACC Asn Gly Ile Thr TCT ATG ATT TTT Ser Met Ile Phe
TCA
Ser 310 GGC ATG CAA AAT Gly Met Gin Asn
TTC
Phe 315 GCC AAT AAA GCC Ala Asn Lys Ala
GCT
Ala 320 AAA CTG ACA GGC Lys Leu Thr Gly
ACT
Thr 325 1070 1118 1166 TCA AGC TAT AGC Ser Ser Tyr Ser ATG CAA GAT GCG Met Gin Asp Ala AAT TAC GGG GAA Asn Tyr Gly Glu AGC TTG Ser Leu 340 CTC AGT AAC Leu Ser Asn CCC TAT TTG Pro Tyr Leu 360
ACC
Thr 345 GTA GCG TAT GGG Val Ala Tyr Gly
GAT
Asp 350 TTC ATC ACC AAT Phe Ile Thr Asn TGG GTC GCC Trp Val Ala 355 CCT AGC TAT Pro Ser Tyr 1214 1262 GAT TTA AAC AAC Asp Leu Asn Asn
AAA
Lys 365 GGT TTG AAT TTC Gly Leu Asn Phe GGG GGG Gly Gly 375 CAA TTG AAT GGT GCT AAT CAT CAA ACC Gin Leu Asn Gly Ala Asn His Gin Thr 380
CCA
Pro 385 CAA TTA ACC CCG Gin Leu Thr Pro 1310 CAA GCC CAA CAA Gin Ala Gin Gin CAA AAA GTC ATC Gin Lys Val Ile
ATG
Met 400 AAC CAA CTA GAG Asn Gin Leu Glu
CAA
Gin 405 1358 1406 GCC ACA AAC GCC Ala Thr Asn Ala
CCC
Pro 410 ACC CCC GCG CAA Thr Pro Ala Gin
ATA
Ile 415 AAC AGG ATT TTA Asn Arg Ile Leu GCC AAC Ala Asn 420 CCC TAT TCC Pro Tyr Ser TCT AAA GCA Ser Lys Ala 440 ACG GCA AAA ACT Thr Ala Lys Thr
TTA
Leu 430 ATG GCT TAT GGG Met Ala Tyr Gly CTT TAT CGC Leu Tyr Arg 435 ACT AAA GTG Thr Lys Val 1454 1502 GTG ATT GGC GGG Val Ile Gly Gly
GTG
Val 445 ATT GAT GAA ATG Ile Asp Glu Met AAT CAA Asn Gin 455 GTC TAT CAA ATG Val Tyr Gin Met
GGC
Gly 460 TTT GCT AGG AAT Phe Ala Arg Asn TTG GAG CAT AAC Leu Glu His Asn 1550 WO 98/21225 PCT/US97/21353 -277-
TCT
Ser 470 AAT TCT AAT AAC Asn Ser Asn Asn AAC GGC TTT GGC Asn Gly Phe Gly
GTG
Val 480 AAA ATG GGC TAT Lys Met Gly Tyr
AAG
Lys 485 CAA TTC TTT GGC Gin Phe Phe Gly
AAA
Lys 490 AAG CGC ATG TTT Lys Arg Met Phe CTT AGG TAT TAT Leu Arg Tyr Tyr GGT TTT Gly Phe 500 1598 1646 1694 TAT GAT TTT Tyr Asp Phe GCC ACC CTC Ala Thr Leu 520 TAC GCT CAA TTT GGC GCA GAA TCT TCT Tyr Ala Gin Phe Gly Ala Glu Ser Ser 510 TTA GTG AAA Leu Val Lys 515 TAT AAT GTT Tyr Asn Val TCT AGC TAT GGG Ser Ser Tyr Gly
GCA
Ala 525 GGC ACA GAC TTT Gly Thr Asp Phe
CTT
Leu 530 1742 TTT ACC Phe Thr 535 CGA AAA AGA GGG Arg Lys Arg Gly
ACT
Thr 540 GAA GCG ATA GAT Glu Ala Ile Asp
ATC
Ile 545 GGT TTT TTT GCC Gly Phe Phe Ala
GGT
Gly 550 ATC CAA CTT GCA Ile Gin Leu Ala CAA ACT TGG AAA Gin Thr Trp Lys
ACG
Thr 560 AAT TTT TTA GAT Asn Phe Leu Asp
CAA
Gln 565 1790 18.38 1886 GTG GAT GGC AAC Val Asp Gly Asn CTT AAA CCC AAA Leu Lys Pro Lys ACT TCT TTC CAA Thr Ser Phe Gin TTC CTT Phe Leu 580 TTT GAT TTA Phe Asp Leu AGA TCC CGT Arg Ser Arg 600 ATA AGG ACC AAT Ile Arg Thr Asn TCC AAA ATC GCT Ser Lys Ile Ala CAT CAA AAA His Gln Lys 595 ATA CCG GTG Ile Pro Val 1934 1982 TTT TCT CAA GGG Phe Ser Gin Gly
ATA
Ile 605 GAA TTT GGC CTT Glu Phe Gly Leu
AAA
Lys 610 CTT TAT Leu Tyr 615 CAC ACC TAT TAC His Thr Tyr Tyr
CAA
Gin 620 TCA GAA GGC GTT Ser Glu Gly Val
ACA
Thr 625 GCG AAG TAT AGA Ala Lys Tyr Arg 2030
AGA
Arg 630 GCC TTT AGT TTT Ala Phe Ser Phe
TAT
Tyr 635 GTG GGC TAC AAC Val Gly Tyr Asn ATA GGC TTT Ile Gly Phe 640 TGATTAAACA AA 2081 ATAAGGGAAA AATATGATAA AAAAAGCTAG AAAATTCATA CCATTCTTTT TAATTGGCTC CCTCTTAGCT GAAGACAATG GCTGGTATAT GTCTGTAGGC TATC 2141 2185 INFORMATION FOR SEQ ID NO:138: SEQUENCE CHARACTERISTICS: LENGTH: 663 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear WO 98/21225 PCT/US97/21353 -278- (ii) MOLECULE'TYPE: protein FRAGMENT TYPE: internal (ix) FEATURE: NAME/KEY: Signal Sequence LOCATION: 1..-.21 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:138: Met Gly Val Ser Tyr Gin Thr Ser Leu Ala Ile Gin Arg Val Asp Asn -15 Ser Gly Leu Asn Ala Ser Gin Ala Ala Ser Thr Tyr Ile Arg Gin Asn 1 5 Ala Ile Ala Leu Glu Ser Ala Ala Val Pro Leu Ala Tyr Tyr Leu Glu 20 Ala Met Gly Gin Gin Thr Arg Val Leu Met Gin Met Leu Cys Pro Asp 35 Pro Ser Lys Arg Cys Leu Leu Tyr Ala Gly Gly Tyr Lys Asn Gly Ser 50 Ser Asn Thr Asn Gly Asp Thr Gly Asn Asn Pro Pro Arg Gly Asn Val 65 70 Asn Ala Thr Phe Asp Met Gin Ser Leu Val Asn Asn Leu Asn Lys Leu 85 Thr Gin Leu Ile Gly Glu Thr Leu Ile Arg Asn Pro Glu Asn Leu Ser 100 105 Asn Ala Lys Val Phe Asn Val Lys Phe Gly Asn Gin Ser Thr Val Ile 110 115 120 Ala Leu Pro Glu Gly Leu Ala Asn Thr Met Asn Ala Leu Asn Asp Asp 125 130 135 Ile Thr Asn Ala Leu Thr Thr Leu Trp Tyr Asn Gin Thr Leu Thr Asn 140 145 150 155 Lys Ser Phe Asn Ser Gly Asn Ser Val Asn Phe Ser Pro Gin Val Leu 160 165 170 Gin His Leu Leu Gin Asp Gly Leu Ala Thr Ser Asn Gin Thr Ile Cys 175 180 185 Ser Thr Gin Asn Gin Cys Thr Ala Thr Asn Glu Ala Lys Ser Ile Ala 190 195 200 Gin Asn Ala Gin Asn Ile Phe Gin Ala Leu Met Gin Ala Gly Ile Leu 205 210 215 Gly Gly Leu Ala Asn Glu Lys Gin Phe Gly Phe Thr Tyr Asn Lys Ala 220 225 230 235 Pro Asn Gly Ser Asp Ser Gin Gin Gly Tyr Gin Ser Phe Ser Gly Pro 240 245 250 Gly Tyr Tyr Thr Lys Asn Gly Ala Asn Gly Thr Thr Gin Ala Pro Leu 255 260 265 Lys Ala Leu Pro Ala Gly Ala Thr Ile Gly Ser Gly Asn Gly Gin Tyr 270 275 280 Thr Tyr His Pro Ser Ser Ala Val Tyr Tyr Leu Ala Asp Ser Ile Ile 285 290 295 Ala Asn Gly Ile Thr Ala Ser Met Ile Phe Ser Gly Met Gin Asn Phe _300 305 310 315 Ala Asn Lys Ala Ala Lys Leu Thr Gly Thr Ser Ser Tyr Ser Gin Met 9 Ir WO 98/21225 PCT/US97/21353 -279- 320 Gin Asp Ala Ile Asn Tyr Gly Glu Tyr Asn Ala 380 Gn Pro Lys Gly Gly 460 Asn Arg Gin Gly Thr 540 Gin Lys Thr Gly Gin 620 Val Gly Lys 365 Asn Lys Ala Thr Val 445 Phe Gly Met Phe Ala 525 Glu Thr Pro Asn Ile 605 Ser Gly Asp 350 Gly His Val Gin Leu 430 Ile Ala Phe Phe Gly 510 Gly Ala Trp Lys Phe 590 Glu Glu Tyr 335 Phe Leu Gin Ile Ile 415 Met Asp Arg Gly Gly 495 Ala Thr Ile Lys Asp 575 Ser Phe Gly Asn Ile Asn Thr Met 400 Asn Ala Glu Asn Val 480 Leu Glu Asp Asp Thr 560 Thr Lys Gly Val Ile Thr Phe Pro 385 Asn Arg Tyr Met Phe 465 Lys Arg Ser Phe Ile 545 Asn Ser Ile Leu Thr 625 Gly Asn Leu 370 Gin Gin Ile Gly Gin 450 Leu Met Tyr Ser Leu 530 Gly Phe Phe Ala Lys 610 Ala Phe Trp 355 Pro Leu Leu Leu Leu 435 Thr Glu Gly Tyr Leu 515 Tyr Phe Leu Gin His 595 Ile Lys 325 Ser Leu 340 Val Ala Ser Tyr Thr Pro Glu Gin 405 Ala Asn 420 Tyr Arg Lys Val His Asn Tyr Lys 485 Gly Phe 500 Val Lys Asn Val Phe Ala Asp Gin 565 Phe Leu 580 Gin Lys Pro Val Tyr Arg Leu Ser Pro Tyr Gly Gly 375 Gin Gin 390 Ala Thr Pro Tyr Ser Lys Asn Gin 455 Ser Asn 470 Gin Phe Tyr Asp Ala Thr Phe Thr 535 Gly Ile 550 Val Asp Phe Asp Arg Ser Leu Tyr 615 Arg Ala 630 Asn Leu 360 Gin Ala Asn Ser Ala 440 Val Ser Phe Phe Leu 520 Arg Gin Gly Leu Arg 600 His Phe Thr 345 Asp Leu Gin Ala Pro 425 Val Tyr Asn Gly Gly 505 Ser Lys Leu Asn Gly 585 Phe Thr Ser 330 Val Leu Asn Gin Pro 410 Thr Ile Gin Asn Lys 490 Tyr Ser Arg Ala His 570 Ile Ser Tyr Phe Ala Asn Gly Glu 395 Thr Ala Gly Met Met 475 Lys Ala Tyr Gly Gly 555 Leu Arg Gin Tyr Tyr 635 INFORMATION FOR SEQ ID NO:139: SEQUENCE CHARACTERISTICS: LENGTH: 1213 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 51...1160 I' 1,.
WO 98/21225 PCTIUS97/21353 -280- OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:139: ATTATTTTTA ATCTTGCATG AAATCTTAAA TATAGAATTA GTC!CCTTTGG ATG GGA Met Gly 1 TTT TCN CTC Phe Xaa Leu GCO CTA GOC TAT Ala Leu Oly Tyr
TTG
Leu 10 TGT TTO TTT ATA Cys Leu Phe Ile
TTC
Phe GTT TTA AGC Val Leu Ser GCT TCT Ala Ser TTA ATC TCT GAA Leu Ile Ser Oiu
AAA
Lys 25 GCC TTA TCC AAG Ala Leu Ser Lys
CAG
Gin TAT TTG CAA ACC Tyr Leu Gin Thr
OCT
Al a AAA OAT AAA ATC Lys Asp Lys Ile TCT TTA AAG PAT TTA AA.A GTC ATC GCC Ser Leu Lys Asn Leu Lys Val Ile Ala ACC OOA AOC TTT Thr Oly Ser Phe AAA ACC AOC ACC Lys Thr Ser Thr PAT TTC TTG CTT Asn Phe Leu Leu CPA ATC Gin Ile TTA CPA ACC ACA TTC AAC OCO CAT Leu Gin Thr Thr Phe Asn Ala His
OCA
Al a 75 AOC CCC AAA AOC Ser Pro Lys Ser OTC PAT ACC Vai Asn Thr OAT AGO AGT Asp Arg Ser CTT TTA 000 Leu Leu Oly CTT OCO PAT OAT Leu Ala Asn Asp
ATT
Ile PAT CAG PAT TTA Asn Gin Asn Leu 344 OPA ATC Olu Ile 100 TAT ATC GCT CA Tyr Ile Ala Olu GOO OCA AGO PAT Gly Ala Arg Asn
PAG
Lys 110 GOC OAT ATT AA Oly Asp Ile Lys
A
Giu 115 ATC ACC TOT CTC Ile Thr Cys Leu GPA CCO CAC CTT Oiu Pro His Leu
OTT
Val1 125 GTG GTT OCA A Val Val Ala Glu 392 440 488 GOC OPA CAG CAT Oly Giu Gin His OPA TAC TTT AAA Glu Tyr Phe Lys TTA GPA PAT ATT Leu Glu Asn Ile TOC GAG Cys Giu 145 ACT AAA OCG GPA TTA TTG GAT TCC Thr Lys Ala Giu Leu Leu Asp Ser 150 COC TTA OPA AA Arg Leu Ciu Lys 0CC TTT TOT Ala Phe Cys 160 TAC TCO OTO Tyr Ser Val 165 GPA AC ATC AC CCC TAT 0CC CCT AAA OAT AOC CCT TTA Glu Lys Ile Lys Pro Tyr Ala Pro Lys Asp Ser Pro Leu ATA GAC TAT TCT ACC CTC OTT AAA PAC ATC CAA TCC ACT TTA -AAA GCC WO 98/21225 PCT/US97/21353 -281- Ile Asp 180 Tyr Ser Ser Leu Val 185 Lys Asn Ile Gin Ser 190 Thr Leu Lys Gly
ACT
Thr 195 TCT TTT GAA ATG Ser Phe Glu Met ATA GGT AGC GTT Ile Gly Ser Val GAA AGA TTT GAA Glu Arg Phe Glu
ACA
Thr 210 AAG GTT CTA GGG Lys Val Leu Gly TTT AGC GCT TAT AAT ATC GCT TCA GCC ATT TTA Phe Ser Ala Tyr Asn Ile Ala Ser Ala Ile Leu ATC GCT AAG Ile Ala Lys TTA GAA CTC Leu Glu Leu 245
CAT
His 230 TTA GGC TTA GAG Leu Gly Leu Glu
ACC
Thr 235 GAA AGG ATC AAA Glu Arg Ile Lys CGG CTT GTT Arg Leu Val 240 GAA GTG AAT Glu Val Asn AAC CCT ATT GCT Asn Pro Ile Ala
CAT
His 250 CGT TTG CAA CTT Arg Leu Gin Leu CAA AAA Gin Lys 260 ATC ATC ATA GAC Ile Ile Ile Asp AGC TTT AAT GGG Ser Phe Asn Gly
AAT
Asn 270 TTA AAG GGC ATG Leu Lys Gly Met
TTA
Leu 275 GAG GGC ATT CGT Glu Gly Ile Arg GCG AGT TTG CAC Ala Ser Leu His GGG CGT AAA GTC Gly Arg Lys Val 872 920 968 1016 GTA ACA CCG GGC Val Thr Pro Gly GTG GAA AGC AAT Val Glu Ser Asn
ACA
Thr 300 GAA AGT AAT GAG Glu Ser Asn Glu GCT TTA Ala Leu 305 GCG CAA AAA Ala Gin Lys TTG AAT TCC Leu Asn Ser 325
ATA
Ile 310 GAC GGG GTT TTT GAT GTC GCT ATC ATC Asp Gly Val Phe Asp Val Ala Ile Ile 315 ACA GGG GAG Thr Gly Glu 320 CAA AAA ATC Gin Lys Ile AAA ACG ATT GCT Lys Thr Ile Ala
TCA
Ser 330 CAA TTG AAA ACC Gin Leu Lys Thr 1064 TTA CTC Leu Leu 340 AAG GAT AAG GCG Lys Asp Lys Ala
CAA
Gin 345 TTG GAA AAT ATC Leu Glu Asn Ile
TTA
Leu 350 CAA GCC ACC ACG Gin Ala Thr Thr 1112 1161 ATT Ile 355 CAA GGC GAT TTG Gin Gly Asp Leu TTA TTC GCT AAT Leu Phe Ala Asn GCC CCT AAT TAC Ala Pro Asn Tyr ATT T Ile 370 AGGAAATGAA CATGCAACAT TTATACGCTC CTTGGCGCGA AAGTTATTTG AA INFORMATION FOR SEQ ID NO:140: SEQUENCE CHARACTERISTICS: LENGTH: 370 amino acids TYPE: amino acid STRANDEDNESS: single 1213 WO 98/21225 WO 981225PCT/US97/21353 -282- TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:140: Met Gly Phe Xaa Leu Ala Leu Gly Tyr Leu 1 Leu Gin Al a Gin Asn Arg Ile Giu Cys 145 Phe Pro Lys Giu Ile 225 Leu Vali Giy Val1 Al a 305 Gly Lys Thr Tyr Se r Thr Ile Ile Thr Ser Lys Val1 130 Glu Cys Leu Gly Thr 210 Leu Val Asn Met Ile 290 Leu Glu Ile Thr Ile 370 Al a Al a Thr Leu Leu Giu Giu 115 Gly Thr Tyr Ile Thr 195 Lys Ile Leu Gin Leu 275 Val1 Ala Leu Leu Ile 355 Ser Lys Gly Gin Leu Ile 100 Ile Giu Lys Ser Asp 180 Ser Val1 Ala Giu Lys 260 Giu Thr Gin Asn Leu 340 Gin 5 Leu Asp Ser Thr Gly Tyr Thr Gin Ala Val 165 Tyr Phe Leu Lys Leu 245 Ile Gly Pro Lys Ser 325 Lys Gly Ile Lys Phe Thr 70 Leu Ile Cys His Giu 150 Glu Ser Giu Giy His 230 Asn Ile Ile Gly Ile 310 Lys Asp Asp Ser Ile Gly 55 Phe Ala Al a Leu Leu 135 Leu Lys Ser Met Glu 215 Leu Pro Ile Arg Leu 295 Asp Thr Lys Leu Glu Thr 40 Lys Asn As n Glu Ile 120 Giu Leu Ile Leu Leu 200 Phe Gly Ile Asp Leu 280 Val1 Gly Ile Al a Ile 360 Lys 25 Ser Thr Aia Asp Ala 105 Glu Tyr Asp Lys Val1 185 Ile Ser Leu Ala Asp 265 Ala Giu Val1 Aia Gin 345 Leu 10 Aila Leu Ser His Ile 90 Gly Pro Phe Ser Pro 170 Lys Giy Al a Glu His 250 Ser Ser Ser Phe Ser 330 Leu Phe Cys Leu Lys Thr Ala 75 Asn Al a His Lys Lys 155 Tyr Asn Ser Tyr Thr 235 Arg Phe Leu Asn Asp 315 Gin Giu Leu Ser Asn Lys Ser Gin Arg Leu Thr 140 Arg Ala Ile Val1 As n 220 Giu Leu Asn His Thr 300 Val1 Leu Asn Phe Lys Leu As n Pro As n Asn Val1 125 Leu Leu Pro Gin Trp 205 Ile Arg Gin Gly Lys 285 Giu Al a Ly s Ile Ile Gin Lys Phe Lys Leu Lys 110 Val Glu Glu Lys Ser 190 Glu Ala Ile Leu Asn 270 Gly Ser Ile Thr Leu 350 Ala Phe Tyr Val1 Leu Ser Asp Gly Val1 Asn Lys Asp 175 Thr Arg Ser Lys Leu 255 Leu Arg As n Ile Pro 335 Gln Pro Val1 Leu Ile Leu Val1 Asp Asp Al a Ile Al a 160 Ser Leu Phe Al a Arg 240 Giu Lys Lys Glu Thr 320 Gin Al a Asn Ala Asn Asp 365 WO 98/21225 PCT/US97/21353 -283- INFORMATION FOR SEQ ID NO:141: SEQUENCE CHARACTERISTICS: LENGTH: 360 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 82...270 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:141: ACTTAAAGGC ATAAAAACCT TAAGCTTTTT GAGTTTCAAA AGGGTTTCAA GCTTTTTATA AGACTTTTTT TGAATGAGTA A GGA GAA AAT ATT TTG TTC CAT AAA CTG -ATC Gly Glu Asn Ile Leu Phe His Lys Leu Ile 111 TTA ACA TGC TTT Leu Thr Cys Phe TTA GCG CTT GTA GCA ATA ACC ATT CAA GCT TGC GGT Leu Ala Leu Val Ala Ile Thr lie Gln Ala Cys Gly 20 CCA TTC AAT GAA AAA CCC GCT AAA AAA ACT TCA AAC Pro Phe Asn Glu Lys Pro Ala Lys Lys Thr Ser Asn 35 TAT AAA GCC Tyr Lys Ala AGC TCT AAT Ser Ser Asn
CCT
Pro 159 207 255 311 TCT TCT ATG CAA ACG CCC ACC AAC AGC ACC ACG CCA GAA Ser Ser Met Gin Thr Pro Thr Asn Ser Thr Thr Pro Glu 50 TTT TTA Phe Leu AAT CAG CCT Asn Gin Pro TAAAATCACT GCTCTTGTTT AAGGGCTTTG ATTTCTAGGG T TTTTGTGGCT AACTTTTGAN STTCGCTTTC ATCATGCGTT ACCATAATG INFORMATION FOR SEQ ID NO:142: SEQUENCE CHARACTERISTICS: LENGTH: 63 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:142: Gly Glu Asn Ile Leu Phe His Lys Leu Ile Leu Thr Cys Phe Leu Ala WO 98/21225 PCT/US97/21353 -284- 1 5 Leu Val Ala Ile Thr Ile Gin Ala Cys 10 Gly Tyr Lys Ala Pro Pro Phe 25 Asn Glu Lys Pro Ala Lys Lys Thr Ser Asn Ser Ser 40 Gin Thr Pro Thr Asn Ser Thr Thr Pro Glu Phe Leu 55 Asn Ser Ser Asn Gin Pro Met INFORMATION FOR SEQ ID NO:143: SEQUENCE CHARACTERISTICS: LENGTH: 1024 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear- (ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 115...921 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:143: AGTTGGCAAA AACGCAGAGA CAGTAACGCA AAGGCAAATA AAGAGACTCA TTTTAAACAA GCGAATGCCA TTACAAATAT AATCAGATCA GTTGGTGGGT TTTTTACAAA GATT ATG Met 1 117 AAG AGA GTT Lys Arg Val GCA TTA GTA Ala Leu Val
AGA
Arg GAA CTT GTA AAA Glu Leu Val Lys
AAA
Lys 10 CAT CCC GAG AAA His Pro Glu Lys AGC AGT GTG Ser Ser Val AAA GAA TTG Lys Glu Leu 165 213 GTA TTA ACC CAT Val Leu Thr His
GCT
Ala 25 GCA TGC AAG AAA Ala Cys Lys Lys
GCG
Ala GAC GAT Asp Asp AAA GTC CAG GAT Lys Val Gin Asp
AAA
Lys TCC AAA CAA GCT Ser Lys Gin Ala AAA GAA AAT CAA Lys Glu Asn Gin
ATC
Ile AAT TGG TGG AAA Asn Trp Trp Lys
TAT
Tyr 55 TCA GGA TTA ACA Ser Gly Leu Thr
ATA
Ile 60 GCG ACA AGT TTA Ala Thr Ser Leu
TTA
Leu 261 309 357 TTA GCC GCT TGT Leu Ala Ala Cys
AGT
Ser GTT GGT GAT ATT Val Gly Asp Ile
GAT
Asp 75 AAA CAG ATA GAG Lys Gin Ile Glu TTA GAA Leu Glu CAA GAA AAA AAG GAA GCT GAA AAC GCT AGG GAT AGA GCG Gin Glu Lys Lys Glu Ala Glu Asn Ala Arg Asp Arg Ala AAC AAG AGT Asn Lys Ser f4. I v WO 98/21225 PCT/US97/21353 -285- GGG ATA GAA Gly Ile Glu 100 CTG GAA CAG-GAA Leu Glu Gin Glu
AAA
Lys 105 CAA AAG ACC ATT AAA GAA CAA AAA Gin Lys Thr Ile Lys Glu Gin Lys 110 GAT TTA Asp Leu 115 GTT AAA AAA GCA Val Lys Lys Ala
GAA
Glu 120 CAA AAT TGC CAA Gin Asn Cys Gin AAT CAT GGC CAA Asn His Gly Gin
TTC
Phe 130 TTT ATG AAA AAA TTA GGA ATT AAG GGT Phe Met Lys Lys Leu Gly Ile Lys Gly
GGC
Gly 140 ATT GCT ATA GAA Ile Ala Ile Glu 549 597 GAA GCT GAA TGC Glu Ala Glu Cys
AAA
Lys 150 ACC CCT AAA CCT GCA AAA ACC AAT CAA Thr Pro Lys Pro Ala Lys Thr Asn Gin 155 ACC CCT Thr Pro 160 ATC CAG CCA Ile Gin Pro GGA TCA AAA Gly Ser Lys 180 CAC CTC CCC AAC His Leu Pro Asn AAA CAA CCC CAC Lys Gin Pro His TCT CAA AGA Ser Gin Arg 175 GAG TTA GAA Glu Leu Glu GCG CAA GAG CTT ATC GCT TAT TTG CAA Ala Gin Glu Leu Ile Ala Tyr Leu Gin
AAA
Lys 190 TCT CTG Ser Leu 195 CCC TAT TCA CAA Pro Tyr Ser Gin
AAA
Lys 200 GCT ATC GCT AAA Ala Ile Ala Lys GTG AAT TTT TAC Val Asn Phe Tyr
AGG
Arg 210 CCA AGT TCT GTC Pro Ser Ser Val
GCT
Ala 215 TAT TTA GAA CTA Tyr Leu Glu Leu
GAC
Asp 220 CCT AGA GAT TTT Pro Arg Asp Phe
AAG
Lys 225 GTT ACA GAA GAA Val Thr Glu Glu
TGG
Trp 230 CAA AAA GAA AAT Gin Lys Glu Asn
CTA
Leu 235 AAA ATA CGC TCT Lys Ile Arg Ser AAA GCT Lys Ala 240 CAA GCT AAA Gin Ala Lys CTC TCA AAG Leu Ser Lys 260 CTT GGA AAT GAG Leu Gly Asn Glu
AAA
Lys 250 CCC ACA AGC Pro Thr Ser CCA CCT TTC AAC Pro Pro Phe Asn 255 TGATGTTAAT AAAGAA CCT TTT GTT CGT Pro Phe Val Arg AAA AAT ATT TGC Lys Asn Ile Cys ATAGAAGCAG TTGCTAATAC TGAAAAGAAA GCAGAAAAAG AGGATGTAGG CATAAGAAAA TAAGAAC INFORMATION FOR SEQ ID NO:144: SEQUENCE CHARACTERISTICS: LENGTH: 269 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear MGGGTTATGG TTATAGTAAA 997 1024 WO 98/21225 WO 9821225PCTIUS97/21353 -286- (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:144: Met Lys Arg Val Arg Giu Leu Val Lys Lys 1 Val1 Leu Gin Leu 6S Glu Ser Lys Gin Val1 145 Pro Arg Glu Tyr Lys 225 Ala Asn Al a Asp Ilie Leu Gin Gly Asp Phe 130 Giu Ile Gly Ser Arg 210 Val1 Gin Leu Leu Asp Asn Al a Giu Ile Leu 115 Phe Ala Gin Ser Leu 195 Pro Thr Al a Ser Vai Lys Trp Aila Lys Giu 100 Val Met Giu Pro Lys 180 Pro Ser Giu Lys Lys 260 5 Val1 Val Trp Cys Lys Leu Lys Lys Cys Lys 165 Aila Tyr Ser Giu Met 245 Pro Leu Gin Lys Ser 70 Giu Giu Lys Lys Lys 150 His Gin Ser Val1 Trp 230 Leu Phe Thr Asp Tyr 55 Val1 Al a Gin Al a Leu 135 Thr Leu Giu Gin Ala 215 Gin Gly Val His Lys 40 Ser Gly Giu Giu Giu 120 Gly Pro Pro Leu Lys 200 Tyr Lys Asn Arg Al a 25 Ser Gly Asp Asn Lys 105 Gin Ile Lys Asn Ile 185 Ala Leu Giu Giu Ser 10 Al a Lys Leu Ile Ala 90 Gin Asn Lys Pro Ser 170 Ala Ile Giu Asn Lys 250 Lys Pro Lys Ala Ile Lys Asp Thr Gin Gly 140 Lys Gin Leu Lys Asp 220 Lys Thr Ile Giu Lys Lys Ala Giu Lys Ala Thr Gin Ile Arg Ala Ile Lys 110 Giu Asn 125 Ile Ala Thr Asn Pro His Gin Lys 190 Gin Val 205 Pro Arg Ile Arg Ser Pro Cys Ser Lys Giu Ser Giu Asn Giu His Ile Gin Ser 175 Giu Asn Asp Ser Pro 255 Ser Giu Asn Leu Leu Lys Gin Gly Giu Thr 160 Gin Leu Phe Phe Lys 240 Phe INFORMATION FOR SEQ ID NO:145: Ci) SEQUENCE CHARACTERISTICS: LENGTH: 669 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE: NAME/KEY: coding Sequence LOCATION: .603 OTHER INFORMATION: WO 98/21225 WO 981225PCT/US97/21353 -2 87- (xi) SEQUENCE DESCRIPTION: SEQ ID NO:145: AAAATAAGOA GGAATTOTTT GATTTTACGA TTGGCTGGAG CAAGCGTTTT AACGGCTTGT GTCTTTTCGG GGTGTTTTTT TTTAAAA ATG TTT GAT AAA AAA CTT TCT AGT AAC Met Phe Asp Lys Lys Leu Ser Ser Asn 1 114
GAT
Asp TGG CAT ATC CAA Trp His Ile Gin
AAA
Lys 15 GTG GAA ATG AAC Val Giu Met Asn
CAT
His 20 CAA GTC TAT GAC Gin Val Tyr Asp
ATT
Ile 162 210 GAA ACC ATG CTC Giu Thr Met Leu
GCT
Aia GAT AGC GCT TTT Asp Ser Ala Phe
AGA
Arg 35 GAG CAT GAA GAA Glu His Giu Glu GAG CAA Oiu Gin GAT TCC TCT Asp Ser Ser GCC AAA GAG Ala Lys Oiu TTT AAA AAG Phe Lys Lys
CTA
Leu 4S AAT ACC GCT TTG CCT GAA GAT AAA ACA GCG ATT GAA Asn Thr Ala Leu Pro Giu Asp Lys Thr Ala Ile Glu 258 CAA GAG CAA AAA GAA AAA AGA AAA CGC Gin Giu Gin Lys Oiu Lys Arg Lys Arg
TGO
Trp TAT GAG CTT Tyr Oiu Leu AAA CCA AAG Lys Pro Lys AAA AGC TCT ATG Lys Ser Ser Met GAG TTT GTG TTT Giu Phe Val Phe
OAT
Asp CAA AAA OAA AAT Gin Lys Glu Asn ATT TAT GOC AAA Ile Tyr Oiy Lys
GC
Gly 100 TAT TOC AAC CGO Tyr Cys Asn Arg TTT 0CC AOC TAT OTA TOO CAG GOC OAT Phe Ala Ser Tyr Vai Trp Gin Gly Asp CAC ATT 000 ATT His Ile Oiy Ile GAA OAT Glu Asp 120 AOC 000 ATT TCA AGA AAA GTO TOT AAA GAT GAG CAT TTA ATO GCO TTT Ser Gly Ile Ser Arg Lys Val Cys Lys Asp Glu His Leu Met Ala Phe OAA TTG OAA TTT ATO GAG AAT Oiu Leu Oiu Phe Met Oiu Asn 140 TTT AAO GOT AAT TTT ACO OTA ACT AAO Phe Lys Oly Asn Phe Thr Val Thr Lys 145. 150 GAO AAC CAA AAA ATO AAA ATT TAT TTG Asp Asn Gin Lys Met Lys Ile Tyr Leu 165 GOC AAG GAC ACO CTC ATT Oiy Lys Asp Thr Leu Ile, 155
TTA
Leu 160 AAA ACO CCT Lys Thr Pro 170 TOAOTGGTT TTTOATTTCA AAACAATCTA AGATCACTAA ATTAGGOAT TAAAAAGAAA TTTTTAA 66 669 II n WO 98/21225 -288iNFORMATION FOR SEQ ID NO:146: SEQUENCE CHARACTERISTICS: LENGTH: 172 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:146: PCT/US97/21353 Met Phe Asp Lys Lys Leu Ser Ser Asn Asp 1 Glu Ala Leu Glu Lys Tyr Gly Cys Phe 145 Asp His Glu Asp Lys Met Gly 100 His Glu Asn Lys 5 Gin His Lys Arg Gly Tyr Ile His Phe Met 165 Tyr Glu Ala Tyr Phe Asn Ile Met 135 Val Ile Asp Glu Ile Glu Val Arg Glu 120 Ala Thr Tyr Trp Thr Ser Lys Lys 75 Gin Ala Gly Leu Lys 155 Thr His Met Ser Glu Lys Lys Ser Ile Glu 140 Asp Pro Gin Ala Asn Glu Pro Asn Val 110 Arg Met Leu Lys Asp Thr Gin Lys Arg Trp Lys Glu Ile Val Ser Ala Lys Pro Ile Gin Val Asn Leu 160 INFORMATION FOR SEQ ID NO:147: SEQUENCE CHARACTERISTICS: LENGTH: 1350 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 87...1280 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:147: WO 98/21225 WO 8/2225PCTIUS97/21353 -289- ATCAATCTAA CTTGAGTGGA TTTTTCGTAT TAGTTTCCAT GATATAATTT TGAAAAGTAA GATTGTTTTT TAAAAAAAGG TTGGTA ATG GAA TCA GTA AAA ACA GGA AAA ACA Met Glu Ser Val Lys Thr Gly Lys Thr 113
AAT
Asn AAG GTT GGC AAG AAT ACA GAG ATG GCT Lys Val Gly Lys Asn Thr Giu Met Ala 15
AAT
Asn 20 ACA AAG GCA AAT Thr Lys Ala Asn.
AA
Lys GAG ACT CAT TTT Giu Thr His Phe
AAA
Lys CAA GTG AGC GCC Gin Val Ser Ala
ATT
Ile 35 ACA AAT ATA ATC Thr Asn Ile Ile AGA TCA Arg Ser 209 GTT GGT GGG Val Gly Gly AAA AAA CAC Lys Lys His TTT ACA AAA ATT Phe Thr Lys Ile
GCA
Ala AAG AGA GTT AGA Lys Arg Val Arg GGA CTT GTA Gly Leu Val TTG ACC CAT Leu Thr His CCC AAG AAA AGC Pro Lys Lys Ser GCG GCA TTA GTA Ala Ala Leu Val
GTA
Val ATT GCG Ile Ala TGC AAG AAA GCG Cys Lys Lys Ala
AAA
Lys 80 GAA TTA GAC GAT Giu Leu Asp Asp
AAA
Lys GTC CAA GAT AAA Val Gin Asp Lys
TCC
Ser AAA CAA OCT GAA Lys Gin Ala Giu
AAA
Lys GAA AAT CAA ATC Glu Asn Gin Ile TOG TOG AAA T.AT Trp Trp Lys Tyr GGA TTA ACA ATA GCG GCA AGT TTA TTA Gly Leu Thr Ile Ala Ala Ser Leu Leu GCC OCT TOT AGC Ala Ala Cys Ser OCT GGT Ala Giy 120 GAT ACT GAT AAA CAG ATA GAA CTA GAA CAA GAA AAA AAG Asp Thr Asp Lys Gin Ile Glu Leu Giu Gin Glu Lys Lys GAA GCT GAA Giu Ala Giu 135 GAA CAA GAA Giu Gin Glu 449 497 545 593 AAC GCT AGO GAT AGA GCG AAC AAO AGT GO ATA Asn Ala Arg Asp Arg Ala Asn. Lys Ser Oly Ile GAA CTA Giu Leu 150 AGA CAG AAA ACA AAC AAO Arg Gin Lys Thr Asn Lys
AGT
Ser 160 GGG ATA GAA CTC GCT AAT AGT CAA ATA Giy Ile Glu Leu Ala Asn. Ser Gin Ile 165
AAA
Lys 170 GCA GAA -CAA GAA Ala Glu Gin Glu CAA AAG ACA OAA CAA GAA AAA CAA AAA Gin Lys Thr Giu Gin Giu Lys Gin Lys 180 641 PAT PAG AGT OCG Asn Lys Ser Ala GAG TTA GPA CAG CPA AAA CAA AAG Giu Leu Giu Gin Gin Lys Gin Lys 195 ACC ATT PAT Thr Ile Asn 200 ACA CPA AGA GAT TTO ATT AAA GPA CAG AAA GAT TTC ATT AAA GPA ACA Thr Gin Arg Asp Leu Ile Lys Giu Gin Lys ASP Phe Ile Lys Giu Thr WO 98/21225 WO 9821225PCTIUS97/21353- -290- 210- GAA CAA A-AT Giu Gin Asn 220 TGC CAA GAA AAT Cys Gin Giu Asn
CAT
His 225 AAT CAA TTC TTT Asn Gin Phe Phe
ATT
Ile 230 AAA AAA TTA Lys Lys Leu 785 GGA ATT Gly Ile 235 AAG GGT GGC ATT Lys Gly Gly Ile ATA GAA GTA GA-A Ile Giu Vai Giu GAA TGC AAA ACC Giu Cys Lys Thr
CCT
Pro 250 AAA CCT GCA AAA Lys Pro Ala Lys
ACC
Thr 255 AAT CAA ACC CCT Asn Gin Thr Pro
ATC
Ile 260 CAG CCA AAA CAC Gin Pro Lys His
CTC
Leu 265 CCA AAC TCT AAA Pro Asn Ser Lys
CAA
Gin 270 CCT CAT TCT CAA Pro His Ser Gin
AGA
Arg 275 GGA TCA AAA GCG Gly Ser Lys Ala CAA GAG Gin Giu 280 TTT ATC GCT Phe Ile Ala AAA GCT ATC Lys Aia Ile 300
TAT
Tyr 285 TTG CAA AAA GAG Leu Gin Lys Giu
CTA
Leu 290 GAA TTT CTG CCC Giu Phe Leu Pro TAT TCG CAA Tyr Ser Gin 295 TCT ATC GCT Ser Ile Ala GCT AAA CAA GTG AAT TTC TAT AAA CCA Aia Lys Gin Vai Asn Phe Tyr Lys Pro
AGT
Ser 310 TAT TTA Tyr Leu 315 GAA CTA GAT CCT Glu Leu Asp Pro
AGA
Arg 320 GAT TTT AAG GTT Asp Phe Lys Val GAA GAA TGG CAA Giu Giu Trp Gin
AAA
Lys 330 GAA AAT CTA AAA Glu Asn Leu Lys
ATA
Ile 3413C CGC TCT AAA GCT Arg Ser Lys Ala 1025 1073 1121 1169 1217
CAA
Gin 340 GCT AAA ATG CTT Ala Lys Met Leu
GAA
Giu 345 ATG AGG GAT TTA Met Arg Asp Leu
AAA
Lys 350
GTT
Val CCA GAC CCA CAA Pro Asp Pro Gin
GCC
Al a 355
GCT
Al a CAC CTT CCA ACC His Leu Pro Thr TCT CAA Ser Gin 360 GAA ATA Giu Ile AGC CTT TTG Ser Leu Leu GAA GCA GTT Giu Ala Val 380 CAA AAA ATA Gln Lys Ile GAT GTT AAT Asp Val Asn GCT AAT ACT GAA Ala Asn Thr Glu AAA GCA GA-A AAA Lys Ala Glu Lys GGT TAT GGT Gly Tyr Gly 1265 TAT AGT Tyr Ser 395 AAA AGG ATG Lys Arg Met TAGGCATAAG AAAATAAGAA CACCATAAAA TCGTTTTTAG C 1321 1350 TTCTAGGAGA CATCAGTCAG TTTCTTGCC INFORMATION FOR SEQ LD NO:i4B: (12 SEQUENCE CHARACTERISTICS: LENGTH: 398 amino acids WO 98/21225 PCT/US97/21353 -291- TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:148: Met Glu Ser Val Lys Thr Gly Lys Thr Asn Lys Val Gly Lys Asn Thr 1 5 10 Glu Met Ala Asn Thr Lys Ala Asn Lys Glu Thr His Phe Lys Gin Val 25 Ser Ala Ile Thr Asn Ile Ile Arg Ser Val Gly Gly Phe Phe Thr Lys 40 Ile Ala Lys Arg Val Arg Gly Leu Val Lys Lys His Pro Lys Lys Ser 55 Ser Ala Ala Leu Val Val Leu Thr His Ile Ala Cys Lys Lys Ala Lys 70 75 Glu Leu Asp Asp Lys Val Gin Asp Lys Ser Lys Gin Ala Glu Lys Glu 90 Asn Gin Ile Asn Trp Trp Lys Tyr Ser Gly Leu Thr Ile Ala Ala Ser 100 105 110 Leu Leu Leu Ala Ala Cys Ser Ala Gly Asp Thr Asp Lys Gin Ile Glu 115 120 125 Leu Glu Gin Glu Lys Lys Glu Ala Glu Asn Ala Arg Asp Arg Ala Asn 130 135 140 Lys Ser Gly Ile Glu Leu Glu Gin Glu Arg Gin Lys Thr Asn Lys Ser 145 150 155 160 Gly Ile Glu Leu Ala Asn Ser Gin Ile Lys Ala Glu Gin Glu Arg Gin 165 170 175 Lys Thr Glu Gin Glu Lys Gin Lys Ala Asn Lys Ser Ala Ile Glu Leu 180 185 190 Glu Gin Gin Lys Gin Lys Thr Ile Asn Thr Gin Arg Asp Leu Ile Lys 195 200 205 Glu. Gn Lys Asp Phe Ile Lys Glu Thr Glu Gin Asn Cys Gin Glu Asn 210 215 220 His Asn Gin Phe Phe Ile Lys Lys Leu Gly Ile Lys Gly Gly Ile Ala 225 230 235 240 Ile Glu Val Glu Ala Glu Cys Lys Thr Pro Lys Pro Ala Lys Thr Asn 245 250 255 Gin Thr Pro Ile Gin Pro Lys His Leu Pro Asn Ser Lys Gin Pro His 260 265 270 Ser Gin Arg Gly Ser Lys Ala Gin Glu Phe Ile Ala Tyr Leu Gin Lys 275 280 285 Glu Leu Glu Phe Leu Pro Tyr Ser Gin Lys Ala Ile Ala Lys Gin Val 290 295 300 Asn Phe Tyr Lys Pro Ser Ser Ile Ala Tyr Leu Glu Leu Asp Pro Arg 305 310 315 320 Asp Phe Lys Val Thr Glu Glu Trp Gin Lys Glu Asn Leu Lys Ile Arg 325 330 335 Ser Lys Ala Gin Ala Lys Met Leu Glu Met Arg Asp Leu Lys Pro Asp 340 345 350 Pro Gln Ala His Leu Pro Thr Ser Gin Ser Leu Leu Phe Val Gin Lys 355 360 365 t* WO 98/21225 PCT/US97/21353 -292- Ile Phe Ala Asp Val Asn Lys Glu Ile-Glu Ala Val Ala Asn Thr Glu 370 375 380 Lys Lys Ala Glu Lys Ala Gly Tyr Gly Tyr Ser Lys Arg Met 385 390 395 INFORMATION FOR SEQ ID NO:149: SEQUENCE CHARACTERISTICS: LENGTH: 709 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 336...443 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:149:
TAAGGGATAT
TGCCTTGTGT
AGGAGCCTAA
TCTAAGATTA
TGAGTTTAAT
GTATTTAAAA
TGCTAACGAT
GATTAGTAAC
CTAAAATAAA
CAAAGGGTAG
TTACTTTTTG
AAATTTTAAA
TAAGCTGTAT
ACAAGGCAAG
ATGAACAATT
CGTTTCTGTT
TTTAATAATA
ACAAAAGGAT
TGGAAGAGTT TATTTTGCAA TGTGATAAAC CCTACTACAA TCAGTTAGGG CTTTATTATA TTTGGATTTA GAGCGTTATT AATCTTAACT ATCATAAATG ATAAA ATG AAA ACC ATT Met Lys Thr Ile
GAATTAATCT
TTTCAATTCA
GCAAAAATTA
TTGATTGTTT
TACAATTAAA
AGA AAT Arg Asn 120 180 240 300 353 AGC GTG TTT ATT GGA GCG TCT TTA CTC GGC GGT Ser Val Phe Ile Gly Ala Ser Leu Leu Gly Gly 15 GCT TAT TTT GAC GCT TTG CAT GTT GCT CGC GTT Ala Tyr Phe Asp Ala Leu His Val Ala Arg Val 30 AAAAAGAAGC ACACCACACG CCCAAAGACT TTGATAGCCC GCACTAGGTT TTAGTTGGGG GTTTTTAGGG GTGTTATTTT AAGAAAATAA ATTTCTACCA TAAAATAAAA TCTTAAATTA TTAAAAAATT AAAAAGCGTT AAGTAAGACT TATCCAAAAA CAACCACTTT TTTTAAG INFORMATION FOR SEQ ID NO:150: SEQUENCE CHARACTERISTICS: LENGTH: 36 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear TGC GCT AGC GTT GAG Cys Ala Ser Val Glu AAA GAC GCT TGTTTATAG Lys Asp Ala TTACCACACT GACTAAACCG AGATACTCTC TGTTCCCTTA AGGCGACTAA AACCCCACTT GCAAAGAAAA TCAATTTTTC 401 452 512 572 632 692 709 WO 98/21225 PCT/US97/21353 -293- (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:150: Met 1 Gly Lys Thr Ile Arg Asn Ser Val Phe Ile 5 10 Cys Ala Ser Val Glu Ala Tyr Phe Asp 25 Gly Ala Ser Leu Leu Gly Ala Leu His Val Ala Arg Val Lys Asp Ala INFORMATION FOR SEQ ID NO:151: SEQUENCE CHARACTERISTICS: LENGTH: 888 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 19...837 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:151: AGATAGGAAT GTAAAGGA ATG GAA TTT ATG AAA AAG TTT GTA GCT Met Glu Phe Met Lys Lys Phe Val Ala 1 5 TTA GGG Leu Gly CTT CTA TCC Leu Leu Ser GTT TAT ATA Val Tyr Ile
GCA
Ala GTT TTA AGC TCT TCG TTG TTA GCC GAA Val Leu Ser Ser Ser Leu Leu Ala Glu 20 GGT GAT GGT Gly Asp Gly TTG AAT AGT Leu Asn Ser 99 147 GGG ACT AAT TAT CAG CTT GGA CAA GCC Gly Thr Asn Tyr Gin Leu Gly Gin Ala 35
CGT
Arg AAT ATT Asn Ile TAT AAT ACA GGG Tyr Asn Thr Gly
GAT
Asp 50 TGC ACA GGG AGT GTT GTA GGT TGC CCC Cys Thr Gly Ser Val Val Gly Cys Pro
CCA
Pro GGT CTT ACC GCT Gly Leu Thr Ala
AAT
Asn 65 AAG CAT AAT CCA Lys His Asn Pro
GGA
Gly 70 GGC ACC AAT ATC Gly Thr Asn Ile
AAT
Asn TGG CAT GCT AAA Trp His Ala Lys
TAC
Tyr GCT AAT GGG GCT Ala Asn Gly Ala TTG AAT Leu Asn 85 GGT CTT GGG Gly Leu Gly TTG AAT Leu Asn WO 98/21225 PCT/US97/21353 -294- GTG GGT TAT Val Gly Tyr AAG TGG TTT Lys Trp Phe 110 AAG TTC TTC CAG TTC AAG TCT TTT GAT Lys Phe Phe Gin Phe Lys Ser Phe Asp 100 ATG ACA AGC Met Thr Ser 105 GGG CAT GCC Gly His Ala 339 GGT TTT AGA GTG Gly Phe Arg Val
TAT
Tyr 115 GGG CTT TTT GAT Gly Leu Phe Asp
TAT
Tyr 120 ACT TTA Thr Leu 125 GGC AAG CAA GTT Gly Lys Gin Val GCA CCT AAT AAA Ala Pro Asn Lys CAG TTG GAT ATG Gin Leu Asp Met
GTC
Val 140 TCT TGG GGT GTG GGG AGC GAT TTG TTA Ser Trp Gly Val Gly Ser Asp Leu Leu 145 GAT ATT ATT GAT Asp Ile Ile Asp GAT AAC GCT TCT Asp Asn Ala Ser
TTT
Phe 160 GGT ATT TTT GGT Gly Ile Phe Gly
GGG
Gly 165 GTC GCT ATC GGC Val Ala Ile Gly GGT AAC Gly Asn 170 ACT TGG AAA Thr Trp Lys GCT AAG GGT Ala Lys Gly 190
AGC
Ser 175 TCA GCG GCA AAC Ser Ala Ala Asn TGG AAA GAG CAA Trp Lys Glu Gin ATC ATT GAA Ile Ile Glu 185 CCT AAC GCT Pro Asn Ala 579 627 CCT GAT GTT TGT Pro Asp Val Cys CCT ACT TAT TGT Pro Thr Tyr Cys
AAC
Asn 200 CCT TAT Pro Tyr 205 AGC ACC AAA ACT Ser Thr Lys Thr ACC GTC GCT TTT Thr Val Ala Phe GTA TGG TTG AAT Val Trp Leu Asn
TTT
Phe 220 GGG GTG AGA GCC Gly Val Arg Ala
AAT
Asn 225 ATT TAC AAG CAT Ile Tyr Lys His
AAT
Asn 230 GGC GTA GAG TTT Gly Val Glu Phe
GGC
Gly 235 GTG AGA GTG CCG CTA CTC ATC AAC AAG Val Arg Val Pro Leu Leu Ile Asn Lys 240
TTT
Phe 245 TTG AGT GCG GGT Leu Ser Ala Gly CCT AAC Pro Asn 250 GCT ACT AAT Ala Thr Asn GGG TAT AAC Gly Tyr Asn 270
CTT
Leu 255 TAT TAC CAT TTG Tyr Tyr His Leu
AAA
Lys 260 CGG GAT TAT TCG Arg Asp Tyr Ser CTT TAT TTA Leu Tyr Leu 265 TAC ACT TTT Tyr Thr Phe TAAACCCTTT AAAAGGGTGT CTTTAAGCCC TTTTTAGT CCTTATAAAA AGG INFORMATION FOR SEQ ID NO:152: SEQUENCE CHARACTERISTICS: LENGTH: 273 amino acids TYPE: amino acid WO 98/21225 WO 98/ 1225PCT/US97/21353 -295- STRANDiEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:152: Met Glu Phe Met Lys Lys Phe Val Ala Leu 1 Leu Asn Gly Asn Ala Phe Arg Val Gly 145 Gly Al a Val1 Thr As n 225 Leu Tyr Phe Ser Gln Cys His Gly Gin Tyr 115 Ala Asp Phe Asn Thr 195 Thr Tyr Asn Leu Ser Leu Thr Asn Ala Phe 100 Gly Pro Leu Gly Tyr 180 Pro Val1 Lys Lys Lys 260 5 Leu Gly Gly Pro Leu Lys Leu Asn Leu Gly 165 Trp Thr Al a His Phe 245 Arg Leu Gin Ser Gly 70 Asn Ser Phe Lys Al a 150 Val1 Lys Tyr Phe Asn 230 Leu Asp Ala Al a Val1 Gly Gly Phe Asp I le 135 Asp Ala Glu Cys Gin 215 Gly Ser Tyr Giu Arg 40 Val Thr Leu Asp Tyr 120 Gin Ile Ile Gin Asn 200 Val1 Val1 Al a Ser Gly 25 Leu Gly Asn Gly Met 105 Gly Leu Ile Gly Ile 185 Pro Trp Glu Gly Leu 265 10 Asp As n Cys Ile Leu 90 Thr His Asp Asp Gly 170 Ile Asn Leu Phe Pro 250 Tyr Gly Gly Ser Pro Asn.
75 As n Ser Al a Met As n 155 Asn Giu Al a As n Gly 235 As n Leu Leu Val Asn Pro Trp Val Lys Thr Val1 140 Asp Thr Ala Pro Phe 220 Val Ala Gly Leu Ser Tyr Ile Ile Tyr Gly Leu His Ala Gly Tyr Trp Phe 110 Leu cGly 125 Ser Trp Asn Ala Trp Lys Lys Gly 190 Tyr Ser 205 Gly Val Arg Val Thr Ass Tyr Asn 270 Ala Gly Asn Thr Lys Lys Gly Lys Gly Ser Ser 175 Pro Thr Arg Pro Leu 255 Tyr Val Thr Thr Ala Tyr Lys Phe Gin Val Phe 160 Ser Asp Lys Al a Leu 240 Tyr Thr INFORMATION FOR SEQ ID NO:153: SEQUENCE CHARACTERISTICS: LENGTH: 310 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE: WO 98/21225 PCT/US97/21353 -296- NAME/KEx: Coding Sequence LOCATION: 10...279 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:153: AAAAGGAGA GTG GCG GTG AAA AAA ATC GTT GTG AGT TGG TGT GTG GCG TTG Val Ala Val Lys Lys Ile Val Val Ser Trp Cys Val Ala Leu
GCT
Ala TTT TTA AGC GCG Phe Leu Ser Ala TCA GCA CAA GCC AAT AAA GCG ATC AGT AAT Ser Ala Gin Ala Asn Lys Ala Ile Ser Asn GCG GAT TTG ATT Ala Asp Leu Ile
AAA
Lys GAG ATA AGG GAT Glu Ile Arg Asp
TTA
Leu 40 AAA AAA ATC ATC AGC GCG Lys Lys Ile Ile Ser Ala CAA AAC ACT GAG ATT AAC AAC TTA Gin Asn Thr Glu Ile Asn Asn Leu
AGA
Arg 55 AAA GTG CAA Lys Val Gin GGG CAA TTA Gly Gin Leu TGC ATT AGC Cys Ile Ser GGG GAC ATG CGT Gly Asp Met Arg GAA GTG TTG TCT Glu Val Leu Ser ACT AGA GAT TAT Thr Arg Asp Tyr TAGGGGATAA TCCAAA 195
AAG
Lys 70 GAT ATA TTA AGC Asp Ile Leu Ser TTA AGG CCT Leu Arg Pro ATC TAT AAT TGG Ile Tyr Asn Trp AAATGAAAGC ATGCG INFORMATION FOR SEQ ID NO:154: SEQUENCE CHARACTERISTICS: LENGTH: 90 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:154: Val 1 Leu Ala Val -Lys Lys Ile Val Val Ser Trp Cys Val Ala Leu 5 10 Ser Ala Asp Ser Ala Gin Ala Asn Lys Ala Phe Ala Ile Ser Leu Ile Lys Glu Ile Arg Asp Leu Lys Lys Ile Ile Ser 40 Thr Glu Ile Asn Asn Leu Arg Lys Val Gin Glu Val Leu 55 Leu Gly Asp Met Arg Lys Asp Ile Leu Ser Thr Arg Asp 70 Asn Ala Asp Ala Gin Asn Ser Gly Gin Tyr Cys Ile WO 98/21225 PCT/US97/21353 -297- Ser Leu Arg Pro Tyr lie Tyr Asn Trp Arg INFORMATION FOR SEQ ID NO:155: SEQUENCE CHARACTERISTICS: LENGTH: 549 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 16...474 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:155: TGTTAAGATC AGTTT ATG GAA CAA AAT ATT TTC TCC TTA CTC ATT CAA AAA Met Glu Gln Asn Ile Phe Ser Leu Leu Ile Gln Lys AAG TCT TAT AAA AAG CTT GAA Lys Ser Tyr Lys Lys Leu Glu
CTT
Leu TTG AAA CTC Leu Lys Leu
AAA
Lys AAG CTT AAG Lys Leu Lys GTT TTT Val Phe ATG CCT TTA AGT TTA CAA GAA AAT TTG CTT TTT ATC TTC ATA Met Pro Leu Ser Leu Gln Glu Asn Leu Leu Phe Ile Phe Ile
AAA
Lys GAC TCT AAA TTG Asp Ser Lys Leu
CTT
Leu 50 TTT GCG TTT AAA Phe Ala Phe Lys
GAC
Asp ATT TGG GCT TCT Ile Trp Ala Ser GAA TTT AAC CAA Glu Phe Asn Gln
CGA
Arg TTC GCT AAA GAA Phe Ala Lys Glu
ATC
Ile 70 AGC CAT TTT TTA Ser His Phe Leu AAC ACG Asn Thr CAA GGG CAT Gln Gly His
GCT
Ala TAT GGG TTT GAC Tyr Gly Phe Asp
GGG
Gly 85 TTG AAT GGG TTA Leu Asn Gly Leu GAA ATT TTA Glu Ile Leu TAT GCC CCC Tyr Ala Pro GGT TAT GTG CCT AAA GAC GCG CTA AAA AAA TCC AAT Gly Tyr Val Pro Lys Asp Ala Leu Lys Lys Ser Asn
TTT
Phe 105 ATT AAA Ile Lys 110 AAA CAA GCC CGT TTT TTT CGC CCT AGT Lys Gin Ala Arg Phe Phe Arg Pro Ser 115 TTA GGG TTG TTC Leu Gly Leu Phe CAT AAC CCC ATT AAA GAC GCT CGT TTG CAT GAA TGT TTT GAA AAA GCG WO 98/21225 PCT/US97/21353 -298- His Asn Pro lie Lys Asp Ala Arg Leu-His Glu Cys Phe Glu Lys Ala 125 130 135 140 CGC GCT TTG ATC CAC TAC CAA CGA AGT TTT TTT GAG GAA TGAATGGCTG AT Arg Ala Leu Ile His Tyr Gin Arg Ser Phe Phe Glu Glu 145 150 TTATTGTCCA GTTTAAAAAA CCTTCCTAAC AGCAGTGGCG TGTATCAATA TTTTGATAAA
AAC
INFORMATION FOR SEQ ID NO:156: SEQUENCE CHARACTERISTICS: LENGTH: 153 amino acids TYPE: amino acid.
STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein 486 546 549 (xi) SEQUENCE Met Glu Gin Asn Ile 1 Lys Leu Leu Arg Tyr Lys Ala Lys His 145 Leu Ser Leu Phe Gly Asp Arg Asp 130 Tyr Thr Gin Ala Lys Asp Leu 100 Phe Arg Arg 5 Leu Glu Phe Glu Gly Lys Arg Leu Ser DESCRIPTION: SEQ ID Phe Ser Leu Leu Ile 10 Leu Lys Leu Lys Lys Asn Leu Leu Phe Ile Lys Asp Ile Trp Ala 55 Ile Ser His Phe Leu 70 Leu Asn Gly Leu Glu 90 Lys Ser Asn Phe Tyr 105 Pro Ser Ala Leu Gly 120 His Glu Cys Phe Glu 135 Phe Phe Glu Glu 150 NO: 156: Gin Lys Leu Lys Phe Ile Ser Lys Asn Thr 75 Ile Leu Ala Pro Leu Phe Lys Ala 140 Tyr Met Ser Asn His Val Lys Pro Leu Lys Pro Lys Gin Ala Pro Gin Ile Ile INFORMATION FOR SEQ ID NO:157: SEQUENCE CHARACTERISTICS: LENGTH: 2627 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE: L WO 98/21225 WO 981225PCTIUS97/2 1353 -299- NAME/KI~x: Coding Sequence LOCATION: 18.. .2582 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:157: AAAGACATGT GCAACCG ATG AAA TCT AAA AAA CTT TAT TTG OCT TTA ATC Met Lys Ser Lys Lys Leu Tyr Leu Ala Leu Ile ATA GOG OTT Ile Gly Val AOC GOT TTA Ser Gly Leu
TTA
Leu TTA GCO TTT TTA Leu Ala Phe Leu CTA TCT TCA TOO Leu Ser Ser Trp CTG GOT AAT Leu Gly Asn GTG GOG COT TTT GOG OTO TOG TTT GCC OCA CTC AAT AAA Val Oly Arg Phe Gly Val Trp Phe Ala Ala Leu Asn Lys AAA TAT Lys Tyr OTT TTA Val Leu TTT 000 CAT CTT Phe Oly His Leu
TCA
Ser
AAO
Lys TTC ATT AAT TTA CCC TAT TTA OCA TG Phe Ile Asn Leu Pro Tyr Leu Ala Trp TTC CTT TTA Phe Leu Leu
TAC
Tyr 65 ACT AAA AAC Thr Lys Asn
CCT
Pro 70 TTT ACA OAA ATC Phe Thr Glu Ile
OTT
Val1 146 194 242 290 338 TTA OAA AAA ACT Leu Oiu Lys Thr
TTA
Leu 000 CAT CTA TTA Gly His Leu Leu
GOC
Gly 85 ATT TTA TCT TTG Ile Leu Ser Leu CIC TTT Leu Phe TTA CAA TCT Leu Gin Ser TTO TTT TTA Leu Phe Leu 110 CTA TTA AAT CAA Leu Leu Asn Gin GAA ATC OGC AAC AOC OCO COT Oiu Ile Gly Asn Ser Ala Arg 105 COC CCT TTT ATA Arg Pro Phe Ile 000 Gly 115 OAT TTT 000 CTT Asp Phe Gly Leu
TAT
Tyr 120 GCG CTG ATA Ala Leu Ile ACO CTT Thr Leu 125 ATG OTA OTT ATT Met Val Val Ile TAT TTG ATT CTA Tyr Leu Ile Leu AAA CTA CCC CCT Lys Leu Pro Pro
AAA
Lys 140 AOC OTT TTT TAT Ser Val Phe Tyr TAT ATO AAC AAA Tyr Met Asn Lys
ACA
Thr 150 CAA AAC CTT TTA Gin Asn Leu Leu 434 482 530 GAO ATT TAC AAA Giu Ile Tyr Lys TOC TTA CAA 0CC Cys Leu Gin Ala AOC CCT AAT TTT Ser Pro Asn Phe AOC CCA Ser Pro 170 AAA AAA GAG Lys Lys Glu
GT
Gly 175 TTT GAA AAC ACC Phe Oiu Asn Thr TCA OAT ATT CAA Ser Asp Ile Gin AAA AAA GAA Lys Lys Oiu 185 578 WO 98/21225 PCT/US97/21353 -300- ACC AAA AAC Thr Lys Asn 190 GAC AAA GAA AAA Asp Lys Glu Lys
GAA
Glu 195 AAC CGC AAA GAA Asn Arg Lys Glu
AAC
Asn 200 CCT ATT AAT Pro Ile Asn GAA AAC Glu Asn 205 CAC AAA ACC CCT His Lys Thr Pro GAA GAA CCG TTT Glu Glu Pro Phe GCG ATC CCT ACC Ala Ile Pro Thr
CCC
Pro 220 TAT AAC ACG ACT Tyr Asn Thr Thr
TTA
Leu 225 AAT GAT TCA GAG Asn Asp Ser Glu
CCG
Pro 230 CAA GAA GGC TTA Gin Glu Gly Leu
GTC
Val 235 674 722 770 CAA ATT TCC TCC Gin Ile Ser Ser CCC CCT ACC CAT Pro Pro Thr His ACC ATT TAC CCT Thr Ile Tyr Pro AAA AGA Lys Arg 250 AAC CGA TTT Asn Arg Phe ATT AAA CAA Ile Lys Gin 270 GAT TTG ACT AAC Asp Leu Thr Asn ACT AAC CCC CCT Thr Asn Pro Pro TTA AAA GAA Leu Lys Glu 265 AAA GAA ACT Lys Glu Thr GAA ACT AAA GAA Glu Thr Lys Glu
AGA
Arg 275 GAA CCC ACG CCT Glu Pro Thr Pro
ACA
Thr 280 CTT ACG Leu Thr 285 CCC ACC ACG CCC Pro Thr Thr Pro
AAA
Lys 290 CCT ATC ATG CCC Pro Ile Met Pro CTT GCA CCC ATA Leu Ala Pro Ile
ATA
Ile 300 GAA AAT GAC AAC Glu Asn Asp Asn ACA GAA AAC CAA Thr Glu Asn Gin
AAA
Lys 310 ACC CCC AAC CAC CCT Thr Pro Asn His Pro 315 AAA AAA GAA GAA Lys Lys Glu Glu CCA CAA GAA AAC Pro Gin Glu Asn CAA GAA GAA ATG ATA GAA Gin Glu Glu Met Ile Glu 1010 GGA AGG ATA Gly Arg Ile GAA GTG CAA Glu Val Gin 350
GAA
Glu 335 GAA ATG ATA AAG GAA AAT CTA AAA AAA Glu Met lie Lys Glu Asn Leu Lys Lys GAA GAA AAA Glu Glu Lys 345 ACA AGC GCT Thr Ser Ala AAC GCT CCA AAC TTT AGC CCA GTA ACC Asn Ala Pro Asn Phe Ser Pro Val Thr
CCC
Pro 360 1058 1106 1154 1202 AAA AAA Lys Lys 365 CCC GTT ATG GTT Pro Val Met Val
AAA
Lys 370 GAA TTG AGC GAA Glu Leu Ser Glu AAA GAG ATA TTA Lys Glu Ile Leu
GAC
Asp 380 GGA TTG GAT TAT Gly Leu Asp Tyr GAA GTG CAA AAA Glu Val Gin Lys AAA GAT TAT GAG Lys Asp Tyr Glu
CTT
Leu 395 CCC ACC ACG CAA TTA TTG AAT GCG GTT TGT TTG AAA GAC ACT TCT TTA Pro Thr Thr Gin Leu Leu Asn Ala Val Cys Leu Lys Asp Thr Ser Leu 1250 WO 98/21225 PCT/US97/21353 -301- GAC GAA AAC GAG ATT GAC CAA AAA Asp Glu Asn Glu Ile Asp Gin Lys 415 ATC CAG Ile Gin 420 GAT CTA TTG AGC AAA CTG Asp Leu Leu Ser Lys Leu 425 1298 CGC ACC TTT Arg Thr Phe 430 AAA ATT GAT GGC Lys Ile Asp Gly ATT ATC CGC ACT Ile Ile Arg Thr
TAT
Tyr 440 TCA GGC CCT Ser Gly Pro 1346 ATT GTA Ile Val 445 ACC ACT TTT GAA TTC CGC CCA GCC CCT Thr Thr Phe Glu Phe Arg Pro Ala Pro 450
AAC
Asn 455 GTT AAG GTG AGT Val Lys Val Ser 1394
CGT
Arg 460 ATT TTA GGC TTG Ile Leu Gly Leu
AGC
Ser 465 GAT GAT TTA GCG Asp Asp Leu Ala ACT TTA TGC GCT Thr Leu Cys Ala
GAA
Glu 475 1442 1490 TCC ATC CGC ATT Ser Ile Arg Ile
CAA
Gin 480 GCC CCT ATT AAG Ala Pro Ile Lys AAA GAT GTC GTT Lys Asp Val Val GGC ATT Gly Ile 490 GAA ATC CCT Glu Ile Pro GAG AGC GAA Glu Ser Glu 510
AAC
Asn 495 AGC CAA AGC CAA Ser Gin Ser Gin ATT TAT TTA AGA Ile Tyr Leu Arg GAA ATT CTA Glu Ile Leu 505 CTA GCT TTA Leu Ala Leu 1538 1586 TTG TTT CAA AAA Leu Phe Gin Lys
TCC
Ser 515 AGC TCG CCC TTA Ser Ser Pro Leu
ACT
Thr 520 GGC AAA GAC ATT GTG GGT Gly Lys Asp Ile Val Gly 525 CCT TTC ATC ACG Pro Phe Ile Thr TTA AAA AAG CTC Leu Lys Lys Leu 1634
CCC
Pro 540 CAT TTG CTC ATC His Leu Leu Ile GGC ACG ACA GGA AGC GGT AAG AGC GTG Gly Thr Thr Gly Ser Gly Lys Ser Val 550
GGC
Gly 555 1682 GTG AAT GCG ATG Val Asn Ala Met
ATT
Ile 560 TTA TCC TTA CTT Leu Ser Leu Leu
TAT
Tyr 565 AAA AAC CCT CCC Lys Asn Pro Pro GAT CAA Asp Gin 570 1730 CTC AAA TTA Leu Lys Leu GCG GAT ATC Ala Asp Ile 590
GTG
Val 575 ATG ATC GAT CCC Met Ile Asp Pro
AAA
Lys 580 ATG GTA GAA TTT Met Val Glu Phe AGT ATT TAT Ser Ile Tyr 585 CCT AAA AAA Pro Lys Lys 1778 1826 CCT CAT TTG CTC Pro His Leu Leu CCC ATT ATC ACC Pro Ile Ile Thr
GAC
Asp 600 GCT ATT Ala Ile 605 GGG GCT TTG CAA Gly Ala Leu Gin GTG GCT AAA GAA Val Ala Lys Glu GAA CGC CGG TAT Glu Arg Arg Tyr 1874 1922
TCT
Ser 620 TTA ATG AGC GAA Leu Met Ser Glu AAG GTT AAA ACC Lys Val Lys Thr GAT TCT TAT AAT Asp Ser Tyr Asn WO 98/21225 WO 9821225PCTIUS97/2 1353 -302- CAA GCC CCA Gin Ala Pro ATT GAT GAA Ile Asp Glu TTT CCT ATC Phe Pro Ile 670 ACT AAC GCC GTT GAA GCG TTC CCC Ser Asn Giy Val Giu Ala Phe Pro 640 645 TAT TTG ATT Tyr Leu Ile GTG GTC Val Val 650 1970 2018 GCG GAT TTA ATG Ala Asp Leu Met
ATG
Met 660 ACA GGG GGC AAA Thr Gly Gly Lys GAA GCG GAG Glu Ala Giu 665 GGC TTA CAC Gly Leu Hi-s GCT AGA ATC GCT Ala Arg Ile Ala
CAA
Gin 675 ATG GGG CGC GCG Met Cly Arg Ala
AGC
Ser 680 2066 CTC ATT Leu Ile 685 ATT AAA Ile Lys 700 GTA GCG ACC CAA Val Ala Thr Gin CCA AGC GTG GAT Pro Ser Val Asp ACC AAC TTG Thr Asn Leu
CCT
Pro 705 TCA AGG GTG ACT Ser Arg Val Ser GTA ACC CCC TTG Val Thr Cly Leu GTA GGC ACT AAG Val Gly Thr Lys 715
TTT
Phe 710 2114 2162 2210 ATT GAT TCT AAA Ile Asp Ser Lys GTG ATT TTA GAC ACT GAT Vai Ile Leu Asp Thr Asp 720 -725 CCC CCG CAA AGC Cly Ala Gin Ser TTG TTA Leu Leu 730 GGA AGA GGC Cly Arg Gly CGC TTG CAT Arg Leu His 750 ATG CTC TTT ACC Met Leu Phe Thr
CCC
Pro 740 CCA GGA GCG AAC Pro Gly Ala Asn GGG TTA GTG Gly Leu Vai 745 AAA ATC GTG Lys Ile Val 2258 2306 GCC CCC TTT GCC Ala Pro Phe Ala GAA CAT GAA ATC Giu Asp Giu Ile
AAA
Lys 760 GAT TTT Asp Phe 765 ATT AAA GCC CAA Ile Lys Ala Gin GAA GTA CAA TAC Glu Val Gin Tyr AAA GAT TTC TTG Lys Asp Phe Leu
CTA
Leu 780 CAA GAA TCA CGC Giu Giu Ser Arg
ATG
Met 785 CCT TTA GAC ACC Pro Leu Asp Thr
CCT
Pro 790 AAT TAT CAA GGC Asn Tyr Gin Gly 2354 2402 2450 GAC ATT TTA GAA Asp Ile Leu Giu GCT AAA GCG GTG Ala Lys Ala Val TTA GAA AAA AAG Leu Glu Lys Lys ATC ACT Ile Thr 810 TCT ACG AGT Ser Thr Ser GCT ACC ATT Ala Thr Ile 830 AAC GCT AAA Asn Ala Ly 845
TTT
Phe 815 TTA CAA CGC CAA Leu Gin Arg Gin
TTA
Leu 820 AAA ATC GGC Lys Ile Gly TAC AAC CAA GCC Tyr Asn Gin Ala 825 ACT GAC GAA TTA Thr Asp Glu Leu GCT CAA GGC TTT Ala Gin Gly Phe
TTA
Leu 840 TCC CCA AGA Ser Pro Arg 2498 2546 2598 GGC AAC AGA Gly Asn Arg ATT TTG CAA. AAC Ile Leu Gin Asn TAGGCTTTGT TTTCAT WO 98121225 WO 98/ 225PCTfUS97/21353 -303- TGGATATTGG CAAAC2ATTAT TTTTGATTT INFORMATION FOR SEQ ID NO:158: SEQUENCE CHARACTERISTICS: LENGTH: 855 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear 2627 (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:158: Met Lys Ser Lys Lys Leu Tyr Leu Ala Leu Ilie Ile Gly Val Leu Leu Al a Arg Leu Tyr Gly Leu Phe Ile Pro 145 Cys Giu Glu Pro Leu 225 Pro Leu Lys Pro Lys 305 Phe Phe Ser Lys His As n Ile Ser 130 Tyr Leu Asn Lys Asn 210 Asn Pro Thr Glu Lys 290 Thr Leu Gly Phe Thr Leu Gin Gly 115 Tyr Met Gin Thr Glu 195 Glu Asp Thr Asn Arg 275 Pro Glu Thr Val1 Ile Lys Leu Gly 100 Asp Leu As n Al a Pro 180 As n Giu Ser His Pro 260 Glu Ile Asn Ser Phe Leu Pro 70 Ile Ile Gly Leu Thr 150 Ser Asp Lys Phe Pro 230 Thr Asn Thr Pro Lys 310 Ser Ala Pro 55 Phe Leu Gly Leu Phe 135 Gin Pro Ile Giu Leu 215 Gin Ile Pro Pro Thr 295 Thr Trp Ala 40 Tyr Thr Ser Asn Tyr 120 Lys As n As n Gin Asn 200 Ala Glu Tyr Pro Thr 280 Leu Pro Leu Leu Leu Giu Leu Ser 105 Al a Leu Leu Phe Lys 185 Pro Ile Gly Pro Leu 265 Lys Ala Asn Giy Asn Ala Ile Leu Ala Leu Pro Leu Ser 170 Lys Ile Pro Leu Lys 250 Lys Giu Pro His Asn Lys Trp Vali 75 Phe Arg Ile Pro Lys 155 Pro Giu Asn Thr Val1 235 Arg Glu Thr Ile Pro 315 Ser Lys Val Leu Leu Leu Thr Lys 140 Glu Lys Thr Glu Pro 220 Gin Asn Ile Leu Ile 300 Lys Gly Tyr Leu Glu Gin Phe Leu 125 Ser Ile Lys Lys Asn 205 Tyr Ile Arg Lys Thr 285 Giu Lys Val Gly Gly His Leu Leu Thr Leu Ser Leu Arg Pro Vai Val Phe Tyr Lys Gln 160 Giy Phe 175 Asp Lys Lys Thr Thr Thr Ser His 240 Asp Asp 255 Giu Thr Thr Thr Asp Asn Glu Asn 320 Pro Gin Oiu Asn Thr Gin Glu Oiu Met Ile Glu Gly Arg Ile Giu Giu 325 330 335 WO 98/21225 WO 9821225PCT/US97/21353- -304- Met Pro Val1 Gly 385 Len Asp Asp Gin Ser 465 Ala Gin Gin Gly Al a 545 Len Ile Len Gin Tyr 625 Gly Asp Ile Gin Pro 705 Ile Leu Phe Gin Ile Asn Lys 370 Giu Asn Gin Gly Phe 450 Asp Pro Ser Lys Asn 530 Gly Se r Asp Leu Ser 610 Lys Vali Leu Ala Arg 690 Ser Leu Phe Ala Lys Lys Phe 355 Gin Val1 Ala Lys Asp 435 Arg Asp Ile Gin Her 515 Pro Thr Leu Pro Thr 595 Val1 Val1 Gin Met Gin 675 Pro Arg Asp Thr Thr 755 Glu Asn ieu Lys Lys Giu Glu Lys Pro Ser Lys Cys 405 Gin Ile Ala Al a Gly 485 Ile Ser Ile Gly Tyr 565 Met Ile Lys Thr Phe 645 Thr Gly Val Ser Asp 725 Pro Asp Gin Val Giu Pro 390 Leu Asp Arg Pro Met 470 Lys Tyr Pro Thr Ser 550 Lys Val Ile Giu Ile 630 Pro Gly Arg Asp Phe 710 Gly Gly Gin Tyr Thr Asn 375 Lys Lys Leu Thr Asn 455 Thr Asp Leu Len Asp 535 Gly Asn Gin Thr Met 615 Asp Tyr Giy Ala Val1 695 Arg Ala Ala Ile Asp Pro 360 Lys Asp Asp Leu Tyr 440 Val1 Leu Val1 Arg Thr 520 Len Lys Pro Phe Asp 600 Gin Ser Len Lys Ser 680 Vai Val1 Gin Asn Lys 760 Lys 345 Thr Gin Tyr Thr Her 425 Ser Lys Cys Val1 Gin 505 Len Lys Ser Pro Ser 585 Pro Arg Tyr Ile Gin 665 Gly Thr Gly Ser Gly 745 Lys Asp Al a Le u Leu 395 Leu Leu Pro) Ser Giu 475 Ile Leu Len Leu Gly 555 Gin Tyr Lys Tyr Gin 635 Val1 Glu His Leu Lys 715 Leu Va I Val1 Leu Val Lys 365 Giy Thr Glu Thr Vai 445 Ile Ile Ile Ser Lys 525 His Asn Lys Asp Ile 605 Len Al a Asp Pro Ile 685 Lys Asp Arg Len Phe 765 Gin As n Val1 Asp Gin Gin 415 Lys Thr Gly Ile Asn 495 Len Ile Len Met Val1 575 Pro Al a Ser Ser Len 655 Ala Ala As n Lys Asp 735 Ala Lys Ser Al a Met Tyr Leu 400 Ile Ile Phe Len Gin 480 Her Phe Val1 Ile Ile 560 Met His Len Gin Asn 640 Al a Arg Thr Len Val1 720 Met Pro Ala Arg WO 98/21225 PCT/US97/21353 -305 770 775 Met Pro Leu Asp Thr Pro Asn Tyr Gin Gly Asp 785 790 795 Ala Lys Ala Val Ile Leu Glu Lys Lys Ile Thr Ile Leu Glu Ser Thr Ser 805 Lys Phe Leu 815 Gin Arg Gin Glu Leu Glu 835 Arg Glu Ile 850 Leu 820 Ala Ile Gly Tyr Asn 825 Ser Gin Ala Ala Thr Pro Arg Asn Ala 845 Ile Thr Asp 830 Lys Gly Asn Gin Gly Phe Leu Gin Asn INFORMATION FOR SEQ ID NO:159: SEQUENCE CHARACTERISTICS: LENGTH: 1986 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 56...1945 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:159: GGTGTCCTTA AACAGCAGGG TGAAAGAGAT TTTAAAAGAA AGCGCTCTGC ATTCT ATG .Met 1 CAA GAT AGT Gin Asp Ser AAC ACT TAT Asn Thr Tyr
TTG
Leu CAT TTT AAG GTT His Phe Lys Val
AAT
Asn GAA GTG CAA GGG GTT TTA GAA Glu Val Gin Gly Val Leu Glu ACG AGC ATG GGC Thr Ser Met Gly
ATT
Ile 25 GTT AAA GAA ATG CTC CCT AAA GAC Val Lys Glu Met Leu Pro Lys Asp ACC AAA Thr Lys AGA GAA ATC AAA Arg Glu Ile Lys ATC GGC TTG TTA AAA AAC TTC ATT TTA GCC Ile Gly Leu Leu Lys Asn Phe Ile Leu Ala 40 GTG AGC ATG TTT TTT AAA GGC AGA GAA GAT Val Ser Met Phe Phe Lys Gly Arg Glu Asp
AAT
Asn TCG CAT GTC GCT Ser His Val Ala 250 TTA AGA TTA ACG Leu Arg Leu Thr TTA AGG GAT AAC AAT ACG ATT AAG CTA Leu Arg Asp Asn Asn Thr Ile Lys Leu GTG GAA Val Glu WO 98/21225 PCT/US97/21353 -306- AAT CCG TCA TIA UAu AAfT AGC CCT TTA GCG CAA AAA GCG Asn Pro Ser Leu Glu Asn Ser Pro Leu Ala Gin Lys Ala AAA GAA ATT Lys Glu Ile 100 GCG GAA GTT Ala Glu Val 115 TCT AAA AGT TTG Ser Lys Ser Leu TAT GGG GTG GAT Tyr Gly Val Asp 120 TAT TAT AGG AAA Tyr Tyr Arg Lys
ATG
Met 110
TTG
Leu ATO AAA AAT Met Lys Asn CCT AAT GGG Pro Asn Gly AAT GAG AAC Asn Glu Asn 394 442 CTT TTA CCT Leu Leu Pro
GCT
Ala 130 CAA GAG GTT OTA Gin Giu Val Val
GGG
Gly 135 GCT TTG ATG ATT Ala Leu Met Ile ATT TCC ATT GAC Ile Ser Ile Asp
AGC
Ser 145 TTC AGC AAT GAA Phe Ser Asn Glu ACT AAA AAC AGG Thr Lys Asn Arg GAT TTA TTT TTA Asp Leu Phe Leu ATT GGC Ile Gly 160 ACT AAA GGT Thr Lys Gly CCT ATC GCA Pro Ile Ala 180 GTG CTT TTG AGC Val Leu Leu Ser
GCG
Ala 170 AAT AAG ACT TTG Asn Lys Ser Leu CAA GAC AAA Gin Asp Lys 175 AAC GAA GTG Asn Glu Val GAA ATT TAT AAG Glu Ile Tyr Lys GTG CCT AAA GCC Val Pro Lys Ala
ACC
Thr 190 ATG GCT Met Ala 195 ATT TTA GAA AAC Ile Leu Glu Asn TCT AAA GCG ACT Ser Lys Ala Thr GAA TAC TTA GAT Glu Tyr Leu Asp
CCC
Pro 210 TTT AGC CAT AAG GAA AAT TTT TTA GCC Phe Ser His Lys Glu Asn Phe Leu Ala 215 GAA ACC TTT AAA Glu Thr Phe Lys
ATG
Met 225 730 CTA GGC AAA ACA Leu Gly Lys Thr
GAA
Glu 230 AGT AAA GAC AAT Ser Lys Asp Asn
CTT
Leu 235 AAT TGG ATG ATC Asn Trp Met Ile GCT TTA Ala Leu 240 ATC ATT GAA Ile Ile Glu GTG GTG ATC Vai Val Ile 260
AAA
Lys 245 GAC AAG GTC TAT Asp Lys Val Tyr
GAG
Glu 250 CAA GTA GGC TCG Gin Val Gly Ser GTG CGT TTT Val Arg Phe 255 ATT ATA GCG Ile Ile Ala ATA GCG AGC GCA Ile Ala Ser Ala ATG GTG TTA GCC Met Val Leu Ala
TTG
Leu 270 ATC ACT Ile Thr 275 CTC TTA ATG CGA GCG ATC GTG AGC AGT Leu Leu Met Arg Ala Ile Val Ser Ser 280 TTG GAA GCC GTT Leu Giu Ala Val AGC ACC TTG TCT Ser Thr Leu Ser
CAT
His 295 TTC TTT AAA TTA Phe Phe Lys Leu
TTG
Leu 300 AAC AAT CAA GCC As Asn Gin Ala WO 98/21225 WO 98/1225PCTIUS97/2 1353 TCT AGC GGT ATT AAA 'iiG ATT GAA GCG Ser Ser Gly Ile Lys Leu Ile Giu Ala 310 -307-
AAA
Lys 315 TCC AAT GAC GAG Ser Asn Asp Giu TTA GC Leu Giy 320 1018 CGC ATG CAA Arg Met Gin ATG CAA GAA Met Gin Giu 340
ACA
Thr 325 GCG ATC PAT AA Ala Ile Asn Lys ATC TTG CAA ACC Ile Leu Gin Thr CAA AAA ATC Gin Lys Ile 335 GTG GTT TCA Val Val Ser 1066 1114 GAC AGG CAA GCC Asp Arg Gin Aia
GTC
Val 345 CAA GAC ACC ATT Gin Asp Thr Ile
AAA
Lys 350 GAT GTG AAA GCA GGG AAT Asp Val Lys Ala Gly Asn
TTT
Phe 360 C GTG CGC ATC Ala Val Arg Ilie GCT GAG CCC GCA Ala Giu Pro Ala CCT GAT TTG AAA Pro Asp Leu L~ys
GAA
Giu 375 TTG AGG GAC GCG Leu Arg Asp Ala
CTA
Leu 380 AAT GGG ATC ATG Asn Gly Ile Met
GAT
Asp 385 1162 1210 1258 TAT TTG CAA GAA Tyr Leu Gin Giu GTA GGG ACT CAC Vai Gly Thr His CCA AGC ATT TTC Pro Ser Ile Phe AAA ATC Lys Ile 400 TTT GAA AGC Phe Giu Ser TCG GGT AGG Ser Gly Arg 420
TAT
Tyr 405 TCT GGT TTG GAT Ser Gly Leu Asp
TTT
Phe 410 AGA GGC CGG ATC Arg Giy Arg Ile CAA AAC GCT Gin Asn Ala 415 GAA ATC CAA Giu Ile Gin 1306 1354 GTG GAA CTG GTT Val Giu Leu Val
ACT
Thr 425 AAC GCT TTA GGG Asn Ala Leu Gly
CAA
Gin 430 AAA ATG Lys Met 435 CTA GAA ACT TCG Leu Giu Thr Ser AAT TTT GCC AA.A Asn Phe Ala Lys TTA GCG AAC GAT Leu Ala Asn Asp
AGC
Ser 450 GCG PAT TTA AAA Ala Asn Leu Lys TGC GTG CAA AAT Cys Vai Gin Asn
TTA
Leu 460 GAA AAA GCT TCA Giu Lys Ala Ser 1402 1450 1498 TCC CPA CAC AAA Ser Gin His Lys TTG ATG GPA ACT Leu Met Giu Thr AAA ACG ATA GAA Lys Thr Ile Giu PAT ATC Asn Ile 480 ACC ACT TCC Thr Thr Ser CPA GGG CA Gin Gly Gin 500
ATT
Ile 485 CPA GGC GTG AGC Gin Gly Val Ser
TCT
Ser 490 CPA AGT GPA GCC Gin Ser Giu Ala ATG ATT GA Met Ile Giu 495 GAT ATT GCT Asp Ile Ala 1546 1594 GAC ATT AAA ACC Asp Ile Lys Ser
ATT
Ile 505 GTA GPA ATC ATT Val Gu Ilie Ile
AGA
Arg 510 GAT CA Asp Gin 515 ACC PAT CTT TTA Thr Asn Leu Leu TTA PAC GCC GCT Leu Asn Ala Ala GAA GCC OCA AGG Giu Ala Ala Arg 1642
L
WO 98/21225 PCT/US97/21353 -308-
GCC
Ala 530 GGC GAG CAT GGC Gly Glu His Gly
AUA
Arg 535 GGC TTT GCG GTG Gly Phe Ala Val
GTG
Val 540 GCT GAT GAG GTA AGA Ala Asp Glu Val Arg 1690 1738 AAG CTC GCT GAA Lys Leu Ala Glu ACG CAA AAA TCG Thr Gin Lys Ser AGC GAG ATT GAA Ser Glu Ile Glu GCC AAT Ala Asn 560 ATC AAT ATT Ile Asn Ile AAC CAG GTT Asn Gin Val 580
TTA
Leu 565 GTG CAA AGC ATT Val Gin Ser Ile GAC ACG AGC GAA Asp Thr Ser Glu AGC ATT AAA Ser Ile Lys 575 GAA GCC TTA Glu Ala Leu 1786 1834 AAA GAA GTG GAA Lys Glu Val Glu
GAA
Glu 585 ATC AAC GCT TCT Ile Asn Ala Ser
ATT
Ile 590 AGA TCG Arg Ser 595 GTT ACT GAG GGC Val Thr Glu Gly CTA AAA ATC GCT Leu Lys Ile Ala GAT TCT TTA GAA Asp Ser Leu Glu 1882 1930
ATC
Ile 610 AGT CAA GAA ATT Ser Gin Glu Ile
GAC
Asp 615 AAA GTT TCT AAC Lys Val Ser Asn
GAT
Asp 620 ATT TTA GAA GAT Ile Leu Glu Asp AAT AAA AAG CAG Asn Lys Lys Gin
TTT
Phe 630 TAATGCTCAT TCATATTTGC TGCTCAGTGG ATAACCTCTA T 1986 1986 INFORMATION FOR SEQ ID NO:160: SEQUENCE CHARACTERISTICS: LENGTH: 630 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:160: Gin Asp Ser Leu His Phe Lys Val Asn Val Glu Val Gin Gly Val Leu Glu Asn Thr Tyr Thr Asp Thr Lys Arg Glu Ala Asn Ser His Val Asp Leu Arg Leu Thr Glu Asn Pro Ser Leu Asn Lys Glu Ile Ser Ser Met Gly Ile 25 Ile Lys Ile Gly 40 Ala Gly Val Ser 55 Lys Glu Met Leu Leu Lys Met Phe Phe Asn Lys Leu Pro Lys Phe Ile Leu Gly Arg Glu Leu Leu Arg Asp Asn Asn 70 Glu Asn Ser Pro Leu Ala Thr Ile Lys Leu Tyr Gin Lys Ala Arg Lys Met Met Lys Pro Asn Lys Ser Leu Gly WO 98/21225 PCT/US97/21353 309- Gly Asn Ser 145 Gly Lys Val Asp Met 225 Leu Phe Ala Val Asn 305 Gly Ile Ser Ala Asp 385 Ile Ala Gin Asp Asn 465 Ile Glu Ala Arg Ala Ala 130 Phe Thr Pro Met Pro 210 Leu Ile Val Ile Ser 290 Ser Arg Met Asp Ser 370 Tyr Phe Ser Lys Ser 450 Ser Thr Gin Asp Ala 530 Val Glu Asn Gly Ala 180 Ile Ser Lys Glu Ile 260 Leu Thr Gly Gin Glu 340 Lys Asp Gin Ser Arg 420 Leu Asn His Ser Gin 500 Thr Glu Tyr Val Glu Lys 165 Glu Leu His Thr Lys 245 Ile Leu Leu Ile Thr 325 Asp Ala Leu Glu Tyr 405 Val Glu Leu Lys Ile 485 Asp Asn His Gly Val Ile 150 Val Ile Glu Lys Glu 230 Asp Ala Met Ser Lys 310 Ala Arg Gly Lys Ser 390 Ser Glu Thr Lys Ser 470 Gin Ile Leu Gly Ile Leu Asn Ser Ser 185 Ser Phe Asp Tyr Ile 265 Ile Phe Glu Lys Val 345 Ala Arg Thr Asp Thr 425 Asn Val Glu Ser Ile 505 Leu Phe Leu Met Arg Ala 170 Val Lys Leu Asn Glu 250 Met Val Lys Ala Asn 330 Gin Val Asp His Phe 410 Asn Phe Gin Thr Ser 490 Val Asn Ala Leu Ile Ser 155 Asn Pro Ala Ala Leu 235 Gin Val Ser Leu Lys 315 Ile Asp Arg Ala Met 395 Arg Ala Ala Asn Ser 475 Gin Glu Ala Val Pro Phe 140 Asp Lys Lys Thr Val 220 Asn Val Leu Ser Leu 300 Ser Leu Thr Ile Leu 380 Pro Gly Leu Lys Leu 460 Lys Ser Ile Ala Val 540 Leu 125 Ile Leu Ser Ala Leu 205 Glu Trp Gly Ala Arg 285 Asn Asn Gin Ile Thr 365 Asn Ser Arg Gly Asp 445 Glu Thr Glu Ile Ile 525 Ala 110 Leu Asn Ser Ile Phe Leu Leu Gin 175 Thr Asn 190 Glu Tyr Thr Phe Met Ile Ser Val 255 Leu Ile 270 Leu Glu Asn Gin Asp Glu Thr Gin 335 Lys Val 350 Ala Glu Gly Ile Ile Phe Ile Gin 415 Gin Glu 430 Leu Ala Lys Ala Ile Glu Ala Met 495 Arg Asp 510 Glu Ala Asp Glu Glu Asp Ile 160 Asp Glu Leu Lys Ala 240 Arg Ile Ala Ala Leu 320 Lys Val Pro Met Lys 400 Asn Ile Asn Ser Asn 480 Ile Ile Ala Val WO 98/21225 PCT/US97/21353 -310- Arg 545 Asn Lys Leu Ala Glu Thr Gin Lys Ser Leu 555 Ser Glu Ile Gl.u Ala 560 Ile Asn Ile Gin Ser Ile Lys Asn Gin Leu Arg Ser 595 Glu Ile Ser Val 580 Val Glu Val Glu Glu 585 Leu Ser Asp Thr 570 Ile Asn Ala Lys Ile Ala Ser Glu Ser Ile 575 Thr Glu Gly Asn 600 Lys Ser Ile Glu Ala 590 Ser Asp Ser Leu 605 Ile Leu Glu Asp Gin Glu Ile 610 Val Asn 625 Asp 615 Val Ser Asn Asp 620 Lys Lys Gin Phe INFORMATION FOR SEQ ID NO:161: SEQUENCE CHARACTERISTICS: LENGTH: 1758 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 8...1702 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:161: GAGATAA ATG ATG TTT TCT TCA ATG TTT GCT TCG TTG GGG ACT CGT ATC Met Met Phe Ser Ser Met Phe Ala Ser Leu Gly Thr Arg Ile
ATG
Met CTG GTC GTG TTA GCC GCT CTT TTA GGT TTA GGG GGG CTT TTT Leu Val Val Leu Ala Ala Leu Leu Gly Leu Gly Gly Leu Phe
ATT
Ile GGT TTT GTA AAG GTT ATG CAA AAA GAT GTG TTA GCG CAA CTC Gly Phe Val Lys Val Met Gin Lys Asp Val Leu Ala Gin Leu ATG GAG Met Glu CAT TTA GAA ACC His Leu Glu Thr ATG ACA AAA ATT Met Thr Lys Ile GAC AAT GCT ACT Asp Asn Ala Thr GGG CAA TAC AAA AAG CGT GAA AAA ACG Gly Gin Tyr Lys Lys Arg Glu Lys Thr CTC GCT TAC Leu Ala Tyr AAA AAT TTT Lys Asn Phe 193 241 289 ATT GAA CAG GGC ATT CAT GAG TAT Ile Glu Gin Gly Ile His Glu Tyr
TAC
Tyr GCA AGA AAA ATG GCG TTA GAT TAT TTC AAA CGC ATC Ala Arg Lys Met Ala Leu Asp Tyr Phe Lys Arg Ile 85
I,
WO 98/21225 PCT[US97/21353 -311- AAC GAC GAT AAG--GGC Asn Asp Asp Lys Gly ATT TAT ATG GTG Ile Tyr Met Val GTG GTG Val Val 105 GAT AAA AAC Asp Lys Asn GTG GTA TTG TTT GAT CCG GTC AAT CCT Val Val Leu Phe Asp Pro Val Asn Pro 115 ACC GTA GNC CAA Thr Val Xaa Gin TCA GGG Ser Gly 125 CTT GAC GCT CAG AGC GTT GAT GGG Leu Asp Ala Gin Ser Val Asp Gly
GTG
Val 135 TAT TAT GTT AGG Tyr Tyr Val Arg GGG TAT TTG Gly Tyr Leu 140 ATG CCT AAA Met Pro Lys GAG GCG GCC AAA AAA GGG GGA Glu Ala Ala Lys Lys Gly Gly 145
GGC
Gly 150 TAC ACT TAT TAT Tyr Thr Tyr Tyr
AAA
Lys 155 TAC GAT Tyr Asp 160 GGA GGC GTA CCG GAG AAA AAA TTC GCC Gly Gly Val Pro Glu Lys Lys Phe Ala 165 TCG CAT TAT GAT Ser His Tyr Asp
GAA
Glu 175
AAC
Asn GTT TCT CAA ATG Val Ser Gin Met ACA GAA AAT AAA7 Thr Glu Asn Lys 195 ATC GCA ACG ACT Ile Ala Thr Thr
TCC
Ser 185 TAT TAC ACT GAC Tyr Tyr Thr Asp GCG ATC AAA GAA Ala Ile Lys Glu GTG AAT AAG GTT Val Asn Lys Val TTT GAT Phe Asp 205 GAA AAC ACC Glu Asn Thr CTA GTG GTT Leu Val Val 225 AAA TTA TTC CTT Lys Leu Phe Leu
TGG
Trp 215 ATA CTG ACA GCG Ile Leu Thr Ala ACG ATA GCG Thr Ile Ala 220 GTG AAA CGC Val Lys Arg TTG ACG CTC ATA Leu Thr Leu Ile
TAC
Tyr 230 GCT AAA TTA AGG Ala Lys Leu Arg
ATC
Ile 235 ATT GAT Ile Asp 240 GAA CTG GTC CTT Glu Leu Val Leu ATC AAC GCT TTT Ile Asn Ala Phe CGT GGG GAT AAG Arg Gly Asp Lys TTG AGA GCC AAA Leu Arg Ala Lys GAT GTG GGT GAT Asp Val Gly Asp
CGC
Arg 265 AAC GAT GAA ATC Asn Asp Glu Ile
TCG
Ser 270 CAA GTG GGC CGT Gin Val Gly Arg ATC AAT TTG TTT Ile Asn Leu Phe
GTG
Val 280 GAA AAC GCC CGC Glu Asn Ala Arg TTG ATT Leu Ile 285 ATG GAA GAG Met Glu Glu AAA TTA GTC Lys Leu Val 305 AAA GGG ATT TCC Lys Gly Ile Ser CTC AAT AAA ACT Leu Asn Lys Thr TCA ATG GAT Ser Met Asp 300 AAA GAT TCC Lys Asp Ser CAA ATC ACG CAA Gin Ile Thr Gin
GAA
Glu 310 ACC CAA AAG AGC Thr Gin Lys Ser
ATG
Met 315 t.
WO 98/21225 PCTIUS97/21353 -312- TCA ACC Ser Thr 320 ACC CTA AAT TUC Thr Leu Asn Ser
GTG
Val 325 AAA AAT AAA GCC Lys Asn Lys Ala GAT ATA GCG AG.C Asp Ile Ala Ser 1009
ATG
Met 335 ATG AAT GCT TCC Met Asn Ala Ser
ATA
Ile 340 GAG CAA TCT CAA GGG TTA AGG AAG CGT Glu Gin Ser Gin Gly Leu Arg Lys Arg 345
TTG
Leu 350 1057 ATT GAA ACG CAA Ile Glu Thr Gin CTG GTC AAA GAG Leu Val Lys Glu AAG GAT GCG ATC Lys Asp Ala Ile GGG GAT Gly Asp 365 1105 TTA TTT TCT Leu Phe Ser AGC AAA GTG Ser Lys Val 385
CAA
Gin 370 ATC ACA GAG AGC Ile Thr Glu Ser CAC ACT GAA GAG His Thr Glu Glu GAA CTC TCT Glu Leu Ser 380 AAA TCC ATT Lys Ser Ile 1153 1201 GAG CAG CTA AGC Glu Gin Leu Ser
CGT
Arg 390 AAC GCT GAT GAT Asn Ala Asp Asp
GTC
Val 395 CTG GAT Leu Asp 400 ATT ATC AAT GAT Ile Ile Asn Asp GCC GAT CAA ACG Ala Asp Gin Thr TTA TTA GCC CTA Leu Leu Ala Leu 1249 1297
AAC
Asn 415 GCT GCT ATT GAA Ala Ala Ile Glu GCA AGG GCT GGC Ala Arg Ala Gly
GAG
Glu 425 CAT GGC AGA GGC His Gly Arg Gly
TTT
Phe 430 GCG GTG GTG GCT Ala Val Val Ala
GAT
Asp 435 GAA GTT AGG AAT TTA GCC GGG CGC ACT Glu Val Arg Asn Leu Ala Gly Arg Thr 440 CAA AAG Gin Lys 445 1345 TCT TTA GCC Ser Leu Ala AAT GCC GTG Asn Ala Val 465
GAA
Glu 450 ATC AAT TCC ACT Ile Asn Ser Thr
ATC
Ile 455 ATG GTG ATT GTC Met Val Ile Val CAA GAA ATC Gin Glu Ile 460 ATG GAG CGT Met Glu Arg 1393 AGT TCG CAA ATG AAT CTC AAT TCG CAA Ser Ser Gin Met Asn Leu Asn Ser Gin 470
AAA
Lys 475 1441 TTG AGC Leu Ser 480 GAT ATG AGT AAA Asp Met Ser Lys
AGC
Ser 485 GTG CAA GAA ACT Val Gin Glu Thr GAA AAA ATG AGT Glu Lys Met Ser
TCT
Ser 495 AAT TTA AGC TCA Asn Leu Ser Ser GTG TCA GAC AGC Val Ser Asp Ser
AAT
Asn 505 CAA AGC ATG GAC Gin Ser Met Asp
GAT
Asp 510 1489 1537 1585 1633 TAC GCC AAA TCC Tyr Ala Lys Ser
GGA
Gly 515 CAC CAA ATT GAA His Gin Ile Glu
GTT
Val 520 ATG GTA AGC GAT Met Val Ser Asp TTT GCA Phe Ala 525 GAG GTG GAA Glu Val Glu GTG GCT TCT AAG Val Ala Ser Lys
ACT
Thr 535 TTA GCG GAT TCT TCA GAT ATT Leu Ala Asp Ser Ser Asp Ile WO 98/21225 WO 9821225PCTIUS97/21353 -313- TTA AAC ATC GCT AWC uAT GTG AGT GGA ACG ACC ATG AAT TTA GAC AAA Leu Asn Ile Ala Thr His Val Ser Gly Thr Thr Met Asn Leu Asp Lys 545 550 555 CAA GTG AAT TTG TTT AAA ACT TAATCAGGGG GAGTTTATTA AAAAAGGGTT GGAT Gln Val Asn Leu Phe Lys Thr 560 565 TGTTAAAAGT TTCTGTGATC AC INFORMATION FOR SEQ ID NO:162: SEQUENCE CHARACTERISTICS: LENGTH: 565 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear 1682 1736 1758 (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:162: Met Met Phe Ser Ser Met Phe Ala Ser Leu Val Lys Thr Ile Thr Lys Phe Gin 130 Lys Gly Gin Asn Thr 210 Leu Leu Ala Leu Val1 Gly Ile Ala Gly Asp 115 Ser Lys Val Met Lys 195 Lys Thr Val1 Lys Al a Gin Tyr Gin Lys Ile Val Asp Gly Glu 165 Ile Ile Phe Ile Lys 245 Asp Leu Leu Glu Glu Asp Val1 105 Thr Tyr Tyr Ala Ser 185 Val1 Leu Leu Phe Arg 10 Gly Ala Lys Tyr Tyr 90 Val1 Val Val Tyr Tyr 170 Tyr Asn Thr Arg Ser 250 Asn Gly Giy Gin Thr Tyr Phe Asp Xaa Arg Lys 155 Ser Tyr Lys Ala Ile 235 Arg Asp Arg Phe Met Al a Asn Arg Asn Ser 125 Tyr Pro Tyr Asp Phe 205 Ile Lys Asp Ile Met Gly His Met Asp As n Val Leu Glu Tyr Glu 175 Asn Glu Leu Ile Asp 255 Gin Leu Phe Leu Thr Asn Asp Val Asp Ala Asp 160 Val1 Thr Asn Val1 Asp 240 Leu Val WO 98/21225 PCT/US97/21353 -314- 265 Gly Glu Val 305 Thr Asn Thr Ser Val 385 Ile Ala Val Ala Val 465 Asp Leu Lys Glu Ile 545 Asn Gly 275 Lys Ile Asn Ser Gly 355 Ile Gin Asn Glu Asp 435 Ile Ser Ser Ser Gly 515 Val Thr Phe Asn Ile Gin Val 325 Glu Val Glu Ser Ile 405 Ala Val Ser Met Ser 485 Val Gin Ser Val Thr 565 Val 280 Leu Gin Lys Gin Ser 360 His Ala Gin Gly Leu 440 Met Asn Glu Ser Val 520 Leu Thr Asn Lys Ser Thr 330 Leu Asp Glu Asp Asn 410 His Gly Ile Gin Tyr 490 Gin Val Asp Met Ala Thr Met 315 Asp Arg Ala Glu Val 395 Leu Gly Arg Val Lys 475 Glu Ser Ser Ser Asn 555 Leu 285 Met Asp Ala Arg Gly 365 Leu Ser Ala Gly Gin 445 Glu Glu Met Asp Phe 525 Asp Asp Met Lys Ser Met 335 Ile Leu Ser Leu Asn 415 Ala Ser Asn Leu Ser 495 Tyr Glu Leu Gin Glu Leu Thr 320 Met Glu Phe Lys Asp 400 Ala Val Leu Ala Ser 480 Asn Ala Val Asn Val 560 INFORMATION FOR SEQ ID NO:163: SEQUENCE CHARACTERISTICS: LENGTH: 686 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 16...660 OTHER INFORMATION: WO 98/21225 WO 981225PCTIUS97/21353 -315- (xi) SEQUENCE DESCRIPTION: SEQ ID NO:i63: TATAAGGTTG CTCTC ATO AAA AAA CCC TAT AGG AAG ATT TCT OAT TAT GCG Met Lys Lys Pro Tyr Arg Lys Ile Ser Asp Tyr Ala ATC GTG GGT Ile Val Gly GOT TTG AOC GCG Gly Leu Ser Ala GTG ATG GTG AGC Val Met Val Ser
ATT
Ile GTG GGG TGT Val Oly Cys AAG AGC Lys Ser AAT GCT GAT GAC AAA CCA AAA GAG CAA AGC TCT TTA ACT CAA Asn Ala Asp Asp Lys Pro Lys Olu Gin Ser Ser Leu Ser Gin
AGC
Ser OTT CAA AAA GOC Val Gin Lys Gly
OCO
Al a 50 TTT OTO ATT TTA Phe Val Ile Leu
GAA
Giu 55 GAG CAA AAG OAT Glu Gin Lys Asp
AAA
Lys TCT TAC AAG OTT Ser Tyr Lys Val OAA GAA TAC CCC Olu Glu Tyr Pro
AGC
Ser TCA AGA ACC CAC Ser Arg Thr His ATT ATA Ile Ile OTO COC OAT Val Arg Asp CAA GOC AAT OAA Gin Gly Asn Giu OTO TTA AGC AAT Val Leu Ser Asn GAA GAO ATT Glu Giu Ile GGC AG ACC Gly Thr Ser CAA AAG CTC ATC AAA OAA OAA Gin Lys Leu Ile Lys Olu Glu OCT AAA ATT GAT Ala Lys Ile Asp
AAC
Asn 105 AAG CTT Lys Leu 110 GTC CAG CCT AAT AAT OGA 000 ACT AAT Val Gin Pro Asn Asn Gly Oly Ser Asn 115 GOC TCA GOC TTT Gly Ser Gly Phe
GOC
Oly 125
ACT
Ser TTG 000 AGC OCO Leu Gly Ser Ala
ATT
Ile 130
AAO
Lys TTA 000 AOC OCO Leu Gly Ser Ala 000 OCO ATT TTA Oly Ala Ile Leu 000 Gly 140
AAC
As n TAT ATT GOT Tyr Ile Gly
AAT
Asn 145 CTT TTC AAT Leu Phe Asn
AAC
Asn 150 AAT TAC CAG Asn Tyr Gln
CAA
Gin 155 0CC CAA CG Ala Gin Arg TCC TTT TCT Ser Phe Ser 175 TAC AAA TCC CCA Tyr Lys Ser Pro
CAA
Gin 165 OCT TAC CAA COC Ala Tyr Gln Arg TCT CAA AAT Ser Gln Asn 170 OOA OCO ACT Oly Ala Ser AAA ACT OCO CCC ACT OCT TCA AGC ATO Lys Ser Ala Pro Ser Ala Ser Ser Met
GC
Gly 185 AAO GGA Lys Gly 190 CAG AOC 000 TTT Gln Ser Gly Phe GOC TCT ACT AGO Oly Ser Ser Arg ACT AOT TCA CCG Thr Ser Ser Pro WO 98/21225 WO 981225PCTIUS97/21353 -316- GCG.GTA AGC TUT Ucu ACA AGG GGC TTT AAC TCA TAATTTAATT GATTCAAGGC Ala Val Ser Ser Gly Thr Arg Gly Phe Asn Ser 205 210 215
TAAAAA
INFORMATION FOR SEQ ID NO:164: WI SEQUENCE CHARACTERISTICS: LENGTH: 215 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:164: Met Lys Lys Pro Tyr Arg Lys Ile Ser Asp 1 Leu Asp Gly Val1 Gin Lys Pro Ala Asn 145 Tyr Ser Gly Gly Ser Asp Al a Glu Gly Glu Asn Ile 130 Lys Lys Al a Phe Thr 210 Al a Lys Phe G-lu Asn Glu Asn 115 Leu Leu Ser Pro Phe 195 Arg Met Glu Leu Ser 70 Val1 Lys Ser Ala Asn 150 Ala Ser Ser Asn Val Gln Glu Ser Leu Ile Asn Ala 135 Pro Tyr Ser Arg Ser 215 Ile Ser Gin Thr Asn Asn 105 Gly Ala Tyr Arg Gly 185 Thr Tyr Giy Ser Asp Ile 75 Glu Thr Gly Leu Gin 155 Gin Al a Ser Ile Lys Ser Ser Val Gin Lys Gly 125 Ser Al a Ser Lys Al a 205 Gly Asn Gln Lys Asp Leu Val Gly I le Arg Ser 175 Gin Ser Gly Ala Lys Val1 Leu Ile Gin Ser Gly Thr 160 Lys Ser Ser INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 8748 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear WO 98/21225 (ii) MOLIECUL TYPE: Genomic DNA (ix) FEATURE: PCTIUS97/21353 NAM~E/KEY: Coding Sequence LOCATION: .8694 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:165: AGAGGGTAGC ATTTA ATG AAA AAG TTT AAA AAG AAA CCA AAA AGT ATC AAA Met Lys Lys Phe Lys Lys Lys Pro Lys Ser Ile Lys CGA TCG CAT Arg Ser His CAA AAT CAA AAA Gin Asn Gin Lys
ACA
Thr 20 ATC TTA AAG CGT Ile Leu Lys Arg
CCT
Pro TTA TGG CTT Leu Trp Leu ATG CCT Met Pro TTA CTC ATC AGC GGG TTT GCT AGT GGG GTG TAT GCG AAT AAT Leu Leu Ile Ser Giy Phe Ala Ser Gly Val Tyr Aia Asn Asn
CTG
Leu TOG GAT TTG TTA Trp Asp Leu Leu CCA AAA GTG GGG Pro Lys Vai Gly GGT GAG Gly Giu 55 TAT GTG CAT Tyr Val His
TGG
Trp OTT AAG GOC AGT Val Lys Gly Ser
CAG
Gin TAT TGT GCA TGG TOG GAA TTT GCT GGG Tyr Cys Aia Trp Trp Giu Phe Ala Gly 70 TOT TTA Cys Leu AAG AAT GTA Lys Asn Val
TGG
Trp GGG OCA AAT CAT AAA GGC TAT GAT GCT Oly Ala Asn His Lys Gly Tyr Asp Ala 85 OGA AAC 0CC Gly Asn Ala GTG GOT AGT Val Gly Ser OCT AAC TAT Ala Asn Tyr TTG TCT TCT CAA Leu Ser Ser Gin
AAC
Asn 100 TAT CAA OCT ATT Tyr Gin Ala Ile
TCG
Ser 105 339 GGG AAT Gly Asn 110 OAA ACG OGG ACT Giu Thr Oly Thr
TAT
Tyr 115 AGT TTA AGC OGT Ser Leu Ser Gly ACC AAT TAT GTT Thr Asn Tyr Val
GGG
Gly 125 GGC AAT CTC ACG ATC AAT CTA GGC AAT AGC GTT GTT TTA GAT Gly As'n Leu Thr Ile Asn Leu Gly Asn Ser Val Val Leu Asp
TTA
Leu AGC GGT TCT AAT AGT TTC ACT TCG TAT CAA GOT TAT AAT CAA Ser Gly Ser Asn Ser Phe Thr Ser Tyr Gin Oly Tyr Asn Gin 140 GGC AAA Gly Lys 155 GAT GAT GTA Asp Asp Vai TTT ACG OTT GGC GCA ATC AAT TTA AAC Phe Thr Val Gly Ala Ile Asn Leu Asn GGC ACT TTA Gly Thr Leu 170 WO 98/21225 PCT/US97/21353 -318- OAA GTO GOT Glu Val Oly 175 AAT-CG1I GIG GOA Asn Arg Val Gly
TCG
Ser 180 GGA GCT GGC ACG Oly Ala Gly Thr
CAC
Hi s 185 ACC 0CC ACA Thr Gly Thr 579 GCC ACT Ala Thr 190 TTA AAC TTG AAC Leu Asn Leu Asn
OCT
Al a 195 AAT AAG GTC AAT Asn Lys Val Asn AAT TCC AAT ATC Asn Ser Asn Ile
AAC
Asn 205 GCG TAT AAA ACT Ala Tyr Lys Thr CAA GTG AAT ATA Gln Val Asn Ile
GGC
Gly 215 AAC OCT AAC AGC Asn Ala Asn Ser
OTT
Val 220 627 675 723 ATT ACC ATT GOT Ile Thr Ile Gly OTT TCT TTO AGT Val Ser Leu Ser GAT OGTT TOC AOT Asp Val Cys Ser TCT TTA Ser Leu 235 OCT AOC OTT Ala Ser Val TCT TTT AAA Ser Phe Lys 255 000 Oly 240 ATA 000 OCT AAT Ile Gly Ala Asn TCC ACT TCT 000 Ser Thr Ser Gly CCT AGC TAT Pro Ser Tyr 250 000 ACO ACT AAC Gly Thr Thr Asn
OCT
Al a 260 ACT AAC ACG GCO TTT AGT AAT GCA Thr Asn Thr Ala Phe Ser Asn Ala 265 AGC GOC Ser Gly 270 AGT TTC ACT-TTT Ser Phe Thr Phe GAG AAC 0CC ACT TTT AOC 000 OCO AAA Glu Asri Ala Thr Phe Ser Gly Ala Lys 280
TG
Trp 285 AAT 000 000 ACT Asn Gly Oly Thr ACC TTT AAT AAA Thr Phe Asn Lys
GAG
Glu 295 TTT AGC OCT ACC Phe Ser Ala Thr AAC ACC 0CC TTT Asn Thr Ala Phe AGC GOT AGT TTT AAT TTT AAA GOT OTA Ser Gly Ser Phe Asn Phe Lys Gly Val 310 AOC TCT Ser Ser 315 TTT AAT GOT Phe Asn Oly 0CC ACT TTC Ala Thr Phe 335
ACT
Thr 320 TCO TTT AGT AAC Ser Phe Ser Asn TCT TAT ACT TTT Ser Tyr Thr Phe CAC AAT CAA Asp Asn Gin 330 ACT TTT AAT Thr Phe Asn 1011 1059 CAA AAC AGC TCC Gin Asn Ser Ser
TTT
Phe 340 AAT 000 000 ACT Asn Oly Gly Thr
TTT
Phe 345 AAC CAA Asn Gin 350 ACT AAT CCA ACT AAC AAC OCT CAG CAC Thr Asn Pro Thr Asn Asn Ala Gin His 355 CAA ATT CAA AAC Oin Ile Gin Asn 1107
AGC
Ser 365 TCT TTT AGT GOT Ser Phe Ser Oly OCT ACC ACT CTT Ala Thr Thr Leu
AAG
Ly s 375 GGC TTT OTO AAT Gly Phe Val Asn
TTC
Phe 380 1155 1203 CAG CAA 0CC TTT Gin Gin Ala Phe AAT TCA AAC CAC Asn Ser Asn His CTA ACO ATC CAA Leu Thr Ile Gin AAC OCT Asn Ala 395 WO 98/21225 PCTIUS97/21353 319- TCC TTT AAT Ser Phe Asn AAA CAT GCC Lys Asp Ala 415 CC ACT TTT AAC Ala Thr Phe Asn
AAT
Asn 405 ACC GGT AAA ATC Thr Gly Lys Ile ACT ATA GAA Thr Ile Glu 410 GTT CAT ACA Val Asp Thr 1251 1299 AGT TTT AAT AAC Ser Phe Asn Asn
ACG
Thr 420 ACA TTC AAC ACT Thr Phe Asn Thr
TCT
Ser 425 AAC AAC Asn Asn 430 ATC ACT GTT ACC Met Ser Val Thr
GGT
Gly 435 GGC CTT ACT TTA Cly Val Thr Leu GGT AAA AAT GAC Cly Lys Asn Asp
TTG
Leu 445 AAA AAT GGC TCA Lys Asn Gly Ser CTT CAT TTT GGG Leu Asp Phe Cly
AGT
Ser 455 TCT AAA ATC ACT Ser Lys Ile Thr 1347 1395 1443 OCT CAA GGG ACG Ala Cmn Gly Thr
ACT
Thr 465 TTC AAC CTC ACA Phe Asn Leu Thr TTA GGC ACT GAG Leu Gly Ser Glu AAG AGC Lys Ser 475 CTA ACC ATT Val Thr Ile AAC CAT CCA Asn His Ala 495
TTA
Leu 480 AAT TCT ACC GGT Asn Ser Ser Cly
GGG
Cly 485 ATC ACT TAT AGT Ile Thr Tyr Ser AAC CTT TTA Asn Leu Leu 490 AAC GAA AGC Asn Clu Ser 1491 1539 ATC AAC GGC TTG Ile Asn Cly Leu
ACA
Thr 500 AGT CCC TTA AAA Ser Ala Leu Lys
ACG
Thr 505 CTT TCA Leu Ser 510 AAT CCC CAA AGT Asn Pro Gin Ser OCT CAA GGT TTG Ala Cmn Gly Leu GAT ATA ATC ACT Asp Ile Ile Thr
TAC
Tyr 525 AAT GGG GTT ACC Asn Gly Val Thr
GGG
Cly 530 CAC CTT TTC AAT Cmn Leu Leu Asn
GAA
Glu 535 AAC CCT GCA ACA Asn Ala Ala Thr
TCT
Ser 540 1587 1635 1683 AAA CCC ACT CAC Lys Pro Thr Asp
TCT
Ser 545 TCC CCC TCT AAA Ser Pro Ser Lys TCT ACA AAC TCT Ser Thr Asn Ser ACC CAA Thr Gin 555 CTC TAT CAA Val Tyr Gin CAA ACT TTC Clu Thr Phe 575 GGT TAC AAA ATA Cly Tyr Lys Ile
GGG
Cly 565 CAT ACT ATC TAC Asp Thr Ile Tyr AAA CTC CAA Lys Leu Gin 570 GAG ACC GGG Glu Ser Gly 1731 1779 AGC CAC AAT TCC Ser His Asn Ser ATT ATT CAC GCT Ile Ile Gin Ala
TTA
Leu 585 ACT TAC Thr Tyr 590 ACG CCA CCC CCT Thr Pro Pro Pro ATT AAC GGC TCC Ile Asn Cly Ser TTT GAC TTA TCC Phe Asp Leu Ser 1827
OCT
Ala 605 TCA AAT TAT ATC Ser Asn Tyr Ile
AAT
Asn 610 OCT CAC ATC CCT TGG TAT CAC CAT AAA Ala Asp Met Pro Trp Tyr Asp His Lys 615 1875 WO 98/21225 WO 9821225PCTIUS97/21353 -320- TAC ATC CCT AAA Tyr Ile Pro Lys (:AA AAT TTT ACA G.in Asn Php Thr AGC GGG ACT TAT Ser Gly Thr Tyr TAC TTG Tyr Leu 635 1923 CCG AGC GTC Pro Ser Vai TTT AGC GCA Phe Ser Ala 655
CAA
Gin 640 ATA TGG GGG AGC Ile Trp Gly Ser
TAC
Tyr 645 ACT AAC TCG TTT Thr Asn Ser Phe AAA CAA ACT Lys Gin Thr 650 TCA ACA TGG Ser Thr Trp 1971 2019 AAT GGT AGT AAT Asn Giy Ser Asn
CTG
Leu 660 GTG ATT GGG TAT Val Ile Gly Tyr
AAC
Asn 665 ACT GAT Thr Asp 670 CAT AAT GTC TCT His Asn Vai Ser AGC GGC ACG GTG Ser Giy Thr Val
TCT
Ser 680 TTT GGG GAC ACT Phe Gly Asp Thr TCA GGG AGC GCT CTT Ser Gly Ser Ala Leu GGG CAT TGC GGA Gly His Cys Gly TGG CCG TAT TAC Trp Pro Tyr Tyr
CAA.
Gin 700 2067 2115 2163 TGC ACA GGC ACG Cys Thr Gly Thr
ACT
Thr 705 AAC GGC ACT TAT Asn Gly Thr Tyr GCC TAT CAT GTG Ala Tyr His Vai TAT ATC Tyr Ile 715 ACA GCG AAT Thr Ala Asn AAT CTA ATC Asn Leu Ile 735 CGT TCT GGC AAT Arg Ser Gly Asn
CGT
Arg 725
AGT
Ser ATA GGC ACC GGT Ilie Gly Thr Gly GGG GCA GCT Giy Aia Ala 730 AAC GCT ACC Asn Ala Thr 2211 2259 AAT GGG GTA Asn Gly Val ATC AAT ATC Ile Asn Ile
GCT
Al a 745 ATC ACG Ile Thr 750 CAA CAT AAC GCC Gin His Asn Ala ATC TAT TCA AGC Ile Tyr Ser Ser ATO ACT TTT TCC Met Thr Phe Ser
ACG
Thr 765 CAA AGC ATG GAT Gin Ser Met Asp
AAT
As n 770 TCG CAG AAT TTG Ser Gin Asn Leu GGT CTA AAT TCT Gly Leu Asn Ser 2307 2355 2403 GGC AAA CTT TCG Gly Lys Leu Ser
GTG
Val1 785 TAT GOC ACC ACT Tyr Gly Thr Thr
TTC
Phe 790 ACT AAC GAA GCT Thr Asn Glu Ala AAA GAT Lys Asp 795 GGG AAA TTC Gly Lys Phe TTT AAT GGA Phe Asn Gly 815 TTC AAT GCA GG Phe Asn Ala Gly
CAA
Gin 805 GCG OTT TTT GAA Ala Val Phe Glu AAC ACC AAC Asn Thr Asn 810 AAT TTT TCA Asn Phe Ser 2451 2499 GGG AGT TAC CAA Gly Ser Tyr Gin AGC GGC OAT AGC Ser Gly Asp Ser AAC AAC AAC CAG TTC AAT Asn Asn Asn Gin Phe Asn 830 GGT TCG TTT GAA Gly Ser Phe Olu AGC OCA AAA AAC Ser Ala Lys Asn 2547 WO 98/21225 WO 9821225PCTIUS97/2 1353 -321-
GCT
Al a 845 TCG TTC AAT AAC Ser Phe Asn Asn AAC TTT AAC PAC AGC GCT TCT TTT AAT.
Asn Phe Asn Asn Ser Ala Ser Phe Asn 855 2595 AAT AAT TCT AAC Asn Asn Ser Asn ACC ACT TCG TTT Thr Thr Ser Phe GGG GAT TTC ACT Gly Asp Phe Thr AAC CCT Asn Ala 875 2643 AAT TCA AAT Asn Ser Asn AAT GGC TCT Asn Gly Ser 895
TTG
Leu 880 CAA ATC GCC G Gln Ile Ala Gly
AAC
Asn 885 GCT GTT TTT GGG Ala Val Phe Gly AAC TCT ACT Asn Ser Thr 890 TCT GTT AAT Ser Val Asn 2691 2739 CAA AAT ACC GCT Gln Asn Thr Ala
AAT
Asn 900 TTT AAT AAT ACC Phe Asn Asn Thr ATT TCA Ile Ser 910 GCG AAT CCA ACC Cly Asn Ala Thr GAT AAT CTG GTG Asp Asn Val Val AAT GGC CCT ACC Asn Gly Pro Thr
AAC
Asn 925 ACG ACC GTG AAA Thr Ser Val Lys CAC GTT ACT--TTA Gln Val Thr Leu
AAT
As n 935 AAC ATC ACT TTA Asn Ile Thr Leu 2787 2835 2883 AAC CTG AAC GCC Asn Leu Asn Ala TTG TCT TTT GGC Leu Ser Phe Gly
GAT
Asp 950 GGG ACG ATT ACT Gly Thr Ile Thr TTT AAC Phe Asn 955 CCT CAT TCC Ala His Ser ATC ACT CTT Ile Thr Leu 975 ATT AAT ATT GCT Ile Asn Ile Ala GAA TCT Glu Ser 965 ATC ACT AAT Ile Thr Asn GGC AAC CCT Gly Asn Pro 970 AAC GCT TTC Asn Ala Phe 2931 2979 GTA AGC TCT TCT Val Ser Ser Ser GAA ATT GAA TAC Clu Ile Glu Tyr AGT AAA Ser Lys 990 AAT CTA TGG CAG Asn Leu Trp Gln
CTC
Leu 995 ATC AAC TAC CAA GGG Ile Asn Tyr Gin Gly 1000 CAT GGG CCA AGC His Gly Ala Ser ACT CAA AAG CTC GTC Ser Glu Lys Leu Val 1005 TAT TCT TTC AAT AAC Tyr Ser Phe Asn Asn 1025 TCT ACC GCG GGT AAT Ser Ser Ala Gly Asn 1010 CCC GTT Gly Val 1015 TAT CAT CTG Tyr Asp Val
GTC
Val1 1020 3027 3075 3123 CAA ACC TAC Cmn Thr Tyr AAT TTC Asn Phe 1030 CAA CAC CTT TTT TCA CA Gin Glu Val Phe Ser Gin 1035 AAC ACC ATT TCT ATC CCC CCT TTG CCC CTT AAC ATC CTG Asn Ser Ile Ser Ile Arg Arg Leu Cly Val Asn Met Val TTT CAT TAT Phe Asp Tyr 1050 1040 1045 3171 3219 CTC CAT ATC CPA AAA TCC CAT CAT TTA TAT TAT CAA AAC CCT CTC CGT Val Asp Met Clu Lys Ser Asp His Leu Tyr Tyr Cmn Asn Ala Leu Gly 1055 1060 1065 WO 98/21225 PCT/US97/21353 -322- TTT ATG Phe Met 1070 AAC AAC Asn Asn 1085 ACC TAc Alu Thr Tyr Met ACC ATT TAC Thr Ile Tyr CC! AAT Pro Asn 1075 TAT TAC Tyr Tyr 1090 AGC TAT AAC Ser Tyr Asn GAC AAG AGC Asp Lys Ser AAT PAT TTA Asn Asn Leu 1080 ATT GAT TTT Ile Asp Phe 1095 TCT CAA ACA Ser Gin Thr GGG PAT GCA Gly Asn Ala TAT GCG AGC Tyr Ala Ser 1100 TTC ACC GGG Phe Thr Gly 1115 3267 3315 GGG AAA ACT CTA TTC Gly Lys Thr Leu Phe 1105 ACT AAA GCG Thr Lys Ala GAA TTT Glu Phe 1110 3363 CAA PAC AGC Gin Asn Ser AGC GAT GCA Ser Asp Ala 1135 GCG ATC GTT TTT GGG Ala Ile Val Phe Gly 120 CCG CAG TCT AAC ACC Pro Gin Ser Asn Thr 1140
GCT
Ala L125 AAA AGC ATA TGC Lys Ser Ile Trr ACG AGC TTA Thr Ser Leu 1130 GAC PAT AAG Asp Asn Lys 3411 3459 ATC ATT CGC TTT GGG Ile Ile Arg Phe Gly 1145 GGA GCA Gly Ala 1150 ATA GGC Ile Gly 1165 GGG AGT PAT GAT GCG AGC GGG CAT Gly Ser Asn Asp Ala Ser Gly His 1155 TGC TGG AAT Cys Trp Asn 1160 CAA AAG ATT Gin Lys Ile 1175 TTG CAA TGC Leu Gin Cys TAC ATC ACC Tyr Ile Thr 1180 3507 TTT ATT ACA Phe Ile Thr GGG CAT Gly His 1170 TAT GPA GCG Tyr Glu Ala 3555 GGT AGC ATT GPA AGC GGG PAT CGC Gly Ser Ile Giu Ser Gly Asn Arg 1185 ATT TCT Ile Ser 1190 AGC GGT GGG GGC GCG AGC Ser Gly Gly Gly Ala Ser 1195 3603 CTT PAT TTT Leu Asn Phe TAT PAC CGC Tyr Asn Arg 1215 PAC AGC GCG Asn Ser Ala 1230
PAC
Asn 1200 GGG CTT CPA GGC ATT Gly Leu Gin Gly Ile 1205 CTT TTA ACG AAC Leu Leu Thr Asn 0CC GCT GGC ACG CAA Ala Ala Gly Thr Gin 1220 PAC ATT CAG OCT CAA Asn Ile Gin Ala Gin 1235 AGC TCG TCT Ser Ser Ser ATG PAT Met Asn 1225 GCG ACT TTG Ala .Thr Leu 1210 TTT ATC TCT Phe Ile Ser GAC GAT ACC Asp Asp Thr 3651 3699 3747 AAC TCC TAT TTT ATA Asn Ser Tyr Phe Ile 1240 GCA CPA PAT OGC GGT AAC CCT PAT TTC AGT TTC PAC OCT TTG AAT CTG Ala Gin Asn Oly Oly Asn Pro Asn Phe Ser Phe Asn Ala Leu Asn Leu 3795 1245 1250 1255 1260 GAT TTT TCT Asp Phe Ser PAC AGC Asn Ser 1265 TCT TTT AGA GGC TAT Ser Phe Arg Oly Tyr 1270 GCC PAG PAT GCG ATC Ala Lys Asn Ala Ile 1285 GTG GGG AAA ACG CPA TCT Val Gly Lys Thr Gin Ser 1275 3843 OTT TTT PAA TTC PAT Val Phe Lys Phe Asn 1280 AGT TTC ACC Ser Phe Thr PAC ACC ACO Asn Ser Thr 290 3891 WO 98/21225 PCTIUS97/21353 -323- AAT TTA AGC Asn Leu Ser 1295 TCT UUi -1-f G TAT CAA Ser Gly Leu Tyr Gin 1300 ATG CAA GCT AAA AGC GTG TTG TTT Met Gin Ala Lys Ser Val Leu Phe 1305 GTG GGG ACA AGC ACT ATT AAA GCC Val Gly Thr Ser Ser Ile Lys Ala 1320 3939 GAC AAT TCC AAT TTA AGC GTT TCA Asp Asn Ser Asn Leu Ser Val Ser 1310 1315 3987 PAT GCG ATC Asn Aia Ile 1325 AAT CTT TCT CAA PAT Asn Leu Ser Gin Asn 1330 GCC TCT ATT AAT GCG Ala Ser Ile Asn Ala 1335 AGC AAC CAT Ser Asn His 1340 4035 TCA ACC TTA GAA CTT CAA GGC GAT TTG AAT GTG AAC GAC ACC AGC TCG Ser Thr Leu Giu Leu Gin Gly Asp Leu Asn Vai Asn Asp Thr Ser Ser 4083 1345 1350 1355 CTC PAC CTC PAC Leu Asn Leu Asn 1360 PAC GAT TAT GCG Asn Asp Tyr Ala 1375 CAA AGC ACG ATT Gin Ser Thr Ile AGC TTG ATT Ser Leu Ile
GCG
Ala 1380
I
AAT
Asn .365
AGT
Ser GTT TCC AAT AAC Val Ser Asn Asn GCC ACG ATC Ala Thr Ile 1370 CTT AAT TTT Leu Asn Phe 4131 4179 AAT GGC TCT Asn Gly Ser
CAC
His 1385 AAC GGG Asn Gly 1390 AAT TCC Asn Ser 1405 GCG GTT AAT Ala Val Asn TCT ATC GTG Ser Ile Val TTC PAT Phe Asn 1395 TTT AAG Phe Lys 1410 TCA GCG PAT ATT ACT ACG AGT TTG PAT Ser Ala Asn Ile Thr Thr Ser Leu Asn 1400 4227 GGG GCG GTC Gly Ala Val TCT TTA Ser Leu 1415 GGA GGG CAG Gly Gly Gin
TTT
Phe 1420 4275 4323 PAT TTA AGC PAT PAC Asn Leu Ser Asn Asn 1425 TCT TCT TTA GAT TTC Ser Ser Leu Asp Phe 1430 CAA GGC TCT AGC GCT ATC Gin Gly Ser Ser Ala Ile 1435 ACC TCT PAC ACG Thr Ser Asn Thr 1440 CCC ATC ACT TTC Pro Ile Thr Phe 1455 GGA GGC PAC CTT Gly Giy Asn Leu 1470 PAC AGC CAG CTT Asn Ser Gin Leu 1485 GCG TTT PAT TTC TAT GAT AAC GCT Ala Phe Asn Phe Tyr Asp Asn Ala 1445 CAT CPA GCC CTT GAC ATT APA GCG His Gin Ala Leu Asp Ile Lys Ala 1460 2 TTT TCT CAA AGC Phe Ser Gin Ser 1450 4371 4419
CCC
Pro 465 TTA AGT TTG Leu Ser Leu TTA PAC CCT Leu Asn Pro 1475 OTT TTT GGC Val Phe Gly 1490 PAC PAC AGC Asn Asn Ser GAT CPA GGG Asp Gin Gly AGC GTG Ser Val 1480 AGT TTG Ser Leu 1495 CTG GAT TTA AAA Leu Asp Leu Lys 4467 4515 AAT ATC GCT Asn Ile Ala
PAC
Asn 1500 ATT GAT TTA CTA AGC GAT CTA PAT GAT PAT AAA PAT CGT GTG TAT PAC Ile Asp Leu Leu Ser Asp Leu Asn Asp Asn Lys Asn Arg Val Tyr Asn 4563 1505 1510 1515 WO 98/21225 PCT/US97/21353 -324- ATC ATT CAA GCU Ile Ile Gin Ala 1520 UAu AIG AAT Asp Met Asn AGT AAT TGG Ser Asn Trp 1525 TAT GAG CGT ATC AGC TTC Tyr Glu Arg Ile Ser Phe 1530 4611 TTT GGC ATG Phe Gly Met 1535 CAC ATC AAT GAC GGG His Ile Asn Asp Gly 1540 ATT TAT GAT GCT AAA AAC CAA ACT Ile Tyr Asp Ala Lys Asn Gin Thr 1545 4659 TAT AGT Tyr Ser 1550 TTT AAA Phe Lys 1565 TTC ACT AAC Phe Thr Asn GAC AAC CAA Asp Asn Gin CCC CTT Pro Leu 1555 CTA AGC Leu Ser 1570 AAT AAC GCC Asn Asn Ala GTT ACG CTC Val Thr Leu CTA AAA Leu Lys 1560 TCT CAA Ser Gin 1575 ATC ACC GAG AGC Ile Thr Glu Ser ATC CCG GGT ATT Ile Pro Gly Ile 1580 4707 4755 AAA AAC ACG Lys Asn Thr CTC TAT Leu Tyr 1585 AAC ATT GGC TCT GAA Asn Ile Gly Ser Glu 1590 ATT TTT AAC TAC CAA AAA Ile Phe Asn Tyr Gin Lys 1595 4803 GTT TAT AAC Val Tyr Asn GGC GTG TTT Gly Val Phe 1615
AAC
Asn 1600 GCT AAT GGC Ala Asn Gly GTG TAT Val Tyr 1605 TCT TAT AGC Ser Tyr Ser GAT GAT GCA CAA Asp Asp Ala Gin 1610 TAC AAC CCT AAC Tyr Asn Pro Asn 625 4851 4899 TAT CTC ACA AGC AAC Tyr Leu Thr Ser Asn 1620 GTG AAA GGC TAT Val Lys Gly Tyr 1 CAA TCC Gin Ser 1630 CTA ACC Leu Thr 1645 TAT CAA GCC AGC GGC Tyr Gin Ala Ser Gly 1635 AGT AAC AAC Ser Asn Asn ATC TCG CAA Ile Ser Gin ACC ACG Thr Thr 1640 ACC TAT Thr Tyr 1655 AAA AAT AAT AAT Lys Asn Asn Asn TCT GAA TCT Ser Glu Ser TCT ATC Ser Ile 1650 AAC GCG CAA Asn Ala Gin
GGC
Gly 1660 4947 4995 5043 AAC CCT ATT AGC GCG Asn Pro Ile Ser Ala 1665 TTG CAC ATC TAT AAC Leu His Ile Tyr Asn 1670 AAG GGC TAT AAT TTC AAC Lys Gly Tyr Asn Phe Asn 1675 AAT ATC AAA GCG TTA GGG CAA ATG GCT CTC AAA CTC TAC Asn Ile Lys Ala Leu Gly Gin Met Ala Leu Lys Leu Tyr CCT GAA ATC Pro Glu Ile 1690 1680 1685 5091 5139 AAA AAG GTA TTA Lys Lys Val Leu 1695 AAC TCT AAT GCG Asn Ser Asn Ala 1710 GGG AAT GAT TTT TCG CCC TCA AGT TTG AAC GCT TTA Gly Asn Asp Phe Ser Pro Ser Ser Leu Asn Ala Leu 1700 1705 CTA AAC CAA CTT ACC Leu Asn Gin Leu Thr 1715 AAA CTC ATC Lys Leu Ile 1720 ACG CCT AAC GAC Thr Pro Asn Asp 5187 TGG AAA AAC ATT AAC Trp Lys Asn Ile Asn 1725 GAG TTG ATT GAT AAC Glu Leu Ile Asp Asn 1730 GCA AAC AAT TCG GTG Ala Asn Asn Ser Val 1735
GTG
Val 1740 5235 i.
WO 98/21225 PCT1S97/21353 -325- CAA PAT TTC AAT'-AAU Gin Asn Phe Asn Asn 1745 UUC ACT TTG ATT GTG Gly Thr Leu Ile Val 1750 GGA GCG ACT CPA ATA. GGG Gly Ala Thr Gin Ile Gly 1755 5283 CAA ACA GAC ACC Gin Thr Asp Thr 1760 PAT AGC GCG GTT GTT Asn Ser Ala Val Vai 1765 TTT GGG GGC Phe Giy Gly TTG GGC TAT CAA Leu Giy Tyr Gin 1770 5331 5379 ACA CCT TGT Thr Pro Cys 1775 GAT TAT ACT Asp Tyr Thr GAT ATT Asp Ile 1780 GTG TGC CAA PAA TTT Val Cys Gin Lys Phe 1785 AGA GGC ACT Arg Gly Thr TAT TTA Tyr Leu 1790 GAC ACG Asp Thr 1805 GGA CAG CTT TTA GAG TCC Gly Gin Leu Leu Giu Ser 1795 AGC TCG GCT GAT Ser Ser Ala Asp 1800 ATT TAT CTT ACC Ile Tyr Leu Thr 1815 TTG GGC TAT ATT Leu Gly Tyr Ile 5427 ACT TTT PAC Thr Phe Asn GCT AAA GAA Ala Lys Glu 1810 GGG ACT GGG Gly Thr Gly GGC ACT TTA Gly Thr Leu
GGG
Gly 1820 AGC GGG PAC GCA TGG Ser Giy Asn Ala Trp 1825 GGG AGC Gly Ser 1830 GCG AGC GTA Ala Ser Val AGC CPA ACT Ser Gin Thr ACC GAT GGG Thr Asp Gly 1855
TCG
Ser L840 CTC ATT CTC PAT CAG Leu Ile Leu Asn Gin 1845 GCT PAT ATC GTA Ala Asn Ile Val ACT TTT PAC Thr Phe Asn 1835 AGC TCG CAA Ser Ser Gin 850 PAT PAG GTT Asn Lys Val 5475 5523 5571 5619 ATC TTT AGC ATG CTG Ile Phe Ser Met Leu 1860 GGT CPA GAG GGT ATT Gly Gin Glu Gly Ile 1865 TTC PAT Phe Asn 1870 CAA GCC GGG CTC OCT Gin Ala Gly Leu Ala 1875 PAT ATT TTG Asn Ile Leu TTA GGG PAT Leu Gly Asn GGC GAA Gly Glu 1880 TTG ATA Leu Ile 1895 GTG GCG GTG CAA Val Ala Val Gin
TCC
Ser 1885
GGG
Gly ATC PAC AAA GCC Ile Asn Lys Ala AGT PAT AGC GTG Ser Asn Ser Val 1905 GGG GGA Gly Gly 1890 GTA PAT ACG Val Asn Thr
CTA
Leu 1900 5667 5715 5763 ATT GGG GGG TAT TTA Ile Gly Gly Tyr Leu 1910 ACG CCT GPA CPA PAA PAT Thr Pro Glu Gin Lys Asn 1915 CAA ACC CTA AGC Gin Thr Leu Ser 1920 CAG CTT TTA GGG Gin Leu Leu Gly 1 CAG PAT PAC TTT Gin L925 Asn Asn Phe GAT PAT CTC ATG Asp Asn Leu Met 1930 5811 AAC OAT AGC Asn Asp Ser 1935 GGT TTG PAT ACG GCG ATT PAG GAT TTG ATC AGA CPA AAA Gly Leu Asn Thr Ala Ile Lys Asp Leu Ile Arg Gin Lys 5859 1940 1945 TTA GGC TTT TGG ACC Leu Gly Phe Trp Thr 1950 GGG CTA GTG GGG GGA TTA Gly Leu Val Giy Gly Leu 1955 GCC GGA CTA GGG GGC Ala Gly Leu Gly Gly L960 5907
I.
WO 98/21225 PCTIUS97/21353 -326- ATT GAT TTG CAA AA(2 Ile Asp Leu Gin Asn 1965 GAT TTA TTG AGT AAA Asp Leu Leu Ser Lys 1985 CCT G;!L AAG CTT ATA GGC AGC ATG TCA ATC Pro Giu Lys Leu Ile Gly Ser Met Ser Ile 1970 1975
AAT
Asn 1980 5955 AAA GGG TTG TTC AAT CAG Lys Gly Leu Phe Asn Gin 1990 ATC ACC GGC TTT ATT Ile Thr Gly Phe Ile 1995 6003 TCC GCT AAC GAT Ser Ala Asn Asp 2000 GTC AAA CCG AGC Val Lys Pro Ser 2015 ATA GGG CAA Ile Gly Gin GTC ATA Val Ile 2005 AGC GTA ATG Ser Vai Met GAT GTA GCG Asp Vai Ala AAC GCT TTA AAA Asn Ala Leu Lys 2020
AAC
As n CAA ATG Gin Met 2030 AGC TTG Ser Leu 2045 ATT GGC GAA Ile Gly Glu TTG CAA AAC Leu Gin Asn TTT TTA Phe Leu 2035 CAG CAG Gin Gin 2050 GGC CAA GAC ACG CTC Gly Gin Asp Thr Leu 2040 ATT AAA AGC GTT TTA Ile Lys Ser Vai Leu 2055 TTG CAA GAT ATT Leu Gin Asp Ile 2010 GCT TTA GGC AAG Aia Leu Gly Lys 2025 AAT TCT TTA GAA Asn Ser Leu Glu GAC AAA GTC CTA Asp Lys Val Leu 2060 TTG GGG GAT TTG Leu Giy Asp Leu 2075 TAT GGC TTG AGT Tyr Giy Leu Ser 2090 6051 6099 6147 6195 GCG GCT AAA Aia Ala Lys ATA CCT AAT Ile Pro Asn CAA GTG TGG Gin Vai Trp 2095 GGT TTA Gly Leu 2065 GGG CCT ATT TAT GAA CAA Giy Pro Ile Tyr Glu Gin 2070 6243 6291
CTT
Leu 2080 GGT AAA AAA Gly Lys Lys GGG CTT Gly Leu 2085 TTC GCT CCT Phe Al a Pro CAA AAA GGG Gin Lys Gly GAT TTT Asp Phe 2100 AGT TTC AAC GCA CAA Ser Phe Asn Ala Gin 2105 GGC AAT GTT Gly Asn Val 6339 TTT GTG Phe Val 2110 TTT AAC Phe Asn 2125 CAA AAT TCC Gin Asn Ser GCA GGA AAT Ala Giy Asn ACT TTC Thr Phe 2115 TCG CTC Ser Leu 2130 TCT AAC GCC AAT GGA Ser Asn Ala Asn Gly 2120 ATT TTT GCC GGA AAC Ile Phe Ala Gly Asn 2135 GGC ACG CTC TCT Gly Thr Leu Ser 6387 6435 AAT CAT ATT Asn His Ile
GCA
Al a 2140 TTC ACT AAC Phe Thr Asn CAC GCT His Ala 2145 GGA ACT CTT CAA TTA, TTG TCC GAT CAA GTT TCT Gly Thr Leu Gin Leu Leu Ser Asp Gin Val Ser 6483 2150 215S AAC ATT AAC ATC ACC Asn Ile Asn Ile Thr 2160 GCC GCT AAT AAC AAT Ala Ala Asn Asn Asn 2175 ACG CTT AAC GCT AGC AAC GGC CTT AAG ATT AAC Thr Leu Asn Ala Ser Asn Gly Leu Lys Ile Asn 2165 2170 GTT TCT GTG TCT CAA GGC AAT CTG TTT GTC AGC Val Ser Val Ser Gin Gly Asn Leu Phe Val Ser 2180 2185 6531 6579 t.
WO 98/21225 PCTIUS97/21353 -327- GCT AGC Ala Ser 2190 CCT TGC Pro Cys 2205 TGC GCG CAA Cys Ala Gin GCG CTT AGC Ala Leu Ser CAA AGC Gin Ser 2195 GCC CAA Ala Gin 2210 GAT CCA ACT Asp Pro Thr AGC ACG AAT Ser Thr Asn ACA GCT Thr Ala 2200 GGC GCT Gly Ala 2215 AAT ATT GCA AAC Asn Ile Ala Asn 6627 6675 TCT TCT AAT Ser Ser Asn
AAT
Asn 2220 GCG TCA AAT Ala Ser Asn AAC GCG Asn Ala 2225 CCA ATC GCC Pro Ile Ala TTG AGT Leu Ser 2230 AAT AAC GAT Asn Asn Asp GAA AGC TTG Glu Ser Leu 2235 6723 6771 ATG GTT GCG GCG Met Val Ala Ala 2240 AAT GAT TTC Asn Asp Phe AAT TTT Asn Phe 2245 TCA GGC AAT Ser Gly Asn ATT TAC GCT AAT Ile Tyr Ala Asn 2250 GGG GTG GTT Gly Val Val 2255 GAT TTT TCA AAG ATT AAA GGC TCT GCA AAC ATT AAA AAC.
Asp Phe Ser Lys Ile Lys Gly Ser Ala Asn Ile Lys Asn 6819 2260 2265 CTG TAT Leu Tyr 2270 TCC AAT Ser Asn 2285 CTT TAC AAT Leu Tyr Asn CAA GCG GTG Gin Ala Val AAC GCT Asn Ala 2275 TTA GAA Leu Glu 2290 CAA TTC CAA Gin Phe Gin AAA AAC GCC Lys Asn Ala GCC AAC Ala Asn 2280 AGC TTT Ser Phe 2295 AAT CTC ACT ATT Asn Leu Thr Ile 6867 6915 GTA ACG AAT Val Thr Asn
AAT
Asn 2300 TTA AAC ATT Leu Asn Ile CAA GGA Gin Gly 2305 GCG TTT AAC AAC AAC Ala Phe Asn Asn Asn 2310 GCC ACG CAA Ala Thr Gin AAA ATA GAG Lys Ile Glu 2315 6963 GTG CTT CAA AAT TTA GTG ATC Val Leu Gin Asn Leu Val Ile 2320 OCT TCA Ala Ser 2325 AAC GCT TCT Asn Ala Ser TTA AGC ACC GGG Leu Ser Thr Gly 2330 7011 ATT TAT GGG TTA GAA Ile Tyr Gly Leu Glu 2335 GTA GGG GGG Val Gly Gly 2340 GCT TTG AAT AAT TCT Ala Leu Asn Asn Ser 2345 GGA GCG ATC Gly Ala Ile 7059 CAT TTT His Phe 2350 GAG GGG Glu Gly 2365 AAT TTA GAA Asn Leu Glu ATC ATT AAC Ile Ile Asn AAT ACC Asn Thr 2355 CTC AAC Leu Asn 2370 CAA ACG CCA Gin Thr Pro ACC ACC CAA Thr Thr Gin ACG CCG Thr Pro 2360 ACG CCT Thr Pro 2375 CTC ATT CAA GCA Leu Ile Gin Ala 7107 7155 TTT ATG AAT Phe Met Asn
GTC
Val 2380 AAT AAC AGC Asn Asn Ser ATG GCC Met Ala 2385 AAT AAT ACG Asn Asn Thr ACT TAC Thr Tyr 2390 ACT TTA TTA Thr Leu Leu AAA AGC AGC Lys Ser Ser 2395 7203 CGT TAC ATT GAT TAC Arg Tyr Ile Asp Tyr 2400 AAT ATC AAC CCC AAC Asn Ile Asn Pro Asn 2405 AGC TTG CAA TCG TAT TTG Ser Leu Gin Ser Tyr Leu 2410 7251 4.
WO 98/21225 PCT[US97/21353 -328- AAT CTC TAC Asn Leu Tyr 2415 ACT TIA ATC-AAT ATC Thr Leu Ile Asn Ile 2420 AAC GGG AAC CAC ATA Asn Gly Asn His Ile 2425 GAG GAA AAA Glu Glu Lys 7299 AAC GGC Asn Gly 2430 GGG TTA Gly Leu 2445 GCA TTG ACT Ala Leu Thr TTG TTA AGC Leu Leu Ser TAT TTG Tyr Leu 2435 GTA GCG Val Ala 2450 GGC CAA CGG Gly Gin Arg GTT TTG Val Leu 2440 TTG CAA GAT AAG Leu Gin Asp Lys 7347 CTG CCC AAC TCA AAC AAC GCT TCT Leu Pro Asn Ser Asn Asn Ala Ser 2455
CAA
Gin 2460 7395 AAC AAC ATT TTA AGC Asn Asn Ile Leu Ser 2465 CTT TCT GTC Leu Ser Val CTT TAT Leu Tyr 2470 AAC CAA GTT AAA ATG TCT Asn Gin Val Lys Met Ser 2475 7443 TGC GGC GAT AAA Cys Gly Asp Lys 2480 GCG ATG GAT TTT ACC Ala Met Asp Phe Thr 2485 CCC CCT ACC TTA Pro Pro Thr Leu CAA GAT TAC 1 Gin Asp Tyr 2490 GAA GCT GTT SGlu Ala Val 7491 7539 ATT GTG GGC Ile Val Gly 2495 ATT CAA GGG Ile Gin Gly CAA AGC Gin Ser 2500 GCG CTC AAT CAA Ala Leu Asn Gin
ATT
Ile 2505 GGG GGG Gly Gly 2510 GAA AAC Glu Asn 2525 AAC GCT ATC Asn Ala Ile CCG TTT TTT Pro Phe Phe AAG TGG Lys Trp 2515 GCG CCG Ala Pro 2530 CTT TCA ACA Leu Ser Thr ATT TAT TTA Ile Tyr Leu TTG ATG Leu Met 2520 AAA AAC Lys Asn 2535 ATG GAG ACT AAA Met Glu Thr Lys CAC TCT TTG His Ser Leu
AAT
Asn 2540 7587 7635 7683 GAA ATC TTA GGC GTA Glu Ile Leu Gly Val 2545 ACA AAA GAT CTT CAA Thr Lys Asp Leu Gin 2550 AAC ACC GCA AGC TTG ATT Asn Thr Ala Ser Leu Ile 2555 TCT AAC CCT AAT Ser Asn Pro Asn 2560 AGT TAC ACC CAA Ser Tyr Thr Gin 2575 TTT AGA GAT AAC GCT Phe Arg Asp Asn Ala 2565 CAA ACC AGC CGT TTA Gin Thr Ser Arg Leu 2580 ACC AAT CTT TTA Thr Asn Leu Let ACA AAA CTC TCT Thr Lys Leu Ser 258E SGAA TTG GCG Glu Leu Ala 2570 GAT TTT AGA Asp Phe Arg 7731 7779 TCT AGA Ser Arg 2590 CGT TTT Arg Phe 2605 GAG GGA GAG Glu Gly Glu AGC GAT CCT Ser Asp Pro TCT GAT Ser Asp 2595 AAT CCA Asn Pro 2610 TTT TCT TTG Phe Ser Leu GAG GTT TTT Glu Val Phe TTA GAG Leu Glu 2600 GTC AAA Val Lys 2615 CTT AAA AAC AAG Leu Lys Asn Lys 7827 7875 TAC TCT CAA Tyr Ser Gin
CTT
Leu 2620 AGC AAA CAC CCA AAT AAC CTT TGG GTT CAA GGG GTG GGA GGA GCG AGC Ser Lys His Pro Asn Asn Leu Trp Val Gin Gly Val Gly Gly Ala Ser 7923 2625 2630 2635
I.I
WO 98/21225 PCT/US97/21353 -329- TTT ATT TCT GGG Phe Ile Ser Gly 2640 GAC AGG TTG GTT Asp Arg Leu Val 2655 GGC AAT GGC ACG CTT TAT Gly Asn Gly Thr Leu Tyr AAA AAT GTG Lys Asn Val 2
ATC
Ile 66C 2645
CTT
Leu 0 GGC TTG AAT GCG GGC TAT Gly Leu Asn Ala Gly Tyr 2650 GGT TAT GTG GCT TAT GGC Gly Tyr Val Ala Tyr Gly 2665 7971 8019 TAT AGC Tyr Ser 2670 GAT GTG Asp Val 2685 GAC TTT AAT Asp Phe Asn GGG ATG TAT Gly Met Tyr GGG AAC Gly Asn 2675 GCG AGG Ala Arg 2690 ATC ATG CAT TCT TTG GGT AAT AAT GTG Ile Met His Ser Leu Gly Asn Asn Val 2680 8067 GCT TTT TTA Ala Phe Leu AAA AGG Lys Arg 2695 AAC GAA TTC Asn Glu Phe
ACT
Thr 2700 8115 TTG AGC GCG Leu Ser Ala AAT GAA Asn Glu 2705 ACT TAT GGA Thr Tyr Gly GGC AAT Gly Asn 2710 GCA ACT AGT Ala Thr Ser ATC AAT TCT Ile Asn Ser 2715 8163 TCT AAT TCT TTG Ser Asn Ser Leu 2720 CTC TCT GTG Leu Ser Val TTG AAC Leu Asn 2725 CAA CGC TAC AAC TAC AAC ACC Gin Arg Tyr Asn Tyr Asn Thr 2730 8211 TGG ACA ACG Trp Thr Thr 2735 AGC GTG AAC Ser Val Asn GGG AAT TAC Gly Asn Tyr 2740 GGC TAT GAT TTC Gly Tyr Asp Phe 2745 ATG TTC AAA Met Phe Lys 8259 CAA AAA Gin Lys 2750 ATA GGT Ile Gly 2765 AGC GTG GTG Ser Val Val CTA AGT GGG Leu Ser Gly CTA AAA Leu Lys 2755 ATG AAA Met Lys 2770 CCT CAA GTG Pro Gin Val GGC AAT GAT Gly Asn Asp GGT TTG Gly Leu 2760 GCC GCT Ala Ala 2775 AGC TAT CAT TTC Ser Tyr His Phe 8307 8355 TAC AAA CAA Tyr Lys Gin
TTC
Phe 2780 CTC ATG CAT Leu Met His TCA AAC Ser Asn 2785 CCC TCT AAC GAA TCG Pro Ser Asn Glu Ser 2790 GTT TTA ACG Val Leu Thr CTC AAC ATG Leu Asn Met 2795 8403 GGG TTG GAG AGC CGT AAA TAT Gly Leu Glu Ser Arg Lys Tyr 2800 TTT GGT Phe Gly 2805 AAA AAT TCC Lys Asn Ser TAT TAT TTT GTA Tyr Tyr Phe Val 2810 8451 ACG GCG AGA Thr Ala Arg 2815 CTA GGT AGG GAT CTT Leu Gly Arg Asp Leu 2820 TTG ATC AAA TCT AAA Leu Ile Lys Ser Lys 2825 GGC AGC AAT Gly Ser Asn 8499 ACG GTG CGT TTT GTG Thr Val Arg Phe Val 2830 GTT TTT AAC ACT TTT Val Phe Asn Thr Phe 2845 GGC GAA AAC Gly Glu Asn 2835 GCG AGC GTG Ala Ser Val 2850 ACT TTA TTG TAT CGC AAG GGG GAA Thr Leu Leu Tyr Arg Lys Gly Glu 2840 8547 ATT ACA GGG GGC GAA Ile Thr Gly Gly Glu 2855 ATG CAT TTG Met His Leu 2860 8595 WO 98/21225
PCI
-330- TGG CGT TTG GIG TAT GTG -AAT GCG GGG GTG GGG CTT AAG ATG GGC TTG Trp Arg Leu Val Tyr Val Asn Ala Gly Val Gly Leu Lys Met Gly Leu 2865 2870 2875 CPA TAC CAA GAT ATT PAT ATA ACC GGG PAT GTG GGC ATG CGA GTG GCG Gin Tyr Gin Asp Ile Asn Ile Thr Gly Asn Val Gly Met Arg Val Ala 2880 2885 2890 TTT TAGCTTTTTT GCTATPATGC TTCGTTCAAA TTTTATGGTT AGGTTTTTCT ATOT Phe INFORMATION FOR SEQ ID NO:166: SEQUENCE CHARACTERISTICS: LENGTH: 2893 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE:-protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:166: fIUS97/21353- 8643 8691 8748 Met Lys Lys Phe Lys Lys Lys Pro Lys Ser 1 As n Ile Leu Gin Gly Ser Gly Thr Ser 145 Phe Arg Leu Thr Gin Ser Asn Tyr.
Ala Ser Thr Ile 130 Phe Thr Val1 Asn Ser 210 Thr Phe Lys Ala His Asn 100 Ser Leu Ser Gly Ser 180 Asn Val 5 Ile Al a Val1 Trp Lys Tyr Leu Gly Tyr Ala 165 Gly Lys As n Arg Val1 Glu Phe Asp Ile Phe 120 Val1 Tyr Leu Thr Ile 200 Asn Pro 25 Tyr Tyr Ala Al a Ser 105 Thr Val Asn Asn His 185 Asn Al a Lys Leu Asn Trp Leu Al a Ser Val Leu 140 Lys Leu Thr Ile Val 220 Arg Met Leu Val Lys Ala Gly Gly 125 Ser Asp Giu Ala Asn 205 Ile Ser Pro Trp Lys As n Asn Asn 110 Gly Gly Asp Val Thr 190 Al a Thr His Leu Asp Gly Val1 Tyr Giu Asn Ser Val1 Gly 175 Leu Tyr Ile Gin Leu Leu Ser Trp Leu Thr Leu Asn Thr 160 Asn Asn Lys Gly Ser Val Ser Leu Ser Gly Asp Val Cys Ser Ser Leu Ala Ser Vai Gly 225 230 235; WO 98/21225 WO 9821225PCT/US97/21353 -331- Ile Thr Thr Thr Ser 305 Ser Asn Pro Gly Asn 385 Ala Phe Val1 Ser Thr 465 Asn As n Gin Thr Ser 545 Gly His Pro Ile Ser 625 Ile Gly -Val1 Giy Thr Phe Tyr 290 Ser Phe Ser Thr Asn 370 Asn Thr Asn Thr Thr 450 Phe Ser Gly Ser Gly 530 Ser Tyr Asn Pro Asn 610 Gin Trp Ser Ser Ala. Asn Uys 245 Asn Ala Thr 260 Giu Giu Asn 275 Thr Phe Asn Giy Ser Phe Ser Asn Ala 325 Ser Phe Asn 340 Asn Asn Ala 355 Aia Thr Thr Ser Asn His Phe Asn Asn 405 Asn Thr Thr 420 Gly Gly Vai 435 Leu Asp Phe Asn Leu Thr Ser Gly Giy 485 Leu Thr Ser 500 Phe Ala Gin 515 Gin Leu Leu Pro Ser Lys Lys Ile Giy 565 Ser Ile Ile -ai580 Va Ile Asn 595- Aia Asp Met Asn Phe Thr Gly Ser Tyr 645 Asn Leu Val 660 Ser Ser Gly ,ier- Thr Ser Gly Pro Ser Asn Al a Lys Asn 310 Ser Gly Gin Leu Gin 390 Thr Phe Thr Gly Ser 470 Ile Al a Gly Asn Ser 550 Asp Ile Giy Pro Giu 630 Thr Ile Thr Thr Thr Giu 295 Phe Tyr Gly His Lys 375 Leu Giy Asn Leu Ser 455 Leu Thr Leu Leu Giu 535 Ser Thr Gin Ser Trp 615 Ser As n Giy Val1 Phe 265 Ser Ser Gly Phe Phe 345 Gin Phe Ile Ile Ser 425 Gly Lys Ser Ser Thr 505 Asp Aila Asn Tyr Leu 585 Phe Asp Thr Phe Asn 665 Phe 250 Ser Giy Aia Val Asp 330 Thr Ile Val Gin Thr 410 Val1 Lys Ile Glu Asn 490 Asn Ile Ala Ser Lys 570 Giu Asp His Tyr Lys 650 Ser Gly Asn Al a Thr Ser 315 Asn Phe Gin Asn Asn 395 Ile Asp Asn Thr Lys 475 Leu Giu Ile Thr Thr 555 Leu Ser Leu Lys Tyr 635 Gin Thr Asp Tyr Ser Aia Ser Lys Trp 285 Asn Asn 300 Ser Phe Gin Ala Asn Asn Asn Ser 365 Phe Gin 380 Aia Ser Giu Lys Thr Asn Asp Leu 445 Leii Ala 460 Ser Val Leu Asn Ser Leu Thr Tyr 525 Ser Lys 540 Gin Val Gin Giu Giy Thr Ser Ala 605 Tyr Tyr 620 Leu Pro Thr Phe Trp Thr Thr Ser Phe Gly 270 Asn Thr Asn Thr Gin 350 Ser Gin Phe Asp Asn 430 Lys Gin Thr His Ser 510 Asn Pro Tyr Thr Tyr 590 Ser Ile Ser Ser Asp 670 Giy Lys Gly 255 Ser Phe Gly Gly Ala Phe Gly Thr 320 Phe Gin 335 Thr Asn Phe Ser Ala Phe Asn Asn 400 Ala Ser 415 Met Ser Asn Gly Gly Thr Ile Leu 480 Ala Ile 495 Asn Pro Gly Val Thr Asp Gin Val 560 Phe Ser 575 Thr Pro Asn Tyr Pro Val Gin 640 Ala Asn 655 His Asn Ser Ala WO 98/21225 PCT/US97/21353 -332- 675 680 685 Leu Asn Gly His Cys Gly Pro Trp Pro Tyr Tyr Gin Cys Thr Gly Thr 690 695 700 Thr Asn Gly Thr Tyr Ser Ala Tyr His Val Tyr Ile Thr Ala Asn Leu 705 710 715 720 Arg Ser Gly Asn Arg Ile Gly Thr Gly Gly Ala Ala Asn Leu Ile Phe 725 730 735 Asn Gly Val Asp Ser Ile Asn Ile Ala Asn Ala Thr Ile Thr Gin His 740 745 750 Asn Ala Gly Ile Tyr Ser Ser Ser Met Thr Phe Ser Thr Gin Ser Met 755 760 765 Asp Asn Ser Gin Asn Leu Asn Gly Leu Asn Ser Asn Gly Lys Leu Ser 770 775 780 Val Tyr Gly Thr Thr Phe Thr Asn Glu Ala Lys Asp Gly Lys Phe Ile 785 790 795 800 Phe Asn Ala Gly Gin Ala Val Phe Glu Asn Thr Asn Phe Asn Gly Gly 805 810 815 Ser Tyr Gin Phe Ser Gly Asp Ser Leu Asn Phe Ser Asn Asn Asn Gin 820 825 830 Phe Asn Ser Gly Ser Phe Glu Ile Ser Ala Lys Asn Ala Ser Phe Asn 835 840 845 Asn Ala Asn Phe Asn Asn Ser Ala Ser Phe Asn Phe Asn Asn Ser Asn 850 855 860 Ala Thr Thr Ser Phe Val Gly Asp Phe Thr Asn Ala Asn Ser Asn Leu 865 870 875 880 Gin Ile Ala Gly Asn Ala Val Phe Gly Asn Ser Thr Asn Gly Ser Gin 885 890 895 Asn Thr Ala Asn Phe Asn Asn Thr Gly Ser Val Asn Ile Ser Gly Asn 900 905 910 Ala Thr Phe Asp Asn Val Val Phe Asn Gly Pro Thr Asn Thr Ser Val 915 920 925 Lys Gly Gin Val Thr Leu Asn Asn Ile Thr Leu Lys Asn Leu Asn Ala 930 935 940 Pro Leu Ser Phe Gly Asp Gly Thr Ile Thr Phe Asn Ala His Ser Val 945 950 955 960 Ile Asn Ile Ala Glu Ser Ile Thr Asn Gly Asn Pro Ile Thr Leu Val 965 970 975 Ser Ser Ser Lys Glu Ile Glu Tyr Asn Asn Ala Phe Ser Lys Asn Leu 980 985 990 Trp Gin Leu Ile Asn Tyr Gin Gly His Gly Ala Ser Ser Glu Lys Leu 995 1000 1005 Val Ser Ser Ala Gly Asn Gly Val Tyr Asp Val Val Tyr Ser Phe Asn 1010 1015 1020 Asn Gin Thr Tyr Asn Phe Gin Glu Val Phe Ser Gin Asn Ser Ile Ser 025 1030 1035 1040 Ile Arg Arg Leu Gly Val Asn Met Val Phe Asp Tyr Val Asp Met Glu 1045 1050 1055 Lys Ser Asp His Leu Tyr Tyr Gin Asn Ala Leu Gly Phe Met Thr Tyr 1060 1065 1070 Met Pro Asn Ser Tyr Asn Asn Asn Leu Gly Asn Ala Asn Asn Thr Ile 1075 1080 1085 Tyr Tyr Tyr Asp Lys Ser Ile Asp Phe Tyr Ala Ser Gly Lys Thr Leu 1090 1095 1100 Phe Thr Lys Ala Glu Phe Ser Gin Thr Phe Thr Gly Gin Asn Ser Ala 105 1110 1115 1120 1 WO 98/21225 PCT/US97/21353 -333- Ile Val Phe Gly Ala Lys-Ser Ile Trp Thr Ser Leu Ser Asp Ala Pro 1125 1130 1135 Gin Ser Asn Thr Ile Ile Arg Phe Gly Asp Asn Lys Gly Ala Gly Ser 1140 1145 1150 Asn Asp Ala Ser Gly His Cys Trp Asn Leu Gin Cys Ile Gly Phe Ile 1155 1160 1165 Thr Gly His Tyr Glu Ala Gin Lys Ile Tyr Ile Thr Gly Ser Ile Glu 1170 1175 1180 Ser Gly Asn Arg Ile Ser Ser Gly Gly Gly Ala Ser Leu Asn Phe Asn 185 1190 1195 1200 Gly Leu Gin Gly Ile Leu Leu Thr Asn Ala Thr Leu Tyr Asn Arg Ala 1205 1210 1215 Ala Gly Thr Gin Ser Ser Ser Met Asn Phe Ile Ser Asn Ser Ala Asn 1220 1225 1230 Ile Gin Ala Gin Asn Ser Tyr Phe Ile Asp Asp Thr Ala Gin Asn Gly 1235 1240 1245 Gly Asn Pro Asn Phe Ser Phe Asn Ala Leu Asn Leu Asp Phe Ser Asn 1250 1255 1260 Ser Ser Phe Arg Gly Tyr Val Gly Lys Thr Gin Ser Val Phe Lys Phe 265 1270 1275 1280 Asn Ala Lys Asn Ala Ile Ser Phe Thr Asn Ser Thr Asn Leu Ser Ser 1285 1290 1295 Gly Leu Tyr Gin Met Gin Ala Lys Ser Val Leu Phe Asp Asn Ser Asn 1300 1305 1310 Leu Ser Val Ser Val Gly Thr Ser Ser Ile Lys Ala Asn Ala Ile Asn 1315 1320 1325 Leu Ser Gin Asn Ala Ser Ile Asn Ala Ser Asn His Ser Thr Leu Glu 1330 1335 1340 Leu Gin Gly Asp Leu Asn Val Asn Asp Thr Ser Ser Leu Asn Leu Asn 345 1350 1355 1360 Gin Ser Thr Ile Asn Val Ser Asn Asn Ala Thr Ile Asn Asp Tyr Ala 1365 1370 1375 Ser Leu Ile Ala Ser Asn Gly Ser His Leu Asn Phe Asn Gly Ala Val 1380 1385 1390 Asn Phe Asn Ser Ala Asn Ile Thr Thr Ser Leu Asn Asn Ser Ser Ile 1395 1400 1405 Val Phe Lys Gly Ala Val Ser Leu Gly Gly Gin Phe Asn Leu Ser Asn 1410 1415 1420 Asn Ser Ser Leu Asp Phe Gin Gly Ser Ser Ala Ile Thr Ser Asn Thr 425 1430 1435 1440 Ala Phe Asn Phe Tyr Asp Asn Ala Phe Ser Gin Ser Pro Ile Thr Phe 1445 1450 1455 His Gin Ala Leu Asp Ile Lys Ala Pro Leu Ser Leu Gly Gly Asn Leu 1460 1465 1470 Leu Asn Pro Asn Asn Ser Ser Val Leu Asp Leu Lys Asn Ser Gin Leu 1475 1480 1485 Val Phe Gly Asp Gin Gly Ser Leu Asn Ile Ala Asn Ile Asp Leu Leu 1490 1495 1500 Ser Asp Leu Asn Asp Asn Lys Asn Arg Val Tyr Asn Ile Ile Gin Ala 505 1510 1515 1520 Asp Met Asn Ser Asn Trp Tyr Glu Arg Ile Ser Phe Phe Gly Met His 1525 1530 1535 Ile Asn Asp Gly Ile Tyr Asp Ala Lys Asn Gin Thr Tyr Ser Phe Thr 1540 1545 1550 Asn Pro Leu Asn Asn Ala Leu Lys Ile Thr Glu Ser Phe Lys Asp Asn WO 98/21225 PCT/US97/21353 -334- 1555 1560 1565 Gin Leu Ser Val Thr Leu Ser Gin Ile Pro Gly Ile Lys Asn Thr Leu 1570 1575 1580 Tyr Asn Ile Gly Ser Glu Ile Phe Asn Tyr Gin Lys Val Tyr Asn Asn 585 1590 1595 1600 Ala Asn Gly Val Tyr Ser Tyr Ser Asp Asp Ala Gin Gly Val Phe Tyr 1605 1610 1615 Leu Thr Ser Asn Val Lys Gly Tyr Tyr Asn Pro Asn Gin Ser Tyr Gin 1620 1625 1630 Ala Ser Gly Ser Asn Asn Thr Thr Lys Asn Asn Asn Leu Thr Ser Glu 1635 1640 1645 Ser Ser Ile Ile Ser Gin Thr Tyr Asn Ala Gin Gly Asn Pro Ile Ser 1650 1655 1660 Ala Leu His Ile Tyr Asn Lys Gly Tyr Asn Phe Asn Asn Ile Lys Ala 665 1670 1675 1680 Leu Gly Gin Met Ala Leu Lys Leu Tyr Pro Glu Ile Lys Lys Val Leu 1685 1690 1695 Gly Asn Asp Phe Ser Pro Ser Ser Leu Asn Ala Leu Asn Ser Asn Ala 1700 1705 1710 Leu Asn Gin Leu Thr Lys Leu Ile Thr Pro Asn Asp Trp Lys Asn Ile 1715 1720 1725 Asn Glu Leu Ile Asp Asn Ala Asn Asn Ser Val Val Gin Asn Phe Asn 1730 1735 1740 Asn Gly Thr Leu Ile Val Gly Ala Thr Gin Ile Gly Gin Thr Asp Thr 745 1750 1755 1760 Asn Ser Ala Val Val Phe Gly Gly Leu Gly Tyr Gin Thr Pro Cys Asp 1765 1770 1775 Tyr Thr Asp Ile Val Cys Gin Lys Phe Arg Gly Thr Tyr Leu Gly Gin 1780 1785 1790 Leu Leu Glu Ser Ser Ser Ala Asp Leu Gly Tyr Ile Asp Thr Thr Phe 1795 1800 1805 Asn Ala Lys Glu Ile Tyr Leu Thr Gly Thr Leu Gly Ser Gly Asn Ala 1810 1815 1820 Trp Gly Thr Gly Gly Ser Ala Ser Val Thr Phe Asn Ser Gin Thr Ser 825 1830 1835 1840 Leu Ile Leu Asn Gin Ala Asn Ile Val Ser Ser Gin Thr Asp Gly Ile 1845 1850 1855 Phe Ser Met Leu Gly Gin Glu Gly Ile Asn Lys Val Phe Asn Gin Ala 1860 1865 1870 Gly Leu Ala Asn Ile Leu Gly Glu Val Ala Val Gin Ser Ile Asn Lys 1875 1880 1885 Ala Gly Gly Leu Gly Asn Leu Ile Val Asn Thr Leu Gly Ser Asn Ser 1890 1895 1900 Val Ile Gly Gly Tyr Leu Thr Pro Glu Gin Lys Asn Gin Thr Leu Ser 905 1910 1915 1920 Gin Leu Leu Gly Gin Asn Asn Phe Asp Asn Leu Met Asn Asp Ser Gly 1925 1930 1935 Leu Asn Thr Ala Ile Lys Asp Leu Ile Arg Gin Lys Leu Gly Phe Trp 1940 1945 1950 Thr Gly Leu Val Gly Gly Leu Ala Gly Leu Gly Gly Ile Asp Leu Gin 1955 1960 1965 Asn Pro Glu Lys Leu Ile Gly Ser Met Ser Ile Asn Asp Leu Leu Ser 1970 1975 1980 Lys Lys Gly Leu Phe Asn Gin Ile Thr Gly Phe Ile Ser Ala Asn Asp 985 1990 1995 2000 WO 98/21225 PCT/US97/21353 -335- Ile Gly Gin Val lie Ser-Val Met Leu Gin Asp Ile Val Lys Pro Ser 2005 2010 2015 Asn Ala Leu Lys Asn Asp Val Ala Ala Leu Gly Lys Gin Met Ile Gly 2020 2025 2030 Glu Phe Leu Gly Gin Asp Thr Leu Asn Ser Leu Glu Ser Leu Leu Gin 2035 2040 2045 Asn Gin Gin Ile Lys Ser Val Leu Asp Lys Val Leu Ala Ala Lys Gly 2050 2055 2060 Leu Gly Pro Ile Tyr Glu Gin Gly Leu Gly Asp Leu Ile Pro Asn Leu 065 2070 2075 2080 Gly Lys Lys Gly Leu Phe Ala Pro Tyr Gly Leu Ser Gin Val Trp Gin 2085 2090 2095 Lys Gly Asp Phe Ser Phe Asn Ala Gin Gly Asn Val Phe Val Gin Asn 2100 2105 2110 Ser Thr Phe Ser Asn Ala Asn Gly Gly Thr Leu Ser Phe Asn Ala Gly 2115 2120 2125 Asn Ser Leu Ile Phe Ala Gly Asn Asn His Ile Ala Phe Thr Asn His 2130 2135 2140 Ala Gly Thr Leu Gin Leu Leu Ser Asp Gin Val Ser Asn Ile Asn Ile 145 2150 2155 2160 Thr Thr Leu Asn Ala Ser Asn Gly Leu Lys Ile Asn Ala Ala Asn Asn 2165 2170 2175 Asn Val Ser Val Ser Gin Gly Asn Leu Phe Val Ser Ala Ser Cys Ala 2180 2185 2190 Gin Gin Ser Asp Pro Thr Thr Ala Asn Ile Ala Asn Pro Cys Ala Leu 2195 2200 2205 Ser Ala Gin Ser Thr Asn Gly Ala Ser Ser Asn Asn Ala Ser Asn Asn 2210 2215 2220 Ala Pro Ile Ala Leu Ser Asn Asn Asp Glu Ser Leu Met Val Ala Ala 225 2230 2235 2240 Asn Asp Phe Asn Phe Ser Gly Asn Ile Tyr Ala Asn Gly Val Val Asp 2245 2250 2255 Phe Ser Lys Ile Lys Gly Ser Ala Asn Ile Lys Asn Leu Tyr Leu Tyr 2260 2265 2270 Asn Asn Ala Gin Phe Gin Ala Asn Asn Leu Thr Ile Ser Asn Gin Ala 2275 2280 2285 Val Leu Glu Lys Asn Ala Ser Phe Val Thr Asn Asn Leu Asn Ile Gin 2290 2295 2300 Gly Ala Phe Asn Asn Asn Ala Thr Gin Lys Ile Glu Val Leu Gin Asn 305 2310 2315 2320 Leu Val Ile Ala Ser Asn Ala Ser Leu Ser Thr Gly Ile Tyr Gly Leu 2325 2330 2335 Glu Val Gly Gly Ala Leu Asn Asn Ser Gly Ala Ile His Phe Asn Leu 2340 2345 2350 Glu Asn Thr Gin Thr Pro Thr Pro Leu Ile Gin Ala Glu Gly Ile Ile 2355 2360 2365 Asn Leu Asn Thr Thr Gin Thr Pro Phe Met Asn Val Asn Asn Ser Met 2370 2375 2380 Ala Asn Asn Thr Thr Tyr Thr Leu Leu Lys Ser Ser Arg Tyr Ile Asp 385 2390 2395 2400 Tyr Asn Ile Asn Pro Asn Ser Leu Gin Ser Tyr Leu Asn Leu Tyr Thr 2405 2410 2415 Leu Ile Asn Ile Asn Gly Asn His Ile Glu Glu Lys Asn Gly Ala Leu 2420 2425 2430 Thr Tyr Leu Gly Gin Arg Val Leu Leu Gin Asp Lys Gly Leu Leu Leu WO 98/21225 WO 981225PCTIUS97/21353 -336- 2435 2440 2445 Ser Val Ala Leu Pro Asn Ser Asn Asn Ala Ser Gin Asn Asn Ile Leu 2450 2455 2460 Ser Leu Ser Val Leu Tyr Asn Gin Val Lys Met Ser Cys Gly Asp Lys 465 2470 2475 2480 Ala Met Asp Phe Thr Pro Pro Thr Leu Gin Asp Tyr Ile Val Gly Ile 2485 2490 2495 Gin Giy Gin Ser Ala Leu Asn Gin Ile Glu Ala Val Gly Gly Asn Ala 2500 2505 2510 Ile Lys Trp Leu Ser Thr Leu Met Met Giu Thr Lys Glu Asn Pro Phe 2515 2520 2525 Phe Ala Pro Ile Tyr Leu Lys Asn His Ser Leu Asn Giu Ile Leu Gly 2530 2535 2540 Val Thr Lys Asp Leu Gin Asn Thr Ala Ser Leu Ile Ser Asn Pro Asn 545 2550 2555 2560 Phe Arg Asp Asn Ala Thr Asn Leu Leu Giu Leu Ala Ser Tyr Thr Gin 2565 2570 2575 Gin Thr Ser Arg Leu Thr Lys Leu Ser Asp Phe Arg Ser Arg Glu Gly 2580 2585 2590 Glu Ser Asp Phe Ser Leu Leu Glu Leu Lys Asn Lys Arg Phe Ser Asp 2595 2600 2605 Pro Asn Pro Giu Val Phe Val Lys Tyr Ser Gin Leu Ser Lys His Pro 2610 2615 2620 Asn Asn Leu Trp Vai Gin Giy Val Gly Giy Ala Ser Phe Ile Ser Gly 625 2630 2635 2640 Gly Asn Gly Thr Leu Tyr Gly Leu Asn Ala Gly Tyr Asp Arg Leu Val 2645 2650 2655 Lys Asn Val Ile Leu Gly Gly Tyr Val Ala Tyr Giy Tyr Ser Asp Phe 2660 2665 2670 Asn Giy Asn Ile Met His Ser Leu Gly Asn Asn Val Asp Val Gly Met 2675 2680 2685 Tyr Ala Arg Ala Phe Leu Lys Arg Asn Glu Phe Thr Leu Ser Ala Asn 2690 2695 2700 Giu Thr Tyr Gly Gly Asn Ala Thr Ser Ile Asn Ser Ser Asn Ser Leu 705 2710 2715 2720 Leu Ser Vai Leu Asn Gin Arg Tyr Asn Tyr Asn Thr Trp Thr Thr Ser 2725 2730 2735 Val Asn Gly Asn Tyr Gly Tyr Asp Phe Met Phe Lys Gin Lys Ser Val 2740 2745 2750 Val Leu Lys Pro Gin Val Gly Leu Ser Tyr His PheIle Gly Leu Ser 2755 2760 2765 Gly Met Lys Gly Asn Asp Ala Ala Tyr Lys Gin Phe Leu Met His Ser 2770 2775 2780 Asn Pro Ser Asn Giu Ser Val Leu .Thr Leu Asn Met Gly Leu Giu Ser 785 2790 2795 2800 Arg Lys Tyr Phe Gly Lys Asn Ser Tyr Tyr Phe Val Thr Ala Arg Leu 2805 2810 2815 Gly Arg Asp Leu Leu Ile Lys Ser Lys Giy Ser Asn Thr Val Arg Phe 2820 2825 2830 Val Gly Giu Asn Thr Leu Leu Tyr Arg Lys Gly Glu Val Phe Asn Thr 2835 2840 2845 Phe Ala Ser Val Ile Thr Gly Gly Giu Met His Leu Trp, Arg Leu Val 2850 2855 2860 Tyr Val Asn Ala Giy Vai Gly Leu Lys Met Gly Leu Gin Tyr Gin Asp 665 2870 2875 2880 WO 98/21225 PCT/US97/21353 -337- Ile Asn Ile Thr Gly Asn-Val Gly Met-Arg Val Ala Phe 2885 2890 INFORMATION FOR SEQ ID N0:167: SEQUENCE CHARACTERISTICS: LENGTH: 1376 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 13...1338 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:167: TGGTAGTTAA GA ATG GGT AAT CAT TTT TCT AAA TTA GGA TTT GTT TTA GCC Met Gly Asn His Phe Ser Lys Leu Gly Phe Val Leu Ala GCA TTA Ala Leu GGA AGC GCG ATA Gly Ser Ala Ile
GGT
Gly 20 TTA GGG CAT ATC TGG CGT TTC CCC TAC Leu Gly His Ile Trp Arg Phe Pro Tyr
ATG
Met ACT GGG GTG AGT Thr Gly Val Ser
GGT
Gly 35 GGG GGT GCT TTT Gly Gly Ala Phe GTT TTA Val Leu TTG TTT TTA Leu Phe Leu TTA TCT TTA AGC Leu Ser Leu Ser GGC GCG GCG ATG TTT ATC GCT GAA ATG Gly Ala Ala Met Phe Ile Ala Glu Met CTA TTA Leu Leu GGA CAA AGC Gly Gln Ser ATT AAC CCC Ile Asn Pro
ACT
Thr CAA AAA AAT GTA Gin Lys Asn Val
ACA
Thr 70 GAA GCT TTT AAA Glu Ala Phe Lys GAG CTT GAC Glu Leu Asp CTT GTT TCT Leu Val Ser 243 291 AAA AAA CGC TGG Lys Lys Arg Trp TAC GCA GGG CTT Tyr Ala Gly Leu
TTG
Leu GGG CCA Gly Pro TTA ATA CTG ACT Leu Ile Leu Thr
TTT
Phe 100 TAC GGC ACG ATT TTA GGT TGG GTG CTT Tyr Gly Thr Ile Leu Gly Trp Val Leu 339 387 TAT Tyr 110 TAT TTG GTG AGT GTT AGT TTT AAT TTG Tyr Leu Val Ser Val Ser Phe Asn Leu 115 CCT AAC Pro Asn 120 AAT ATC CAA Asn Ile Gin WO 98/21225 WO 9821225PCTIUS97/2 1353 -338- TCT GAA CAA ATT TTT ACT -CAA ACT TTG CAG TCT ATA GGG CTA Ser Glu Gin Ile Phe Thr Gin Thr Leu Gin Ser Ile Gly Leu 130 135 CAA TCC Gin Ser 140 435 ATA GGG CTT Ile Gly Leu GGG ATT AAA Giy Ile Lys 160
TTT
Phe 145 AGC GTT TTA TTG Ser Val Leu Leu
ATA
Ile 150 ACC GGA TGG ATT Thr Gly Trp Ile GTT TCT AGG Vai Ser Arg 155 ATG CCC TTA Met Pro Leu 483 531 GAA GGC ATT GAA Giu Gly Ile Giu CTC AAT TTG GTT Leu Asn Leu Val
TTA
Leu 170 CTC TTT Leu Phe 175 GCT ACT TTT TTT Ala Thr Phe Phe TTG CTT TTC TAT Leu Leu Phe Tyr
GCG
Al a 185 ATG AGC ATG GAT Met Ser Met Asp 579 627
TCT
Ser 190 TTT TCT AAA GCT TTT CAT TTC ATG TTT Phe Ser Lys Ala Phe His Phe Met Phe 195 TTC AAA CCA AAA Phe Lys Pro Lys
GAT
Asp 205 TTG ACC TCT CAA Leu Thr Ser Gin
GTG
Val 210 TTC ACT TAT TCC Phe Thr Tyr Ser
TTG
Leu 215 GGG CAG GTT TTC Gly Gin Val Phe TTT TCC Phe Ser 220 675 723 TTA AGC ATC Leu Ser Ile AAA ACG CAG Lys Thr Gin 240
GGT
Gly 225 TTA GGG ATC AAT Leu Gly Ile Asn
ATC
Ile 230 ACT TAC GCT GCG GTT ACG GAT Thr Tyr Ala Ala Val Thr Asp 235 AAT TTG CTT AAA Asn Leu Leu Lys ACT ATT TGG GTG Thr Ile Trp Val
GTT
Val1 250 TTA TCA GGA Leu Ser Gly ATT CTA Ile Leu 255 ATT TCT CTT GTG Ile Ser Leu Val GGA CTT ATG ATT Gly Leu Met Ile
TTC
Phe 265 ACT TTT GTG TTT Thr Phe Val Phe 819 867 915
GAA
Giu 270
TTA
Leu TAT GGG GCG AAT Tyr Gly Ala Asn
GTC
Val 275
GGC
Gly TCA CAA GGC ACA Ser Gin Gly Thr TTA ATC TTC ACT Leu Ile Phe Thr CCG GTG GTT Pro Vai Val
TTT
Phe 290 CAA ATG GGA Gin Met Giy
GCG
Al a 295 GGC ATT CTT Gly Ile Leu ATT CTT TTC- Ile Leu Phe GCT TTA TTG Ala Leu Leu 320
TTG
Leu 305 CTC GCG CTC GCT Leu Ala Leu Ala
TTT
Phe 310 GCT GGC ATC ACT Ala Gly Ile Thr TCT ACG GTG Ser Thr Val 315 TAT CAA TAC Tyr Gin Tyr 963 1011 GAG CCA AGC GTG Giu Pro Ser Val TAT CTT ACC GAA Tyr Leu Thr Glu
AGG
Arg 330 TCT CGT Ser Arg 335 TTT AAG GTT ACT Phe Lys Val Thr GGT CTT GTA GCA Gly Leu Val Ala
CTA
Leu 345 ATT TTT GTG GTA Ile Phe Val Val 1059 WO 98/21225 WO 98/ 1225PCTfUS97/2 1353 -339-
GGC
Gly 350 GTG GTG TTi.- ATT TTC TCG CTC CAT -AAG Val Val Leu Ile Phe Ser Leu His Lys 355
GAT
Asp 360 TAT AAA GAT TAT Tyr Lys Asp Tyr
CTC
Leu 365 1107 ACT TTC-TTT GAA Thr Phe Phe Glu ACT CTT TTT GAT Ser Leu Phe Asp TTG GAT TTT GCA Leu Asp Phe Ala TCA AGC Ser Ser 380 1155 ACC ATT ATC Thr Ile Ile TGG GTT TTG Trp, Val Leu 400
ATG
Met 385 CCT TTA GGC GGG Pro Leu Gly Gly GCA ACC TTT ATT Ala Thr Phe Ile TTT ATO GGT Phe Met Gly 395 1203 AAA AAA GAA AAA Lys Lys Glu Lys
TTG
Leu 405 CGT CTT TTG AGC GTG CAC TTT TTA Arg Leu Leu Ser Val His Phe Leu 410 1251 GGC CCT Gly Pro 415 AAA TTG TTT GCA Lys Leu Phe Ala
ACT
Thr 420 TGG TAT TTC TTG CTT AAA TAT ATC ACC Trp Tyr Phe Leu Leu Lys Tyr Ile Thr 42S 1299 TTA ATT GTG TTT Leu Ile Val Phe
TCC
Ser 435 ATT TGG TTG AGC Ile Trp Leu Ser AAG ATT TAT Lys Ile Tyr 440 TAAAATATTT GG 1350 CATGGGAAAA TTTTCTAAA.T TAGGCT INFORMATION FOR SEQ ID NO:i68: SEQUENCE CHARACTERISTICS: LENGTH: 442 amino acids TYPE: amino acid STRANqDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:i68: 1376 Met 1 Ser Val1 Gly Asn His Ala Ile Gly Ser Gly Gly Ser Lys Leu Gly Vai Leu Ala Ala Leu Gly Gly His Ile Phe Pro Tyr Met Thr Gly Leu Ser Leu Giy Gin Ser Gly Ala Phe Gly Val 40 Ile Leu Phe Leu Phe Leu Ser Val Thr Gin Ala Ala Met Ala Giu Met Leu Lys Asn Val Lys Ile Val1 Thr 70 Tyr Ala Phe Lys Glu Leu Leu Asp Ile Asn Lys Arg Trp Leu Thr Phe 100 Ser Val Ser Ala Gly Leu Val Ser Gly Pro Leu Gly Thr Ile Leu Trp, Val Leu Phe Asn Leu 11S Thr Pro 120 Ser Asn Asn Ile Gin Ile Gly Leu Gin 140 Glu 125 Ser Tyr Tyr Leu 110 Ser Glu Gin Ile Gly Leu Ile Phe 130 Gin Thr Leu WO 98/21225 WO 9821225PCTfUS97/21353 -340- Phe 145 Glu Thr Lys Gin Gly 225 Asn Ser Al a Val Leu 305 Glu Lys Leu Glu Met 385 Lys Leu Val1 Val Ile Phe Phe 195 Phe Gly Leu Val Val 275 Gly Ala Ser Thr Phe 355 Ser Leu Giu Ala Ser 435 Leu ieu ie-Thr Gly Trp Ile Val 150 Glu Gly 180 His Thr Ile Lys Al a 260 Ser Gin Leu Val Trp 340 Ser Leu Gly Lys Thr 420 Ile Leu Leu Met Ser Ile 230 Thr Leu Gly G ly Phe 310 Tyr Leu His Asp Met 390 Arg Tyr Leu As n Phe Phe Leu 215 Thr Ile Met Thr Al a 295 Ala Leu Val1 Lys Trp 375 Al a Leu Phe Ser Leu Tyr Asp 200 Gly Tyr Trp Ile Gly 280 Ile Giy Thr Ala Asp 360 Leu Thr Leu Leu Lys 440 155 Met Ser Pro Phe Val 235 Leu Phe Phe Leu Ser 315 Tyr Phe Asp Al a Phe 395 His Tyr Ser Pro Met Lys Phe 220 Thr Ser Val1 Thr Val1 300 Thr Gin Val Tyr Ser 380 Met Phe Ile Arg Leu Asp Asp 205 Ser Asp Gly Phe Ser 285 Ser Val1 Tyr Val1 Leu 365 Ser Gly Leu Thr Gly Leu Ser 190 Leu Leu Lys Ile Glu 270 Leu Ile Al a Ser Gly 350 Thr Thr Trp Gly Pro 430 Lys 160 Ala Ser Ser Ile Gin 240 Ile Gly Val1 Phe Leu 320 Phe Val1 Phe Ile Leu 400 Lys Ile INFORMATION FOR SEQ ID NO:169: Ci) SEQUENCE CHARACTERISTICS: LENGTH: 1392 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: .1356 OTHER INFORMATION: WO 98/21225 PCT/US97/21353 -341- (xi) SEQUENCE DESCRIPTION: SEQ ID NO:169: TTTAAAAGGT ATTTTATAAC G ATG AAA ATT TTT GGG ACT GAT GGC GTG AGG Met Lys Ile Phe Gly Thr Asp Gly Val Arg GGT AAA GCA GGG Gly Lys Ala Gly AAA CTC ACC CCC ATG TTT GTG ATG CGT Lys Leu Thr Pro Met Phe Val Met Arg TTA GGC Leu Gly ATT GCT GCC Ile Ala Ala CTA ATC GGT Leu Ile Gly TTG TAT TTT AAA Leu Tyr Phe Lys AAA CAT TCT CAA ACG AAT AAA ATT Lys His Ser Gin Thr Asn Lys Ile AGC GGC TAT ATG GTA GAA AAC GCT Ser Gly Tyr Met Val Glu Asn Ala AAA GAC ACC AGA Lys Asp Thr Arg
AAA
Lys 50 TTA GTG Leu Val AGC GCT CTA ACT Ser Ala Leu Thr ATA GGC TAT AAT GTG ATT CAA ATA GGG Ile Gly Tyr Asn Val Ile Gin Ile Gly 243 ATG CCC ACC CCT Met Pro Thr Pro ATT GCG TTT TTA Ile Ala Phe Leu GAA GAC ATG CGC Glu Asp Met Arg GAT GCG GGT ATT Asp Ala Gly Ile ATA AGC GCG AGC Ile Ser Ala Ser AAC CCT TTT GAA Asn Pro Phe Glu GAT AAT Asp Asn 105 GGC ATT AAG Gly Ile Lys GAA AAA GCG Glu Lys Ala 125
TTT
Phe 110 TTC AAT TCT TAT Phe Asn Ser Tyr
GGC
Gly 115 TAT AAG CTT AAA Tyr Lys Leu Lys GAA GAA GAA Glu Glu Glu 120 CTG CAT TCT Leu His Ser ATT GAA GAA ATC Ile Glu Glu Ile
TTT
Phe 130 CAT GAT GAA GAA His Asp Glu Glu AGC TAT Ser Tyr 140 AAA GTG GGT GAG AGC GTC GGT AGC GCT AAA AGG ATA GAC GAT Lys Val Gly Glu Ser Val Gly Ser Ala Lys Arg Ile Asp Asp
GTC
Val 155 ATA GGG CGC TAT Ile Gly Arg Tyr
ATT
Ile 160 GCA CAT TTA AAA Ala His Leu Lys TCT TTC CCC AAA Ser Phe Pro Lys
CAT
His 170 531 579 TTG AAT TTA CAG Leu Asn Leu Gin
AGT
Ser 175 TTA AGG ATC Leu Arg Ile GTG CTA Val Leu 180 TTT AGC Phe Ser 195 GAT ACG GCT AAT Asp Thr Ala Asn GGC GCG Gly Ala 185 GCT TAT AAG GTG GCT CCG GTC GTT Ala Tyr Lys Val Ala Pro Val Val GAG CTT GGG Glu Leu Gly GCT GAT GTG Ala Asp Val 200 WO 98/21225 WO 9821225PCTIUS97/21353 -342- TTA GTG ATT AAT GAT GAG -CCT AAC GGG TGT AAC ATT AAT GAT CAA TGC Leu Val Ile Asn Asp Giu Pro Asn Gly Cys Asn Ile Asn Asp Gin Cys GGG GCT Gly Ala 220 TTA CAC CCC AAC Leu His Pro Asn
CAA
Gin 225 TTA AGC CAG GAA Leu Ser Gin Giu AAA AAA TAC CGC Lys Lys Tyr Arg
GCA
Aia 23S GAT TTA GGC TTT Asp Leu Gly Phe TTT GAT GGC GAT Phe Asp Gly Asp
GCT
Ala 245 GAC AGG CTA GTG Asp Arg Leu Val
GTG
Val 250 723 771 819 GTG GAT AAT TTA Vai Asp Asn Leu
GGG
Giy 255 AAT ATC GTG CAT Asn Ile Val His GAT AAG CTT TTA Asp Lys Leu Leu GGG GTG Gly Val 265 TTA GGG OTT Leu Giy Vai GTC GCC ACA Val Ala Thr 285
TAT
Tyr 270 CAA AAA TCT AAA Gin Lys Ser Lys
AAC
As n 275 GCC CTT TCT TCT Ala Leu Ser Ser CAA GCG GTT Gin Aia Val 280 TTA AAA TCC Leu Lys Ser AAC ATG AGC AAT Asn Met Ser Asn
TTA
Leu 290 GCC CTT AAA GAA Ala Leu Lys Glu
TAT
Tyr 295 CAA GAT Gin Asp 300 TTG GAA TTG AAG Leu Giu Leu Lys TGC GCG ATT GGG Cys Ala Ile Gly AAG TTT GTG AGC Lys Phe Vai Ser
GAA
Giu 315
CAT
His 963 1011 1059 TGC ATG CAA TTG Cys Met Gin Leu
AAT
Asn 320
GAT
Asp AAA GCC A.AT TTT Lys Ala Asn Phe
GGA
Gly 325
GC
Gly GGC GAG CAA AGC Gly Giu Gin Ser
GGG
Gly 330
TGC
Cys ATC ATT TTT Ile Ile Phe TAC GCT AAA Tyr Ala Lys
ACA
Thr 340 GAT GGT TTG Asp Giy Leu
GTG
Vai 345 OCT TTG CAA Ala Leu Gin GTT GCG TTA Val Ala Leu 365 AGC GCG TTA GTG Ser Ala Leu Val
TTA
Leu 355 GAA AGC AAG CAG Giu Ser Lys Gin GTA AGC TCT Vai Ser Ser 360 GTG AAT TTG Vai Asn Leu 1107 1155 AAC CCC TTT GAA Asn Pro Phe Giu
TTA
Leu 370 TAC CCC CAA AGC Tyr Pro Gln Ser AAT GTC Asn Vai 380 CAA AAA AAG CCC Gin Lys Lys Pro
CCT
Pro 385 TTA GAA AGC CTG Leu Giu Ser Leu GOT TAT AGC GCT Gly Tyr Ser Ala 1203 CTT TTA AAA GAA TTA GAC AAG CTA GAA ATC Leu Leu Lys Giu Leu Asp Lys Leu Giu Ile 395 400 AGC GGC ACT GAA AAC AAA TTG CGA ATC CTT Ser Gly Thr Giu Asn Lys Leu Arg Ile Leu
CC
Arg 405 CAT TTG ATC CGT His Leu Ile Arg 1251 1299 TTA GAA GCT AAA Leu Giu Ala Lys OAT GAA Asp Giu 425 WO 98/21225 PCT/US97/21353 -343- AAG CTT TTA GAA TCC AAA-ATG CAA GAA TTA AAA GAG TTT TTT GAA GGG 1347 Lys Leu Leu Glu Ser Lys Met Gin Glu Leu Lys Glu Phe Phe Glu Gly 430 435 440 CAT TTG TGC TAAAAACCAC TAAAAAAAGC CTGTTGGTTT TTATGG 1392 His Leu Cys 445 INFORMATION FOR SEQ ID NO:170: SEQUENCE CHARACTERISTICS: LENGTH: 445 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:170: Met Lys Ile Phe Gly Thr Asp Gly Val Arg Gly Lys Ala Gly Val Lys 1 5 10 Leu Thr Pro Met Phe Val Met Arg Leu Gly Ile Ala Ala Gly Leu Tyr 25 Phe Lys Lys His Ser Gin Thr Asn Lys Ile Leu Ile Gly Lys Asp Thr 40 Arg Lys Ser Gly Tyr Met Val Glu Asn Ala Leu Val Ser Ala Leu Thr 55 Ser Ile Gly Tyr Asn Val Ile Gin Ile Gly Pro Met Pro Thr Pro Ala 70 75 Ile Ala Phe Leu Thr Glu Asp Met Arg Cys Asp Ala Gly Ile Met Ile 90 Ser Ala Ser His Asn Pro Phe Glu Asp Asn Gly Ile Lys Phe Phe Asn 100 105 110 Ser Tyr Gly Tyr Lys Leu Lys Glu Glu Glu Glu Lys Ala Ile Glu Glu 115 120 125 Ile Phe His Asp Glu Glu Leu Leu His Ser Ser Tyr Lys Val Gly Glu 130 135 140 Ser Val Gly Ser Ala Lys Arg Ile Asp Asp Val Ile Gly Arg Tyr Ile 145 150 155 160 Ala His Leu Lys His Ser Phe Pro Lys His Leu Asn Leu Gin Ser Leu 165 170 175 Arg Ile Val Leu Asp Thr Ala Asn Gly Ala Ala Tyr Lys Val Ala Pro 180 185 190 Val Val Phe Ser Glu Leu Gly Ala Asp Val Leu Val Ile Asn Asp Glu 195 200 205 Pro Asn Gly Cys Asn Ile Asn Asp Gin Cys Gly Ala Leu His Pro Asn 210 215 220 Gin Leu Ser Gin Glu Val Lys Lys Tyr Arg Ala Asp Leu Gly Phe Ala 225 230 235 240 Phe Asp Gly Asp Ala Asp Arg Leu Val Val Val Asp Asn Leu Gly Asn 245 250 255 Ile Val His Gly Asp Lys Leu Leu Gly Val Leu Gly Val Tyr Gin Lys 260 265 270 WO 98/21225 PCT/US97/21353 -344- Ser Lys Asn Ala Leu Ser-Ser Gin Ala Val Val Ala Thr Asn Met Ser 275 280 285 Asn Leu Ala Leu Lys Glu Tyr Leu Lys Ser Gin Asp Leu Glu Leu Lys 290 295 300 His Cys Ala Ile Gly Asp Lys Phe Val Ser Glu Cys Met Gin Leu Asn 305 310 315 .320 Lys Ala Asn Phe Gly Gly Glu Gin Ser Gly His Ile Ile Phe Ser Asp 325 330 335 Tyr Ala Lys Thr Gly Asp Gly Leu Val Cys Ala Leu Gin Val Ser Ala 340 345 350 Leu Val Leu Glu Ser Lys Gin Val Ser Ser Val Ala Leu Asn Pro Phe 355 360 365 Glu Leu Tyr Pro Gin Ser Leu Val Asn Leu Asn Val Gin Lys Lys Pro 370 375 380 Pro Leu Glu Ser Leu Lys Gly Tyr Ser Ala Leu Leu Lys Glu Leu Asp 385 390 395 400 Lys Leu Glu Ile Arg His Leu Ile Arg Tyr Ser Gly Thr Glu Asn Lys 405 410 415 Leu Arg Ile Leu Leu Glu Ala Lys Asp Glu Lys Leu Leu Glu Ser Lys 420 425 430 Met Gin Glu Leu Lys Glu Phe Phe Glu Gly His Leu Cys 435 440 445 INFORMATION FOR SEQ ID NO:171: SEQUENCE CHARACTERISTICS: LENGTH: 30 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:171: GCCGGATCCA TGACTTATGG GTATGGGGAA INFORMATION FOR SEQ ID NO:172: SEQUENCE CHARACTERISTICS: LENGTH: 34 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:172: GCCCTCGAGA CTTTTATTGA TTCACCATTT CATT 34 INFORMATION FOR SEQ ID NO:173: SEQUENCE CHARACTERISTICS: WO 98/21225 PCT/US97/21353 -345- LENGTH: 30 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:173: GCCGGATCCA TCGCTGAAGA AAATGGGGCG INFORMATION FOR SEQ ID NO:174: SEQUENCE CHARACTERISTICS: LENGTH: 33 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:174: GCCCGGCCGC CCTAAAAACT ATAAACATAA CTC 33 INFORMATION FOR SEQ ID NO:175: SEQUENCE CHARACTERISTICS: LENGTH: 32 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:175: GCCGGATCCG GTATTAGGAA GCTTATACCA TC 32 INFORMATION FOR SEQ ID NO:176: SEQUENCE CHARACTERISTICS: LENGTH: 35 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:176: GCCCTCGAGA AGTTCTATTT TTAATTCCTT GAGAG INFORMATION FOR SEQ ID NO:177: WO 98/21225 PCT/US97/21353 -346- SEQUENCE CHARACTERISTICS: LENGTH: 36 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:177: GCCGGATCCT CTGATAGCCA TAAAGAAAAA AAGGAC 36 INFORMATION FOR SEQ ID NO:178: SEQUENCE CHARACTERISTICS: LENGTH: 34 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:178: GCCCTCGAGA TCTTTAGAAA TCAACCCCCA AAGC 34 INFORMATION FOR SEQ ID NO:179: SEQUENCE CHARACTERISTICS: LENGTH: 33 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:179: GCCGGATCCG ACTTAGAACA TTTTAACACG CTC 33 INFORMATION FOR SEQ ID NO:180: SEQUENCE CHARACTERISTICS: LENGTH: 33 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:180: GCCCTCGAGT CATTTTAAAC GACTCAAAAC AAA 33 INFORMATION FOR SEO ID NO:181: WO 98/21225 PCT/US97/21353 -347- SEQUENCE CHARACTERISTICS: LENGTH: 33 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:181: GCCGGATCCG GCCAAAGCGT GCGCACTTAT TGG 33 INFORMATION FOR SEQ ID NO:182: SEQUENCE CHARACTERISTICS: LENGTH: 33 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (xi) SEQUENCE DESCRIPTION: SEQ ID N0:182: GCCCTCGAGT TATTGTTCCA ACCCCCACGC ATC 33 INFORMATION FOR SEQ ID NO:183: SEQUENCE CHARACTERISTICS: LENGTH: 32 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:183: GCCGGATCCA AGAGCAATGC TGATGACAAA CC 32 INFORMATION FOR SEQ ID NO:184: SEQUENCE CHARACTERISTICS: LENGTH: 32 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: CDNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:184: GCCCTCGAGT TATGAGTTAA AGCCCCTTGT CC 32 INFORMATION FOR SEQ ID NO:185: WO 98/21225 PCT/US97/21353 -348- SEQUENCE CHARACTERISTICS: LENGTH: 32 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:185: GCCGGATCCG AATCAGTAAA AACAGGAAAA AC 32 INFORMATION FOR SEQ ID NO:186: SEQUENCE CHARACTERISTICS: LENGTH: 30 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:186: GCCCTCGAGC GGCTCTTTGG AGTTTTATTG INFORMATION FOR SEQ ID NO:187: SEQUENCE CHARACTERISTICS: LENGTH: 31 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:187: GCCGGATCCA TCATTCCCTC TCGCTCTATG G 31 INFORMATION FOR SEQ ID NO:188: SEQUENCE CHARACTERISTICS: LENGTH: 33 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:188: GCCCTCGAGA CCTTAATGCG TTGCGTTTTC TTT 33 INFORMATION FOR SEQ ID NO:189: WO 98/21225 PCT/US97/21353 -349- SEQUENCE CHARACTERISTICS: LENGTH: 36 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:189: GCCGAGCTCC AAGCAAAAAA ATGTCAATTA AAAGGG 36 INFORMATION FOR SEQ ID NO:190: SEQUENCE CHARACTERISTICS: LENGTH: 33 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:190: GCCCTCGAGG TCTAAATTAG AATAAGTGTT GTT 33
Claims (27)
1. An isolated polynucleotide that encodes: a polypeptide comprising an amino acid sequence that is homologous to the amino acid sequence of a Helicobacter polypeptide, wherein said amino acid sequence of said Helicobacter polypeptide is selected from the group consisting o f the amino acid sequences as shown in SEQ ID NO:2 (GHPO 13), SEQ ID NO:4 (GHPO 73), SEQ ID NO: 6 (GHPO 90), SEQ ID NO: 8 (GHPO 107), SEQ ID NO: (GHPO 13 SEQ ID NO: 12 (GHPO 19 SEQ ID NO: 14 (GHPO 213), SEQ ID NO: 16 (GHPO 240), SEQ ID NO: 18 (GHPO 408), SEQ ID NO:20 (GHPO 411), SEQ ID NO:22 (GHPO 419), SEQ ID NO:24 (GHPO 43 SEQ ID NO:26 (GHPO 474), SEQ ID NO:28 (GHPO 591), SEQ ID NO:30 (GHPO 596), SEQ ID NO:32 (GHPO 699), SEQ ID NO:34 (GHPO 724), SEQ ID NO:36 (GHPO 730), SEQ ID NO:38 (GHPO 761), SEQ ID NO:40 (GHPO 804), SEQ ID NO:42 (GHPO 805), SEQ ID NO:44 (GHPO 812), SEQ ID NO:46 (GHPO 879), SEQ ID NO:48 (GHPO 888), SEQ ID NO:50 (GHPO 986), SEQ ID NO:52 (GHPO 1056), SEQ ID NO:54 (GHPO 1081), SEQ ID NO:56 (GHPO 1100), SEQ ID NO:58 (GHPO 1140), SEQ ID NO:60 (GHPO 1148), SEQ ID NO:62 (GHPO 1200), SEQ ID NO:64 (GHPO 1212), SEQ ID NO:66 (GHPO 1258), SEQ ID NO:68 (GHPO 1263), SEQ ID (GHPO 1273), SEQ ID NO:72 (GHPO 1284), SEQ ID NO:74 (GI4PO 1299), SEQ 1D NO:76 (GHPO 1327), SEQ ID NO:78 (GHPO 1346), SEQ ID NO:80 (GHPO 1378), SEQ ID NO:82 (GHPO 1412), SEQ ID NO:84 (GHPO 1443), SEQ ID NO:86 (GHPO 1466), SEQ ID NO:88 (GHPO 1476), SEQ ID NO:90 (GHPO 1536), SEQ ID NO:92 (GHPO 1559), SEQ ID NO:94 (GHIPO, 427), SEQ ID NO:96 (GHPO 1045), SEQ ID NO:98 (GHPO 1262), SEQ ID NO: 100 (GHPO 1688), SEQ ID NO: 102 (GHPO 15 38), SEQ ID NO: 104 (GHPO 346), SEQ ID NO: 106 (GHPO 1012), SEQ ID NO: 108 (GHPO 470), SEQ ID NO: 110 (GHPO 1398), SEQ ID NO: 112 (GHPO 1550), SEQ ID NO: 114 (GI4PO 276), SEQ ID NO:1 116 (GHPO 01), SEQ ID NO: 118 (GHPO 706), SEQ ID NO: 120 (GHPO 100 SEQ ID NO: 122 (GHPO 732), SEQ ID NO: 124 (GHPO 329), SEQ ID NO: 126 (GHPO 574), WO 98/21225 PCT/US97/21353 -351- SEQ ID NO:128 (GHPO 1190), SEQ ID NO:130 (GHPO 1374), SEQ ID NO:132 (GHPO 1620), SEQ ID NO:134 (GHPO 956), SEQ ID NO:136 (HPO 98), SEQ ID NO:138 (GHPO 689), SEQ ID NO:140 (GHPO 208), SEQ ID NO:142 (GHPO 296), SEQ ID NO:144 (GHPO 726), SEQ ID NO: 146 (GHPO 1026), SEQ ID NO:148 (GHPO 1301), SEQ ID NO:150 (GHPO 1536), SEQ ID NO:152 (GHPO 166), SEQ ID NO:154 (GHPO 253), SEQ ID NO:156 (GHPO 297), SEQ ID NO:158 (GHPO 615), SEQ ID NO:160 (GHPO 1278), SEQ ID NO:162 (GHPO 1282), SEQ ID NO:164 (GHPO 1420), SEQ ID NO:166 (GHPO 1484), SEQ ID NO:168 (GHPO 1719), and SEQ ID NO:170 (GHPO 1252); or (ii) a derivative of said polypeptide encoded by said polynucleotide.
2. The isolated polynucleotide of claim 1, which encodes a mature form of said polypeptide.
3. The isolated polynucleotide of claim 1 or 2, wherein the polynucleotide is a DNA molecule.
4. A compound, in a substantially purified form, that is the mature form or a derivative of a polypeptide comprising an amino acid sequence that is homologous to a Helicobacter amino acid sequence that is selected from the group consisting of the amino acid sequences as shown in SEQ ID NO:2 (GHPO 13), SEQ ID NO:4 (GHPO 73), SEQ ID NO:6 (GHPO 90), SEQ ID NO:8 (GHPO 107), SEQ ID NO:10 (GHPO 136), SEQ ID NO:12 (GHPO 191), SEQ ID NO:14 (GHPO 213), SEQ ID NO:16 (GHPO 240), SEQ ID NO:18 (GHPO 408), SEQ ID NO:20 (GHPO 411), SEQ ID NO:22 (GHPO 419), SEQ ID NO:24 (GHPO 431), SEQ ID NO:26 (GHPO 474), SEQ ID NO:28 (GHPO 591), SEQ ID NO:30 (GHPO 596), SEQ ID NO:32 (GHPO 699), SEQ ID NO:34 (GHPO 724), SEQ ID NO:36 (GHPO 730), SEQ ID NO:38 (GHPO 761), SEQ ID NO:40 (GHPO 804), SEQ ID NO:42 (GHPO 805), SEQ ID WO 98/21225 PCTIUS97/21353 -352- NO:44 (GHPO 812), SEQ ID NO:46 (GHPO 879), SEQ ID NO:48 (GHPO 888), SEQ ID NO:50 (GHPO 986), SEQ JD NO:52 (GHPO 1056), SEQ ID NO:54 (GHPO, 108 SEQ ID NO: 56 (GHPO 1 100), SEQ ID NO:58 (GHPO 1140), SEQ ID (GHPO 1148), SEQ ID NO:62 (GHPO 1200), SEQ ID NO:64 (GHPO 1212), SEQ ID NO:66 (GHPO 1258), SEQ ID NO:68 (GIAPO 1263), SEQ ID NO:70 (GHPO, 1273), SEQ ID NO:72 (GHPO 1284), SEQ ID NO:74 (GHPO 1299), SEQ ID NO:76 (GHPO 1327), SEQ ID NO:78 (GHPO 1346), SEQ ID NO: 80 (GHPO 1378), SEQ ID NO:82 (GHPO 1412), SEQ ID NO:84 (GHPO 1443), SEQ ID NO:86 (GHPO, 1466), SEQ ID NO:88 (GHPO 1476), SEQ ID NO:90 (GHPO 1536), SEQ ID NO:92 (GHPO 1559), SEQ ID NO:94 (GHPO, 427), SEQ ID NO:96 (GHPO 1045), SEQ ID NO:98 (GHPO 1262), SEQ ID NO: 100 (GHPO 1688), SEQ ID NO: 102 (GHPO, 1538), SEQ ID NO: 104 (GHPO 346), SEQ ID NO: 106 (GHPO 1012), SEQ ID NO: 108 (GHPO 470), SEQ ID NO:1I 10 (GHPO 1398), SEQ ID NO: 112 (GHPO 1550), SEQ ID NO:1 114 (GHPO 276), SEQ ID NO:1 116 (GHPO 150 SEQ ID NO:1 118 (GHPO 706), SEQ ID NO: 120 (GHPO 100 SEQ ID NO: 122 (GHPO 732), SEQ ID NO: 124 (GHPO 329), SEQ ID NO: 126 (GHPO 574), SEQ ID NO: 128 (GHPO 1190), SEQ ID NO: 130 (GHPO 1374), SEQ ID NO: 132 (GHPO 1620), SEQ ID NO: 134 (GHPO 956), SEQ ID NO: 136 (HPO 98), SEQ ID NO: 13 8 (GHPO 689), SEQ ID NO: 140 (GHPO 208), SEQ ID NO: 142 (GHPO 296), SEQ ID NO: 144 (GHPO 726), SEQ ID NO: 146 (GHPO, 1026), SEQ ID NO: 148 (GHPO 130 SEQ ID NO: 150 (GHPO 153 SEQ ID NO: 152 (GHPO 166), SEQ ID NO: 154 (GHPO 253), SEQ ID NO: 156 (GHPO 297), SEQ ID NO: 15 8 (GHPO 615), SEQ ID NO: 160 (GHPO 1278), SEQ ID NO: 162 (GHPO 1282), SEQ ID NO: 164 (GHPO 1420), SEQ ID NO: 166 (GHPO 1484), SEQ ID NO: 168 (GHPO 1719), and SEQ ID NO: 170 (GHPO 1252); or (11) a derivative of said polypeptide. -353- A method of preventing or treating Helicobacter infection in a mammal, said method comprising administering to said mammal a prophylactically or therapeutically effective amount of a compound of claim 4.
6. The method of claim 5, further comprising administering to said mammal an antibiotic, an antisecretory agent, a bismuth salt, or a combination thereof.
7. The method of claim 6, wherein said antibiotic is selected from the group consisting of amoxicillin, clarithromycin, tetracycline, metronidizole, and erythromycin, and said bismuth salt is selected from the group consisting of bismuth subcitrate and bismuth subsalicylate. 10 8. The method of claim 6 or claim 7 wherein said antisecretory agent is a proton pump inhibitor, an H 2 -receptor antagonist, or a prostaglandin analog.
9. The method of claim 8, wherein said proton pump inhibitor is selected from the group consisting of omeprazole, lansoprazole, and pantoprazole; said H 2 -receptor antagonist is selected from the group consisting of ranitidine, cimetidine, 15 famotidine, nizatidine, and roxatidine; and said prostaglandin analog is selected from the group consisting of misoprostil and enprostil.
10. The method of any one of claims 5 to 9, further comprising administering to said mammal a prophylactically or therapeutically effective amount of a second Helicobacter polypeptide or a derivative thereof.
11. The method of claim 10, wherein the second Helicobacter polypeptide is a Helicobacter urease or a subunit derivative thereof.
12. A composition comprising a compound of claim 4, together with a physiologically acceptable diluent or carrier. SRA 13. The composition of claim 12, further comprising an adjuvant.
14. The composition of claim 12 or claim 13, further comprising a second Helicobacter polypeptide or a derivative thereof. H:\janel\Keep\Speci\52662-98.doc 26/04/01 -354- Helicobacter polypeptide or a derivative thereof. The composition of claim 14, wherein the second Helicobacter polypeptide is a Helicobacter urease, a subunit, or a derivative thereof.
16. A method of preventing or treating Helicobacter infection in a mammal, said method comprising administering to said mammal a prophylactically or therapeutically effective amount of a polynucleotide of any one of claims 1 to 3.
17. A composition comprising a viral vector, in the genome of which is inserted DNA molecule of claim 1 or claim 2, said DNA molecule being placed under conditions for expression in a mammalian cell and said viral vector being admixed with a physiologically acceptable diluent or carrier.
18. A composition that comprises a bacterial vector comprising a DNA molecule of claim 1 or claim 2, said DNA molecule being placed under conditions for expression and said bacterial vector being admixed with a physiologically acceptable diluent or carrier. 15 19. The composition of claim 18, wherein said vector is selected from the group consisting of Shigella, Salmonella, Vibrio cholerae, Lactobacillus, Bacille dili6 de Calmette-Gu6rin, and Streptococcus. *•go
20. A composition comprising a polynucleotide of any one of claims 1 to 3, together with a physiologically acceptable diluent or carrier.
21. The composition of claim 20, wherein the polynucleotide is a DNA molecule that is inserted in a plasmid that is unable to replicate and to substantially integrate in a mammalian genome and is placed under conditions for expression in a mammalian cell.
22. An expression cassette comprising a DNA molecule of any one of claims 1 to 3, SRA, 5 said DNA molecule being placed under conditions for expression in a procaryotic or eucaryotic cell.
23. A process for producing a compound of claim 4, which comprises culturing a H:\WendyS\Keep\species\S2662-98 Merieux.doc 7/12/99 355 procaryotic or eucaryotic cell transformed or transfected with an expression cassette of claim 22, and recovering said compound from the cell culture.
24. A method of preventing or treating Helicobacter infection in a mammal, said method comprising administering to said mammal a prophylactically or therapeutically effective amount of an antibody that binds to the compound of claim 4. Use of a compound of claim 4, for the manufacture of a medicament for the prevention or treatment of Helicobacter infection in a mammal.
26. Use according to claim 25, in which the medicament additionally comprises an antibiotic, an antisecretory agent, a bismuth salt, or a combination thereof.
27. Use according to claim 26, in which the antibiotic is selected from the group consisting of amoxicillin, clarithromycin, tetracycline, metronidizole, and erythromycin, and said bismuth salt is selected from the group consisting of bismuth subcitrate and bismuth subsalicylate. 15 28. Use according to claim 26, in which the antisecretory agent is a proton pump inhibitor, an H2-receptor antagonist, or a prostaglandin analog. 4
29. Use according to claim 28, in which the proton pump inhibitor is selected from the group consisting of omeprazole, lansoprazole, and pantoprazole; said H 2 S* receptor antagonist is selected from the group consisting of ranitidine, cimetidine, 2 0 famotidine, nizatidine, and roxatidine; and said prostaglandin analog is selected from the group consisting of misoprostil and enprostil.
30. Use according to any one of claims 25 to 29, in which the medicament additionally comprises a second Helicobacter polypeptide or a derivative thereof.
31. Use according to claim 30, in which the second Helicobacter polypeptide is a Helicobacter urease, or a subunit or a derivative thereof.
32. Use of a polynucleotide according to any one of claims 1 to 3 for the manufacture of a medicament for the prevention or treatment Helicobacter infection in a H:\WendyS\Keep\species\52662-98 Merieux.doc 7/12/99 I- 356- mammal.
33. An isolated polynucleotide according to claim 1, substantially as herein described with reference to the examples and drawings.
34. A compound according to claim 4, substantially as herein described with reference to the examples and drawings. DATED this 7 th Day of December 1999 MERIEUX ORAVAX SOCIETE EN NOM COLLECTIF PASTEUR MERIEUX SERUMS ET VACCINS S.A.; MAX-PLANCK GESELLSCHAFT ZUR FORDERUNG DER WISSENSCHAFTEN E.V. BERLIN; and HUMAN GENOME SCIENCES, INC. By Their Patent Attorneys: GRIFFITH HACK So Fellows Institute of Patent Attorneys of Australia *oo*o* *•go• H:\WendyS\Keep\species\52662-98 Merieux.doc 7/12/99
Applications Claiming Priority (13)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US74905196A | 1996-11-14 | 1996-11-14 | |
| US08/749051 | 1996-11-14 | ||
| US83345797A | 1997-04-01 | 1997-04-01 | |
| US83130997A | 1997-04-01 | 1997-04-01 | |
| US08/834,705 US20030023066A1 (en) | 1996-11-14 | 1997-04-01 | Helicobacter polypeptides and corresponding polynucleotide molecules |
| US08/831309 | 1997-04-01 | ||
| US08/834705 | 1997-04-01 | ||
| US08/833457 | 1997-04-01 | ||
| US88122797A | 1997-06-24 | 1997-06-24 | |
| US08/881227 | 1997-06-24 | ||
| US90261597A | 1997-07-29 | 1997-07-29 | |
| US08/902615 | 1997-07-29 | ||
| PCT/US1997/021353 WO1998021225A1 (en) | 1996-11-14 | 1997-11-14 | Helicobacter polypeptides and corresponding polynucleotide molecules |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| AU5266298A AU5266298A (en) | 1998-06-03 |
| AU735391B2 true AU735391B2 (en) | 2001-07-05 |
Family
ID=27560272
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| AU52662/98A Ceased AU735391B2 (en) | 1996-11-14 | 1997-11-14 | Helicobacter polypeptides and corresponding polynucleotide molecules |
Country Status (4)
| Country | Link |
|---|---|
| EP (1) | EP1021458A4 (en) |
| AU (1) | AU735391B2 (en) |
| CA (1) | CA2271774A1 (en) |
| WO (1) | WO1998021225A1 (en) |
Families Citing this family (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6503747B2 (en) * | 1998-07-14 | 2003-01-07 | University Of Hawaii | Serotype-specific probes for Listeria monocytogenes |
| AU3205000A (en) * | 1999-02-19 | 2000-09-04 | Astrazeneca Ab | Expression of helicobacter polypeptides in pichia pastoris |
| AU5399800A (en) * | 1999-05-31 | 2000-12-18 | Creatogen Aktiengesellschaft | Essential gene and gene products for identifying, developing and optimising immunological and pharmacological active ingredients for the treatment of microbial infections |
| WO2002040516A2 (en) * | 2000-11-15 | 2002-05-23 | Ludwig Deml | Helicobacter cysteine rich protein a (hcpa) and uses thereof |
| CN111793137A (en) * | 2019-12-12 | 2020-10-20 | 南京蛋球球生物医学技术合伙企业(有限合伙) | Hp tetravalent antigen and preparation method and application thereof |
Family Cites Families (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| GB8928625D0 (en) * | 1989-12-19 | 1990-02-21 | 3I Res Expl Ltd | H.pylori dna probes |
| US5733740A (en) * | 1992-10-13 | 1998-03-31 | Vanderbilt University | Taga gene and methods for detecting predisposition to peptic ulceration and gastric carcinoma |
| AU723063B2 (en) * | 1995-04-28 | 2000-08-17 | Oravax, Inc | Multimeric, recombinant urease vaccine |
| SK165197A3 (en) * | 1995-06-07 | 1999-01-11 | Astra Ab | Nucleic acid and amino acid sequences relating to helicobacter pylori for diagnostics and therapeutics |
| WO1997019098A1 (en) * | 1995-11-17 | 1997-05-29 | Astra Aktiebolag | Nucleic acid and amino acid sequences relating to helicobacter pylori for diagnostics and therapeutics |
| TR199801939T2 (en) * | 1996-03-29 | 1999-02-22 | Astra Aktiebolag | Nucleic acid and amino acid sequences and their network compositions related to Helicobacter pylori. |
-
1997
- 1997-11-14 WO PCT/US1997/021353 patent/WO1998021225A1/en not_active Ceased
- 1997-11-14 EP EP97947620A patent/EP1021458A4/en not_active Withdrawn
- 1997-11-14 CA CA002271774A patent/CA2271774A1/en not_active Abandoned
- 1997-11-14 AU AU52662/98A patent/AU735391B2/en not_active Ceased
Also Published As
| Publication number | Publication date |
|---|---|
| AU5266298A (en) | 1998-06-03 |
| EP1021458A1 (en) | 2000-07-26 |
| WO1998021225A1 (en) | 1998-05-22 |
| EP1021458A4 (en) | 2001-12-12 |
| CA2271774A1 (en) | 1998-05-22 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| KR101078919B1 (en) | Novel Streptococcus antigens | |
| JP3380559B2 (en) | Helicobacter pylori antigen and vaccine composition | |
| US6258359B1 (en) | Immunogenic compositions against helicobacter infection, polypeptides for use in the compositions, and nucleic acid sequences encoding said polypeptides | |
| AU784193B2 (en) | Chlamydia antigens and corresponding DNA fragments and uses thereof | |
| WO2002018595A2 (en) | Moraxella polypeptides and corresponding dna fragments and uses thereof | |
| KR20010005893A (en) | Identification of polynucleotides encoding novel helicobacter polypeptides in the helicobacter genome | |
| US5837825A (en) | Campylobacter jejuni flagellin/Escherichia coli LT-B fusion protein | |
| JP2000125889A (en) | Protein from actinobacillus pleuropneumoniae | |
| AU734052B2 (en) | Nucleic acid and amino acid sequences relating to helicobacter pylori and vaccine compositions thereof | |
| CZ297698A3 (en) | Nucleic acid and amino acid sequences relating to heliobacter pylori and vaccination composition thereof | |
| AU739641B2 (en) | Nucleic acid and amino acid sequences relating to helicobacter pylori and vaccine compositions thereof | |
| AU735391B2 (en) | Helicobacter polypeptides and corresponding polynucleotide molecules | |
| US20030158396A1 (en) | Identification of polynucleotides encoding novel helicobacter polypeptides in the helicobacter genome | |
| AU750792B2 (en) | 76 kDa, 32 kDa, and 50 kDa helicobacter polypeptides and corresponding polynucleotide molecules | |
| US20040219585A1 (en) | Nontypeable haemophilus influenzae virulence factors | |
| US20030124141A1 (en) | Helicobacter polypeptides and corresponding polynucleotide molecules | |
| US20020115078A1 (en) | Identification of polynucleotides encoding novel helicobacter polypeptides in the helicobacter genome | |
| US20040047871A1 (en) | Recombinant fusobacterium necrophorum leukotoxin vaccine and prepaation thereof | |
| US20030023066A1 (en) | Helicobacter polypeptides and corresponding polynucleotide molecules | |
| JP2004531235A (en) | STREPTOCOCCUSPYOGENES polypeptides and corresponding DNA fragments | |
| AU2002211201A1 (en) | aopB Gene, protein, homologs, fragments and variants thereof, and their use for cell surface display | |
| WO2002036777A1 (en) | aopB GENE, PROTEIN, HOMOLOGS, FRAGMENTS AND VARIANTS THEREOF, AND THEIR USE FOR CELL SURFACE DISPLAY | |
| US20020160456A1 (en) | Identification of polynucleotides encoding novel helicobacter polypeptides in the helicobacter genome | |
| US20020026035A1 (en) | Helicobacter ghpo 1360 and ghpo 750 polypeptides and corresponding polynucleotide molecules | |
| US20030069404A1 (en) | Helicobacter antigens and corresponding DNA fragments |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| FGA | Letters patent sealed or granted (standard patent) |