AU729228B2 - Insecticidal protein toxins from photorhabdus - Google Patents
Insecticidal protein toxins from photorhabdus Download PDFInfo
- Publication number
- AU729228B2 AU729228B2 AU10509/97A AU1050997A AU729228B2 AU 729228 B2 AU729228 B2 AU 729228B2 AU 10509/97 A AU10509/97 A AU 10509/97A AU 1050997 A AU1050997 A AU 1050997A AU 729228 B2 AU729228 B2 AU 729228B2
- Authority
- AU
- Australia
- Prior art keywords
- seq
- toxin
- protein
- leu
- dna
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
- 241001148062 Photorhabdus Species 0.000 title claims description 122
- 230000000749 insecticidal effect Effects 0.000 title claims description 42
- 231100000654 protein toxin Toxicity 0.000 title claims description 23
- 108700012359 toxins Proteins 0.000 title description 189
- 108090000623 proteins and genes Proteins 0.000 claims description 352
- 102000004169 proteins and genes Human genes 0.000 claims description 226
- 108090000765 processed proteins & peptides Proteins 0.000 claims description 174
- 239000003053 toxin Substances 0.000 claims description 174
- 231100000765 toxin Toxicity 0.000 claims description 172
- 241000238631 Hexapoda Species 0.000 claims description 131
- 102000004196 processed proteins & peptides Human genes 0.000 claims description 67
- 238000000034 method Methods 0.000 claims description 60
- 125000003275 alpha amino acid group Chemical group 0.000 claims description 50
- 108091028043 Nucleic acid sequence Proteins 0.000 claims description 30
- 125000001429 N-terminal alpha-amino-acid group Chemical group 0.000 claims description 18
- 239000011159 matrix material Substances 0.000 claims description 16
- 239000002689 soil Substances 0.000 claims description 11
- 230000009261 transgenic effect Effects 0.000 claims description 11
- 229920001184 polypeptide Polymers 0.000 claims description 8
- 239000004480 active ingredient Substances 0.000 claims description 3
- 238000011160 research Methods 0.000 claims description 2
- 235000018102 proteins Nutrition 0.000 description 194
- 108020004414 DNA Proteins 0.000 description 149
- 235000010633 broth Nutrition 0.000 description 116
- 230000000694 effects Effects 0.000 description 109
- 235000001014 amino acid Nutrition 0.000 description 88
- 150000001413 amino acids Chemical class 0.000 description 88
- 229940024606 amino acid Drugs 0.000 description 87
- BYXHQQCXAJARLQ-ZLUOBGJFSA-N Ala-Ala-Ala Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O BYXHQQCXAJARLQ-ZLUOBGJFSA-N 0.000 description 79
- 239000013615 primer Substances 0.000 description 76
- 239000012634 fragment Substances 0.000 description 75
- 241000196324 Embryophyta Species 0.000 description 74
- 210000004027 cell Anatomy 0.000 description 61
- 108091005804 Peptidases Proteins 0.000 description 56
- 102000035195 Peptidases Human genes 0.000 description 56
- 239000004365 Protease Substances 0.000 description 56
- 239000000047 product Substances 0.000 description 56
- 238000004458 analytical method Methods 0.000 description 55
- 239000012528 membrane Substances 0.000 description 52
- 238000006243 chemical reaction Methods 0.000 description 51
- 239000000523 sample Substances 0.000 description 51
- 235000019419 proteases Nutrition 0.000 description 50
- COYHRQWNJDJCNA-NUJDXYNKSA-N Thr-Thr-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O COYHRQWNJDJCNA-NUJDXYNKSA-N 0.000 description 49
- 239000002609 medium Substances 0.000 description 44
- 239000000499 gel Substances 0.000 description 43
- 210000004379 membrane Anatomy 0.000 description 41
- 238000011282 treatment Methods 0.000 description 41
- 241000489976 Diabrotica undecimpunctata howardi Species 0.000 description 40
- 239000000243 solution Substances 0.000 description 40
- 239000000872 buffer Substances 0.000 description 39
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 38
- 108020004705 Codon Proteins 0.000 description 37
- 238000012163 sequencing technique Methods 0.000 description 36
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 35
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Chemical compound O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 35
- 238000003752 polymerase chain reaction Methods 0.000 description 34
- 239000000203 mixture Substances 0.000 description 33
- 229910001868 water Inorganic materials 0.000 description 32
- 101000702488 Rattus norvegicus High affinity cationic amino acid transporter 1 Proteins 0.000 description 30
- 239000013612 plasmid Substances 0.000 description 30
- 108091034117 Oligonucleotide Proteins 0.000 description 29
- 240000008042 Zea mays Species 0.000 description 28
- 235000005911 diet Nutrition 0.000 description 28
- 230000037213 diet Effects 0.000 description 28
- YNJBWRMUSHSURL-UHFFFAOYSA-N trichloroacetic acid Chemical compound OC(=O)C(Cl)(Cl)Cl YNJBWRMUSHSURL-UHFFFAOYSA-N 0.000 description 28
- 230000012010 growth Effects 0.000 description 27
- QTBSBXVTEAMEQO-UHFFFAOYSA-N acetic acid Substances CC(O)=O QTBSBXVTEAMEQO-UHFFFAOYSA-N 0.000 description 26
- 238000002415 sodium dodecyl sulfate polyacrylamide gel electrophoresis Methods 0.000 description 26
- 241001148064 Photorhabdus luminescens Species 0.000 description 25
- DCXYFEDJOCDNAF-REOHCLBHSA-N L-asparagine Chemical compound OC(=O)[C@@H](N)CC(N)=O DCXYFEDJOCDNAF-REOHCLBHSA-N 0.000 description 24
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 24
- 235000002017 Zea mays subsp mays Nutrition 0.000 description 24
- 230000001580 bacterial effect Effects 0.000 description 24
- 241000894006 Bacteria Species 0.000 description 23
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 23
- 241000244206 Nematoda Species 0.000 description 23
- 238000003556 assay Methods 0.000 description 23
- 239000000463 material Substances 0.000 description 23
- 239000013598 vector Substances 0.000 description 23
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 21
- 230000003321 amplification Effects 0.000 description 21
- 238000004519 manufacturing process Methods 0.000 description 21
- 238000003199 nucleic acid amplification method Methods 0.000 description 21
- 239000012064 sodium phosphate buffer Substances 0.000 description 21
- 230000006378 damage Effects 0.000 description 20
- 230000014509 gene expression Effects 0.000 description 20
- 230000002441 reversible effect Effects 0.000 description 20
- 230000002588 toxic effect Effects 0.000 description 20
- 239000011543 agarose gel Substances 0.000 description 19
- 238000004166 bioassay Methods 0.000 description 19
- 239000011780 sodium chloride Substances 0.000 description 19
- QKNYBSVHEMOAJP-UHFFFAOYSA-N 2-amino-2-(hydroxymethyl)propane-1,3-diol;hydron;chloride Chemical compound Cl.OCC(N)(CO)CO QKNYBSVHEMOAJP-UHFFFAOYSA-N 0.000 description 18
- 238000012360 testing method Methods 0.000 description 18
- 241000588724 Escherichia coli Species 0.000 description 17
- CKLJMWTZIZZHCS-REOHCLBHSA-N L-aspartic acid Chemical compound OC(=O)[C@@H](N)CC(O)=O CKLJMWTZIZZHCS-REOHCLBHSA-N 0.000 description 17
- 108700026244 Open Reading Frames Proteins 0.000 description 17
- 241001147398 Ostrinia nubilalis Species 0.000 description 17
- 108091006629 SLC13A2 Proteins 0.000 description 17
- 238000011534 incubation Methods 0.000 description 17
- 239000008188 pellet Substances 0.000 description 17
- 238000002360 preparation method Methods 0.000 description 17
- RLMISHABBKUNFO-WHFBIAKZSA-N Ala-Ala-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O RLMISHABBKUNFO-WHFBIAKZSA-N 0.000 description 16
- 239000004033 plastic Substances 0.000 description 16
- 229920003023 plastic Polymers 0.000 description 16
- HEMHJVSKTPXQMS-UHFFFAOYSA-M Sodium hydroxide Chemical compound [OH-].[Na+] HEMHJVSKTPXQMS-UHFFFAOYSA-M 0.000 description 15
- DBMJMQXJHONAFJ-UHFFFAOYSA-M Sodium laurylsulphate Chemical compound [Na+].CCCCCCCCCCCCOS([O-])(=O)=O DBMJMQXJHONAFJ-UHFFFAOYSA-M 0.000 description 15
- 238000007792 addition Methods 0.000 description 15
- 239000003795 chemical substances by application Substances 0.000 description 15
- HEDRZPFGACZZDS-UHFFFAOYSA-N Chloroform Chemical compound ClC(Cl)Cl HEDRZPFGACZZDS-UHFFFAOYSA-N 0.000 description 14
- KCXVZYZYPLLWCC-UHFFFAOYSA-N EDTA Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 14
- 235000005824 Zea mays ssp. parviglumis Nutrition 0.000 description 14
- 238000012512 characterization method Methods 0.000 description 14
- 235000005822 corn Nutrition 0.000 description 14
- 235000013305 food Nutrition 0.000 description 14
- 238000009396 hybridization Methods 0.000 description 14
- 239000002917 insecticide Substances 0.000 description 14
- 238000002955 isolation Methods 0.000 description 14
- 241000894007 species Species 0.000 description 14
- 231100000331 toxic Toxicity 0.000 description 14
- 231100000419 toxicity Toxicity 0.000 description 14
- 230000001988 toxicity Effects 0.000 description 14
- 229920001817 Agar Polymers 0.000 description 13
- 101710182223 Toxin B Proteins 0.000 description 13
- 238000005119 centrifugation Methods 0.000 description 13
- 239000002773 nucleotide Substances 0.000 description 13
- 125000003729 nucleotide group Chemical group 0.000 description 13
- 108010009004 proteose-peptone Proteins 0.000 description 13
- 210000001519 tissue Anatomy 0.000 description 13
- 238000013519 translation Methods 0.000 description 13
- OKKJLVBELUTLKV-UHFFFAOYSA-N Methanol Chemical compound OC OKKJLVBELUTLKV-UHFFFAOYSA-N 0.000 description 12
- 229920002684 Sepharose Polymers 0.000 description 12
- 239000008272 agar Substances 0.000 description 12
- 150000007523 nucleic acids Chemical class 0.000 description 12
- 238000000746 purification Methods 0.000 description 12
- 238000012216 screening Methods 0.000 description 12
- 241000701447 unidentified baculovirus Species 0.000 description 12
- 241000255925 Diptera Species 0.000 description 11
- 241001147381 Helicoverpa armigera Species 0.000 description 11
- 206010061217 Infestation Diseases 0.000 description 11
- 101710084578 Short neurotoxin 1 Proteins 0.000 description 11
- 101710182532 Toxin a Proteins 0.000 description 11
- 238000001962 electrophoresis Methods 0.000 description 11
- 239000011521 glass Substances 0.000 description 11
- 108010010803 Gelatin Proteins 0.000 description 10
- 241000258937 Hemiptera Species 0.000 description 10
- 241000257303 Hymenoptera Species 0.000 description 10
- 241001414662 Macrosteles fascifrons Species 0.000 description 10
- ZPHBZEQOLSRPAK-UHFFFAOYSA-N Phosphoramidon Natural products C=1NC2=CC=CC=C2C=1CC(C(O)=O)NC(=O)C(CC(C)C)NP(O)(=O)OC1OC(C)C(O)C(O)C1O ZPHBZEQOLSRPAK-UHFFFAOYSA-N 0.000 description 10
- 241000607757 Xenorhabdus Species 0.000 description 10
- 235000016383 Zea mays subsp huehuetenangensis Nutrition 0.000 description 10
- 239000008273 gelatin Substances 0.000 description 10
- 229920000159 gelatin Polymers 0.000 description 10
- 235000019322 gelatine Nutrition 0.000 description 10
- 235000011852 gelatine desserts Nutrition 0.000 description 10
- 108010050848 glycylleucine Proteins 0.000 description 10
- 230000009036 growth inhibition Effects 0.000 description 10
- 238000002347 injection Methods 0.000 description 10
- 239000007924 injection Substances 0.000 description 10
- PHTQWCKDNZKARW-UHFFFAOYSA-N isoamylol Chemical compound CC(C)CCO PHTQWCKDNZKARW-UHFFFAOYSA-N 0.000 description 10
- 230000001418 larval effect Effects 0.000 description 10
- 235000009973 maize Nutrition 0.000 description 10
- 230000004048 modification Effects 0.000 description 10
- 238000012986 modification Methods 0.000 description 10
- 108010072906 phosphoramidon Proteins 0.000 description 10
- BWSDNRQVTFZQQD-AYVHNPTNSA-N phosphoramidon Chemical compound O([P@@](O)(=O)N[C@H](CC(C)C)C(=O)N[C@H](CC=1[C]2C=CC=CC2=NC=1)C(O)=O)[C@H]1O[C@@H](C)[C@H](O)[C@@H](O)[C@@H]1O BWSDNRQVTFZQQD-AYVHNPTNSA-N 0.000 description 10
- LWIHDJKSTIGBAC-UHFFFAOYSA-K tripotassium phosphate Chemical compound [K+].[K+].[K+].[O-]P([O-])([O-])=O LWIHDJKSTIGBAC-UHFFFAOYSA-K 0.000 description 10
- 229920000936 Agarose Polymers 0.000 description 9
- 241000254173 Coleoptera Species 0.000 description 9
- IAZDPXIOMUYVGZ-UHFFFAOYSA-N Dimethylsulphoxide Chemical compound CS(C)=O IAZDPXIOMUYVGZ-UHFFFAOYSA-N 0.000 description 9
- XGDCYUQSFDQISZ-BQBZGAKWSA-N Leu-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(O)=O XGDCYUQSFDQISZ-BQBZGAKWSA-N 0.000 description 9
- 241000255908 Manduca sexta Species 0.000 description 9
- 235000010419 agar Nutrition 0.000 description 9
- 229910052921 ammonium sulfate Inorganic materials 0.000 description 9
- 230000000295 complement effect Effects 0.000 description 9
- 230000006870 function Effects 0.000 description 9
- 239000003112 inhibitor Substances 0.000 description 9
- 125000001360 methionine group Chemical group N[C@@H](CCSC)C(=O)* 0.000 description 9
- 230000009467 reduction Effects 0.000 description 9
- 108091008146 restriction endonucleases Proteins 0.000 description 9
- LZZYPRNAOMGNLH-UHFFFAOYSA-M Cetrimonium bromide Chemical compound [Br-].CCCCCCCCCCCCCCCC[N+](C)(C)C LZZYPRNAOMGNLH-UHFFFAOYSA-M 0.000 description 8
- 108010067770 Endopeptidase K Proteins 0.000 description 8
- ISWSIDIOOBJBQZ-UHFFFAOYSA-N Phenol Chemical compound OC1=CC=CC=C1 ISWSIDIOOBJBQZ-UHFFFAOYSA-N 0.000 description 8
- 108010001267 Protein Subunits Proteins 0.000 description 8
- 102000002067 Protein Subunits Human genes 0.000 description 8
- 241000255588 Tephritidae Species 0.000 description 8
- 101710204001 Zinc metalloprotease Proteins 0.000 description 8
- 230000001413 cellular effect Effects 0.000 description 8
- 238000010276 construction Methods 0.000 description 8
- 238000005516 engineering process Methods 0.000 description 8
- 238000000605 extraction Methods 0.000 description 8
- 230000005714 functional activity Effects 0.000 description 8
- 238000002523 gelfiltration Methods 0.000 description 8
- 238000004128 high performance liquid chromatography Methods 0.000 description 8
- 230000001965 increasing effect Effects 0.000 description 8
- 239000007788 liquid Substances 0.000 description 8
- 108020004999 messenger RNA Proteins 0.000 description 8
- 230000036961 partial effect Effects 0.000 description 8
- 239000006228 supernatant Substances 0.000 description 8
- 238000013518 transcription Methods 0.000 description 8
- 230000035897 transcription Effects 0.000 description 8
- 108091026890 Coding region Proteins 0.000 description 7
- 241000255967 Helicoverpa zea Species 0.000 description 7
- 229910002651 NO3 Inorganic materials 0.000 description 7
- NHNBFGGVMKEFGY-UHFFFAOYSA-N Nitrate Chemical compound [O-][N+]([O-])=O NHNBFGGVMKEFGY-UHFFFAOYSA-N 0.000 description 7
- 239000004677 Nylon Substances 0.000 description 7
- 108010033276 Peptide Fragments Proteins 0.000 description 7
- 102000007079 Peptide Fragments Human genes 0.000 description 7
- 241000700605 Viruses Species 0.000 description 7
- 241000607479 Yersinia pestis Species 0.000 description 7
- DGEZNRSVGBDHLK-UHFFFAOYSA-N [1,10]phenanthroline Chemical compound C1=CN=C2C3=NC=CC=C3C=CC2=C1 DGEZNRSVGBDHLK-UHFFFAOYSA-N 0.000 description 7
- BFNBIHQBYMNNAN-UHFFFAOYSA-N ammonium sulfate Chemical compound N.N.OS(O)(=O)=O BFNBIHQBYMNNAN-UHFFFAOYSA-N 0.000 description 7
- 235000011130 ammonium sulphate Nutrition 0.000 description 7
- 235000021405 artificial diet Nutrition 0.000 description 7
- 238000011161 development Methods 0.000 description 7
- 230000018109 developmental process Effects 0.000 description 7
- 235000014113 dietary fatty acids Nutrition 0.000 description 7
- 238000010790 dilution Methods 0.000 description 7
- 239000012895 dilution Substances 0.000 description 7
- 238000002474 experimental method Methods 0.000 description 7
- 229930195729 fatty acid Natural products 0.000 description 7
- 239000000194 fatty acid Substances 0.000 description 7
- 150000004665 fatty acids Chemical class 0.000 description 7
- 238000000855 fermentation Methods 0.000 description 7
- 230000004151 fermentation Effects 0.000 description 7
- 230000002068 genetic effect Effects 0.000 description 7
- 238000003119 immunoblot Methods 0.000 description 7
- 239000002054 inoculum Substances 0.000 description 7
- 238000007726 management method Methods 0.000 description 7
- 229920001778 nylon Polymers 0.000 description 7
- 230000017854 proteolysis Effects 0.000 description 7
- 238000011218 seed culture Methods 0.000 description 7
- 238000000926 separation method Methods 0.000 description 7
- 238000012546 transfer Methods 0.000 description 7
- 238000011144 upstream manufacturing Methods 0.000 description 7
- 241000256113 Culicidae Species 0.000 description 6
- LYCAIKOWRPUZTN-UHFFFAOYSA-N Ethylene glycol Chemical compound OCCO LYCAIKOWRPUZTN-UHFFFAOYSA-N 0.000 description 6
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 description 6
- KFZMGEQAYNKOFK-UHFFFAOYSA-N Isopropanol Chemical compound CC(C)O KFZMGEQAYNKOFK-UHFFFAOYSA-N 0.000 description 6
- 108010085220 Multiprotein Complexes Proteins 0.000 description 6
- 102000007474 Multiprotein Complexes Human genes 0.000 description 6
- 238000002105 Southern blotting Methods 0.000 description 6
- 241001454293 Tetranychus urticae Species 0.000 description 6
- 229960000723 ampicillin Drugs 0.000 description 6
- AVKUERGKIZMTKX-NJBDSQKTSA-N ampicillin Chemical compound C1([C@@H](N)C(=O)N[C@H]2[C@H]3SC([C@@H](N3C2=O)C(O)=O)(C)C)=CC=CC=C1 AVKUERGKIZMTKX-NJBDSQKTSA-N 0.000 description 6
- 108010040443 aspartyl-aspartic acid Proteins 0.000 description 6
- 108010047857 aspartylglycine Proteins 0.000 description 6
- 238000005415 bioluminescence Methods 0.000 description 6
- 230000029918 bioluminescence Effects 0.000 description 6
- SUYVUBYJARFZHO-RRKCRQDMSA-N dATP Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@H]1C[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 SUYVUBYJARFZHO-RRKCRQDMSA-N 0.000 description 6
- SUYVUBYJARFZHO-UHFFFAOYSA-N dATP Natural products C1=NC=2C(N)=NC=NC=2N1C1CC(O)C(COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 SUYVUBYJARFZHO-UHFFFAOYSA-N 0.000 description 6
- RGWHQCVHVJXOKC-SHYZEUOFSA-J dCTP(4-) Chemical compound O=C1N=C(N)C=CN1[C@@H]1O[C@H](COP([O-])(=O)OP([O-])(=O)OP([O-])([O-])=O)[C@@H](O)C1 RGWHQCVHVJXOKC-SHYZEUOFSA-J 0.000 description 6
- HAAZLUGHYHWQIW-KVQBGUIXSA-N dGTP Chemical compound C1=NC=2C(=O)NC(N)=NC=2N1[C@H]1C[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 HAAZLUGHYHWQIW-KVQBGUIXSA-N 0.000 description 6
- NHVNXKFIZYSCEB-XLPZGREQSA-N dTTP Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)[C@@H](O)C1 NHVNXKFIZYSCEB-XLPZGREQSA-N 0.000 description 6
- 230000034994 death Effects 0.000 description 6
- 238000013461 design Methods 0.000 description 6
- 230000029087 digestion Effects 0.000 description 6
- 238000001914 filtration Methods 0.000 description 6
- 230000037406 food intake Effects 0.000 description 6
- 238000001502 gel electrophoresis Methods 0.000 description 6
- 239000001963 growth medium Substances 0.000 description 6
- 238000010438 heat treatment Methods 0.000 description 6
- 230000005764 inhibitory process Effects 0.000 description 6
- 238000011081 inoculation Methods 0.000 description 6
- 108010003700 lysyl aspartic acid Proteins 0.000 description 6
- 239000003550 marker Substances 0.000 description 6
- 230000004899 motility Effects 0.000 description 6
- 230000003287 optical effect Effects 0.000 description 6
- 125000002924 primary amino group Chemical group [H]N([H])* 0.000 description 6
- 239000012465 retentate Substances 0.000 description 6
- 239000007787 solid Substances 0.000 description 6
- 238000001228 spectrum Methods 0.000 description 6
- 108010080629 tryptophan-leucine Proteins 0.000 description 6
- 238000000108 ultra-filtration Methods 0.000 description 6
- 238000005406 washing Methods 0.000 description 6
- 241000566547 Agrotis ipsilon Species 0.000 description 5
- 241000254175 Anthonomus grandis Species 0.000 description 5
- VGRHZPNRCLAHQA-IMJSIDKUSA-N Asp-Asn Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(O)=O VGRHZPNRCLAHQA-IMJSIDKUSA-N 0.000 description 5
- 108700003860 Bacterial Genes Proteins 0.000 description 5
- 108010077805 Bacterial Proteins Proteins 0.000 description 5
- 102000016938 Catalase Human genes 0.000 description 5
- 108010053835 Catalase Proteins 0.000 description 5
- 108700010070 Codon Usage Proteins 0.000 description 5
- 241001635274 Cydia pomonella Species 0.000 description 5
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 5
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 5
- 102000004190 Enzymes Human genes 0.000 description 5
- 108090000790 Enzymes Proteins 0.000 description 5
- 241000255777 Lepidoptera Species 0.000 description 5
- 241000880493 Leptailurus serval Species 0.000 description 5
- 241000258916 Leptinotarsa decemlineata Species 0.000 description 5
- 238000012408 PCR amplification Methods 0.000 description 5
- 239000002033 PVDF binder Substances 0.000 description 5
- 108700008625 Reporter Genes Proteins 0.000 description 5
- VMLONWHIORGALA-SRVKXCTJSA-N Ser-Leu-Leu Chemical compound CC(C)C[C@@H](C([O-])=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H]([NH3+])CO VMLONWHIORGALA-SRVKXCTJSA-N 0.000 description 5
- OLKICIBQRVSQMA-SRVKXCTJSA-N Ser-Ser-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O OLKICIBQRVSQMA-SRVKXCTJSA-N 0.000 description 5
- 241000256247 Spodoptera exigua Species 0.000 description 5
- 229930006000 Sucrose Natural products 0.000 description 5
- CZMRCDWAGMRECN-UGDNZRGBSA-N Sucrose Chemical compound O[C@H]1[C@H](O)[C@@H](CO)O[C@@]1(CO)O[C@@H]1[C@H](O)[C@@H](O)[C@H](O)[C@@H](CO)O1 CZMRCDWAGMRECN-UGDNZRGBSA-N 0.000 description 5
- GXDLGHLJTHMDII-WISUUJSJSA-N Thr-Ser Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CO)C(O)=O GXDLGHLJTHMDII-WISUUJSJSA-N 0.000 description 5
- 241000255993 Trichoplusia ni Species 0.000 description 5
- 108010087924 alanylproline Proteins 0.000 description 5
- 238000013459 approach Methods 0.000 description 5
- 230000008859 change Effects 0.000 description 5
- 229960005091 chloramphenicol Drugs 0.000 description 5
- WIIZWVCIJKGZOK-RKDXNWHRSA-N chloramphenicol Chemical compound ClC(Cl)C(=O)N[C@H](CO)[C@H](O)C1=CC=C([N+]([O-])=O)C=C1 WIIZWVCIJKGZOK-RKDXNWHRSA-N 0.000 description 5
- 229940042399 direct acting antivirals protease inhibitors Drugs 0.000 description 5
- VHJLVAABSRFDPM-QWWZWVQMSA-N dithiothreitol Chemical compound SC[C@@H](O)[C@H](O)CS VHJLVAABSRFDPM-QWWZWVQMSA-N 0.000 description 5
- 235000013601 eggs Nutrition 0.000 description 5
- 238000010828 elution Methods 0.000 description 5
- 239000003623 enhancer Substances 0.000 description 5
- 229940088598 enzyme Drugs 0.000 description 5
- 238000005558 fluorometry Methods 0.000 description 5
- 108010044374 isoleucyl-tyrosine Proteins 0.000 description 5
- 238000002372 labelling Methods 0.000 description 5
- 108020004707 nucleic acids Proteins 0.000 description 5
- 102000039446 nucleic acids Human genes 0.000 description 5
- 239000000137 peptide hydrolase inhibitor Substances 0.000 description 5
- 229920002981 polyvinylidene fluoride Polymers 0.000 description 5
- 229910000160 potassium phosphate Inorganic materials 0.000 description 5
- 235000011009 potassium phosphates Nutrition 0.000 description 5
- 230000002797 proteolythic effect Effects 0.000 description 5
- 230000002829 reductive effect Effects 0.000 description 5
- 238000010186 staining Methods 0.000 description 5
- 239000005720 sucrose Substances 0.000 description 5
- 230000009466 transformation Effects 0.000 description 5
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 4
- HKJKONMZMPUGHJ-UHFFFAOYSA-N 4-amino-5-hydroxy-3-[(4-nitrophenyl)diazenyl]-6-phenyldiazenylnaphthalene-2,7-disulfonic acid Chemical compound OS(=O)(=O)C1=CC2=CC(S(O)(=O)=O)=C(N=NC=3C=CC=CC=3)C(O)=C2C(N)=C1N=NC1=CC=C([N+]([O-])=O)C=C1 HKJKONMZMPUGHJ-UHFFFAOYSA-N 0.000 description 4
- OPIFSICVWOWJMJ-AEOCFKNESA-N 5-bromo-4-chloro-3-indolyl beta-D-galactoside Chemical compound O[C@@H]1[C@@H](O)[C@@H](O)[C@@H](CO)O[C@H]1OC1=CNC2=CC=C(Br)C(Cl)=C12 OPIFSICVWOWJMJ-AEOCFKNESA-N 0.000 description 4
- 241000238876 Acari Species 0.000 description 4
- WPWUFUBLGADILS-WDSKDSINSA-N Ala-Pro Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(O)=O WPWUFUBLGADILS-WDSKDSINSA-N 0.000 description 4
- REWSWYIDQIELBE-FXQIFTODSA-N Ala-Val-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O REWSWYIDQIELBE-FXQIFTODSA-N 0.000 description 4
- HXWUJJADFMXNKA-BQBZGAKWSA-N Asn-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CC(N)=O HXWUJJADFMXNKA-BQBZGAKWSA-N 0.000 description 4
- KRXIWXCXOARFNT-ZLUOBGJFSA-N Asp-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC(O)=O KRXIWXCXOARFNT-ZLUOBGJFSA-N 0.000 description 4
- WMLFFCRUSPNENW-ZLUOBGJFSA-N Asp-Ser-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O WMLFFCRUSPNENW-ZLUOBGJFSA-N 0.000 description 4
- 229920000742 Cotton Polymers 0.000 description 4
- 241000255601 Drosophila melanogaster Species 0.000 description 4
- 241000588921 Enterobacteriaceae Species 0.000 description 4
- XMPXVJIDADUOQB-RCOVLWMOSA-N Gly-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C([O-])=O)NC(=O)CNC(=O)C[NH3+] XMPXVJIDADUOQB-RCOVLWMOSA-N 0.000 description 4
- DNAZKGFYFRGZIH-QWRGUYRKSA-N Gly-Tyr-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=C(O)C=C1 DNAZKGFYFRGZIH-QWRGUYRKSA-N 0.000 description 4
- PVMPDMIKUVNOBD-CIUDSAMLSA-N Leu-Asp-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O PVMPDMIKUVNOBD-CIUDSAMLSA-N 0.000 description 4
- ZOKVLMBYDSIDKG-CSMHCCOUSA-N Lys-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCCN ZOKVLMBYDSIDKG-CSMHCCOUSA-N 0.000 description 4
- VWPJQIHBBOJWDN-DCAQKATOSA-N Lys-Val-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O VWPJQIHBBOJWDN-DCAQKATOSA-N 0.000 description 4
- JWBLQDDHSDGEGR-DRZSPHRISA-N Phe-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 JWBLQDDHSDGEGR-DRZSPHRISA-N 0.000 description 4
- YUJLIIRMIAGMCQ-CIUDSAMLSA-N Ser-Leu-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O YUJLIIRMIAGMCQ-CIUDSAMLSA-N 0.000 description 4
- MTCFGRXMJLQNBG-UHFFFAOYSA-N Serine Natural products OCC(N)C(O)=O MTCFGRXMJLQNBG-UHFFFAOYSA-N 0.000 description 4
- 239000007984 Tris EDTA buffer Substances 0.000 description 4
- 239000007983 Tris buffer Substances 0.000 description 4
- 108090000631 Trypsin Proteins 0.000 description 4
- 102000004142 Trypsin Human genes 0.000 description 4
- 238000002835 absorbance Methods 0.000 description 4
- 108010047495 alanylglycine Proteins 0.000 description 4
- 108010070944 alanylhistidine Proteins 0.000 description 4
- 238000005349 anion exchange Methods 0.000 description 4
- 238000000137 annealing Methods 0.000 description 4
- 230000003115 biocidal effect Effects 0.000 description 4
- 238000009835 boiling Methods 0.000 description 4
- 238000003776 cleavage reaction Methods 0.000 description 4
- ZMMJGEGLRURXTF-UHFFFAOYSA-N ethidium bromide Chemical compound [Br-].C12=CC(N)=CC=C2C2=CC=C(N)C=C2[N+](CC)=C1C1=CC=CC=C1 ZMMJGEGLRURXTF-UHFFFAOYSA-N 0.000 description 4
- 229960005542 ethidium bromide Drugs 0.000 description 4
- 239000013604 expression vector Substances 0.000 description 4
- 239000000284 extract Substances 0.000 description 4
- 235000019387 fatty acid methyl ester Nutrition 0.000 description 4
- VPZXBVLAVMBEQI-UHFFFAOYSA-N glycyl-DL-alpha-alanine Natural products OC(=O)C(C)NC(=O)CN VPZXBVLAVMBEQI-UHFFFAOYSA-N 0.000 description 4
- 238000003306 harvesting Methods 0.000 description 4
- 210000003000 inclusion body Anatomy 0.000 description 4
- 229930027917 kanamycin Natural products 0.000 description 4
- 229960000318 kanamycin Drugs 0.000 description 4
- SBUJHOSQTJFQJX-NOAMYHISSA-N kanamycin Chemical compound O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CN)O[C@@H]1O[C@H]1[C@H](O)[C@@H](O[C@@H]2[C@@H]([C@@H](N)[C@H](O)[C@@H](CO)O2)O)[C@H](N)C[C@@H]1N SBUJHOSQTJFQJX-NOAMYHISSA-N 0.000 description 4
- 229930182823 kanamycin A Natural products 0.000 description 4
- 108010034529 leucyl-lysine Proteins 0.000 description 4
- 230000014759 maintenance of location Effects 0.000 description 4
- 238000005259 measurement Methods 0.000 description 4
- 230000000813 microbial effect Effects 0.000 description 4
- 108010051242 phenylalanylserine Proteins 0.000 description 4
- 230000019612 pigmentation Effects 0.000 description 4
- 229920002401 polyacrylamide Polymers 0.000 description 4
- 229920000136 polysorbate Polymers 0.000 description 4
- 108010090894 prolylleucine Proteins 0.000 description 4
- 230000007017 scission Effects 0.000 description 4
- 108010048818 seryl-histidine Proteins 0.000 description 4
- 108010026333 seryl-proline Proteins 0.000 description 4
- 239000000126 substance Substances 0.000 description 4
- LENZDBCJOHFCAS-UHFFFAOYSA-N tris Chemical compound OCC(N)(CO)CO LENZDBCJOHFCAS-UHFFFAOYSA-N 0.000 description 4
- 239000012588 trypsin Substances 0.000 description 4
- 108010020532 tyrosyl-proline Proteins 0.000 description 4
- 230000004584 weight gain Effects 0.000 description 4
- 235000019786 weight gain Nutrition 0.000 description 4
- OSBLTNPMIGYQGY-UHFFFAOYSA-N 2-amino-2-(hydroxymethyl)propane-1,3-diol;2-[2-[bis(carboxymethyl)amino]ethyl-(carboxymethyl)amino]acetic acid;boric acid Chemical compound OB(O)O.OCC(N)(CO)CO.OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O OSBLTNPMIGYQGY-UHFFFAOYSA-N 0.000 description 3
- HRPVXLWXLXDGHG-UHFFFAOYSA-N Acrylamide Chemical compound NC(=O)C=C HRPVXLWXLXDGHG-UHFFFAOYSA-N 0.000 description 3
- ZKHQWZAMYRWXGA-UHFFFAOYSA-N Adenosine triphosphate Natural products C1=NC=2C(N)=NC=NC=2N1C1OC(COP(O)(=O)OP(O)(=O)OP(O)(O)=O)C(O)C1O ZKHQWZAMYRWXGA-UHFFFAOYSA-N 0.000 description 3
- 241000256118 Aedes aegypti Species 0.000 description 3
- AAQGRPOPTAUUBM-ZLUOBGJFSA-N Ala-Ala-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O AAQGRPOPTAUUBM-ZLUOBGJFSA-N 0.000 description 3
- YLTKNGYYPIWKHZ-ACZMJKKPSA-N Ala-Ala-Glu Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCC(O)=O YLTKNGYYPIWKHZ-ACZMJKKPSA-N 0.000 description 3
- YYSWCHMLFJLLBJ-ZLUOBGJFSA-N Ala-Ala-Ser Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O YYSWCHMLFJLLBJ-ZLUOBGJFSA-N 0.000 description 3
- CCUAQNUWXLYFRA-IMJSIDKUSA-N Ala-Asn Chemical compound C[C@H]([NH3+])C(=O)N[C@H](C([O-])=O)CC(N)=O CCUAQNUWXLYFRA-IMJSIDKUSA-N 0.000 description 3
- OBVSBEYOMDWLRJ-BFHQHQDPSA-N Ala-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@H](C)N OBVSBEYOMDWLRJ-BFHQHQDPSA-N 0.000 description 3
- HHRAXZAYZFFRAM-CIUDSAMLSA-N Ala-Leu-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O HHRAXZAYZFFRAM-CIUDSAMLSA-N 0.000 description 3
- UAOSDDXCTBIPCA-QXEWZRGKSA-N Arg-Ile-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CCCN=C(N)N)N UAOSDDXCTBIPCA-QXEWZRGKSA-N 0.000 description 3
- SJUXYGVRSGTPMC-IMJSIDKUSA-N Asn-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CC(N)=O SJUXYGVRSGTPMC-IMJSIDKUSA-N 0.000 description 3
- MQLZLIYPFDIDMZ-HAFWLYHUSA-N Asn-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CC(N)=O MQLZLIYPFDIDMZ-HAFWLYHUSA-N 0.000 description 3
- PTSDPWIHOYMRGR-UGYAYLCHSA-N Asn-Ile-Asn Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O PTSDPWIHOYMRGR-UGYAYLCHSA-N 0.000 description 3
- SONUFGRSSMFHFN-IMJSIDKUSA-N Asn-Ser Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(O)=O SONUFGRSSMFHFN-IMJSIDKUSA-N 0.000 description 3
- KNMRXHIAVXHCLW-ZLUOBGJFSA-N Asp-Asn-Ser Chemical compound C([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CO)C(=O)O)N)C(=O)O KNMRXHIAVXHCLW-ZLUOBGJFSA-N 0.000 description 3
- ICZWAZVKLACMKR-CIUDSAMLSA-N Asp-His-Ser Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CO)C(O)=O)CC1=CN=CN1 ICZWAZVKLACMKR-CIUDSAMLSA-N 0.000 description 3
- 238000001712 DNA sequencing Methods 0.000 description 3
- 241000489947 Diabrotica virgifera virgifera Species 0.000 description 3
- 241000709823 Dictyoptera <beetle genus> Species 0.000 description 3
- 239000004471 Glycine Substances 0.000 description 3
- 244000299507 Gossypium hirsutum Species 0.000 description 3
- 241000256244 Heliothis virescens Species 0.000 description 3
- YCKPUHHMCFSUMD-IUKAMOBKSA-N Ile-Thr-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N YCKPUHHMCFSUMD-IUKAMOBKSA-N 0.000 description 3
- 229930010555 Inosine Natural products 0.000 description 3
- UGQMRVRMYYASKQ-KQYNXXCUSA-N Inosine Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C2=NC=NC(O)=C2N=C1 UGQMRVRMYYASKQ-KQYNXXCUSA-N 0.000 description 3
- DLFAACQHIRSQGG-CIUDSAMLSA-N Leu-Asp-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O DLFAACQHIRSQGG-CIUDSAMLSA-N 0.000 description 3
- MYGQXVYRZMKRDB-SRVKXCTJSA-N Leu-Asp-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN MYGQXVYRZMKRDB-SRVKXCTJSA-N 0.000 description 3
- KUIDCYNIEJBZBU-AJNGGQMLSA-N Leu-Ile-Leu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O KUIDCYNIEJBZBU-AJNGGQMLSA-N 0.000 description 3
- LXKNSJLSGPNHSK-KKUMJFAQSA-N Leu-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N LXKNSJLSGPNHSK-KKUMJFAQSA-N 0.000 description 3
- FOBUGKUBUJOWAD-IHPCNDPISA-N Leu-Leu-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(C)C)C(O)=O)=CNC2=C1 FOBUGKUBUJOWAD-IHPCNDPISA-N 0.000 description 3
- JLWZLIQRYCTYBD-IHRRRGAJSA-N Leu-Lys-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O JLWZLIQRYCTYBD-IHRRRGAJSA-N 0.000 description 3
- VKVDRTGWLVZJOM-DCAQKATOSA-N Leu-Val-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O VKVDRTGWLVZJOM-DCAQKATOSA-N 0.000 description 3
- YQAIUOWPSUOINN-IUCAKERBSA-N Lys-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCCN YQAIUOWPSUOINN-IUCAKERBSA-N 0.000 description 3
- SITLTJHOQZFJGG-UHFFFAOYSA-N N-L-alpha-glutamyl-L-valine Natural products CC(C)C(C(O)=O)NC(=O)C(N)CCC(O)=O SITLTJHOQZFJGG-UHFFFAOYSA-N 0.000 description 3
- 239000000020 Nitrocellulose Substances 0.000 description 3
- 206010058667 Oral toxicity Diseases 0.000 description 3
- 241000256682 Peregrinus maidis Species 0.000 description 3
- BIYWZVCPZIFGPY-QWRGUYRKSA-N Phe-Gly-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)NCC(=O)N[C@@H](CO)C(O)=O BIYWZVCPZIFGPY-QWRGUYRKSA-N 0.000 description 3
- YCCUXNNKXDGMAM-KKUMJFAQSA-N Phe-Leu-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O YCCUXNNKXDGMAM-KKUMJFAQSA-N 0.000 description 3
- KWYUFKZDYYNOTN-UHFFFAOYSA-M Potassium hydroxide Chemical compound [OH-].[K+] KWYUFKZDYYNOTN-UHFFFAOYSA-M 0.000 description 3
- 238000012300 Sequence Analysis Methods 0.000 description 3
- OHKLFYXEOGGGCK-ZLUOBGJFSA-N Ser-Asp-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O OHKLFYXEOGGGCK-ZLUOBGJFSA-N 0.000 description 3
- LALNXSXEYFUUDD-GUBZILKMSA-N Ser-Glu-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O LALNXSXEYFUUDD-GUBZILKMSA-N 0.000 description 3
- FUMGHWDRRFCKEP-CIUDSAMLSA-N Ser-Leu-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O FUMGHWDRRFCKEP-CIUDSAMLSA-N 0.000 description 3
- VMHLLURERBWHNL-UHFFFAOYSA-M Sodium acetate Chemical compound [Na+].CC([O-])=O VMHLLURERBWHNL-UHFFFAOYSA-M 0.000 description 3
- 239000008051 TBE buffer Substances 0.000 description 3
- VPZKQTYZIVOJDV-LMVFSUKVSA-N Thr-Ala Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(O)=O VPZKQTYZIVOJDV-LMVFSUKVSA-N 0.000 description 3
- DFTCYYILCSQGIZ-GCJQMDKQSA-N Thr-Ala-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O DFTCYYILCSQGIZ-GCJQMDKQSA-N 0.000 description 3
- YKRQRPFODDJQTC-CSMHCCOUSA-N Thr-Lys Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(O)=O)CCCCN YKRQRPFODDJQTC-CSMHCCOUSA-N 0.000 description 3
- DSGIVWSDDRDJIO-ZXXMMSQZSA-N Thr-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O DSGIVWSDDRDJIO-ZXXMMSQZSA-N 0.000 description 3
- LYMVXFSTACVOLP-ZFWWWQNUSA-N Trp-Leu Chemical compound C1=CC=C2C(C[C@H]([NH3+])C(=O)N[C@@H](CC(C)C)C([O-])=O)=CNC2=C1 LYMVXFSTACVOLP-ZFWWWQNUSA-N 0.000 description 3
- GBIUHAYJGWVNLN-UHFFFAOYSA-N Val-Ser-Pro Natural products CC(C)C(N)C(=O)NC(CO)C(=O)N1CCCC1C(O)=O GBIUHAYJGWVNLN-UHFFFAOYSA-N 0.000 description 3
- DLRZGNXCXUGIDG-KKHAAJSZSA-N Val-Thr-Asp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N)O DLRZGNXCXUGIDG-KKHAAJSZSA-N 0.000 description 3
- YQYFYUSYEDNLSD-YEPSODPASA-N Val-Thr-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O YQYFYUSYEDNLSD-YEPSODPASA-N 0.000 description 3
- 239000002253 acid Substances 0.000 description 3
- 230000004913 activation Effects 0.000 description 3
- -1 al Species 0.000 description 3
- 108010008685 alanyl-glutamyl-aspartic acid Proteins 0.000 description 3
- 108010045350 alanyl-tyrosyl-alanine Proteins 0.000 description 3
- 125000000539 amino acid group Chemical group 0.000 description 3
- 238000005571 anion exchange chromatography Methods 0.000 description 3
- 108010038633 aspartylglutamate Proteins 0.000 description 3
- QVGXLLKOCUKJST-UHFFFAOYSA-N atomic oxygen Chemical compound [O] QVGXLLKOCUKJST-UHFFFAOYSA-N 0.000 description 3
- 230000004071 biological effect Effects 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 3
- 239000005018 casein Substances 0.000 description 3
- 239000003153 chemical reaction reagent Substances 0.000 description 3
- 238000005352 clarification Methods 0.000 description 3
- 238000010367 cloning Methods 0.000 description 3
- NKLPQNGYXWVELD-UHFFFAOYSA-M coomassie brilliant blue Chemical compound [Na+].C1=CC(OCC)=CC=C1NC1=CC=C(C(=C2C=CC(C=C2)=[N+](CC)CC=2C=C(C=CC=2)S([O-])(=O)=O)C=2C=CC(=CC=2)N(CC)CC=2C=C(C=CC=2)S([O-])(=O)=O)C=C1 NKLPQNGYXWVELD-UHFFFAOYSA-M 0.000 description 3
- 238000001035 drying Methods 0.000 description 3
- 238000009472 formulation Methods 0.000 description 3
- 238000001641 gel filtration chromatography Methods 0.000 description 3
- 108010040030 histidinoalanine Proteins 0.000 description 3
- 230000007062 hydrolysis Effects 0.000 description 3
- 238000006460 hydrolysis reaction Methods 0.000 description 3
- 230000002401 inhibitory effect Effects 0.000 description 3
- 229960003786 inosine Drugs 0.000 description 3
- 108010057821 leucylproline Proteins 0.000 description 3
- 108010017391 lysylvaline Proteins 0.000 description 3
- ZIUHHBKFKCYYJD-UHFFFAOYSA-N n,n'-methylenebisacrylamide Chemical compound C=CC(=O)NCNC(=O)C=C ZIUHHBKFKCYYJD-UHFFFAOYSA-N 0.000 description 3
- 239000013642 negative control Substances 0.000 description 3
- 229920001220 nitrocellulos Polymers 0.000 description 3
- 239000006916 nutrient agar Substances 0.000 description 3
- 231100000418 oral toxicity Toxicity 0.000 description 3
- 239000001301 oxygen Substances 0.000 description 3
- 229910052760 oxygen Inorganic materials 0.000 description 3
- 108010012581 phenylalanylglutamate Proteins 0.000 description 3
- 239000008363 phosphate buffer Substances 0.000 description 3
- 239000013600 plasmid vector Substances 0.000 description 3
- 229920001223 polyethylene glycol Polymers 0.000 description 3
- 239000013641 positive control Substances 0.000 description 3
- 238000001556 precipitation Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 230000003252 repetitive effect Effects 0.000 description 3
- 150000003839 salts Chemical class 0.000 description 3
- 108010071207 serylmethionine Proteins 0.000 description 3
- 239000001632 sodium acetate Substances 0.000 description 3
- 235000017281 sodium acetate Nutrition 0.000 description 3
- 239000001488 sodium phosphate Substances 0.000 description 3
- 229910000162 sodium phosphate Inorganic materials 0.000 description 3
- 238000010561 standard procedure Methods 0.000 description 3
- 238000003756 stirring Methods 0.000 description 3
- 239000011550 stock solution Substances 0.000 description 3
- 239000000758 substrate Substances 0.000 description 3
- 239000012134 supernatant fraction Substances 0.000 description 3
- 230000004083 survival effect Effects 0.000 description 3
- 230000002103 transcriptional effect Effects 0.000 description 3
- RYFMWSXOAZQYPI-UHFFFAOYSA-K trisodium phosphate Chemical compound [Na+].[Na+].[Na+].[O-]P([O-])([O-])=O RYFMWSXOAZQYPI-UHFFFAOYSA-K 0.000 description 3
- IBIDRSSEHFLGSD-UHFFFAOYSA-N valinyl-arginine Natural products CC(C)C(N)C(=O)NC(C(O)=O)CCCN=C(N)N IBIDRSSEHFLGSD-UHFFFAOYSA-N 0.000 description 3
- 239000011800 void material Substances 0.000 description 3
- LWTDZKXXJRRKDG-KXBFYZLASA-N (-)-phaseollin Chemical compound C1OC2=CC(O)=CC=C2[C@H]2[C@@H]1C1=CC=C3OC(C)(C)C=CC3=C1O2 LWTDZKXXJRRKDG-KXBFYZLASA-N 0.000 description 2
- ALBODLTZUXKBGZ-JUUVMNCLSA-N (2s)-2-amino-3-phenylpropanoic acid;(2s)-2,6-diaminohexanoic acid Chemical compound NCCCC[C@H](N)C(O)=O.OC(=O)[C@@H](N)CC1=CC=CC=C1 ALBODLTZUXKBGZ-JUUVMNCLSA-N 0.000 description 2
- AUXMWYRZQPIXCC-KNIFDHDWSA-N (2s)-2-amino-4-methylpentanoic acid;(2s)-2-aminopropanoic acid Chemical compound C[C@H](N)C(O)=O.CC(C)C[C@H](N)C(O)=O AUXMWYRZQPIXCC-KNIFDHDWSA-N 0.000 description 2
- IKQRPFTXKQQLJF-IAHYZSEUSA-N (4s,4as,5as,6s,12ar)-4-(dimethylamino)-1,6,10,11,12a-pentahydroxy-6-methyl-3,12-dioxo-n-(pyrrolidin-1-ylmethyl)-4,4a,5,5a-tetrahydrotetracene-2-carboxamide Chemical compound OC([C@@]1(O)C(=O)C=2[C@@H]([C@](C3=CC=CC(O)=C3C=2O)(C)O)C[C@H]1[C@@H](C1=O)N(C)C)=C1C(=O)NCN1CCCC1 IKQRPFTXKQQLJF-IAHYZSEUSA-N 0.000 description 2
- RUFPHBVGCFYCNW-UHFFFAOYSA-N 1-naphthylamine Chemical compound C1=CC=C2C(N)=CC=CC2=C1 RUFPHBVGCFYCNW-UHFFFAOYSA-N 0.000 description 2
- ZIIUUSVHCHPIQD-UHFFFAOYSA-N 2,4,6-trimethyl-N-[3-(trifluoromethyl)phenyl]benzenesulfonamide Chemical compound CC1=CC(C)=CC(C)=C1S(=O)(=O)NC1=CC=CC(C(F)(F)F)=C1 ZIIUUSVHCHPIQD-UHFFFAOYSA-N 0.000 description 2
- AXAVXPMQTGXXJZ-UHFFFAOYSA-N 2-aminoacetic acid;2-amino-2-(hydroxymethyl)propane-1,3-diol Chemical compound NCC(O)=O.OCC(N)(CO)CO AXAVXPMQTGXXJZ-UHFFFAOYSA-N 0.000 description 2
- 238000013030 3-step procedure Methods 0.000 description 2
- HVBSAKJJOYLTQU-UHFFFAOYSA-N 4-aminobenzenesulfonic acid Chemical compound NC1=CC=C(S(O)(=O)=O)C=C1 HVBSAKJJOYLTQU-UHFFFAOYSA-N 0.000 description 2
- ZKHQWZAMYRWXGA-KQYNXXCUSA-N Adenosine triphosphate Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)[C@@H](O)[C@H]1O ZKHQWZAMYRWXGA-KQYNXXCUSA-N 0.000 description 2
- 241001136249 Agriotes lineatus Species 0.000 description 2
- CXRCVCURMBFFOL-FXQIFTODSA-N Ala-Ala-Pro Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N1CCC[C@H]1C(O)=O CXRCVCURMBFFOL-FXQIFTODSA-N 0.000 description 2
- KQFRUSHJPKXBMB-BHDSKKPTSA-N Ala-Ala-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](C)NC(=O)[C@@H](N)C)C(O)=O)=CNC2=C1 KQFRUSHJPKXBMB-BHDSKKPTSA-N 0.000 description 2
- JBGSZRYCXBPWGX-BQBZGAKWSA-N Ala-Arg-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)[C@@H](N)C)CCCN=C(N)N JBGSZRYCXBPWGX-BQBZGAKWSA-N 0.000 description 2
- LZRNYBIJOSKKRJ-XVYDVKMFSA-N Ala-Asp-His Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N LZRNYBIJOSKKRJ-XVYDVKMFSA-N 0.000 description 2
- ZIWWTZWAKYBUOB-CIUDSAMLSA-N Ala-Asp-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O ZIWWTZWAKYBUOB-CIUDSAMLSA-N 0.000 description 2
- YSMPVONNIWLJML-FXQIFTODSA-N Ala-Asp-Pro Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N1CCC[C@H]1C(O)=O YSMPVONNIWLJML-FXQIFTODSA-N 0.000 description 2
- KXEVYGKATAMXJJ-ACZMJKKPSA-N Ala-Glu-Asp Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O KXEVYGKATAMXJJ-ACZMJKKPSA-N 0.000 description 2
- VGPWRRFOPXVGOH-BYPYZUCNSA-N Ala-Gly-Gly Chemical compound C[C@H](N)C(=O)NCC(=O)NCC(O)=O VGPWRRFOPXVGOH-BYPYZUCNSA-N 0.000 description 2
- NIZKGBJVCMRDKO-KWQFWETISA-N Ala-Gly-Tyr Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 NIZKGBJVCMRDKO-KWQFWETISA-N 0.000 description 2
- PNALXAODQKTNLV-JBDRJPRFSA-N Ala-Ile-Ala Chemical compound C[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O PNALXAODQKTNLV-JBDRJPRFSA-N 0.000 description 2
- TZDNWXDLYFIFPT-BJDJZHNGSA-N Ala-Ile-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O TZDNWXDLYFIFPT-BJDJZHNGSA-N 0.000 description 2
- AWZKCUCQJNTBAD-SRVKXCTJSA-N Ala-Leu-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCCN AWZKCUCQJNTBAD-SRVKXCTJSA-N 0.000 description 2
- AWNAEZICPNGAJK-FXQIFTODSA-N Ala-Met-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CO)C(O)=O AWNAEZICPNGAJK-FXQIFTODSA-N 0.000 description 2
- VQAVBBCZFQAAED-FXQIFTODSA-N Ala-Pro-Asn Chemical compound C[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(=O)N)C(=O)O)N VQAVBBCZFQAAED-FXQIFTODSA-N 0.000 description 2
- NZGRHTKZFSVPAN-BIIVOSGPSA-N Ala-Ser-Pro Chemical compound C[C@@H](C(=O)N[C@@H](CO)C(=O)N1CCC[C@@H]1C(=O)O)N NZGRHTKZFSVPAN-BIIVOSGPSA-N 0.000 description 2
- BUQICHWNXBIBOG-LMVFSUKVSA-N Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)N BUQICHWNXBIBOG-LMVFSUKVSA-N 0.000 description 2
- WNHNMKOFKCHKKD-BFHQHQDPSA-N Ala-Thr-Gly Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O WNHNMKOFKCHKKD-BFHQHQDPSA-N 0.000 description 2
- IOFVWPYSRSCWHI-JXUBOQSCSA-N Ala-Thr-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@H](C)N IOFVWPYSRSCWHI-JXUBOQSCSA-N 0.000 description 2
- ALZVPLKYDKJKQU-XVKPBYJWSA-N Ala-Tyr Chemical compound C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 ALZVPLKYDKJKQU-XVKPBYJWSA-N 0.000 description 2
- YCTIYBUTCKNOTI-UWJYBYFXSA-N Ala-Tyr-Asp Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N YCTIYBUTCKNOTI-UWJYBYFXSA-N 0.000 description 2
- 241000902876 Alticini Species 0.000 description 2
- 241001124076 Aphididae Species 0.000 description 2
- 241001507652 Aphrophoridae Species 0.000 description 2
- BHSYMWWMVRPCPA-CYDGBPFRSA-N Arg-Arg-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@@H](N)CCCN=C(N)N BHSYMWWMVRPCPA-CYDGBPFRSA-N 0.000 description 2
- PQWTZSNVWSOFFK-FXQIFTODSA-N Arg-Asp-Asn Chemical compound C(C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N)CN=C(N)N PQWTZSNVWSOFFK-FXQIFTODSA-N 0.000 description 2
- UHFUZWSZQKMDSX-DCAQKATOSA-N Arg-Leu-Asn Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N UHFUZWSZQKMDSX-DCAQKATOSA-N 0.000 description 2
- JEXPNDORFYHJTM-IHRRRGAJSA-N Arg-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CCCN=C(N)N JEXPNDORFYHJTM-IHRRRGAJSA-N 0.000 description 2
- COXMUHNBYCVVRG-DCAQKATOSA-N Arg-Leu-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O COXMUHNBYCVVRG-DCAQKATOSA-N 0.000 description 2
- QNYWYYNQSXANBL-WDSOQIARSA-N Arg-Trp-His Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC3=CN=CN3)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N QNYWYYNQSXANBL-WDSOQIARSA-N 0.000 description 2
- QLSRIZIDQXDQHK-RCWTZXSCSA-N Arg-Val-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O QLSRIZIDQXDQHK-RCWTZXSCSA-N 0.000 description 2
- AILDTIZEPVHXBF-UHFFFAOYSA-N Argentine Natural products C1C(C2)C3=CC=CC(=O)N3CC1CN2C(=O)N1CC(C=2N(C(=O)C=CC=2)C2)CC2C1 AILDTIZEPVHXBF-UHFFFAOYSA-N 0.000 description 2
- 239000004475 Arginine Substances 0.000 description 2
- NTXNUXPCNRDMAF-WFBYXXMGSA-N Asn-Ala-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CC(N)=O)C)C(O)=O)=CNC2=C1 NTXNUXPCNRDMAF-WFBYXXMGSA-N 0.000 description 2
- DNYRZPOWBTYFAF-IHRRRGAJSA-N Asn-Arg-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC(=O)N)N)O DNYRZPOWBTYFAF-IHRRRGAJSA-N 0.000 description 2
- VYLVOMUVLMGCRF-ZLUOBGJFSA-N Asn-Asp-Ser Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O VYLVOMUVLMGCRF-ZLUOBGJFSA-N 0.000 description 2
- HYQYLOSCICEYTR-YUMQZZPRSA-N Asn-Gly-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O HYQYLOSCICEYTR-YUMQZZPRSA-N 0.000 description 2
- GQRDIVQPSMPQME-ZPFDUUQYSA-N Asn-Ile-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O GQRDIVQPSMPQME-ZPFDUUQYSA-N 0.000 description 2
- SEKBHZJLARBNPB-GHCJXIJMSA-N Asn-Ile-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O SEKBHZJLARBNPB-GHCJXIJMSA-N 0.000 description 2
- IBLAOXSULLECQZ-IUKAMOBKSA-N Asn-Ile-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CC(N)=O IBLAOXSULLECQZ-IUKAMOBKSA-N 0.000 description 2
- ZMUQQMGITUJQTI-CIUDSAMLSA-N Asn-Leu-Asn Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O ZMUQQMGITUJQTI-CIUDSAMLSA-N 0.000 description 2
- DJIMLSXHXKWADV-CIUDSAMLSA-N Asn-Leu-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(N)=O DJIMLSXHXKWADV-CIUDSAMLSA-N 0.000 description 2
- FHETWELNCBMRMG-HJGDQZAQSA-N Asn-Leu-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O FHETWELNCBMRMG-HJGDQZAQSA-N 0.000 description 2
- KHCNTVRVAYCPQE-CIUDSAMLSA-N Asn-Lys-Asn Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O KHCNTVRVAYCPQE-CIUDSAMLSA-N 0.000 description 2
- JWKDQOORUCYUIW-ZPFDUUQYSA-N Asn-Lys-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JWKDQOORUCYUIW-ZPFDUUQYSA-N 0.000 description 2
- NJSNXIOKBHPFMB-GMOBBJLQSA-N Asn-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CC(=O)N)N NJSNXIOKBHPFMB-GMOBBJLQSA-N 0.000 description 2
- YWFLXGZHZXXINF-BPUTZDHNSA-N Asn-Pro-Trp Chemical compound NC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CNC2=CC=CC=C12 YWFLXGZHZXXINF-BPUTZDHNSA-N 0.000 description 2
- UGXYFDQFLVCDFC-CIUDSAMLSA-N Asn-Ser-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC(N)=O UGXYFDQFLVCDFC-CIUDSAMLSA-N 0.000 description 2
- XEGZSHSPQNDNRH-JRQIVUDYSA-N Asn-Tyr-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XEGZSHSPQNDNRH-JRQIVUDYSA-N 0.000 description 2
- XLDMSQYOYXINSZ-QXEWZRGKSA-N Asn-Val-Arg Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N XLDMSQYOYXINSZ-QXEWZRGKSA-N 0.000 description 2
- QHAJMRDEWNAIBQ-FXQIFTODSA-N Asp-Arg-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(O)=O QHAJMRDEWNAIBQ-FXQIFTODSA-N 0.000 description 2
- WKGJGVGTEZGFSW-FXQIFTODSA-N Asp-Asn-Met Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCSC)C(O)=O WKGJGVGTEZGFSW-FXQIFTODSA-N 0.000 description 2
- FRYULLIZUDQONW-IMJSIDKUSA-N Asp-Asp Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(O)=O FRYULLIZUDQONW-IMJSIDKUSA-N 0.000 description 2
- NAPNAGZWHQHZLG-ZLUOBGJFSA-N Asp-Asp-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)O)N NAPNAGZWHQHZLG-ZLUOBGJFSA-N 0.000 description 2
- QOVWVLLHMMCFFY-ZLUOBGJFSA-N Asp-Asp-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O QOVWVLLHMMCFFY-ZLUOBGJFSA-N 0.000 description 2
- ZCKYZTGLXIEOKS-CIUDSAMLSA-N Asp-Asp-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)O)N ZCKYZTGLXIEOKS-CIUDSAMLSA-N 0.000 description 2
- ACEDJCOOPZFUBU-CIUDSAMLSA-N Asp-Cys-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CC(=O)O)N ACEDJCOOPZFUBU-CIUDSAMLSA-N 0.000 description 2
- WSGVTKZFVJSJOG-RCOVLWMOSA-N Asp-Gly-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O WSGVTKZFVJSJOG-RCOVLWMOSA-N 0.000 description 2
- SCQIQCWLOMOEFP-DCAQKATOSA-N Asp-Leu-Arg Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O SCQIQCWLOMOEFP-DCAQKATOSA-N 0.000 description 2
- HKEZZWQWXWGASX-KKUMJFAQSA-N Asp-Leu-Phe Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 HKEZZWQWXWGASX-KKUMJFAQSA-N 0.000 description 2
- ORRJQLIATJDMQM-HJGDQZAQSA-N Asp-Leu-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(O)=O ORRJQLIATJDMQM-HJGDQZAQSA-N 0.000 description 2
- MYOHQBFRJQFIDZ-KKUMJFAQSA-N Asp-Leu-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O MYOHQBFRJQFIDZ-KKUMJFAQSA-N 0.000 description 2
- DINOVZWPTMGSRF-QXEWZRGKSA-N Asp-Pro-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(O)=O DINOVZWPTMGSRF-QXEWZRGKSA-N 0.000 description 2
- CUQDCPXNZPDYFQ-ZLUOBGJFSA-N Asp-Ser-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O CUQDCPXNZPDYFQ-ZLUOBGJFSA-N 0.000 description 2
- HRVQDZOWMLFAOD-BIIVOSGPSA-N Asp-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CC(=O)O)N)C(=O)O HRVQDZOWMLFAOD-BIIVOSGPSA-N 0.000 description 2
- GXHDGYOXPNQCKM-XVSYOHENSA-N Asp-Thr-Phe Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CC(=O)O)N)O GXHDGYOXPNQCKM-XVSYOHENSA-N 0.000 description 2
- ITGFVUYOLWBPQW-KKHAAJSZSA-N Asp-Thr-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O ITGFVUYOLWBPQW-KKHAAJSZSA-N 0.000 description 2
- QOJJMJKTMKNFEF-ZKWXMUAHSA-N Asp-Val-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC(O)=O QOJJMJKTMKNFEF-ZKWXMUAHSA-N 0.000 description 2
- DCXYFEDJOCDNAF-UHFFFAOYSA-N Asparagine Natural products OC(=O)C(N)CC(N)=O DCXYFEDJOCDNAF-UHFFFAOYSA-N 0.000 description 2
- 241000193830 Bacillus <bacterium> Species 0.000 description 2
- 241000193388 Bacillus thuringiensis Species 0.000 description 2
- 231100000699 Bacterial toxin Toxicity 0.000 description 2
- 108091003079 Bovine Serum Albumin Proteins 0.000 description 2
- 241000907223 Bruchinae Species 0.000 description 2
- 241000776777 Cacopsylla mali Species 0.000 description 2
- 241000426451 Camponotus modoc Species 0.000 description 2
- 241001491934 Camponotus pennsylvanicus Species 0.000 description 2
- 241000604356 Chamaepsila rosae Species 0.000 description 2
- 108050001186 Chaperonin Cpn60 Proteins 0.000 description 2
- 102000052603 Chaperonins Human genes 0.000 description 2
- 241000034870 Chrysoteuchia culmella Species 0.000 description 2
- 241001222599 Clania variegata Species 0.000 description 2
- 108091033380 Coding strand Proteins 0.000 description 2
- 241000084474 Contarinia pisi Species 0.000 description 2
- 238000007399 DNA isolation Methods 0.000 description 2
- 239000003155 DNA primer Substances 0.000 description 2
- 244000000626 Daucus carota Species 0.000 description 2
- 235000002767 Daucus carota Nutrition 0.000 description 2
- 241000084475 Delia antiqua Species 0.000 description 2
- 241001414892 Delia radicum Species 0.000 description 2
- 102000020897 Formins Human genes 0.000 description 2
- 108091022623 Formins Proteins 0.000 description 2
- 241001466042 Fulgoromorpha Species 0.000 description 2
- JJKKWYQVHRUSDG-GUBZILKMSA-N Glu-Ala-Lys Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCCN)C(O)=O JJKKWYQVHRUSDG-GUBZILKMSA-N 0.000 description 2
- SBCYJMOOHUDWDA-NUMRIWBASA-N Glu-Asp-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SBCYJMOOHUDWDA-NUMRIWBASA-N 0.000 description 2
- VMKCPNBBPGGQBJ-GUBZILKMSA-N Glu-Leu-Asn Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)O)N VMKCPNBBPGGQBJ-GUBZILKMSA-N 0.000 description 2
- DWUKOTKSTDWGAE-BQBZGAKWSA-N Gly-Asn-Arg Chemical compound NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N DWUKOTKSTDWGAE-BQBZGAKWSA-N 0.000 description 2
- FIQQRCFQXGLOSZ-WDSKDSINSA-N Gly-Glu-Asp Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O FIQQRCFQXGLOSZ-WDSKDSINSA-N 0.000 description 2
- YYPFZVIXAVDHIK-IUCAKERBSA-N Gly-Glu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)CN YYPFZVIXAVDHIK-IUCAKERBSA-N 0.000 description 2
- HAXARWKYFIIHKD-ZKWXMUAHSA-N Gly-Ile-Ser Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O HAXARWKYFIIHKD-ZKWXMUAHSA-N 0.000 description 2
- DKEXFJVMVGETOO-LURJTMIESA-N Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CN DKEXFJVMVGETOO-LURJTMIESA-N 0.000 description 2
- AFWYPMDMDYCKMD-KBPBESRZSA-N Gly-Leu-Tyr Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 AFWYPMDMDYCKMD-KBPBESRZSA-N 0.000 description 2
- PCPOYRCAHPJXII-UWVGGRQHSA-N Gly-Lys-Met Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(O)=O PCPOYRCAHPJXII-UWVGGRQHSA-N 0.000 description 2
- NZOAFWHVAFJERA-OALUTQOASA-N Gly-Phe-Trp Chemical compound [H]NCC(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O NZOAFWHVAFJERA-OALUTQOASA-N 0.000 description 2
- BCCRXDTUTZHDEU-VKHMYHEASA-N Gly-Ser Chemical compound NCC(=O)N[C@@H](CO)C(O)=O BCCRXDTUTZHDEU-VKHMYHEASA-N 0.000 description 2
- JYGYNWYVKXENNE-OALUTQOASA-N Gly-Tyr-Trp Chemical compound [H]NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O JYGYNWYVKXENNE-OALUTQOASA-N 0.000 description 2
- 229920002683 Glycosaminoglycan Polymers 0.000 description 2
- VPZXBVLAVMBEQI-VKHMYHEASA-N Glycyl-alanine Chemical compound OC(=O)[C@H](C)NC(=O)CN VPZXBVLAVMBEQI-VKHMYHEASA-N 0.000 description 2
- 241001523406 Heterorhabditis Species 0.000 description 2
- FRJIAZKQGSCKPQ-FSPLSTOPSA-N His-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CN=CN1 FRJIAZKQGSCKPQ-FSPLSTOPSA-N 0.000 description 2
- CJGDTAHEMXLRMB-ULQDDVLXSA-N His-Arg-Phe Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O CJGDTAHEMXLRMB-ULQDDVLXSA-N 0.000 description 2
- MHAJPDPJQMAIIY-UHFFFAOYSA-N Hydrogen peroxide Chemical compound OO MHAJPDPJQMAIIY-UHFFFAOYSA-N 0.000 description 2
- VAXBXNPRXPHGHG-BJDJZHNGSA-N Ile-Ala-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)O)N VAXBXNPRXPHGHG-BJDJZHNGSA-N 0.000 description 2
- HERITAGIPLEJMT-GVARAGBVSA-N Ile-Ala-Tyr Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 HERITAGIPLEJMT-GVARAGBVSA-N 0.000 description 2
- DMHGKBGOUAJRHU-UHFFFAOYSA-N Ile-Arg-Pro Natural products CCC(C)C(N)C(=O)NC(CCCN=C(N)N)C(=O)N1CCCC1C(O)=O DMHGKBGOUAJRHU-UHFFFAOYSA-N 0.000 description 2
- CWJQMCPYXNVMBS-STECZYCISA-N Ile-Arg-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N CWJQMCPYXNVMBS-STECZYCISA-N 0.000 description 2
- WKXVAXOSIPTXEC-HAFWLYHUSA-N Ile-Asp Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@H](C(O)=O)CC(O)=O WKXVAXOSIPTXEC-HAFWLYHUSA-N 0.000 description 2
- RPZFUIQVAPZLRH-GHCJXIJMSA-N Ile-Asp-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](C)C(=O)O)N RPZFUIQVAPZLRH-GHCJXIJMSA-N 0.000 description 2
- JQLFYZMEXFNRFS-DJFWLOJKSA-N Ile-Asp-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N JQLFYZMEXFNRFS-DJFWLOJKSA-N 0.000 description 2
- DSDPLOODKXISDT-XUXIUFHCSA-N Ile-Leu-Val Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O DSDPLOODKXISDT-XUXIUFHCSA-N 0.000 description 2
- OVDKXUDMKXAZIV-ZPFDUUQYSA-N Ile-Lys-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)N)C(=O)O)N OVDKXUDMKXAZIV-ZPFDUUQYSA-N 0.000 description 2
- LRAUKBMYHHNADU-DKIMLUQUSA-N Ile-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)[C@@H](C)CC)CC1=CC=CC=C1 LRAUKBMYHHNADU-DKIMLUQUSA-N 0.000 description 2
- TWVKGYNQQAUNRN-ACZMJKKPSA-N Ile-Ser Chemical compound CC[C@H](C)[C@H]([NH3+])C(=O)N[C@@H](CO)C([O-])=O TWVKGYNQQAUNRN-ACZMJKKPSA-N 0.000 description 2
- NURNJECQNNCRBK-FLBSBUHZSA-N Ile-Thr-Thr Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NURNJECQNNCRBK-FLBSBUHZSA-N 0.000 description 2
- MUFXDFWAJSPHIQ-XDTLVQLUSA-N Ile-Tyr Chemical compound CC[C@H](C)[C@H]([NH3+])C(=O)N[C@H](C([O-])=O)CC1=CC=C(O)C=C1 MUFXDFWAJSPHIQ-XDTLVQLUSA-N 0.000 description 2
- OMDWJWGZGMCQND-CFMVVWHZSA-N Ile-Tyr-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N OMDWJWGZGMCQND-CFMVVWHZSA-N 0.000 description 2
- FADYJNXDPBKVCA-UHFFFAOYSA-N L-Phenylalanyl-L-lysin Natural products NCCCCC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FADYJNXDPBKVCA-UHFFFAOYSA-N 0.000 description 2
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 description 2
- QOOWRKBDDXQRHC-BQBZGAKWSA-N L-lysyl-L-alanine Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN QOOWRKBDDXQRHC-BQBZGAKWSA-N 0.000 description 2
- BQSLGJHIAGOZCD-CIUDSAMLSA-N Leu-Ala-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O BQSLGJHIAGOZCD-CIUDSAMLSA-N 0.000 description 2
- UCOCBWDBHCUPQP-DCAQKATOSA-N Leu-Arg-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(O)=O UCOCBWDBHCUPQP-DCAQKATOSA-N 0.000 description 2
- WUFYAPWIHCUMLL-CIUDSAMLSA-N Leu-Asn-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(O)=O WUFYAPWIHCUMLL-CIUDSAMLSA-N 0.000 description 2
- JKGHDYGZRDWHGA-SRVKXCTJSA-N Leu-Asn-Leu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O JKGHDYGZRDWHGA-SRVKXCTJSA-N 0.000 description 2
- CLVUXCBGKUECIT-HJGDQZAQSA-N Leu-Asp-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O CLVUXCBGKUECIT-HJGDQZAQSA-N 0.000 description 2
- LCPYQJIKPJDLLB-UWVGGRQHSA-N Leu-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC(C)C LCPYQJIKPJDLLB-UWVGGRQHSA-N 0.000 description 2
- YOKVEHGYYQEQOP-QWRGUYRKSA-N Leu-Leu-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O YOKVEHGYYQEQOP-QWRGUYRKSA-N 0.000 description 2
- FAELBUXXFQLUAX-AJNGGQMLSA-N Leu-Leu-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(C)C FAELBUXXFQLUAX-AJNGGQMLSA-N 0.000 description 2
- LZHJZLHSRGWBBE-IHRRRGAJSA-N Leu-Lys-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O LZHJZLHSRGWBBE-IHRRRGAJSA-N 0.000 description 2
- BIZNDKMFQHDOIE-KKUMJFAQSA-N Leu-Phe-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(N)=O)C(O)=O)CC1=CC=CC=C1 BIZNDKMFQHDOIE-KKUMJFAQSA-N 0.000 description 2
- YWKNKRAKOCLOLH-OEAJRASXSA-N Leu-Phe-Thr Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H]([C@@H](C)O)C(O)=O)CC1=CC=CC=C1 YWKNKRAKOCLOLH-OEAJRASXSA-N 0.000 description 2
- KWLWZYMNUZJKMZ-IHRRRGAJSA-N Leu-Pro-Leu Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O KWLWZYMNUZJKMZ-IHRRRGAJSA-N 0.000 description 2
- IRMLZWSRWSGTOP-CIUDSAMLSA-N Leu-Ser-Ala Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O IRMLZWSRWSGTOP-CIUDSAMLSA-N 0.000 description 2
- IWMJFLJQHIDZQW-KKUMJFAQSA-N Leu-Ser-Phe Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 IWMJFLJQHIDZQW-KKUMJFAQSA-N 0.000 description 2
- LJBVRCDPWOJOEK-PPCPHDFISA-N Leu-Thr-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O LJBVRCDPWOJOEK-PPCPHDFISA-N 0.000 description 2
- URHJPNHRQMQGOZ-RHYQMDGZSA-N Leu-Thr-Met Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCSC)C(O)=O URHJPNHRQMQGOZ-RHYQMDGZSA-N 0.000 description 2
- ODRREERHVHMIPT-OEAJRASXSA-N Leu-Thr-Phe Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 ODRREERHVHMIPT-OEAJRASXSA-N 0.000 description 2
- YLMIDMSLKLRNHX-HSCHXYMDSA-N Leu-Trp-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O YLMIDMSLKLRNHX-HSCHXYMDSA-N 0.000 description 2
- SXOFUVGLPHCPRQ-KKUMJFAQSA-N Leu-Tyr-Cys Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CS)C(O)=O SXOFUVGLPHCPRQ-KKUMJFAQSA-N 0.000 description 2
- WFCKERTZVCQXKH-KBPBESRZSA-N Leu-Tyr-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(O)=O WFCKERTZVCQXKH-KBPBESRZSA-N 0.000 description 2
- RDFIVFHPOSOXMW-ACRUOGEOSA-N Leu-Tyr-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O RDFIVFHPOSOXMW-ACRUOGEOSA-N 0.000 description 2
- FDBTVENULFNTAL-XQQFMLRXSA-N Leu-Val-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N FDBTVENULFNTAL-XQQFMLRXSA-N 0.000 description 2
- GDBQQVLCIARPGH-UHFFFAOYSA-N Leupeptin Natural products CC(C)CC(NC(C)=O)C(=O)NC(CC(C)C)C(=O)NC(C=O)CCCN=C(N)N GDBQQVLCIARPGH-UHFFFAOYSA-N 0.000 description 2
- 102000003960 Ligases Human genes 0.000 description 2
- 108090000364 Ligases Proteins 0.000 description 2
- 241001646976 Linepithema humile Species 0.000 description 2
- 108090001060 Lipase Proteins 0.000 description 2
- 102000004882 Lipase Human genes 0.000 description 2
- 239000004367 Lipase Substances 0.000 description 2
- 239000006142 Luria-Bertani Agar Substances 0.000 description 2
- 239000006137 Luria-Bertani broth Substances 0.000 description 2
- NFLFJGGKOHYZJF-BJDJZHNGSA-N Lys-Ala-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN NFLFJGGKOHYZJF-BJDJZHNGSA-N 0.000 description 2
- KCXUCYYZNZFGLL-SRVKXCTJSA-N Lys-Ala-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O KCXUCYYZNZFGLL-SRVKXCTJSA-N 0.000 description 2
- IRNSXVOWSXSULE-DCAQKATOSA-N Lys-Ala-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN IRNSXVOWSXSULE-DCAQKATOSA-N 0.000 description 2
- JPNRPAJITHRXRH-BQBZGAKWSA-N Lys-Asn Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(O)=O)CC(N)=O JPNRPAJITHRXRH-BQBZGAKWSA-N 0.000 description 2
- CIOWSLJGLSUOME-BQBZGAKWSA-N Lys-Asp Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(O)=O)CC(O)=O CIOWSLJGLSUOME-BQBZGAKWSA-N 0.000 description 2
- ZAENPHCEQXALHO-GUBZILKMSA-N Lys-Cys-Glu Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CS)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZAENPHCEQXALHO-GUBZILKMSA-N 0.000 description 2
- OWRUUFUVXFREBD-KKUMJFAQSA-N Lys-His-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(O)=O OWRUUFUVXFREBD-KKUMJFAQSA-N 0.000 description 2
- PRSBSVAVOQOAMI-BJDJZHNGSA-N Lys-Ile-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCCCN PRSBSVAVOQOAMI-BJDJZHNGSA-N 0.000 description 2
- WAIHHELKYSFIQN-XUXIUFHCSA-N Lys-Ile-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O WAIHHELKYSFIQN-XUXIUFHCSA-N 0.000 description 2
- ATIPDCIQTUXABX-UWVGGRQHSA-N Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCCN ATIPDCIQTUXABX-UWVGGRQHSA-N 0.000 description 2
- MYZMQWHPDAYKIE-SRVKXCTJSA-N Lys-Leu-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O MYZMQWHPDAYKIE-SRVKXCTJSA-N 0.000 description 2
- HVAUKHLDSDDROB-KKUMJFAQSA-N Lys-Lys-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O HVAUKHLDSDDROB-KKUMJFAQSA-N 0.000 description 2
- QQPSCXKFDSORFT-IHRRRGAJSA-N Lys-Lys-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCCN QQPSCXKFDSORFT-IHRRRGAJSA-N 0.000 description 2
- LMGNWHDWJDIOPK-DKIMLUQUSA-N Lys-Phe-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O LMGNWHDWJDIOPK-DKIMLUQUSA-N 0.000 description 2
- WGILOYIKJVQUPT-DCAQKATOSA-N Lys-Pro-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O WGILOYIKJVQUPT-DCAQKATOSA-N 0.000 description 2
- JOSAKOKSPXROGQ-BJDJZHNGSA-N Lys-Ser-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JOSAKOKSPXROGQ-BJDJZHNGSA-N 0.000 description 2
- BDFHWFUAQLIMJO-KXNHARMFSA-N Lys-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCCN)N)O BDFHWFUAQLIMJO-KXNHARMFSA-N 0.000 description 2
- BIWVMACFGZFIEB-VFAJRCTISA-N Lys-Trp-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)NC(=O)[C@H](CCCCN)N)O BIWVMACFGZFIEB-VFAJRCTISA-N 0.000 description 2
- HMZPYMSEAALNAE-ULQDDVLXSA-N Lys-Val-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O HMZPYMSEAALNAE-ULQDDVLXSA-N 0.000 description 2
- 239000006154 MacConkey agar Substances 0.000 description 2
- 241001414659 Macrosteles Species 0.000 description 2
- CSNNHWWHGAXBCP-UHFFFAOYSA-L Magnesium sulfate Chemical compound [Mg+2].[O-][S+2]([O-])([O-])[O-] CSNNHWWHGAXBCP-UHFFFAOYSA-L 0.000 description 2
- 241000766511 Meligethes Species 0.000 description 2
- QEVRUYFHWJJUHZ-DCAQKATOSA-N Met-Ala-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC(C)C QEVRUYFHWJJUHZ-DCAQKATOSA-N 0.000 description 2
- MXEASDMFHUKOGE-ULQDDVLXSA-N Met-His-Tyr Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O)N MXEASDMFHUKOGE-ULQDDVLXSA-N 0.000 description 2
- AFVOKRHYSSFPHC-STECZYCISA-N Met-Ile-Tyr Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 AFVOKRHYSSFPHC-STECZYCISA-N 0.000 description 2
- FMYLZGQFKPHXHI-GUBZILKMSA-N Met-Met-Ala Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C)C(O)=O FMYLZGQFKPHXHI-GUBZILKMSA-N 0.000 description 2
- JKXVPNCSAMWUEJ-GUBZILKMSA-N Met-Met-Asp Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(O)=O)C(O)=O JKXVPNCSAMWUEJ-GUBZILKMSA-N 0.000 description 2
- MIXPUVSPPOWTCR-FXQIFTODSA-N Met-Ser-Ser Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O MIXPUVSPPOWTCR-FXQIFTODSA-N 0.000 description 2
- QYIGOFGUOVTAHK-ZJDVBMNYSA-N Met-Thr-Thr Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O QYIGOFGUOVTAHK-ZJDVBMNYSA-N 0.000 description 2
- 108010006035 Metalloproteases Proteins 0.000 description 2
- 102000005741 Metalloproteases Human genes 0.000 description 2
- 102000016943 Muramidase Human genes 0.000 description 2
- 108010014251 Muramidase Proteins 0.000 description 2
- 241000257159 Musca domestica Species 0.000 description 2
- 108010062010 N-Acetylmuramoyl-L-alanine Amidase Proteins 0.000 description 2
- 108010079364 N-glycylalanine Proteins 0.000 description 2
- 244000061176 Nicotiana tabacum Species 0.000 description 2
- 235000002637 Nicotiana tabacum Nutrition 0.000 description 2
- IOVCWXUNBOPUCH-UHFFFAOYSA-M Nitrite anion Chemical compound [O-]N=O IOVCWXUNBOPUCH-UHFFFAOYSA-M 0.000 description 2
- 101710163270 Nuclease Proteins 0.000 description 2
- 108020002230 Pancreatic Ribonuclease Proteins 0.000 description 2
- 102000005891 Pancreatic ribonuclease Human genes 0.000 description 2
- 241000830147 Pediomelum cyphocalyx Species 0.000 description 2
- 239000001888 Peptone Substances 0.000 description 2
- 108010080698 Peptones Proteins 0.000 description 2
- BJEYSVHMGIJORT-NHCYSSNCSA-N Phe-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=CC=C1 BJEYSVHMGIJORT-NHCYSSNCSA-N 0.000 description 2
- LSXGADJXBDFXQU-DLOVCJGASA-N Phe-Ala-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=CC=C1 LSXGADJXBDFXQU-DLOVCJGASA-N 0.000 description 2
- HTTYNOXBBOWZTB-SRVKXCTJSA-N Phe-Asn-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N HTTYNOXBBOWZTB-SRVKXCTJSA-N 0.000 description 2
- SWZKMTDPQXLQRD-XVSYOHENSA-N Phe-Asp-Thr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SWZKMTDPQXLQRD-XVSYOHENSA-N 0.000 description 2
- JXWLMUIXUXLIJR-QWRGUYRKSA-N Phe-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 JXWLMUIXUXLIJR-QWRGUYRKSA-N 0.000 description 2
- JWQWPTLEOFNCGX-AVGNSLFASA-N Phe-Glu-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 JWQWPTLEOFNCGX-AVGNSLFASA-N 0.000 description 2
- GYEPCBNTTRORKW-PCBIJLKTSA-N Phe-Ile-Asp Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O GYEPCBNTTRORKW-PCBIJLKTSA-N 0.000 description 2
- MGLBSROLWAWCKN-FCLVOEFKSA-N Phe-Phe-Thr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MGLBSROLWAWCKN-FCLVOEFKSA-N 0.000 description 2
- ROHDXJUFQVRDAV-UWVGGRQHSA-N Phe-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 ROHDXJUFQVRDAV-UWVGGRQHSA-N 0.000 description 2
- 102000015439 Phospholipases Human genes 0.000 description 2
- 108010064785 Phospholipases Proteins 0.000 description 2
- 241000255969 Pieris brassicae Species 0.000 description 2
- 108700001094 Plant Genes Proteins 0.000 description 2
- 241000595629 Plodia interpunctella Species 0.000 description 2
- 241000254101 Popillia japonica Species 0.000 description 2
- 244000308495 Potentilla anserina Species 0.000 description 2
- 235000016594 Potentilla anserina Nutrition 0.000 description 2
- FELJDCNGZFDUNR-WDSKDSINSA-N Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 FELJDCNGZFDUNR-WDSKDSINSA-N 0.000 description 2
- VTFXTWDFPTWNJY-RHYQMDGZSA-N Pro-Leu-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VTFXTWDFPTWNJY-RHYQMDGZSA-N 0.000 description 2
- UGDMQJSXSSZUKL-IHRRRGAJSA-N Pro-Ser-Tyr Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O UGDMQJSXSSZUKL-IHRRRGAJSA-N 0.000 description 2
- ONIBWKKTOPOVIA-UHFFFAOYSA-N Proline Natural products OC(=O)C1CCCN1 ONIBWKKTOPOVIA-UHFFFAOYSA-N 0.000 description 2
- 108010076504 Protein Sorting Signals Proteins 0.000 description 2
- 108010007131 Pulmonary Surfactant-Associated Protein B Proteins 0.000 description 2
- 102100032617 Pulmonary surfactant-associated protein B Human genes 0.000 description 2
- 235000014443 Pyrus communis Nutrition 0.000 description 2
- 240000001987 Pyrus communis Species 0.000 description 2
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 2
- SSJMZMUVNKEENT-IMJSIDKUSA-N Ser-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CO SSJMZMUVNKEENT-IMJSIDKUSA-N 0.000 description 2
- HRNQLKCLPVKZNE-CIUDSAMLSA-N Ser-Ala-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O HRNQLKCLPVKZNE-CIUDSAMLSA-N 0.000 description 2
- PZZJMBYSYAKYPK-UWJYBYFXSA-N Ser-Ala-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O PZZJMBYSYAKYPK-UWJYBYFXSA-N 0.000 description 2
- ICHZYBVODUVUKN-SRVKXCTJSA-N Ser-Asn-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ICHZYBVODUVUKN-SRVKXCTJSA-N 0.000 description 2
- MESDJCNHLZBMEP-ZLUOBGJFSA-N Ser-Asp-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O MESDJCNHLZBMEP-ZLUOBGJFSA-N 0.000 description 2
- BXLYSRPHVMCOPS-ACZMJKKPSA-N Ser-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CO BXLYSRPHVMCOPS-ACZMJKKPSA-N 0.000 description 2
- XNCUYZKGQOCOQH-YUMQZZPRSA-N Ser-Leu-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O XNCUYZKGQOCOQH-YUMQZZPRSA-N 0.000 description 2
- HEUVHBXOVZONPU-BJDJZHNGSA-N Ser-Leu-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O HEUVHBXOVZONPU-BJDJZHNGSA-N 0.000 description 2
- MUJQWSAWLLRJCE-KATARQTJSA-N Ser-Leu-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MUJQWSAWLLRJCE-KATARQTJSA-N 0.000 description 2
- PBUXMVYWOSKHMF-WDSKDSINSA-N Ser-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CO PBUXMVYWOSKHMF-WDSKDSINSA-N 0.000 description 2
- WBAXJMCUFIXCNI-WDSKDSINSA-N Ser-Pro Chemical compound OC[C@H](N)C(=O)N1CCC[C@H]1C(O)=O WBAXJMCUFIXCNI-WDSKDSINSA-N 0.000 description 2
- HHJFMHQYEAAOBM-ZLUOBGJFSA-N Ser-Ser-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O HHJFMHQYEAAOBM-ZLUOBGJFSA-N 0.000 description 2
- ILZAUMFXKSIUEF-SRVKXCTJSA-N Ser-Ser-Phe Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 ILZAUMFXKSIUEF-SRVKXCTJSA-N 0.000 description 2
- PYTKULIABVRXSC-BWBBJGPYSA-N Ser-Ser-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PYTKULIABVRXSC-BWBBJGPYSA-N 0.000 description 2
- LDEBVRIURYMKQS-WISUUJSJSA-N Ser-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](N)CO LDEBVRIURYMKQS-WISUUJSJSA-N 0.000 description 2
- VLMIUSLQONKLDV-HEIBUPTGSA-N Ser-Thr-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VLMIUSLQONKLDV-HEIBUPTGSA-N 0.000 description 2
- BCAVNDNYOGTQMQ-AAEUAGOBSA-N Ser-Trp-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)NCC(O)=O BCAVNDNYOGTQMQ-AAEUAGOBSA-N 0.000 description 2
- ANOQEBQWIAYIMV-AEJSXWLSSA-N Ser-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CO)N ANOQEBQWIAYIMV-AEJSXWLSSA-N 0.000 description 2
- VYPSYNLAJGMNEJ-UHFFFAOYSA-N Silicium dioxide Chemical compound O=[Si]=O VYPSYNLAJGMNEJ-UHFFFAOYSA-N 0.000 description 2
- XUIMIQQOPSSXEZ-UHFFFAOYSA-N Silicon Chemical compound [Si] XUIMIQQOPSSXEZ-UHFFFAOYSA-N 0.000 description 2
- 241000256251 Spodoptera frugiperda Species 0.000 description 2
- 241001480238 Steinernema Species 0.000 description 2
- 101710137500 T7 RNA polymerase Proteins 0.000 description 2
- 108020005038 Terminator Codon Proteins 0.000 description 2
- SWIKDOUVROTZCW-GCJQMDKQSA-N Thr-Asn-Ala Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](C)C(=O)O)N)O SWIKDOUVROTZCW-GCJQMDKQSA-N 0.000 description 2
- JXKMXEBNZCKSDY-JIOCBJNQSA-N Thr-Asp-Pro Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N1CCC[C@@H]1C(=O)O)N)O JXKMXEBNZCKSDY-JIOCBJNQSA-N 0.000 description 2
- IMULJHHGAUZZFE-MBLNEYKQSA-N Thr-Gly-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H]([C@@H](C)CC)C(O)=O IMULJHHGAUZZFE-MBLNEYKQSA-N 0.000 description 2
- JQAWYCUUFIMTHE-WLTAIBSBSA-N Thr-Gly-Tyr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O JQAWYCUUFIMTHE-WLTAIBSBSA-N 0.000 description 2
- BQBCIBCLXBKYHW-CSMHCCOUSA-N Thr-Leu Chemical compound CC(C)C[C@@H](C([O-])=O)NC(=O)[C@@H]([NH3+])[C@@H](C)O BQBCIBCLXBKYHW-CSMHCCOUSA-N 0.000 description 2
- MEJHFIOYJHTWMK-VOAKCMCISA-N Thr-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)[C@@H](C)O MEJHFIOYJHTWMK-VOAKCMCISA-N 0.000 description 2
- FIFDDJFLNVAVMS-RHYQMDGZSA-N Thr-Leu-Met Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(O)=O FIFDDJFLNVAVMS-RHYQMDGZSA-N 0.000 description 2
- PRNGXSILMXSWQQ-OEAJRASXSA-N Thr-Leu-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O PRNGXSILMXSWQQ-OEAJRASXSA-N 0.000 description 2
- XSEPSRUDSPHMPX-KATARQTJSA-N Thr-Lys-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O XSEPSRUDSPHMPX-KATARQTJSA-N 0.000 description 2
- WVVOFCVMHAXGLE-LFSVMHDDSA-N Thr-Phe-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C)C(O)=O WVVOFCVMHAXGLE-LFSVMHDDSA-N 0.000 description 2
- BDYBHQWMHYDRKJ-UNQGMJICSA-N Thr-Phe-Met Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCSC)C(=O)O)N)O BDYBHQWMHYDRKJ-UNQGMJICSA-N 0.000 description 2
- XKWABWFMQXMUMT-HJGDQZAQSA-N Thr-Pro-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O XKWABWFMQXMUMT-HJGDQZAQSA-N 0.000 description 2
- STUAPCLEDMKXKL-LKXGYXEUSA-N Thr-Ser-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O STUAPCLEDMKXKL-LKXGYXEUSA-N 0.000 description 2
- NQQMWWVVGIXUOX-SVSWQMSJSA-N Thr-Ser-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O NQQMWWVVGIXUOX-SVSWQMSJSA-N 0.000 description 2
- AHERARIZBPOMNU-KATARQTJSA-N Thr-Ser-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O AHERARIZBPOMNU-KATARQTJSA-N 0.000 description 2
- BBPCSGKKPJUYRB-UVOCVTCTSA-N Thr-Thr-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O BBPCSGKKPJUYRB-UVOCVTCTSA-N 0.000 description 2
- ZESGVALRVJIVLZ-VFCFLDTKSA-N Thr-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@@H]1C(=O)O)N)O ZESGVALRVJIVLZ-VFCFLDTKSA-N 0.000 description 2
- FBQHKSPOIAFUEI-OWLDWWDNSA-N Thr-Trp-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](C)C(O)=O FBQHKSPOIAFUEI-OWLDWWDNSA-N 0.000 description 2
- WCRFXRIWBFRZBR-GGVZMXCHSA-N Thr-Tyr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 WCRFXRIWBFRZBR-GGVZMXCHSA-N 0.000 description 2
- KVEWWQRTAVMOFT-KJEVXHAQSA-N Thr-Tyr-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O KVEWWQRTAVMOFT-KJEVXHAQSA-N 0.000 description 2
- FYBFTPLPAXZBOY-KKHAAJSZSA-N Thr-Val-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O FYBFTPLPAXZBOY-KKHAAJSZSA-N 0.000 description 2
- PWONLXBUSVIZPH-RHYQMDGZSA-N Thr-Val-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N)O PWONLXBUSVIZPH-RHYQMDGZSA-N 0.000 description 2
- VYVBSMCZNHOZGD-RCWTZXSCSA-N Thr-Val-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C(C)C)C(O)=O VYVBSMCZNHOZGD-RCWTZXSCSA-N 0.000 description 2
- 241000255901 Tortricidae Species 0.000 description 2
- 235000021307 Triticum Nutrition 0.000 description 2
- 244000098338 Triticum aestivum Species 0.000 description 2
- BDWDMRSGCXEDMR-WFBYXXMGSA-N Trp-Ala-Asn Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N BDWDMRSGCXEDMR-WFBYXXMGSA-N 0.000 description 2
- PNHABSVRPFBUJY-UMPQAUOISA-N Trp-Arg-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N)O PNHABSVRPFBUJY-UMPQAUOISA-N 0.000 description 2
- OSYOKZZRVGUDMO-HSCHXYMDSA-N Trp-Lys-Ile Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O OSYOKZZRVGUDMO-HSCHXYMDSA-N 0.000 description 2
- LGEYOIQBBIPHQN-UWJYBYFXSA-N Tyr-Ala-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 LGEYOIQBBIPHQN-UWJYBYFXSA-N 0.000 description 2
- AYHSJESDFKREAR-KKUMJFAQSA-N Tyr-Asn-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 AYHSJESDFKREAR-KKUMJFAQSA-N 0.000 description 2
- SFSZDJHNAICYSD-PMVMPFDFSA-N Tyr-His-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CC3=CN=CN3)NC(=O)[C@H](CC4=CC=C(C=C4)O)N SFSZDJHNAICYSD-PMVMPFDFSA-N 0.000 description 2
- PRONOHBTMLNXCZ-BZSNNMDCSA-N Tyr-Leu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 PRONOHBTMLNXCZ-BZSNNMDCSA-N 0.000 description 2
- HSBZWINKRYZCSQ-KKUMJFAQSA-N Tyr-Lys-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O HSBZWINKRYZCSQ-KKUMJFAQSA-N 0.000 description 2
- FASACHWGQBNSRO-ZEWNOJEFSA-N Tyr-Phe-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](CC2=CC=C(C=C2)O)N FASACHWGQBNSRO-ZEWNOJEFSA-N 0.000 description 2
- VNYDHJARLHNEGA-RYUDHWBXSA-N Tyr-Pro Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(O)=O)C1=CC=C(O)C=C1 VNYDHJARLHNEGA-RYUDHWBXSA-N 0.000 description 2
- QKXAEWMHAAVVGS-KKUMJFAQSA-N Tyr-Pro-Glu Chemical compound N[C@@H](Cc1ccc(O)cc1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O QKXAEWMHAAVVGS-KKUMJFAQSA-N 0.000 description 2
- XYBNMHRFAUKPAW-IHRRRGAJSA-N Tyr-Ser-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC1=CC=C(C=C1)O)N XYBNMHRFAUKPAW-IHRRRGAJSA-N 0.000 description 2
- MFEVVAXTBZELLL-GGVZMXCHSA-N Tyr-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 MFEVVAXTBZELLL-GGVZMXCHSA-N 0.000 description 2
- SQUMHUZLJDUROQ-YDHLFZDLSA-N Tyr-Val-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O SQUMHUZLJDUROQ-YDHLFZDLSA-N 0.000 description 2
- CCEVJBJLPRNAFH-BVSLBCMMSA-N Tyr-Val-Trp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CC3=CC=C(C=C3)O)N CCEVJBJLPRNAFH-BVSLBCMMSA-N 0.000 description 2
- UEOOXDLMQZBPFR-ZKWXMUAHSA-N Val-Ala-Asn Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](C(C)C)N UEOOXDLMQZBPFR-ZKWXMUAHSA-N 0.000 description 2
- IBIDRSSEHFLGSD-YUMQZZPRSA-N Val-Arg Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(O)=O)CCCN=C(N)N IBIDRSSEHFLGSD-YUMQZZPRSA-N 0.000 description 2
- QHDXUYOYTPWCSK-RCOVLWMOSA-N Val-Asp-Gly Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)NCC(=O)O)N QHDXUYOYTPWCSK-RCOVLWMOSA-N 0.000 description 2
- ZSZFTYVFQLUWBF-QXEWZRGKSA-N Val-Asp-Met Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCSC)C(=O)O)N ZSZFTYVFQLUWBF-QXEWZRGKSA-N 0.000 description 2
- YODDULVCGFQRFZ-ZKWXMUAHSA-N Val-Asp-Ser Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O YODDULVCGFQRFZ-ZKWXMUAHSA-N 0.000 description 2
- ZXAGTABZUOMUDO-GVXVVHGQSA-N Val-Glu-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N ZXAGTABZUOMUDO-GVXVVHGQSA-N 0.000 description 2
- PYPZMFDMCCWNST-NAKRPEOUSA-N Val-Ile-Cys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](C(C)C)N PYPZMFDMCCWNST-NAKRPEOUSA-N 0.000 description 2
- OVBMCNDKCWAXMZ-NAKRPEOUSA-N Val-Ile-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](C(C)C)N OVBMCNDKCWAXMZ-NAKRPEOUSA-N 0.000 description 2
- APQIVBCUIUDSMB-OSUNSFLBSA-N Val-Ile-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](C(C)C)N APQIVBCUIUDSMB-OSUNSFLBSA-N 0.000 description 2
- SYSWVVCYSXBVJG-RHYQMDGZSA-N Val-Leu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](C(C)C)N)O SYSWVVCYSXBVJG-RHYQMDGZSA-N 0.000 description 2
- RQOMPQGUGBILAG-AVGNSLFASA-N Val-Met-Leu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O RQOMPQGUGBILAG-AVGNSLFASA-N 0.000 description 2
- 206010052428 Wound Diseases 0.000 description 2
- 208000027418 Wounds and injury Diseases 0.000 description 2
- 108010081404 acein-2 Proteins 0.000 description 2
- 230000009471 action Effects 0.000 description 2
- 230000006978 adaptation Effects 0.000 description 2
- 239000002671 adjuvant Substances 0.000 description 2
- 238000000246 agarose gel electrophoresis Methods 0.000 description 2
- 235000004279 alanine Nutrition 0.000 description 2
- 108010076324 alanyl-glycyl-glycine Proteins 0.000 description 2
- 108010069020 alanyl-prolyl-glycine Proteins 0.000 description 2
- 108010070783 alanyltyrosine Proteins 0.000 description 2
- KOSRFJWDECSPRO-UHFFFAOYSA-N alpha-L-glutamyl-L-glutamic acid Natural products OC(=O)CCC(N)C(=O)NC(CCC(O)=O)C(O)=O KOSRFJWDECSPRO-UHFFFAOYSA-N 0.000 description 2
- QFAADIRHLBXJJS-ZAZJUGBXSA-N amastatin Chemical compound CC(C)C[C@@H](N)[C@H](O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC(O)=O QFAADIRHLBXJJS-ZAZJUGBXSA-N 0.000 description 2
- 108010052590 amastatin Proteins 0.000 description 2
- 238000012870 ammonium sulfate precipitation Methods 0.000 description 2
- 239000007864 aqueous solution Substances 0.000 description 2
- ODKSFYDXXFIFQN-UHFFFAOYSA-N arginine Natural products OC(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-N 0.000 description 2
- 108010084758 arginyl-tyrosyl-aspartic acid Proteins 0.000 description 2
- 108010062796 arginyllysine Proteins 0.000 description 2
- 235000009582 asparagine Nutrition 0.000 description 2
- 229960001230 asparagine Drugs 0.000 description 2
- 108010069205 aspartyl-phenylalanine Proteins 0.000 description 2
- 108010093581 aspartyl-proline Proteins 0.000 description 2
- 229940097012 bacillus thuringiensis Drugs 0.000 description 2
- 239000000688 bacterial toxin Substances 0.000 description 2
- 239000011324 bead Substances 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 2
- 229940098773 bovine serum albumin Drugs 0.000 description 2
- AIYUHDOJVYHVIT-UHFFFAOYSA-M caesium chloride Chemical compound [Cl-].[Cs+] AIYUHDOJVYHVIT-UHFFFAOYSA-M 0.000 description 2
- 229910052799 carbon Inorganic materials 0.000 description 2
- 238000004587 chromatography analysis Methods 0.000 description 2
- 238000007621 cluster analysis Methods 0.000 description 2
- 238000012790 confirmation Methods 0.000 description 2
- 238000004132 cross linking Methods 0.000 description 2
- 230000001351 cycling effect Effects 0.000 description 2
- 230000009089 cytolysis Effects 0.000 description 2
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical compound NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 2
- 239000008367 deionised water Substances 0.000 description 2
- 229910021641 deionized water Inorganic materials 0.000 description 2
- 238000004925 denaturation Methods 0.000 description 2
- 230000036425 denaturation Effects 0.000 description 2
- 238000000502 dialysis Methods 0.000 description 2
- 108010054813 diprotin B Proteins 0.000 description 2
- 235000021186 dishes Nutrition 0.000 description 2
- 239000003480 eluent Substances 0.000 description 2
- 230000005284 excitation Effects 0.000 description 2
- 239000006260 foam Substances 0.000 description 2
- 238000005194 fractionation Methods 0.000 description 2
- 108010063718 gamma-glutamylaspartic acid Proteins 0.000 description 2
- 239000007789 gas Substances 0.000 description 2
- 108010050475 glycyl-leucyl-tyrosine Proteins 0.000 description 2
- 108010048994 glycyl-tyrosyl-alanine Proteins 0.000 description 2
- 108010015792 glycyllysine Proteins 0.000 description 2
- UYTPUPDQBNUYGX-UHFFFAOYSA-N guanine Chemical compound O=C1NC(N)=NC2=C1N=CN2 UYTPUPDQBNUYGX-UHFFFAOYSA-N 0.000 description 2
- 108010036413 histidylglycine Proteins 0.000 description 2
- 210000004408 hybridoma Anatomy 0.000 description 2
- 229910052739 hydrogen Inorganic materials 0.000 description 2
- 238000007654 immersion Methods 0.000 description 2
- 230000000415 inactivating effect Effects 0.000 description 2
- 238000010348 incorporation Methods 0.000 description 2
- 230000001939 inductive effect Effects 0.000 description 2
- 239000002919 insect venom Substances 0.000 description 2
- 230000003834 intracellular effect Effects 0.000 description 2
- 238000004255 ion exchange chromatography Methods 0.000 description 2
- 108010027338 isoleucylcysteine Proteins 0.000 description 2
- 125000001972 isopentyl group Chemical group [H]C([H])([H])C([H])(C([H])([H])[H])C([H])([H])C([H])([H])* 0.000 description 2
- 108010053037 kyotorphin Proteins 0.000 description 2
- 230000003902 lesion Effects 0.000 description 2
- 231100000518 lethal Toxicity 0.000 description 2
- 230000001665 lethal effect Effects 0.000 description 2
- 231100000225 lethality Toxicity 0.000 description 2
- 108010091798 leucylleucine Proteins 0.000 description 2
- GDBQQVLCIARPGH-ULQDDVLXSA-N leupeptin Chemical compound CC(C)C[C@H](NC(C)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C=O)CCCN=C(N)N GDBQQVLCIARPGH-ULQDDVLXSA-N 0.000 description 2
- 108010052968 leupeptin Proteins 0.000 description 2
- 235000019421 lipase Nutrition 0.000 description 2
- 238000009630 liquid culture Methods 0.000 description 2
- 239000012160 loading buffer Substances 0.000 description 2
- 239000004325 lysozyme Substances 0.000 description 2
- 229960000274 lysozyme Drugs 0.000 description 2
- 235000010335 lysozyme Nutrition 0.000 description 2
- 108010054155 lysyllysine Proteins 0.000 description 2
- 108010038320 lysylphenylalanine Proteins 0.000 description 2
- 230000001404 mediated effect Effects 0.000 description 2
- 230000002503 metabolic effect Effects 0.000 description 2
- 108010056582 methionylglutamic acid Proteins 0.000 description 2
- 108010034507 methionyltryptophan Proteins 0.000 description 2
- 244000005700 microbiome Species 0.000 description 2
- 238000002156 mixing Methods 0.000 description 2
- 238000010369 molecular cloning Methods 0.000 description 2
- 230000035772 mutation Effects 0.000 description 2
- 238000006386 neutralization reaction Methods 0.000 description 2
- 230000031787 nutrient reservoir activity Effects 0.000 description 2
- 239000003921 oil Substances 0.000 description 2
- 230000008520 organization Effects 0.000 description 2
- 238000004806 packaging method and process Methods 0.000 description 2
- 230000035515 penetration Effects 0.000 description 2
- 235000019319 peptone Nutrition 0.000 description 2
- 108010083476 phenylalanyltryptophan Proteins 0.000 description 2
- 239000011148 porous material Substances 0.000 description 2
- 239000008057 potassium phosphate buffer Substances 0.000 description 2
- 239000002244 precipitate Substances 0.000 description 2
- 230000037452 priming Effects 0.000 description 2
- 108010031719 prolyl-serine Proteins 0.000 description 2
- 108010070643 prolylglutamic acid Proteins 0.000 description 2
- 239000013014 purified material Substances 0.000 description 2
- 239000011541 reaction mixture Substances 0.000 description 2
- 230000000384 rearing effect Effects 0.000 description 2
- 238000011084 recovery Methods 0.000 description 2
- 230000010076 replication Effects 0.000 description 2
- 239000011347 resin Substances 0.000 description 2
- 229920005989 resin Polymers 0.000 description 2
- 230000000717 retained effect Effects 0.000 description 2
- 210000002966 serum Anatomy 0.000 description 2
- 230000009528 severe injury Effects 0.000 description 2
- 230000035939 shock Effects 0.000 description 2
- 229910052710 silicon Inorganic materials 0.000 description 2
- 239000010703 silicon Substances 0.000 description 2
- 238000004513 sizing Methods 0.000 description 2
- 239000011734 sodium Substances 0.000 description 2
- DJIKQWNNMSCYPY-UHFFFAOYSA-M sodium;3,9-diethyltridecan-6-yl sulfate Chemical compound [Na+].CCCCC(CC)CCC(OS([O-])(=O)=O)CCC(CC)CC DJIKQWNNMSCYPY-UHFFFAOYSA-M 0.000 description 2
- ATHGHQPFGPMSJY-UHFFFAOYSA-N spermidine Chemical compound NCCCCNCCCN ATHGHQPFGPMSJY-UHFFFAOYSA-N 0.000 description 2
- 238000003860 storage Methods 0.000 description 2
- 239000000725 suspension Substances 0.000 description 2
- 230000009897 systematic effect Effects 0.000 description 2
- 230000009885 systemic effect Effects 0.000 description 2
- 101150032575 tcdA gene Proteins 0.000 description 2
- 239000012085 test solution Substances 0.000 description 2
- 108010061238 threonyl-glycine Proteins 0.000 description 2
- RWQNBRDOKXIBIV-UHFFFAOYSA-N thymine Chemical compound CC1=CNC(=O)NC1=O RWQNBRDOKXIBIV-UHFFFAOYSA-N 0.000 description 2
- 108010038745 tryptophylglycine Proteins 0.000 description 2
- 108010017949 tyrosyl-glycyl-glycine Proteins 0.000 description 2
- 108010051110 tyrosyl-lysine Proteins 0.000 description 2
- 108010003137 tyrosyltyrosine Proteins 0.000 description 2
- 108010073969 valyllysine Proteins 0.000 description 2
- 230000003442 weekly effect Effects 0.000 description 2
- 238000001262 western blot Methods 0.000 description 2
- CNKBMTKICGGSCQ-ACRUOGEOSA-N (2S)-2-[[(2S)-2-[[(2S)-2,6-diamino-1-oxohexyl]amino]-1-oxo-3-phenylpropyl]amino]-3-(4-hydroxyphenyl)propanoic acid Chemical compound C([C@H](NC(=O)[C@@H](N)CCCCN)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 CNKBMTKICGGSCQ-ACRUOGEOSA-N 0.000 description 1
- RVLOMLVNNBWRSR-KNIFDHDWSA-N (2s)-2-aminopropanoic acid;(2s)-2,6-diaminohexanoic acid Chemical compound C[C@H](N)C(O)=O.NCCCC[C@H](N)C(O)=O RVLOMLVNNBWRSR-KNIFDHDWSA-N 0.000 description 1
- PIDRBUDUWHBYSR-UHFFFAOYSA-N 1-[2-[[2-[(2-amino-4-methylpentanoyl)amino]-4-methylpentanoyl]amino]-4-methylpentanoyl]pyrrolidine-2-carboxylic acid Chemical compound CC(C)CC(N)C(=O)NC(CC(C)C)C(=O)NC(CC(C)C)C(=O)N1CCCC1C(O)=O PIDRBUDUWHBYSR-UHFFFAOYSA-N 0.000 description 1
- OWEGMIWEEQEYGQ-UHFFFAOYSA-N 100676-05-9 Natural products OC1C(O)C(O)C(CO)OC1OCC1C(O)C(O)C(O)C(OC2C(OC(O)C(O)C2O)CO)O1 OWEGMIWEEQEYGQ-UHFFFAOYSA-N 0.000 description 1
- HQRHFUYMGCHHJS-UHFFFAOYSA-N 2-[[2-[(2-aminoacetyl)amino]acetyl]amino]-5-(diaminomethylideneamino)pentanoic acid Chemical compound NCC(=O)NCC(=O)NC(C(O)=O)CCCN=C(N)N HQRHFUYMGCHHJS-UHFFFAOYSA-N 0.000 description 1
- MLONYBFKXHEPCD-UHFFFAOYSA-N 2-amino-2-(hydroxymethyl)propane-1,3-diol Chemical compound OCC(N)(CO)CO.OCC(N)(CO)CO MLONYBFKXHEPCD-UHFFFAOYSA-N 0.000 description 1
- BFSVOASYOCHEOV-UHFFFAOYSA-N 2-diethylaminoethanol Chemical compound CCN(CC)CCO BFSVOASYOCHEOV-UHFFFAOYSA-N 0.000 description 1
- SUGXUUGGLDCZKB-UHFFFAOYSA-N 3,4-dichloroisocoumarin Chemical compound C1=CC=C2C(Cl)=C(Cl)OC(=O)C2=C1 SUGXUUGGLDCZKB-UHFFFAOYSA-N 0.000 description 1
- UPMXNNIRAGDFEH-UHFFFAOYSA-N 3,5-dibromo-4-hydroxybenzonitrile Chemical compound OC1=C(Br)C=C(C#N)C=C1Br UPMXNNIRAGDFEH-UHFFFAOYSA-N 0.000 description 1
- CAAMSDWKXXPUJR-UHFFFAOYSA-N 3,5-dihydro-4H-imidazol-4-one Chemical class O=C1CNC=N1 CAAMSDWKXXPUJR-UHFFFAOYSA-N 0.000 description 1
- CTRXDTYTAAKVSM-UHFFFAOYSA-N 3-{[ethyl({4-[(4-{ethyl[(3-sulfophenyl)methyl]amino}phenyl)(2-sulfophenyl)methylidene]cyclohexa-2,5-dien-1-ylidene})azaniumyl]methyl}benzene-1-sulfonate Chemical compound C=1C=C(C(=C2C=CC(C=C2)=[N+](CC)CC=2C=C(C=CC=2)S([O-])(=O)=O)C=2C(=CC=CC=2)S(O)(=O)=O)C=CC=1N(CC)CC1=CC=CC(S(O)(=O)=O)=C1 CTRXDTYTAAKVSM-UHFFFAOYSA-N 0.000 description 1
- ZPLCXHWYPWVJDL-UHFFFAOYSA-N 4-[(4-hydroxyphenyl)methyl]-1,3-oxazolidin-2-one Chemical compound C1=CC(O)=CC=C1CC1NC(=O)OC1 ZPLCXHWYPWVJDL-UHFFFAOYSA-N 0.000 description 1
- 241000208140 Acer Species 0.000 description 1
- 108010085238 Actins Proteins 0.000 description 1
- 102000007469 Actins Human genes 0.000 description 1
- 241000079319 Aculops lycopersici Species 0.000 description 1
- 241000585703 Adelphia <angiosperm> Species 0.000 description 1
- 229930024421 Adenine Natural products 0.000 description 1
- GFFGJBXGBJISGV-UHFFFAOYSA-N Adenine Chemical compound NC1=NC=NC2=C1N=CN2 GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 description 1
- FJVAQLJNTSUQPY-CIUDSAMLSA-N Ala-Ala-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN FJVAQLJNTSUQPY-CIUDSAMLSA-N 0.000 description 1
- JBVSSSZFNTXJDX-YTLHQDLWSA-N Ala-Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@H](C)N JBVSSSZFNTXJDX-YTLHQDLWSA-N 0.000 description 1
- GJMNLCSOIHOLQZ-FXQIFTODSA-N Ala-Ala-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](C)NC(=O)[C@H](C)N)C(O)=O GJMNLCSOIHOLQZ-FXQIFTODSA-N 0.000 description 1
- SITWEMZOJNKJCH-WDSKDSINSA-N Ala-Arg Chemical compound C[C@H](N)C(=O)N[C@H](C(O)=O)CCCN=C(N)N SITWEMZOJNKJCH-WDSKDSINSA-N 0.000 description 1
- XEXJJJRVTFGWIC-FXQIFTODSA-N Ala-Asn-Arg Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N XEXJJJRVTFGWIC-FXQIFTODSA-N 0.000 description 1
- HGRBNYQIMKTUNT-XVYDVKMFSA-N Ala-Asn-His Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N HGRBNYQIMKTUNT-XVYDVKMFSA-N 0.000 description 1
- STACJSVFHSEZJV-GHCJXIJMSA-N Ala-Asn-Ile Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O STACJSVFHSEZJV-GHCJXIJMSA-N 0.000 description 1
- KIUYPHAMDKDICO-WHFBIAKZSA-N Ala-Asp-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O KIUYPHAMDKDICO-WHFBIAKZSA-N 0.000 description 1
- KUDREHRZRIVKHS-UWJYBYFXSA-N Ala-Asp-Tyr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O KUDREHRZRIVKHS-UWJYBYFXSA-N 0.000 description 1
- GGNHBHYDMUDXQB-KBIXCLLPSA-N Ala-Glu-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](C)N GGNHBHYDMUDXQB-KBIXCLLPSA-N 0.000 description 1
- HXNNRBHASOSVPG-GUBZILKMSA-N Ala-Glu-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O HXNNRBHASOSVPG-GUBZILKMSA-N 0.000 description 1
- XYTNPQNAZREREP-XQXXSGGOSA-N Ala-Glu-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XYTNPQNAZREREP-XQXXSGGOSA-N 0.000 description 1
- ZVFVBBGVOILKPO-WHFBIAKZSA-N Ala-Gly-Ala Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@@H](C)C(O)=O ZVFVBBGVOILKPO-WHFBIAKZSA-N 0.000 description 1
- MPLOSMWGDNJSEV-WHFBIAKZSA-N Ala-Gly-Asp Chemical compound [H]N[C@@H](C)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O MPLOSMWGDNJSEV-WHFBIAKZSA-N 0.000 description 1
- PCIFXPRIFWKWLK-YUMQZZPRSA-N Ala-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@H](C)N PCIFXPRIFWKWLK-YUMQZZPRSA-N 0.000 description 1
- XZWXFWBHYRFLEF-FSPLSTOPSA-N Ala-His Chemical compound C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 XZWXFWBHYRFLEF-FSPLSTOPSA-N 0.000 description 1
- JEPNLGMEZMCFEX-QSFUFRPTSA-N Ala-His-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](C)N JEPNLGMEZMCFEX-QSFUFRPTSA-N 0.000 description 1
- YHKANGMVQWRMAP-DCAQKATOSA-N Ala-Leu-Arg Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N YHKANGMVQWRMAP-DCAQKATOSA-N 0.000 description 1
- CCDFBRZVTDDJNM-GUBZILKMSA-N Ala-Leu-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O CCDFBRZVTDDJNM-GUBZILKMSA-N 0.000 description 1
- ZKEHTYWGPMMGBC-XUXIUFHCSA-N Ala-Leu-Leu-Ser Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O ZKEHTYWGPMMGBC-XUXIUFHCSA-N 0.000 description 1
- VHVVPYOJIIQCKS-QEJZJMRPSA-N Ala-Leu-Phe Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 VHVVPYOJIIQCKS-QEJZJMRPSA-N 0.000 description 1
- MEFILNJXAVSUTO-JXUBOQSCSA-N Ala-Leu-Thr Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MEFILNJXAVSUTO-JXUBOQSCSA-N 0.000 description 1
- PMQXMXAASGFUDX-SRVKXCTJSA-N Ala-Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](C)N)CCCCN PMQXMXAASGFUDX-SRVKXCTJSA-N 0.000 description 1
- FUKFQILQFQKHLE-DCAQKATOSA-N Ala-Lys-Met Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(O)=O FUKFQILQFQKHLE-DCAQKATOSA-N 0.000 description 1
- RAAWHFXHAACDFT-FXQIFTODSA-N Ala-Met-Asn Chemical compound CSCC[C@H](NC(=O)[C@H](C)N)C(=O)N[C@@H](CC(N)=O)C(O)=O RAAWHFXHAACDFT-FXQIFTODSA-N 0.000 description 1
- DEWWPUNXRNGMQN-LPEHRKFASA-N Ala-Met-Pro Chemical compound C[C@@H](C(=O)N[C@@H](CCSC)C(=O)N1CCC[C@@H]1C(=O)O)N DEWWPUNXRNGMQN-LPEHRKFASA-N 0.000 description 1
- CJQAEJMHBAOQHA-DLOVCJGASA-N Ala-Phe-Asn Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(=O)N)C(=O)O)N CJQAEJMHBAOQHA-DLOVCJGASA-N 0.000 description 1
- WEZNQZHACPSMEF-QEJZJMRPSA-N Ala-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=CC=C1 WEZNQZHACPSMEF-QEJZJMRPSA-N 0.000 description 1
- DYJJJCHDHLEFDW-FXQIFTODSA-N Ala-Pro-Cys Chemical compound C[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CS)C(=O)O)N DYJJJCHDHLEFDW-FXQIFTODSA-N 0.000 description 1
- YHBDGLZYNIARKJ-GUBZILKMSA-N Ala-Pro-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](C)N YHBDGLZYNIARKJ-GUBZILKMSA-N 0.000 description 1
- MSWSRLGNLKHDEI-ACZMJKKPSA-N Ala-Ser-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(O)=O MSWSRLGNLKHDEI-ACZMJKKPSA-N 0.000 description 1
- HOVPGJUNRLMIOZ-CIUDSAMLSA-N Ala-Ser-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@H](C)N HOVPGJUNRLMIOZ-CIUDSAMLSA-N 0.000 description 1
- ARHJJAAWNWOACN-FXQIFTODSA-N Ala-Ser-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O ARHJJAAWNWOACN-FXQIFTODSA-N 0.000 description 1
- XQNRANMFRPCFFW-GCJQMDKQSA-N Ala-Thr-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](C)N)O XQNRANMFRPCFFW-GCJQMDKQSA-N 0.000 description 1
- YNOCMHZSWJMGBB-GCJQMDKQSA-N Ala-Thr-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(O)=O YNOCMHZSWJMGBB-GCJQMDKQSA-N 0.000 description 1
- CREYEAPXISDKSB-FQPOAREZSA-N Ala-Thr-Tyr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O CREYEAPXISDKSB-FQPOAREZSA-N 0.000 description 1
- AENHOIXXHKNIQL-AUTRQRHGSA-N Ala-Tyr-Ala Chemical compound [O-]C(=O)[C@H](C)NC(=O)[C@@H](NC(=O)[C@@H]([NH3+])C)CC1=CC=C(O)C=C1 AENHOIXXHKNIQL-AUTRQRHGSA-N 0.000 description 1
- BGGAIXWIZCIFSG-XDTLVQLUSA-N Ala-Tyr-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O BGGAIXWIZCIFSG-XDTLVQLUSA-N 0.000 description 1
- ZJLORAAXDAJLDC-CQDKDKBSSA-N Ala-Tyr-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O ZJLORAAXDAJLDC-CQDKDKBSSA-N 0.000 description 1
- IYKVSFNGSWTTNZ-GUBZILKMSA-N Ala-Val-Arg Chemical compound C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N IYKVSFNGSWTTNZ-GUBZILKMSA-N 0.000 description 1
- RFJNDTQGEJRBHO-DCAQKATOSA-N Ala-Val-Leu Chemical compound CC(C)C[C@@H](C([O-])=O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](C)[NH3+] RFJNDTQGEJRBHO-DCAQKATOSA-N 0.000 description 1
- USFZMSVCRYTOJT-UHFFFAOYSA-N Ammonium acetate Chemical compound N.CC(O)=O USFZMSVCRYTOJT-UHFFFAOYSA-N 0.000 description 1
- 239000005695 Ammonium acetate Substances 0.000 description 1
- 244000105624 Arachis hypogaea Species 0.000 description 1
- DFCIPNHFKOQAME-FXQIFTODSA-N Arg-Ala-Asn Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O DFCIPNHFKOQAME-FXQIFTODSA-N 0.000 description 1
- YYOVLDPHIJAOSY-DCAQKATOSA-N Arg-Ala-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCN=C(N)N YYOVLDPHIJAOSY-DCAQKATOSA-N 0.000 description 1
- VBFJESQBIWCWRL-DCAQKATOSA-N Arg-Ala-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCNC(N)=N VBFJESQBIWCWRL-DCAQKATOSA-N 0.000 description 1
- SBVJJNJLFWSJOV-UBHSHLNASA-N Arg-Ala-Phe Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 SBVJJNJLFWSJOV-UBHSHLNASA-N 0.000 description 1
- OMLWNBVRVJYMBQ-YUMQZZPRSA-N Arg-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O OMLWNBVRVJYMBQ-YUMQZZPRSA-N 0.000 description 1
- UXJCMQFPDWCHKX-DCAQKATOSA-N Arg-Arg-Glu Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(O)=O)C(O)=O UXJCMQFPDWCHKX-DCAQKATOSA-N 0.000 description 1
- OVVUNXXROOFSIM-SDDRHHMPSA-N Arg-Arg-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CCCN=C(N)N)N)C(=O)O OVVUNXXROOFSIM-SDDRHHMPSA-N 0.000 description 1
- RWCLSUOSKWTXLA-FXQIFTODSA-N Arg-Asp-Ala Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O RWCLSUOSKWTXLA-FXQIFTODSA-N 0.000 description 1
- MFAMTAVAFBPXDC-LPEHRKFASA-N Arg-Asp-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCCN=C(N)N)N)C(=O)O MFAMTAVAFBPXDC-LPEHRKFASA-N 0.000 description 1
- OGUPCHKBOKJFMA-SRVKXCTJSA-N Arg-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCCN=C(N)N OGUPCHKBOKJFMA-SRVKXCTJSA-N 0.000 description 1
- RFXXUWGNVRJTNQ-QXEWZRGKSA-N Arg-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CCCN=C(N)N)N RFXXUWGNVRJTNQ-QXEWZRGKSA-N 0.000 description 1
- NKNILFJYKKHBKE-WPRPVWTQSA-N Arg-Gly-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O NKNILFJYKKHBKE-WPRPVWTQSA-N 0.000 description 1
- QEHMMRSQJMOYNO-DCAQKATOSA-N Arg-His-Asn Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N QEHMMRSQJMOYNO-DCAQKATOSA-N 0.000 description 1
- NVCIXQYNWYTLDO-IHRRRGAJSA-N Arg-His-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CCCN=C(N)N)N NVCIXQYNWYTLDO-IHRRRGAJSA-N 0.000 description 1
- LKDHUGLXOHYINY-XUXIUFHCSA-N Arg-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N LKDHUGLXOHYINY-XUXIUFHCSA-N 0.000 description 1
- OFIYLHVAAJYRBC-HJWJTTGWSA-N Arg-Ile-Phe Chemical compound CC[C@H](C)[C@H](NC(=O)[C@@H](N)CCCNC(N)=N)C(=O)N[C@@H](Cc1ccccc1)C(O)=O OFIYLHVAAJYRBC-HJWJTTGWSA-N 0.000 description 1
- CVXXSWQORBZAAA-SRVKXCTJSA-N Arg-Lys-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCN=C(N)N CVXXSWQORBZAAA-SRVKXCTJSA-N 0.000 description 1
- ROWCTNFEMKOIFQ-YUMQZZPRSA-N Arg-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCNC(N)=N ROWCTNFEMKOIFQ-YUMQZZPRSA-N 0.000 description 1
- JOADBFCFJGNIKF-GUBZILKMSA-N Arg-Met-Ala Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C)C(O)=O JOADBFCFJGNIKF-GUBZILKMSA-N 0.000 description 1
- YTMKMRSYXHBGER-IHRRRGAJSA-N Arg-Phe-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N YTMKMRSYXHBGER-IHRRRGAJSA-N 0.000 description 1
- FKQITMVNILRUCQ-IHRRRGAJSA-N Arg-Phe-Asp Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O FKQITMVNILRUCQ-IHRRRGAJSA-N 0.000 description 1
- ADPACBMPYWJJCE-FXQIFTODSA-N Arg-Ser-Asp Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O ADPACBMPYWJJCE-FXQIFTODSA-N 0.000 description 1
- OQPAZKMGCWPERI-GUBZILKMSA-N Arg-Ser-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O OQPAZKMGCWPERI-GUBZILKMSA-N 0.000 description 1
- AIFHRTPABBBHKU-RCWTZXSCSA-N Arg-Thr-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O AIFHRTPABBBHKU-RCWTZXSCSA-N 0.000 description 1
- WCZXPVPHUMYLMS-VEVYYDQMSA-N Arg-Thr-Asp Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(O)=O WCZXPVPHUMYLMS-VEVYYDQMSA-N 0.000 description 1
- ZJBUILVYSXQNSW-YTWAJWBKSA-N Arg-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N)O ZJBUILVYSXQNSW-YTWAJWBKSA-N 0.000 description 1
- XRNXPIGJPQHCPC-RCWTZXSCSA-N Arg-Thr-Val Chemical compound CC(C)[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CCCNC(N)=N)[C@@H](C)O)C(O)=O XRNXPIGJPQHCPC-RCWTZXSCSA-N 0.000 description 1
- UVTGNSWSRSCPLP-UHFFFAOYSA-N Arg-Tyr Natural products NC(CCNC(=N)N)C(=O)NC(Cc1ccc(O)cc1)C(=O)O UVTGNSWSRSCPLP-UHFFFAOYSA-N 0.000 description 1
- QMQZYILAWUOLPV-JYJNAYRXSA-N Arg-Tyr-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CCCN=C(N)N)C(O)=O)CC1=CC=C(O)C=C1 QMQZYILAWUOLPV-JYJNAYRXSA-N 0.000 description 1
- PJOPLXOCKACMLK-KKUMJFAQSA-N Arg-Tyr-Glu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O PJOPLXOCKACMLK-KKUMJFAQSA-N 0.000 description 1
- QHUOOCKNNURZSL-IHRRRGAJSA-N Arg-Tyr-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(O)=O QHUOOCKNNURZSL-IHRRRGAJSA-N 0.000 description 1
- IZSMEUDYADKZTJ-KJEVXHAQSA-N Arg-Tyr-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O IZSMEUDYADKZTJ-KJEVXHAQSA-N 0.000 description 1
- CNBIWSCSSCAINS-UFYCRDLUSA-N Arg-Tyr-Tyr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O CNBIWSCSSCAINS-UFYCRDLUSA-N 0.000 description 1
- PSUXEQYPYZLNER-QXEWZRGKSA-N Arg-Val-Asn Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O PSUXEQYPYZLNER-QXEWZRGKSA-N 0.000 description 1
- WOZDCBHUGJVJPL-AVGNSLFASA-N Arg-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N WOZDCBHUGJVJPL-AVGNSLFASA-N 0.000 description 1
- CMLGVVWQQHUXOZ-GHCJXIJMSA-N Asn-Ala-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O CMLGVVWQQHUXOZ-GHCJXIJMSA-N 0.000 description 1
- NUHQMYUWLUSRJX-BIIVOSGPSA-N Asn-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)N)N NUHQMYUWLUSRJX-BIIVOSGPSA-N 0.000 description 1
- XWGJDUSDTRPQRK-ZLUOBGJFSA-N Asn-Ala-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC(N)=O XWGJDUSDTRPQRK-ZLUOBGJFSA-N 0.000 description 1
- QEYJFBMTSMLPKZ-ZKWXMUAHSA-N Asn-Ala-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(O)=O QEYJFBMTSMLPKZ-ZKWXMUAHSA-N 0.000 description 1
- VDCIPFYVCICPEC-FXQIFTODSA-N Asn-Arg-Ala Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(O)=O VDCIPFYVCICPEC-FXQIFTODSA-N 0.000 description 1
- HUZGPXBILPMCHM-IHRRRGAJSA-N Asn-Arg-Phe Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O HUZGPXBILPMCHM-IHRRRGAJSA-N 0.000 description 1
- RJUHZPRQRQLCFL-IMJSIDKUSA-N Asn-Asn Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(O)=O RJUHZPRQRQLCFL-IMJSIDKUSA-N 0.000 description 1
- KLKHFFMNGWULBN-VKHMYHEASA-N Asn-Gly Chemical compound NC(=O)C[C@H](N)C(=O)NCC(O)=O KLKHFFMNGWULBN-VKHMYHEASA-N 0.000 description 1
- PBSQFBAJKPLRJY-BYULHYEWSA-N Asn-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC(=O)N)N PBSQFBAJKPLRJY-BYULHYEWSA-N 0.000 description 1
- OLVIPTLKNSAYRJ-YUMQZZPRSA-N Asn-Gly-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC(=O)N)N OLVIPTLKNSAYRJ-YUMQZZPRSA-N 0.000 description 1
- UDSVWSUXKYXSTR-QWRGUYRKSA-N Asn-Gly-Tyr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O UDSVWSUXKYXSTR-QWRGUYRKSA-N 0.000 description 1
- OOWSBIOUKIUWLO-RCOVLWMOSA-N Asn-Gly-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O OOWSBIOUKIUWLO-RCOVLWMOSA-N 0.000 description 1
- PNHQRQTVBRDIEF-CIUDSAMLSA-N Asn-Leu-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC(=O)N)N PNHQRQTVBRDIEF-CIUDSAMLSA-N 0.000 description 1
- HDHZCEDPLTVHFZ-GUBZILKMSA-N Asn-Leu-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O HDHZCEDPLTVHFZ-GUBZILKMSA-N 0.000 description 1
- QJMCHPGWFZZRID-BQBZGAKWSA-N Asn-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC(N)=O QJMCHPGWFZZRID-BQBZGAKWSA-N 0.000 description 1
- ALHMNHZJBYBYHS-DCAQKATOSA-N Asn-Lys-Arg Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O ALHMNHZJBYBYHS-DCAQKATOSA-N 0.000 description 1
- ZYPWIUFLYMQZBS-SRVKXCTJSA-N Asn-Lys-Lys Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N ZYPWIUFLYMQZBS-SRVKXCTJSA-N 0.000 description 1
- COWITDLVHMZSIW-CIUDSAMLSA-N Asn-Lys-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O COWITDLVHMZSIW-CIUDSAMLSA-N 0.000 description 1
- AYOAHKWVQLNPDM-HJGDQZAQSA-N Asn-Lys-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O AYOAHKWVQLNPDM-HJGDQZAQSA-N 0.000 description 1
- QDXQWFBLUVTOFL-FXQIFTODSA-N Asn-Met-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CC(=O)N)N QDXQWFBLUVTOFL-FXQIFTODSA-N 0.000 description 1
- OMSMPWHEGLNQOD-UWVGGRQHSA-N Asn-Phe Chemical compound NC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 OMSMPWHEGLNQOD-UWVGGRQHSA-N 0.000 description 1
- LSJQOMAZIKQMTJ-SRVKXCTJSA-N Asn-Phe-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O LSJQOMAZIKQMTJ-SRVKXCTJSA-N 0.000 description 1
- YRTOMUMWSTUQAX-FXQIFTODSA-N Asn-Pro-Asp Chemical compound NC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O YRTOMUMWSTUQAX-FXQIFTODSA-N 0.000 description 1
- BYLSYQASFJJBCL-DCAQKATOSA-N Asn-Pro-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O BYLSYQASFJJBCL-DCAQKATOSA-N 0.000 description 1
- GMUOCGCDOYYWPD-FXQIFTODSA-N Asn-Pro-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O GMUOCGCDOYYWPD-FXQIFTODSA-N 0.000 description 1
- IIQIOFVDFOLCHP-UHFFFAOYSA-N Asn-Pro-Ser-Ser Chemical compound NC(=O)CC(N)C(=O)N1CCCC1C(=O)NC(CO)C(=O)NC(CO)C(O)=O IIQIOFVDFOLCHP-UHFFFAOYSA-N 0.000 description 1
- SUIJFTJDTJKSRK-IHRRRGAJSA-N Asn-Pro-Tyr Chemical compound NC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 SUIJFTJDTJKSRK-IHRRRGAJSA-N 0.000 description 1
- VLDRQOHCMKCXLY-SRVKXCTJSA-N Asn-Ser-Phe Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O VLDRQOHCMKCXLY-SRVKXCTJSA-N 0.000 description 1
- WUQXMTITJLFXAU-JIOCBJNQSA-N Asn-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)N)N)O WUQXMTITJLFXAU-JIOCBJNQSA-N 0.000 description 1
- BIGRHVNFFJTHEB-UBHSHLNASA-N Asn-Trp-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC(O)=O)C(O)=O BIGRHVNFFJTHEB-UBHSHLNASA-N 0.000 description 1
- FYRVDDJMNISIKJ-UWVGGRQHSA-N Asn-Tyr Chemical compound NC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 FYRVDDJMNISIKJ-UWVGGRQHSA-N 0.000 description 1
- BEHQTVDBCLSCBY-CFMVVWHZSA-N Asn-Tyr-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BEHQTVDBCLSCBY-CFMVVWHZSA-N 0.000 description 1
- CGYKCTPUGXFPMG-IHPCNDPISA-N Asn-Tyr-Trp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O CGYKCTPUGXFPMG-IHPCNDPISA-N 0.000 description 1
- DPSUVAPLRQDWAO-YDHLFZDLSA-N Asn-Tyr-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CC(=O)N)N DPSUVAPLRQDWAO-YDHLFZDLSA-N 0.000 description 1
- MJIJBEYEHBKTIM-BYULHYEWSA-N Asn-Val-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N MJIJBEYEHBKTIM-BYULHYEWSA-N 0.000 description 1
- DVUFTQLHHHJEMK-IMJSIDKUSA-N Asp-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CC(O)=O DVUFTQLHHHJEMK-IMJSIDKUSA-N 0.000 description 1
- UWMIZBCTVWVMFI-FXQIFTODSA-N Asp-Ala-Arg Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O UWMIZBCTVWVMFI-FXQIFTODSA-N 0.000 description 1
- XEDQMTWEYFBOIK-ACZMJKKPSA-N Asp-Ala-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O XEDQMTWEYFBOIK-ACZMJKKPSA-N 0.000 description 1
- PBVLJOIPOGUQQP-CIUDSAMLSA-N Asp-Ala-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O PBVLJOIPOGUQQP-CIUDSAMLSA-N 0.000 description 1
- ILJQISGMGXRZQQ-IHRRRGAJSA-N Asp-Arg-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ILJQISGMGXRZQQ-IHRRRGAJSA-N 0.000 description 1
- UQBGYPFHWFZMCD-ZLUOBGJFSA-N Asp-Asn-Asn Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O UQBGYPFHWFZMCD-ZLUOBGJFSA-N 0.000 description 1
- UGKZHCBLMLSANF-CIUDSAMLSA-N Asp-Asn-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O UGKZHCBLMLSANF-CIUDSAMLSA-N 0.000 description 1
- RYEWQKQXRJCHIO-SRVKXCTJSA-N Asp-Asn-Tyr Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 RYEWQKQXRJCHIO-SRVKXCTJSA-N 0.000 description 1
- FANQWNCPNFEPGZ-WHFBIAKZSA-N Asp-Asp-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O FANQWNCPNFEPGZ-WHFBIAKZSA-N 0.000 description 1
- LKIYSIYBKYLKPU-BIIVOSGPSA-N Asp-Asp-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)O)N)C(=O)O LKIYSIYBKYLKPU-BIIVOSGPSA-N 0.000 description 1
- PXLNPFOJZQMXAT-BYULHYEWSA-N Asp-Asp-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC(O)=O PXLNPFOJZQMXAT-BYULHYEWSA-N 0.000 description 1
- CKAJHWFHHFSCDT-WHFBIAKZSA-N Asp-Glu Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O CKAJHWFHHFSCDT-WHFBIAKZSA-N 0.000 description 1
- XJQRWGXKUSDEFI-ACZMJKKPSA-N Asp-Glu-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O XJQRWGXKUSDEFI-ACZMJKKPSA-N 0.000 description 1
- VFUXXFVCYZPOQG-WDSKDSINSA-N Asp-Glu-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O VFUXXFVCYZPOQG-WDSKDSINSA-N 0.000 description 1
- DGKCOYGQLNWNCJ-ACZMJKKPSA-N Asp-Glu-Ser Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O DGKCOYGQLNWNCJ-ACZMJKKPSA-N 0.000 description 1
- YDJVIBMKAMQPPP-LAEOZQHASA-N Asp-Glu-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O YDJVIBMKAMQPPP-LAEOZQHASA-N 0.000 description 1
- JHFNSBBHKSZXKB-VKHMYHEASA-N Asp-Gly Chemical compound OC(=O)C[C@H](N)C(=O)NCC(O)=O JHFNSBBHKSZXKB-VKHMYHEASA-N 0.000 description 1
- DTNUIAJCPRMNBT-WHFBIAKZSA-N Asp-Gly-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H](C)C(O)=O DTNUIAJCPRMNBT-WHFBIAKZSA-N 0.000 description 1
- YNCHFVRXEQFPBY-BQBZGAKWSA-N Asp-Gly-Arg Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N YNCHFVRXEQFPBY-BQBZGAKWSA-N 0.000 description 1
- JUWZKMBALYLZCK-WHFBIAKZSA-N Asp-Gly-Asn Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O JUWZKMBALYLZCK-WHFBIAKZSA-N 0.000 description 1
- PSLSTUMPZILTAH-BYULHYEWSA-N Asp-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC(O)=O PSLSTUMPZILTAH-BYULHYEWSA-N 0.000 description 1
- QCVXMEHGFUMKCO-YUMQZZPRSA-N Asp-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC(O)=O QCVXMEHGFUMKCO-YUMQZZPRSA-N 0.000 description 1
- ILQCHXURSRRIRY-YUMQZZPRSA-N Asp-His-Gly Chemical compound C1=C(NC=N1)C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CC(=O)O)N ILQCHXURSRRIRY-YUMQZZPRSA-N 0.000 description 1
- NHSDEZURHWEZPN-SXTJYALSSA-N Asp-Ile-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)NC(=O)[C@H](CC(=O)O)N NHSDEZURHWEZPN-SXTJYALSSA-N 0.000 description 1
- JNNVNVRBYUJYGS-CIUDSAMLSA-N Asp-Leu-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O JNNVNVRBYUJYGS-CIUDSAMLSA-N 0.000 description 1
- UJGRZQYSNYTCAX-SRVKXCTJSA-N Asp-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(O)=O UJGRZQYSNYTCAX-SRVKXCTJSA-N 0.000 description 1
- OAMLVOVXNKILLQ-BQBZGAKWSA-N Asp-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC(O)=O OAMLVOVXNKILLQ-BQBZGAKWSA-N 0.000 description 1
- AKKUDRZKFZWPBH-SRVKXCTJSA-N Asp-Lys-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC(=O)O)N AKKUDRZKFZWPBH-SRVKXCTJSA-N 0.000 description 1
- HJCGDIGVVWETRO-ZPFDUUQYSA-N Asp-Lys-Ile Chemical compound CC[C@H](C)[C@H](NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC(O)=O)C(O)=O HJCGDIGVVWETRO-ZPFDUUQYSA-N 0.000 description 1
- HXVILZUZXFLVEN-DCAQKATOSA-N Asp-Met-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O HXVILZUZXFLVEN-DCAQKATOSA-N 0.000 description 1
- YZQCXOFQZKCETR-UWVGGRQHSA-N Asp-Phe Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 YZQCXOFQZKCETR-UWVGGRQHSA-N 0.000 description 1
- RPUYTJJZXQBWDT-SRVKXCTJSA-N Asp-Phe-Ser Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC(=O)O)N RPUYTJJZXQBWDT-SRVKXCTJSA-N 0.000 description 1
- VKPHBHGUUUPGAI-UHFFFAOYSA-N Asp-Phe-Tyr-Tyr Chemical compound C=1C=C(O)C=CC=1CC(C(=O)NC(CC=1C=CC(O)=CC=1)C(O)=O)NC(=O)C(NC(=O)C(CC(O)=O)N)CC1=CC=CC=C1 VKPHBHGUUUPGAI-UHFFFAOYSA-N 0.000 description 1
- UKGGPJNBONZZCM-WDSKDSINSA-N Asp-Pro Chemical compound OC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(O)=O UKGGPJNBONZZCM-WDSKDSINSA-N 0.000 description 1
- UAXIKORUDGGIGA-DCAQKATOSA-N Asp-Pro-Lys Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CC(=O)O)N)C(=O)N[C@@H](CCCCN)C(=O)O UAXIKORUDGGIGA-DCAQKATOSA-N 0.000 description 1
- RVMXMLSYBTXCAV-VEVYYDQMSA-N Asp-Pro-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O RVMXMLSYBTXCAV-VEVYYDQMSA-N 0.000 description 1
- DWBZEJHQQIURML-IMJSIDKUSA-N Asp-Ser Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(O)=O DWBZEJHQQIURML-IMJSIDKUSA-N 0.000 description 1
- ZQFRDAZBTSFGGW-SRVKXCTJSA-N Asp-Ser-Phe Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O ZQFRDAZBTSFGGW-SRVKXCTJSA-N 0.000 description 1
- MGSVBZIBCCKGCY-ZLUOBGJFSA-N Asp-Ser-Ser Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O MGSVBZIBCCKGCY-ZLUOBGJFSA-N 0.000 description 1
- XYPJXLLXNSAWHZ-SRVKXCTJSA-N Asp-Ser-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O XYPJXLLXNSAWHZ-SRVKXCTJSA-N 0.000 description 1
- JSHWXQIZOCVWIA-ZKWXMUAHSA-N Asp-Ser-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O JSHWXQIZOCVWIA-ZKWXMUAHSA-N 0.000 description 1
- UTLCRGFJFSZWAW-OLHMAJIHSA-N Asp-Thr-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)O)N)O UTLCRGFJFSZWAW-OLHMAJIHSA-N 0.000 description 1
- JSNWZMFSLIWAHS-HJGDQZAQSA-N Asp-Thr-Leu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H](CC(=O)O)N)O JSNWZMFSLIWAHS-HJGDQZAQSA-N 0.000 description 1
- GCACQYDBDHRVGE-LKXGYXEUSA-N Asp-Thr-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H]([C@H](O)C)NC(=O)[C@@H](N)CC(O)=O GCACQYDBDHRVGE-LKXGYXEUSA-N 0.000 description 1
- ZARXTZFGQZBYFO-JQWIXIFHSA-N Asp-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CC(O)=O)N)C(O)=O)=CNC2=C1 ZARXTZFGQZBYFO-JQWIXIFHSA-N 0.000 description 1
- MRYDJCIIVRXVGG-QEJZJMRPSA-N Asp-Trp-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCC(O)=O)C(O)=O MRYDJCIIVRXVGG-QEJZJMRPSA-N 0.000 description 1
- LEYKQPDPZJIRTA-AQZXSJQPSA-N Asp-Trp-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LEYKQPDPZJIRTA-AQZXSJQPSA-N 0.000 description 1
- GGBQDSHTXKQSLP-NHCYSSNCSA-N Asp-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)O)N GGBQDSHTXKQSLP-NHCYSSNCSA-N 0.000 description 1
- RKXVTTIQNKPCHU-KKHAAJSZSA-N Asp-Val-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC(O)=O RKXVTTIQNKPCHU-KKHAAJSZSA-N 0.000 description 1
- 241000972773 Aulopiformes Species 0.000 description 1
- 239000007989 BIS-Tris Propane buffer Substances 0.000 description 1
- 102100021277 Beta-secretase 2 Human genes 0.000 description 1
- 101710150190 Beta-secretase 2 Proteins 0.000 description 1
- 239000002028 Biomass Substances 0.000 description 1
- 241000238657 Blattella germanica Species 0.000 description 1
- 241001674044 Blattodea Species 0.000 description 1
- 241000283690 Bos taurus Species 0.000 description 1
- 235000014698 Brassica juncea var multisecta Nutrition 0.000 description 1
- 235000006008 Brassica napus var napus Nutrition 0.000 description 1
- 240000000385 Brassica napus var. napus Species 0.000 description 1
- 235000006618 Brassica rapa subsp oleifera Nutrition 0.000 description 1
- 235000004977 Brassica sinapistrum Nutrition 0.000 description 1
- 239000005489 Bromoxynil Substances 0.000 description 1
- 238000009631 Broth culture Methods 0.000 description 1
- 125000001433 C-terminal amino-acid group Chemical group 0.000 description 1
- RZZPDXZPRHQOCG-OJAKKHQRSA-O CDP-choline(1+) Chemical compound O[C@@H]1[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OCC[N+](C)(C)C)O[C@H]1N1C(=O)N=C(N)C=C1 RZZPDXZPRHQOCG-OJAKKHQRSA-O 0.000 description 1
- 241000294063 Cacopsylla bidens Species 0.000 description 1
- 101100283604 Caenorhabditis elegans pigk-1 gene Proteins 0.000 description 1
- 108090000565 Capsid Proteins Proteins 0.000 description 1
- 241000701489 Cauliflower mosaic virus Species 0.000 description 1
- 241001414720 Cicadellidae Species 0.000 description 1
- 240000001980 Cucurbita pepo Species 0.000 description 1
- 241000254171 Curculionidae Species 0.000 description 1
- AYKQJQVWUYEZNU-IMJSIDKUSA-N Cys-Asn Chemical compound SC[C@H](N)C(=O)N[C@H](C(O)=O)CC(N)=O AYKQJQVWUYEZNU-IMJSIDKUSA-N 0.000 description 1
- UPJGYXRAPJWIHD-CIUDSAMLSA-N Cys-Asn-Leu Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O UPJGYXRAPJWIHD-CIUDSAMLSA-N 0.000 description 1
- DVIHGGUODLILFN-GHCJXIJMSA-N Cys-Ile-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CS)N DVIHGGUODLILFN-GHCJXIJMSA-N 0.000 description 1
- NXTYATMDWQYLGJ-BQBZGAKWSA-N Cys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CS NXTYATMDWQYLGJ-BQBZGAKWSA-N 0.000 description 1
- QQOWCDCBFFBRQH-IXOXFDKPSA-N Cys-Phe-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](CS)N)O QQOWCDCBFFBRQH-IXOXFDKPSA-N 0.000 description 1
- CMYVIUWVYHOLRD-ZLUOBGJFSA-N Cys-Ser-Ala Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O CMYVIUWVYHOLRD-ZLUOBGJFSA-N 0.000 description 1
- ZAKOWWREFLAJOT-CEFNRUSXSA-N D-alpha-tocopherylacetate Chemical compound CC(=O)OC1=C(C)C(C)=C2O[C@@](CCC[C@H](C)CCC[C@H](C)CCCC(C)C)(C)CCC2=C1C ZAKOWWREFLAJOT-CEFNRUSXSA-N 0.000 description 1
- 108010066133 D-octopine dehydrogenase Proteins 0.000 description 1
- 102000053602 DNA Human genes 0.000 description 1
- 102000012410 DNA Ligases Human genes 0.000 description 1
- 108010061982 DNA Ligases Proteins 0.000 description 1
- 238000007900 DNA-DNA hybridization Methods 0.000 description 1
- 102000004163 DNA-directed RNA polymerases Human genes 0.000 description 1
- 108090000626 DNA-directed RNA polymerases Proteins 0.000 description 1
- NDUPDOJHUQKPAG-UHFFFAOYSA-N Dalapon Chemical compound CC(Cl)(Cl)C(O)=O NDUPDOJHUQKPAG-UHFFFAOYSA-N 0.000 description 1
- 102000016911 Deoxyribonucleases Human genes 0.000 description 1
- 108010053770 Deoxyribonucleases Proteins 0.000 description 1
- 206010012735 Diarrhoea Diseases 0.000 description 1
- 241001057636 Dracaena deremensis Species 0.000 description 1
- 241000305071 Enterobacterales Species 0.000 description 1
- YQYJSBFKSSDGFO-UHFFFAOYSA-N Epihygromycin Natural products OC1C(O)C(C(=O)C)OC1OC(C(=C1)O)=CC=C1C=C(C)C(=O)NC1C(O)C(O)C2OCOC2C1O YQYJSBFKSSDGFO-UHFFFAOYSA-N 0.000 description 1
- 241000588722 Escherichia Species 0.000 description 1
- 101100377706 Escherichia phage T5 A2.2 gene Proteins 0.000 description 1
- 241000701959 Escherichia virus Lambda Species 0.000 description 1
- 241000206602 Eukaryota Species 0.000 description 1
- 108050001049 Extracellular proteins Proteins 0.000 description 1
- 229920001917 Ficoll Polymers 0.000 description 1
- 241000233866 Fungi Species 0.000 description 1
- 108700028146 Genetic Enhancer Elements Proteins 0.000 description 1
- IGNGBUVODQLMRJ-CIUDSAMLSA-N Gln-Ala-Met Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCSC)C(O)=O IGNGBUVODQLMRJ-CIUDSAMLSA-N 0.000 description 1
- WMOMPXKOKASNBK-PEFMBERDSA-N Gln-Asn-Ile Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WMOMPXKOKASNBK-PEFMBERDSA-N 0.000 description 1
- VOLVNCMGXWDDQY-LPEHRKFASA-N Gln-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCC(=O)N)N)C(=O)O VOLVNCMGXWDDQY-LPEHRKFASA-N 0.000 description 1
- JXFLPKSDLDEOQK-JHEQGTHGSA-N Gln-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCC(N)=O JXFLPKSDLDEOQK-JHEQGTHGSA-N 0.000 description 1
- XITLYYAIPBBHPX-ZKWXMUAHSA-N Gln-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CCC(N)=O XITLYYAIPBBHPX-ZKWXMUAHSA-N 0.000 description 1
- NHMRJKKAVMENKJ-WDCWCFNPSA-N Gln-Thr-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O NHMRJKKAVMENKJ-WDCWCFNPSA-N 0.000 description 1
- RUFHOVYUYSNDNY-ACZMJKKPSA-N Glu-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(O)=O RUFHOVYUYSNDNY-ACZMJKKPSA-N 0.000 description 1
- FHPXTPQBODWBIY-CIUDSAMLSA-N Glu-Ala-Arg Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O FHPXTPQBODWBIY-CIUDSAMLSA-N 0.000 description 1
- SZXSSXUNOALWCH-ACZMJKKPSA-N Glu-Ala-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O SZXSSXUNOALWCH-ACZMJKKPSA-N 0.000 description 1
- DIXKFOPPGWKZLY-CIUDSAMLSA-N Glu-Arg-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O DIXKFOPPGWKZLY-CIUDSAMLSA-N 0.000 description 1
- VTTSANCGJWLPNC-ZPFDUUQYSA-N Glu-Arg-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VTTSANCGJWLPNC-ZPFDUUQYSA-N 0.000 description 1
- PCBBLFVHTYNQGG-LAEOZQHASA-N Glu-Asn-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)O)N PCBBLFVHTYNQGG-LAEOZQHASA-N 0.000 description 1
- FYYSIASRLDJUNP-WHFBIAKZSA-N Glu-Asp Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(O)=O FYYSIASRLDJUNP-WHFBIAKZSA-N 0.000 description 1
- RDPOETHPAQEGDP-ACZMJKKPSA-N Glu-Asp-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O RDPOETHPAQEGDP-ACZMJKKPSA-N 0.000 description 1
- HJIFPJUEOGZWRI-GUBZILKMSA-N Glu-Asp-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCC(=O)O)N HJIFPJUEOGZWRI-GUBZILKMSA-N 0.000 description 1
- QJCKNLPMTPXXEM-AUTRQRHGSA-N Glu-Glu-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCC(O)=O QJCKNLPMTPXXEM-AUTRQRHGSA-N 0.000 description 1
- LSPKYLAFTPBWIL-BYPYZUCNSA-N Glu-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(O)=O LSPKYLAFTPBWIL-BYPYZUCNSA-N 0.000 description 1
- ZWABFSSWTSAMQN-KBIXCLLPSA-N Glu-Ile-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O ZWABFSSWTSAMQN-KBIXCLLPSA-N 0.000 description 1
- LGYCLOCORAEQSZ-PEFMBERDSA-N Glu-Ile-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O LGYCLOCORAEQSZ-PEFMBERDSA-N 0.000 description 1
- VGUYMZGLJUJRBV-YVNDNENWSA-N Glu-Ile-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O VGUYMZGLJUJRBV-YVNDNENWSA-N 0.000 description 1
- GXMXPCXXKVWOSM-KQXIARHKSA-N Glu-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)O)N GXMXPCXXKVWOSM-KQXIARHKSA-N 0.000 description 1
- WNRZUESNGGDCJX-JYJNAYRXSA-N Glu-Leu-Phe Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O WNRZUESNGGDCJX-JYJNAYRXSA-N 0.000 description 1
- CUPSDFQZTVVTSK-GUBZILKMSA-N Glu-Lys-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCC(O)=O CUPSDFQZTVVTSK-GUBZILKMSA-N 0.000 description 1
- QDMVXRNLOPTPIE-WDCWCFNPSA-N Glu-Lys-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O QDMVXRNLOPTPIE-WDCWCFNPSA-N 0.000 description 1
- SUIAHERNFYRBDZ-GVXVVHGQSA-N Glu-Lys-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O SUIAHERNFYRBDZ-GVXVVHGQSA-N 0.000 description 1
- SXGAGTVDWKQYCX-BQBZGAKWSA-N Glu-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CCC(O)=O SXGAGTVDWKQYCX-BQBZGAKWSA-N 0.000 description 1
- ZKONLKQGTNVAPR-DCAQKATOSA-N Glu-Pro-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CCC(=O)O)N ZKONLKQGTNVAPR-DCAQKATOSA-N 0.000 description 1
- IDEODOAVGCMUQV-GUBZILKMSA-N Glu-Ser-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O IDEODOAVGCMUQV-GUBZILKMSA-N 0.000 description 1
- JSIQVRIXMINMTA-ZDLURKLDSA-N Glu-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](N)CCC(O)=O JSIQVRIXMINMTA-ZDLURKLDSA-N 0.000 description 1
- HZISRJBYZAODRV-XQXXSGGOSA-N Glu-Thr-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O HZISRJBYZAODRV-XQXXSGGOSA-N 0.000 description 1
- YQAQQKPWFOBSMU-WDCWCFNPSA-N Glu-Thr-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O YQAQQKPWFOBSMU-WDCWCFNPSA-N 0.000 description 1
- CQGBSALYGOXQPE-HTUGSXCWSA-N Glu-Thr-Phe Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O CQGBSALYGOXQPE-HTUGSXCWSA-N 0.000 description 1
- UMZHHILWZBFPGL-LOKLDPHHSA-N Glu-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O UMZHHILWZBFPGL-LOKLDPHHSA-N 0.000 description 1
- VIPDPMHGICREIS-GVXVVHGQSA-N Glu-Val-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O VIPDPMHGICREIS-GVXVVHGQSA-N 0.000 description 1
- QXUPRMQJDWJDFR-NRPADANISA-N Glu-Val-Ser Chemical compound CC(C)[C@H](NC(=O)[C@@H](N)CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O QXUPRMQJDWJDFR-NRPADANISA-N 0.000 description 1
- SOYWRINXUSUWEQ-DLOVCJGASA-N Glu-Val-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCC(O)=O SOYWRINXUSUWEQ-DLOVCJGASA-N 0.000 description 1
- WQZGKKKJIJFFOK-GASJEMHNSA-N Glucose Natural products OC[C@H]1OC(O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-GASJEMHNSA-N 0.000 description 1
- 102000053187 Glucuronidase Human genes 0.000 description 1
- 108010060309 Glucuronidase Proteins 0.000 description 1
- PUUYVMYCMIWHFE-BQBZGAKWSA-N Gly-Ala-Arg Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N PUUYVMYCMIWHFE-BQBZGAKWSA-N 0.000 description 1
- BRFJMRSRMOMIMU-WHFBIAKZSA-N Gly-Ala-Asn Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O BRFJMRSRMOMIMU-WHFBIAKZSA-N 0.000 description 1
- UGVQELHRNUDMAA-BYPYZUCNSA-N Gly-Ala-Gly Chemical compound [NH3+]CC(=O)N[C@@H](C)C(=O)NCC([O-])=O UGVQELHRNUDMAA-BYPYZUCNSA-N 0.000 description 1
- LERGJIVJIIODPZ-ZANVPECISA-N Gly-Ala-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](NC(=O)CN)C)C(O)=O)=CNC2=C1 LERGJIVJIIODPZ-ZANVPECISA-N 0.000 description 1
- FUESBOMYALLFNI-VKHMYHEASA-N Gly-Asn Chemical compound NCC(=O)N[C@H](C(O)=O)CC(N)=O FUESBOMYALLFNI-VKHMYHEASA-N 0.000 description 1
- AIJAPFVDBFYNKN-WHFBIAKZSA-N Gly-Asn-Asp Chemical compound C([C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)CN)C(=O)N AIJAPFVDBFYNKN-WHFBIAKZSA-N 0.000 description 1
- OCDLPQDYTJPWNG-YUMQZZPRSA-N Gly-Asn-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)CN OCDLPQDYTJPWNG-YUMQZZPRSA-N 0.000 description 1
- SCCPDJAQCXWPTF-VKHMYHEASA-N Gly-Asp Chemical compound NCC(=O)N[C@H](C(O)=O)CC(O)=O SCCPDJAQCXWPTF-VKHMYHEASA-N 0.000 description 1
- PMNHJLASAAWELO-FOHZUACHSA-N Gly-Asp-Thr Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PMNHJLASAAWELO-FOHZUACHSA-N 0.000 description 1
- JNGJGFMFXREJNF-KBPBESRZSA-N Gly-Glu-Trp Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O JNGJGFMFXREJNF-KBPBESRZSA-N 0.000 description 1
- UFPXDFOYHVEIPI-BYPYZUCNSA-N Gly-Gly-Asp Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CC(O)=O UFPXDFOYHVEIPI-BYPYZUCNSA-N 0.000 description 1
- UQJNXZSSGQIPIQ-FBCQKBJTSA-N Gly-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)CN UQJNXZSSGQIPIQ-FBCQKBJTSA-N 0.000 description 1
- INLIXXRWNUKVCF-JTQLQIEISA-N Gly-Gly-Tyr Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 INLIXXRWNUKVCF-JTQLQIEISA-N 0.000 description 1
- KGVHCTWYMPWEGN-FSPLSTOPSA-N Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CN KGVHCTWYMPWEGN-FSPLSTOPSA-N 0.000 description 1
- UTYGDAHJBBDPBA-BYULHYEWSA-N Gly-Ile-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)CN UTYGDAHJBBDPBA-BYULHYEWSA-N 0.000 description 1
- UHPAZODVFFYEEL-QWRGUYRKSA-N Gly-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)CN UHPAZODVFFYEEL-QWRGUYRKSA-N 0.000 description 1
- LHYJCVCQPWRMKZ-WEDXCCLWSA-N Gly-Leu-Thr Chemical compound [H]NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LHYJCVCQPWRMKZ-WEDXCCLWSA-N 0.000 description 1
- MIIVFRCYJABHTQ-ONGXEEELSA-N Gly-Leu-Val Chemical compound [H]NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O MIIVFRCYJABHTQ-ONGXEEELSA-N 0.000 description 1
- GMTXWRIDLGTVFC-IUCAKERBSA-N Gly-Lys-Glu Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O GMTXWRIDLGTVFC-IUCAKERBSA-N 0.000 description 1
- PFMUCCYYAAFKTH-YFKPBYRVSA-N Gly-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)CN PFMUCCYYAAFKTH-YFKPBYRVSA-N 0.000 description 1
- RVGMVLVBDRQVKB-UWVGGRQHSA-N Gly-Met-His Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)CN RVGMVLVBDRQVKB-UWVGGRQHSA-N 0.000 description 1
- DHNXGWVNLFPOMQ-KBPBESRZSA-N Gly-Phe-His Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)NC(=O)CN DHNXGWVNLFPOMQ-KBPBESRZSA-N 0.000 description 1
- JPVGHHQGKPQYIL-KBPBESRZSA-N Gly-Phe-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=CC=C1 JPVGHHQGKPQYIL-KBPBESRZSA-N 0.000 description 1
- BMWFDYIYBAFROD-WPRPVWTQSA-N Gly-Pro-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN BMWFDYIYBAFROD-WPRPVWTQSA-N 0.000 description 1
- IRJWAYCXIYUHQE-WHFBIAKZSA-N Gly-Ser-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CO)NC(=O)CN IRJWAYCXIYUHQE-WHFBIAKZSA-N 0.000 description 1
- IMRNSEPSPFQNHF-STQMWFEESA-N Gly-Ser-Trp Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CNC2=CC=CC=C12)C(=O)O IMRNSEPSPFQNHF-STQMWFEESA-N 0.000 description 1
- YXTFLTJYLIAZQG-FJXKBIBVSA-N Gly-Thr-Arg Chemical compound NCC(=O)N[C@@H]([C@H](O)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N YXTFLTJYLIAZQG-FJXKBIBVSA-N 0.000 description 1
- NVTPVQLIZCOJFK-FOHZUACHSA-N Gly-Thr-Asp Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(O)=O NVTPVQLIZCOJFK-FOHZUACHSA-N 0.000 description 1
- KOYUSMBPJOVSOO-XEGUGMAKSA-N Gly-Tyr-Ile Chemical compound [H]NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KOYUSMBPJOVSOO-XEGUGMAKSA-N 0.000 description 1
- RIYIFUFFFBIOEU-KBPBESRZSA-N Gly-Tyr-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=C(O)C=C1 RIYIFUFFFBIOEU-KBPBESRZSA-N 0.000 description 1
- GJHWILMUOANXTG-WPRPVWTQSA-N Gly-Val-Arg Chemical compound [H]NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O GJHWILMUOANXTG-WPRPVWTQSA-N 0.000 description 1
- SBVMXEZQJVUARN-XPUUQOCRSA-N Gly-Val-Ser Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O SBVMXEZQJVUARN-XPUUQOCRSA-N 0.000 description 1
- 235000010469 Glycine max Nutrition 0.000 description 1
- 244000068988 Glycine max Species 0.000 description 1
- 108700037728 Glycine max beta-conglycinin Proteins 0.000 description 1
- 239000005562 Glyphosate Substances 0.000 description 1
- RVKIPWVMZANZLI-UHFFFAOYSA-N H-Lys-Trp-OH Natural products C1=CC=C2C(CC(NC(=O)C(N)CCCCN)C(O)=O)=CNC2=C1 RVKIPWVMZANZLI-UHFFFAOYSA-N 0.000 description 1
- 206010018910 Haemolysis Diseases 0.000 description 1
- 241000506680 Haemulon melanurum Species 0.000 description 1
- 244000020551 Helianthus annuus Species 0.000 description 1
- 235000003222 Helianthus annuus Nutrition 0.000 description 1
- 108010006464 Hemolysin Proteins Proteins 0.000 description 1
- 241001465746 Heterorhabditidae Species 0.000 description 1
- DZMVESFTHXSSPZ-XVYDVKMFSA-N His-Ala-Ser Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O DZMVESFTHXSSPZ-XVYDVKMFSA-N 0.000 description 1
- NOQPTNXSGNPJNS-YUMQZZPRSA-N His-Asn-Gly Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O NOQPTNXSGNPJNS-YUMQZZPRSA-N 0.000 description 1
- JWTKVPMQCCRPQY-SRVKXCTJSA-N His-Asn-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O JWTKVPMQCCRPQY-SRVKXCTJSA-N 0.000 description 1
- WZOGEMJIZBNFBK-CIUDSAMLSA-N His-Asp-Asn Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O WZOGEMJIZBNFBK-CIUDSAMLSA-N 0.000 description 1
- NQKRILCJYCASDV-QWRGUYRKSA-N His-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CN=CN1 NQKRILCJYCASDV-QWRGUYRKSA-N 0.000 description 1
- BDFCIKANUNMFGB-PMVVWTBXSA-N His-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CN=CN1 BDFCIKANUNMFGB-PMVVWTBXSA-N 0.000 description 1
- JJHWJUYYTWYXPL-PYJNHQTQSA-N His-Ile-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CC1=CN=CN1 JJHWJUYYTWYXPL-PYJNHQTQSA-N 0.000 description 1
- ORERHHPZDDEMSC-VGDYDELISA-N His-Ile-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC1=CN=CN1)N ORERHHPZDDEMSC-VGDYDELISA-N 0.000 description 1
- MMFKFJORZBJVNF-UWVGGRQHSA-N His-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CN=CN1 MMFKFJORZBJVNF-UWVGGRQHSA-N 0.000 description 1
- LVWIJITYHRZHBO-IXOXFDKPSA-N His-Leu-Thr Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LVWIJITYHRZHBO-IXOXFDKPSA-N 0.000 description 1
- VGYOLSOFODKLSP-IHPCNDPISA-N His-Leu-Trp Chemical compound C([C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(O)=O)C1=CN=CN1 VGYOLSOFODKLSP-IHPCNDPISA-N 0.000 description 1
- LNDVNHOSZQPJGI-AVGNSLFASA-N His-Pro-Pro Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(=O)N1[C@@H](CCC1)C(O)=O)C1=CN=CN1 LNDVNHOSZQPJGI-AVGNSLFASA-N 0.000 description 1
- HTOOKGDPMXSJSY-STQMWFEESA-N His-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CN=CN1 HTOOKGDPMXSJSY-STQMWFEESA-N 0.000 description 1
- 101001062098 Homo sapiens RNA-binding protein 14 Proteins 0.000 description 1
- UFHFLCQGNIYNRP-UHFFFAOYSA-N Hydrogen Chemical compound [H][H] UFHFLCQGNIYNRP-UHFFFAOYSA-N 0.000 description 1
- 206010020649 Hyperkeratosis Diseases 0.000 description 1
- RCFDOSNHHZGBOY-ACZMJKKPSA-N Ile-Ala Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C)C(O)=O RCFDOSNHHZGBOY-ACZMJKKPSA-N 0.000 description 1
- JRHFQUPIZOYKQP-KBIXCLLPSA-N Ile-Ala-Glu Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCC(O)=O JRHFQUPIZOYKQP-KBIXCLLPSA-N 0.000 description 1
- YPWHUFAAMNHMGS-QSFUFRPTSA-N Ile-Ala-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N YPWHUFAAMNHMGS-QSFUFRPTSA-N 0.000 description 1
- QICVAHODWHIWIS-HTFCKZLJSA-N Ile-Ala-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N QICVAHODWHIWIS-HTFCKZLJSA-N 0.000 description 1
- HLYBGMZJVDHJEO-CYDGBPFRSA-N Ile-Arg-Arg Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N HLYBGMZJVDHJEO-CYDGBPFRSA-N 0.000 description 1
- DMHGKBGOUAJRHU-RVMXOQNASA-N Ile-Arg-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N1CCC[C@@H]1C(=O)O)N DMHGKBGOUAJRHU-RVMXOQNASA-N 0.000 description 1
- HZYHBDVRCBDJJV-HAFWLYHUSA-N Ile-Asn Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@H](C(O)=O)CC(N)=O HZYHBDVRCBDJJV-HAFWLYHUSA-N 0.000 description 1
- QADCTXFNLZBZAB-GHCJXIJMSA-N Ile-Asn-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](C)C(=O)O)N QADCTXFNLZBZAB-GHCJXIJMSA-N 0.000 description 1
- NCSIQAFSIPHVAN-IUKAMOBKSA-N Ile-Asn-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N NCSIQAFSIPHVAN-IUKAMOBKSA-N 0.000 description 1
- HVWXAQVMRBKKFE-UGYAYLCHSA-N Ile-Asp-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N HVWXAQVMRBKKFE-UGYAYLCHSA-N 0.000 description 1
- GYAFMRQGWHXMII-IUKAMOBKSA-N Ile-Asp-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N GYAFMRQGWHXMII-IUKAMOBKSA-N 0.000 description 1
- JSZMKEYEVLDPDO-ACZMJKKPSA-N Ile-Cys Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CS)C(O)=O JSZMKEYEVLDPDO-ACZMJKKPSA-N 0.000 description 1
- WTOAPTKSZJJWKK-HTFCKZLJSA-N Ile-Cys-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N WTOAPTKSZJJWKK-HTFCKZLJSA-N 0.000 description 1
- LPXHYGGZJOCAFR-MNXVOIDGSA-N Ile-Glu-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(C)C)C(=O)O)N LPXHYGGZJOCAFR-MNXVOIDGSA-N 0.000 description 1
- NZOCIWKZUVUNDW-ZKWXMUAHSA-N Ile-Gly-Ala Chemical compound CC[C@H](C)[C@H](N)C(=O)NCC(=O)N[C@@H](C)C(O)=O NZOCIWKZUVUNDW-ZKWXMUAHSA-N 0.000 description 1
- KFVUBLZRFSVDGO-BYULHYEWSA-N Ile-Gly-Asp Chemical compound CC[C@H](C)[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(O)=O KFVUBLZRFSVDGO-BYULHYEWSA-N 0.000 description 1
- NYEYYMLUABXDMC-NHCYSSNCSA-N Ile-Gly-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H](CC(C)C)C(=O)O)N NYEYYMLUABXDMC-NHCYSSNCSA-N 0.000 description 1
- QNBYCZTZNOVDMI-HGNGGELXSA-N Ile-His Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 QNBYCZTZNOVDMI-HGNGGELXSA-N 0.000 description 1
- ZXIGYKICRDFISM-DJFWLOJKSA-N Ile-His-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CC(=O)N)C(=O)O)N ZXIGYKICRDFISM-DJFWLOJKSA-N 0.000 description 1
- KYLIZSDYWQQTFM-PEDHHIEDSA-N Ile-Ile-Arg Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CCCN=C(N)N KYLIZSDYWQQTFM-PEDHHIEDSA-N 0.000 description 1
- GVKKVHNRTUFCCE-BJDJZHNGSA-N Ile-Leu-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)O)N GVKKVHNRTUFCCE-BJDJZHNGSA-N 0.000 description 1
- UYNXBNHVWFNVIN-HJWJTTGWSA-N Ile-Phe-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)[C@@H](C)CC)CC1=CC=CC=C1 UYNXBNHVWFNVIN-HJWJTTGWSA-N 0.000 description 1
- UAELWXJFLZBKQS-WHOFXGATSA-N Ile-Phe-Gly Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](Cc1ccccc1)C(=O)NCC(O)=O UAELWXJFLZBKQS-WHOFXGATSA-N 0.000 description 1
- VEPIBPGLTLPBDW-URLPEUOOSA-N Ile-Phe-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N VEPIBPGLTLPBDW-URLPEUOOSA-N 0.000 description 1
- IITVUURPOYGCTD-NAKRPEOUSA-N Ile-Pro-Ala Chemical compound CC[C@H](C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O IITVUURPOYGCTD-NAKRPEOUSA-N 0.000 description 1
- IVXJIMGDOYRLQU-XUXIUFHCSA-N Ile-Pro-Leu Chemical compound CC[C@H](C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O IVXJIMGDOYRLQU-XUXIUFHCSA-N 0.000 description 1
- CAHCWMVNBZJVAW-NAKRPEOUSA-N Ile-Pro-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)O)N CAHCWMVNBZJVAW-NAKRPEOUSA-N 0.000 description 1
- MLSUZXHSNRBDCI-CYDGBPFRSA-N Ile-Pro-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(=O)O)N MLSUZXHSNRBDCI-CYDGBPFRSA-N 0.000 description 1
- JHNJNTMTZHEDLJ-NAKRPEOUSA-N Ile-Ser-Arg Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O JHNJNTMTZHEDLJ-NAKRPEOUSA-N 0.000 description 1
- JODPUDMBQBIWCK-GHCJXIJMSA-N Ile-Ser-Asn Chemical compound [H]N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O JODPUDMBQBIWCK-GHCJXIJMSA-N 0.000 description 1
- ZNOBVZFCHNHKHA-KBIXCLLPSA-N Ile-Ser-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N ZNOBVZFCHNHKHA-KBIXCLLPSA-N 0.000 description 1
- PXKACEXYLPBMAD-JBDRJPRFSA-N Ile-Ser-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)O)N PXKACEXYLPBMAD-JBDRJPRFSA-N 0.000 description 1
- SAEWJTCJQVZQNZ-IUKAMOBKSA-N Ile-Thr-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N SAEWJTCJQVZQNZ-IUKAMOBKSA-N 0.000 description 1
- RKQAYOWLSFLJEE-SVSWQMSJSA-N Ile-Thr-Cys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CS)C(=O)O)N RKQAYOWLSFLJEE-SVSWQMSJSA-N 0.000 description 1
- QHUREMVLLMNUAX-OSUNSFLBSA-N Ile-Thr-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)O)N QHUREMVLLMNUAX-OSUNSFLBSA-N 0.000 description 1
- HQLSBZFLOUHQJK-STECZYCISA-N Ile-Tyr-Arg Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N HQLSBZFLOUHQJK-STECZYCISA-N 0.000 description 1
- YJRSIJZUIUANHO-NAKRPEOUSA-N Ile-Val-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(=O)O)N YJRSIJZUIUANHO-NAKRPEOUSA-N 0.000 description 1
- BCISUQVFDGYZBO-QSFUFRPTSA-N Ile-Val-Asp Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC(O)=O BCISUQVFDGYZBO-QSFUFRPTSA-N 0.000 description 1
- JCGMFFQQHJQASB-PYJNHQTQSA-N Ile-Val-His Chemical compound N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)O JCGMFFQQHJQASB-PYJNHQTQSA-N 0.000 description 1
- APQYGMBHIVXFML-OSUNSFLBSA-N Ile-Val-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N APQYGMBHIVXFML-OSUNSFLBSA-N 0.000 description 1
- 108010060231 Insect Proteins Proteins 0.000 description 1
- 108010065920 Insulin Lispro Proteins 0.000 description 1
- 102100024319 Intestinal-type alkaline phosphatase Human genes 0.000 description 1
- 101710184243 Intestinal-type alkaline phosphatase Proteins 0.000 description 1
- OWYWGLHRNBIFJP-UHFFFAOYSA-N Ipazine Chemical compound CCN(CC)C1=NC(Cl)=NC(NC(C)C)=N1 OWYWGLHRNBIFJP-UHFFFAOYSA-N 0.000 description 1
- 108010025815 Kanamycin Kinase Proteins 0.000 description 1
- HGCNKOLVKRAVHD-UHFFFAOYSA-N L-Met-L-Phe Natural products CSCCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 HGCNKOLVKRAVHD-UHFFFAOYSA-N 0.000 description 1
- SITWEMZOJNKJCH-UHFFFAOYSA-N L-alanine-L-arginine Natural products CC(N)C(=O)NC(C(O)=O)CCCNC(N)=N SITWEMZOJNKJCH-UHFFFAOYSA-N 0.000 description 1
- RCFDOSNHHZGBOY-UHFFFAOYSA-N L-isoleucyl-L-alanine Natural products CCC(C)C(N)C(=O)NC(C)C(O)=O RCFDOSNHHZGBOY-UHFFFAOYSA-N 0.000 description 1
- LHSGPCFBGJHPCY-UHFFFAOYSA-N L-leucine-L-tyrosine Natural products CC(C)CC(N)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 LHSGPCFBGJHPCY-UHFFFAOYSA-N 0.000 description 1
- FBOZXECLQNJBKD-ZDUSSCGKSA-N L-methotrexate Chemical compound C=1N=C2N=C(N)N=C(N)C2=NC=1CN(C)C1=CC=C(C(=O)N[C@@H](CCC(O)=O)C(O)=O)C=C1 FBOZXECLQNJBKD-ZDUSSCGKSA-N 0.000 description 1
- 241000258915 Leptinotarsa Species 0.000 description 1
- HSQGMTRYSIHDAC-BQBZGAKWSA-N Leu-Ala Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(O)=O HSQGMTRYSIHDAC-BQBZGAKWSA-N 0.000 description 1
- LJHGALIOHLRRQN-DCAQKATOSA-N Leu-Ala-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N LJHGALIOHLRRQN-DCAQKATOSA-N 0.000 description 1
- ZRLUISBDKUWAIZ-CIUDSAMLSA-N Leu-Ala-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC(O)=O ZRLUISBDKUWAIZ-CIUDSAMLSA-N 0.000 description 1
- WNGVUZWBXZKQES-YUMQZZPRSA-N Leu-Ala-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O WNGVUZWBXZKQES-YUMQZZPRSA-N 0.000 description 1
- ZTUWCZQOKOJGEX-DCAQKATOSA-N Leu-Ala-Val Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(O)=O ZTUWCZQOKOJGEX-DCAQKATOSA-N 0.000 description 1
- MLTRLIITQPXHBJ-BQBZGAKWSA-N Leu-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC(N)=O MLTRLIITQPXHBJ-BQBZGAKWSA-N 0.000 description 1
- IGUOAYLTQJLPPD-DCAQKATOSA-N Leu-Asn-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N IGUOAYLTQJLPPD-DCAQKATOSA-N 0.000 description 1
- KKXDHFKZWKLYGB-GUBZILKMSA-N Leu-Asn-Glu Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N KKXDHFKZWKLYGB-GUBZILKMSA-N 0.000 description 1
- OGCQGUIWMSBHRZ-CIUDSAMLSA-N Leu-Asn-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O OGCQGUIWMSBHRZ-CIUDSAMLSA-N 0.000 description 1
- ZDSNOSQHMJBRQN-SRVKXCTJSA-N Leu-Asp-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N ZDSNOSQHMJBRQN-SRVKXCTJSA-N 0.000 description 1
- KTFHTMHHKXUYPW-ZPFDUUQYSA-N Leu-Asp-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KTFHTMHHKXUYPW-ZPFDUUQYSA-N 0.000 description 1
- DLCOFDAHNMMQPP-SRVKXCTJSA-N Leu-Asp-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O DLCOFDAHNMMQPP-SRVKXCTJSA-N 0.000 description 1
- JQSXWJXBASFONF-KKUMJFAQSA-N Leu-Asp-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JQSXWJXBASFONF-KKUMJFAQSA-N 0.000 description 1
- HIZYETOZLYFUFF-BQBZGAKWSA-N Leu-Cys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CS)C(O)=O HIZYETOZLYFUFF-BQBZGAKWSA-N 0.000 description 1
- DKEZVKFLETVJFY-CIUDSAMLSA-N Leu-Cys-Asn Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(=O)N)C(=O)O)N DKEZVKFLETVJFY-CIUDSAMLSA-N 0.000 description 1
- NFNVDJGXRFEYTK-YUMQZZPRSA-N Leu-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O NFNVDJGXRFEYTK-YUMQZZPRSA-N 0.000 description 1
- RVVBWTWPNFDYBE-SRVKXCTJSA-N Leu-Glu-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O RVVBWTWPNFDYBE-SRVKXCTJSA-N 0.000 description 1
- WIDZHJTYKYBLSR-DCAQKATOSA-N Leu-Glu-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O WIDZHJTYKYBLSR-DCAQKATOSA-N 0.000 description 1
- WQWSMEOYXJTFRU-GUBZILKMSA-N Leu-Glu-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O WQWSMEOYXJTFRU-GUBZILKMSA-N 0.000 description 1
- LESXFEZIFXFIQR-LURJTMIESA-N Leu-Gly Chemical compound CC(C)C[C@H](N)C(=O)NCC(O)=O LESXFEZIFXFIQR-LURJTMIESA-N 0.000 description 1
- LAPSXOAUPNOINL-YUMQZZPRSA-N Leu-Gly-Asp Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(O)=O LAPSXOAUPNOINL-YUMQZZPRSA-N 0.000 description 1
- VWHGTYCRDRBSFI-ZETCQYMHSA-N Leu-Gly-Gly Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)NCC(O)=O VWHGTYCRDRBSFI-ZETCQYMHSA-N 0.000 description 1
- HYMLKESRWLZDBR-WEDXCCLWSA-N Leu-Gly-Thr Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O HYMLKESRWLZDBR-WEDXCCLWSA-N 0.000 description 1
- POZULHZYLPGXMR-ONGXEEELSA-N Leu-Gly-Val Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O POZULHZYLPGXMR-ONGXEEELSA-N 0.000 description 1
- XWOBNBRUDDUEEY-UWVGGRQHSA-N Leu-His Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CNC=N1 XWOBNBRUDDUEEY-UWVGGRQHSA-N 0.000 description 1
- KXODZBLFVFSLAI-AVGNSLFASA-N Leu-His-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CC(C)C)CC1=CN=CN1 KXODZBLFVFSLAI-AVGNSLFASA-N 0.000 description 1
- XQXGNBFMAXWIGI-MXAVVETBSA-N Leu-His-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CC(C)C)CC1=CN=CN1 XQXGNBFMAXWIGI-MXAVVETBSA-N 0.000 description 1
- AZLASBBHHSLQDB-GUBZILKMSA-N Leu-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CC(C)C AZLASBBHHSLQDB-GUBZILKMSA-N 0.000 description 1
- KOSWSHVQIVTVQF-ZPFDUUQYSA-N Leu-Ile-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O KOSWSHVQIVTVQF-ZPFDUUQYSA-N 0.000 description 1
- HGFGEMSVBMCFKK-MNXVOIDGSA-N Leu-Ile-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O HGFGEMSVBMCFKK-MNXVOIDGSA-N 0.000 description 1
- HRTRLSRYZZKPCO-BJDJZHNGSA-N Leu-Ile-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O HRTRLSRYZZKPCO-BJDJZHNGSA-N 0.000 description 1
- NRFGTHFONZYFNY-MGHWNKPDSA-N Leu-Ile-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 NRFGTHFONZYFNY-MGHWNKPDSA-N 0.000 description 1
- PDQDCFBVYXEFSD-SRVKXCTJSA-N Leu-Leu-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O PDQDCFBVYXEFSD-SRVKXCTJSA-N 0.000 description 1
- WXUOJXIGOPMDJM-SRVKXCTJSA-N Leu-Lys-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O WXUOJXIGOPMDJM-SRVKXCTJSA-N 0.000 description 1
- RZXLZBIUTDQHJQ-SRVKXCTJSA-N Leu-Lys-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O RZXLZBIUTDQHJQ-SRVKXCTJSA-N 0.000 description 1
- LVTJJOJKDCVZGP-QWRGUYRKSA-N Leu-Lys-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O LVTJJOJKDCVZGP-QWRGUYRKSA-N 0.000 description 1
- OVZLLFONXILPDZ-VOAKCMCISA-N Leu-Lys-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OVZLLFONXILPDZ-VOAKCMCISA-N 0.000 description 1
- ONPJGOIVICHWBW-BZSNNMDCSA-N Leu-Lys-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 ONPJGOIVICHWBW-BZSNNMDCSA-N 0.000 description 1
- NTISAKGPIGTIJJ-IUCAKERBSA-N Leu-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC(C)C NTISAKGPIGTIJJ-IUCAKERBSA-N 0.000 description 1
- IBSGMIPRBMPMHE-IHRRRGAJSA-N Leu-Met-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(O)=O IBSGMIPRBMPMHE-IHRRRGAJSA-N 0.000 description 1
- HDHQQEDVWQGBEE-DCAQKATOSA-N Leu-Met-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CO)C(O)=O HDHQQEDVWQGBEE-DCAQKATOSA-N 0.000 description 1
- ZAVCJRJOQKIOJW-KKUMJFAQSA-N Leu-Phe-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(O)=O)C(O)=O)CC1=CC=CC=C1 ZAVCJRJOQKIOJW-KKUMJFAQSA-N 0.000 description 1
- DRWMRVFCKKXHCH-BZSNNMDCSA-N Leu-Phe-Leu Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C([O-])=O)CC1=CC=CC=C1 DRWMRVFCKKXHCH-BZSNNMDCSA-N 0.000 description 1
- PJWOOBTYQNNRBF-BZSNNMDCSA-N Leu-Phe-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)O)N PJWOOBTYQNNRBF-BZSNNMDCSA-N 0.000 description 1
- MVVSHHJKJRZVNY-ACRUOGEOSA-N Leu-Phe-Tyr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O MVVSHHJKJRZVNY-ACRUOGEOSA-N 0.000 description 1
- WMIOEVKKYIMVKI-DCAQKATOSA-N Leu-Pro-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O WMIOEVKKYIMVKI-DCAQKATOSA-N 0.000 description 1
- XXXXOVFBXRERQL-ULQDDVLXSA-N Leu-Pro-Phe Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 XXXXOVFBXRERQL-ULQDDVLXSA-N 0.000 description 1
- KZZCOWMDDXDKSS-CIUDSAMLSA-N Leu-Ser-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O KZZCOWMDDXDKSS-CIUDSAMLSA-N 0.000 description 1
- JIHDFWWRYHSAQB-GUBZILKMSA-N Leu-Ser-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O JIHDFWWRYHSAQB-GUBZILKMSA-N 0.000 description 1
- RGUXWMDNCPMQFB-YUMQZZPRSA-N Leu-Ser-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O RGUXWMDNCPMQFB-YUMQZZPRSA-N 0.000 description 1
- MVHXGBZUJLWZOH-BJDJZHNGSA-N Leu-Ser-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O MVHXGBZUJLWZOH-BJDJZHNGSA-N 0.000 description 1
- XOWMDXHFSBCAKQ-SRVKXCTJSA-N Leu-Ser-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC(C)C XOWMDXHFSBCAKQ-SRVKXCTJSA-N 0.000 description 1
- BRTVHXHCUSXYRI-CIUDSAMLSA-N Leu-Ser-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O BRTVHXHCUSXYRI-CIUDSAMLSA-N 0.000 description 1
- PPGBXYKMUMHFBF-KATARQTJSA-N Leu-Ser-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PPGBXYKMUMHFBF-KATARQTJSA-N 0.000 description 1
- LRKCBIUDWAXNEG-CSMHCCOUSA-N Leu-Thr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LRKCBIUDWAXNEG-CSMHCCOUSA-N 0.000 description 1
- ZJZNLRVCZWUONM-JXUBOQSCSA-N Leu-Thr-Ala Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O ZJZNLRVCZWUONM-JXUBOQSCSA-N 0.000 description 1
- ILDSIMPXNFWKLH-KATARQTJSA-N Leu-Thr-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O ILDSIMPXNFWKLH-KATARQTJSA-N 0.000 description 1
- HGLKOTPFWOMPOB-MEYUZBJRSA-N Leu-Thr-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 HGLKOTPFWOMPOB-MEYUZBJRSA-N 0.000 description 1
- WUHBLPVELFTPQK-KKUMJFAQSA-N Leu-Tyr-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(O)=O WUHBLPVELFTPQK-KKUMJFAQSA-N 0.000 description 1
- ARNIBBOXIAWUOP-MGHWNKPDSA-N Leu-Tyr-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O ARNIBBOXIAWUOP-MGHWNKPDSA-N 0.000 description 1
- JGKHAFUAPZCCDU-BZSNNMDCSA-N Leu-Tyr-Leu Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C([O-])=O)CC1=CC=C(O)C=C1 JGKHAFUAPZCCDU-BZSNNMDCSA-N 0.000 description 1
- BTEMNFBEAAOGBR-BZSNNMDCSA-N Leu-Tyr-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCCN)C(=O)O)N BTEMNFBEAAOGBR-BZSNNMDCSA-N 0.000 description 1
- VUBIPAHVHMZHCM-KKUMJFAQSA-N Leu-Tyr-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CO)C(O)=O)CC1=CC=C(O)C=C1 VUBIPAHVHMZHCM-KKUMJFAQSA-N 0.000 description 1
- VQHUBNVKFFLWRP-ULQDDVLXSA-N Leu-Tyr-Val Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C(C)C)C(O)=O)CC1=CC=C(O)C=C1 VQHUBNVKFFLWRP-ULQDDVLXSA-N 0.000 description 1
- CGHXMODRYJISSK-NHCYSSNCSA-N Leu-Val-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC(O)=O CGHXMODRYJISSK-NHCYSSNCSA-N 0.000 description 1
- QQXJROOJCMIHIV-AVGNSLFASA-N Leu-Val-Met Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCSC)C(O)=O QQXJROOJCMIHIV-AVGNSLFASA-N 0.000 description 1
- 101001090725 Leuconostoc gelidum Bacteriocin leucocin-A Proteins 0.000 description 1
- 235000007688 Lycopersicon esculentum Nutrition 0.000 description 1
- 241000721703 Lymantria dispar Species 0.000 description 1
- YIBOAHAOAWACDK-QEJZJMRPSA-N Lys-Ala-Phe Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 YIBOAHAOAWACDK-QEJZJMRPSA-N 0.000 description 1
- KNKHAVVBVXKOGX-JXUBOQSCSA-N Lys-Ala-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O KNKHAVVBVXKOGX-JXUBOQSCSA-N 0.000 description 1
- NPBGTPKLVJEOBE-IUCAKERBSA-N Lys-Arg Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(O)=O)CCCNC(N)=N NPBGTPKLVJEOBE-IUCAKERBSA-N 0.000 description 1
- VHNOAIFVYUQOOY-XUXIUFHCSA-N Lys-Arg-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VHNOAIFVYUQOOY-XUXIUFHCSA-N 0.000 description 1
- SWWCDAGDQHTKIE-RHYQMDGZSA-N Lys-Arg-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SWWCDAGDQHTKIE-RHYQMDGZSA-N 0.000 description 1
- 108010062166 Lys-Asn-Asp Proteins 0.000 description 1
- BYPMOIFBQPEWOH-CIUDSAMLSA-N Lys-Asn-Asp Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N BYPMOIFBQPEWOH-CIUDSAMLSA-N 0.000 description 1
- WLCYCADOWRMSAJ-CIUDSAMLSA-N Lys-Asn-Cys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CS)C(O)=O WLCYCADOWRMSAJ-CIUDSAMLSA-N 0.000 description 1
- DEFGUIIUYAUEDU-ZPFDUUQYSA-N Lys-Asn-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O DEFGUIIUYAUEDU-ZPFDUUQYSA-N 0.000 description 1
- QUYCUALODHJQLK-CIUDSAMLSA-N Lys-Asp-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O QUYCUALODHJQLK-CIUDSAMLSA-N 0.000 description 1
- OVIVOCSURJYCTM-GUBZILKMSA-N Lys-Asp-Glu Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCC(O)=O OVIVOCSURJYCTM-GUBZILKMSA-N 0.000 description 1
- GJJQCBVRWDGLMQ-GUBZILKMSA-N Lys-Glu-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O GJJQCBVRWDGLMQ-GUBZILKMSA-N 0.000 description 1
- PBIPLDMFHAICIP-DCAQKATOSA-N Lys-Glu-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O PBIPLDMFHAICIP-DCAQKATOSA-N 0.000 description 1
- WGLAORUKDGRINI-WDCWCFNPSA-N Lys-Glu-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O WGLAORUKDGRINI-WDCWCFNPSA-N 0.000 description 1
- SLQJJFAVWSZLBL-BJDJZHNGSA-N Lys-Ile-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCCCN SLQJJFAVWSZLBL-BJDJZHNGSA-N 0.000 description 1
- XIZQPFCRXLUNMK-BZSNNMDCSA-N Lys-Leu-Phe Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CCCCN)N XIZQPFCRXLUNMK-BZSNNMDCSA-N 0.000 description 1
- WRODMZBHNNPRLN-SRVKXCTJSA-N Lys-Leu-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O WRODMZBHNNPRLN-SRVKXCTJSA-N 0.000 description 1
- OIQSIMFSVLLWBX-VOAKCMCISA-N Lys-Leu-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OIQSIMFSVLLWBX-VOAKCMCISA-N 0.000 description 1
- VUTWYNQUSJWBHO-BZSNNMDCSA-N Lys-Leu-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O VUTWYNQUSJWBHO-BZSNNMDCSA-N 0.000 description 1
- YDDDRTIPNTWGIG-SRVKXCTJSA-N Lys-Lys-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O YDDDRTIPNTWGIG-SRVKXCTJSA-N 0.000 description 1
- BEGQVWUZFXLNHZ-IHPCNDPISA-N Lys-Lys-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCCN)C(O)=O)=CNC2=C1 BEGQVWUZFXLNHZ-IHPCNDPISA-N 0.000 description 1
- LUAJJLPHUXPQLH-KKUMJFAQSA-N Lys-Phe-Ser Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CCCCN)N LUAJJLPHUXPQLH-KKUMJFAQSA-N 0.000 description 1
- BOJYMMBYBNOOGG-DCAQKATOSA-N Lys-Pro-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O BOJYMMBYBNOOGG-DCAQKATOSA-N 0.000 description 1
- QBHGXFQJFPWJIH-XUXIUFHCSA-N Lys-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CCCCN QBHGXFQJFPWJIH-XUXIUFHCSA-N 0.000 description 1
- YSZNURNVYFUEHC-BQBZGAKWSA-N Lys-Ser Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CO)C(O)=O YSZNURNVYFUEHC-BQBZGAKWSA-N 0.000 description 1
- PLOUVAYOMTYJRG-JXUBOQSCSA-N Lys-Thr-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O PLOUVAYOMTYJRG-JXUBOQSCSA-N 0.000 description 1
- YCJCEMKOZOYBEF-OEAJRASXSA-N Lys-Thr-Phe Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O YCJCEMKOZOYBEF-OEAJRASXSA-N 0.000 description 1
- CAVRAQIDHUPECU-UVOCVTCTSA-N Lys-Thr-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O CAVRAQIDHUPECU-UVOCVTCTSA-N 0.000 description 1
- NROQVSYLPRLJIP-PMVMPFDFSA-N Lys-Trp-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O NROQVSYLPRLJIP-PMVMPFDFSA-N 0.000 description 1
- PELXPRPDQRFBGQ-KKUMJFAQSA-N Lys-Tyr-Asn Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCCN)N)O PELXPRPDQRFBGQ-KKUMJFAQSA-N 0.000 description 1
- XYLSGAWRCZECIQ-JYJNAYRXSA-N Lys-Tyr-Glu Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CCC(O)=O)C(O)=O)CC1=CC=C(O)C=C1 XYLSGAWRCZECIQ-JYJNAYRXSA-N 0.000 description 1
- RPWQJSBMXJSCPD-XUXIUFHCSA-N Lys-Val-Ile Chemical compound CC[C@H](C)[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CCCCN)C(C)C)C(O)=O RPWQJSBMXJSCPD-XUXIUFHCSA-N 0.000 description 1
- NYTDJEZBAAFLLG-IHRRRGAJSA-N Lys-Val-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(O)=O NYTDJEZBAAFLLG-IHRRRGAJSA-N 0.000 description 1
- TXTZMVNJIRZABH-ULQDDVLXSA-N Lys-Val-Phe Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 TXTZMVNJIRZABH-ULQDDVLXSA-N 0.000 description 1
- 241000255682 Malacosoma americanum Species 0.000 description 1
- GUBGYTABKSRVRQ-PICCSMPSSA-N Maltose Natural products O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CO)O[C@@H]1O[C@@H]1[C@@H](CO)OC(O)[C@H](O)[C@H]1O GUBGYTABKSRVRQ-PICCSMPSSA-N 0.000 description 1
- 241000256010 Manduca Species 0.000 description 1
- 240000004658 Medicago sativa Species 0.000 description 1
- 235000017587 Medicago sativa ssp. sativa Nutrition 0.000 description 1
- JHKXZYLNVJRAAJ-WDSKDSINSA-N Met-Ala Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(O)=O JHKXZYLNVJRAAJ-WDSKDSINSA-N 0.000 description 1
- YRAWWKUTNBILNT-FXQIFTODSA-N Met-Ala-Ala Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O YRAWWKUTNBILNT-FXQIFTODSA-N 0.000 description 1
- VHGIWFGJIHTASW-FXQIFTODSA-N Met-Ala-Asp Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC(O)=O VHGIWFGJIHTASW-FXQIFTODSA-N 0.000 description 1
- MUYQDMBLDFEVRJ-LSJOCFKGSA-N Met-Ala-His Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CNC=N1 MUYQDMBLDFEVRJ-LSJOCFKGSA-N 0.000 description 1
- ULNXMMYXQKGNPG-LPEHRKFASA-N Met-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCSC)N ULNXMMYXQKGNPG-LPEHRKFASA-N 0.000 description 1
- IVCPHARVJUYDPA-FXQIFTODSA-N Met-Asn-Asp Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N IVCPHARVJUYDPA-FXQIFTODSA-N 0.000 description 1
- YNOVBMBQSQTLFM-DCAQKATOSA-N Met-Asn-Leu Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O YNOVBMBQSQTLFM-DCAQKATOSA-N 0.000 description 1
- DRXODWRPPUFIAY-DCAQKATOSA-N Met-Asn-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CCCCN DRXODWRPPUFIAY-DCAQKATOSA-N 0.000 description 1
- QTZXSYBVOSXBEJ-WDSKDSINSA-N Met-Asp Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(O)=O)CC(O)=O QTZXSYBVOSXBEJ-WDSKDSINSA-N 0.000 description 1
- ADHNYKZHPOEULM-BQBZGAKWSA-N Met-Glu Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O ADHNYKZHPOEULM-BQBZGAKWSA-N 0.000 description 1
- JPCHYAUKOUGOIB-HJGDQZAQSA-N Met-Glu-Thr Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O JPCHYAUKOUGOIB-HJGDQZAQSA-N 0.000 description 1
- DJBCKVNHEIJLQA-GMOBBJLQSA-N Met-Ile-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCSC)N DJBCKVNHEIJLQA-GMOBBJLQSA-N 0.000 description 1
- PBOUVYGPDSARIS-IUCAKERBSA-N Met-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(O)=O)CC(C)C PBOUVYGPDSARIS-IUCAKERBSA-N 0.000 description 1
- HZVXPUHLTZRQEL-UWVGGRQHSA-N Met-Leu-Gly Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O HZVXPUHLTZRQEL-UWVGGRQHSA-N 0.000 description 1
- SODXFJOPSCXOHE-IHRRRGAJSA-N Met-Leu-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O SODXFJOPSCXOHE-IHRRRGAJSA-N 0.000 description 1
- DBXMFHGGHMXYHY-DCAQKATOSA-N Met-Leu-Ser Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O DBXMFHGGHMXYHY-DCAQKATOSA-N 0.000 description 1
- CNAGWYQWQDMUGC-IHRRRGAJSA-N Met-Phe-Asn Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(=O)N)C(=O)O)N CNAGWYQWQDMUGC-IHRRRGAJSA-N 0.000 description 1
- XPVCDCMPKCERFT-GUBZILKMSA-N Met-Ser-Arg Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O XPVCDCMPKCERFT-GUBZILKMSA-N 0.000 description 1
- KAKJTZWHIUWTTD-VQVTYTSYSA-N Met-Thr Chemical compound CSCC[C@H]([NH3+])C(=O)N[C@@H]([C@@H](C)O)C([O-])=O KAKJTZWHIUWTTD-VQVTYTSYSA-N 0.000 description 1
- GHQFLTYXGUETFD-UFYCRDLUSA-N Met-Tyr-Tyr Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O)N GHQFLTYXGUETFD-UFYCRDLUSA-N 0.000 description 1
- VWFHWJGVLVZVIS-QXEWZRGKSA-N Met-Val-Asn Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O VWFHWJGVLVZVIS-QXEWZRGKSA-N 0.000 description 1
- OVTOTTGZBWXLFU-QXEWZRGKSA-N Met-Val-Asp Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC(O)=O OVTOTTGZBWXLFU-QXEWZRGKSA-N 0.000 description 1
- IQJMEDDVOGMTKT-SRVKXCTJSA-N Met-Val-Val Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C(C)C)C(O)=O IQJMEDDVOGMTKT-SRVKXCTJSA-N 0.000 description 1
- 241001465754 Metazoa Species 0.000 description 1
- 241000952627 Monomorium pharaonis Species 0.000 description 1
- 241000699666 Mus <mouse, genus> Species 0.000 description 1
- 241000699670 Mus sp. Species 0.000 description 1
- 240000001307 Myosotis scorpioides Species 0.000 description 1
- YBAFDPFAUTYYRW-UHFFFAOYSA-N N-L-alpha-glutamyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCC(O)=O YBAFDPFAUTYYRW-UHFFFAOYSA-N 0.000 description 1
- XZFYRXDAULDNFX-UHFFFAOYSA-N N-L-cysteinyl-L-phenylalanine Natural products SCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XZFYRXDAULDNFX-UHFFFAOYSA-N 0.000 description 1
- AUEJLPRZGVVDNU-UHFFFAOYSA-N N-L-tyrosyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CC1=CC=C(O)C=C1 AUEJLPRZGVVDNU-UHFFFAOYSA-N 0.000 description 1
- AJHCSUXXECOXOY-UHFFFAOYSA-N N-glycyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)CN)C(O)=O)=CNC2=C1 AJHCSUXXECOXOY-UHFFFAOYSA-N 0.000 description 1
- 108010002311 N-glycylglutamic acid Proteins 0.000 description 1
- 229920002274 Nalgene Polymers 0.000 description 1
- BQVUABVGYYSDCJ-UHFFFAOYSA-N Nalpha-L-Leucyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)CC(C)C)C(O)=O)=CNC2=C1 BQVUABVGYYSDCJ-UHFFFAOYSA-N 0.000 description 1
- 101710202365 Napin Proteins 0.000 description 1
- 229930193140 Neomycin Natural products 0.000 description 1
- 101100342977 Neurospora crassa (strain ATCC 24698 / 74-OR23-1A / CBS 708.71 / DSM 1257 / FGSC 987) leu-1 gene Proteins 0.000 description 1
- 101710089395 Oleosin Proteins 0.000 description 1
- 108020005187 Oligonucleotide Probes Proteins 0.000 description 1
- 241000283973 Oryctolagus cuniculus Species 0.000 description 1
- 240000007594 Oryza sativa Species 0.000 description 1
- 235000007164 Oryza sativa Nutrition 0.000 description 1
- 101000793655 Oryza sativa subsp. japonica Calreticulin Proteins 0.000 description 1
- 238000009004 PCR Kit Methods 0.000 description 1
- 238000010222 PCR analysis Methods 0.000 description 1
- 108091081548 Palindromic sequence Proteins 0.000 description 1
- 241000488581 Panonychus citri Species 0.000 description 1
- 241000488583 Panonychus ulmi Species 0.000 description 1
- 101710163504 Phaseolin Proteins 0.000 description 1
- DFEVBOYEUQJGER-JURCDPSOSA-N Phe-Ala-Ile Chemical compound N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O DFEVBOYEUQJGER-JURCDPSOSA-N 0.000 description 1
- MQWISMJKHOUEMW-ULQDDVLXSA-N Phe-Arg-His Chemical compound C([C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC=1NC=NC=1)C(O)=O)C1=CC=CC=C1 MQWISMJKHOUEMW-ULQDDVLXSA-N 0.000 description 1
- MECSIDWUTYRHRJ-KKUMJFAQSA-N Phe-Asn-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O MECSIDWUTYRHRJ-KKUMJFAQSA-N 0.000 description 1
- XMPUYNHKEPFERE-IHRRRGAJSA-N Phe-Asp-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 XMPUYNHKEPFERE-IHRRRGAJSA-N 0.000 description 1
- HBGFEEQFVBWYJQ-KBPBESRZSA-N Phe-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CC=CC=C1 HBGFEEQFVBWYJQ-KBPBESRZSA-N 0.000 description 1
- OHUXOEXBXPZKPT-STQMWFEESA-N Phe-His Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1N=CNC=1)C(O)=O)C1=CC=CC=C1 OHUXOEXBXPZKPT-STQMWFEESA-N 0.000 description 1
- MIICYIIBVYQNKE-QEWYBTABSA-N Phe-Ile-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N MIICYIIBVYQNKE-QEWYBTABSA-N 0.000 description 1
- GXDPQJUBLBZKDY-IAVJCBSLSA-N Phe-Ile-Ile Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O GXDPQJUBLBZKDY-IAVJCBSLSA-N 0.000 description 1
- RORUIHAWOLADSH-HJWJTTGWSA-N Phe-Ile-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CC1=CC=CC=C1 RORUIHAWOLADSH-HJWJTTGWSA-N 0.000 description 1
- SMFGCTXUBWEPKM-KBPBESRZSA-N Phe-Leu-Gly Chemical compound OC(=O)CNC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 SMFGCTXUBWEPKM-KBPBESRZSA-N 0.000 description 1
- OSBADCBXAMSPQD-YESZJQIVSA-N Phe-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CC=CC=C2)N OSBADCBXAMSPQD-YESZJQIVSA-N 0.000 description 1
- ZUQACJLOHYRVPJ-DKIMLUQUSA-N Phe-Lys-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CC=CC=C1 ZUQACJLOHYRVPJ-DKIMLUQUSA-N 0.000 description 1
- XZQYIJALMGEUJD-OEAJRASXSA-N Phe-Lys-Thr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XZQYIJALMGEUJD-OEAJRASXSA-N 0.000 description 1
- RVEVENLSADZUMS-IHRRRGAJSA-N Phe-Pro-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O RVEVENLSADZUMS-IHRRRGAJSA-N 0.000 description 1
- BPCLGWHVPVTTFM-QWRGUYRKSA-N Phe-Ser-Gly Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)NCC(O)=O BPCLGWHVPVTTFM-QWRGUYRKSA-N 0.000 description 1
- MCIXMYKSPQUMJG-SRVKXCTJSA-N Phe-Ser-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O MCIXMYKSPQUMJG-SRVKXCTJSA-N 0.000 description 1
- GMWNQSGWWGKTSF-LFSVMHDDSA-N Phe-Thr-Ala Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O GMWNQSGWWGKTSF-LFSVMHDDSA-N 0.000 description 1
- BPIMVBKDLSBKIJ-FCLVOEFKSA-N Phe-Thr-Phe Chemical compound C([C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 BPIMVBKDLSBKIJ-FCLVOEFKSA-N 0.000 description 1
- MMPBPRXOFJNCCN-ZEWNOJEFSA-N Phe-Tyr-Ile Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O MMPBPRXOFJNCCN-ZEWNOJEFSA-N 0.000 description 1
- BQMFWUKNOCJDNV-HJWJTTGWSA-N Phe-Val-Ile Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BQMFWUKNOCJDNV-HJWJTTGWSA-N 0.000 description 1
- RGMLUHANLDVMPB-ULQDDVLXSA-N Phe-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N RGMLUHANLDVMPB-ULQDDVLXSA-N 0.000 description 1
- BELBBZDIHDAJOR-UHFFFAOYSA-N Phenolsulfonephthalein Chemical compound C1=CC(O)=CC=C1C1(C=2C=CC(O)=CC=2)C2=CC=CC=C2S(=O)(=O)O1 BELBBZDIHDAJOR-UHFFFAOYSA-N 0.000 description 1
- IAJOBQBIJHVGMQ-UHFFFAOYSA-N Phosphinothricin Natural products CP(O)(=O)CCC(N)C(O)=O IAJOBQBIJHVGMQ-UHFFFAOYSA-N 0.000 description 1
- 108091000080 Phosphotransferase Proteins 0.000 description 1
- 206010035226 Plasma cell myeloma Diseases 0.000 description 1
- 239000004698 Polyethylene Substances 0.000 description 1
- 239000002202 Polyethylene glycol Substances 0.000 description 1
- 241000952063 Polyphagotarsonemus latus Species 0.000 description 1
- 229920001214 Polysorbate 60 Polymers 0.000 description 1
- IWNOFCGBMSFTBC-CIUDSAMLSA-N Pro-Ala-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O IWNOFCGBMSFTBC-CIUDSAMLSA-N 0.000 description 1
- CGBYDGAJHSOGFQ-LPEHRKFASA-N Pro-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@@H]2CCCN2 CGBYDGAJHSOGFQ-LPEHRKFASA-N 0.000 description 1
- KPDRZQUWJKTMBP-DCAQKATOSA-N Pro-Asp-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@@H]1CCCN1 KPDRZQUWJKTMBP-DCAQKATOSA-N 0.000 description 1
- QVIZLAUEAMQKGS-GUBZILKMSA-N Pro-Asp-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@@H]1CCCN1 QVIZLAUEAMQKGS-GUBZILKMSA-N 0.000 description 1
- MGDFPGCFVJFITQ-CIUDSAMLSA-N Pro-Glu-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O MGDFPGCFVJFITQ-CIUDSAMLSA-N 0.000 description 1
- NXEYSLRNNPWCRN-SRVKXCTJSA-N Pro-Glu-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O NXEYSLRNNPWCRN-SRVKXCTJSA-N 0.000 description 1
- XQSREVQDGCPFRJ-STQMWFEESA-N Pro-Gly-Phe Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O XQSREVQDGCPFRJ-STQMWFEESA-N 0.000 description 1
- BEPSGCXDIVACBU-IUCAKERBSA-N Pro-His Chemical compound C([C@@H](C(=O)O)NC(=O)[C@H]1NCCC1)C1=CN=CN1 BEPSGCXDIVACBU-IUCAKERBSA-N 0.000 description 1
- VWXGFAIZUQBBBG-UWVGGRQHSA-N Pro-His-Gly Chemical compound C([C@@H](C(=O)NCC(=O)[O-])NC(=O)[C@H]1[NH2+]CCC1)C1=CN=CN1 VWXGFAIZUQBBBG-UWVGGRQHSA-N 0.000 description 1
- BBFRBZYKHIKFBX-GMOBBJLQSA-N Pro-Ile-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@@H]1CCCN1 BBFRBZYKHIKFBX-GMOBBJLQSA-N 0.000 description 1
- ZKQOUHVVXABNDG-IUCAKERBSA-N Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1 ZKQOUHVVXABNDG-IUCAKERBSA-N 0.000 description 1
- CLJLVCYFABNTHP-DCAQKATOSA-N Pro-Leu-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O CLJLVCYFABNTHP-DCAQKATOSA-N 0.000 description 1
- FKYKZHOKDOPHSA-DCAQKATOSA-N Pro-Leu-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O FKYKZHOKDOPHSA-DCAQKATOSA-N 0.000 description 1
- RVQDZELMXZRSSI-IUCAKERBSA-N Pro-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1 RVQDZELMXZRSSI-IUCAKERBSA-N 0.000 description 1
- OFGUOWQVEGTVNU-DCAQKATOSA-N Pro-Lys-Ala Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O OFGUOWQVEGTVNU-DCAQKATOSA-N 0.000 description 1
- NTXFLJULRHQMDC-GUBZILKMSA-N Pro-Met-Asp Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@@H]1CCCN1 NTXFLJULRHQMDC-GUBZILKMSA-N 0.000 description 1
- ZZCJYPLMOPTZFC-SRVKXCTJSA-N Pro-Met-Met Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCSC)C(O)=O ZZCJYPLMOPTZFC-SRVKXCTJSA-N 0.000 description 1
- ZUZINZIJHJFJRN-UBHSHLNASA-N Pro-Phe-Ala Chemical compound C([C@@H](C(=O)N[C@@H](C)C(O)=O)NC(=O)[C@H]1NCCC1)C1=CC=CC=C1 ZUZINZIJHJFJRN-UBHSHLNASA-N 0.000 description 1
- CGSOWZUPLOKYOR-AVGNSLFASA-N Pro-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 CGSOWZUPLOKYOR-AVGNSLFASA-N 0.000 description 1
- AFWBWPCXSWUCLB-WDSKDSINSA-N Pro-Ser Chemical compound OC[C@@H](C([O-])=O)NC(=O)[C@@H]1CCC[NH2+]1 AFWBWPCXSWUCLB-WDSKDSINSA-N 0.000 description 1
- MKGIILKDUGDRRO-FXQIFTODSA-N Pro-Ser-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H]1CCCN1 MKGIILKDUGDRRO-FXQIFTODSA-N 0.000 description 1
- PKHDJFHFMGQMPS-RCWTZXSCSA-N Pro-Thr-Arg Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O PKHDJFHFMGQMPS-RCWTZXSCSA-N 0.000 description 1
- FDMCIBSQRKFSTJ-RHYQMDGZSA-N Pro-Thr-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O FDMCIBSQRKFSTJ-RHYQMDGZSA-N 0.000 description 1
- IALSFJSONJZBKB-HRCADAONSA-N Pro-Tyr-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)N3CCC[C@@H]3C(=O)O IALSFJSONJZBKB-HRCADAONSA-N 0.000 description 1
- YDTUEBLEAVANFH-RCWTZXSCSA-N Pro-Val-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H]1CCCN1 YDTUEBLEAVANFH-RCWTZXSCSA-N 0.000 description 1
- 241001646398 Pseudomonas chlororaphis Species 0.000 description 1
- 101800001006 Putative helicase Proteins 0.000 description 1
- 102100029250 RNA-binding protein 14 Human genes 0.000 description 1
- PLXBWHJQWKZRKG-UHFFFAOYSA-N Resazurin Chemical compound C1=CC(=O)C=C2OC3=CC(O)=CC=C3[N+]([O-])=C21 PLXBWHJQWKZRKG-UHFFFAOYSA-N 0.000 description 1
- 239000011542 SDS running buffer Substances 0.000 description 1
- 239000012506 Sephacryl® Substances 0.000 description 1
- 229920005654 Sephadex Polymers 0.000 description 1
- 239000012507 Sephadex™ Substances 0.000 description 1
- LVVBAKCGXXUHFO-ZLUOBGJFSA-N Ser-Ala-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(O)=O LVVBAKCGXXUHFO-ZLUOBGJFSA-N 0.000 description 1
- WTWGOQRNRFHFQD-JBDRJPRFSA-N Ser-Ala-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WTWGOQRNRFHFQD-JBDRJPRFSA-N 0.000 description 1
- WTUJZHKANPDPIN-CIUDSAMLSA-N Ser-Ala-Lys Chemical compound C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CO)N WTUJZHKANPDPIN-CIUDSAMLSA-N 0.000 description 1
- YQHZVYJAGWMHES-ZLUOBGJFSA-N Ser-Ala-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O YQHZVYJAGWMHES-ZLUOBGJFSA-N 0.000 description 1
- WDXYVIIVDIDOSX-DCAQKATOSA-N Ser-Arg-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CO)CCCN=C(N)N WDXYVIIVDIDOSX-DCAQKATOSA-N 0.000 description 1
- HBOABDXGTMMDSE-GUBZILKMSA-N Ser-Arg-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(O)=O HBOABDXGTMMDSE-GUBZILKMSA-N 0.000 description 1
- VGNYHOBZJKWRGI-CIUDSAMLSA-N Ser-Asn-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CO VGNYHOBZJKWRGI-CIUDSAMLSA-N 0.000 description 1
- BGOWRLSWJCVYAQ-CIUDSAMLSA-N Ser-Asp-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O BGOWRLSWJCVYAQ-CIUDSAMLSA-N 0.000 description 1
- ZOHGLPQGEHSLPD-FXQIFTODSA-N Ser-Gln-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZOHGLPQGEHSLPD-FXQIFTODSA-N 0.000 description 1
- HJEBZBMOTCQYDN-ACZMJKKPSA-N Ser-Glu-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O HJEBZBMOTCQYDN-ACZMJKKPSA-N 0.000 description 1
- BRGQQXQKPUCUJQ-KBIXCLLPSA-N Ser-Glu-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BRGQQXQKPUCUJQ-KBIXCLLPSA-N 0.000 description 1
- VQBCMLMPEWPUTB-ACZMJKKPSA-N Ser-Glu-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O VQBCMLMPEWPUTB-ACZMJKKPSA-N 0.000 description 1
- GZBKRJVCRMZAST-XKBZYTNZSA-N Ser-Glu-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GZBKRJVCRMZAST-XKBZYTNZSA-N 0.000 description 1
- GZFAWAQTEYDKII-YUMQZZPRSA-N Ser-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CO GZFAWAQTEYDKII-YUMQZZPRSA-N 0.000 description 1
- UIGMAMGZOJVTDN-WHFBIAKZSA-N Ser-Gly-Ser Chemical compound OC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O UIGMAMGZOJVTDN-WHFBIAKZSA-N 0.000 description 1
- LQESNKGTTNHZPZ-GHCJXIJMSA-N Ser-Ile-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O LQESNKGTTNHZPZ-GHCJXIJMSA-N 0.000 description 1
- BKZYBLLIBOBOOW-GHCJXIJMSA-N Ser-Ile-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O BKZYBLLIBOBOOW-GHCJXIJMSA-N 0.000 description 1
- HBTCFCHYALPXME-HTFCKZLJSA-N Ser-Ile-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O HBTCFCHYALPXME-HTFCKZLJSA-N 0.000 description 1
- UIPXCLNLUUAMJU-JBDRJPRFSA-N Ser-Ile-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O UIPXCLNLUUAMJU-JBDRJPRFSA-N 0.000 description 1
- ZIFYDQAFEMIZII-GUBZILKMSA-N Ser-Leu-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZIFYDQAFEMIZII-GUBZILKMSA-N 0.000 description 1
- XXNYYSXNXCJYKX-DCAQKATOSA-N Ser-Leu-Met Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(O)=O XXNYYSXNXCJYKX-DCAQKATOSA-N 0.000 description 1
- CRJZZXMAADSBBQ-SRVKXCTJSA-N Ser-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CO CRJZZXMAADSBBQ-SRVKXCTJSA-N 0.000 description 1
- NQZFFLBPNDLTPO-DLOVCJGASA-N Ser-Phe-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](CO)N NQZFFLBPNDLTPO-DLOVCJGASA-N 0.000 description 1
- BUYHXYIUQUBEQP-AVGNSLFASA-N Ser-Phe-Glu Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CO)N BUYHXYIUQUBEQP-AVGNSLFASA-N 0.000 description 1
- XKFJENWJGHMDLI-QWRGUYRKSA-N Ser-Phe-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)NCC(O)=O XKFJENWJGHMDLI-QWRGUYRKSA-N 0.000 description 1
- NMZXJDSKEGFDLJ-DCAQKATOSA-N Ser-Pro-Lys Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CO)N)C(=O)N[C@@H](CCCCN)C(=O)O NMZXJDSKEGFDLJ-DCAQKATOSA-N 0.000 description 1
- AZWNCEBQZXELEZ-FXQIFTODSA-N Ser-Pro-Ser Chemical compound OC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O AZWNCEBQZXELEZ-FXQIFTODSA-N 0.000 description 1
- PPCZVWHJWJFTFN-ZLUOBGJFSA-N Ser-Ser-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O PPCZVWHJWJFTFN-ZLUOBGJFSA-N 0.000 description 1
- BMKNXTJLHFIAAH-CIUDSAMLSA-N Ser-Ser-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O BMKNXTJLHFIAAH-CIUDSAMLSA-N 0.000 description 1
- OZPDGESCTGGNAD-CIUDSAMLSA-N Ser-Ser-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CO OZPDGESCTGGNAD-CIUDSAMLSA-N 0.000 description 1
- SQHKXWODKJDZRC-LKXGYXEUSA-N Ser-Thr-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(O)=O SQHKXWODKJDZRC-LKXGYXEUSA-N 0.000 description 1
- SOACHCFYJMCMHC-BWBBJGPYSA-N Ser-Thr-Cys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CO)N)O SOACHCFYJMCMHC-BWBBJGPYSA-N 0.000 description 1
- PURRNJBBXDDWLX-ZDLURKLDSA-N Ser-Thr-Gly Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CO)N)O PURRNJBBXDDWLX-ZDLURKLDSA-N 0.000 description 1
- KIEIJCFVGZCUAS-MELADBBJSA-N Ser-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CO)N)C(=O)O KIEIJCFVGZCUAS-MELADBBJSA-N 0.000 description 1
- ILVGMCVCQBJPSH-WDSKDSINSA-N Ser-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CO ILVGMCVCQBJPSH-WDSKDSINSA-N 0.000 description 1
- JGUWRQWULDWNCM-FXQIFTODSA-N Ser-Val-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O JGUWRQWULDWNCM-FXQIFTODSA-N 0.000 description 1
- BQCADISMDOOEFD-UHFFFAOYSA-N Silver Chemical compound [Ag] BQCADISMDOOEFD-UHFFFAOYSA-N 0.000 description 1
- 102000039471 Small Nuclear RNA Human genes 0.000 description 1
- 240000003768 Solanum lycopersicum Species 0.000 description 1
- 244000061456 Solanum tuberosum Species 0.000 description 1
- 235000002595 Solanum tuberosum Nutrition 0.000 description 1
- 240000003829 Sorghum propinquum Species 0.000 description 1
- 235000011684 Sorghum saccharatum Nutrition 0.000 description 1
- 241001465745 Steinernematidae Species 0.000 description 1
- 241000187391 Streptomyces hygroscopicus Species 0.000 description 1
- 241000187191 Streptomyces viridochromogenes Species 0.000 description 1
- NHUHCSRWZMLRLA-UHFFFAOYSA-N Sulfisoxazole Chemical compound CC1=NOC(NS(=O)(=O)C=2C=CC(N)=CC=2)=C1C NHUHCSRWZMLRLA-UHFFFAOYSA-N 0.000 description 1
- 229940100389 Sulfonylurea Drugs 0.000 description 1
- 108700026226 TATA Box Proteins 0.000 description 1
- 101150006914 TRP1 gene Proteins 0.000 description 1
- 108010006785 Taq Polymerase Proteins 0.000 description 1
- 241000254109 Tenebrio molitor Species 0.000 description 1
- 241001454295 Tetranychidae Species 0.000 description 1
- 241000916142 Tetranychus turkestani Species 0.000 description 1
- CAJFZCICSVBOJK-SHGPDSBTSA-N Thr-Ala-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O CAJFZCICSVBOJK-SHGPDSBTSA-N 0.000 description 1
- HYLXOQURIOCKIH-VQVTYTSYSA-N Thr-Arg Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(O)=O)CCCNC(N)=N HYLXOQURIOCKIH-VQVTYTSYSA-N 0.000 description 1
- XYEXCEPTALHNEV-RCWTZXSCSA-N Thr-Arg-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O XYEXCEPTALHNEV-RCWTZXSCSA-N 0.000 description 1
- TWLMXDWFVNEFFK-FJXKBIBVSA-N Thr-Arg-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O TWLMXDWFVNEFFK-FJXKBIBVSA-N 0.000 description 1
- YLXAMFZYJTZXFH-OLHMAJIHSA-N Thr-Asn-Asp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N)O YLXAMFZYJTZXFH-OLHMAJIHSA-N 0.000 description 1
- JTEICXDKGWKRRV-HJGDQZAQSA-N Thr-Asn-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N)O JTEICXDKGWKRRV-HJGDQZAQSA-N 0.000 description 1
- JBHMLZSKIXMVFS-XVSYOHENSA-N Thr-Asn-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JBHMLZSKIXMVFS-XVSYOHENSA-N 0.000 description 1
- JVTHIXKSVYEWNI-JRQIVUDYSA-N Thr-Asn-Tyr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O JVTHIXKSVYEWNI-JRQIVUDYSA-N 0.000 description 1
- IOWJRKAVLALBQB-IWGUZYHVSA-N Thr-Asp Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(O)=O)CC(O)=O IOWJRKAVLALBQB-IWGUZYHVSA-N 0.000 description 1
- MFEBUIFJVPNZLO-OLHMAJIHSA-N Thr-Asp-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O MFEBUIFJVPNZLO-OLHMAJIHSA-N 0.000 description 1
- NLSNVZAREYQMGR-HJGDQZAQSA-N Thr-Asp-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O NLSNVZAREYQMGR-HJGDQZAQSA-N 0.000 description 1
- QWMPARMKIDVBLV-VZFHVOOUSA-N Thr-Cys-Ala Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CS)C(=O)N[C@@H](C)C(O)=O QWMPARMKIDVBLV-VZFHVOOUSA-N 0.000 description 1
- DXNUZQGVOMCGNS-SWRJLBSHSA-N Thr-Gln-Trp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)N)O DXNUZQGVOMCGNS-SWRJLBSHSA-N 0.000 description 1
- LHEZGZQRLDBSRR-WDCWCFNPSA-N Thr-Glu-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O LHEZGZQRLDBSRR-WDCWCFNPSA-N 0.000 description 1
- BIYXEUAFGLTAEM-WUJLRWPWSA-N Thr-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)NCC(O)=O BIYXEUAFGLTAEM-WUJLRWPWSA-N 0.000 description 1
- AHOLTQCAVBSUDP-PPCPHDFISA-N Thr-Ile-Lys Chemical compound CC[C@H](C)[C@H](NC(=O)[C@@H](N)[C@@H](C)O)C(=O)N[C@@H](CCCCN)C(O)=O AHOLTQCAVBSUDP-PPCPHDFISA-N 0.000 description 1
- BVOVIGCHYNFJBZ-JXUBOQSCSA-N Thr-Leu-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O BVOVIGCHYNFJBZ-JXUBOQSCSA-N 0.000 description 1
- IMDMLDSVUSMAEJ-HJGDQZAQSA-N Thr-Leu-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O IMDMLDSVUSMAEJ-HJGDQZAQSA-N 0.000 description 1
- XIULAFZYEKSGAJ-IXOXFDKPSA-N Thr-Leu-His Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CNC=N1 XIULAFZYEKSGAJ-IXOXFDKPSA-N 0.000 description 1
- MECLEFZMPPOEAC-VOAKCMCISA-N Thr-Leu-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N)O MECLEFZMPPOEAC-VOAKCMCISA-N 0.000 description 1
- YOOAQCZYZHGUAZ-KATARQTJSA-N Thr-Leu-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O YOOAQCZYZHGUAZ-KATARQTJSA-N 0.000 description 1
- MCDVZTRGHNXTGK-HJGDQZAQSA-N Thr-Met-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCC(O)=O)C(O)=O MCDVZTRGHNXTGK-HJGDQZAQSA-N 0.000 description 1
- WRUWXBBEFUTJOU-XGEHTFHBSA-N Thr-Met-Ser Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CO)C(=O)O)N)O WRUWXBBEFUTJOU-XGEHTFHBSA-N 0.000 description 1
- WYLAVUAWOUVUCA-XVSYOHENSA-N Thr-Phe-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O WYLAVUAWOUVUCA-XVSYOHENSA-N 0.000 description 1
- ABWNZPOIUJMNKT-IXOXFDKPSA-N Thr-Phe-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O ABWNZPOIUJMNKT-IXOXFDKPSA-N 0.000 description 1
- QOLYAJSZHIJCTO-VQVTYTSYSA-N Thr-Pro Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(O)=O QOLYAJSZHIJCTO-VQVTYTSYSA-N 0.000 description 1
- KERCOYANYUPLHJ-XGEHTFHBSA-N Thr-Pro-Ser Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O KERCOYANYUPLHJ-XGEHTFHBSA-N 0.000 description 1
- PRTHQBSMXILLPC-XGEHTFHBSA-N Thr-Ser-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O PRTHQBSMXILLPC-XGEHTFHBSA-N 0.000 description 1
- IQPWNQRRAJHOKV-KATARQTJSA-N Thr-Ser-Lys Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCCN IQPWNQRRAJHOKV-KATARQTJSA-N 0.000 description 1
- WKGAAMOJPMBBMC-IXOXFDKPSA-N Thr-Ser-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O WKGAAMOJPMBBMC-IXOXFDKPSA-N 0.000 description 1
- NHQVWACSJZJCGJ-FLBSBUHZSA-N Thr-Thr-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O NHQVWACSJZJCGJ-FLBSBUHZSA-N 0.000 description 1
- JNKAYADBODLPMQ-HSHDSVGOSA-N Thr-Trp-Val Chemical compound C1=CC=C2C(C[C@@H](C(=O)N[C@@H](C(C)C)C(O)=O)NC(=O)[C@@H](N)[C@@H](C)O)=CNC2=C1 JNKAYADBODLPMQ-HSHDSVGOSA-N 0.000 description 1
- BEZTUFWTPVOROW-KJEVXHAQSA-N Thr-Tyr-Arg Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N)O BEZTUFWTPVOROW-KJEVXHAQSA-N 0.000 description 1
- JAWUQFCGNVEDRN-MEYUZBJRSA-N Thr-Tyr-Leu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC(C)C)C(=O)O)N)O JAWUQFCGNVEDRN-MEYUZBJRSA-N 0.000 description 1
- CYCGARJWIQWPQM-YJRXYDGGSA-N Thr-Tyr-Ser Chemical compound C[C@@H](O)[C@H]([NH3+])C(=O)N[C@H](C(=O)N[C@@H](CO)C([O-])=O)CC1=CC=C(O)C=C1 CYCGARJWIQWPQM-YJRXYDGGSA-N 0.000 description 1
- YOPQYBJJNSIQGZ-JNPHEJMOSA-N Thr-Tyr-Tyr Chemical compound C([C@H](NC(=O)[C@@H](N)[C@H](O)C)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=C(O)C=C1 YOPQYBJJNSIQGZ-JNPHEJMOSA-N 0.000 description 1
- XGFYGMKZKFRGAI-RCWTZXSCSA-N Thr-Val-Arg Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N XGFYGMKZKFRGAI-RCWTZXSCSA-N 0.000 description 1
- AYFVYJQAPQTCCC-UHFFFAOYSA-N Threonine Natural products CC(O)C(N)C(O)=O AYFVYJQAPQTCCC-UHFFFAOYSA-N 0.000 description 1
- 239000004473 Threonine Substances 0.000 description 1
- 229920004890 Triton X-100 Polymers 0.000 description 1
- 239000013504 Triton X-100 Substances 0.000 description 1
- RNFZZCMCRDFNAE-WFBYXXMGSA-N Trp-Asn-Ala Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(O)=O RNFZZCMCRDFNAE-WFBYXXMGSA-N 0.000 description 1
- IUFQHOCOKQIOMC-XIRDDKMYSA-N Trp-Asn-Lys Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N IUFQHOCOKQIOMC-XIRDDKMYSA-N 0.000 description 1
- BXKWZPXTTSCOMX-AQZXSJQPSA-N Trp-Asn-Thr Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O BXKWZPXTTSCOMX-AQZXSJQPSA-N 0.000 description 1
- DZIKVMCFXIIETR-JSGCOSHPSA-N Trp-Gly-Glu Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O DZIKVMCFXIIETR-JSGCOSHPSA-N 0.000 description 1
- LVTKHGUGBGNBPL-UHFFFAOYSA-N Trp-P-1 Chemical compound N1C2=CC=CC=C2C2=C1C(C)=C(N)N=C2C LVTKHGUGBGNBPL-UHFFFAOYSA-N 0.000 description 1
- IMMPMHKLUUZKAZ-WMZOPIPTSA-N Trp-Phe Chemical compound C([C@H](NC(=O)[C@H](CC=1C2=CC=CC=C2NC=1)N)C(O)=O)C1=CC=CC=C1 IMMPMHKLUUZKAZ-WMZOPIPTSA-N 0.000 description 1
- MBLJBGZWLHTJBH-SZMVWBNQSA-N Trp-Val-Arg Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O)=CNC2=C1 MBLJBGZWLHTJBH-SZMVWBNQSA-N 0.000 description 1
- XKTWZYNTLXITCY-QRTARXTBSA-N Trp-Val-Asn Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O)=CNC2=C1 XKTWZYNTLXITCY-QRTARXTBSA-N 0.000 description 1
- DLZKEQQWXODGGZ-KWQFWETISA-N Tyr-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 DLZKEQQWXODGGZ-KWQFWETISA-N 0.000 description 1
- ZWZOCUWOXSDYFZ-CQDKDKBSSA-N Tyr-Ala-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 ZWZOCUWOXSDYFZ-CQDKDKBSSA-N 0.000 description 1
- SGFIXFAHVWJKTD-KJEVXHAQSA-N Tyr-Arg-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SGFIXFAHVWJKTD-KJEVXHAQSA-N 0.000 description 1
- PEVVXUGSAKEPEN-AVGNSLFASA-N Tyr-Asn-Glu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O PEVVXUGSAKEPEN-AVGNSLFASA-N 0.000 description 1
- AYPAIRCDLARHLM-KKUMJFAQSA-N Tyr-Asn-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N)O AYPAIRCDLARHLM-KKUMJFAQSA-N 0.000 description 1
- XMNDQSYABVWZRK-BZSNNMDCSA-N Tyr-Asn-Phe Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O XMNDQSYABVWZRK-BZSNNMDCSA-N 0.000 description 1
- QZOSVNLXLSNHQK-UWVGGRQHSA-N Tyr-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 QZOSVNLXLSNHQK-UWVGGRQHSA-N 0.000 description 1
- PDSLRCZINIDLMU-QWRGUYRKSA-N Tyr-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 PDSLRCZINIDLMU-QWRGUYRKSA-N 0.000 description 1
- IMXAAEFAIBRCQF-SIUGBPQLSA-N Tyr-Glu-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N IMXAAEFAIBRCQF-SIUGBPQLSA-N 0.000 description 1
- ZRPLVTZTKPPSBT-AVGNSLFASA-N Tyr-Glu-Ser Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O ZRPLVTZTKPPSBT-AVGNSLFASA-N 0.000 description 1
- HPYDSVWYXXKHRD-VIFPVBQESA-N Tyr-Gly Chemical compound [O-]C(=O)CNC(=O)[C@@H]([NH3+])CC1=CC=C(O)C=C1 HPYDSVWYXXKHRD-VIFPVBQESA-N 0.000 description 1
- CNLKDWSAORJEMW-KWQFWETISA-N Tyr-Gly-Ala Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(=O)N[C@@H](C)C(O)=O CNLKDWSAORJEMW-KWQFWETISA-N 0.000 description 1
- LFCQXIXJQXWZJI-BZSNNMDCSA-N Tyr-His-His Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)N[C@@H](CC3=CN=CN3)C(=O)O)N)O LFCQXIXJQXWZJI-BZSNNMDCSA-N 0.000 description 1
- AUEJLPRZGVVDNU-STQMWFEESA-N Tyr-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 AUEJLPRZGVVDNU-STQMWFEESA-N 0.000 description 1
- QHLIUFUEUDFAOT-MGHWNKPDSA-N Tyr-Leu-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC1=CC=C(C=C1)O)N QHLIUFUEUDFAOT-MGHWNKPDSA-N 0.000 description 1
- KHCSOLAHNLOXJR-BZSNNMDCSA-N Tyr-Leu-Leu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O KHCSOLAHNLOXJR-BZSNNMDCSA-N 0.000 description 1
- DMWNPLOERDAHSY-MEYUZBJRSA-N Tyr-Leu-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O DMWNPLOERDAHSY-MEYUZBJRSA-N 0.000 description 1
- SINRIKQYQJRGDQ-MEYUZBJRSA-N Tyr-Lys-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 SINRIKQYQJRGDQ-MEYUZBJRSA-N 0.000 description 1
- KYPMKDGKAYQCHO-RYUDHWBXSA-N Tyr-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 KYPMKDGKAYQCHO-RYUDHWBXSA-N 0.000 description 1
- XJPXTYLVMUZGNW-IHRRRGAJSA-N Tyr-Pro-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O XJPXTYLVMUZGNW-IHRRRGAJSA-N 0.000 description 1
- RCMWNNJFKNDKQR-UFYCRDLUSA-N Tyr-Pro-Phe Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 RCMWNNJFKNDKQR-UFYCRDLUSA-N 0.000 description 1
- ZSXJENBJGRHKIG-UWVGGRQHSA-N Tyr-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 ZSXJENBJGRHKIG-UWVGGRQHSA-N 0.000 description 1
- RWOKVQUCENPXGE-IHRRRGAJSA-N Tyr-Ser-Arg Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O RWOKVQUCENPXGE-IHRRRGAJSA-N 0.000 description 1
- QFXVAFIHVWXXBJ-AVGNSLFASA-N Tyr-Ser-Glu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(O)=O QFXVAFIHVWXXBJ-AVGNSLFASA-N 0.000 description 1
- MQGGXGKQSVEQHR-KKUMJFAQSA-N Tyr-Ser-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 MQGGXGKQSVEQHR-KKUMJFAQSA-N 0.000 description 1
- SYFHQHYTNCQCCN-MELADBBJSA-N Tyr-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CC2=CC=C(C=C2)O)N)C(=O)O SYFHQHYTNCQCCN-MELADBBJSA-N 0.000 description 1
- UMSZZGTXGKHTFJ-SRVKXCTJSA-N Tyr-Ser-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 UMSZZGTXGKHTFJ-SRVKXCTJSA-N 0.000 description 1
- PLVVHGFEMSDRET-IHPCNDPISA-N Tyr-Ser-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC3=CC=C(C=C3)O)N PLVVHGFEMSDRET-IHPCNDPISA-N 0.000 description 1
- WQOHKVRQDLNDIL-YJRXYDGGSA-N Tyr-Thr-Ser Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O WQOHKVRQDLNDIL-YJRXYDGGSA-N 0.000 description 1
- YMZYSCDRTXEOKD-IHPCNDPISA-N Tyr-Trp-Asn Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC3=CC=C(C=C3)O)N YMZYSCDRTXEOKD-IHPCNDPISA-N 0.000 description 1
- HZDQUVQEVVYDDA-ACRUOGEOSA-N Tyr-Tyr-Leu Chemical compound C([C@@H](C(=O)N[C@@H](CC(C)C)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC(O)=CC=1)C1=CC=C(O)C=C1 HZDQUVQEVVYDDA-ACRUOGEOSA-N 0.000 description 1
- KSGKJSFPWSMJHK-JNPHEJMOSA-N Tyr-Tyr-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O KSGKJSFPWSMJHK-JNPHEJMOSA-N 0.000 description 1
- AEOFMCAKYIQQFY-YDHLFZDLSA-N Tyr-Val-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 AEOFMCAKYIQQFY-YDHLFZDLSA-N 0.000 description 1
- 108090000848 Ubiquitin Proteins 0.000 description 1
- 102000044159 Ubiquitin Human genes 0.000 description 1
- HSRXSKHRSXRCFC-WDSKDSINSA-N Val-Ala Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C)C(O)=O HSRXSKHRSXRCFC-WDSKDSINSA-N 0.000 description 1
- ASQFIHTXXMFENG-XPUUQOCRSA-N Val-Ala-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O ASQFIHTXXMFENG-XPUUQOCRSA-N 0.000 description 1
- NMANTMWGQZASQN-QXEWZRGKSA-N Val-Arg-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N NMANTMWGQZASQN-QXEWZRGKSA-N 0.000 description 1
- CVUDMNSZAIZFAE-UHFFFAOYSA-N Val-Arg-Pro Natural products NC(N)=NCCCC(NC(=O)C(N)C(C)C)C(=O)N1CCCC1C(O)=O CVUDMNSZAIZFAE-UHFFFAOYSA-N 0.000 description 1
- WITCOKQIPFWQQD-FSPLSTOPSA-N Val-Asn Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(O)=O)CC(N)=O WITCOKQIPFWQQD-FSPLSTOPSA-N 0.000 description 1
- QPZMOUMNTGTEFR-ZKWXMUAHSA-N Val-Asn-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](C(C)C)N QPZMOUMNTGTEFR-ZKWXMUAHSA-N 0.000 description 1
- JLFKWDAZBRYCGX-ZKWXMUAHSA-N Val-Asn-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CO)C(=O)O)N JLFKWDAZBRYCGX-ZKWXMUAHSA-N 0.000 description 1
- DBOXBUDEAJVKRE-LSJOCFKGSA-N Val-Asn-Val Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](C(C)C)C(=O)O)N DBOXBUDEAJVKRE-LSJOCFKGSA-N 0.000 description 1
- DDNIHOWRDOXXPF-NGZCFLSTSA-N Val-Asp-Pro Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N1CCC[C@@H]1C(=O)O)N DDNIHOWRDOXXPF-NGZCFLSTSA-N 0.000 description 1
- LAYSXAOGWHKNED-XPUUQOCRSA-N Val-Gly-Ser Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O LAYSXAOGWHKNED-XPUUQOCRSA-N 0.000 description 1
- BVWPHWLFGRCECJ-JSGCOSHPSA-N Val-Gly-Tyr Chemical compound CC(C)[C@@H](C(=O)NCC(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N BVWPHWLFGRCECJ-JSGCOSHPSA-N 0.000 description 1
- PNVLWFYAPWAQMU-CIUDSAMLSA-N Val-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](N)C(C)C PNVLWFYAPWAQMU-CIUDSAMLSA-N 0.000 description 1
- CPGJELLYDQEDRK-NAKRPEOUSA-N Val-Ile-Ala Chemical compound CC[C@H](C)[C@H](NC(=O)[C@@H](N)C(C)C)C(=O)N[C@@H](C)C(O)=O CPGJELLYDQEDRK-NAKRPEOUSA-N 0.000 description 1
- BZMIYHIJVVJPCK-QSFUFRPTSA-N Val-Ile-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](C(C)C)N BZMIYHIJVVJPCK-QSFUFRPTSA-N 0.000 description 1
- XCTHZFGSVQBHBW-IUCAKERBSA-N Val-Leu Chemical compound CC(C)C[C@@H](C([O-])=O)NC(=O)[C@@H]([NH3+])C(C)C XCTHZFGSVQBHBW-IUCAKERBSA-N 0.000 description 1
- BTWMICVCQLKKNR-DCAQKATOSA-N Val-Leu-Ser Chemical compound CC(C)[C@H]([NH3+])C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C([O-])=O BTWMICVCQLKKNR-DCAQKATOSA-N 0.000 description 1
- IEBGHUMBJXIXHM-AVGNSLFASA-N Val-Lys-Met Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(=O)O)N IEBGHUMBJXIXHM-AVGNSLFASA-N 0.000 description 1
- VPGCVZRRBYOGCD-AVGNSLFASA-N Val-Lys-Val Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O VPGCVZRRBYOGCD-AVGNSLFASA-N 0.000 description 1
- RSGHLMMKXJGCMK-JYJNAYRXSA-N Val-Met-Phe Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N RSGHLMMKXJGCMK-JYJNAYRXSA-N 0.000 description 1
- DEGUERSKQBRZMZ-FXQIFTODSA-N Val-Ser-Ala Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O DEGUERSKQBRZMZ-FXQIFTODSA-N 0.000 description 1
- VHIZXDZMTDVFGX-DCAQKATOSA-N Val-Ser-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](C(C)C)N VHIZXDZMTDVFGX-DCAQKATOSA-N 0.000 description 1
- OFTXTCGQJXTNQS-XGEHTFHBSA-N Val-Thr-Ser Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](C(C)C)N)O OFTXTCGQJXTNQS-XGEHTFHBSA-N 0.000 description 1
- JAIZPWVHPQRYOU-ZJDVBMNYSA-N Val-Thr-Thr Chemical compound C[C@H]([C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](C(C)C)N)O JAIZPWVHPQRYOU-ZJDVBMNYSA-N 0.000 description 1
- PDASTHRLDFOZMG-JYJNAYRXSA-N Val-Tyr-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C(C)C)CC1=CC=C(O)C=C1 PDASTHRLDFOZMG-JYJNAYRXSA-N 0.000 description 1
- YKZVPMUGEJXEOR-JYJNAYRXSA-N Val-Val-Tyr Chemical compound CC(C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N YKZVPMUGEJXEOR-JYJNAYRXSA-N 0.000 description 1
- 241000907138 Xanthomonas oryzae pv. oryzae Species 0.000 description 1
- 235000007244 Zea mays Nutrition 0.000 description 1
- 229920002494 Zein Polymers 0.000 description 1
- 108010055615 Zein Proteins 0.000 description 1
- CTCBPRXHVPZNHB-VQFZJOCSSA-N [[(2r,3s,4r,5r)-5-(6-aminopurin-9-yl)-3,4-dihydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl] phosphono hydrogen phosphate;(2r,3r,4s,5r)-2-(6-aminopurin-9-yl)-5-(hydroxymethyl)oxolane-3,4-diol Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O.C1=NC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)[C@@H](O)[C@H]1O CTCBPRXHVPZNHB-VQFZJOCSSA-N 0.000 description 1
- 230000001594 aberrant effect Effects 0.000 description 1
- 230000001133 acceleration Effects 0.000 description 1
- AOZUYISQWWJMJC-UHFFFAOYSA-N acetic acid;methanol;hydrate Chemical compound O.OC.CC(O)=O AOZUYISQWWJMJC-UHFFFAOYSA-N 0.000 description 1
- 239000012445 acidic reagent Substances 0.000 description 1
- 239000000654 additive Substances 0.000 description 1
- 230000000996 additive effect Effects 0.000 description 1
- 229960000643 adenine Drugs 0.000 description 1
- 229960001456 adenosine triphosphate Drugs 0.000 description 1
- 230000001464 adherent effect Effects 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 230000016571 aggressive behavior Effects 0.000 description 1
- 108010017893 alanyl-alanyl-alanine Proteins 0.000 description 1
- 108010044940 alanylglutamine Proteins 0.000 description 1
- 108010011559 alanylphenylalanine Proteins 0.000 description 1
- 238000003277 amino acid sequence analysis Methods 0.000 description 1
- 229940043376 ammonium acetate Drugs 0.000 description 1
- 235000019257 ammonium acetate Nutrition 0.000 description 1
- 239000003242 anti bacterial agent Substances 0.000 description 1
- 230000001887 anti-feedant effect Effects 0.000 description 1
- 229940088710 antibiotic agent Drugs 0.000 description 1
- 239000000427 antigen Substances 0.000 description 1
- 108091007433 antigens Proteins 0.000 description 1
- 102000036639 antigens Human genes 0.000 description 1
- 239000008346 aqueous phase Substances 0.000 description 1
- 108010013835 arginine glutamate Proteins 0.000 description 1
- 108010086780 arginyl-glycyl-aspartyl-alanine Proteins 0.000 description 1
- 108010018691 arginyl-threonyl-arginine Proteins 0.000 description 1
- 108010068380 arginylarginine Proteins 0.000 description 1
- 108010060035 arginylproline Proteins 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 108010077245 asparaginyl-proline Proteins 0.000 description 1
- 235000003704 aspartic acid Nutrition 0.000 description 1
- 108010092854 aspartyllysine Proteins 0.000 description 1
- 230000003416 augmentation Effects 0.000 description 1
- 101150103518 bar gene Proteins 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- OQFSQFPPLPISGP-UHFFFAOYSA-N beta-carboxyaspartic acid Natural products OC(=O)C(N)C(C(O)=O)C(O)=O OQFSQFPPLPISGP-UHFFFAOYSA-N 0.000 description 1
- 238000012742 biochemical analysis Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 229920001222 biopolymer Polymers 0.000 description 1
- HHKZCCWKTZRCCL-UHFFFAOYSA-N bis-tris propane Chemical compound OCC(CO)(CO)NCCCNC(CO)(CO)CO HHKZCCWKTZRCCL-UHFFFAOYSA-N 0.000 description 1
- KGBXLFKZBHKPEV-UHFFFAOYSA-N boric acid Chemical compound OB(O)O KGBXLFKZBHKPEV-UHFFFAOYSA-N 0.000 description 1
- 239000004327 boric acid Substances 0.000 description 1
- 229940105402 brillant blue Drugs 0.000 description 1
- UDSAIICHUKSCKT-UHFFFAOYSA-N bromophenol blue Chemical compound C1=C(Br)C(O)=C(Br)C=C1C1(C=2C=C(Br)C(O)=C(Br)C=2)C2=CC=CC=C2S(=O)(=O)O1 UDSAIICHUKSCKT-UHFFFAOYSA-N 0.000 description 1
- 239000001273 butane Substances 0.000 description 1
- 244000309466 calf Species 0.000 description 1
- 229940041514 candida albicans extract Drugs 0.000 description 1
- 239000012159 carrier gas Substances 0.000 description 1
- 230000003833 cell viability Effects 0.000 description 1
- VJYIFXVZLXQVHO-UHFFFAOYSA-N chlorsulfuron Chemical compound COC1=NC(C)=NC(NC(=O)NS(=O)(=O)C=2C(=CC=CC=2)Cl)=N1 VJYIFXVZLXQVHO-UHFFFAOYSA-N 0.000 description 1
- 238000011208 chromatographic data Methods 0.000 description 1
- 238000011210 chromatographic step Methods 0.000 description 1
- 230000035071 co-translational protein modification Effects 0.000 description 1
- 210000001072 colon Anatomy 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 239000012141 concentrate Substances 0.000 description 1
- 235000008504 concentrate Nutrition 0.000 description 1
- 230000001276 controlling effect Effects 0.000 description 1
- 229940089639 cornsilk Drugs 0.000 description 1
- 239000013601 cosmid vector Substances 0.000 description 1
- 244000038559 crop plants Species 0.000 description 1
- 230000009260 cross reactivity Effects 0.000 description 1
- 239000013078 crystal Substances 0.000 description 1
- 125000000151 cysteine group Chemical group N[C@@H](CS)C(=O)* 0.000 description 1
- 108010060199 cysteinylproline Proteins 0.000 description 1
- 210000000805 cytoplasm Anatomy 0.000 description 1
- 229940104302 cytosine Drugs 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 230000032459 dedifferentiation Effects 0.000 description 1
- 230000007123 defense Effects 0.000 description 1
- 239000007857 degradation product Substances 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000002716 delivery method Methods 0.000 description 1
- 238000000326 densiometry Methods 0.000 description 1
- 238000000432 density-gradient centrifugation Methods 0.000 description 1
- 230000027832 depurination Effects 0.000 description 1
- 230000004069 differentiation Effects 0.000 description 1
- 238000009792 diffusion process Methods 0.000 description 1
- 230000003292 diminished effect Effects 0.000 description 1
- FSXRLASFHBWESK-UHFFFAOYSA-N dipeptide phenylalanyl-tyrosine Natural products C=1C=C(O)C=CC=1CC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FSXRLASFHBWESK-UHFFFAOYSA-N 0.000 description 1
- 239000006185 dispersion Substances 0.000 description 1
- 239000012153 distilled water Substances 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 238000004520 electroporation Methods 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 230000000408 embryogenic effect Effects 0.000 description 1
- 230000001804 emulsifying effect Effects 0.000 description 1
- 238000005538 encapsulation Methods 0.000 description 1
- 230000000967 entomopathogenic effect Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000002255 enzymatic effect Effects 0.000 description 1
- 238000009585 enzyme analysis Methods 0.000 description 1
- SEACYXSIPDVVMV-UHFFFAOYSA-L eosin Y Chemical compound [Na+].[Na+].[O-]C(=O)C1=CC=CC=C1C1=C2C=C(Br)C(=O)C(Br)=C2OC2=C(Br)C([O-])=C(Br)C=C21 SEACYXSIPDVVMV-UHFFFAOYSA-L 0.000 description 1
- 239000006167 equilibration buffer Substances 0.000 description 1
- 230000008029 eradication Effects 0.000 description 1
- 210000003743 erythrocyte Anatomy 0.000 description 1
- 210000003527 eukaryotic cell Anatomy 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000001747 exhibiting effect Effects 0.000 description 1
- 239000013613 expression plasmid Substances 0.000 description 1
- 210000000416 exudates and transudate Anatomy 0.000 description 1
- 210000003608 fece Anatomy 0.000 description 1
- 230000003031 feeding effect Effects 0.000 description 1
- 244000037666 field crops Species 0.000 description 1
- 239000000706 filtrate Substances 0.000 description 1
- 238000007667 floating Methods 0.000 description 1
- 239000012530 fluid Substances 0.000 description 1
- 238000013467 fragmentation Methods 0.000 description 1
- 238000006062 fragmentation reaction Methods 0.000 description 1
- 239000012737 fresh medium Substances 0.000 description 1
- 235000012055 fruits and vegetables Nutrition 0.000 description 1
- 239000005350 fused silica glass Substances 0.000 description 1
- 108010074605 gamma-Globulins Proteins 0.000 description 1
- 210000001035 gastrointestinal tract Anatomy 0.000 description 1
- 239000010437 gem Substances 0.000 description 1
- 102000034356 gene-regulatory proteins Human genes 0.000 description 1
- 108091006104 gene-regulatory proteins Proteins 0.000 description 1
- 238000010353 genetic engineering Methods 0.000 description 1
- BRZYSWJRSDMWLG-CAXSIQPQSA-N geneticin Natural products O1C[C@@](O)(C)[C@H](NC)[C@@H](O)[C@H]1O[C@@H]1[C@@H](O)[C@H](O[C@@H]2[C@@H]([C@@H](O)[C@H](O)[C@@H](C(C)O)O2)N)[C@@H](N)C[C@H]1N BRZYSWJRSDMWLG-CAXSIQPQSA-N 0.000 description 1
- 239000008103 glucose Substances 0.000 description 1
- IAJOBQBIJHVGMQ-BYPYZUCNSA-N glufosinate-P Chemical compound CP(O)(=O)CC[C@H](N)C(O)=O IAJOBQBIJHVGMQ-BYPYZUCNSA-N 0.000 description 1
- 108010078144 glutaminyl-glycine Proteins 0.000 description 1
- 108010080575 glutamyl-aspartyl-alanine Proteins 0.000 description 1
- 108010055341 glutamyl-glutamic acid Proteins 0.000 description 1
- 108010049041 glutamylalanine Proteins 0.000 description 1
- 108010079547 glutamylmethionine Proteins 0.000 description 1
- 108010090037 glycyl-alanyl-isoleucine Proteins 0.000 description 1
- 108010010096 glycyl-glycyl-tyrosine Proteins 0.000 description 1
- 108010089804 glycyl-threonine Proteins 0.000 description 1
- 108010084760 glycyl-tyrosyl-glycyl-aspartate Proteins 0.000 description 1
- YMAWOPBAYDPSLA-UHFFFAOYSA-N glycylglycine Chemical compound [NH3+]CC(=O)NCC([O-])=O YMAWOPBAYDPSLA-UHFFFAOYSA-N 0.000 description 1
- 108010037850 glycylvaline Proteins 0.000 description 1
- XDDAORKBJWWYJS-UHFFFAOYSA-N glyphosate Chemical compound OC(=O)CNCP(O)(O)=O XDDAORKBJWWYJS-UHFFFAOYSA-N 0.000 description 1
- 229940097068 glyphosate Drugs 0.000 description 1
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 244000000013 helminth Species 0.000 description 1
- 239000003228 hemolysin Substances 0.000 description 1
- 230000008588 hemolysis Effects 0.000 description 1
- 239000004009 herbicide Substances 0.000 description 1
- 108010025306 histidylleucine Proteins 0.000 description 1
- 108010092114 histidylphenylalanine Proteins 0.000 description 1
- 229940088597 hormone Drugs 0.000 description 1
- 239000005556 hormone Substances 0.000 description 1
- 239000001257 hydrogen Substances 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 210000000987 immune system Anatomy 0.000 description 1
- 230000003053 immunization Effects 0.000 description 1
- 238000002649 immunization Methods 0.000 description 1
- 238000001727 in vivo Methods 0.000 description 1
- 230000002779 inactivation Effects 0.000 description 1
- 208000015181 infectious disease Diseases 0.000 description 1
- 230000002458 infectious effect Effects 0.000 description 1
- 230000000977 initiatory effect Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 210000004347 intestinal mucosa Anatomy 0.000 description 1
- 231100000568 intoxicate Toxicity 0.000 description 1
- 230000009545 invasion Effects 0.000 description 1
- 238000005342 ion exchange Methods 0.000 description 1
- JEIPFZHSYJVQDO-UHFFFAOYSA-N iron(III) oxide Inorganic materials O=[Fe]O[Fe]=O JEIPFZHSYJVQDO-UHFFFAOYSA-N 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 108010060857 isoleucyl-valyl-tyrosine Proteins 0.000 description 1
- 108010078274 isoleucylvaline Proteins 0.000 description 1
- 230000002147 killing effect Effects 0.000 description 1
- 230000009571 larval growth Effects 0.000 description 1
- 108010071185 leucyl-alanine Proteins 0.000 description 1
- 108010009932 leucyl-alanyl-glycyl-valine Proteins 0.000 description 1
- 108010044311 leucyl-glycyl-glycine Proteins 0.000 description 1
- 108010051673 leucyl-glycyl-phenylalanine Proteins 0.000 description 1
- 108010090333 leucyl-lysyl-proline Proteins 0.000 description 1
- 108010047926 leucyl-lysyl-tyrosine Proteins 0.000 description 1
- 108010091871 leucylmethionine Proteins 0.000 description 1
- 108010012058 leucyltyrosine Proteins 0.000 description 1
- 150000002632 lipids Chemical class 0.000 description 1
- 238000011068 loading method Methods 0.000 description 1
- 230000033001 locomotion Effects 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000004020 luminiscence type Methods 0.000 description 1
- 108010009298 lysylglutamic acid Proteins 0.000 description 1
- 229920002521 macromolecule Polymers 0.000 description 1
- 229910052943 magnesium sulfate Inorganic materials 0.000 description 1
- 235000019341 magnesium sulphate Nutrition 0.000 description 1
- 210000001161 mammalian embryo Anatomy 0.000 description 1
- 108010083942 mannopine synthase Proteins 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 235000012054 meals Nutrition 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000002844 melting Methods 0.000 description 1
- 230000008018 melting Effects 0.000 description 1
- 239000002207 metabolite Substances 0.000 description 1
- MYWUZJCMWCOHBA-VIFPVBQESA-N methamphetamine Chemical compound CN[C@@H](C)CC1=CC=CC=C1 MYWUZJCMWCOHBA-VIFPVBQESA-N 0.000 description 1
- 108010068488 methionylphenylalanine Proteins 0.000 description 1
- 229960000485 methotrexate Drugs 0.000 description 1
- LAQFLZHBVPULPL-UHFFFAOYSA-N methyl(phenyl)silicon Chemical compound C[Si]C1=CC=CC=C1 LAQFLZHBVPULPL-UHFFFAOYSA-N 0.000 description 1
- 238000001471 micro-filtration Methods 0.000 description 1
- 108010009355 microbial metalloproteinases Proteins 0.000 description 1
- 230000002906 microbiologic effect Effects 0.000 description 1
- 239000011859 microparticle Substances 0.000 description 1
- 238000007431 microscopic evaluation Methods 0.000 description 1
- 239000002480 mineral oil Substances 0.000 description 1
- 235000010446 mineral oil Nutrition 0.000 description 1
- 238000001823 molecular biology technique Methods 0.000 description 1
- 238000000329 molecular dynamics simulation Methods 0.000 description 1
- 239000003068 molecular probe Substances 0.000 description 1
- 201000000050 myeloid neoplasm Diseases 0.000 description 1
- 238000001426 native polyacrylamide gel electrophoresis Methods 0.000 description 1
- 239000006199 nebulizer Substances 0.000 description 1
- 229960004927 neomycin Drugs 0.000 description 1
- PGSADBUBUOPOJS-UHFFFAOYSA-N neutral red Chemical compound Cl.C1=C(C)C(N)=CC2=NC3=CC(N(C)C)=CC=C3N=C21 PGSADBUBUOPOJS-UHFFFAOYSA-N 0.000 description 1
- 238000011587 new zealand white rabbit Methods 0.000 description 1
- 231100000252 nontoxic Toxicity 0.000 description 1
- 230000003000 nontoxic effect Effects 0.000 description 1
- 108010058731 nopaline synthase Proteins 0.000 description 1
- 235000015097 nutrients Nutrition 0.000 description 1
- 239000002751 oligonucleotide probe Substances 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 230000003647 oxidation Effects 0.000 description 1
- 238000007254 oxidation reaction Methods 0.000 description 1
- 238000012856 packing Methods 0.000 description 1
- 230000003071 parasitic effect Effects 0.000 description 1
- 239000002245 particle Substances 0.000 description 1
- 230000001575 pathological effect Effects 0.000 description 1
- 235000020232 peanut Nutrition 0.000 description 1
- 239000003415 peat Substances 0.000 description 1
- 108010091212 pepstatin Proteins 0.000 description 1
- 229950000964 pepstatin Drugs 0.000 description 1
- FAXGPCHRFPCXOO-LXTPJMTPSA-N pepstatin A Chemical compound OC(=O)C[C@H](O)[C@H](CC(C)C)NC(=O)[C@H](C)NC(=O)C[C@H](O)[C@H](CC(C)C)NC(=O)[C@H](C(C)C)NC(=O)[C@H](C(C)C)NC(=O)CC(C)C FAXGPCHRFPCXOO-LXTPJMTPSA-N 0.000 description 1
- 238000003359 percent control normalization Methods 0.000 description 1
- 210000001322 periplasm Anatomy 0.000 description 1
- 239000012466 permeate Substances 0.000 description 1
- 239000000575 pesticide Substances 0.000 description 1
- 239000012071 phase Substances 0.000 description 1
- 238000002135 phase contrast microscopy Methods 0.000 description 1
- LWTDZKXXJRRKDG-UHFFFAOYSA-N phaseollin Natural products C1OC2=CC(O)=CC=C2C2C1C1=CC=C3OC(C)(C)C=CC3=C1O2 LWTDZKXXJRRKDG-UHFFFAOYSA-N 0.000 description 1
- 229960003531 phenolsulfonphthalein Drugs 0.000 description 1
- 108010084572 phenylalanyl-valine Proteins 0.000 description 1
- 108010024607 phenylalanylalanine Proteins 0.000 description 1
- 108010073025 phenylalanylphenylalanine Proteins 0.000 description 1
- 229910052698 phosphorus Inorganic materials 0.000 description 1
- 102000020233 phosphotransferase Human genes 0.000 description 1
- 230000008659 phytopathology Effects 0.000 description 1
- INAAIJLSXJJHOZ-UHFFFAOYSA-N pibenzimol Chemical compound C1CN(C)CCN1C1=CC=C(N=C(N2)C=3C=C4NC(=NC4=CC=3)C=3C=CC(O)=CC=3)C2=C1 INAAIJLSXJJHOZ-UHFFFAOYSA-N 0.000 description 1
- 238000003976 plant breeding Methods 0.000 description 1
- 230000008654 plant damage Effects 0.000 description 1
- 210000002706 plastid Anatomy 0.000 description 1
- 238000007747 plating Methods 0.000 description 1
- 231100000614 poison Toxicity 0.000 description 1
- 239000002574 poison Substances 0.000 description 1
- 230000008488 polyadenylation Effects 0.000 description 1
- 229920000573 polyethylene Polymers 0.000 description 1
- 238000006116 polymerization reaction Methods 0.000 description 1
- 239000001818 polyoxyethylene sorbitan monostearate Substances 0.000 description 1
- 235000010989 polyoxyethylene sorbitan monostearate Nutrition 0.000 description 1
- 239000001267 polyvinylpyrrolidone Substances 0.000 description 1
- 229920000036 polyvinylpyrrolidone Polymers 0.000 description 1
- 235000013855 polyvinylpyrrolidone Nutrition 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 230000004481 post-translational protein modification Effects 0.000 description 1
- 230000001323 posttranslational effect Effects 0.000 description 1
- 235000012015 potatoes Nutrition 0.000 description 1
- 230000003389 potentiating effect Effects 0.000 description 1
- 239000002243 precursor Substances 0.000 description 1
- 238000004321 preservation Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 239000013587 production medium Substances 0.000 description 1
- 108010079317 prolyl-tyrosine Proteins 0.000 description 1
- 238000002731 protein assay Methods 0.000 description 1
- 239000012460 protein solution Substances 0.000 description 1
- 230000002285 radioactive effect Effects 0.000 description 1
- 239000001054 red pigment Substances 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 108091035233 repetitive DNA sequence Proteins 0.000 description 1
- 102000053632 repetitive DNA sequence Human genes 0.000 description 1
- 230000003362 replicative effect Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000007894 restriction fragment length polymorphism technique Methods 0.000 description 1
- 235000009566 rice Nutrition 0.000 description 1
- 235000019515 salmon Nutrition 0.000 description 1
- 239000012723 sample buffer Substances 0.000 description 1
- 239000004576 sand Substances 0.000 description 1
- 229920006298 saran Polymers 0.000 description 1
- 238000003345 scintillation counting Methods 0.000 description 1
- 230000028327 secretion Effects 0.000 description 1
- 239000006152 selective media Substances 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 108010069117 seryl-lysyl-aspartic acid Proteins 0.000 description 1
- 108010048397 seryl-lysyl-leucine Proteins 0.000 description 1
- 229910052709 silver Inorganic materials 0.000 description 1
- 239000004332 silver Substances 0.000 description 1
- 108091029842 small nuclear ribonucleic acid Proteins 0.000 description 1
- 238000002791 soaking Methods 0.000 description 1
- 239000001509 sodium citrate Substances 0.000 description 1
- NLJMYIDDQXHKNR-UHFFFAOYSA-K sodium citrate Chemical compound O.O.[Na+].[Na+].[Na+].[O-]C(=O)CC(O)(CC([O-])=O)C([O-])=O NLJMYIDDQXHKNR-UHFFFAOYSA-K 0.000 description 1
- 238000002798 spectrophotometry method Methods 0.000 description 1
- 229940063673 spermidine Drugs 0.000 description 1
- 210000004989 spleen cell Anatomy 0.000 description 1
- 108010005652 splenotritin Proteins 0.000 description 1
- 239000007921 spray Substances 0.000 description 1
- 238000005507 spraying Methods 0.000 description 1
- 230000007480 spreading Effects 0.000 description 1
- 238000003892 spreading Methods 0.000 description 1
- 238000011146 sterile filtration Methods 0.000 description 1
- 239000008223 sterile water Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 229950000244 sulfanilic acid Drugs 0.000 description 1
- 208000024891 symptom Diseases 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 230000008685 targeting Effects 0.000 description 1
- 101150078747 tcaA gene Proteins 0.000 description 1
- 101150089436 tcdB gene Proteins 0.000 description 1
- 108091035539 telomere Proteins 0.000 description 1
- 102000055501 telomere Human genes 0.000 description 1
- 238000005382 thermal cycling Methods 0.000 description 1
- 239000001973 thioglycolate broth Substances 0.000 description 1
- 108010072986 threonyl-seryl-lysine Proteins 0.000 description 1
- 229940113082 thymine Drugs 0.000 description 1
- 239000003104 tissue culture media Substances 0.000 description 1
- 231100000167 toxic agent Toxicity 0.000 description 1
- 239000003440 toxic substance Substances 0.000 description 1
- 231100000820 toxicity test Toxicity 0.000 description 1
- 230000007888 toxin activity Effects 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
- YWBFPKPWMSWWEA-UHFFFAOYSA-O triazolopyrimidine Chemical compound BrC1=CC=CC(C=2N=C3N=CN[N+]3=C(NCC=3C=CN=CC=3)C=2)=C1 YWBFPKPWMSWWEA-UHFFFAOYSA-O 0.000 description 1
- PIEPQKCYPFFYMG-UHFFFAOYSA-N tris acetate Chemical compound CC(O)=O.OCC(N)(CO)CO PIEPQKCYPFFYMG-UHFFFAOYSA-N 0.000 description 1
- 239000006150 trypticase soy agar Substances 0.000 description 1
- 239000012137 tryptone Substances 0.000 description 1
- 108010035534 tyrosyl-leucyl-alanine Proteins 0.000 description 1
- 108010078580 tyrosylleucine Proteins 0.000 description 1
- 241000701366 unidentified nuclear polyhedrosis viruses Species 0.000 description 1
- 108010036320 valylleucine Proteins 0.000 description 1
- 230000003612 virological effect Effects 0.000 description 1
- 238000003260 vortexing Methods 0.000 description 1
- 239000002023 wood Substances 0.000 description 1
- 239000012138 yeast extract Substances 0.000 description 1
- 239000001052 yellow pigment Substances 0.000 description 1
- 239000001231 zea mays silk Substances 0.000 description 1
- 239000005019 zein Substances 0.000 description 1
- 229940093612 zein Drugs 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N5/00—Undifferentiated human, animal or plant cells, e.g. cell lines; Tissues; Cultivation or maintenance thereof; Culture media therefor
- C12N5/10—Cells modified by introduction of foreign genetic material
- C12N5/12—Fused cells, e.g. hybridomas
- C12N5/14—Plant cells
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/82—Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
- C12N15/8241—Phenotypically and genetically modified plants via recombinant DNA technology
- C12N15/8261—Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield
- C12N15/8271—Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield for stress resistance, e.g. heavy metal resistance
- C12N15/8279—Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield for stress resistance, e.g. heavy metal resistance for biotic stress resistance, pathogen resistance, disease resistance
- C12N15/8286—Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield for stress resistance, e.g. heavy metal resistance for biotic stress resistance, pathogen resistance, disease resistance for insect resistance
-
- A—HUMAN NECESSITIES
- A01—AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
- A01G—HORTICULTURE; CULTIVATION OF VEGETABLES, FLOWERS, RICE, FRUIT, VINES, HOPS OR SEAWEED; FORESTRY; WATERING
- A01G13/00—Protection of plants
-
- A—HUMAN NECESSITIES
- A01—AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
- A01N—PRESERVATION OF BODIES OF HUMANS OR ANIMALS OR PLANTS OR PARTS THEREOF; BIOCIDES, e.g. AS DISINFECTANTS, AS PESTICIDES OR AS HERBICIDES; PEST REPELLANTS OR ATTRACTANTS; PLANT GROWTH REGULATORS
- A01N63/00—Biocides, pest repellants or attractants, or plant growth regulators containing microorganisms, viruses, microbial fungi, animals or substances produced by, or obtained from, microorganisms, viruses, microbial fungi or animals, e.g. enzymes or fermentates
- A01N63/50—Isolated enzymes; Isolated proteins
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/195—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/195—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
- C07K14/24—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria from Enterobacteriaceae (F), e.g. Citrobacter, Serratia, Proteus, Providencia, Morganella, Yersinia
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/48—Hydrolases (3) acting on peptide bonds (3.4)
- C12N9/50—Proteinases, e.g. Endopeptidases (3.4.21-3.4.25)
- C12N9/52—Proteinases, e.g. Endopeptidases (3.4.21-3.4.25) derived from bacteria or Archaea
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A40/00—Adaptation technologies in agriculture, forestry, livestock or agroalimentary production
- Y02A40/10—Adaptation technologies in agriculture, forestry, livestock or agroalimentary production in agriculture
- Y02A40/146—Genetically Modified [GMO] plants, e.g. transgenic plants
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Engineering & Computer Science (AREA)
- Chemical & Material Sciences (AREA)
- Zoology (AREA)
- Organic Chemistry (AREA)
- Wood Science & Technology (AREA)
- General Health & Medical Sciences (AREA)
- Biotechnology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Microbiology (AREA)
- Biochemistry (AREA)
- Biophysics (AREA)
- Plant Pathology (AREA)
- Pest Control & Pesticides (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Cell Biology (AREA)
- Medicinal Chemistry (AREA)
- Environmental Sciences (AREA)
- Physics & Mathematics (AREA)
- Virology (AREA)
- Insects & Arthropods (AREA)
- Gastroenterology & Hepatology (AREA)
- Dentistry (AREA)
- Agronomy & Crop Science (AREA)
- Botany (AREA)
- Toxicology (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
- Agricultural Chemicals And Associated Chemicals (AREA)
- Peptides Or Proteins (AREA)
- Breeding Of Plants And Reproduction By Means Of Culturing (AREA)
Description
WO 97/17432 PCT/US96/18003 INSECTICIDAL PROTEIN TOXINS FROM PHOTORHABDUS Field of the Invention The present invention relates to toxins isolated from bacteria and the use of said toxins as insecticides.
Background of the Invention Many insects are widely regarded as pests to homeowners, to picnickers, to gardeners, and to farmers and others whose investments in agricultural products are often destroyed or diminished as a result of insect damage to field crops.
Particularly in areas where the growing season is short, significant insect damage can mean the loss of all profits to growers and a dramatic decrease in crop yield. Scarce supply of particular agricultural products invariably results in.higher costs to food processors and, then, to the ultimate consumers of food plants and products derived from those plants.
Preventing insect damage to crops and flowers and eliminating the nuisance of insect pests have typically relied on strong organic pesticides and insecticides with broad toxicities.
These synthetic products have come under attack by the general population as being too harsh on the environment and on those exposed to such agents. Similarly in non-agricultural settings, homeowners would be satisfied to have insects avoid their homes or outdoor meals without needing to kill the insects.
-1- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 The extensive use of chemical insecticides has raised environmental and health concerns for farmers, companies that produce the insecticides, government agencies, public interest groups, and the public in general. The development of less intrusive pest management strategies has been spurred along both by societal concern for the environment and by the development of biological tools which exploit mechanisms of insect management.
Biological control agents present a promising alternative to chemical insecticides.
Organisms at every evolutionary development level have devised means to enhance their own success and survival. The use of biological molecules as tools of defense and aggression is known throughout the animal and plant kingdoms. In addition, the relatively new tools of the genetic engineer allow modifications to biological insecticides to accomplish particular solutions to particular problems.
One such agent, Bacillus Churingiensis is an effective insecticidal agent, and is widely commercially used as such. In fact, the insecticidal agent of the Bt bacterium is a protein which has such limited toxicity, it can be used on human food crops on the day of harvest. To non-targeted organisms, the Bt toxin' is a digestible non-toxic protein.
Another known class of biological insect control agents are certain genera of nematodes known to be vectors of transmission for insect-killing bacterial symbionts. Nematodes containing insecticidal bacteria invade insect larvae. The bacteria then kill the larvae. The nematodes reproduce in the larval cadaver.
The nematode progeny then eat the cadaver from within. The bacteria-containing nematode progeny thus produced can then invade additional larvae.
In the past, insecticidal nematodes in the Steinernema and Heterorhabditis genera were used as insect control agents.
Apparently, each genus of nematode hosts a particular species of bacterium. In nematodes of the Heterorhabditis genus, the symbiotic bacterium is Photorhabdus luminescens.
Although these nematodes are effective insect control agents, it is presently difficult, expensive, and inefficient to produce, maintain, and distribute nematodes for insect control.
It has been known in the art that one may isolate an insecticidal toxin from Photorhabdus luminescens that has -2- SUBSTITUTE SHEET (RULE 26) 3 activity only when injected into Lepidopteran and Coleopteran insect larvae. This has made it impossible to effectively exploit the insecticidal properties of the nematode or its bacterial symbiont. What would be useful would be a more practical, less labor-intensive wide-area delivery method of an insecticidal toxin which would retain its biological S properties after delivery. It would be quite desirous to discover toxins with oral activity produced by the genus Photorhabdus. The isolation and use of these toxins are desirous due to efficacious reasons. Until applicants' discoveries, these toxins had not been isolated or characterized.
Summary of the Invention The native toxins are protein complexes that are produced and secreted by growing bacteria cells of the genus Phororhabdus, of interest are the proteins produced by the species Phocorhabdus luminescens. The protein complexes, with a molecular size of approximately 1,000 kDa, can be separated by SDS-PAGE gel analysis into numerous component proteins. The toxins contain no hemolysin, lipase, type C phospholipase, or i nuclease activities. The toxins exhibit significant toxicity upon exposure administration to a number of insects.
The present invention provides an easily administered insecticidal protein as well as S: the expression of toxin in a heterologous system.
The present invention also provides a method for delivering insecticidal toxins that 20 are functional active and effective against many orders of insects.
According to a first embodiment of the invention, there is provided a method of protecting a plant from an insect which comprises incorporating with the plant an effective amount of a Photorhabdus protein toxin having oral insecticidal activity.
According to a second embodiment of the invention, there is provided an insect bait comprising, as active ingredient, a Photorhabdus toxin having oral insecticidal activity in combination with a conventional bait matrix.
Objects, advantages, and features of the present invention will become apparent from the following specification.
Brief Description of the Drawings Fig. 1 is an illustration of a match of cloned DNA isolates used as a part of sequence genes for the toxin of the present invention.
Fig. 2 is a map of three plasmids used in the sequencing process.
[I:\DayLib\LIBFF]38901 spec.doc:gcc WO 97/17432 PCT/US96/18003 Fig. 3 is a map illustrating the inter-relationship of several partial DNA fragments.
Fig. 4 is an illustration of a homology analysis between the protein sequences of TcbAii and TcaBii proteins.
Fig. 5 is a phenogram of Photorhabdus strains. Relationship of Photorhabdus Strains was defined by rep-PCR.
The upper axis of Fig. 5 measures the percentage similarity of strains based on scoring of rep-PCR products 0.0 [no similarity] to 1.0 [100% similarity]). At the right axis, the numbers and letters indicate the various strains tested; 14=W-14, Hm=Hm, H9=H9, 7=WX-7, 1=WX-1, 2=WX-2, 88=HP88, NC-1=NC-1, 4=WX-4, 9=WX-9, 8=WX-8, 10=WX-10, WIR=WIR, 3=WX-3, 11=WX-11, 6=WX-6, 12=WX-12, xl4=WX-14, 15=WX-15, Hb=Hb, B2=B2, 48 through 52=ATCC 43948 through ATCC 43952. Vertical lines separating horizontal lines indicate the degree of relatedness (as read from the extrapolated intersection of the vertical line with the upper axis) between strains or groups of strains at the base of the horizontal lines strain W-14 is approximately 60% similar to strains H9 and Hm).
Fig. 6 is an illustration of the genomic maps of the W-14 Strain.
Detailed Description of the Invention The present inventions are directed to the discovery cf a unique class of insecticidal protein toxins from the genus Photorhabdus that have oral toxicity against insects. A unique feature of Photorhabdus is its bioluminescence. Photorhabdus may be isolated from a variety of sources. One such source is nematodes, more particularly nematodes of the genus Heterorhabdicis. Another such source is from human clinical samples from wounds, see Farmer et al. 1989 J. Clin. Microbiol.
27 pp. 1594-1600. These saprohytic strains are deposited in the American Type Culture Collection (Rockville, MD) ATCC #s 43948, 43949, 43950, 43951, and 43952, and are incorporated herein by reference. It is possible that other sources could harbor Photorhabdus bacteria that produce insecticidal toxins. Such sources in the environment could be either terrestrial or aquatic based.
-4- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 The genus Phocorhabdus is taxonomically defined as a mermber of the Family Enterobacceriaceae, although it has certain trait.
atypical of this family. For example, strains of this genus are nitrate reduction negative, yellow and red pigment producing and bioluminescent. This latter trait is otherwise unknown within the Enterobacteriaceae. Photorhabdus has only recently been described as a genus separate from the Xenorhabdus (Boemare et al., 1993 Int. J. Syst. Bacteriol. 43, 249-255). This differentiation is based on DNA-DNA hybridization studies, phenotypic differences presence (Photorhabdus) or absence (Xenorhabdus) of catalase and bioluminescence) and the Family of the nematode host (Xenorhabdus; Steinernematidae, Photorhabdus; Heterorhabditidae). Comparative, cellular fatty-acid analyses (Janse et al. 1990, Lett. Appl. Microbiol 10, 131-135; Suzuki et al. 1990, J. Gen. Appl. Microbiol., 36, 393-401) support the separation of Photorhabdus from Xenorhabdus.
In order to establish that the strain collection disclosed herein was comprised of Photorhabdus strains, the strains were characterized based on recognized traits which define Photorhabdus and differentiate it from other Enterobacteriaceae and Xenorhabdus species. (Farmer, 1984 Bergey's Manual of Systemic Bacteriology Vol. 1 pp.510-511; Akhurst and Boemare 1988, J. Gen. Microbiol. 134 pp.1835-1845; Boemare et al. 1993 Int. J.
Syst. Bacteriol. 43 pp.249-255, which are incorporated herein by reference). The traits studied were the following: gram stain negative rods, organism size, colony pigmentation, inclusion bodies, presence of catalase, ability to reduce nitrate, bioluminescence, dye uptake, gelatin hydrolysis, growth on selective media, growth temperature, survival under anerobic conditions and motility. Fatty acid analysis was used to confirm that the strains herein all belong to the single genus Photorhabdus.
Currently, the bacterial genus Photorhabdus is comprised of a single defined species, Photorhabdus luminescens (ATCC Type strain #29999, Poinar et al., 1977, Nematologica 23, 97-102). A variety of related strains have been described in the literature Akhurst et al. 1988 J. Gen. Microbiol., 134, 1835-1845; Boemare et al. 1993 Int. J. Syst. Bacteriol. 43 pp. 249-255; Putz et al. 1990, Appl. Environ. Microbiol., 56, 181-186). Numerous SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 Phoccrhabdus strains have been characterized herein. Such strains are listed in Table 18 in the Examples. Because there is currently only one species (luminescens) defined within the genus Phocorhabdus, the luminescens species traits were used to characterize the strains herein. As can be seen in Fig. 5, these strains are quite diverse. It is not unforeseen that in the future there may be other Photorhabdus species that will have some of the attributes of the luminescens species as well as some different characteristics that are presently not defined as a trait of Photorhabdus luminescens. However, the scope of the invention herein is to any Photorhabdus species or strains which produce proteins that have functional activity as insect control agents, regardless of other traits and characteristics.
Furthermore, as is demonstrated herein, the bacteria of the genus Photorhabdus produce proteins that have functional activity as defined herein. Of particular interest are proteins produced by the species Photorhabdus luminescens. The inventions herein should in no way be limited to the strains which are disclosed herein. These strains illustrate for the first time that proteins produced by diverse isolates of Photorhabdus are toxic upon exposure to insects. Thus, included within the inventions described herein are the strains specified herein and any mutants thereof, as well as any strains or species of the genus Photorhabdus that have the functional activity described herein.
There are several terms that are used herein that have a particular meaning and are as follows: By "functional activity" it is meant herein that the protein toxins function as insect control agents in that the proteins are orally active, or have a toxic effect, or are able to disrupt or deter feeding, which may or may not cause death of the insect.
When an insect comes into contact with an effective amount of toxin delivered via transgenic plant expression, formulated protein compositions(s), sprayable protein composition(s), a bait matrix or other delivery system, the results are typically death of the insect, or the insects do not feed upon the source which makes the toxins available to the insects.
-6- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 The protein toxins discussed herein are typically referred to as "insecticides". By insecticides it is meant herein that the protein toxins have a "functional activity" as further defined herein and are used as insect control agents.
By the use of the term "oligonucleotides" it is meant a macromolecule consisting of a short chain of nucleotides of either RNA or DNA. Such length could be at least one nucleotide, but typically are in the range of about 10 to about 12 nucleotides. The determination of the length of the oligonucleotide is well within the skill of an artisan and should not be a limitation herein. Therefore, oligonucleotides may be less than 10 or greater than 12.
By the use of the term "toxic" or "toxicity" as used herein it is meant that the toxins produced by Photorhabdus have "functional activity" as defined herein.
By the use of the term "genetic material" herein, it is meant to include all genes, nucleic acid, DNA and RNA.
Fermentation broths from selected strains reported in Table 18 were used to determine the following: breadth of insecticidal toxin production by the Photorhabdus genus, the insecticidal spectrum of these toxins, and to provide source material to purify the toxin complexes. The strains characterized herein have been shown to have oral toxicity against a variety of insect orders. Such insect orders include but are not limited to Coleoptera, Homoptera, Lepidoptera, Diptera, Acarina, Hymenoptera and Dictyoptera.
As with other bacterial toxins, the rate of mutation of the bacteria in a population causes many related toxins slightly different in sequence to exist. Toxins of interest here are those which produce protein complexes toxic to a variety of insects upon exposure, as described herein. Preferably, the toxins are active against Lepidoptera, Coleoptera, Homopotera, Diptera, Hymenoptera, Dictyoptera and Acarina. The inventions herein are intended to capture the protein toxins homologous to protein toxins produced by the strains herein and any derivative -7- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 By the use of the term "Photorhabdus toxin" it is meant any protein produced by a Photorhabdus microorganism strain which has functional activity against insects, where the Photorhabdus toxin could be formulated as a sprayable composition, expressed by a transgenic plant, formulated as a bait matrix, delivered via a Baculovirus, or delivered by any other applicable host or delivery system.
-7/1- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 strains thereof, as well as any protein toxins produced by Photorhabdus. These homologous proteins may differ in sequence, but do not differ in function from those toxins described herein.
Homologous toxins are meant to include protein complexes of between 300 kDa to 2,000 kDa and are comprised of at least tv..
subunits, where a subunit is a peptide which may or may not be the same as the other subunit. Various protein subunits have been identified and are taught in the Examples herein.
Typically, the protein subunits are between about 18 kDa to about 230 kDa; between about 160 kDa to about 230 kDa; 100 kDa to 160 kDa; about 80 kDa to about 100 kDa; and about 50 kDa to about kDa.
As discussed above, some Photorhabdus strains can be isolated from nematodes. Some nematodes, elongated cylindrical parasitic worms of the phylum Nemacoda, have evolved an ability to exploit insect larvae as a favored growth environment. The insect larvae provide a source of food for growing nematodes and an environment in which to reproduce. One dramatic effect that follows invasion of larvae by certain nematodes is larval death.
Larval death results from the presence of, in certain nematodes, bacteria that produce an insecticidal toxin which arrests larval growth and inhibits feeding activity.
Interestingly, it appears that each genus of insect parasitic nematode hosts a particular species of bacterium, uniquely adapted for symbiotic growth with that nematode. In the interim since this research was initiated, the name of the bacterial genus Xenorhabdus was reclassified into the Xenorhabdus and the Photorhabdus. Bacteria of the genus Photorhabdus are characterized as being symbionts of Heterorhabditus nematodes while Xenorhabdus species are symbionts of the Steinernema species.. This change in nomenclature is reflected in this specification, but in no way should a change in nomenclature alter the scope of the inventions described herein.
The peptides and genes that are disclosed herein are named according to the guidelines recently published in the Journ.l of Bacteriology "Instructions to Authors" p. i-xii (Jan. 1996), which is incorporated herein by reference. The following peptides and genes were isolated from Photorhabdus strain W-14.
-8- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 Peptide Gene Nomenclature Toxin complex (Tc) Peptide Name Gene Name Patent Smnllnncr TDn Name Serrue ID# tca genomic region TcaA TcaAiii TcaBi TcaBii TcaC tcb genomic region TcbA TcbAi TcbAii TcbAiii cc genomic region TccA TccB tcd genomic region TcdAi TcdAii CcaA CcaA ccaB ccaB tcaC tcbA tcbA tcbA tcbA CccA tccB tcdA tcdA tcdA tcdB 12 4 3 (19, 2 (pro-peptide) 1 (21, 22, 23, 24) (pro-peptide) 13, (38, 39 17, 18) 41, (42, 43) 14 TcdAiii TcdB (bracket sequence indicates internal amino acid sequence obtained by tryptic digests) The sequences listed above are grouped by genomic region.
The tcbA gene was expressed in E. coli as two protein fragments TcbA and TcbAiii as illustrated in the Examples. It may be beneficial to have proteolytic clippage of some sequences to obtain the higher activity of the toxins for commercial transgenic applications.
The toxins described herein are quite unique in that the toxins have functional activity, which is key to developing an insect management strategy. In developing an insect management strategy, it is possible to delay or circumvent the protein degradation process by injecting a protein directly into an organism, avoiding its digestive tract. In such cases, the protein administered to the organism will retain its function until it is denatured, non-specifically degraded, or eliminated by the immune system in higher organisms. Injection into insects -9- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 of an insecticidal-toxin has potential application only in the laboratory, and then only on large insects which are easily injected. The observation that the insecticidal protein toxins herein described exhibits their toxic activity after oral ingestion or contact with the toxins permits the development of an insect management plan based solely on the ability to incorporate the protein toxins into the insect diet. Such a plan could result in the production of insect baits.
The Photorhabdus toxins may be administered to insects in a purified form. The toxins may also be delivered in amounts from about 1 to about 100 mg liter of broth. This may vary upon formulation condition, conditions of the inoculum source, techniques for isolation of the toxin, and the like. The toxins may be administered as an exudate secretion or cellular protein originally expressed in a heterologous prokaryotic or eukaryotic host. Bacteria are typically the hosts in which proteins are expressed. Eukaryotic hosts could include but are not limited to plants, insects and yeast. Alternatively, the toxins may be produced in bacteria or transgenic plants in the field or in the insect by a baculovirus vector. Typically the toxins will be introduced to the insect by incorporating one or more of the toxins into the insects' feed.
Complete lethality to feeding insects is useful but is not required to achieve useful toxicity. If the insects avoid the toxin or cease feeding, that avoidance will be useful in some applications, even if the effects are sublethal. For example, if insect resistant transgenic crop plants are desired, a reluctance of insects to feed on the plants is as useful as lethal toxicity to the insects since the ultimate objective is protection of the plants rather than killing the insect.
There are many other ways in which toxins can be incorporated into an insect's diet. As an example, it is possible to adulterate the larval food source with the toxic protein by spraying the food with a protein solution, as disclosed herein. Alternatively, the purified protein could be genetically engineered into an otherwise harmless bacterium, which could then be grown in culture, and either applied to the food source or allowed to reside in the soil in an area in which insect eradication was desirable. Also, the protein could be genetically engineered directly into an insect food source. For SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 instance, the major food source of many insect larvae is plant material.
By incorporating genetic material that encodes the insecticidal properties of the Photorhabdus toxins into the genome of a plant eaten by a particular insect pest, the adult or larvae would die after consuming the food plant. Numerous members of the monocotyledonous and dictyledenous genera have been transformed. Transgenic agronmonic crops as well as fruits and vegetables are of commercial interest. Such crops include but are not limited to maize, rice, soybeans, canola, sunflower, alfalfa, sorghum, wheat, cotton, peanuts, tomatoes, potatoes, and the like. Several techniques exist for introducing foreign genetic material into plant cells, and for obtaining plants that stably maintain and express the introduced gene. Such techniques include acceleration of genetic material coated onto microparticles directly into cells(U.S. Patents 4,945,050 to Cornell and 5,141,131 to DowElanco). Plants may be transformed using Agrobacerium technology, see U.S. Patent 5,177,010 to University of Toledo, 5,104,310 to Texas A&M, European Patent Application 0131624B1, European Patent Applications 120516, 159418B1 and 176,112 to Schilperoot, U.S. Patents 5,149,645, 5,469,976, 5,464,763 and 4,940,838 and 4,693,976 to Schilperoot, European Patent Applications 116718, 290799, 320500 all to MaxPlanck, European Patent Applications 604662 and 627752 to Japan Tobacco, European Patent Applications 0267159, and 0292435 and U.S. Patent 5,231,019 all to Ciba Geigy, U.S. Patents 5,463,174 and 4,762,785 both to Calgene, and U.S. Patents 5,004,863 and 5,159,135 both to Agracetus. Other transformation technology includes whiskers technology, see U.S. Patents 5,302,523 and 5,464,765 both to Zeneca. Electroporation technology has also been used to transform plants, see WO 87/06614 to Boyce Thompson Institute, 5,472,869 and 5,384,253 both to Dekalb, W09209696 and W09321335 both to PGS. All of these transformation patents and publications are incorporated by reference. In addition to numerous technologies for transforming plants, the type of tissue which is contacted with the foreign genes may vary as well. Such tissue would include but would not be limited to embryogenic tissue, callus tissue type I and II, hypocotyl, meristem, and the like. Almost all plant tissues may -11- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 be transformed during dedifferentiation using appropriate techniques within the skill of an artisan.
Another variable is the choice of a selectable marker. The preference for a particular marker is at the discretion of the artisan, but any of the following selectable markers may be used along with any other gene not listed herein which could function as a selectable marker. Such selectable markers include but are not limited to aminoglycoside phosphotransferase gene of transposon Tn5 (Aph II) which encodes resistance to the antibiotics kanamycin, neomycin and G418, as well as those genes which code for resistance or tolerance to glyphosate; hygromycin; methotrexate; phosphinothricin (bialophos); imidazolinones, sulfonylureas and triazolopyrimidine herbicides, such as chlorosulfuron; bromoxynil, dalapon and the like.
In addition to a selectable marker, it may be desirous to use a reporter gene. In some instances a reporter gene may be used without a selectable marker. Reporter genes are genes which are typically not present or expressed in the recipient organism or tissue. The reporter gene typically encodes for a protein which provides for some phenotypic change or enzymatic property.
Examples of such genes are provided in K. Weising et al. Ann.
Rev. Genetics, 22, 421 (1988), which is incorporated herein by reference. A preferred reporter gene is the glucuronidase
(GUS)
gene.
Regardless of transformation technique, the gene is preferably incorporated into a gene transfer vector adapted to express the Photorhabdus toxins in the plant cell by including in the vector a plant promoter. In addition to plant promoters, promoters from a variety of sources can be used efficiently in plant cells to express foreign genes. For example, promoters of bacterial origin, such as the octopine synthase promoter, the nopaline synthase promoter, the mannopine synthase promoter; promoters of viral origin, such as the cauliflower mosaic virus and 19S)and the like may be used. Plant promoters include, but are not limited to ribulose-1,6-bisphosphate
(RUBP)
carboxylase small subunit (ssu), beta-conglycinin promoter, phaseolin promoter, ADH promoter, heat-shock promoters and tissue specific promoters. Promoters may also contain certain enhancer sequence elements that may improve the transcription efficiency.
Typical enhancers include but are not limited to Adh-intron 1 and -12- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/1 8003 Adh-incron 6. Constitutive promoters may be used. Constitutive promoters direct continuous gene expression in all cells types and at all times actin, ubiquitin, CaMV 35S). Tissue specific promoters are responsible for gene expression in specific cell or tissue types, such as the leaves or seeds zein, oleosin, napin, ACP) and these promoters may also be used.
Promoters may also be are active during a certain stage of the plants' development as well as active in plant tissues and organs. Examples of such promoters include but are not limited to pollen-specific, embryo specific, corn silk specific, cotton fiber specific, root specific, seed endosperm specific promoters and the like.
Under certain circumstances it may be desirable to use an inducible promoter. An inducible promoter is responsible for expression of genes in response to a specific signal, such as: physical stimulus (heat shock genes); light (RUBP carboxylase); hormone metabolites; and stress. Other desirable transcription and translation elements that function in plants may be used. Numerous plant-specific gene transfer vectors are known to the art.
In addition, it is known that to obtain high expression of bacterial genes in plants it is preferred to reengineer the bacterial genes so that they are more efficiently expressed in the cytoplasm of plants. Maize is one such plant where it is preferred to reengineer the bacterial gene(s) prior to transformation to increase the expression level of the toxin in the plant. One reason for the reengineering is the very low G+C content of the native bacterial gene(s) (and consequent skewing towards high A+T content). This results in the generation of sequences mimicking or duplicating plant gene control sequences that are known to be highly A+T rich. The presence of some A+Trich sequences within the DNA of the gene(s) introduced into plants TATA box regions normally found in gene promoters) may result in aberrant transcription of the gene(s). On the other hand, the presence of other regulatory sequences residing in the transcribed mRNA polyadenylation signal sequences (AAUAAA), or sequences complementary to small nuclear RNAs involved in pre-mRNA splicing) may lead to RNA instability.
Therefore, one goal in the design of reengineered bacterial -13- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 gene(s), more preferably referred to as plant optimized gene(s), is to generate a DNA sequence having a higher G+C content, and preferably one close to that of plant genes coding for metabolic enzymes. Another goal in the design of the plant optimized gene(s) is to generate a DNA sequence that not only has a higher G+C content, but by modifying the sequence changes, should be made so as to not hinder translation.
An example of a plant that has a high G+C content is maize.
The table below illustrates how high the G+C content is in maize.
As in maize, it is thought that G+C content in other plants is also high.
Table 1 Compilation of G+C contents of protein of maize genes coding regions Protein Class" Range %G+C Mean %G+C b Metabolic Enzymes (40) 44.4-75.3 59.0 Storage Proteins Group I (23) 46.0-51.9 48.1 (1.3) Group II (13) 60.4-74.3 67.5 (3.2) Group I II (36) 46.0-74.3 55.1 (9.6) Structural Proteins (18) 48.6-70.5 63.6 (6.7) Regulatory Proteins 57.2-68.9 62.0 (4.9) Uncharacterized Proteins 41.5-70.3 64.3 (7.2) All Proteins (108) 44.4-75.3 60.8 (5.2) SNumber of genes in class given in parentheses.
bStandard deviations given in parentheses.
Combined groups mean ignored in calculation of overall mean.
For the data in Table 1, coding regions of the genes were extracted from GenBank (Release 71) entries, and base compositions were calculated using the MacVector program (IBI, New Haven, CT). Intron sequences were ignored in the -14- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 calculations. Group I and II storage protein gene sequences were distinguished by their marked difference in base composition.
Due to the plasticity afforded by the redundancy of the genetic code some amino acids are specified by more than one codon), evolution of the genomes of different organisms or classes or organisms has resulted in differential usage of redundant codons. this "codon bias" is reflected in the mean base composition of protein coding regions. For example, organisms with relatively low G+C contents utilize codons having A or T in the third position of redundant codons, whereas those having higher G+C contents utilize codons having G or C in the third position. It is thought that the presence of "minor" codons within a gene's mRNA may reduce the absolute translation rate of that mRNA, especially when the relative abundance of the charged tRNA corresponding to the minor codon is low. An extension of this is that the diminution of translation rate by individual minor codons would be at least additive for multiple minor codons. Therefore, mRNAs having high relative contents of minor codons would have correspondingly low translation rates. This rate would be reflected by the synthesis of low levels of the encoded protein.
In order to reengineer the bacterial gene(s), the codon bias of the plant is determined. The codon bias is the statistical codon distribution that the plant uses for coding its proteins.
After determining the bias, the percent frequency of the codons in the gene(s) of interest is determined. The primary codons preferred by the plant should be determined as well as the second and third choice of preferred codons. The amino acid sequence of the protein of interest is reverse translated so that the resulting nucleic acid sequence codes for the same protein as the native bacterial gene, but the resulting nucleic acid sequence corresponds to the first preferred codons of the desired plant.
The new sequence is analyzed for restriction enzyme sites that might have been created by the modification. The identified sites are further modified by replacing the codons with second or third choice preferred codons. Other sites in the sequence which could affect the transcription or translation of the gene of interest are the exon:intron 5' or 3' junctions, poly A addition signals, or RNA polymerase termination signals. The sequence is SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 further analyzed and modified to reduce the frequency of TA or GC doublets. In addition to the doublets, G or C sequence blocks that have more than about four residues that are the same can affect transcription of the sequence. Therefore, these blocks are also modified by replacing the codons of first or second choice, etc. with the next preferred codon of choice. it is preferred that the plant optimized gene(s) contains about 63% of first choice codons, between about 22% to about 37% second choice codons, and between 15% and 0% third choice codons, wherein the total percentage is 100%. Most preferred the plant optimized gene(s) contain about 63% of first choice codons, at least about 22% second choice codons, about 7.5% third choice codons, and about 7.5% fourth choice codons, wherein the total percentage is 100%. The method described above enables one skilled in the art to modify gene(s) that are foreigA to a particular plant so that the genes are optimally expressed in plants. The method is further illustrated in pending provisional application U.S.
60/005,405 filed on October 13, 1995, which is incorporated herein by reference.
Thus, in order to design plant optimized gene(s) the amino acid sequence of the toxins are reverse translated into a DNA sequence, utilizing a nonredundant genetic code established from a codon bias table compiled for the gene DNA sequence for the particular plant being transformed. The resulting DNA sequence, which is completely homogeneous in codon usage, is further modified to establish a DNA sequence that, besides having a higher degree of codon diversity, also contains strategically placed restriction enzyme recognition sites, desirable base composition, and a lack of sequences that might interfere with transcription of the gene, or translation of the product mRNA.
It is theorized that bacterial genes may be more easily expressed in plants if the bacterial genes are expressed in the plastids. Thus, it may be possible to express bacterial genes in plants, without optimizing the genes for plant expression, and obtain high express of the protein. See U.S. Patent Nos.
4,762,785; 5,451,513 and 5,545,817, which are incorporated herein by reference.
-16- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 One of the issues regarding commercial exploiting transgenic plants is resistance management. This is of particular concern with Bacillus thuringiensis toxins. There are numerous companies commerically exploiting Bacillus thuringiensis and there has been much concern about Be toxins becoming resistant. One strataegy for insect resistant management would be to combine the toxins produced by Photorhabdus with toxins such as BC, vegetative insect proteins (Ciba Geigy) or other toxins. The combinations could be formulated for a sprayable application or could be molecular combinations. Plants could be transformed with Photorhabdus genes that produce insect toxins and other insect toxin genes such as Be as with other insect toxin genes such as
BC.
European Patent Application 0400246A1 describes transformation of 2 Be in a plant, which could be any 2 genes.
Another way to produce a transgenic plant that contains more than one insect resistant gene would be to produce two plants, with each plant containing an insect resistant gene. These plants would be backcrossed using traditional plant breeding techniques to produce a plant containing more than one insect resistant gene.
In addition to producing a transformed plant containing plant optimized gene(s), there are other delivery systems where it may be desirable to reengineer the bacterial gene(s). Along the same lines, a genetically engineered, easily isolated protein toxin fusing together both a molecule attractive to insects as a food source and the insecticidal activity of the toxin may be engineered and expressed in bacteria or in eukaryotic cells using standard, well-known techniques. After purification in the laboratory such a toxic agent with "built-in" bait could be packaged inside standard insect trap housings.
Another delivery scheme is the incorporation of the genetic material of toxins into a baculovirus vector. Baculoviruses infect particular insect hosts, including those desirably targeted with the Photorhabdus toxins. Infectious baculovirus harboring an expression construct for the Photorhabdus toxins could be introduced into areas of insect infestation to thereby intoxicate or poison infected insects.
-17- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96I18003 Transfer of the insecticidal properties requires nucleic acid sequences encoding the coding the amino acid sequences for the Photorhabdus toxins integrated into a protein expression vector appropriate to the host in which the vector will reside.
One way to obtain a nucleic acid sequence encoding a protein with insecticidal properties is to isolate the native genetic material which produces the toxins from Phocorhabdus, using information deduced from the toxin's amino acid sequence, large portions of which are set forth below. As described below, methods of purifying the proteins responsible for toxin activity are also disclosed.
Using N-terminal amino acid sequence data, such as set forth below, one can construct oligonucleotides complementary to all, or a section of, the DNA bases that encode the first amino acids of the toxin. These oligonucleotides can be radiolabeled and used as molecular probes to isolate the genetic material from a genomic genetic library built from genetic material isolated from strains of Photorhabdus. The genetic library can be cloned in plasmid, cosmid, phage or phagemid vectors. The library could be transformed into Escherichia coli and screened for toxin production by the transformed cells using antibodies raised against the toxin or direct assays for insect toxicity.
This approach requires the production of a battery of oligonucleotides, since the degenerate genetic code allows an amino acid to be encoded in the DNA by any of several threenucleotide combinations. For example, the amino acid arginine can be encoded by nucleic acid triplets CGA, CGC, CGG, CGT, AGA, and AGG. Since one cannot predict which triplet is used at those positions in the toxin gene, one must prepare oligonucleotides with each potential triplet represented. More than one DNA molecule corresponding to a protein subunit may be necessary to construct a sufficient number of oligonucleotide probes to recover all of the protein subunits necessary to achieve oral toxicity.
From the amino acid sequence of the purified protein, genetic materials responsible for the production of toxins can readily be isolated and cloned, in whole or in part, into an expression vector using any of several techniques well-known to one skilled in the art of molecular biology. A typical expression vector is a DNA plasmid, though other transfer means -18- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 including, but not limited to, cosmids, phagemids and phage are also envisioned. In addition to features required or desired for plasmid replication, such as an origin of replication and antibiotic resistance or other form of a selectable marker such as the bar gene of Streptomyces hygroscopicus or viridochromogenes, protein expression vectors normally additionally require an expression cassette which incorporates the cis-acting sequences necessary for transcription and translation of the gene of interest. The cis-acting sequences required for expression in prokaryotes differ from those required in eukaryotes and plants.
A eukaryotic expression cassette requires a transcriptional promoter upstream to the gene of interest, a transcriptional termination region such as a poly-A addition site, and a ribosome binding site upstream of the gene of interest's first codon. In bacterial cells, a useful transcriptional promoter that could be included in the vector is the T7 RNA Polymerase-binding promoter.
Promoters, as previously described herein, are known to efficiently promote transcription of mRNA. Also upstream from the gene of interest the vector may include a nucleotide sequence encoding a signal sequence known to direct a covalently linked protein to a particular compartment of the host cells such as the cell surface.
Insect viruses, or baculoviruses, are known to infect and adversely affect certain insects. The affect of the viruses on insects is slow, and viruses do not stop the feeding of insects.
Thus viruses are not viewed as being useful as insect pest control agents. Combining the Photorhabdus toxins genes into a baculovirus vector could provide an efficient way of transmitting the toxins while increasing the lethality of the virus. In addition, since different baculoviruses are specific to different insects, it may be possible to use a particular toxin to selectively target particularly damaging insect pests. A particularly useful vector for the toxins genes is the nuclear polyhedrosis virus. Transfer vectors using this virus have been described and are now the vectors of choice for transferring foreign genes into insects. The virus-toxin gene recombinant may be constructed in an orally transmissible form. Baculoviruses normally infect insect victims through the mid-gut intestinal mucosa. The toxin gene inserted behind a strong viral coat -19- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 protein promoter would be expressed and should rapidly kill the infected insect.
In addition to an insect virus or baculovirus or transgenic plant delivery system for the protein toxins of the present invention, the proteins may be encapsulated using Bacillus churingiensis encapsulation technology such as but not limited to U.S. Patent Nos. 4,695,455; 4,695,462; 4,861,595 which are all incorporated herein by reference. Another delivery system for the protein toxins of the present invention is formulation of the protein into a bait matrix, which could then be used in above and below ground insect bait stations. Examples of such technology include but are not limited to PCT Patent Application WO 93/23998, which is incorporated herein by reference.
As is described above, it might become necessary to modify the sequence encoding the protein when expressing it in a nonnative host, since the codon preferences of other hosts may differ from that of Photorhabdus. In such a case, translation may be quite inefficient in a new host unless compensating modifications to the coding sequence are made. Additionally, modifications to the amino acid sequence might be desirable to avoid inhibitory cross-reactivity with proteins of the new host, or to refine the insecticidal properties of the protein in the new host. A genetically modified toxin gene might encode a toxin exhibiting, for example, enhanced or reduced toxicity, altered insect resistance development, altered stability, or modified target species specificity.
In addition to the Photorhabdus genes encoding the toxins, the scope of the present invention is intended to include related nucleic acid sequences which encode amino acid biopolymers homologous to the toxin proteins and which retain the toxic effect of the Photorhabdus proteins in insect species after oral ingestion.
For instance, the toxins used in the present invention seem to first inhibit larval feeding before death ensues. By manipulating the nucleic acid sequence of Photorhabdus toxins or its controlling sequences, genetic engineers placing the toxin gene into plants could modulate its potency or its mode of action to, for example, keep the eating-inhibitory activity while eliminating the absolute toxicity to the larvae. This change could permit the transformed plant to survive until harvest SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 without having the unnecessarily dramatic effect on the ecosystem of wiping out all target insects. All such modifications of the gene encoding the toxin, or of the protein encoded by the gene, are envisioned to fall within the scope of the present invention.
Other envisioned modifications of the nucleic acid include the addition of targeting sequences to direct the toxin to particular parts of the insect larvae for improving its efficiency.
Strains ATCC 55397, 43948, 43949, 43950, 43951, 43952 have been deposited in the American Type Culture Collection, 12301 Parklawn Drive, Rockville, MD 20852 USA. Amino acid and nucleotide sequence data for the W-14 native toxin (ATCC 55397) is presented below. Isolation of the genomic DNA for the toxins from the bacterial hosts is also exemplified herein.
Standard and molecular biology techniques were followed and taught in the specification herein. Additional information may be found in Sambrook, Fritsch, E. and Maniatis, T.
(1989), Molecular Cloning, A Laboratory Manual, Cold Spring Harbor Press, which is incorporated herein by reference.
The following abbreviations are used throughout the Examples: Tris tris (hydroxymethyl) amino methane; SDS sodium dodecyl sulfate; EDTA ethylenediaminetetraacetic acid, IPTG isopropylthio-B-galactoside, X-gal 5-bromo-4-chloro-3-indoyl-B- D-galactoside, CTAB cetyltrimethylammonium bromide; kbp kilobase pairs; dATP, dCTP, dGTP, dTTP, I 2 '-deoxynucleoside of adenine, cytosine, guanine, thymine, and inosine, respectively; ATP adenosine 5' triphosphate.
Example 1 Purification of toxin from P. luminescens and Demonstration of toxicity after oral delivery of purified toxin The insecticidal protein toxin of the present invention was purified from P. luminescens strain W-14, ATCC Accession Number 55397. Stock cultures of P. luminescens were maintained on petri dishes containing 2% Proteose Peptone No. 3 PP3, Difco Laboratories, Detroit MI) in 1.5% agar, incubated at 25 0 C and transferred weekly. Colonies of the primary form of the bacteria were inoculated into 200 ml of PP3 broth supplemented with -21- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 polyoxyethylene sorbitan mono-stearate (Tween 60, Sigma Chemical Company, St. Louis MO) in a one liter flask. The broth cultures were grown for 72 hours at 30 0 C on a rotary shaker. The toxin proteins can be recovered from cultures grown in the presence or absence of Tween; however, the absence of Tween can affect the form of the bacteria grown and the profile of proteins produced by the bacteria. In the absence of Tween, a variant shift occurs insofar as the molecular weight of at least one identified toxin subunit shifts from about 200 kDa to about 185 kDa.
The 72 hour cultures were centrifuged at 10,000 x g for minutes to remove cells and debris. The supernatant fraction that contained the insecticidal activity was decanted and brought to 50 mM K 2
HPO
4 by adding an appropriate volume of 1.0 M K2HPO 4 The pH was adjusted to 8.6 by adding potassium hydroxide. This supernatant fraction was then mixed with DEAE-Sephacel (Pharmacia LKB Biotechnology) which had been equilibrated with 50 mM K 2 HPO4.
The toxic activity was adsorbed to the DEAE resin. This mixture was then poured into a 2.6 x 40 cm column and washed with 50 mM K2HPO 4 at room temperature at a flow rate of 30 ml/hr until the effluent reached a steady baseline UV absorbance at 280 nm. The column was then washed with 150 mM KC1 until the effluent again reached a steady 280 nm baseline. Finally the column was washed with 300 mM KC1 and fractions were collected.
Fractions containing the toxin were pooled and filter sterilized using a 0.2 micron pore membrane filter. The toxin was then concentrated and equilibrated to 100 mM KPO 4 pH 6.9, using an ultrafiltration membrane with a molecular weight cutoff of 100 kDa at 4 0 C (Centriprep 100, Amicon Division-W.R. Grace and Company). A 3 ml sample of the toxin concentrate was applied to the top of a 2.6 x 95 cm Sephacryl S-400 HR gel filtration column (Pharmacia LKB Biotechnology). The eluent buffer was 100 mM KPO 4 pH 6.9, which was run at a flow rate of 17 ml/hr, at 4 0 C. The effluent was monitored at 280 nm.
Fractions were collected and tested for toxic activity.
Toxicity of chromatographic fractions was examined in a biological assay using Manduca sexta larvae. Fractions were either applied directly onto the insect diet (Gypsy moth wheat germ diet, ICN Biochemicals Division ICN Biomedicals, Inc.) or administered by intrahemocelic injection of a 5 ul sample through the first proleg of 4th or 5th instar larva using a 30 gauge -22- SUBSTITUTE SHEET (RULE 26 WO 97/17432 PCT/US96/18003 needle. The weight of each larva within a treatment group was recorded at 24 hour intervals. Toxicity was presumed if the insect ceased feeding and died within several days of consuming treated insect diet or if death occurred within 24 hours after injection of a fraction.
The toxic fractions were pooled and concentrated using the Centriprep-100 and were then analyzed by HPLC using a 7.5 mm x cm TSK-GEL G-4000 SW gel permeation column with 100 mM potassium phosphate, pH 6.9 eluent buffer running at 0.4 ml/min. This analysis revealed the toxin protein to be contained within a single sharp peak that eluted from the column with a retention time of approximately 33.6 minutes. This retention time corresponded to an estimated molecular weight of 1,000 kDa. Peak fractions were collected for further purification while fractions not containing this protein were discarded. The peak eluted from the HPLC absorbs UV light at 218 and 280 nm but did not absorb at 405 nm. Absorbance at 405 nm was shown to be an attribute of xenorhabdin antibiotic compounds.
Electrophoresis of the pooled peak fractions in a nondenaturing agarose gel (Metaphor Agarose, FMC BioProducts) showed that two protein complexes are present in the peak. The peak material, buffered in 50 mM Tris-HCl, pH 7.0, was separated on a agarose stacking gel buffered with 100 mM Tris-HCl at pH and 1.9% agarose resolving gel buffered with 200 mM Tris-borate at pH 8.3 under standard buffer conditions (anode buffer IM Tris- HCI, pH 8.3; cathode buffer 0.025 M Tris, 0.192 M glycine). The gels were run at 13 mA constant current at 15 0 C until the phenol red tracking dye reached the end of the gel. Two protein bands were visualized in the agarose gels using Coomassie brilliant blue staining.
The slower migrating band was referred to as "protein band 1" and faster migrating band was referred to as "protein band 2." The two protein bands were present in approximately equal amounts. The Coomassie stained agarose gels were used as a guide to precisely excise the two protein bands from unstained portions of the gels. The excised pieces containing the protein bands were macerated and a small amount of sterile water was added. As a control, a portion of the gel that contained no protein was also excised and treated in the same manner as the gel pieces containing the protein. Protein was recovered from the gel -23- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 pieces by electroelution into 100 mM Tris-borate pH 8.3, at Lo0 volts (constant voltage) for two hours. Alternatively, protein was passively eluted from the gel pieces by adding an equal volume of 50 mM Tris-HCl, pH 7.0, to the gel pieces, then incubating at 30 0 C for 16 hours. This allowed the protein to diffuse from the gel into the buffer, which was then collected.
Results of insect toxicity tests using HPLC-purified toxin (33.6 min. peak) and agarose gel purified toxin demonstrated toxicity of the extracts. Injection of 1.5 ug of the HPLC purified protein kills within 24 hours. Both protein bands I and 2, recovered from agarose gels by passive elution or electroelution, were lethal upon injection. The protein concentration estimated for these samples was less than ng/larva. A comparison of the weight gain and the mortality between the groups of larvae injected with protein bands 1 cr 2 indicate that protein band 1 was more toxic by injection delivery.
When HPLC-purified toxin was applied to larval diet at a concentration of 7.5 ug/larva, it caused a halt in larval weight gain (24 larvae tested). The larvae begin to feed, but after consuming only a very small portion of the toxin treated diet they began to show pathological symptoms induced by the toxin and the larvae cease feeding. The insect frass became discolorei and most larva showed signs of diarrhea. Significant insect mortality resulted when several 5 jg toxin doses were applied to the diet over a 7-10 day period.
Agarose-separated protein band 1 significantly inhibited larval weight gain at a dose of 200 ng/larva. Larvae fed similar concentrations of protein band 2 were not inhibited and gained weight at the same rate as the control larvae. Twelve larvae were fed eluted protein and 45 larvae were fed protein-contaiing agarose pieces. These two sets of data indicate that protein band 1 was orally toxic to Manduca sexca. In this experiment: it appeared that protein band 2 was not toxic to Manduca sexta.
Further analysis of protein bands 1 and 2 by SDS-PAGE under denaturing conditions showed that each band was composed of several smaller protein subunits. Proteins were visualized by Coomassie brilliant blue staining followed by silver stainin to achieve maximum sensitivity.
-24- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 The protein subunits in the two bands were very similar.
Protein band I contains 8 protein subunits of 25.1, 56.2, 60.8, 65.6, 166, 171, 184 and 208 kDa. Protein band 2 had an identical profile except that the 25.1, 60.8, and 65.6 kDa proteins were not present. The 56.2, 60.8, 65.6, and 184 kDa proteins were present in the complex of protein band 1 at approximately equal concentrations and represent 80% or more of the total protein content of that complex.
The native HPLC-purified toxin was further characterized as follows. The toxin was heat labile in that after being heated to 0 C for 15 minutes it lost its ability to kill or to inhibit weight gain when injected or fed to M. sexta larvae. Assays were designed to detect lipase, type C phospholipase, nuclease or red blood cell hemolysis activities and were performed with purified toxin. None of these activities were present. Antibiotic zone inhibition assays were also done and the purified toxin failed to inhibit growth of Gram-negative or -positive bacteria, yeast or filamentous fungi, indicating that the toxic is not a xenorhabdin antibiotic.
The native HPLC-purified toxin was tested for ability to kill insects other than Manduca sexta. Table 2 lists insects killed by the HPLC-purified P. luminescens toxin in this study.
Table 2 Insects Killed by P. luminescens Toxin Genus and Route of Common Name Order species Delivery Tobacco Lepidoptera Manduca sexta Oral and horn worm injected Mealworm Coleoptera Tenebrio molitor Oral Pharaoh ant Hymenoptera Monomorium pharoanis Oral German Dictyoptera Blattella germanica Oral and cockroach injected Mosquito Diptera Aedes aegypti Oral SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 Example 2 Insecticide Utility The Phocorhabdus luminescens utility and toxicity were further characterized. Photorhabdus luminescens (strain W-14l culture broth was produced as follows. The production medium was 2% Bacto Proteose Peptone' Number 3 (PP3, Difco Laboratories, Detroit, Michigan) in Milli-Q deionized water. Seed culture flasks consisted of 175 ml medium placed in a 500 ml tribaffied flask with a Delong neck, covered with a Kaput and autoclaved for 20 minutes, T=250 0 F. Production flasks consisted of 500 mis in a 2.8 liter 500 ml tribaffled flask with a Delong neck, covered by a Shin-etsu silicon foam closure. These were autoclaved for 45 minutes, T=250'F. The seed culture was incubated at 28°C at 150 rpm in a gyrotory shaking incubator w..ith a 2 inch throw. After 16 hours of growth, 1% of the seed culture was placed in the production flask which was allowed to grow for 24 hours before harvest. Production of the toxin appears to be during log phase growth. The microbial broth was transferred to a 1L centrifuge bottle and the cellular biomass was pelleted minutes at 2500 RPM at 4°C, -1600] HG-4L Rotor RC3 Sorval centrifuge, Dupont, Wilmington, Delaware). The primary broth was chilled at 4°C for 8 16 hours and recentrifuged at least 2 hours (conditions above) to further clarify the broth by removal of a putative mucopolysaccharide which precipitated upon standing. (An alternative processing method combined both steps and involved the use of a 16 hour clarification centrifugation, same conditions as above.) This broth was then stored at 4'C prior to bioassay or filtration.
Photorhabdus culture broth and protein toxin(s) purified from this broth showed activity (mortality and/or growth inhibition, reduced adult emergence) against a number of insects.
More specifically, the activity is seen against corn rootworm (larvae and adult), Colorado potato beetle, and turf grubs, which are members of the insect order Coleoptera. Other members of the -Coleoptera include wireworms, pollen beetles, flea beetles, seed beetles and weevils. Activity has also been observed against aster leafhopper, which is a member of the order, Homoptera.
Other members of the Homoptera include planthoppers, pear pyslla, apple sucker, scale insects, whiteflies, and spittle bugs, as -26- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 well as numerous host specific aphid species. The broth and purified fractions are also active against beet armyworm, cabbage looper, black cutworm, tobacco budworm, European corn borer, corn earworm, and codling moth, which are members of the order Lepidoptera. Other typical members of this order are clothes moth, Indian mealmoth, leaf rollers, cabbage worm, cotton bollworm, bagworm, Eastern tent caterpillar, sod webworm, and fall armyworm. Activity is also seen against fruitfly and mosquito larvae, which are members of the order Diptera. Other members of the order Diptera are pea midge, carrot fly, cabbage root fly, turnip root fly, onion fly, crane fly, house fly, and various mosquito species. Activity is seen against carpenter ant and Argentine ant, which are members of the order that also includes fire ants, oderous house ants, and little black ants.
The broth/fraction is useful for reducing populations of insects and were used in a method of inhibiting an insect population. The method may comprise applying to a locus of the insect an effective insect inactivating amount of the active described. Results are reported in Table 3.
Activity against corn rootworm larvae was tested as follows.
Photorhabdus culture broth (filter sterilized, cell-free) or purified HPLC fractions were applied directly to the surface cm 2 of 0.25 ml of artificial diet in 30 Ll aliquots following dilution in control medium or 10 mM sodium phosphate buffer, pH 7.0, respectively. The diet plates were allowed to air-dry in a sterile flow-hood and the wells were infested with single, neonate Diabrotica undecimpunctata howardi (Southern corn rootworm, SCR) hatched from sterilized eggs, with second instar SCR grown on artificial diet or with second instar Diabrotica virgifera virgifera (Western corn rootworm, WCR) reared on corn seedlings grown in Metromix". Second instar larvae were weighed prior to addition to the diet. The plates were sealed, placed in a humidified growth chamber and maintained at 27 0 C for the appropriate period (4 days for neonate and adult SCR, 2-5 days for WCR larvae, 7-14 days for second instar SCR). Mortality and weight determinations were scored as indicated. Generally, 16 insects per treatment were used in all studies. Control mortalities were as follows: neonate larvae, adult beetles, -27- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 Activity against Colorado potato beetle was tested as follows. Phocorhabdus culture broth or control medium was applied to the surface cm') of 1.5 ml of standard artificial diet held in the wells of a 24-well tissue culture plate. Each well received 50 l of treatment and was allowed to air dry.
Individual second instar Colorado potato beetle (Leptinotarsa decemlineaca, CPB) larvae were then placed onto the diet and mortality was scored after 4 days. Ten larvae per treatment were used in all studies. Control mortality was 3.3%.
Activity against Japanese beetle grubs and beetles was tested as follows. Turf grubs (Popillia japonica, 2-3rd instar) were collected from infested lawns and maintained in the laboratory in soil/peat mixture with carrot slices added as additional diet. Turf beetles were pheromone-trapped locally and maintained in the laboratory in plastic containers with maple leaves as food. Following application of undiluted Photorhabdus culture broth or control medium to corn rootworm artificial diet g1/1.54 cm 2 beetles) or carrot slices (larvae), both stages were placed singly in a diet well and observed for any mortality and feeding. In both cases there was a clear reduction in the amount of feeding (and feces production) observed.
Activity against mosquito larvae was tested as follows. The assay was conducted in a 96-well microtiter plate. Each well contained 200 1l of aqueous solution (Photorhabdus culture broth, .control medium or H 2 0) and approximately 20, 1-day old larvae (Aedes aegypti). There were 6 wells per treatment. The results were read at 2 hours after infestation and did not change over the three day observation period. No control mortality was seen.
Activity against fruitflies was tested as follows.
Purchased Drosophila melanogaster medium was prepared using dry medium and a 50% liquid of either water, control medium or Phocorhabdus culture broth. This was accomplished by placing ml of dry medium in each of 3 rearing vials per treatment and adding 8.0 ml of the appropriate liquid. Ten late instar Drosophila melanogaster maggots were then added to each vial.
The vials were held on a laboratory bench, at room temperature, under fluorescent ceiling lights. Pupal or adult counts were made after 3, 7 and 10 days of exposure. Incorporation of Photorhabdus culture broth into the diet media for fruitfly -28- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 maggots caused a slight but significant reduction in adult emergence as compared to water and control medium (3% reduction).
Activity against aster leafhopper was tested as follows.
The ingestion assay for aster leafhopper (Macrosteles severini) is designed to allow ingestion of the active without other external contact. The reservoir for the active/"food" solution is made by making 2 holes in the center of the bottom portion of a 35 x 10 mm Petri dish. A 2 inch Parafilm M* square is placed across the top of the dish and secured with an ring. A 1 oz.
plastic cup is then infested with approximately 7 leafhoppers and the reservoir is placed on top of the cup, Parafilm down. The test solution is then added to the reservoir through the holes.
In tests using undiluted Photorhabdus culture broth, the broth and control medium were dialyzed against water to reduce control mortality. Mortality is reported at day 2 where 26.5% control mortality was seen. In the tests using purified fractions (200 mg protein/ml a final concentration of 5% sucrose was used in all treatments to improve survivability of the aster leafhoppers.
The assay was held in an incubator at 28 0 C, 70% RH with a 16/8 photoperiod. The assay was graded for mortality at 72 hours.
Control mortality was Activity against Argentine ants was tested as follows. A ml aliquot of 100% Photorhabdus culture broth, control medium or water was pipetted into 2.0 ml clear glass vials. The vials were plugged with a piece of cotton dental wick that was moistened with the appropriate treatment. Each vial was placed into a separate 60xl6mm Petri dish with 8 to 12 adult Argentine ants (Linepithema humile). There were three replicates per treatment. Bioassay plates were held on a laboratory bench, at room temperature under fluorescent ceiling lights. Mortality readings were made after 5 days of exposure. Control mortality was 24%.
Activity against carpenter ant was tested as follows. Black carpenter ant workers (Camponotus pennsylvanicus) were collected from trees on DowElanco property in Indianapolis, IN. Tests with Photorhabdus culture broth were performed as follows. Each plastic bioassay container (7 1/8" x held fifteen workers, a paper harborage and 10 ml of broth or control media in a plastic shot glass. A cotton wick delivered the treatment to the ants -29- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 through a hole in the shot glass lid. All treatments contained sucrose. Bioassays were held in the dark at room temperature and graded at 19 days. Control mortality was Assays delivering purified fractions utilized artificial ant diet mixed with the treatment (purified fraction or control solution) at a rate of 0.2 ml treatment/2.0 g diet in a plastic test tube. The final protein concentration of the purified fraction was less than 10 g/g diet. Ten ants per treatment, a water source, harborage and the treated diet were placed in sealed plastic containers and maintained in the dark at 27 0 C in a humidified incubator. Mortality was scored at day 10. No control mortality was seen.
Activity against various lepidopteran larvae was tested as follows. Photorhabdus culture broth or purified fractions were applied directly to the surface cm of 0.25 ml of standard artificial diet in 30 .1 aliquots following dilution in control medium or 10 mM sodium phosphate buffer, pH 7.0, respectively.
The diet plates were allowed to air-dry in a sterile flow-hood and the wells were infested with single, neonate larva. European corn borer (Ostrinia nubilalis) and corn earworm (Helicoverpa zea) eggs were supplied from commercial sources and hatched inhouse, whereas beet armyworm (Spodoptera exigua), cabbage looper (Trichoplusia ni), tobacco budworm (Heliothis virescens), codling moth (Laspeyresia pomonella) and black cutworm (Agrotis ipsilon) larvae were supplied internally. Following infestation with larvae, the diet plates were sealed, placed in a humidified growth chamber and maintained in the dark at 27 0 C for the appropriate period. Mortality and weight determinations were scored at days 5-7 for Photorhabdus culture broth and days 4-7 for the purified fraction. Generally, 16 insects per treatment were used in all studies. Control mortality ranged from 4-12.5% for control medium and was less than 10% for phosphate buffer.
SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 Table 3 Effect of Phocorhabdus luminescens (strain W-14) Culture Broth and Purified Toxin Fraction on Mortality and Growth Inhibition of Different Insect Orders/Species Insect Order/Species Broth Purified Fraction Mort. G.I. Mort. G.I.
COLEOPTERA
Corn Rootworm Southern/neonate larva 100 na 100 na Southern/2ld instar na 38.5 nt nt Southern/adult 45 nt nt nt Western/2 n instar na 35 nt nt Colorado Potato Beetle 93 nt nt nt 2 nd instar Turf Grub na a.f. nt nt 3 d instar na a.f. nt nt adult
DIPTERA
Fruit Fly (adult 17 nt nt nt emergence) 100 na nt nt Mosquito larvae
HOMOPTERA
Aster Leafhopper 96.5 na 100 na
HYMENOPTERA
Argentine Ant 75 na nt na Carpenter Ant 71 na 100 na
LEPIDOPTERA
Beet Armyworm 12.5 36 18.75 41.4 Black Cutworm nt nt 0 71.2 Cabbage Looper nt nt 21.9 66.8 Codling Moth nt nt 6.25 45.9 Corn Earworm 56.3 94.2 97.9 na European Corn Borer 96.7 98.4 100 na Tobacco Budworm 13.5 52.5 19.4 85.6 Mort. mortality, G.I. growth inhibition, na not applicable, nt not tested, a.f. anti-feedant -31- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 Example 3 Insecticide Utility Upon Soil Application Photorhabdus luminescens (strain W-14) culture broth was shown to be active against corn rootworm when applied directly to soil or a soil-mix (Metromix) Activity against neonate SCR and WCR in Metromix" was tested as follows (Table The test was run using corn seedlings (United Agriseeds brand CL614) that were germinated in the light on moist filter paper for 6 days. After roots were approximately 3-6 cm long, a single kernel/seedling was planted in a 591 ml clear plastic cup with 50 gm of dry Metromix". Twenty neonate SCR or WCR were then placed directly on the roots of the seedling and covered with Metromix'. Upon infestation, the seedlings were then drenched with 50 ml total volume of a diluted broth solution. After drenching, the cups were sealed and left at room temperature in the light for 7 days.
Afterwards, the seedlings were washed to remove all Metromix' and the roots were excised and weighed. Activity was rated as the percentage of corn root remaining relative to the control plants and as leaf damage induced by feeding. Leaf damage was scored visually and rated as either or with representing no damage and representing severe damage.
Activity against neonate SCR in soil was tested as follows (Table The test was run using corn seedlings (United Agriseeds brand CL614) that were germinated in the light on moist filter paper for 6 days. After the roots were approximately 3-6 cm long, a single kernel/seedling was planted in a 591 ml clear plastic cup with 150 gm of soil from a field in Lebanon, IN planted the previous year with corn. This soil had not been previously treated with insecticides. Twenty neonate SCR were then placed directly on the roots of the seedling and covered with soil. After infestation, the seedlings were drenched with ml total volume of a diluted broth solution. After drenching, the unsealed cups were incubated in a high relative humidity chamber at 78 0 F. Afterwards, the seedlings were washed to remove all soil and the roots were excised and weighed. Activity was rated as the percentage of corn root remaining relative to the control plants and as leaf damage induced by feeding. Leaf damage was scored visually and rated as either or with representing no damage and representing severe damage.
-32- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 Table- 4- Effect of Phocorhabdus luminescens (strain W-14) Culture Broth on Rootworm Larvae After Post-Infestation Drenching (Metromix Treatment Larvae Leaf Damage Southern Corn Rootworm Water Medium v/v) Broth (6.25%v/v) Water Media v/v) Broth (1.56% v/v) Western Corn Rootworm Water Broth v/v) Water Broth v/v) Root Weight (g) 0.4916 0.023 0.4416 0.029 0.4641 0.081 0.1410 0.006 0.1345 0.028 0.4830 0.031 0.4446 0.019 0.4069 0.026 0.2202 0.015 0.3879 0.013 100 100 100 28.7 30.4 104 100 100 49 Table Effect of Photorhabdus luminescens (strain W-14) Culture Broth on Southern Corn Rootworm Larvae After Post-Infestation Drenching (Soil) Treatment Larvae Leaf Damage Water Broth (50% v/v) Water Broth (50% v/v) Root Weight(g) 0.2148 0.014 0.2260 0.016 0.0916 0.009 0.2428 0.032 100 103 43 113 Activity of Phocorhabdus luminescens (strain W-14) culture broth against second instar turf grubs in Metromix' was observed in tests conducted as follows (Table Approximately 50 gm of dry Metromix' was added to a 591 ml clear plastic cup. The Metromix' was then drenched with 50 ml total volume of a 50% (v/v) diluted Photorhabdus broth solution. The dilution of crude broth was made with water, with 50% broth being prepared by adding ml of crude broth to 25 ml of water for 50 ml total volume. A 1% solution of proteose peptone #3 (PP3), which is a dilution of the normal media concentration, was used as a broth control. After drenching, five second instar turf grubs were -33- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 placed on the top of the moistened Metromix*. Healthy turf arub larvae burrowed rapidly into the Metromix'. Those larvae that did not burrow within lh were removed and replaced with fresh larvae.
The cups were sealed and placed in a 28 0 C incubator, in the dark.
After seven days, larvae were removed from the Metromix' and scored for mortality. Activity was rated the percentage of mortality relative to control.
Table 6 Effect of Photorhabdus luminescens (strain W-14) Culture Broth on Turf Grub After Pre-Infestation Drenching (Metromix') Treatment Mortality* Mortality Water 7/15 47 Control medium w/v) 12/19 63 Broth v/v) 17/20 *expressed as a ratio of dead/living larvae Example 4 Insecticide Utility Upon Leaf Application Activity of Photorhabdus broth against European corn borer was seen when the broth was applied directly to the surface of maize leaves (Table In these assays Photorhabdus broth was diluted 100-fold with culture medium and applied manually to the surface of excised maize leaves at a rate of -6.0 ±l/cm 2 of leaf surface. The leaves were air dried and cut into equal sized strips approximately 2 x 2 inches. The leaves were rolled, secured with paper clips and placed in 1 oz plastic shot glasses with 0.25 inch of 2% agar on the bottom surface to provide moisture. Twelve neonate European corn borers were then placed onto the rolled leaf and the cup was sealed. After incubation for 5 days at 27 0 C in the dark, the samples were scored for feeding damage and recovered larvae.
-34- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 Table 7 Effect of Phoccrhabdus luminescens (strain W-14) Culture Broth on European Corn Borer Larvae Following Pre-Infestation Application to Excised Maize Leaves Treatment Leaf Damage Larvae Recovered Weight(mg) Water Extensive 55/120 0.42 mg Control Medium Extensive 40/120 0.50 mg Broth v/v) Trace 3/120 0.15 mg Activity of the culture broth against neonate tobacco budworm (Heliothis virescens) was demonstrated using a leaf dip methodology. Fresh cotton leaves were excised from the plant and leaf disks were cut with an 18.5 mm cork-borer. The disks were individually emersed in control medium (PP3) or Photorhabdus luminescens (strain W-14) culture broth which had been concentrated approximately 10-fold using an Amicon (Beverly, MA), Proflux M12 tangential filtration system with a 10 kDa filter.
Excess liquid was removed and a straightened paper clip was placed through the center of the disk. The paper clip was then wedged into a plastic, 1.0 oz shot glass containing approximately ml of 1% Agar. This served to suspend the leaf disk above the agar. Following drying of the leaf disk, a single neonate tobacco budworm larva was placed on the disk and the cup was capped. The cups were then sealed in a plastic bag and placed in a darkened, 27 0 C incubator for 5 days. At this time the remaining larvae and leaf material were weighed to establish a measure of leaf damage (Table 8).
Table 8 Effect of Photorhabdus luminescens (Strain W-14) Culture Broth on Tobacco Budworm Neonates in a Cotton-Leaf Dip Assay Final Weights (mg) Treatment Leaf Disk Larvae Control leaves 55.7 1.3 na* Control Medium 34.0 2.9 4.3 0.91 Phocorhabdus broth 54.3 1.4 0.0** not applicable, no live larvae found SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 Example 5, Part A Characterization of Toxin Peptide Components In a subsequent analysis, the toxin protein subunits of the bands isolated as in Example 1 were resolved on a 7% SDS polyacrylamide electrophoresis gel with a ratio of 30:0.8 (acrylamide:BIS-acrylamide). This gel matrix facilitates better resolution of the larger proteins. The gel system used to estimate the Band 1 and Band 2 subunit molecular weights in Example 1 was an 18% gel with a ratio of 38:0.18 (acrylamide:BISacrylamide), which allowed for a broader range of size separation, but less resolution of higher molecular weight components.
In this analysis, 10, rather than 8, protein bands were resolved. Table 9 reports the calculated molecular weights of the 10 resolved bands, and directly compares the molecular weights estimated under these conditions to those of the prior example. It is not surprising that additional bands were detected under the different separation conditions used in this example. Variations between the prior and new estimates of molecular weight are also to be expected given the differences in analytical conditions. In the analysis of this example, it is thought that the higher molecular weight estimates are more accurate than in Example 1, as a result of improved resolution.
However, these are estimates based on SDS PAGE analysis, which are typically not analytically precise and result in estimates of peptides and which may have been further altered due to post- and co-translational modifications.
Amino acid sequences were determined for the N-terminal portions of five of the 10 resolved peptides. Table 9 correlates the molecular weight of the proteins and the identified sequences. In SEQ ID NO:2, certain analyses suggest that the proline at residue 5 may be an asparagine (asn). In SEQ ID NO:3, certain analyses suggest that the amino acid residues at positions 13 and 14 are both arginine (arg). In SEQ ID NO:4, certain analyses suggest that the amino acid residue at position 6 may be either alanine (ala) or serine (ser). In SEQ ID certain analyses suggest that the amino acid residue at position 3 may be aspartic acid (asp).
-36- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 Table 9 EXAMPLE 1 ESTIMATE NEW ESTIMATE* SEQ. LISTING 208 200.2 kDa SEQ ID NO:1 184 175.0 kDa SEQ ID NO:2 65.6 68.1 kDa SEQ ID NO:3 60.8 65.1 kDa SEQ ID NO:4 56.2 58.3 kDa SEQ ID 25.1 23.2 kDa SEQ ID *New estimates are based on SDS PAGE and are not based on gene sequences. SDS PAGE is not analytically precise.
Example 5, Part B Characterization of Toxin Peptide Components New N-terminal sequence, SEQ ID NO:15, Ala Gin Asp Gly Asn Gin Asp Thr Phe Phe Ser Gly Asn Thr, was obtained by further Nterminal sequencing of peptides isolated from Native HPLCpurified toxin as described in Example 5, Part A, above. This peptide comes from the tcaA gene. The peptide labeled TcaAii, starts at position 254 and goes to position 491, where the TcaAiii peptide starts, SEQ ID NO:4. The estimated size of the peptide based on the gene sequence is 25,240 Da.
Example 6 Characterization of Toxin Peptide Components In yet another analysis, the toxin protein complex was reisolated from the Photorhabdus luminescens growth medium (after culture without Tween) by performing a 10% 80% ammonium sulfate precipitation followed by an ion exchange chromatography step (Mono Q) and two molecular sizing chromatography steps. These conditions were like those used in Example 1. During the first molecular sizing step, a second biologically active peak was found at about 100 10 kDa. Based upon protein measurements, this fraction was 20 50 fold less active than the larger, or primary, active peak of about 860 100 kDa (native). During this isolation experiment, a smaller active peak of about 325 kDa that retained a considerable portion of the starting biological activity was also resolved. It is thought that the 325 kDa peak is related to or derived from the 860 kDa peak.
-37- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 A 56 kDa protein was resolved in this analysis. The Nterminal sequence of this protein is presented in SEQ ID NO:6.
It is noteworthy that this protein shares significant identitand conservation with SEQ ID NO:5 at the N-terminus, suggesting that the two may be encoded by separate members of a gene family and that the proteins produced by each gene are sufficiently similar to both be operable in the insecticidal toxin complex.
A second, prominent 185 kDa protein was consistently present in amounts comparable to that of protein 3 from Table 9, and may be the same protein or protein fragment. The N-terminal sequence of this 185 kDa protein is shown at SEQ ID NO:7.
Additional N-terminal amino acid sequence data were also obtained from isolated proteins. None of the determined Nterminal sequences appear identical to a protein identified in Table 9. Other proteins were present in isolated preparation.
One such protein has an estimated molecular weight of 108 kDa and an N-terminal sequence as shown in SEQ ID NO:8. A second such protein has an estimated molecular weight of 80 kDa and an Nterminal sequence as shown in SEQ ID NO:9.
When the protein material in the approximately 325 kDa active peak was analyzed by size, bands of approximately 51, 31, 28, and 22 kDa were observed. As in all cases in which a molecular weight was determined by analysis of electrophoretic mobility, these molecular weights were subject to error effects introduced by buffer ionic strength differences, electrophoresis power differences, and the like. One of ordinary skill would understand that definitive molecular weight values cannot be determined using these standard methods and that each was subject to variation. It was hypothesized that proteins of these sizes are degradation products of the larger protein species (of approximately 200 kDa size) that were observed in the larger primary toxin complex.
Finally, several preparations included a protein having the N-terminal sequence shown in SEQ ID NO:10. This sequence was strongly homologous to known chaperonin proteins, accessory proteins known to function in the assembly of large protein complexes. Although the applicants could not ascribe such an assembly function to the protein identified in SEQ ID NO:10, it was consistent with the existence of the described toxin protein complex that such a chaperonin protein could be involved in its -38- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96I18003 assembly. Moreover, although such proteins have not directly been suggested to-have toxic activity, this protein may be important to determining the overall structural nature of the protein.toxin, and.thus, may contribute to the toxic activity or durability of the complex in vivo after oral delivery.
Subsequent analysis of the stability of the protein toxin complex to proteinase K was undertaken. It was determined that after 24 hour incubation of the complex in the presence of a fold molar excess of proteinase K, activity was virtually eliminated (mortality on oral application dropped to about These data confirm the proteinaceous nature of the toxin.
The toxic activity was also retained by a dialysis membrane, again confirming the large size of the native toxin complex.
Example 7 Isolation, Characterization and Partial Amino Acid Sequencing of Photorhabdus Toxins Isolation and N-Terminal Amino Acid Sequencing: In a set of experiments conducted in parallel to Examples 5 and 6, ammonium sulfate precipitation of Photorhabdus proteins was performed by adjusting Photorhabdus broth, typically 2-3 liters, to a final concentration of either 10% or 20% by the slow addition of ammonium sulfate crystals. After stirring for 1 hour at 4 0 C, the material was centrifuged at 12,000 x g for 30 minutes. The supernatant was adjusted to 80% ammonium sulfate, stirred at 4 0
C
for 1 hour, and centrifuged at 12,000 x g for 60 minutes. The pellet was resuspended in one-tenth the volume of 10 mM Na 2
.PO
4 pH 7.0 and dialyzed against the same phosphate buffer overnight at 4 0 C. The dialyzed material was centrifuged at 12,000 x g for 1 hour prior to ion exchange chromatography.
A HR 16/50 Q Sepharose (Pharmacia) anion exchange column was equilibrated with 10 mM Na2.P0 4 pH 7.0. Centrifuged, dialyzed ammonium sulfate pellet was applied to the Q Sepharose column at a rate of 1.5 ml/min and washed extensively at 3.0 ml/min with equilibration buffer until the optical density 280) reached less than 0.100. Next, either a 60 minute NaC1 gradient ranging from 0 to 0.5 M at 3 ml/min, or a series of step elutions using 0.1 M, 0.4 M and finally 1.0 NaCI for 60 minutes each was applied to the column. Fractions were pooled and concentrated using a -39- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 Centriprep 100. Alternatively, proteins could be eluted by a single 0.4 M NaC1 wash without prior elution with 0.1 M NaC1.
Two milliliter aliquots of concentrated Q Sepharose samples were loaded at 0.5 ml/min onto a HR 16/50 Superose 12 (Pharmacia) gel filtration column equilibrated with 10 mM Na2.PO 4 pH The column was washed with the same buffer for 240 min at ml/min and 2 min samples were collected. The void volume material was collected and concentrated using a Centriprep 100.
Two milliliter aliquots of concentrated Superose 12 samples were loaded at 0.5 ml/min onto a HR 16/50 Sepharose 4B-CL (Pharmacia) gel filtration column equilibrated with 10 mM Na 2
.PO
4 pH The column was washed with the same buffer for 240 min at ml/min and 2 min samples were collected.
The excluded protein peak was subjected to a second fractionation by application to a gel filtration column that used a Sepharose CL-4B resin, which separates proteins ranging from kDa to 1000 kDa. This fraction was resolved into two peaks; a minor peak at the void volume (>1000 kDa) and a major peak which eluted at an apparent molecular weight of about 860 kDa.
Over a one week period subsequent samples subjected to gel filtration showed the gradual appearance of a third peak (approximately 325 kDa) that seemed to arise from the major peak, perhaps by limited proteolysis. Bioassays performed on the three peaks showed that the void peak had no activity, while the 860 kDa toxin complex fraction was highly active, and the 325 kDa peak was less active, although quite potent. SDS PAGE analysis of Sepharose CL-4B toxin complex peaks from different fermentation productions revealed two distinct peptide patterns, denoted and The two patterns had marked differences in the molecular weights and concentrations of peptide components in their fractions. The pattern, produced most frequently, had 4 high molecular weight peptides 150 kDa) while the "P" pattern had 3 high molecular weight peptides. In addition, the peptide fraction was found to have 2-3 fold more activity against European Corn Borer. This shift may be related to variations in protein expression due to age of inoculum and/or other factors based on growth parameters of aged cultures.
Milligram quantities of peak toxin complex fractions determined to be or peptide patterns were subjected to preparative SDS PAGE, and transblotted with TRIS-glycine SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 (Seprabuff M to PVDF membranes (ProBlott, Applied Biosystems) for 3-4 hours. Blots were sent for amino acid analysis and Nterminal amino acid sequencing at Harvard MicroChem and Cambridge ProChem, respectively. Three peptides in the pattern had unique N-terminal amino acid sequences compared to the sequences identified in the previous example. A 201 kDa (TcdAii) peptide set forth as SEQ ID NO:.13 below shared between 33% amino acid identity and 50% similarity with SEQ ID NO:1 (TcbAii)(Table in Table 10 vertical lines denote amino acid identities and colons indicate conservative amino acid substitutions). A second peptide of 197 kDa, SEQ ID NO:14 (TcdB), had 42% identity and 58% homology with SEQ ID NO:2 (TcaC). Yet a third peptide of 205 kDa was denoted TcdAii. In addition, a limited N-terminal amino acid sequence, SEQ ID NO:16 (TcbA), of a peptide of at least 235 kDa was identical in homology with the amino acid sequence, SEQ ID NO:12, deduced from a cloned gene (CcbA), SEQ ID NO:11, containing a deduced amino acid sequence corresponding to SEQ ID NO:1 (TcbAii). This indicates that the larger 235+ kDa peptide was proteolytically processed to the 201 kDa peptide, (TcbAii), (SEQ ID NO:1) during fermentation, possibly resulting in activation of the molecule. In yet another sequence, the sequence originally reported as SEQ ID NO:5 (TcaBii) reported in Example 5 above, was found to contain an aspartic acid residue (Asp) at the third position rather than glycine (Gly) and two additional amino acids Gly and Asp at the eighth and ninth positions, respectively. In yet two other sequences, SEQ ID NO:2 (TcaC) and SEQ ID NO:3 (TcaBi), additional amino acid sequence was obtained. Densitometric quantitation was performed using a sample that was identical to the preparation sent for Nterminal analysis. This analysis showed that the 201 kDa and 197 kDa peptides represent 7.0% and respectively, of the total Coomassie brillant blue stained protein in the pattern and are present in amounts similar to the other abundant peptides.
It is speculated that these peptides may represent protein homologs, analogous to the situation found with other bacterial toxins, such as various Cryl Bt toxins. These proteins vary from 40-90% homology at their N-terminal amino acid sequence, which encompasses the toxic fragment.
-41- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 Internal Amino Acid Sequencing: To facilitate cloning of toxin peptide genes, internal amino acid sequences of selected peptides were obtained as followed. Milligram quantities of peak 2A fractions determined to be or peptide patterns were subjected to preparative SDS PAGE, and transblotted with TRISglycine (Seprabuff 4 to PVDF membranes (ProBlott, Applied Biosystems) for 3-4 hours. Blots were sent for amino acid analysis and N-terminal amino acid sequencing at Harvard MicroChem and Cambridge ProChem, respectively. Three peptides, referred to as TcbAii (containing SEQ ID NO:1), TcdAii, and TcaBi (containing SEQ ID NO:3) were subjected to trypsin digestion by Harvard MicroChem followed by HPLC chromatography to separate individual peptides. N-terminal amino acid analysis was performed on selected tryptic peptide fragments. Two internal peptides were sequenced for the peptide TcaBt (205 kDa peptide) referred to as TcaBi-PTlll (SEQ ID NO:17) and TcaBi-PT79 (SEQ ID NO:18). Two internal peptides were sequenced for the peptide TcaBi (68 kDa peptide) referred to as TcaBI-PT158 (SEQ ID NO:19) and TcaBt-PT108 (SEQ ID NO:20). Four internal peptides were sequenced for the peptide TcbAii (201 kDa peptide) referred to as TCBAII-PT103 (SEQ ID NO:21), TcbAii-PT56 (SEQ ID NO:22), TcbAii- PT81(a) (SEQ ID NO:23), and TcbAii-PT81(b) (SEQ ID NO:24).
Table N-Terminal Amino Acid Sequences 201 kDa (33% identity 50% similarity to SEQ ID NO.1) L I G Y N N Q F S G A SEQ ID NO:13 I I I I F I Q G Y S D L F G N A SEQ ID NO:l 197 kDa (42% identity 58% similarity SEQ ID NO.2) M Q N S Q T F S V G E L SEQ ID NO.14 I I I T
I
M Q D S P E V S I T T L SEQ ID NO.2 Example 8 Construction of a cosmid library of Phocorhabdus luminescens W-14 genomic DNA and its screening to isolate genes encoding peptides comprising the toxic protein preparation As a prerequisite for the production of Photorhabdus insect toxic proteins in heterologous hosts, and for other uses, it is necessary to isolate and characterize the genes that encode those -42- SUBSTITUTE SHEET (RULE 26) 43 peptides. This objective was pursued in parallel. One approach, described later, was based on the use of monoclonal and polyclonal antibodies raised against the purified toxin which were then used to isolate clones from an expression library. The other approach, described in this example, is based on the use of the N-terminal and internal amino acid sequence data to design degenerate oligonucleotides for use in PCR amplication. Either method can be used to identify DNA clones that contain the peptide-encoding genes so as to permit the isolation of the respective genes, and the determination of their DNA base sequence.
Genomic DNA Isolation Photorhabdus luminescens (Xenorhabdus luminescens) strain W-14 (ATCC accession number 55397) deposited on 5 March 1993, was grown on 2% proteose peptone #3 agar (Difco Laboratories, Detroit, MI) and insecticidal toxin competence was maintained by repeated bioassay after passage, using the method described in Example 1 above. A 50ml shake culture was produced in a 175ml baffled flask in 2% proteose 15 peptone #3 medium, grown at 28 0 C and 150rpm for approximately 24 hours. 15ml of this culture was pelleted and frozen in its medium at -20 0 C until it was thawed for DNA S: isolation. The thawed culture was centrifuged, (700 x g, 30 min) and the floating orange mucopolysaccharide material was removed. The remaining cell material was centrifuged S (25,000 x g, 15 min) to pellet the bacterial cells, and the medium was removed and discarded.
Genomic DNA was isolated by an adaptation of the CTAB method described in section 2.4.1 of Current Protocols in Molecular Biology (Ausubel et al. eds, John Wiley Sons, 1994) [modified to include a salt shock and with all volumes increased The pelleted bacterial cells were resuspended in TE buffer (10mM Tris-HC1, 1mM EDTA, pH 8.0) to a final volume of 10ml, then 12ml of 5M NaCl was added; this mixture was centrifuged 20 min at 15,000 x g. The pellet was resuspended in 5.7ml TE and 300ml of 10% SDS and 60ml of 20mg/ml proteinase K (Gibco BRL Products, Grand Island, NY; in sterile distilled water) were added to the suspension. This mixture was incubated at 37 0 C for lhr; then approximately 10mg lysozyme (Worthington Biochemical Corp., Freehold, NJ) was added. After an additional 45 min, 1ml of 5M NaCl and 800ml of CTAB/NaCl solution (10% w/v CTAB, 0.7M [n:\libc]02397:MEF WO 97/17432 PCT/US96/18003 NaCI) were added. Tnis preparation was incubated 10 min at 65 0
C.
then gently agitated and further incubated and agitated for approximately 20 min to assist clearing of the cellular material.
An equal volume of chloroform/isoamyl alcohol solution (24:1, v/v) was added, mixed gently and centrifuged. After two extractions with an equal volume of PCI (phenol/chloroform/isoamyl alcohol; 50:49:1, v/v/v; equilibrated with 1 M Tris-HCl, pH 8.0; Intermountain Scientific Corporation, Kaysville, UT), the DNA was precipitated with 0.6 volume of isopropanol. The DNA precipitate was gently removed with a glass rod, washed twice with 70% ethanol, dried, and dissolved in 2 ml STE (10 mM Tris-HCl pH 8.0, 10 mM NaC1, 1 mM EDTA). This preparation contained 2.5 mg/ml DNA, as determined by optical density at 260 nm OD 26 o).
The molecular size range of the isolated genomic DNA was evaluated for suitability for library construction. CHEF gel analysis was performed in 1.5% agarose (Seakem' LE, FMC BioProducts, Rockland, ME) gels with 0.5 X TBE buffer (44.5 mM Tris-HCl pH 8.0, 44.5 mM H 3
BO
3 1 mM EDTA) on a BioRad CHEF-DR I apparatus with a Pulsewave 760 Switcher (Bio-Rad Laboratories, Inc., Richmond, CA). The running parameters were: initial A time,'3 sec; final A time, 12 sec; 200 volts; running temperature, 4-18oC; run time, 16.5 hr. Ethidium bromide staining and examination of the gel under ultraviolet light indicated the DNA ranged from 30-250 kbp in size.
CONSTRUCTION OF LIBRARY: A partial Sau3A 1 digest was I::de of this Phocorhabdus genomic DNA preparation. The method was based on section 3.1.3 of Ausubel (supra.). Adaptions included running smaller scale reactions under various conditions until nearly optimal results were achieved. Several scaled-up large reactions with varied conditions were run, the results analyzed on CHEF gels, and only the best large scale preparation was carried forward. In the optimal case, 200 pg of Photorhabdus genomic DNA was incubated with 1.5 units of Sau3A 1 (New England Biolabs, "NEB", Beverly, MA) for 15 min at 37 0 C in 2 ml total volume of IX NEB 4 buffer (supplied as 10X by the manufacturer).
The reaction was stopped by adding 2 ml of PCI and centrifuging at 8000 x g for 10 min. To the supernatant were added 200 Il of 5 M NaCl plus 6 ml of ice-cold ethanol. This preparation was -44- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 chilled for 30 min at -200C, then centrifuged at 12,000 g for min. The supernatant was removed and the precipitate was dried in a vacuum oven at 40 0 C, then resuspended in 400 l1 STE.
Spectrophotometric assay indicated about 40% recovery of the input DNA. The digested DNA was size fractionated on a sucr.se gradient according to section 5.3.2 of CPMB (op. cit.). A to 40% linear sucrose gradient was prepared with a gradient maker in Ultra-ClearTM tubes (Beckman Instruments, Inc., Palo Alto, CA) and the DNA sample was layered on top. After centrifugation, (26,000 rpm, 17 hr, Beckman SW41 rotor, 20 0
C!,
fractions (about 750 gl) were drawn from the top of the gradient and analyzed by CHEF gel electrophoresis (as described earlier).
Fractions containing Sau3A 1 fragments in the size range 20-40 kbp were selected and DNA was precipitated by a modification (amounts of all solutions increased approximately 6.3-fold) of the method in section 5.3.3 of Ausubel (supra.). After overnight precipitation, the DNA was collected by centrifugation (17,0O0 x g, 15 min), dried, redissolved in TE, pooled into a final volume of 80 gl, and reprecipitated with the addition of 8 1i 3 M sodium acetate and 220 p. ethanol. The pellet collected by centrifugation as above was resuspended in 12 gl TE.
Concentration of the DNA was determined by Hoechst 33258 dye (Polysciences, Inc., Warrington, PA) fluorometry in a Hoefer TKO100 fluorimeter (Hoefer Scientific Instruments, San Francisco, CA). Approximately 2.5 Ig of the size-fractionated DNA was recovered.
Thirty gg of cosmid pWE15 DNA (Stratagene, La Jolla, CA) was digested to completion with 100 units of restriction enzyme BamH 1 (NEB) in the manufacturer's buffer (final volume of 200 g1, 37 0 C, 1 hr). The reaction was extracted with 100 gl of PCI ,ind DNA was precipitated from the aqueous phase by addition of 20 p.
3M sodium acetate and 550 .1 -20 0 C absolute ethanol. After min at -70 0 C, the DNA was collected by centrifugation (17,000 x g, 15 min), dried under vacuum, and dissolved in 180 pl of l) mM Tris-HC1, pH 8.0. To this were added 20 .1 of 10X CIP buffe: (100 mM Tris-HC1, pH 8.3; 10 mM ZnCl 2 10 mM MgCl 2 and 1 gl (0.25 units) of 1:4 diluted calf intestinal alkaline phosphatase SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 (Boehringer Mannheim Corporation, Indianapolis, IN). After min at 37 0 C, the following additions were made: 2 41 0.5 M EDTA, pH 8.0; 10 p. 10% SDS; 0.5 .l of 20 mg/ml proteinase K (as above), followed by incubation at 55 0 C for 30 min. Following sequential extractions with 100 1l of PCI and 100 p. phenol (Intermountain Scientific Corporation, equilibrated with 1 M Tris-HCl, pH the dephosphorylated DNA was precipitated by addition of 72 p. of 7.5 M ammonium acetate and 550 .l -20 0
C
ethanol, incubation on ice for 30 min, and centrifugation as above. The pelleted DNA was washed once with 500 .l -20oC ethanol, dried under vacuum, and dissolved in 20 p1 of TE buffer.
Ligation of the size-fractionated Sau3A 1 fragments to the BamH 1-digested and phosphatased pWE15 vector was accomplished using T4 ligase (NEB) by a modification use of premixed 10X ligation buffer supplied by the manufacturer) of the protocol in section 3.33 of Ausubel. Ligation was carried out overnight in a total volume of 20 pl at 15 0 C, followed by storage at 0
C.
Four p. of the cosmid DNA ligation reaction, containing about 1 pg of DNA, was packaged into bacteriophage lambda using a commercial packaging extract (Gigapack' III Gold Packaging Extract, Stratagene), following the manufacturer's directions.
The packaged preparation was stored at 4 0 C until use. The packaged cosmid preparation was used to infect Escherichia coli XL1 Blue MR cells (Stratagene) according to the Gigapack' III c<:ld protocols ("Titering the Cosmid Library"), as follows. XL1 Blue MR cells were grown in LB medium Bacto-tryptone, 10; Bactoyeast extract, 5; Bacto-agar, 15; NaC1, 5; [Difco Laboratories, Detroit, MI]) containing 0.2% maltose plus 10 mM MgSO, at 37 0 C. After 5 hr growth, cells were pelleted at 700 x g (15 min) and resuspended in 6 ml of 10 mM MgSO The culture density was adjusted with 10 mM MgSO4 to ODsoo 0.5. The packaged cosmid library was diluted 1:10 or 1:20 with sterile SM medium (0.1 M NaCI, 10 mM MgSO 4 50 mM Tris-HCl pH 7.5, 0.01% w/v gelatin), and 25 .l of the diluted preparation was mixed with 25 .1 of the diluted XL1 Blue MR cells. The mixture was incubated at 25 0 for min (without shaking), then 200 pl of LB broth was added, and incubation was continued for approximately 1 hr with occasional -46- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 gentle shaking. Aliquots (20-40 il) of this culture were spread on LB agar plates containing 100 mg/1 ampicillin LB-Ait and incubated overnight at 37 0 C. To store the library without amplification, single colonies were picked and inoculated into individual wells of sterile 96-well microwell plates; each well containing 75 il of Terrific Broth (TB media: 12 g/l Bactotryptone, 24 g/l Bacto-yeast extract, 0.4% v/v glycerol, 17 mM KHzPO 4 72 mM K 2
HPO
4 plus 100 mg/l ampicillin TB-Amp,,,) and incubated (without shaking) overnight at 37 0 C. After replicating the 96-well plate into a copy plate, 75 l/well of filtersterilized TB:glycerol v/v; with, or without, 100 mg/l ampicillin) was added to the plate, it was shaken briefly at 100 rpm, 37 0 C, and then closed with Parafilm! (American National Can, Greenwich, CT) and placed in a -70 0 C freezer for storage. Copy plates were grown and processed identically to the master plates.
A total of 40 such master plates (and their copies) were prepared.
SCREENING OF THE LIBRARY WITH RADIOLABELED DNA PROBES: To prepare colony filters for probing with radioactively labeled probes, ten 96-well plates of the library were thawed at 25 0
C
(bench top at room temperature). A replica plating tool with 96 prongs was used to inoculate a fresh 96-well copy plate containing 75 i/well of TB-Amptoo. The copy plate was grown overnight (stationary) at 37oC, then shaken about 30 min at 100 rpm at 37 0 C. A total of 800 colonies was represented in these copy plates, due to nongrowth of some isolates. The replica tool was used to inoculate duplicate impressions of the 96-well arrays onto Magna NT (MSI, Westboro, MA) nylon membranes (0.45 micron, 220 x 250 mm) which had been placed on solid LB-Amptoo (100 ml/dish) in Bio-assay plastic dishes (Nunc, 243 x 243 x 18 mm; Curtin Mathison Scientific, Inc., Wood Dale, IL). The colonies were grown on the membranes at 37 0 C for about 3 hr.
A positive control colony (a bacterial clone containing a GZ4 sequence insert, see below) was grown on a separate Magna IT membrane (Nunc, 0.45 micron, 82 mm circle) on LB medium supplemented with 35 mg/1 chloramphenicol LB-Camis), and processed alongside the library colony membranes. Bacterial colonies on the membranes were lysed, and the DNA was denatured -47- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCTIUS96/18003 and neutralized according to a protocol taken from the Geniusr m System User's Guide version 2.0 (Boehringer Mannheim, Indianapolis, IN). Membranes were placed colony side up on filter paper soaked with 0.5 N NaOH plus 1.5 M NaC1 for 15 min to denature, and neutralized on filter paper soaked with I M Tris- HCI pH 8.0, 1.5 M NaC1 for 15 min. After UV-crosslinking using a Stratagene UV Stratalinker set on auto crosslink, the membranes were stored dry at 25oc until use. Membranes were trimmed into strips containing the duplicate impressions of a single 96-well plate, then washed extensively by the method of section 6.4.1 in CPMB (op. cit.): 3 hr at 25 0 C in 3X SSC, 0.1% SDS, followed by 1 hr at 65 0 C in the same solution, then rinsed in 2X SSC in preparation for the hybridization step (20X SSC 3 M NaCI, 0.3 M sodium citrate, pH Amplification of a specific genomic fragment of a tcaC iene.
Based on the N-terminal amino acid sequence determined for the purified TcaC peptide fraction (disclosed herein as SEQ ID NO:2], a pool of degenerate oligonucleotides (pool S4Psh) was synthesized by standard P-cyanoethyl chemistry on an Applied BioSystem ABI394 DNA/RNA Synthesizer (Perkin Elmer, Foster City, CA). The oligonucleotides were deprotected 8 hours at 55 0
C,
dissolved in water, quantitated by spectrophotometric measurement, and diluted for use. This pool corresponds to the determined N-terminal amino acid sequence of the TcaC peptide.
The determined amino acid sequence and the corresponding degenerate DNA sequence are given below, where A, C, G, and T are the standard DNA bases, and I represents inosine: Amino Met Gin Asp Ser Pro Glu Val Acid S4Psh 5' ATG CA(A/G) GA(T/C) CCI GA(A/G) GT 3' Another set of degenerate oligonucleotides was synthesized (pool P2.3.5R), representing the complement of the coding strand for the determined amino acid sequence of the SEQ ID NO:17: Amino Acid Ala Phe Asn Ile Asp Asp Val Codons 5' GCN TT(T/C) AA(T/C) AT(A/T/C) GA(T/C) GA(T/C) GT 3' P2.3.5R 3'CG(A/C/G/T) AA(A/G) TT(A/G) TA(T/A/G) CT(A/G) CT(A/G) CA These oligonucleotides were used as primers in Polymerase Chain Reactions (PCR, Roche Molecular Systems, Branchburg, NJ) to -48- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 amplify a specilic UNA tragment from genomic DNA prepared from Photorhabdus strain W-14 (see above). A typical reaction (50 Il) contained 125 pmol of each primer pool P2Psh and P 2 .3.5R, 253 ng of genomic template DNA, 10 nmol each of dATP, dCTP, dGTP, and dTTP, IX GeneAmp' PCR buffer, and 2.5 units of AmpliTaq'
DNA
polymerase (both from Roche Molecular Systems; 10X GeneAmp' buffer is 100 mM Tris-HCl pH 8.3, 500 mM KC1, 0.01% w/v gelatin).
Amplifications were performed in a Perkin Elmer Cetus DNA Thermal Cycler (Perkin Elmer, Foster City, CA) using 35 cycles of 94oC (1.0 min), 55 0 C (2.0 min), 72 0 C (3.0 min), followed by an extension period of 7.0 min at 72 0 C. Amplification products were analyzed by electrophoresis through 2% w/v NuSieve' 3:1 agarose (FMC BioProducts) in TEA buffer (40 mM Tris-acetate, 2 mM EDTA, pH A specific product of estimated size 250 bp was observed amongst numerous other amplification products by ethidium bromide (0.5 gg/ml) staining of the gel and examination under ultraviolet light.
The region of the gel containing an approximately 250 bp product was excised, and a small plug (0.5 mm dia.) was removed and used to supply template for PCR amplification (40 cycles) The reaction (50 contained the same components as above, minus genomic template DNA. Following amplification, the ends of the fragments were made blunt and were phosphorylated by incubation at 25 0 C for 20 min with 1 unit of T4 DNA polymerase.
(NEB), 1 nmol ATP, and 2.15 units of T4 kinase (Pharmacia Biotech Inc., Piscataway,
NJ).
DNA fragments were separated from residual primers by electrophoresis through 1% w/v GTG' agarose (FMC) in TEA. A gel slice containing fragments of apparent size 250 bp was excised, and the DNA was extracted using a Qiaex kit (Qiagen Inc., Chatsworth,
CA).
The extracted DNA fragments were ligated to plasmid vector pBC (Stratagene) that had been digested to completion with restriction enzyme Sma 1 and extracted in a manner similar to that described for pWE15 DNA above. A typical ligation reaction (16.3 contained 100 ng of digested pBC DNA, 70 ng of 250 bp fragment DNA, 1 nmol [Co(NH) 6 ]Cli, and 3.9 Weiss units of T4 DNA ligase (Collaborative Biomedical Products, Bedford, MA), in IX ligation buffer (50 mM Tris-HC1, pH 7.4; 10 mM MgCl 2 10 mM -49- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 dithiothreitol; 1 mM spermidine, 1 mM ATP, 100 mg/ml bovine serum albumin). Following overnight incubation at 14°C, the ligated products were transformed into frozen, competent Escherichia oi cells (Gibco BRL) according to the suppliers' recommendations, and plated on LB-Cams plates, containing IPT; (119 gg/ml) and X-gal (50 gg/ml). Independent white colonies were picked, and plasmid DNA was prepared by a modified alkalinelysis/PEG precipitation method (PRISM T Ready Reaction DyeDeoxy
T
Terminator Cycle Sequencing Kit Protocols; ABI/Perkin Elmer).
The nucleotide sequence of both strands of the insert DNA was determined, using T7 primers [pBC bases 601-623: TAAAACGACGGCCAGTGAGCGCG) and LacZ primers [pBC bases 792- 816: ATGACCATGATTACGCCAAGCGCGC) and protocols supplied with the
PRISM
TM sequencing kit (ABI/Perkin Elmer). Nonincorporated dyeterminator dideoxyribonucleotides were removed by passage through Centri-Sep 100 columns (Princeton Separations, Inc., Adelphia, NJ) according to the manufacturer's instructions. The DNA sequence was obtained by analysis of the samples on an ABI Model 373A DNA Sequencer (ABI/Perkin Elmer). The DNA sequences of two isolates, GZ4 and HB14, were found to be as illustrated in Figure 1.
This sequence illustrates the following features: 1) bases 1-20 represent one of the 64 possible sequences of the S4Psh degenerate oligonucleotides, ii) the sequence of amino acids 1-3 and 6-12 correspond exactly to that determined for the N-terminus of TcaC (disclosed as SEQ ID NO:2), iii) the fourth amino acid encoded is a cysteine residue rather than serine. This difference is encoded within the degeneracy for the serine codons (see above), iv) the fifth amino acid encoded is proline, corresponding to the TcaC N-terminal sequence given as SEQ ID NO:2, v) bases 257-276 encode one of the 192 possible sequences designed into the degenerate pool, vi) the TGA termination codon introduced at bases 268-270 is the result of complementarity to the degeneracy built into the oligonucleotide pool at the corresponding position, and does not indicate a shortened reading frame for the corresponding gene.
Labeling of a TcaC peptide gene-specific probe. DNA fragments corresponding to the above 276 bases were amplified SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCTIUS96/18003 cycles) by PCR* in a 100 gl reaction volume, using 100 pmol each of P2Psh and P2.3.5R primers, 10 ng of plasmids GZ4 or HB14 as templates, 20 nmol each of dATP, dCTP, dGTP, and dTTP, 5 units of AmpliTAq t DNA polymerase, and IX concentration of GeneAmp buffer, under the same temperature regimes as described above. The amplification products were extracted from a 1% GTG' agarose gel by Qiaex kit and quantitated by fluorometry.
The extracted amplification products from plasmid HB14 template (approximately 400 ng) were split into five aliquots and labeled with "P-dCTP using the High Prime Labeling Mix (Boehringer Mannheim) according to the manufacturer's instructions. Nonincorporated radioisotope was removed by passage through NucTrap Probe Purification Columns (Stratagene), according to the supplier's instructions. The specific activity of the labeled DNA product was determined by scintillation counting to be 3.11 x 108 dpm/pg. This labeled DNA was used to probe membranes prepared from 800 members of the genomic library.
Screening with a TcaC-peptide gene specific probe. The radiolabeled HB14 probe was boiled approximately 10 min, then added to "minimal hyb" solution. [Note: The "minimal hyb" method is taken from a CERES protocol; "Restriction Fragment Length Polymorphism Laboratory Manual version sections 4-40 and 4- 47; CERES/NPI, Salt Lake City, UT. NPI is now defunct, with its successors operating as Linkage Genetics]. "Minimal hyb" solution contains 10% w/v PEG (polyethylene glycol, M.W. approx.
8000), 7% w/v SDS; 0.6X SSC, 10 mM sodium phosphate buffer (from a 1M stock containing 95 g/l NaH 2 P04.1H 2 0 and 84.5 g/l Na2HPO 4 .7H 2 5 mM EDTA, and 100 mg/ml denatured salmon sperm DNA. Membranes were blotted dry briefly then, without prehybridization, 5 strips of membrane were placed in each of 2 plastic boxes containing 75 ml of "minimal hyb" and 2.6 ng/ml of radiolabeled HB14 probe. These were incubated overnight with slow shaking (50 rpm) at 60 0 C. The filters were washed three times for approximately 10 min each at 25 0 C in "minimal hyb wash solution" (0.25X SSC, 0.2% SDS), followed by two 30-min washes with slow shaking at 60 0 C in the same solution. The filters were placed on paper covered with Saran Wrap'O (Dow Brands, Indianapolis, IN) in a light-tight autoradiographic cassette and exposed to X-Omat X-ray film (Kodak, Rochester, NY) with two -51- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 DuPont Cronex Lightning-Plus C1 enhancers (Sigma Chemical Co., St. Louis, MO), for 4 hr at -70 0 C. Upon development (standard photographic procedures), significant signals were evident in both replicates amongst a high background of weaker, more irregular signals. The filters were again washed for about 4 hr at 68 0 C in "minimal hyb wash solution" and then placed again in the cassettes and film was exposed overnight at -70 0 C. Twelve possible positives were identified due to strong signals on both of the duplicate 96-well colony impressions. No signal was seen with negative control membranes (colonies of XL1 Blue MR cells containing pWE15), and a very strong signal was seen with positive control membranes (DH5a cells containing the GZ4 isolate of the PCR product) that had been processed concurrently with the experimental samples.
The twelve putative hybridization-positive colonies were retrieved from the frozen 96-well library plates and grown overnight at 37 0 C on solid LB-Amptoo medium. They were then patched (3/plate, plus three negative controls: XL1 Blue MR cells containing the pWE15 vector) onto solid LB-Ampioo. Two sets of membranes (Magna NT nylon, 0.45 micron) were prepared for hybridization. The first set was prepared by placing a filter directly onto the colonies on a patch plate, then removing it with adherent bacterial cells, and processing as below. Filters of the second set were placed on plates containing LB-Amptoo medium, then inoculated by transferring cells from the patch plates onto the filters. After overnight growth at 37 0 C, the filters were removed from the plates and processed.
Bacterial cells on the filters were lysed and DNA denatured by placing each filter colony-side-up on a pool (1.0 ml) of 0.5 1I NaOH in a plastic plate for 3 min. The filters were blotted dry on a paper towel, then the process was repeated with fresh 0.5 N NaOH. After blotting dry, the filters were neutralized by placing each on a 1.0 ml pool of 1 M Tris-HC1, pH 7.5 for 3 min, blotted dry, and reneutralised with fresh buffer. This was followed by two similar soakings (5 min each) on pools of 0.5 M Tris-HC1 pH 7.5 plus 1.5 M NaCl. After blotting dry, the DNA was UV crosslinked to the filter (as above), and the filters were washed (25 0 C, 100 rpm) in about 100 ml of 3X SSC plus 0.1%(w/v) SDS (4 times, 30 min each with fresh solution for each wash).
They were then placed in a minimal volume of prehybridization -52- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 solution [6X SSC plus 1% w,'v each of Ficoll 400 (Pharmacia), polyvinylpyrrolidone (av. M.W. 360,000; Sigma and bovine serum albumin Fraction V; (Sigma)] for 2 hr at 65 0 C, 50 rpm. The prehybridization solution was removed, and replaced with the HB14 3 2P-labeled probe that had been saved from the previous hybridization of the library membranes and which had been denatured at 95 0 C for 5 min. Hybridization was performed at 60 0
C
for 16 hr with shakihg at 50 rpm.
Following removal of the labeled probe solution, the membranes were washed 3 times at 25 0 C (50 rpm, 15 min) in 3X SSC (about 150 ml each wash). They were then washed for 3 hr at 68 0
C
rpm) in 0.25X SSC plus 0.2% SDS (minimal hyb wash solution), and exposed to X-ray film as described above for 1.5 hr at 25 0
C
(no enhancer screens). This exposure revealed very strong hybridization signals to cosmid isolates 22G12, 25A10, 26A5, and 26B10, and a very weak signal with cosmid isolate 8B10. No signal was seen with the negative control (pWE15) colonies, and a very strong signal was seen with positive control membranes cells containing the GZ4 isolate of the PCR product) that had been processed concurrently with the experimental samples.
Amplification of a specific genomic fragment of a ccaB gene.
Based on the N-terminal amino acid sequence determined for the purified TcaB, peptide fraction (disclosed here as SEQ ID NO:3) a pool of degenerate oligonucleotides (pool P8F) was synthesized as described for peptide TcaC. The determined amino acid sequence and the corresponding degenerate DNA sequence are given below, where A, C, G, and T are the standard DNA bases, and I represents inosine: Amino Acid Leu Phe Thr Gln Thr Leu Lys Glu Ala Arg P8F 5' TTT ACI CA(A/G) ACI (C/T)TI AAA GAA GCI (A/C)G 3'
(C/T)TI
Another set of degenerate oligonucleotides was synthesized (pool P8.108.3R), representing the complement of the coding strand for the determined amino acid sequence of the TcaBi-PT108 internal peptide (disclosed herein as SEQ ID Amino Acid Met Tyr Tyr lie Gin Ala Gin Gin -53- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 Codons ATG TA(T/C) TA(T/C) AT(T/C/A) CA(A/G) GC(A/C/G.T) CA(A/G CAiAG: F9.11: .3R 3' AT(A/G) AT(A/G) TA(A/G/T) GT(T/C) CGI GT(T/C) GT
TAC
These oligonucleotides were used as primers for PCR* using HotStart 50 Tubes T M (Molecular Bio-Products, Inc., San Diego, CA) to amplify a specific DNA fragment from genomic DNA prepared from Photorhabdus strain W-14 (see above). A typical reaction (50 pl) 1(1 contained (bottom layer) 25 pmol of each primer pool P8F and P8.108.3R, with 2 nmol each of dATP, dCTP, dGTP, and dTTP, in IX GeneAmp' PCR buffer, and (top layer) 230 ng of genomic template DNA, 8 nmol each of dATP, dCTP, dGTP, and dTTP, and 2.5 units of AmpliTaq" DNA polymerase, in IX GeneAmp' PCR buffer.
Amplifications were performed by 35 cycles as described for the TcaC peptide. Amplification products were analyzed by electrophoresis through 0.7% w/v SeaKem' LE agarose (FMC) in TEA buffer. A specific product of estimated size 1600 bp was observed.
Four such reactions were pooled, and the amplified DNA was extracted from a 1.0% SeaKem! LE gel by Qiaex kit as described for the TcaC peptide. The extracted DNA was used directly as the template for sequence determination (PRISM~ Sequencing Kit) using, the P8F and P8.108.3R primer pools. Each reaction contained about 100 ng template DNA and 25 pmol of one primer pool, and was processed according to standard protocols as described for the TcaC peptide. An analysis of the sequence derived from extension of the P8F primers revealed the short DNA sequence (and encoded amino acid sequence): GAT GCA TTG NTT GCT Asp Ala Leu (Val) Ala which corresponds to a portion of the N-terminal peptide sequence disclosed as SEQ ID NO:3 (TcaBi).
Labeling of a TcaBi-peptide gene-specific probe.
Approximately 50 ng of gel-purified TcaBi DNA fragment was labeled with 2 P-dCTP as described above, and nonincorporated radioisotopes were removed by passage through a NICK Column" (Pharmacia). The specific activity of the labelled DNA was determined to be 6 x 10q dpm/pg. This labeled DNA was used to -54- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 probe colony membranes prepared from members of the genomic library that had hybridized to the TcaC-peptide specific probe.
The membranes containing the 12 colonies identified in the TcaC-probe library screen (see above) were stripped of radioactive TcaC-specific label by boiling twice for approximately 30 min each time in 1 liter of 0.1X SSC plus 0.1 SDS. Removal of radiolabel was checked with a 6 hr film exposure. The stripped membranes were then incubated with the TcaBi peptide-specific probe prepared above. The labeled DNA was denatured by boiling for 10 min, and then added to the filters that had been incubated for 1 hr in 100 ml of "minimal hyb" solution at 60 0 C. After overnight hybridization at this temperature, the probe solution was removed, and the filters were washed as follows (all in 0.3X SSC plus 0.1% SDS): once for 5 min at 25 0 C, once for 1 hr at 60 0 C in fresh solution, and once for 1 hr at 63oC in fresh solution. After 1.5 hr exposure to X-ray film by standard procedures, 4 strongly-hybridizing colonies were observed. These were, as with the TcaC-specific probe, isolates 22G12, 25A10, 26A5, and 26B10.
The same TcaBiprobe solution was diluted with an equal volume (about 100 ml) of "minimal hyb" solution, and then used to screen the membranes containing the 800 members of the genomic library. After hybridization, washing, and exposure to X-ray film as described above, only the four cosmid clones 22G12, 25A10, 26A5, and 26B10, were found to hybridize strongly to this probe.
ISOLATION OF SUBCLONES CONTAINING GENES ENCODING TcaC AND TcaBi PEPTIDES, AND DETERMINATION OF DNA BASE SEQUENCE THEREOF: Three hybridization-positive cosmids in strain XL1 Blue MR were grown with shaking overnight (200 rpm) at 30 0 C in 100 ml TB- Amploo. After harvesting the cells by centrifugation, cosmid DNA was prepared using a commercially available kit (BIGprep T M Prime 3 Prime, Inc., Boulder, CO), following the manufacturer's protocols. Only one cosmid, 26A5, was successfully isolated by this procedure. When digested with restriction enzyme EcoR 1 (NEB) and analyzed by gel electrophoresis, fragments of approximate sizes 14, 10, 8 (vector), 5, 3.3, 2.9, and 1.5 kbp were detected. A second attempt to isolate cosmid DNA from the same three strains (8 ml cultures; TB-Ampmi,, 30 0 C) utilized a SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 boiling miniprep method (Evans G. and G. Wahl., 1987, "Cosmid vectors for genomic walking and rapid restriction mapping." in Guide to Molecular Cloning Techniques. Meth. Enzymology, vol.
152, S. Berger and A. Kimmel, eds., pgs. 604-610). Only one cosmid, 25A10, was successfully isolated by this method. When digested with restriction enzyme EcoR I (NEB) and analyzed by gel electrophoresis, this cosmid showed a fragmentation pattern identical to that previously seen with cosmid 26A5.
A 0.15 gg sample of 26A5 cosmid DNA was used to transform ml of E. coli DH5a cells (Gibco BRL), by the supplier's protocols. A single colony isolate of that strain was inoculated into 4 ml of TB-Amplo, and grown for 8 hr at 37 0
C.
Chloramphenicol was added to a final concentration of 225 gg/ml, incubation was continued for another 24 hr, then cells were harvested by centrifugation and frozen at -20 0 C. Isolation of the 26A5 cosmid DNA was by a standard alkaline lysis miniprep (Maniatis et al., op. cit., p. 382), modified by increasing all volumes by 50% and with stirring or gentle mixing, rather than vortexing, at every step. After washing the DNA pellet in ethanol, it was dissolved in TE containing 25 gg/ml ribonuclease A (Boehringer Mannheim).
Identification of EcoR 1 fragments hybridizing to GZ4derived and TcaBi- probes. Approximately 0.4 ig of cosmid 25A10 (from XLI Blue MR cells) and about 0.5 gg of cosmid 26A5 (from chloramphenicol-amplified DH5a cells) were each digested with about 15 units of EcoR 1 (NEB) for 85 min, frozen overnight, then heated at 65oC for five min, and electrophoresed in a 0.7% agarose gel (Seakem" LE, IX TEA, 80 volts, 90 min). The DNA was stained with ethidium bromide as described above, and photographed under ultraviolet light. The EcoR 1 digest of cosmid 25A10 was a complete digestion, but the sample of cosmid 26A5 was only partially digested under these conditions. The agarose gel containing the DNA fragments was subjected to depurination, denaturation and neutralization, followed by Southern blotting onto a Magna NT nylon membrane, using a high salt (20X SSC) protocol, all as described in section 2.9 of Ausubel et al. (CPMB, op. cit.). The transferred DNA was then UV-crosslinked to the nylon membrane as before.
-56- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 An TcaC-peptide specific DNA fragment corresponding to the insert of plasmid isolate GZ4 was amplified by PCR" in a 100 ml reaction volume as described previously above. The amplification products from three such reactions were pooled and were extracted from a 1% GTG' agarose gel by Qiaex kit, as described above, and quantitated by fluorometry. The gel-purified DNA (100 ng) was labeled with 32 P-dCTP using the High Prime Labeling Mix (Boehringer Mannheim) as described above, to a specific activity of 6.34 x 10" dpm/gg.
The 2P-labeled GZ4 probe was boiled 10 min, then added to "minimal hyb" buffer (at 1 ng/ml), and the Southern blot membrane containing the digested cosmid DNA fragments was added, and incubated for 4 hr at 60 0 C with gentle shaking at 50 rpm. The membrane was then washed 3 times at 25 0 C for about 5 min each (minimal hyb wash solution), followed by two washes for 30 min each at 60 0 C. The blot was exposed to film twith enhancer screens) for about 30 min at -70 0 C. The GZ4 probe hybridized strongly to the 5.0 kbp (apparent size) EcoR 1 fragment both these two cosmids, 26A5 and 25A10.
The membrane was stripped of radioactivity by boiling for about 30 min in 0.1X SSC plus 0.1 SDS, and absence of radiolabel was checked by exposure to film. It was then hybridized at 60 0 C for 3.5 hours with the (denatured) TcaBi probe in "minimal hyb" buffer previously used for screening the colony membranes (above), washed as described previously, and exposed to film for 40 min at -70 0 C with two enhancer screens. With both cosmids, the TcaBi probe hybridized lightly with the about kbp EcoR 1 fragment, and strongly with a fragment of approximately 2.9 kbp.
The sample of cosmid 26A5 DNA previously described, (from cells) was used as the source of DNA from which to subclone the bands of interest. This DNA (2.5 gg) was digested with about 3 units of EcoR I (NEB) in a total volume of 30 .1 for 1.5 hr, to give a partial digest, as confirmed by gel electrophoresis. Ten g of pBC KS DNA (Stratagene) were digested for 1.5 hr with units of EcoR 1 in a total volume of 20 1l, leading to total digestion as confirmed by electrophoresis. Both EcoR 1-cut DNA preparations were diluted to 50 .l with water, to each an equal volume of PCI was added, the suspension was gently mixed, spun in -57- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/1 8003 a microcentrifuge and the aqueous supernatant was collected. E;A was precipitated by 150 p1 ethanol, and the mixture was placed ac 0 C overnight. Following centrifugation and drying, the EcoR 1-digested pBC KS was dissolved in 100 41 TE; the partially digested 26A5 was dissolved in 20 il TE. DNA recovery was checked by fluorometry.
In separate reactions, approximately 60 ng of EcoR idigested pBC DNA was ligated with approximately 180 ng or 270 ng of partially digested cosmid 26A5 DNA. Ligations were carried out in a volume of 20 gl at 15 0 C for 5 hr, using T4 ligase and buffer from New England BioLabs. The ligation mixture, diluted to 100 gl with sterile TE, was used to transform frozen, competent DH5a cells (Gibco BRL) according to the supplier's instructions. Varying amounts (25-200 of the transformed cells were plated on freshly prepared solid LB-Camts medium with 1 mM IPTG and 50 mg/l X-gal. Plates were incubated at 37 0 C about 20 hr, then chilled in the dark for approximately 3 hr to intensify color for insert selection. White colonies were picked onto patch plates of the same composition and incubated overnight at 37 0
C.
Two colony lifts of each of the selected patch plates were prepared as follows. After picking white colonies to fresh plates, round Magna NT nylon membranes were pressed onto the patch plates, the membrane was lifted off, and subjected to denaturation, neutralization and UV crosslinking as described above for the library colony membranes. The crosslinked colony lifts were vigorously washed, including gently wiping off the excess cell debris with a tissue. One set was hybridized with the GZ4(TcaC) probe solution described earlier, and the other set was hybridized with the TcaBi probe solution described earlier, according to the 'minimal hyb' protocol, followed by washing and film exposure as described for the library colony membranes.
Colonies showing hybridization signals either only with the GZ4 probe, with both GZ4 and TcaBi probes, or only with the TcaBi probe, were selected for further work and cells were streaked for single colony isolation onto LB-Cam 35 media with IPTG and X-gal as before. Approximately 35 single colonies, from 16 different isolates, were picked into liquid LB-Camis media and grown -58- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 overnight at 37oC; the cells were collected by centrifugation and plasmid DNA was isolated by a standard alkaline lysis miniprep according to Maniatis et al. (op. cit. p. 368). DNA pellets were dissolved in TE 25 Jg/ml ribonuclease A and DNA concentration was determined by fluorometry. The EcoR 1 digestion pattern was analyzed by gel electrophoresis. The following isolates were picked as useful. Isolate A17.2 contains religated pBC KS(+) only and was used for a (negative) control. Isolates D38.3 and C44.1 each contain only the 2.9 kbp, TcaBi -hybridizing EcoR 1 fragment inserted into pBC These plasmids, named pDAB2000 and pDAB2001, respectively, are illustrated in Fig. 2.
Isolate A35.3 contains only the approximately 5 kbp, GZ4)hybridizing EcoR 1 fragment, inserted into pBC This plasmid was named pDAB2002 (also Fig. These isolates provided templates for DNA sequencing.
Plasmids pDAB2000 and pDAB2001 were prepared using the BIGprep T kit as before. Cultures (30 ml) were grown overnight in TB-Cam 3 s to an OD0oo of 2, then plasmid was isolated according to the manufacturer's directions. DNA pellets were redissolved in 100 ul TE each, and sample integrity was checked by EcoR 1 digestion and gel electrophoretic analysis.
Sequencing reactions were run in duplicate, with one replicate using as template pDAB2000 DNA, and the other replicate using as template pDAB2001 DNA. The reactions were carried out using the dideoxy dye terminator cycle sequencing method, as described above for the sequencing of the GZ4/HB14 DNAs. Initial sequencing runs utilized as primers the LacZ and T7 primers described above, plus primers based on the determined sequence of the TcaB i PCR amplification product (TH1 ATTGCAGACTGCCAATCGCTTCGG, TH12 GAGAGTATCCAGACCGCGGATGATCTG).
After alignment and editing of each sequencing output, each was truncated to between 250 to 350 bases, depending on the integrity of the chromatographic data as interpreted by the Perkin Elmer Applied Biosystems Division SeqEd 675 software.
Subsequent sequencing "steps" were made by selecting appropriate sequence for new primers. With a few exceptions, primers (synthesized as described above) were 24 bases in length with a G+C composition. Sequencing by this method was carried out on both strands of the approximately 2.9 kbp EcoR 1 fragment.
-59- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 To further serve as template for DNA sequencing, plasmid DHA from isolate pDAB2002 was prepared by BIGprep m kit. Sequencing reactions were performed and analyzed as described above.
Initially, a T3 primer (pBS SK bases 774-796: CGCGCAATTAACCCTCACTAAAG) and a T7 primer (pBS KS bases 621- 643: GCGCGTAATACGACTCACTATAG) were used to prime the sequencing reactions from the flanking vector sequences, reading into the insert DNA. Another set of primers, (GZ4F: GTATCGATTACAACGCTGTCACTTCCC; TH13: GGGAAGTGACAGCGTTGTAATCGATAC; TH14: ATGTTGGGTGCGTCGGCTAATGGACATAAC; and LW1-204: GGGAAGTGACAGCGTTGTAATCGATAC) was made to prime from internal sequences, which were determined previously by degenerate oligonucleotide-mediated sequencing of subcloned TcaC-peptide
PCR
products. From the data generated during the initial rounds of sequencing, new sets of primers were designed and used to walk the entire length of the -5 kbp fragment. A total of 55 oligo primers was used, enabling the identification of 4832 total bp of contiguous sequence.
When the DNA sequence of the EcoR 1 fragment insert of pDAB2002 is combined with part of the determined sequence of the pDAB2000/pDAB2001 isolates, a total contiguous sequence of 6005 bp was generated (disclosed herein as SEQ ID NO:25). When long open reading frames were translated into the corresponding amino acids, the sequence clearly shows the TcaBi N-terminal peptide (disclosed as SEQ ID NO:3), encoded by bases 19-75, immediately following a methionine residue (start of translation). Upstream 'lies a potential ribosome binding site (bases and downstream, at bases 166-228 is encoded the TcaBi-PT158 internal peptide (disclosed herein as SEQ ID NO:19). Further downstream, in the same reading frame, at bases 1738-1773, exists a sequence encoding the TcaBi-PT108 internal peptide (disclosed herein as SEQ ID NO:20). Also in the same reading frame, at bases 1897- 1923, is encoded the TcaBii N-terminal peptide (disclosed herein as SEQ ID NO:5), and the reading frame continues uninterrupted to a translation termination codon at nucleotides 3586-3588.
The lack of an in-frame stop codon between the end of the sequence encoding TcaBi-PT108 and the start of the TcaBii encoding region, and the lack of a discernible ribosome binding site immediately upstream of the TcaBii coding region, indicate that SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/S96/18003 peptides TcaBii and TcaBi are encoded by a single open reading frame of 3567 bp beginning at base pair 16 in SEQ ID NO:25), and are most likely derived from a single primary gene product of 1189 amino acids (131,586 Daltons; disclosed herein as SEQ ID NO:26) by post-translational cleavage. If the amino acid immediately preceding the TcaBii N-terminal peptide represents the C-terminal amino acid of peptide TcaBi, then the predicted mass of TcaBii (627 amino acids) is 70,814 Daltons (disclosed herein as SEQ ID NO:28), somewhat higher than the size observed by SDS-PAGE (68 kDa). This peptide would be encoded by a contiguous stretch of 1881 base pairs (disclosed herein as SEQ ID NO:27). It is thought that the native C-terminus of TcaBi lies somewhat closer to the C-terminus of TcaBi-PT108. The molecular mass of PT108 [3.438 kDa; determined during N-terminal amino acid sequence analysis of this peptide] predicts a size of 30 amino acids. Using the size of this peptide to designate the Cterminus of the TcaBi coding region [Glu at position 604 of SEQ ID NO:28], the derived size of TcaBi is determined to be 604 amino acids or 68,463 Daltons, more in agreement with experimental observations.
Translation of the TcaBii peptide coding region of 1686 base pairs (disclosed herein as SEQ ID NO:29) yields a protein of 562 amino acids (disclosed herein as SEQ ID NO:30) with predicted mass of 60,789 Daltons, which corresponds well with the observed 61 kDa.
A potential ribosome binding site (bases 3633-3638) is found 48 bp downstream of the stop codon for the CcaB open reading frame. At bases 3645-3677 is found a sequence encoding the Nterminus of peptide TcaC, (disclosed as SEQ ID NO.2). The open reading frame initiated by this N-terminal peptide continues uninterrupted to base 6005 (2361 base pairs, disclosed herein as the first 2361 base pairs of SEQ ID NO.31). A gene (tcaC) encoding the entire TcaC peptide, (apparent size -165 kDa; -1500 amino acids), would comprise about 4500 bp.
Another isolate containing cloned EcoR 1 fragments of cosmid 26A5, E20.6, was also identified by its homology to the previously mentioned GZ4 and TcaBiprobes. Agarose gel analysis of EcoR 1 digests of the DNA of the plasmid harbored by this strain (pDAB2004, Fig. revealed insert fragments of estimated -61- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 sizes 2.9, 5, and 3.3 kbp. DNA sequence analysis initiated from primers designed from the sequence of plasmid pDAB2002 revealed that the 3.3 kbp EcoR i fragment of pDAB2004 lies adjacent to the kbp EcoR 1 fragment represented in pDAB2002. The 2361 base pair open reading frame discovered in pDAB2002 continues uninterrupted for another 2094 bases in pDAB2004 [disclosed herein as base pairs 2362 to 4458 of SEQ ID NO:31]. DNA sequence analysis using the parent cosmid 26A5 DNA as template confirmed the continuity of the open reading frame. Altogether, the open reading frame (TcaC SEQ ID NO:31) comprises 4455 base pairs, and encodes a protein (TcaC) of 1485 amino acids [disclosed herein as SEQ ID NO:32]. The calculated molecular size of 166,214 Daltons is consistent with the estimated size of the TcaC peptide (165 kDa), and the derived amino acid sequence matches exactly that disclosed for the TcaC N-terminal sequence [SEQ ID NO:2].
The lack of an amino acid sequence corresponding to SEQ ID NO:17; used to design the degenerate oligonucleotide primer pool in the discovered sequence indicates that the generation of the PCR® products found in isolates GZ4 and HB14, which were used as probes in the initial library screen, were fortuitously generated by reverse-strand priming by one of the primers in the degenerate pool. Further, the derived protein sequence does not include the internal fragment disclosed herein as SEQ ID NO:18. These sequences reveal that plasmid pDAB2004 contains the complete coding region for the TcaC peptide.
Example 9 Screening of the Photorhabdus genomic library for genes encoding the TcbAii peptide This example describes a method used to identify DNA clones that contain the TcbAii peptide-encoding genes, the isolation of the gene, and the determination of its partial DNA base sequence.
Primers and PCR reactions The TcbAii polypeptide of the insect active preparation is ~206 kDa. The amino acid sequence of the N-terminus of this peptide is disclosed as SEQ ID NO:1. Four pools of degenerate oligonucleotide primers ("Forward primers": TH-4, TH-5, TH-6, and -62- SUBSTITUTE SHEET (RULE 26) WO 97/1 7432 PCT/US96/1 8003 TH-7) were synthesized to encode a portion of this amino acid sequence, as described in Example 8, and are shown below.
Table 11 Amino Acid Phe Ile Gin Gly ?-yr Ser Asp Leu Phe TH-4 51 -Tr ATI CA(A/G) GOI TA(T/C) TcI GA(T/C) CTI T3 ATI CA(A/G) OGI TA(T/C) AG(T/C) GA(T/C) CTI T3 TH-6 5'-TT(T/C) ATI CA(A/G) OGI TA(T/C) TCI GA(T/C) TT-3' TH-7 5'-Tr(T/c) ATI cA(A/G) GGI TA(T/C) AG(T/c) GA(T/C) 7r(A'G) TT-3' In addition, a primary and a secondary sequence of an internal peptide preparation (TcbAii-PT8l) have been determined and are disclosed herein as SEQ ID No:23 and SEQ ID No:24, respectively. Four pools of degenerate oligonucleotjides ("Reverse Primers": TH-8, TH-9, TH-10 and TH-ll) were similarly designed and synthesized to encode the reverse complement of sequences that encode a portion of the peptide of SEQ ID NQ:23, as shown below.
-63- SUBSTITUTE SHEET (RULE 26) Table 12 Amino Acid Thr Tyr Lou THl-8 3'TGI AT(AIG)
GAI
TH-9 3'TGI AT(AIG)
TT(A/G)
'P1--10 3'TGI AT(A/G)
GAI
TPH-11 3'TGI AT(A/G)
TT(A/G)
Thr Ser TGI AGl TGI AGI TGI TC (G/A) 'P01 TC (G /A) Phe Glu Gin Val Ala ABD AA(A/c) CT(T/C) GT(T/C) CAI CGI AA(A/G) CT('P/C) GT('P/c) CAI CGI TT(G/A)-s' AA(A/G) CT(T/C) GT(TIC) CAI CGI AA(A/G) CT(T/C) GT(T/c) CAI CGI WO 97/17432 PCT/US96/18003 Sets of these primers were used in PCR" reactions to ampii-y TcbAii- encoding gene fragments from the genomic Phocorhabdus luminescens W-14 DNA prepared in Example 6. All PCR' reactions were run with the "Hot Start" technique using AmpliWax TM gems and other Perkin Elmer reagents and protocols. Typically, a mixture (total volume 11 pl) of MgCl, dNTP's, 10X GeneAmp' PCR Buffer II, and the primers were added to tubes containing a single wax bead.
GeneAmp' PCR Buffer II is composed of 100 mM Tris-HCl, pH 8.3; and 500 mM KC1.] The tubes were heated to 80 0 C for 2 minutes and allowed to cool. To the top of the wax seals, a solution containing 10X GeneAmp' PCR Buffer II, DNA template, and AmpliTaq' DNA polymerase were added. Following melting of the wax seal and mixing of components by thermal cycling, final reaction conditions (volume of 50 were: 10 mM Tris-HCl, pH 8.3; 50 mi.
KCl; 2.5 mM MgC12; 200 pM each in dATP, dCTP, dGTP, dTTP; 1.25 mM in a single Forward primer pool; 1.25 pM in a single Reverse primer pool, 1.25 units of AmpliTaq' DNA polymerase, and 170 ng of template DNA.
The reactions were placed in a thermocycler (as in Example 8) and run with the following program: Table 13 Temperature Time Cycle Repetition 94 0 C 2 minutes IX 94 0 C 15 seconds 55-65 0 C 30 seconds 72 0 C 1 minute 72 0 C 7 minutes ix 0 C Constant SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 A series of amplifications was run at three different annealing temperatures (550, 600, 650 C) using the degenerate primer pools. Reactions with annealing at 65 0 C had no amplification products visible following agarose gel electrophoresis. Reactions having a 60 0 C annealing regime and containing primers TH-5+TH-10 produced an amplification product that had a mobility corresponding to 2.9 kbp. A lesser amount of the 2.9 kbp product was produced under these conditions with primers TH-7+TH-10. When reactions were annealed at 55 0 C, these primer pairs produced more of the 2.9 kbp product, and this product was also produced by primer pairs TH-5+TH-8 and 11. Additional very faint 2.9 kbp bands were seen in lanes containing amplification products from primer pairs TH-7 plus TH- 8, TH-9, TH-10, or TH-11.
To obtain sufficient PCR amplification product for cloning and DNA sequence determination, 10 separate PCR reactions were set up using the primers TH-5+TH-10, and were run using the above conditions with a 55 0 C annealing temperature. All reactions were pooled and the 2.9 kbp product was purified by Qiaex extraction from an agarose gel as described above.
Additional sequences determined for TcbAii internal peptides are disclosed herein as SEQ ID NO:21 and SEQ ID NO:22. As before, degenerate oligonucleotides (Reverse primers TH-17 and TH-18) were made corresponding to the reverse complement of sequences that encode a portion of the amino acid sequence of these peptides.
Table 14 From SEQ ID NO:21 Amino Acid Met Glu Thr Gln Asn Ile Gln Glu Pro TH-17 3'-TAC CTT/C TGI GTT/C TTA/G TAI GTT/C GTT/C Table From SEQ ID NO:22 Amino Acid Asn Pro Ile Asn Ile Asn Thr Gly Ile Asp TH-18 3'-TT(A/G) GGI TAI TT(A/G) TAI TT(A?G) TGI CCI TAI -66- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCTIUS96/18003 Degenerate oligonucleotides TH-18 and TH-17 were used in an amplification experiment with Phocorhabdus luminescens W-14 DNA as template and primers TH-4, TH-5, TH-6, or TH-7 as the (Forward) primers. These reactions amplified products of approximately 4 kbp and 4.5 kbp, respectively. These DNAs were transferred from agarose gels to nylon membranes and hybridized with a '-P-labeled probe (as described above) prepared from the 2.9 kbp product amplified by the TH-5+TH10 primer pair. Both the 4 kbp and the 4.5 kbp amplification products hybridized strongly to the 2.9 kbp probe. These results were used to construct a map ordering the TcbAii internal peptide sequences as shown in Fig. 3. Approximate distances between the primers are shown in nucleotides in Fig. 3.
DNA Sequence of the 2.9 kbp TcbAii-encoding fragment Approximately 200 ng of the purified 2.9 kbp fragment (prepared above) was precipitated with ethanol and dissolved in 17 ml of water. One-half of this was used as sequencing template with 25 pmol of the TH-5 pool as primers, the other half was used as template for TH-10 priming. Sequencing reactions were as given in Example 8. No reliable sequence was produced using the primer pool; however, reactions with TH-5 primer pool produced the sequence disclosed below:.
1 AATCGTGTTG ATCCCTATGC CGNGCCGGGT TCGGTGGAAT CGATGTCCTC ACCGGGGGTT 61 TATTNGAGGG ANTNGTCCCG TGAGGCCAAA AANTGGAATG AAAGAAGTTC AATTTNTTAC 121 CTAGATAAAC GTCGCCCGGN TTTAGAAAGN TTANTGNTCA GCCAGAAAAT TTTGGTTGAG 181 GAAATTCCAC CGNTGGTTCT CTCTATTGAT TNGGGCCTGG CCGGGTTCGA ANNAAAACNA 241 GGAAATNCAC AAGTTGAGGT GATGGNTTTG TNGCNANCTT NTCGTTTAGG TGGGGAGAAA 301 CCTTNTCANC ACGNTTNTGA AACTGTCCGG GAAATCGTCC ATGANCGTGA NCCAGGNTTN 361 CGCCATTGG Based on this sequence, a sequencing primer (TH-21, CCGGGCGACGTTTATCTAGG-3') was designed to reverse complement bases 120-139, and initiate polymerization towards the 5' end TH-5 end) of the gel-purified 2.9 kbp TcbAii-encoding PCR fragment. The determined sequence is shown below, and is compared to the biochemically determined N-terminal peptide sequence of TcbAii SEQ ID NO:1.
-67- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 TcbAii 2.9 kbp PCR fragment Sequence Confirmation (Underlined amino acids encoded by degenerate oligonucieotidesi SEQ ID NO:1 F I Q G Y S D L F G A I I I I I I I I I I 2.9 kbp seq GC ATG CAG GGG TAT AGT GAC CTG TTT GGT AAT CGT GCT M Q G Y S D L F G N R A From the homology of the derived amino acid sequence to the biochemically determined one, it is clear that the 2.9 kbp PCR fragment represents the TcbA coding region. This 2.9 kbp fragment was then used as a hybridization probe to screen the Photorhabdus W-14 genomic library prepared in Example 8 for cosmids containing the TcbAii-encoding gene.
Screening the Phocorhabdus cosmid library The 2.9 kb gel-purified PCR fragment was labeled with 3 2
P
using the Boehringer Mannheim High Prime labeling kit as described in Example 8. Filters containing remnants of approximately 800 colonies from the cosmid library were screened as described previously (Example and positive clones were streaked for isolated colonies and rescreened. Three clones (8All, 25G8, and 26DI) gave positive results through several screening and characterization steps. No hybridization of the TcbAii-specific probe was ever observed with any of the four cosmids identified in Example 8, and which contain the ccaB and ccaC genes. DNA from cosmids 8All, 25G8, and 26D1 was digested with restriction enzymes Bgl 2, EcoR 1 or Hind 3 (either alone or in combination with one another), and the fragments were separated on an agarose gel and transferred to a nylon membrane as described in Example 8. The membrane was hybridized with 2Plabeled probe prepared from the 4.5 kbp fragment (generated by amplification of Photorhabdus genomic DNA with primers 17). The patterns generated from cosmid DNAs 8All and 26D1 were identical to those generated with similarly-cut genomic DNA on the same membrane. It is concluded that cosmids 8All and 26D1 are accurate representations of the genomic TcbAii encoding locus. However, cosmid 25G8 has a single Bgl 2 fragment which is slightly larger than the genomic DNA. This may result from positioning of the insert within the vector.
-68- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/1 8003 DNA sequence of the ccbA-encoding gene The membrane hybridization analysis of cosmid 26DI revealed that the 4.5 kbp probe hybridized to a single large EcoR 1 fragment (greater than 9 kbp). This fragment was gel purified and ligated into the EcoR 1 site of pBC KS as described in Example 8, to generate plasmid pBC-SI/Rl. The partial DNA sequence of the insert DNA of this plasmid was determined by "primer walking" from the flanking vector sequence, using procedures described in Example 8. Further sequence was generated by extension from new oligonucleotides designed from the previously determined sequence. When compared to the determined DNA sequence for the tcbA gene identified by other methods (disclosed herein as SEQ ID NO:11 as described in Example 12 below), complete homology was found to nucleotides 1-272, 319- 826, 2578-3036, and 3068-3540 (total bases 1712). It was concluded that both approaches can be used to identify DNA fragments encoding the TcbAii peptide.
Analysis of the derived amino acid sequence of the CcbA gene.
The sequence of the DNA fragment identified as SEQ ID NO:ll encodes a protein whose derived amino acid sequence is disclosed herein as SEQ ID NO:12. Several features verify the identity of the gene as that encoding the TcbAii protein. The TcbAii N-terminal peptide (SEQ ID NO:1; Phe Ile Gln Gly Tyr Ser Asp Leu Phe Gly Asn Arg Ala) is encoded as amino acids 88-100. The TcbAii internal peptide TcbAii- PT81(a) (SEQ ID NO:23) is encoded as amino acids 1065-1077, and TcbAii- PT81(b) (SEQ ID NO:24) is encoded as amino acids 1571-1592. Further, the internal peptide TcbAii-PT56 (SEQ ID NO:22) is encoded as amino acids 1474-1488, and the internal peptide TcbAii-PT103 (SEQ ID NO:24) is encoded as amino acids 1614-1639. It is obvious that this gene is an authentic clone encoding the TcbAii peptide as isolated from insecticidal protein preparations of Photorhabdus luminescens strain W-14.
The protein isolated as peptide TcbAii is derived from cleavage of a longer peptide. Evidence for this is provided by the fact that the nucleotides encoding the TcbAii N-terminal peptide SEQ ID NO:1 are preceded by 261 bases (encoding 87 N-terminal-proximal amino acids) of a longer open reading frame (SEQ ID NO:11). This reading frame begins with nucleotides that encode the amino acid sequence Met Gin Asn Ser -69- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 Leu, '.hich corresponus -u cue rl-terminal sequence or tne large peptile TcbA, and is disclosed herein as SEQ ID 110:16. It is thought that TcbA is the precursor protein for TcbAii.
Relationship of tcbA, tcaB and ccaC genes.
The tcaB and ccaC genes are closely linked and may be transcribed as a single mRNA (Example The tcbA gene is borne on cosmids that apparently do not overlap the ones harboring the ccaB and ccaC cluster, since the respective genomic library screens identified different cosmids. However, comparison of the amino sequences encoded by the tcaB and tcaC genes with the ccbA gene reveals a substantial degree of homology. The amino acid conservation (Protein Alignment Mode of MacVector T Sequence Analysis Software, scoring matrix pam250, hash value 2; Kodak Scientific Imaging Systems, Rochester, NY) is shown in Fig. 4.
On the score line of each panel in Fig. 4, up carats indicate homology or conservative amino acid changes,and down carats (v) indicate nonhomology.
This analysis shows that the amino acid sequence of the TcbA peptide from residues 1739 to 1894 is highly homologous to amino acids 441 to 603 of the TcaBi peptide (162 of the total 627 amino acids of P8; SEQ ID NO:28). In addition, the sequence of TcbA amino acids 1932 to 2459 is highly homologous to amino acids 12 to 531 of peptide TcaBii (520 of the total 562 amino acids; SEQ ID NO:30). Considering that the TcbA peptide (SEQ ID NO:12) comprises 2505 amino acids, a total of 684 amino acids at the C-proximal end of it is homologous to the TcaBi or TcaBii peptides, and the homologies are arranged colinear to the arrangement of the putative TcaB preprotein (SEQ ID NO:26). A sizeable gap in the TcbA homology coincides with the junction between the TcaB i and TcaBii portions of the TcaB preprotein.
Clearly the TcbA and TcaB gene products are evolutionarily related, and it is proposed that they share some common function(s) in Photorhabdus.
SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 Example Characterization of zinc-metalloproteases in Photorhabdus Broth: Protease Inhibition, Classification, and Purification Protease Inhibition and Classification Assays: Protease assays were performed using FITC-casein dissolved in water as substrate (0.08% final assay concentration). Proteolysis reactions were performed at 25 0 C for 1 h in the appropriate buffer with 25 ul of Photorhabdus broth (150 il total reaction volume). Samples were also assayed in the presence and absence of dithiothreitol. After incubation, an equal volume of 12% trichloroacetic acid was added to precipitate undigested protein.
Following precipitation for 0.5 h and subsequent centrifugation, 100 ul of the supernatant was placed into a 96-well microtiter plate and the pH of the solution was adjusted by addition of an equal volume of 4N NaOH. Proteolysis was then quantitated using a Fluoroskan II fluorometric plate reader at excitation and emission wavelengths of 485 and 538 nm, respectively. Protease activity was tested over a range from pH 5.0-10.0 in 0.5 units increments. The following buffers were used at 50 mM final concentration: sodium acetate (pH 5.0 Tris-HCL (pH 7.0 and bis-Tris propane (pH 8.5-10.0). To identify the class of protease(s) observed, crude broth was treated with a variety of protease inhibitors (0.5 ug/ul final concentration) and then examined for protease activity at pH 8.0 using the substrate described above. The protease inhibitors used included E-64 (Ltrans-expoxysaccinylleucylamido(4-,-guanidino]-butane), 3,4 dichloroisocoumarin, Leupeptin, pepstatin, amastatin, ethylenediaminetetraacetic acid (EDTA) and 1,10 phenanthroline.
Protease assays performed over a pH range revealed that indeed protease(s) were present which exhibited maximal activity at pH 8.0 (Table 16). Addition of DTT did not have any effect on protease activity. Crude broth was then treated with a variety of protease inhibitors (Table 17). Treatment of crude broth with the inhibitors described above revealed that 1,10 phenanthroline caused complete inhibition of all protease activity when added at a final concentration of 50 Ug, with the 5 ug in 100 ul of a 2 mg/ml crude broth solution. These data indicate that the most abundant protease(s) found in the -71- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 Phocorhabdus broth are from the zinc-metalloprotease class cf enzymes.
Table 16 Effect of pH on the protease activity found in a Day I production of Phocorhabdus luminescens (strain W-14).
pH Flu. Unitsa Percent Activityb 3013 78 17 7994 448 6.0 12965 483 74 14390 1291 82 14386 1287 82 14135 198 17582 831 100 8.5 16183 953 92 16795 760 96 16279 1022 93 10.0 15225 210 87 a Flu. Units Fluorescence Units (Maximum -28,000; background 2200).
b. Percent activity relative to the maximum at pH -72- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCTUS96/18003 Table 17 Effect of different protease inhibitors on the protease acrtiity at pH 8 found in a Day 1 production of Photorhabdus luminescens (strain W-14).
Inhibitor Corrected Flu. Unitsa Percent Inhibitionb Control 13053 0 E-64 14259 0 1,10 Phenanthroline c 15 99 3,4 Dichloroisocoumarind 7956 39 Leupeptin 13074 0 Pepstatinc 13441 0 Amastatin 12474 4 DMSO Control 12005 8 Methanol Control 12125 7 a. Corrected Flu. Units Fluorescence Units background(2200 flu. units).
b. Percent Inhibition relative to protease activity at pH 8.0.
c. Inhibitors were dissolved in methanol.
d. Inhibitors were dissolved in DMSO.
The isolation of a zinc-metalloprotease was performed by applying dialyzed 10-80% ammonium sulfate pellet to a Q Sepharose column equilibrated at 50 mM Na2P04, pH 7.0 as described in Example 5 for Photorhabdus toxin. After extensive washing, a 0 to 0.5 M NaCI gradient was used to elute toxin protein. The majority of biological activity and protein was eluted from 0.15 0.45 M NaC1. However, it was observed that the majority of proteolytic activity was present in the 0.25-0.35 M NaCl fraction with some activity in the 0.15-0.25 M NaCl fraction. SDS PAGE analysis of the 0.25-0.35 M NaCl fraction showed a major peptide band of approximately 60 kDa. The 0.15-0.25 M NaCI fraction contained a similar 60 kDa band but at lower relative protein concentration. Subsequent gel filtration of this fraction using a Superose 12 HR 16/50 column resulted in a major peak migrating at 57.5 kDa that contained a predominant 90% of total stained protein) 58.5 kDa band by SDS PAGE analysis. Additional analysis of this fraction using various protease inhibitors as described above determined that the protease was a zinc-metalloprotease.
Nearly all of the protease activity present in Photorhabdus broth at day 1 of fermentation corresponded to the -58 kDa zincmetalloprotease.
In yet a second isolation of zinc-metalloprotease(s), W-14 Photorhabdus broth grown for three days was taken and protease -73- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 activity was visualized using sodium dodecyl sulfatepolyacrylamide gel electrophoresis (SDS-PAGE) laced with gelatin as described in Schmidt, Bleakley, B. and Nealson, K.M.
1988. SDS running gels (5.5 x 8 cm) were made with 12.5 polyacrylamide (40% stock solution of acrylamide/bis-acrylamide; Sigma Chemical Co., St. Louis, MO) into which 0.1% gelatin final concentration (Biorad EIA grade reagent; Richmond CA) was incorporated upon dissolving in water. SDS-stacking gels (1.0 x 8 cm) were made with 5% polyacrylamide, also laced with 0.1% gelatin. Typically, 2.5 ig of protein to be tested was diluted in 0.03 ml of SDS-PAGE loading buffer without dithiothreitol (DTT) and loaded onto the gel. Proteins were electrophoresed in SDS running buffer (Laemmli, U.K. 1970. Nature 227, 680) at 00 C and at 8 mA. After electrophoresis was complete, the gel was washed for 2 h in 2.5% Triton X-100. Gels were then incubated for 1 h at 37 oC in 0.1 M glycine (pH After incubation, gels were fixed and stained overnight with 0.1% amido black in methanol-acetic acid- water (30:10:60, vol./vol./vol.; Sigma Chemical Protease activity was visualized as light areas against a dark, amido black stained background due to proteolysis and subsequent diffusion of incorporated gelatin. At least three distinct bands produced by proteolytic activity at 58-, 41-, and 38 kDa were observed.
Activity assays of the different proteases in W-14 day three culture broth were performed using FITC-casein dissolved in water as substrate (0.02% final assay concentration). Proteolysis experiments were performed at 37 oC for 0-0.5 h in 0.1M Tris-HCl (pH 8.0) with different protein fractions in a total volume of 0.15 ml. Reactions were terminated by addition of an equal volume of 12% trichloroacetic acid (TCA) dissolved in water.
After incubation at room temperature for 0.25 h, samples were centrifuged at 10,000 x g for 0.25 h and 0.10 ml aliquots were removed and placed into 96-well microtiter plates. The solution was then neutralized by the addition of an equal volume of 2 1 sodium hydroxide, followed by quantitation using a Fluoroskan II fluorometric plate reader with excitation and emission wavelengths of 485 and 538 nm, respectively. Activity measurements were performed using FITC-Casein with different protease concentrations at 370 C for 0-10 min. A unit of -74- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 activity was arbitrarily defined as the amount of enzyme needed to produce 1000 fluorescent units/min and specific activity was defined as units/mg of protease.
Inhibition studies were performed using two zincmetalloprotease inhibitors; 1,10 phenanthroline and N-(arhamnopyranosyloxyhydroxyphosphinyl)-Leu-Trp(phosphoramidon) with stock solutions of the inhibitors dissolved in 100% ethanol and water, respectively. Stock concentrations were typically mg/ml and 5 mg/ml for 1,10 phenanthroline and phosphoramidon, respectively, with final concentrations of inhibitor at 0.5-1.0 mg/ml per reaction. Treatment of three day W-14 crude broth with 1,10 phenanthroline, an inhibitor of all zinc metalloproteases, resulted in complete elimination of all protease activity while treatment with phosphoramidon, an inhibitor of thermolysin-like proteases (Weaver, Kester, and Matthews, B.W. 1977.
J. Mol. Biol. 114, 119-132), resulted in -56% reduction of protease activity. The residual proteolytic activity could not be further reduced with additional phosphoramidon.
The proteases of three day W-14 Photorhabdus broth were purified as follows: 4.0 liters of broth were concentrated using an Amicon spiral ultra filtration cartridge Type S1Y100 attached to an Amicon M-12 filtration device. The flow-through material having native proteins less than 100 kDa in size (3.8 L) was concentrated to 0.375 L using an Amicon spiral ultra filtration cartridge Type S1Y10 attached to an Amicon M-12 filtration device. The retentate material contained proteins ranging in size from 10-100 kDa. This material was loaded onto a Pharmacia HR16/10 column which had been packed with PerSeptive Biosystem (Framington, MA) Poros® 50 HQ strong anion exchange packing that had been equilibrated in 10 mM sodium phosphate buffer (pH Proteins were loaded on the column at a flow rate of 5 ml/min, followed by washing unbound protein with buffer until A280 0.00. Afterwards, proteins were eluted using a NaCl gradient of 0-1.0 M NaC1 in 40 min at a flow rate of 7.5 ml/min. Fractions were assayed for protease activity, supra., and active fractions were pooled. Proteolytically active fractions were diluted with 10 mM sodium phosphate buffer (pH 7.0) and loaded onto a Pharmacia HR 10/10 Mono Q column equilibrated in 10 mM sodium phosphate. After washing the column with buffer until A280 SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 0.00, proteins were eluted using a NaCl gradient of 0-0.5 M aCl for 1 h at a flow rate of 2.0 ml/min. Fractions were assayed for protease activity. Those fractions having the greatest amount of phosphoramidon-sensitive protease activity, the phosphoramidon sensitive activity being due to the 41/38 kDa protease, infra., were pooled. These fractions were found to elute at a range of 0.15-0.25 M NaC1. Fractions containing a predominance of phosphoramidon-insensitive protease activity, the 58 kDa protease, were also pooled. These fractions were found to elute at a range of 0.25-0.35 M NaCl. The phosphoramidon-sensitive protease fractions were then concentrated to a final volume of 0.75 ml using a Millipore Ultrafree®-15 centrifugal filter device NMWL membrane. This material was applied at a flow rate of 0.5 ml/min to a Pharmacia HR 10/30 column that had been packed with Pharmacia Sephadex G-50 equilibrated in 10 mM sodium phosphate buffer (pH 0.1 M NaC1. Fractions having the maximal phosphoramidon-sensitive protease activity were then pooled and centrifuged over a Millipore Ultrafree®-15 centrifugal filter device Biomax-50K NMWL membrane. Proteolytic activity analysis, supra., indicated this material to have only phosphoramidon-sensitive protease activity. Pooling of the phosphoramidon-insensitive protease, the 58 kDa protein, was followed by concentrating in a Millipore centrifugal filter device Biomax-50K NMWL membrane and further separation on a Pharmacia Superdex-75 column. Fractions containing the protease were pooled.
Analysis of purified 58- and 41/38 kDa purified proteases revealed that, while both types of protease were completely inhibited with 1,10 phenanthroline, only the 41/38 kDa protease was inhibited with phosphoramidon. Further analysis of crude broth indicated that protease activity of day 1 W-14 broth has 23% of the total protease activity due to the 41/38 kDa protease, increasing to 44% in day three W-14 broth.
Standard SDS-PAGE analysis for examining protein purity and obtaining amino terminal sequence was performed using 4-20% gradient MiniPlus SepraGels purchased from Integrated Separation Systems (Natick, MA). Proteins to be amino-terminal sequenced were blotted onto PVDF membrane following purification, infra., (ProBlott" Membranes; Applied Biosystems, Foster City, CA), -76- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 visualized with 0.1% amido black, excised, and sent to Cambridge Prochem; Cambridge, MA, for sequencing.
Deduced amino terminal sequence of the 58- (SEQ ID and 41/38 kDa (SEQ ID NO:44) proteases from three day old W-14 broth were DV-GSEKANEKLK (SEQ ID NO: 45) and DSGDDDKVTNTDIHR
(SEQ
ID NO:44), respectively.
Sequencing of the 41/38 kDa protease revealed several amino termini, each one having an additional amino acid removed by proteolysis. Examination of the primary, secondary, tertiary and quartenary sequences for the 38 and 41 kDa polypeptides allowed for deduction of the sequence shown above and revealed that these two proteases are homologous.
Example 11, Part A Screening of Photorhabdus Genomic Library via use of Antibodies for Genes encoding TcbA Peptide In parallel to the sequencing described above, suitable probing and sequencing was done based on the TcbAii peptide (SEQ ID NO:1). This sequencing was performed by preparing bacterial culture broths and purifying the toxin as described in Examples 1 and 2 above.
Genomic DNA was isolated from the Photorhabdus luminescens strain W-14 grown in Grace's insect tissue culture medium. The bacteria were grown in 5 ml of culture medium in a 250 ml Erlenmeyer flask at 28 0 C and 250 rpm for approximately 24 hours.
Bacterial cells from 100 ml of culture medium were pelleted at 5000 x g for 10 minutes. The supernatant was discarded, and the cell pellets then were used for the genomic DNA isolation.
The genomic DNA was isolated using a modification of the CTAB method described in Section 2.4.3 of Ausubel (supra.). The section entitled "Large Scale CsCl prep of bacterial genomic DMA" was followed through step 6. At this point, an additional chloroform/isoamyl alcohol (24:1) extraction was performed followed by a phenol/chloroform/isoamyl (25:24:1) extraction step and a final chloroform/isoamyl/alcohol (24:1) extraction. The DNA was precipitated by the addition of a 0.6 volume of isopropanol. The precipitated DNA was hooked and wound around the end of a bent glass rod, dipped briefly into 70% ethanol as a final wash, and dissolved in 3 ml of TE buffer.
-77- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 The DNA concentration, estimated by optical density at 280/260 nm, was approximately 2 mg/ml.
Using this genomic DNA, a library was prepared.
Approximately 50 ug of genomic DNA was partly digested with Sau3 Al. Then NaCl density gradient centrifugation was used to size fractionate the partially digested DNA fragments. Fractions containing DNA fragments with an average size of 12 kb, or larger, as determined by agarose gel electrophoresis, were ligated into the plasmid BluScript, Stratagene, La Jolla, California, and transformed into an E. coli DH5a or DHB10 strain.
Separately, purified aliquots of the protein were sent to the biotechnology hybridoma center at the University of Wisconsin, Madison for production of monoclonal antibodies to the proteins. The material that was sent was the HPLC purified fraction containing native bands 1 and 2 which had been denatured at 65 0 C, and 20 gg of which was injected into each of four mice.
Stable monoclonal antibody-producing hybridoma cell lines were recovered after spleen cells from unimmunized mouse were fused with a stable myeloma cell line. Monoclonal antibodies were recovered from the hybridomas.
Separately, polyclonal antibodies were created by taking native agarose gel purified band 1 (see Example 1) protein which was then used to immunize a New Zealand white rabbit. The protein was prepared by excising the band from the native agarose gels, briefly heating the gel pieces to 65 0 C to melt the agarose, and immediately emulsifying with adjuvant. Freund's complete adjuvant was used for the primary immunizations and Freund's incomplete was used for 3 additional injections at monthly intervals. For each injection, approximately 0.2 ml of emulsified band 1, containing 50 to 100 micrograms of protein, was delivered by multiple subcontaneous injections into the back of the rabbit. Serum was obtained 10 days after the final injection and additional bleeds were performed at weekly intervals for 3 weeks. The serum complement was inactivated by heating to 56 0 C for 15 minutes and then stored at -20 0
C.
The monoclonal and polyclonal antibodies were then used to screen the genomic library for the expression of antigens which could be detected by the epitope. Positive clones were detected on nitrocellulose filter colony lifts. An immunoblot analysis of the positive clones was undertaken.
-78- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PC/US96/180031 An analysis of the clones as defined by both immunoblot and Southern analysis resulted in the tentative identification of five classes of clones.
In the first class of clone was a gene encoding the peptide designated here as TcbAii. Full DNA sequence of this gene (TcbA) was obtained. It is set forth as SEQ ID NO:11. Confirmation that the sequence encodes the internal sequence of SEQ ID NO:l is demonstrated by the presence of SEQ ID NO:1 at amino acid number 88 from the deduced amino acid sequence created by the open reading frame of SEQ ID NO:11. This can be confirmed by referring to SEQ ID NO:12, which is the deduced amino acid sequence created by SEQ ID NO:11.
The second class of toxin peptides contains the segments referred to above as TcaBi, TcaBiiand TcaC. Following the screening of the library with the polyclonal antisera, this second class of toxin genes was identified by several clones which produced different size proteins, all of which crossreacted with the polyclonal antibody on an immunoblot and were also found to share DNA homology on a Southern Blot. Sequence comparison revealed that they belonged to the gene complex designated TcaBand TcaC above.
Three other classes of antibody toxin clones were also isolated in the polyclonal screen. These classes produced proteins that cross-react with a polyclonal antibody and also shared DNA homology with the classes as determined by Southern blotting. The classes have been designated Class III, Class IV and Class V. It was also possible to identify monoclonals that cross-reacted with Class I, II, III, and IV. This suggests that all have regions of high protein homology. Thus, it appears that the P. luminescens extracellular protein genes represent a family of genes which are evolutionarily related.
To further pursue the concept that there might be evolutionarily related variations in the toxin peptides contained within this organism, two approaches have been undertaken to examine other strains of P. luminescens for the presence of related proteins. This was done both by PCR amplification of genomic DNA and by immunoblot analysis using the polyclonal and monoclonal antibodies.
-79- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 The results indicate that related proteins are produced by P. luminescens strains WX-2, WX-3, WX-4, WX-5, WX-6, WX- 7
WX-.,
WX-ll, WX-12, WX-15 and W-14.
Example 11, Part B Sequence and anaylsis of Class III toxin clones tcc Further DNA sequencing was performed on plasmids isolated from Class III E. coli clones described in Example 11, Part A.
The nucleotide sequence was shown to be three closely linked open reading frames at this genomic locus. This locus was designated ccc with the three open reading frames designated tccA SEQ ID NO:56, CccB SEQ ID NO:58 and tccC SEQ ID NO:60 (Fig. 6B).
The deduced amino acid from the tccA open reading frame indicates the gene encodes a protein of 105,459 Da. This protein was designated TccA. The first 12 amino acids.of this protein match the N-terminal sequence obtained from a 108 kDa protein, SEQ ID NO:7, previously identified as part of the toxin complex.
The deduced amino acid from the tccB open reading frame indicates this gene encodes a protein of 175,716 Da. This protein was designated TccB. The first 11 amino acids of this protein match the N-terminal sequence obtained from a protein with estimated molecular weight of 185 kDa, SEQ ID NO:8.
The deduced amino acid sequence of cccC indicated that this open reading frame encodes a protein of 111,694 Da and the protein product was designated TccC.
Example 12 Characterization of Photorhabdus Strains In order to establish that the collection described herein was comprised of Photorhabdus strains, the strains herein were assessed in terms of recognized microbiological traits that are characteristic of Photorhabdus and which differentiate it from other Enterobacteriaceae and Xenorhabdus spp. (Farmer, J.J. 1984.
Bergey's Manual of Systemic Bacteriology, vol 1. pp. 510-511.
(ed. Kreig N.R. and Holt, Williams Wilkins, Baltimore.; Akhurst and Boemare, 1988, Boemare et al., 1993). These characteristic traits are as follows: Gram's stain negative SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 rods, organism size of 0.5-2 im in width and 2-10 um in length, red/yellow colony pigmentation, presence of crystalline inclusion bodies, presence of catalase, inability to reduce nitrate, presence of bioluminescence, ability to take up dye from growth media, positive for protease production, growth-temperature range below 37 0 C, survival under anaerobic conditions and positively motile. (Table 18). Reference Escherichia coli, Xenorhabdus and Photorhabdus strains were included in all tests for comparison.
The overall results are consistent with all strains being part of the family Enterobacteriaceae and the genus Photorhabdus.
A luminometer was used to establish the bioluminescence of each strain and provide a quantitative and relative measurement of light production. For measurement of relative light emitting units, the broths from each strain (cells and media) were measured at three time intervals after inoculation in liquid culture 12, and 24 hr) and compared to background luminosity (uninoculated media and water). Prior to measuring light emission from the various broths, cell density was established by measuring light absorbance (560 nM) in a Gilford Systems (Oberlin, OH) spectrophotometer using a sipper cell. Appropriate dilutions were then made (to normalize optical density to unit) before measuring luminosity. Aliquots of the diluted broths were then placed into cuvettes (300 ul each) and read in a Bio-Orbit 1251 Luminometer (Bio-Orbit Oy, Twiku, Finland). The integration period for each sample was 45 seconds. The samples were continuously mixed (spun in baffled cuvettes) while being read to provide oxygen availability. A positive test was determined as being 5-fold background luminescence (-5-10 units). In addition, colony luminosity was detected with photographic film overlays and visually, after adaptation in a darkroom. The Gram's staining characteristics of each strain were established with a commercial Gram's stain kit (BBL, Cockeysville, MD) used in conjunction with Gram's stain control slides (Fisher Scientific, Pittsburgh, PA). Microscopic evaluation was then performed using a Zeiss microscope (Carl Zeiss, Germany) 100X oil immersion objective lens (with ocular and 2X body magnification). Microscopic examination of individual strains for organism size, cellular description and inclusion bodies (the latter after logarithmic growth) was -81- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 performed using wet mount slides (10 ocular, 2X body and objective magnification) with oil immersion and phase contrast microscopy with a micrometer (Akhurst, R.J. and Boemare, N.E.
1990. Entomopathogenic Nematodes in Biological Control (ed.
Gaugler, R. and Kaya, pp. 75-90. CRC Press, Boca Raton, USA.; Baghdiguian Boyer-Giglio Thaler, Bonnot G., Boemare N. 1993. Biol. Cell 79, 177-185.). Colony pigmentation was observed after inoculation on Bacto nutrient agar, (Difco Laboratories, Detroit, MI) prepared as per label instructions.
Incubation occurred at 28 0 C and descriptions were produced after 5-7 days. To test for the presence of the enzyme catalase, a colony of the test organism was removed on a small plug from a nutrient agar plate and placed into the bottom of a glass test tube. One ml of a household hydrogen peroxide solution was gently added down the side of the tube. A positive reaction was recorded when bubbles of gas (presumptive oxygen) appeared immediately or within 5 seconds. Controls of uninoculated nutrient agar and hydrogen peroxide solution were also examined.
To test for nitrate reduction, each culture was inoculated into 10 ml of Bacto Nitrate Broth (Difco Laboratories, Detroit, MI).
After 24 hours incubation at 28 0 C, nitrite production was tested by the addition of two drops of sulfanilic acid reagent and two drops of alpha-naphthylamine reagent (see Difco Manual, edition, Difco Laboratories, Detroit, MI, 1984). The generation of a distinct pink or red color indicates the formation of nitrite from nitrate. The ability of each strain to uptake dye from growth media was tested with Bacto MacConkey agar containing the dye neutral red; Bacto Tergitol-7 agar containing the dye bromothymol blue and Bacto EMB Agar containing the dye eosin-Y (agars from Difco Laboratories, Detroit, MI, all prepared according to label instructions). After inoculation on these media, dye uptake was recorded after incubation at 28 0 C for days. Growth on these latter media is characteristic for members of the family Enterobacceriaceae. Motility of each strain was tested using a solution of Bacto Motility Test Medium (Difco Laboratories, Detroit, MI) prepared as per label instructions. A butt-stab inoculation was performed with each strain and motility was judged macroscopically by a diffuse zone of growth spreading from the line of inoculum. In many cases, motility was also -82- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 observed microscopically from liquid culture under wet mount slides. Biochemical nutrient evaluation for each strain was performed using-BBL Enterotube II (Benton, Dickinson, Germany).
Product instructions were followed with the exception that incubation was carried out at 28 0 C for 5 days. Results were consistent with previously cited reports for Photorhabdus. The production of protease was tested by observing hydrolysis of gelatin using Bacto gelatin (Difco Laboratories, Detroit,
MI)
plates made as per label instructions. Cultures were inoculated and the plates were incubated at 28 0 C for 5 days. To assess growth at different temperatures, agar plates proteose peptone #3 with two percent Bacto-Agar (Difco, Detroit, MI) in deionized water] were streaked from a common source of inoculum.
Plates were sealed with Nesco® film and incubated at 20, 28 and 37 0 C for up to three weeks. Plates showing no growth at 37 0
C
showed no cell viability after transfer to a 28 0 C incubator for one week. Oxygen requirements for Photorhabdus strains were tested in the following manner. A butt-stab inoculation into fluid thioglycolate broth medium (Difco, Detroit, MI) was made.
The tubes were incubated at room temperature for one week and cultures were then examined for type and extent of growth. The indicator resazurin demonstrates the level of medium oxidation or the aerobiosis zone (Difco Manual, 10th edition, Difco Laboratories, Detroit, MI). Growth zone results obtained for the Photorhabdus strains tested were consistent with those of a facultative anaerobic microorganism.
Table 18 Taxonomic Traits of Photorhabdus Strains Traits Assessed* Strain A B C D E F G H I J K L M N O P Q W-14 _t rd
S
WX- 2 rd O
S
WX-3 rd OY
S
WX-3 rd YT WX-4 rd YT T 7 T T T
S
rd LO
S
-83- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCTIUS96/18003 WX-6 rd I 7 1 'fJd I t I I LY ~r n WX-7 rd WX-8 rd WX-9 d -rd YT WX-11 rd 7 Ro +7 WX-12 rd o
S
WX-I rd
S
+rd LR
S
H9 rd LR Hb (I I~l~d I~JJI ~Jlt YT ll Hm rd I.TY +l~l HP88 rd LY NC-1 rd 7
S
rd YT
S
WIR rd RO 4--
S
HP8 J+ rd R
S
43948 rd 7 0 4
S
43949 rd +Y
S
-4 R 3T5 -5 rd 4 4 4 RO. 4- S -0 43949 1 rd 7+ 0 +4
S
43951 t+ rd +6 7 4. 4. 4- 43952 1- 4- td 4- 4 .4 A Gram's stain, B=Crystaline inclusion bodies, C=Bioluminescence, D=cell form, E=Motility, F=Nitrate reduction, G=Presence of catalase, H=Gelatin hydrolysis, I=Dye uptake, J=Pigmentation, K=Growth on EMB agar, L=Growth on MacConkey agar, M=Growth on Tergitol-7 agar, N=Facultative anaerobe, 0=Growth at 0 C, P=Growth at 28 0 C, Q-Growth at 37 0 C, t positive or negative for trait, rd~rod, S=sized within Genus descriptors, RO=red-orange, LR light red, R= red, 0= organge, Y= yellow, T= tan, LY= light yellow, YT= yellow tan, and LO= light orange.
I0 Cellular fatty acid analysis is a recognized tool for bacterial characterization at the genus and species level (Tornabene, T.G. 1985. Lipid Analysis and the Relationship o -84- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 Chemotaxonomy in Methods in Microbiology, Vol 18, 209-2.4.; Goodfellow, M. and O'Donnell, A.G. 1993. Roots of Bacterial Systematics in Handbook of New Bacterial Systematics (ed.
Goodfellow, M. O'Donnell, pp. 3-54. London: Academic Press Ltd.), these references are incorporated herein by reference, and were used to confirm that our collection was related at the genus level. Cultures were shipped to an external, contract laboratory for fatty acid methyl ester analysis (FAME) using a Microbial ID (MIDI, Newark, DE, USA) Microbial Identification System (MIS). The MIS system consists of a Hewlett Packard HP5890A gas chromatograph with a 25mm x 0.2mm methylphenyl silicone fused silica capillary column. Hydrogen is used as the carrier gas and a flame-ionization detector functions in conjunction with an automatic sampler, integrator and computer. The computer compares the sample fatty acid methyl esters to a microbial fatty acid library and against a calibration mix of known fatty acids. As selected by the contract laboratory, strains were grown for 24 hours at 28 C on trypticase soy agar prior to analysis. Extraction of samples was performed by the contract lab as per standard FAME methodology.
There was no direct identification of the strains to any luminescent bacterial group other than Photorhabdus. When the cluster analysis was performed, which compares the fatty acid profiles of a group of isolates, the strain fatty acid profiles were related at the genus level.
The evolutionary diversity of the Photorhabdus strains in our collection was measured by analysis of PCR (Polymerase Chain Reaction) mediated genomic fingerprinting using genomic DNA from each strain. This technique is based on families of repetitive DNA sequences present throughout the genome of diverse bacterial species (reviewed by Versalovic, Schneider, DE Bruijn, F.J. and Lupski, J.R. 1994. Methods Mol. Cell. Biol., 5, 25-40.).
Three of these, repetitive extragenic palindromic sequence (REP), enterobacterial repetitive intergenic consensus (ERIC) and the BOX element are thought to play an important role in the organization of the bacterial genome. Genomic organization is believed to be shaped by selection and the differential dispersion of these elements within the genome of closely related bacterial strains can be used to discriminate these strains (e.g.
SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 Louws, Fulbright, Stephens, C.T. and DE Bruijn, F.J.
1994. Appl. Environ. Micro. 60, 2286-2295.). Rep-PCR utilizes oligonucleotide primers complementary to these repetitive sequences to amplify the variably sized DNA fragments lying between them. The resulting products are separated by electrophoresis to establish the DNA "fingerprint" for each strain.
To isolate genomic DNA from our strains, cell pellets were resuspended in TE buffer (10 mM Tris-HC1, 1 mM EDTA, pH 8.0) to a final volume of 10 ml and 12 ml of 5 M NaC1 was then added. This mixture was centrifuged 20 min. at 15,000 x g. The resulting pellet was resuspended in 5.7 ml of TE and 300 ul of 10% SDS and ul 20 mg/ml proteinase K (Gibco BRL Products, Grand Island, NY) were added. This mixture was incubated at 37 OC for 1 hr, approximately 10 mg of lysozyme was then added and the mixture was incubated for an additional 45 min. One milliliter of 5M NaCl and 800 ul of CTAB/NaCl solution (10% w/v CTAB, 0.7 M NaCl) were then added and the mixture was incubated 10 min. at 65 0 C, gently agitated, then incubated and agitated for an additional 20 min.
to aid in clearing of the cellular material. An equal volume of chloroform/isoamyl alcohol solution (24:1, v/v) was added, mixed gently then centrifuged. Two extractions were then performed with an equal volume of phenol/chloroform/isoamyl alcohol (50:49:1).
Genomic DNA was precipitated with 0.6 volume of isopropanol.
Precipitated DNA was removed with a glass rod, washed twice with ethanol, dried and dissolved in 2 ml of STE (10 mM Tris-HCl 10 mM NaCI, 1 mM EDTA). The DNA was then quantitated by optical density at 260 nm. To perform rep-PCR analysis of Photorhabdus genomic DNA the following primers were used, REP1R- I; 5'-IIIICGICGICATCIGGC-3' and REP2-I; 5'-ICGICTTATCIGGCCTAC-3'.
PCR was performed using the following 25ul reaction: 7.75 ul ul 10X LA buffer (PanVera Corp., Madison, WI), 16 ul dNTP mix mM each), 1 ul of each primer at 50 pM/ul, 1 ul DMSO, 1.5 ul genomic DNA (concentrations ranged from 0.075-0.480 ug/ul) and 0.25 ul TaKaRa EX Taq (PanVera Corp., Madison, WI). The PCR amplification was performed in a Perkin Elmer DNA Thermal Cycler (Norwalk, CT) using the following conditions: 95 0 C/7 min. then cycles of; 94 0 C/1 min.,44 0 C/l min., 65 0 C/8 min., followed by min. at 65 0 C. After cycling, the 25 ul reaction was added to 5 ul -86- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 of 5X gel loading buffer (0.25% bromophenol blue, 40% sucrose in H20). A 15x20cm 1%-agarose gel was then run in TBE buffer (0.09 M Tris-borate, 0.002 M EDTA) using 8 ul of each reaction.
The gel was run for approximately 16 hours at 45v. Gels were then stained in 20 ugiml ethidium bromide for 1 hour and destained in TBE buffer for approximately 3 hours. Polaroid® photographs of the gels were then taken under UV illumination.
The presence or absence of bands at specific sizes for each strain was scored from the photographs and entered as a similarity matrix in the numerical taxonomy software program, NTSYS-pc (Exeter Software, Setauket, NY). Controls of E. coli strain HBl0 and Xanthomonas oryzae pv. oryzae assayed at the same time produced PCR "fingerprints" corresponding to published reports (Versalovic, Koeuth, T. and Lupski, J.R. 1991.
Nucleic Acids Res. 19, 6823-6831; Vera Cruz, Halda-Alija, Louws, Skinner, George, Nelson,
DE
Bruijn, Rice, C. and Leach, J.E. 1995. Int. Rice Res.
Notes, 20, 23-24.; Vera Cruz, Ardales, Skinner,
D.Z.,
Talag, Nelson, Louws, Leung, Mew, T.W. and Leach, J.E. 1996. Phytopathology (in press, respectively). The data .from Photorhabdus strains were then analyzed with a series of programs within NTSYS-pc; SIMQUAL (Similarity for Qualitative data) to generate a matrix of similarity coefficients (using the Jaccard coefficient) and SAHN (Sequential, Agglomerative, Heirarchical and Nested) clustering [using the UPGMA (Unweighted Pair-Group Method with Arithmetic Averages) method] which groups related strains and can be expressed as a phenogram (Figure The COPH (cophenetic values) and MXCOMP (matrix comparison) programs were used to generate a cophenetic value matrix and compare the correlation between this and the original matrix upon which the clustering was based. A resulting normalized Mantel statistic was generated which is a measure of the goodness of fit for a cluster analysis (r=0.8-0.9 represents a very good fit). In our case r 0.919. Therefore, our collection is comprised of a diverse group of easily distinguishable strains representative of the Photorhabdus genus.
-87- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 Example 13 Insecticidal Utility of Toxin(s) Produced by Various Fhocorhabdus Strains Initial "seed" cultures of the various Photorhabdus strains were produced by inoculating 175 ml of 2% Proteose Peptone 43 (PP3) (Difco Laboratories, Detroit, MI) liquid media with a primary variant subclone in a 500 ml tribaffled flask with a Delong neck, covered with a Kaput. Inoculum for each seed culture was derived from oil-overlay agar slant cultures or plate cultures. After inoculation, these flasks were incubated for 16 hrs at 28 0 C on a rotary shaker at 150 rpm. These seed cultures were then used as uniform inoculum sources for a given fermentation of each strain. Additionally, overlaying the postlog seed culture with sterile mineral oil, adding a sterile magnetic stir bar for future resuspension and storing the culture in the dark, at room temperature provided long-term preservation of inoculum in a toxin-competent state. The production broths were inoculated by adding 1% of the actively growing seed culture to fresh 2% PP3 media 1.75 ml per 175 ml fresh media).
Production of broths occurred in either 500 ml tribaffled flasks (see above), or 2800 ml baffled, convex bottom flasks (500 ml volume) covered by a silicon foam closure. Production flasks were incubated for 24-48 hrs under the above mentioned conditions. Following incubation, the broths were dispensed into sterile 1 L polyethylene bottles, spun at 2600 x g for 1 hr at 0 C and decanted from the cell and debris pellet. The liquid broth was then vacuum filtered through Whatman GF/D (2.7 uM retention) and GF/B (1.0 uM retention) glass filters to remove debris. Further broth clarification was achieved with a tangential flow microfiltration device (Pall Filtron, Northborough, MA) using a 0.5 uM open-channel filter. When necessary, additional clarification could be obtained by chilling the broth (to 4 0 C) and centrifuging for several hours at 2600 x g. Following these procedures, the broth was filter sterilized using a 0.2 uM nitrocellulose membrane filter. Sterile broths were then used directly for biological assay, biochemical analysis or concentrated (up to 15-fold) using a 10,000 MW cutoff, M12 ultra-filtration device (Amicon, Beverly MA) or -88- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 centrifugal concentrators (Millipore. Bedford, MA and Pall Filtron, Northborough, MA) with a 10,000 MW pore size. In the case of centrifugal concentrators, the broth was spun at 2000 x g for approximately 2 hr. The 10,000 MW permeate was added to the corresponding retentate to achieve the desired concentration of components greater than 10,000 MW. Heat inactivation of processed broth samples was acheived by heating the samples at 100 0 C in a sand-filled heat block for 10 minutes.
The broth(s) and toxin complex(es) from different Photorhabdus strains are useful for reducing populations of insects and were used in a method of inhibiting an insect population which comprises applying to a locus of the insect an effective insect inactivating amount of the active described. A demonstration of the breadth of insecticidal activity observed from broths of a selected group of Photorhabdus strains fermented as described above is shown in Table 19. It is possible that additional insecticidal activities could be detected with these strains through increased concentration of the broth or by employing different fermentation methods. Consistent with the activity being associated with a protein, the insecticidal activity of all strains tested was heat labile (see above).
Culture broth(s) from diverse Photorhabdus strains show differential insecticidal activity (mortality and/or growth inhibition, reduced adult emergence) against a number of insects.
More specifically, the activity is seen against corn rootworm larvae and boll weevil larvae which are members of the insect order Coleoptera. Other members of the Coleoptera include wireworms, pollen beetles, flea beetles, seed beetles and Colorado potato beetle. Activity is also observed against aster leafhopper and corn plant hopper, which are members of the order Homoptera. Other members of the Homoptera include planthoppers, pear psylla, apple sucker, scale insects, whiteflies, spittle bugs as well as numerous host specific aphid species. The broths and purified toxin complex(es) are also active against tobacco budworm, tobacco hornworm and European corn borer which are members of the order Lepidoptera. Other typical members of this order are beet armyworm, cabbage looper, black cutworm, corn earworm, codling moth, clothes moth, Indian mealmoth, leaf rollers, cabbage worm, cotton bollworm, bagworm, Eastern tent -89- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 caterpillar, sod webworm and fall armyworm. Activity is also seen against fruitfly and mosquito larvae which are members of the order Diptera. Other members of the order Diptera are, pea midge, carrot fly, cabbage root fly, turnip root fly, onion fly, crane fly and house fly and various mosquito species. Activity with broth(s) and toxin complex(es) is also seen against twospotted spider mite which is a member of the order Acarina which includes strawberry spider mites, broad mites, citrus red mite, European red mite, pear rust mite and tomato russet mite.
Activity against corn rootworm larvae was tested as follows.
Photorhabdus culture broth(s) (0-15 fold concentrated, filter sterilized), 2% Proteose Peptone purified toxin complex(es) (0.23 mg/ml] or 10 mM sodium phosphate buffer pH 7.0 were applied directly to the surface (about 1.5 cm 2 of artificial diet (Rose, R. I. and McCabe, J. M. (1973). J. Econ. Entomol. 66, (398-400) in 40 ul aliquots. Toxin complex was diluted in 10 mM sodium phosphate buffer, pH 7.0. The diet plates were allowed to air-dry in a sterile flow-hood and the wells were infested with single, neonate Diabrotica undecimpunctata howardi (Southern corn rootworm, SCR) hatched from surface sterilized eggs. The plates were sealed, placed in a humidified growth chamber and maintained at 27 0 C for the appropriate period (3-5 days). Mortality and larval weight determinations were then scored. Generally, 16 insects per treatment were used in all studies. Control mortality was generally less than Activity against boll weevil (Anthomonas grandis) was tested as follows. Concentrated.(1-10 fold) Photorhabdus broths, control medium Proteose Peptone purified toxin complex(es) [0.23 mg/ml] or 10 mM sodium phosphate buffer, pH were applied in 60 ul aliquots to the surface of 0.35 g of artificial diet (Stoneville Yellow lepidopteran diet) and allowed to dry. A single, 12-24 hr boll weevil larva was placed on the diet, and the wells were sealed and held at 25 0 C, 50% RH for days. Mortality and larval weights were then assessed. Control mortality ranged between 0-13%.
Activity against mosquito larvae was tested as follows. The assay was conducted in a 96-well microtiter plate. Each well contained 200 ul of aqueous solution (10-fold concentrated Photorhabdus culture broth(s), control medium Proteose SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 Peptone 10 mM sodium phosphate buffer, toxin complex(es) a 0.23 mg/ml or H20) and approximately 20, 1-day old larvae (Aedes aegypti). There were 6 wells per treatment. The results were read at 3-4 days after infestation. Control mortality was between 0-20%.
Activity against fruitflies was tested as follows.
Purchased Drosophila melanogaster medium was prepared using dry medium and a 50% liquid of either water, control medium (2% Proteose Peptone 10-fold concentrated Photorhabdus culture broth(s), purified toxin complex(es) [0.23 mg/ml] or 10 mM sodium phosphate buffer pH 7.0. This was accomplished by placing ml of dry medium in each of 3 rearing vials per treatment and adding 4.0 ml of the appropriate liquid. Ten late instar Drosophila melanogaster maggots were then added to each 25 ml vial. The vials were held on a laboratory bench, at room temperature, under fluorescent ceiling lights. Pupal or adult counts were made after 15 days of exposure. Adult emergence as compared to water and control medium (0-16% reduction).
Activity against aster leafhopper adults (Macrosteles severini) and corn planthopper nymphs (Peregrinus maidis) was tested with an ingestion assay designed to allow ingestion of the active without other external contact. The reservoir for the active/"food" solution is made by making 2 holes in the center of the bottom portion of a 35X10 mm Petri dish. A 2 inch Parafilm M® square is placed across the top of the dish and secured with an ring. A 1 oz. plastic cup is then infested with approximately 7 hoppers and the reservoir is placed on top of the cup, Parafilm down. The test solution is then added to the reservoir through the holes. In tests using 10-fold concentrated Photorhabdus culture broth(s), the broth and control medium (2% Proteose Peptone were dialyzed against 10 mM sodium phosphate buffer, pH 7.0 and sucrose (to was added to the resulting solution to reduce control mortality. Purified toxin complex(es) [0.23 mg/ml] or 10 mM sodium phosphate buffer, pH 7.0 was also tested. Mortality is reported at day 3. The assay was held in an incubator at 28 0 C, 70% RH with a 16/8 photoperiod. The assays were graded for mortality at 72 hours. Control mortality was less than 6%.
-91- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 Activity against lepidopteran larvae was tested as fellows.
Concentrated (10-fold) Phocorhabdus culture broth(s), control medium Proteose Peptone purified toxin complex(es) (0.23 mg/ml] or 10 mM sodium phosphate buffer, pH 7.0 were applied directly to the surface cm 2 of standard artificial lepidopteran diet (Stoneville Yellow diet) in 40 ul aliquots.
The diet plates were allowed to air-dry in a sterile flow-hood and each well was ihfested with a single, neonate larva. European corn borer (Ostrinia nubilalis) and tobacco hornworm (Manduca sexta) eggs were obtained from commercial sources and hatched inhouse, whereas tobacco budworm (Heliothis virescens) larvae were supplied internally. Following infestation with larvae, the diet plates were sealed, placed in a humidified growth chamber and maintained in the dark at 27 0 C for the appropriate period.
Mortality and weight determinations were scored at day Generally, 16 insects per treatment were used in all studies.
Control mortality generally ranged from 4-12.5% for control medium and was less than 10% for phosphate buffer.
Activity against two-spotted spider mite (Tetranychus urticae) was determined as follows. Young squash plants were trimmed to a single cotyledon and sprayed to run-off with concentrated broth(s), control medium Proteose Peptone purified toxin complex(es) [0.23 mg/ml] or 10 mM sodium phosphate buffer, pH 7.0. After drying, the plants were infested with a mixed population of spider mites and held at lab temperature and humidity for 72 hr. Live mites were then counted to determine levels of control.
-92- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 Table 19 Observed Insecticidal Spectrum of Broths From Different Phocorhabdus Strains
C
i I'focorliaflus Strain Sensitive* Insect Species wx-l1 WX -2 WX- 3 WX -4 WX- 5 WX -6 WX -7 WX- 8 WX -9 WX -10 wx -11 wx -12 WX -14 WX -15 NC-i
WIR
HP88 Hb Hm H 9 W- 14
ATCC
ATCC
ATCC
ATCC
ATCC
4, 5, 6, 7, 8 2, 4 1, 4 1, 4 4 4 3, 4, 5, 6, 7, 8 1, 2, 4 1, 2, 4 4 1, 2, 4 2, 4, 5, 6, 7, 8 1, 2, 4 1, 2, 4 3, 4, 5, 8 1, 2, 3, 4, 5, 6, 2, 3, 5, 6, 7, 8 1, 3, 4, 5, 7, 8 3, 4, 5, 7, 8 1, 2, 3, 4, 5, 7, 1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6, 4 4 4 4 4 7, 8, 9 8 7, 8 7, 8, 43948 43949 43950 43951 43952 25% mortality and/or growth inhibition vs. control 1; Tobacco budworm, 2; European corn borer, 3; Tobacco hornworm, 4; Southern corn rootworm, Boll weevil, 6; Mosquito, 7; Fruit Fly, 8; Aster Leafhopper, 9; Corn planthopper, Two-spotted spider mite.
93- SUBSTiTUE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 Example 14 Non Photorhabdus Strains: Purification, Characterization and Activity Spectrum Purification The protocol, as follows, is similar to that developed for the purification of W-14 and was established based on purifying those fractions having the most activity against Southern corn root worm (SCR), as determined in bioassays (see Example 13).
Typically, 4-20 L of broth that had been filtered, as described in Example 13, were received and concentrated using an Amicon spiral ultra filtration cartridge Type S1Y100 attached to an Amicon M-12 filtration device. The retentate contained native proteins consisting of molecular sizes greater than 100 kDa, whereas the flow through material contained native proteins less than 100 kDa in size. The majority of the activity against SCR was contained in the 100 kDa retentate. The retentate was then continually diafiltered with 10 mM sodium phosphate (pH until the filtrate reached an A280 0.100. Unless otherwise stated, all procedures from this point were performed in buffer as defined by 10 mM sodium phosphate (pH The retentate was then concentrated to a final volume of approximately 0.20 L and filtered using a 0.45 mm Nalgene T Filterware sterile filtration unit. The filtered material was loaded at 7.5 ml/min onto a Pharmacia HR16/10 column which had been packed with PerSeptive Biosystem Poros® 50 HQ strong anion exchange matrix equilibrated in buffer using a PerSeptive Biosystem Sprint® HPLC system.
After loading, the column was washed with buffer until an A280 0.100 was achieved. Proteins were then eluted from the column at 2.5 ml/min using buffer with 0.4 M NaCl for 20 min for a total volume of 50 ml. The column was then washed using buffer with M NaCl at the same flow rate for an additional 20 min (final volume 50 ml). Proteins eluted with 0.4 M and 1.0 M NaCI were placed in separate dialysis bags (Spectra/Por® Membrane MWCO: 2,000) and allowed to dialyze overnight at 40 C in 12 L buffer.
The majority of the activity against SCR was contained in the 0.4 M fraction. The 0.4 M fraction was further purified by application of 20 ml to a Pharmacia XK 26/100 column that had been prepacked with Sepharose CL4B (Pharmacia) using a flow rate -94- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 of 0.75 ml/min. Fractions were pooled based on A280 Peak profile and concentrated to a final volume of 0.75 ml using a Millipore centrifugal filter device Biomax-50K NMWL membrane.
Protein concentrations were determined using a Biorad Protein Assay Kit with bovine gamma globulin as a standard.
Characterization The native molecular weight of the SCR toxin complex was determined using a Pharmacia HR 16/50 that had been prepacked with Sepharose CL4B in buffer. The column was then calibrated using proteins of known molecular size thereby allowing for calculation of the toxin approximate native molecular size. As shown in Table 20, the molecular size of the toxin complex ranged from 777 kDa with strain Hb to 1,900 kDa with strain WX-14. The yield of toxin complex also varied, from strain WX-12 producing 0.8 mg/L to strain Hb, which produced 7.0 mg/L.
Proteins found in the toxin complex were examined for individual polypeptide size using SDS-PAGE analysis. Typically, mg protein of the toxin complex from each strain was loaded onto a 2-15% polyacrylamide gel (Integrated Separation Systems) and electrophoresed at 20 mA in Biorad SDS-PAGE buffer. After completion of electrophoresis, the gels were stained overnight in Biorad Coomassie blue R-250 in methanol: acetic acid: water; 40:10:40 Subsequently, gels were destained in methanol:acetic acid: water; 40:10:40 The gels were then rinsed with water for 15 min and scanned using a Molecular Dynamics Personal Laser Densitometer®. Lanes were quantitated and molecular sizes were calculated as compared to Biorad high molecular weight standards, which ranged from 200-45 kDa.
Sizes of the individual polypeptides comprising the SCR toxin complex from each strain are listed in Table 21. The sizes of the individual polypeptides ranged from 230 kDa with strain WX-1 to a size of 16 kDa, as seen with strain WX-7. Every strain, with the exception of strain Hb, had polypeptides comprising the toxin complex that were in the 160-230 kDa range, the 100-160 kDa range, and the 50-80 kDa range. These data indicate that the toxin complex may vary in peptide composition and components from strain to strain, however, in all cases the SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 coxin attributes appears to consist of a large, oligomeric protein complex.
Table Characterization of a Toxin Complex From Non W-14 Phocorhabdus Strains Strain Approx. Yield Native Active Molecular Wt.a Fraction (mg/L)b H9 972,000 1.8 Hb 777,000 Hm 1,400,000 1.1 HP88 813,000 NCl 1,092,000 3.3 WIR 979,000 WX-1 973,000 0.8 WX-2 951,000 2.2 WX-7 1,000,000 WX-12 898,000 0.4 WX-14 1,900,000 1.9 W-14 860,000 a Native molecular weight determined using a Pharmacia HR 16/50 column packed with Sepharose CL4B b Amount of toxin complex recovered from culture broth.
Activity Spectrum As shown in Table 21, the toxin complexes purified from strains Hm and H9 were tested for activity against a variety of insects, with the toxin complex from strain W-14 for comparison.
The assays were performed as described in Example 13. The toxin complex from all three strains exhibited activity against tobacco bud worm, European corn borer, Southern corn root worm, and aster leafhopper. Furthermore, the toxin complex from strains Hm and w-14 also exhibited activity against two-spotted spider mite. In addition, the toxin complex from W-14 exhibited activity against mosquito larvae. These data indicate that the toxin complex, while having similarities in activities between certain orders of insects, can also exhibit differential activities against other orders of insects.
-96- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 Table 21 The Approximate Sizes (in kDa) of Peptides in a Purified Toxin Complex From Non W-14 Photorhabdus H9 Hb Hm HP NC-1 WIR WX-1 WX-2 WX-7 WX-12 WX-14 W-i 88 180 150 170 170 180 170 230 200 200 180 210 190 170 140 140 160 170 160 190 170 180 160 180 180 160 139 100 140 140 120 170 150 110 140 160 170 140 130 81 130 110 110 160 120 87 139 120 160 120 120 72 129 44 89 110 110 75 130 110 150 98 100 68 110 16 79 98 82 43 110 100 130 87 98 49 100 74 76 64 33 92 95 120 84 88 46 86 62 58 37 28 87 80 110 79 81 30 81 51 53 30 26 80 69 93 72 75 22 77 40 41 23 73 49 68 69 20 73 39 35 22 59 41 77 60 19 60 37 31 21 56 33 69 57 57 58 33 28 19 51 52 54 45 30 24 18 37 63 46 49 39 28 22 16 33 44 35 27 32 51 37 39 25 26 46 37 23 -97- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 Table 22 Observed Insecticidal Spectrum of a Purified Toxin Complex from Phocorhabdus Strains Phocorhabdus Strain Sensitive* Insect Species Hm Toxin Complex 2, 3, 5, 6, 7, 8 H9 Toxin Complex 1, 2, 3, 6, 7, 8 W-14 Toxin Complex 1, 2, 3, 4, 5, 6, 7, 8 25% mortality or growth inhibition 25% mortality or growth inhibition 1; Tobacco bud worm, 2; European corn borer, 3; Southern corn root worm, 4; Mosquito, 5; Two-spotted spider mite, 6; Aster Leafhopper, 7; Fruit Fly, 8; Boll Weevil Example Sub-Fractionation of Phocorhabdus Protein Toxin Complex The Photorhabdus protein toxin complex was isolated as described in Example 14. Next, about 10 mg toxin was applied to a MonoQ 5/5 column equilibrated with 20 mM Tris-HCl, pH 7.0 at a flow rate of Iml/min. The column was washed with 20 mM Tris-HCl, pH 7.0 until the optical density at 280 nm returned to baseline absorbance. The proteins bound to the column were eluted with a linear gradient of 0 to 1.0 M NaC1 in 20 mM Tris-HC1, pH 7.0 at 1 ml/min for 30 min. One ml fractions were collected and subjected to Southern corn rootworm (SCR) bioassay (see Example 13). Peaks of activity were determined by a series of dilutions of each fraction in SCR bioassays. Two activity peaks against SCR were observed and were named A (eluted at about 0.2-0.3 M NaCI) and B (eluted at 0.3-0.4 M NaC1). Activity peaks A and B were pooled separately and both peaks were further purified using a 3-step procedure described below.
Solid (NH4) 2
SO
4 was added to the above protein fraction to a final concentration of 1.7 M. Proteins were then applied to a phenyl-Superose 5/5 column equilibrated with 1.7 M (NH4)2SO4 in mM potassium phosphate buffer, pH 7 at 1 ml/min. Proteins bound to the column were eluted with a linear gradient of 1.7 M (NH4)2SO4, 0% ethylene glycol, 50 mM potassium phosphate, pH to 25% ethylene glycol, 25 mM potassium phosphate, pH 7.0 (no (NH4)2SO4) at 0.5 ml/min. Fractions were dialyzed overnight -98- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/1 8003 against 10 mM sodium phosphate buffer, pH 7.0. Activities in each fraction against SCR were determined by bioassay.
The fractions with the highest activity were pooled and applied to a MonoQ 5/5 column which was equilibrated with 20 mM Tris-HCl, pH 7.0 at 1 ml/min. The proteins bound to the column were eluted at 1 ml/min by a linear gradient of 0 to IM NaCl in mM Tris-HCl, pH For the final step of purification, the most active fractions above (determined by SCR bioassay; were pooled and subjected to a second phenyl-Superose 5/5/ column. Solid (NH4)2S04 was added to a final concentration of 1.7 M. The solution was then loaded onto the column equilibrated with 1.7 M (NH4)2S0 4 in 50 mM potassium phosphate buffer, pH 7 at Iml/min.
Proteins bound to the column were eluted with a linear gradient of 1.7 M (NH4)2S04, 50 mM potassium phosphate, pH 7.0 to 10 mM potassium phosphate, pH 7.0 at 0.5 ml/min. Fractions were dialyzed overnight against 10 mM sodium phosphate buffer, pH Activities in each fraction against SCR were determined by bioassay.
The final purified protein by the above 3-step procedure from peak A was named toxin A and the final purified protein from peak B was named toxin B.
Characterization and Amino Acid Sequencing of Toxin A and Toxin B In SDS-PAGE, both toxin A and toxin B contained two major of total Commassie stained protein) peptides: 192 kDa (named Al and Bl, respectively) and 58 kDa (named A2 and B2, respectively). Both toxin A and toxin B revealed only one major band in native PAGE, indicating Al and A2 were subunits of one protein complex, and B1 and B2 were subunits of one protein complex. Further, the native molecular weight of both toxin A and toxin B were determined to be 860 kDa by gel filtration chromatography. The relative molar concentrations of Al to A2 was judged to be a 1 to 1 equivalence as determined by densiometric analysis of SDS-PAGE gels. Similarly, Bl and B2 peptides were present at the same molar concentration.
Toxin A and toxin B were electrophoresed in 10% SDS-PAGE and transblotted to PVDF membranes. Blots were sent for amino acid analysis and N-terminal amino acid sequencing at Harvard MicroChem and Cambridge ProChem, respectively. The N-terminal -99- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 amino sequence of BI was determined to be identical to SEQ ID NO:1, the TcbAii region of the tcbA gene (SEQ ID NO:12, position 87 to 99). A unique N-terminal sequence was obtained for peptide B2 (SEQ ID NO:40). The N-terminal amino acid sequence of peptide B2 was identical to the TcbAiii region of the derived amino acid sequence for the tcbA gene (SEQ ID NO:12, position 1935 to 1945).
Therefore, the B toxin contained predominantly two peptides, TcbAii and TcbAiii, that were observed to be derived from the same gene product, TcbA.
The N-terminal sequence of A2 (SEQ ID NO:41) was unique in comparison to the TcbAiii peptide and other peptides. The A2 peptide was denoted TcdAiii (see Example 17). SEQ ID NO:6 was determined to be a mixture of amino acid sequences SEQ ID and 41.
Peptides Al and A2 were further subjected to internal amino acid sequencing. For internal amino acid sequencing, 10 ug of toxin A was electrophoresized in 10% SDS-PAGE and transblotted to PVDF membrane. After the blot was stained with amido black, peptides Al and A2, denoted TcdAii and TcdAiii, respectively, were excised from the blot and sent to Harvard MicroChem and Cambridge ProChem. Peptides were subjected to trypsin digestion followed by HPLC chromatography to separate individual peptides.
N-terminal amino acid analysis was performed on selected tryptic peptide fragments. Two internal amino acid sequences of peptide Al (TcdAii-PK71, SEQ ID NO:38 and TcdAii-PK44, SEQ ID NO:39) were found to have significant homologies with deduced amino acid sequences of the TcbAii region of the tcbA gene (SEQ ID NO:12).
Similarly, the N-terminal sequence (SEQ ID NO:41) and two internal sequences of peptides A2 (TcdAiii-PK57, SEQ ID NO:42 and TcdAiii-PK20, SEQ ID NO.43) also showed significant homology with deduced amino acid sequences of TcbAiii region of the tcbA gene (SEQ ID NO:12).
In summary of above results, the toxin complex has at least two active protein toxin complexes against SCR; toxin A and toxin B. Toxin A and toxin B are similar in their native and subunits molecular weight, however, their peptide compositions are different. Toxin A contained peptides TcdAii and TcdAiii as the major peptides and the toxin B contains TcbAii and TcbAiii as the major peptides.
-100- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 Example 16 Cleavage and Activation of TcbA Peptide In the toxin B complex, peptide TcbAii and TcbAiii originate from the single gene product TcbA (Example 15). The processing of TcbA peptide to TcbAii and TcbAiii is presumably by the action of Photorhabdus protease(s), and most likely, the metalloproteases described in Example 10. In some cases, it was noted that when Photorhabdus W-14 broth was processed, TcbA peptide was present in toxin B complex as a major component, in addition to peptides TcbAii and TcbAiii. Identical procedures, described for the purification of toxin B complex (Example 15), were used to enrich peptide TcbA from toxin complex fraction of W-14 broth. The final purified material was analyzed in a 4-20% gradient SDS-PAGE and major peptides were quantified by densitometry. It was determined that TcbA, TcbAii and TcbAiii comprised 58%, 36%, and 6%, respectively, of total protein. The identities of these peptides were confirmed by their respective molecular sizes in SDS-PAGE and Western blot analysis using monospecific antibodies. The native molecular weight of this fraction was determined to be 860 kDa.
The cleavage of TcbA was evaluated by treating the above purified material with purified 38 kDa and 58 kDa W-14 Photorhabdus metalloproteases (Example 10), and Trypsin as a control enzyme (Sigma, MO). The standard reaction consisted 17.5 ug the above purified fraction, 1.5 unit protease, and 0.1 M Tris buffer, pH 8.0 in a total volume of 100 ul. For the control reaction, protease was omitted. The reaction mixtures were incubated at 37 oC for 90 min. At the end of the reaction, 20 ul was taken and boiled with SDS-PAGE sample buffer immediately for electrophoresis analysis in a 4-20% gradient SDS-PAGE. It was determined from SDS-PAGE that in both 38 kDa and 58 kDa protease treatments, the amount of peptides TcbAii and TcbAiii increased about 3-fold while the amount of TcbA peptide decreased proportionally (Table 23). The relative reduction and augmentation of selected peptides was confirmed by Western blot analyses. Furthermore, gel filtration of the cleaved material revealed that the native molecular size of the complex remained the same. Upon trypsin treatment, peptides TcbA and TcbAii were -101- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 nonspecifically digested into small peptides. This indicated that 38 kDa and 58 kDa Photorhabdus proteases can specifically process peptide TcbA into peptides TcbAii and TcbAiii. Protease treated and untreated control of the remaining 80 ul reaction mixture were serial diluted with 10 mM sodium phosphate buffer, pH 7.0 and analyzed by SCR bioassay. By comparing activity in several dilution, it was determined that the 38 kDa protease treatment increased SCR insecticidal activity approximately 3 to 4 fold.
The growth inhibition of remaining insects in the protease treatment was also more severe than control (Table 23).
Table 23 Conversion and activation of peptide TcbA into peptides TcbAii and TcbAiii by protease treatment.
Control 38 kDa protease treatment SO of total protein) 58 18 S1 of total protein) 36 64 S9 of total protein) 6 18 (ug protein) 2.1 0.52 SCR Weight (mg/insect)* 0.2 0.1 an indication of growth inhibition by measuring the average weight of live insect after 5 days on diet in the assay.
Example 17 Screening of the library for a gene encoding the TcdAii Peptide The cloning and characterization of a gene encoding the TcdAii peptide, described as SEQ ID NO:17 (internal peptide TcdAii-PTll N-terminal sequence) and SEQ ID NO:18 (internal peptide TcdAii-PT79 N-terminal sequence) was completed. Two pools of degenerate oligonucleotides, designed to encode the amino acid sequences of SEQ ID NO:17 (Table 24) and SEQ ID NO:18 (Table 25), and the reverse complements of those sequences, were synthesized as described in Example 8. The DNA sequence of the oligonucleotides is given below: -102- SUBSTITUTE SHEET (RULE 26) 0 Table 24 Degenerate Oligonucleotide for SEQ ID NO:17 P2-PT111 1 2 3 4 5 6 7 8 Amino Acid Ala Ph. Aun 116 Asp Asp Val Ser Codons 5' GCN TT(T/C) AA(T/C) AT(T/C/A) GA(T/C) GA(T/C) GTN 3' P2.3.6.CB 5' GC(A/C/G/T) TP(T/C) AAT A'N' GAT GAT GT 3' P2.3.5 5' GC(A/C/G/T) TT(T/C) AA(T/C) AT(T/C/A) GA(T/C) GA(T/C) GT 31 P2.3.5R 5' AC (G/A)TC (G/A)TC (T/G/A)AT (G/A)TT (G/A)AA (A/C/G/T)GC 3' P2.3.511 5' ACI TCI TC I ATI TTI AAI GC 3' P2.3R.CB 5S CAG (A/G)CT (A/C)AC ATC ATC AAT ATT AAA 3' Table Degenerate Oligonucleotide for SEQ ID NO:18 P2-PT79 1 2 3 4 5 6 7 8 9 10 11 12 13 Amino Phe Ile Val Tyr Thr Ser Leu Gly Val Aen Pro Aen Asn Acid Codons* 51 TTY ATH GTN TAY ACN 6 6 GGN GTN AAY CCN A.AY AAY 31 P2.79.2 5' TTY ATY GTK TAT ACY TC I YTR GGY GTK AAT CCR A.AT AAT 3 P2.79.3 5- TTT ATT GTK TAT ACY AGY YTR GGY GTK AAT C'CR AAT AAT 36 P279R. 5 AT T YG TT MA RC A- -C -G -T AM A A P2.79R.CB 5' ATT AT? YGG ATT MAC R CC Y AR RCT R-GT ATA MAC AAT AAA 3 *According to IUPAC-IUB codes for nucleotides, Y C or T, H A, C or T, N A, C, G or T, K G or T, R A or G, and M A or C WO 97/17432 PCT/US96/18003 Polymerase Chain Reactions (PCR) were performed essentially as described in Example 8, using as forward primers P2.3.5.CB or P2.3.5, and as reverse primers P2.79.R.1 or P2.79R.CB, in all forward/reverse combinations, using Photorhabdus W-14 genomic DNA as template. In another set of reactions, primers P2.79.2 or P2.79.3 were used as forward primers, and P2.3.5R, P2.3.5RI, and P2.3R.CB were used as reverse primers in all forward/reverse combinations. Only in the reactions containing P2.3.6.CB as the forward primers combined with P2.79.R.1 or P2.79R.CB as the reverse primers was a non-artifactual amplified product seen, of estimated size (mobility on agarose gels) of 2500 base pairs.
The order of the primers used to obtain this amplification product indicates that the peptide fragment TcdAii-PT1ll lies amino-proximal to the peptide fragment TcdAii-PT79.
The 2500 bp PCR products were ligated to the plasmid vector pCRII (Invitrogen, San Diego, CA) according to the supplier's instructions, and the DNA sequences across the ends of the insert fragments of two isolates (HS24 and HS27) were determined using the supplier's recommended primers and the sequencing methods described previously. The sequence of both isolates was the same. New primers were synthesized based on the determined sequence, and used to prime additional sequencing reactions to obtain a total of 2557 bases of the insert [SEQ ID NO:36].
Translation of the partial peptide encoded by SEQ ID No: 36 yields the 845 amino acid sequence disclosed as SEQ ID NO:37.
Protein homology analysis of this portion of the TcdAii peptide fragment reveals substantial amino acid homology (68% similarity; 53% identity) to residues 542 to 1390 of protein TcbA [SEQ ID NO:12]. It is therefore apparent that the gene represented in part by SEQ ID NO:36 produces a protein of similar, but not identical, amino acid sequence as the TcbA protein, and which likely has similar, but not identical biological activity as the TcbA protein.
In yet another instance, a gene encoding the peptides TcdAii-PK44 and the TcdAjii 58 kDa N-terminal peptide, described as SEQ ID NO:9 (internal peptide TcdAii-PK44 sequence), and SEQ ID NO:41(TcdAiii 58 kDa N-terminal peptide sequence) was isolated.
Two pools of degenerate oligonucleotides, designed to encode the amino acid sequences described as SEQ ID NO:39 (Table 27) and SEQ -104- SUBSTITUTE SHEET (RULE 26) WO 97/1 7432PcIS6180 PCT/US96/18003 ID NO:41 Table 26), and the reverse complements of those sequences, were synthesized as described in Example 8, and thei~r DNJA sequences.
-105- SUBSTITUTE SHEET (RULE 26) Table 26 Degenerate Oligonucleotide for SEQ ID NO:41 OodM 1 2 3 4 5 6 7 8 9 10 11 12 13 14 kmfo Iao kg Bar h a Ag Sr A IaG 1 UM 7r ASI LuI I za Pro Gn Acid A2.1 5' YIR OGY AGY GCI AAT PyC YIR CI GAT YtIR MTIT Y1IR GXR CA 31 A.2 I I I I P I I AAT ACI YIER ACI GAY YIR TIY YIR CrI CA 31 A2.3.R 15' 7 I YOG I YAR I AAA I YAR IC rI ~r IYAR I F3r Fa IM F& rM 3- A2.4.R 1__1 51 MIt I AG I AAA I CAG RIC I IGr CAG I I W Icx 36 Table 27 Degenerate Oligonucleotide for SEQ ID NO:39 Amino Acid (10) (11) (12) (13) (14) (15) (16) Codon 1 2 3 4 5 6 7 8 9 Amino Acid Gly Pro Val Giu lie Aen Thr Ala lie A1.44.1 51 GGY CCR GTK GAA ATT AAT ACC GCI AT 3, A1.44.1R 56 ATI GCG GTA TTA ATT TCM ACY GGR CC 31 A1.44.2 5' GGI CCI GTI GAR ATY AAY ACI GCI AT 3 A1.44.2R 5' ATI GCI GTR TTR ATY TCI ACI GGI CC 3 WO 97/17432 PCT/US96/18003 Polymerase Chain Reactions (PCR) were performed essentially as described in Example 8, using as forward primers AI.44.1 or AI.44.2, and reverse primers A2.3R or A2.4R, in all forward/reverse combinations, using Photorhabdus w-14 genomic DNA as template. In another set of reactions, primers A2.1 or A2.2 were used as forward primers, and A1.44.1R, and A1.44.2R were used as reverse primers in all forward/reverse combinations.
Only in the reactions containing A1.44.1 or AI.44.2 as the forward primers combined with A2.3R as the reverse primer was a non-artifactual amplified product seen, of estimated size (mobility on agarose gels) of 1400 base pairs. The order of the primers used to obtain this amplification product indicates that the peptide fragment TcdAii-PK44 lies amino-proximal to the 58 kDa peptide fragment of TcdAiii.
The 1400 bp PCR products were ligated to the plasmid vector pCR"II according to the supplier's instructions. The DNA sequences across the ends of the insert fragments of four isolates were determined using primers similar in sequence to the supplier's recommended primers and using sequencing methods described previously. The nucleic acid sequence of all isolates differed as expected in the regions corresponding to the degenerate primer sequences, but the amino acid sequences deduced from these data were the same as the actual amino acid sequences for the peptides determined previously, (SEQ ID NOS:41 and 39).
Screening of the W-14 genomic cosmid library as described in Example 8 with a radiolabeled probe comprised of the DNA prepared above (SEQ ID NO:36) identified five hybridizing cosmid isolates, namely 17D9, 20B10, 21D2, 27B10, and 26D1. These cosmids were distinct from those previously identified with probes corresponding to the genes described as SEQ ID NO:ll or SEQ ID NO:25. Restriction enzyme analysis and DNA blot hybridizations identified three EcoR I fragments, of approximate sizes 3.7, 3.7, and 1.1 kbp, that span the region comprising the DNA of SEQ ID NO:36. Screening of the W-14 genomic cosmid library using as probe the radiolabeled 1.4 kbp DNA fragment prepared in this example identified the same five cosmids (17D9, 20B10, 21D2, 27B10, and 26D1). DNA blot hybridization to EcoR Idigested cosmid DNAs also showed hybridization to the same subset -107- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 of EcoR I fragments as seen with the 2.5 kbp TcdAii gene probe, indicating that both fragments are encoded on the genomic DNA.
DNA sequence determination of the cloned EcoR I fragments revealed an uninterrupted reading frame of 7551 base pairs (SEQ ID NO:46), encoding a 282.9 kDa protein of 2516 amino acids (SEQ ID NO:47). Analysis of the amino acid sequence of this protein revealed all expected internal fragments of peptides TcdAii(SEQ ID NOS:17, 18, 37, 38 and 39) and the TcdAiii peptide N-terminus (SEQ ID NO:41) and all TcdAiii internal peptides (SEQ ID NOS:42 and 43). The peptides isolated and identified as TcdAii and TcdAiii are each products of the open reading frame, denoted tcdA, disclosed as SEQ ID NO:46. Further, SEQ ID NO:47 shows, starting at position 89, the sequence disclosed as SEQ ID NO:13, which is the N-terminal sequence of a peptide of size approximately 201 kDa, indicating that the initial protein produced from SEQ ID No: 46 is processed in a manner similar to that previously disclosed for SEQ ID NO:12. In addition, the protein is further cleaved to generate a product of size 209.2 kDa, encoded by SEQ ID NO:48 and disclosed as SEQ ID NO:49 (TcdAii peptide), and a product of size 63.6 kDa, encoded by SEQ ID NO:50 and disclosed as SEQ ID NO:51 (TcdAiii peptide). Thus, it is thought that the insecticidal activity identified as toxin A (Example 15) derived from the products of SEQ ID NO:46, as exemplified by the full-length protein of 282.9 kDa disclosed as SEQ ID NO:47, is processed to produce the peptides disclosed as SEQ ID NOS:49 and 51. It is thought that the insecticidal activity identified as toxin B (Example 15) derives from the products of SEQ ID NO:11, as exemplified by the 280.6 kDa protein disclosed as SEQ ID NO:12. This protein is proteolytically processed to yield the 207.6 kDa peptide disclosed as SEQ ID NO:53, which is encoded by SEQ ID NO:52, and the 62.9 kDa peptide having N-terminal sequence disclosed as SEQ ID NO:40, and further disclosed as SEQ ID NO:55, which is encoded by SEQ ID NO:54.
Amino acid sequence comparisons between the proteins disclosed as SEQ ID NO:12 and SEQ ID NO:47 reveal that they have 69% similarity and 54% identity. This high degree of evolutionary relationship is not uniform throughout the entire amino acid sequence of these peptides, but is higher towards the carboxy-terminal end of the proteins, since the peptides -108- SUBSTITUTE SHEET (RULE 26 WO 97/17432 PCT/US96/18003 disclosed as SEQ ID NO:51 (derived from SEQ ID NO:47) and SEQ 1D (derived from SEQ ID NO:12) have 76% similarity and 64% identity.
Example 18 Control of European Cornborer-Induced Leaf Damage on Maize Plants by Spray Application of Photorhabdus (Strain W-14) Broth The ability of Photorhabdus toxin(s) to reduce plant damage caused by insect larvae was demonstrated by measuring leaf damage caused by European corn borer (Ostrinia nubilalis) infested onto maize plants treated with Phocorhabdus broth. Fermentation broth from Photorhabdus strain W-14 was produced and concentrated approximately 10-fold using ultrafiltration (10,000 MW pore-size) as described in Example 13. The resulting concentrated broth was then filter sterilized using 0.2 micron nitrocellulose membrane filters. A similarly prepared sample of uninoculated 2% proteose peptone #3 was used for control purposes. Maize plants (a DowElanco proprietary inbred line) were grown from seed to vegetative stage 7 or 8 in pots containing a soilless mixture in a greenhouse (27 0 C day; 22 0 C night, about 50%RH, 14 hr daylength, watered/fertilized as needed). The test plants were arranged in a randomized complete block design (3 reps/treatment, 6 plants/treatment) in a greenhouse with temperature about 22 0
C
day; 18 0 C night, no artificial light and with partial shading, about 50%RH and watered/fertilized as needed. Treatments (uninoculated media and concentrated Photorhabdus broth) were applied with a syringe sprayer, 2.0 mls applied from directly (about 6 inches) over the whorl and 2.0 additional mis applied in a circular motion from approximately one foot above the whorl.
In addition, one group of plants received no treatment. After the treatments had dried (approximately 30 minutes), twelve neonate European corn borer larvae (eggs obtained from commercial sources and hatched in-house) were applied directly to the whorl.
After one week, the plants were scored for damage to the leaves using a modified Guthrie Scale (Koziel, M. Beland, G. L., Bowman, Carozzi, N. Crenshaw, Crossland, Dawson, Desai, Hill, Kadwell, Launis, Lewis, K., Maddox, McPherson, Meghji, M. Merlin, Rhodes, R., -109- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 Warren, G. Wright, M. and Evoia, S. V. 1993).
Bio/Technology, 11, 194-195.) and the scores were compared statistically [T-test (LSD) p<0.05 and Tukey's Studentized Range (HSD) Test The results are shown in Table 29. For reference, a score of 1 represents no damage, a score of 2 represents fine "window pane" damage on the unfurled leaf with no pinhole penetration and a score of 5 represents leaf penetration with elongated lesions and/or mid rib feeding evident on more than three leaves (lesions 1 inch). These data indicate that broth or other protein containing fractions may confer protection against specific insect pests when delivered in a sprayable formulation or when the gene or derivative thereof, encoding the protein or part thereof, is delivered via a transgenic plant or microbe.
Table 28 Effect of Photorhabdus Culture Broth on European Corn Borer-Induced Leaf Damage on Maize Treatment Average Guthrie Score No Treatment 5.02 a Uninoculated medium 5.
15 a Photorhabdus Broth 2.24 b Means with different letters are statistically different (p<0.05 or p<0.1).
Example 19 Genetic Engineering of Genes for Expression in E. coli Summary of constructions A series of plasmids were constructed to express the tcbA gene of Photorhabdus W-14 in Escherichia coli. A list of the plasmids is shown in Table 29. A brief description of each construction follows as well as a summary of the E. coli expression data obtained.
-110- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 Table 29 Expression plasmids for the ccbA gene.
Plasmid Gene Vector/Selection Compartment pDAB634 ccbA pBC/Chl Intracellular pAcGP67B/ CcbA ccbA pAcGP67B/Amp Baculovirus, secreted pDAB635 tcbA pET27b/Kan Periplasm tcbA pET15-tcbA Intracellular Abbreviations: Kan=kanamycin, Chl=chloramphenicol, Amp=ampicillin Construction of pDAB634 In Example 9, a large EcoR I fragment which hybridizes to the TcbAii probe is described. This fragment was subcloned into pBC (Stratagene, La Jolla CA). Sequence analysis indicates that this fragment is 8816 base pairs. The fragment encodes the ccbA gene with the initiating ATG at position 571 and the terminating TAA at position 8086. The fragment therefore carries 570 base pairs of Photorhabdus DNA upstream of the ATG and 730 base pairs downstream of the TAA.
Construction of Plasmid pAcGP67B/tcbA The tcbA gene was PCR amplified using the following primers; primer (SlAc51) 5' TTT AAA CCA TGG GAA ACT CAT TAT CAA GCA CTA TC 3' and 3' primer (SlAc31) 5' TTT AAA GCG GCC GCT TAA CGG ATG GTA TAA CGA ATA TG PCR was performed using a TaKaRa LA PCR kit from PanVera (Madison, Wisconsin) in the following reaction: 57.5 ml water, 10 ml 10X LA buffer, 16 ml dNTPs (2.5 mM each stock solution), 20 ml each primer at 10 pmoles/ml, 300 ng of the plasmid pDAB634 containing the W-14 tcbA gene and one ml of TaKaRa LA Taq polymerase. The cycling conditions were 98 0 sec, 68 0 C/5 min, 72 0 C/10 min for 30 cycles. A PCR product of the expected about 7526bp was isolated in a 0.8% agarose gel in TBE (100 mM Tris, 90 mM boric acid, 1 mM EDTA) buffer and purified using a Qiaex II kit from Qiagen (Chatsworth, California). The purified tcbA gene was digested with Nco I and Not I and ligated into the baculovirus transfer vector pAcGP67B (PharMingen (San Diego, California)) and transformed into DH5a E. coli. The tcbA gene was then cut from pAcGP67B and transferred to pET27b to create plasmid pDAB635. A missense mutation in the tcbA gene was repaired in pDAB635.
-111- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 The repaired tcbA gene contains two changes from the sequence shown in Sequence ID NO:11; an A>G at 212 changing an asparagine 71 to serine 71 and a G.-A at 229 changing an alanine 77 to threonine 77. These changes are both upstream of the proposed TcbAii N-terminus.
Construction of The ccbA coding region of pDAB635 was transferred to vector This was accomplished using shotgun ligations, the DNAs were cut with restriction enzymes Nco I and Xho I. The resulting recombinant is called Expression of TcbA in E. coli from plasmid Expression of tcbA in E. coli was obtained by modification of the methods previously described by Studier et al. (Studier, Rosenberg, Dunn, and Dubendorff, (1990) Use of T7 RNA polymerase to direct expression of cloned genes. Methods Enzymol., 185: 60-89.). Competent E. coli cells strain BL21(DE3) were transformed with plasmid pET15-tcbA and plated on LB agar containing 100 g/ml ampicillin and 40 mM glucose. The transformed cells were plated to a density of several hundred isolated colonies/plate. Following overnight incubation at 37 0
C
the cells were scraped from the plates and suspended in LB broth containing 100 ig /ml ampicillin. Typical culture volumes were from 200-500 ml. At time zero, culture densities (OD600) were from 0.05-0.15 depending on the experiment. Cultures were shaken at one of three temperatures (22 0 C, 30 0 C or 37 0 C) until a density of 0.15-0.5 was obtained at which time they were induced with 1 mM isopropylthio-P-galactoside (IPTG). Cultures were incubated at the designated temperature for 4-5 hours and then were transferred to 4 0 C until processing (12-72 hours).
Purification and characterization of TcbA expressed in E.coli from Plasmid E. coli cultures expressing TcbA peptides were processed as follows. Cells were harvested by centrifugation at 17,000 x G and the media was decanted and saved in a separate container.
The media was concentrated about 8x using the M12 (Amicon, Beverly MA) filtration system and a 100 kD molecular mass cut-off filter. The concentrated media was loaded onto an anion exchange -112- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 column and the bound proteins were eluted with 1.0 M MaCl. The M NaCI elution peak was found to cause mortality against Southern corn rootworm (SCR) larvae Table 30). The 1.0 M NaCi fraction was dialyzed against 10 mM sodium phosphate buffer pH 7.0, concentrated, and subjected to gel filtration on Sepharose CL-4B (Pharmacia, Piscataway, New Jersey). The region of the CL- 4B elution profile corresponding to calculated molecular weight (about 900 kDa) as the native W-14 toxin complex was collected, concentrated and bioassayed against larvae. The collected 900 kDa fraction was found to have insecticidal activity (see Table below), with symptomology similar to that caused by native W- 14 toxin complex. This fraction was subjected to Proteinase K and heat treatment, the activity in both cases was either eliminated or reduced, providing evidence that the activity is proteinaceous in nature. In addition, the active fraction tested immunologically positive for the TcbA and TcbAiii peptides in immunoblot analysis when tested with an anti-TcbAiii monoclonal antibody (Table Table Results of Immunoblot and SCR Bioassays.
Fraction SCR Activity Immunoblot Native Size Growth Peptides (CL-4B Mortality Inhibit. Detected Estimated Size] TcbA Media 1.0 M TcbA Ion Exchange TcbA Media CL-4B TcbA, -900 kDa TcbAii i TcbA Media CL-4B NT Proteinase K TcbA Media CL-4B NT heat treatment TcbA Cell Sup CL-4B NT -900 kD PK Proteinase K treatment 2 hours; Heat treatment 100 0 C for minutes; ND None Detected; NT Not Tested. Scoring system for mortality and growth inhibition as compared to control samples; The cell pellet was resuspended in 10 mM sodium phosphate buffer, pH=7.0, and lysed by passage through a Bio-Neb" cell nebulizer (Glas-Col Inc., Terra Haute, IN). The pellets were -113- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 treated with DNase to remove DNA and centrifuged at 17,000 x g to separate the cell pellet from the cell supernatant. The supernatant fraction was decanted and filtered through a 0.2 micron filter to remove large particles and subjected to anion exchange chromatography. Bound proteins were eluted with 1.0 M NaC1, dialyzed and concentrated using Biomax,' (Millipore Corp, Bedford, MA) concentrators with a molecular mass cut-off of 50,000 Daltons. The concentrated fraction was subjected to gel filtration chromatography using Sepharose CL-4B beaded matrix.
Bioassay data for material prepared in this way is shown in Table and is denoted as TcbA Cell Sup".
In yet another method to handle large amounts of material, the cell pellets were re-suspended in 10 mM sodium phosphate buffer, pH 7.0 and thoroughly homogenized by using a Kontes Glass Company (Vineland, NJ) 40 ml tissue grinder. The cellular debris was pelleted by centrifugation at 25,000 x g and the cell supernatant was decanted, passed through a 0.2 micron filter and subjected to anion exchange chromatography using a Pharmacia 10/10 column packed with Poros HQ 50 beads. The bound proteins were eluted by performing a NaC1 gradient of 0.0 to 1.0 M.
Fractions containing the TcbA protein were combined and concentrated using a 50 kDa concentrator and subjected to gel filtration chromatography using Pharmacia CL-4B beaded matrix.
The fractions containing TcbA oligomer, molecular mass of approximately 900 kDa, were collected and subjected to anion exchange chromatography using a Pharmacia Mono Q 10/10 column equilibrated with 20 mM Tris buffer pH 7.3. A gradient of 0.0 to 1.0 M NaCl was used to elute recombinant TcbA protein.
Recombinant TcbA eluted from the column at a salt concentration of approximately 0.3-0.4 M NaC1, the same molarity at which native TcbA oligomer is eluted from the Mono Q 10/10 column. The recombinant TcbA fraction was found to cause SCR mortality in bioassay experiments similar to those in Table -114- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 SEQUENCE LISTING GENERAL INFORMATION: APPLICANT: Ensign, Jerald C Bowen, David J Petell, James Fatig, Raymond Schoonover, Sue ffrench-Constant, Richard Orr, Gregory L Merlo, Donald J Roberts, Jean L Rocheleau, Thomas A Blackburn, Michael B Hey, Timothy D Strickland, James A (ii) TITLE OF INVENTION: Insecticidal Protein Toxins From Photorhabdus (iii) NUMBER OF SEQUENCES: 61 (iv) CORRESPONDENCE
ADDRESS:
ADDRESSEE: Quarles Brady STREET: 1 South Pinckney Street CITY: Madison STATE: WI COUNTRY: US ZIP: 53703 COMPUTER READABLE FORM: MEDIUM TYPE: Floppy disk COMPUTER: IBM PC compatible OPERATING SYSTEM: PC-DOS/MS-DOS SOFTWARE: PatentIn Release Version #1.30 (vi) CURRENT APPLICATION
DATA:
APPLICATION
NUMBER:
FILING DATE:
CLASSIFICATION:
(vii) PRIOR APPLICATION
DATA:
APPLICATION NUMBER: US 08/063,615 FILING DATE: 18-MAY-1993 (vii) PRIOR APPLICATION
DATA:
APPLICATION.NUMBER: US 08/395,497 FILING DATE: 28-FEB-1995 (vii) PRIOR APPLICATION DATA: APPLICATION NUMBER: US 60/007,255 FILING DATE: 06-NOV-1995 'vii) PRIOR APPLICATION DATA: APPLICATION NUMBER: US 08/608,423 FILING DATE: 28-FEB-1996 -115- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 'vii) PRIOR APPLICATION DATA: APPLICATION NUMBER: US 08/705,484 FILING DATE: 28-AUG-1996 (viii) ATTORNEY/AGENT INFORMATION: NAME: Seay, Nicholas J REGISTRATION NUMBER: 27386 REFERENCE/DOCKET NUMBER: 960296.93804 (ix) TELECOMMUNICATION
INFORMATION:
TELEPHONE: 608-251-5000 TELEFAX: 608-251-9166 INFORMATION FOR SEQ ID NO:1: SEQUENCE CHARACTERISTICS: LENGTH: 11 amino acids TYPE: amino acid
STRANDEDNESS:
TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:1: Phe Ile Gin Gly Tyr Ser Asp Leu Phe Gly Asn 1 5 INFORMATION FOR SEQ ID NO:2: SEQUENCE CHARACTERISTICS: LENGTH: 12 amino acids TYPE: amino acid
STRANDEDNESS:
TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: Met Gin Asp Ser Pro Glu Val Ser Ile Thr Thr Trp 1 5 INFORMATION FOR SEQ ID NO:3: SEQUENCE CHARACTERISTICS: LENGTH: 19 amino acids TYPE: amino acid
STRANDEDNESS:
TOPOLOGY: linear -116- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 (ii) MOLECULE TYPE: protein FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: 1 Ser Glu Ser Leu Phe Thr Gin Thr Leu Lys Glu Ala Arg Arg Asp Ala 1 5 10 Leu Val Ala INFORMATION FOR SEQ ID NO:4: SEQUENCE CHARACTERISTICS: LENGTH: 14 amino acids TYPE: amino acid
STRANDEDNESS:
TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: Ala Ser Pro Leu Ser Thr Ser Glu Leu Thr Ser Lys Leu Asn 1 5 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 9 amino acids TYPE: amino acid
STRANDEDNESS:
TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID Ala Gly Asp Thr Ala Asn Ile Gly Asp 1 INFORMATION FOR SEQ ID NO:6: SEQUENCE CHARACTERISTICS: LENGTH: 15 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein -117- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: Leu Gly Gly Ala Ala Thr Leu Leu Asp Leu Leu Leu Pro Gin I!e 1 5 10
I()
INFORMATION FOR SEQ ID NO:7: SEQUENCE CHARACTERISTICS: LENGTH: 11 amino acids TYPE: amino acid
STRANDEDNESS:
TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:7: Met Leu Ser Thr Met Glu Lys Gin Leu Asn Glu 1 5 INFORMATION FOR SEQ ID NO:8: SEQUENCE CHARACTERISTICS: LENGTH: 9 amino acids TYPE: amino acid
STRANDEDNESS:
TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:8: Met Asn Leu Ala Ser Pro Leu Ile Ser 1 INFORMATION FOR SEQ ID NO:9: SEQUENCE CHARACTERISTICS: LENGTH: 16 amino acids TYPE: amino acid
STRANDEDNESS:
TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: N-terminal -118- SUBSTITUTE SHEET (RULE 26) WO 97/17432 *xi) SEQUENCE DESCRIPTION: SEQ ID NO:9: Met Ile Asn Leu Asp Ile Asn Glu Gin Asn Lys Ile 1 5 10 PCT/US96/18003 Met Val Val Ser INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 20 amino acids TYPE: amino acid
STRANDEDNESS:
TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID Ala Ala Lys Asp Val Lys Phe Gly Ser Asp Ala Arg Val Lys Met Leu 1 5 10 Arg Gly Val Asn INFORMATION FOR SEQ ID NO:11: SEQUENCE CHARACTERISTICS: LENGTH: 7515 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear MOLECULE TYPE: DNA (genomic) (ix) FEATURE: NAME/KEY: CDS LOCATION: 1..7515 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:11:
ATG
Met 1 CAA AAC TCA Gin Asn Ser TCA AGC ACT ATC GAT ACT ATT TGT CAG Ser Ser Thr Ile Asp Thr Ile Cys Gin 10 AAA CTG Lys Leu cAA TTA ACT Gin Leu Thr CGG GAA AAA Arg Glu Lys CCG GCG GAA Pro Ala Glu ATT GCT TTG Ile Ala Leu TAT CCC TTT Tyr Pro Phe GAT ACT TTC Asp Thr Phe AAA CGG ATT Lys Arg Ile ACT CGG GGA ATG GTT AAT TGG GGG GAA Thr Arg Gly Met Val Asn Trp Gly Glu TAT GAA Tyr Glu 50 ATT GCA CAA GCG Ile Ala Gin Ala
GAA
Glu 55 CAG GAT AGA AAC Gin Asp Arg Asn
CTA
Leu CTT CAT GAA AAA Leu His Glu Lys CGT ATT TTT GCC TAT GCT AAT CCG CTG CTG AAA AAC GCT GTT CGG TTG Arg Ile Phe Ala Tyr Ala Asn Pro Leu Leu Lys Asn Ala Val Arg Leu -119- SUBSTITUTE SHEET (RULE 26) WO 97/17432 WO 9717432PCT/US96/18003 GGT ACC CGG CAA ATG TTG GGT TTT ATA Gly Thr Arg Gin Met Leu Gly Phe Ile GGT TAT AGT GAT Gly Tyr Ser Asp OTG TTT Leu Phe C-GT AAT CGT Gly Asn Arg TTC TCA CC-- Phe Ser Pro 115 GAT ?A.O TAT CO GCG COG GGC TCG GTT Asp Asn Tyr Aia Aia Pro Giy Ser Val GCA TCG ATG Aia Ser Met 110 GOC AAA AAC Aia Lys Asn CG GOT TAT TTC Aia Ala Tyr Leu
AOG
Thr 120 GAA TTC TAO CGT Giu Leu ly r Arg TTG OAT Leu His 130 GAO AGO AGO TOA Asp Ser Ser Ser TAT TAO OTA CAT Tyr Tyr Leu Asp
AAA
Lys 140 OGT OGO OOG CAT Arg Arg Pro Asp
TTA
Leu 145 GOA AGO TTA ATG Ala Ser Leu Met AGO CAC AAA A.AT Ser Gin Lys Asn
ATG
Met 155 CAT GAG CG)A ATT Asp Ciu iu Ile AC OTG CT OTO Thr Leu Ala Leu
TOT
Ser 165 A.AT CAA TTC TC Asn Ciu Leu Cys COOC CCCG ATO GAA Ala Cly Ile Giu ACA AA: Thr Lys 175 ACA CCA AAA Thr Cly Lys TTA ACT CCA Leu Ser Cly 195
TOA
Ser 180 CAA CAT GAA CTC Gin Asp Clu Val
ATC
Met 185 CAT ATC TTC TCA Asp Met Leu Ser ACT TAT CT Thr Tyr Arg 190 CTT CT GAA Val Ara Ciu GAG ACA OCT TAT Ciu Thr Pro 74yr CAC GOT TAT CAA His Ala 'y r Ciu ATO GTT Ile Val 210 CAT CAA CT CAT OCA CCA TTT CT OAT His Ciu Arg Asp Pro Cly Phe Arg His TOA CAC CA C00 Ser Gin Ala Pro
ATT
Ilie 225 CTT CT CT AAC Val Ala Ala Lys
OTO
Leu 230 CAT OCT CTC ACT Asp Pro Val Thr
TTC
Leu 235 TTC COT ATT AGO TOO Leu Ciy Ile Ser Ser 240 CAT ATT TOG OCA His Ilie Ser Pro CTC TAT AAO ?I'C Leu Tyr Asn Leu
OTC
Leu 250 ATT GAG GAG ATO Ile Clu Ciu Ile CCC CAA Pro Ciu 255 CA 'AT CAA Lys Asp Clu ATT ACT ACT Ilie Thr Thr 275 CG CTT~ CAT AC Ala Leu Asp Thr
OTT
Leu 265 TAT AAA ACA A-AC Tyr Lys Thr Asn TTT CCC CAT Phe Cly Asp 270 CCC TAT TAT Arg Tyr Tyr CT CAC TTA ATC Ala Gin Leu Met OCA ACT TAT OTC Pro Ser Tyr Leu CCC GC Cly Val 290 TOA CCC CAA CAT Ser Pro Clu Asp CO TAO CTC AC Ala Tyr Vai Thr TOA TTA TOA OAT Ser Leu Ser His
CTT
Vai 305 GCA TAT AGO ACT Cly Tyr Ser Ser
CAT
Asp 310 ATT CTC CTT ATT Ile Leu Val Ile Pro 315 TTC GC CAT CCT Leu Val Asp Cly- 960 1008 COT A.AC ATC GA-A Gly Lys Met Clu
GTA
ValI 325 GTT CT CTT ACC Vai Arg Val Thr ACA OCA TOG CAT Thr Pro Ser Asp AAT TAT Asn Tyr 335 -12 0- SUBSTITUTE SHEET (RULE 26) WO 97/17432 WO 97/ 7432PCT/US96/1 8003 ACC ACT CAG ACC AAT TAT ATT GAG CTC TAT CCA CAG Thr Ser Gi1n TAT TTG ATC T ,r Leu Ile 355 Asn Tyr Ile Ciu Leu 345 Tyr Pro Gin AAA TAC A.AT CTA Lys Th'r Asn Leu AAT ACT TTT CCT Asn Ser Phe Gly CCT CCC GAC AAT i056 Gly Cly Asp asn 350 TTC CAT GAT TTT 1i1'4 Leu Asp Asp Phe 365 GAG ATT CCC CAT 1i52 Ciu Ile Ala His TAT CTC ryr Leu 3-70 CAA TAT AAA CAT Gin Tyr Lys Asp TCC CT CAT TGG Ser Aia Asp Trp
ACT
Thr 380 A AT As n 385 CCC TAT CCT G.AT Pro Tyr Pro Asp
ATG
Met 390 CTC ATA AAT CAA Vai Ile Asn Gin TAT CAA TCA CAC Tyr Glu Ser Gin CCG 1200 Aia 400 ACA ATC AAA CCT Thr Ilie Lys Arg
ACT
Ser 405 GAC TCT CAC AAT ATA CTC ACT ATA Asp Ser Asp Asn Ile Leu Ser Ile AGA TGC CAT Arg Trp His GGT ACT TAT AAT Ciy Ser Tyvr Asn
TTT
Phe 425 CCC CCC GCC .AAT Ala Ala Ala Asn CCC TTA CAkA 1248 Gly Leu Gin 415 TTT AAA ATT 1296 Phe Lys Ilie 430 AAG CCT 'TT 1344 Lys Ala Ile GAO CAA TAC Asp Gin 'I yr 435 CCC TTC CTC Arg Leu Leu 450 TCC CCC AAA CCT Ser Pro Lys Ala CTC CTT AAA ATC Leu Leu Lys Met AAA CCT Acc Lys Ala Thr
GCC
C ly 455 CTC TCT TTT GCT Leu Ser Phe Ala
ACG
Thr 460 TTG GAG CCT AT'r Leu Ciu Arg I e 1392
CTT
Vali 465 CAT ACT GTT AAT Asp Ser Val Asn ACC AAA TCC ATC Thr Lys Ser Ile CTT GAG GTA TTA Val Giu Val Leu A.AC 1440 As n 480 AAG GTT TAT Lys Val Tyr GAG ACA CC Ciu Thr Ala CCC AJ\T CAC Cly Asn Gin 515 AAA TTC TAT ATT Lys Phe Tyr Ile
CAT
Asp 490 CCT TAT GC Arg 7y r Cly CCT ATT TTG GCT AAT Ala Ile Leu Ala Asn
ATT
Ile 505 AAT ATC TCT CAC Asn Ile Ser Gin ATC ACT CAA 1488 Ile Ser Ciu 495 CAA CCT CTT 1536 Gin Ala Vai 510 CCG CCC CTC 1584 Pro Pro Leu CTT ACC CAG TTT Leu Ser Gin Phe CAA CTA TTT A.AT Gin Leu Phe Asn A.AT CCT Asn Gly 530 ATT CCC TAT GAA ATC ACT GAG GAC AAC Ile Arg Tyr Clu Ile Ser Giu Asp Asn
TCC
Ser 540 AAA CAT CTT COT 1632 Lys His Leu Pro
AAT
As n 545 CCT CAT CTG AAC Pro Asp Leu Asn
CTT
Leu 550 AAA CCA CAC ACT Lys Pro Asp Ser GCT CAT CAT CAA Ciy Asp Asp Gin 1680 AAC CC CTT TTA Lys Ala Val Leu CCC CC TTT CAG Arg Ala Phe Gin AAC CCC ACT Asn Ala Ser GAG TTG TAT 1728 Clu Leu Tyr 575 ATC AAA AAT 1776 Ile Lys Asn 590 CAC ATC TTA Cmn Met Leu ATC ACT CAT CGT AAA GA.A CAC GCT CTT Ile Thr Asp Arg Lys Ciu Asp Gly Val 585 AP AC TTA GAG AAT TTG TCT AT CTG TAT TTG GTT AT TTG CTG CC CAC 1824 Asn Leu Giu Asn Leu Ser Asp Leu Tyr Leu Val Ser Leu Leu Ala Gin -121- SUBSTITUTE SHEET (RULE 26) WO 97/17432 WO 9717432PCT/EUS96/1 8003 ATT CAT Ile His 610 AAC CTG ACT ATT Asn Leu Thr Ile
OCT
Al a 615 GAA TTG AAC ATT TTG TTG GTG ATT TGT 1872 Glu Leu Asn Ilie Leu Leu Val Ile Cys 620
GC
G, 1 625 TAT GGC GAC ACC Ty r Gly Asp Thr ATT TAT CAC ATT Ile Tyr Gin Ile
ACC
Thr 635 GAC GAT AAT TTA Asp Asp Asn Leu GCC 1920 Ala 640 A-AA ATA GTG GAA Lys Ile Val Giu TTG TTG TGG ATC Leu Leu Trp Ile
ACT
Thr 650 C AA: TOG TTG Gin Trp Leu TGG ACA Lys Trp Thr ACC ACT TTA Thr Thr Leu 675 ACC GAC CTG TTT Thr Asp Leu Phe ATG ACC ACC CC Met Thr Thr Ala AAG ACC CAA 1368 Lys Thr Gin 655 ACT TAC AGC 2016 Thr Tyr Ser 670 TTG TCT TCA 2064 Leu Ser Ser ACG CCA GA.A ATT Thr Pro Glu Ile AAT CTG AG GCT Asn Leu Thr Ala ACT TTC Thr Leu 090 CAT GCC A.AA GAG AGT CTG ATT GGG A His Gly Lys Glu Ser Leu Ile Gly Clu 695
CAT
Asp 700 CTG AAA AGA GCA 2112 Leu Lys Arg Ala CC CCT TG TTC Ala Pro Cys Phe
ACT
Thr 710 TCO GCT TTG CAT Ser Ala Leu His ACT TCT CAA GAA Thr Ser Gin Clu OTT 2160 Val1 720 GC TAT GAC CTO Ala Tyr Asp Leu
CTG
Leu 7,25 TTG TOG ATA GAC Leu Trp Ile Asp ATT CA.A CCC OCA Ilie Gin Pro Ala CAA ATA 2208 Gin Ile 735 ACT OTT CAT CCC TTT TCG GAA CAA Thr Val Asp Gly Phe Trp Clu Ciu 740 CAA ACA ACA CCA Gin Thr Thr Pro ACC ACC TTG 2256 Thr Ser Leu 750 CTC ATC TAT 2304 Leu Ile Tyr A.AG GTG ATT Lys Val Ilie 755 ACC TTT OCT CAC Thr Phe Ala Gin CTC CCA CAA TTC Leu Ala Gin Leu CGT CGT Arg Arg 770 ATT CCC TTA ACT Ilie Cly Leu Ser ACC CAA CTC TCA Thr Clu Leu Ser
CTG
Leu 780 ATC CTC ACT CAA 2352 Ile Val Thr Gin
TCT
S er 785 TCT CTC CTA OTG Ser Leu Leu Val
GCA
Ala 790 GCC AAkA AOC ATA Cly Lys Ser Ile CAT CAC OCT CTC Asp His Gly Leu TTA 2400 Leu 800 ACC CTC ATG CC Thr Leu Met Ala CAA GCT TTT CAT Ciu Cly Phe His TOO CTT AAT Trp Val Asn CAA CAT CC Gin His Ala OTT ACC CAT Val Thr Asp 835
TC
S er 820 TTC ATA TTG CC Leu Ile Leu Ala TTC AAA GCG A Leu Lys Asp Gly CGC TTC CCC 2448 Gly Leu Cly 815 CCC TTC ACA 2496 Ala Leu Thr 830 CTC CTA CAA 2544 Leu Leu Cmn OTA GA CAA OCT Val Ala Gin Ala A.AT AAG GAC GAA Asn Lys Ciu Oiu ATO CCA Met Ala 850 GCT AAT CAC OTC GAO AAC CAT CTA ACA Ala Asn Gin Val Glu Lys Asp Leu Thr
AAA
Ly s 860 CTC ACC ACT TOG 2592 Leu Thr Ser Trp -122- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003
ACA
Thr 865 CAG ATT GAC GCT Gin Ile Asp Ala ATT CTG Ile Leu 870 CAA TGG TTA Gin Trp Leu CAG ATG Gin Met 875 TCT TCG GCC TTG 2640 Ser Ser Ala Leu 880 GCG GTT TCT CCA Ala Val Ser Pro GAT CTG GCA GGG Asp Leu Ala Gly ATG GCC CTG AAA Met Ala Leu Lys TAT GGG 2688 Tyr Gly 895 CTG ATG 2736 Leu Met ATA GAT CAT AAC TAT GCT GCC TGG Ile Asp His Asn Tr Ala Ala Trp 300 CAA GCT GCG GCG GCT GCG Gin Ala Ala Ala Al Aa Ala 905 910 GCT GAT CAT Ala Asp His 915 GCT AAT CAG GCA Ala Asn Gin Ala AAA AAA CTG GAT Lys Lys Leu Asp GAG ACG TTC AGT 2784 Glu Thr Phe Ser 925 GAT AGT GCT GCT 2832 Asp Ser Ala Ala AAG GCA Lys Ala 930 TTA TGT AAC TAT Leu Cys Asn Tyr
TAT
Tyr 935 ATT AAT GCT GTT Ile Asn Ala Val
GGA
Gly 945 GTA CGT GAT CGT Val Arg Asp Arg
AAC
Asn 950 GGT TTA TAT ACC Gly Leu Tyr Thr TTG CTG ATT GAT Leu Leu Ile Asp AAT 2880 Asn 960 CAG GTT TCT GCC Gin Val Ser Ala GGT ATT CAA CTG Gly Ile Gin Leu GTG ATC ACT TCA Val Ile Thr Ser ATT GCA GAA GCT Ile Ala Glu Ala ATC GCC 2928 Ile Ala 975 TAC GTT AAC CGG GCT TTA AAC CGA GAT Tyr Val Asn Arg Ala Leu Asn Arg Asp 985 GAA GGT CAG 2976 Glu Gly Gin 990 CTT GCA TCG Leu Ala Ser 995 GAC GTT AGT ACC Asp Val Ser Thr CGT CAG TTC TTC ACT Arg Gin Phe Phe Thr 1000 GAC TGG GAA CGT 3024 Asp Trp Glu Arg 1005 TAC AAT AAA CGT TAC AGT Tyr Asn Lys Arg Tyr Ser 1010 ACT TGG GCT GGT GTC Thr Trp Ala Gly Val 1015 TCT GAA CTG GTC TAT 3072 Ser Glu Leu Val Tyr 1020 TAT CCA GAA AAC TAT Tyr Pro Glu Asn Tyr 1025 GTT GAT CCC ACT CAG CGC ATT GGG CAA ACC Val Asp Pro Thr Gin Arg Ile Gly Gin Thr 1030 1035 AAA 3120 Lys 1040 ATG ATG GAT GCG Met Met Asp Ala CTG TTG CAA TCC ATC Leu Leu Gin Ser Ile 1045 AAC CAG AGC CAG CTA Asn Gin Ser Gin Leu 1050 AAT GCG 3168 Asn Ala 1055 GAT ACG GTG Asp Thr Val GAA GAT Glu Asp 1060 GCT TTC AAA Ala Phe Lys ACT TAT TTG ACC AGC TTT GAG CAG 3216 Thr Tyr Leu Thr Ser Phe Glu Gin 1065 1070 GTA GCA AAT CTG AAA GTA ATT Val Ala Asn Leu Lys Val Ile 1075 AGT GCT TAC CAC GAT Ser Ala Tyr His Asp 1080 AAT GTG AAT GTG 3264 Asn Val Asn Val 1085 GAT CAA GGA Asp Gin Gly 1090 TTA ACT TAT Leu Thr Tyr TTT ATC Phe Ile 1095 GGT ATC GAC CAA GCA GCT CCG GGT 3312 Gly Ile Asp Gin Ala Ala Pro Gly 1100 ACG TAT TAC TGG Thr Tyr Tyr Trp 1105 CGT AGT GTT GAT CAC AGC Arg Ser Val Asp His Ser 1110 AAA TGT GAA AAT GGC Lys Cys Glu Asn Gly 1115 AAG 3360 Lys 1120 TTT GCC GCT AAT GCT TGG GGT GAG TGG AAT AAA ATT ACC TGT GCT GTC 3408 Phe Ala Ala Asn Ala Trp Gly Glu Trp Asn Lys Ile Thr Cys Ala Val -123- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 1125 1130 1135 AAT CCT TGG AAA AAT Asn Pro Trp Lys Asn 1140 ATC ATC CGT CCG GTT Ile Ile Arg Pro Val 1145 GTT TAT ATG TCC CGC TTA 3456 Val Tyr Met Ser Arg Leu 1150 TAT CTG CTA TGG T.r Leu Leu Trp 1155 CTG GAG CAG CAA TCA Leu Glu Gin Gin Ser 1160 AAG AAA AGT Lys Lys Ser GAT GAT GGT AAA 3504 Asp Asp Gly Lys 1165 ATT CGT TAC GAC 3552 Ile Arg Tyr Asp ACC ACG ATT Thr Thr Ile 1170 TAT CAA TAT Tyr Gin Tyr AAC TTA Asn Leu 1175 AAA CTG GCT Lys Leu Ala
CAT
His 1180 GGT AGT Gly Ser 1185 TGG AAT ACA Trp Asn Thr
CCA
Pro 1190 TTT ACT TTT GAT GTG ACA GAA AAG GTA AAA 3600 Phe Thr Phe Asp Val Thr Glu Lys Val Lys 1195 1200 AAT TAC ACG TCG Asn Tyr Thr Ser ACT ACT Ser Thr 1205 GAT GCT GCT Asp Ala Ala
GAA
Glu 1210 TCT TTA GGG Ser Leu Gly TTG TAT TGT 3648 Leu Tyr Cys 1215 TAT TCG ATG 3696 Tyr Ser- Met 1230 ACT GGT TAT Thr Gly Tyr CAA GGG Gin Gly 1220 GAA GAC ACT Glu Asp Thr CTA TTA GTT ATG TTC Leu Leu Val Met Phe 1225 CAG AGT AGT TAT Gin Ser Ser Tyr 1235 AGC TCC TAT Ser Ser Tyr
ACC
Thr 1240 GAT AAT AAT GCG CCG GTC ACT GGG 3744 Asp Asn Asn Ala Pro Val Thr Gly 1245 CTA TAT ATT Leu Tyr Ile 1250 TTC GCT GAT Phe Ala Asp ATG TCA TCA GAC AAT ATG ACG AAT GCA CAA 3792 Met Ser Ser Asp Asn Met Thr Asn Ala Gln 1255 1260 GCA ACT Ala Thr 1265 AAC TAT TGG Asn Tyr Trp AAT AAC Asn Asn 1270 AGT TAT CCG CAA Ser Tyr Pro Gln 1275 TTT GAT ACT GTG Phe Asp Thr Val ATG 3840 Met 1280 GCA GAT CCG GAT Ala Asp Pro Asp AGC GAC AAT AAA AAA Ser Asp Asn Lys Lys 1285 GTC ATA Val Ile 1290 ACC AGA AGA Thr Arg Arg GTT AAT 3888 Val Asn 1295 AAC CGT TAT Asn Arg Tyr AGT AAT TAT Ser Asn Tyr 1315 AGT GTT CCT Ser Val Pro 1330 GCG GAG GAT TAT GAA Ala Glu Asp Tyr Glu 1300 ATT CCT TCC TCT GTG Ile Pro Ser Ser Val 1305 ACA AGT AAC 3936 Thr Ser Asn 1310 TCT TGG GGT GAT CAC AGT TTA ACC ATG Ser Trp Gly Asp His Ser Leu Thr Met 1320 AAT ATT ACT TTT GAA TCG GCG GCA GAA Asn Ile Thr Phe Glu Ser Ala Ala Glu 1335 134( CTT TAT GGT GGT 3984 Leu Tyr Gly Gly 1325 GAT TTA AGG CTA Asp Leu Arg Leu 0 4032 TCT ACC Ser Thr 1345 AAT ATG GCA TTG AGT ATT ATT CAT Asn Met Ala Leu Ser Ile Ile His 1350 AAT GGA TAT GCG GGA Asn Gly Tyr Ala Gly 1355 ACC 4080 Thr 1360 CGC CGT ATA CAA Arg Arg Ile Gin TGT AAT Cys Asn 1365 CTT ATG AAA Leu Met Lys CAA TAC GCT TCA TTA Gin Tyr Ala Ser Leu 1370 GGT GAT 4128 Gly Asp 1375 AAA TTT ATA ATT TAT GAT TCA TCA TTT GAT GAT GCA AAC Lys Phe Ile Ile Tyr Asp Ser Ser Phe Asp Asp Ala Asn CGT TTT AAT 4176 Arg Phe Asn 1390 1380 1385 -124- SUBSTITUTE SHEET (RULE 26) WO 97/17432 WO 9717432PCT/UJS96/18003 CTG GTG rOC-A TTG Leu Val Pro Leu 1395 ATT TGT ATA TAT Ile Cys Ilie Tlyr 1410 TTT TOT TCG AAA Pile :ser Ser Lys 1425 TTT ;LA-A TTO GGA A Phe Lys Phle Gly L*.*s 14-0 AAT GAA AAC CT TCOr Asn Giu Asn Pro Ser 1415 GAO' GAG AA-C TCA GAT Asp CIu Asn Ser ?.sp 1405 GAT 47--l Asp Ser TOT GAA CAT Ser Glu Asp 1420 ALAG AAG TGG TAT 427.
Lys Lys Trp Tyjr AAT GGT GGA ACT 432.
Asn Cly GlIy Thr 1440 CAT GAO !-T Asp Asp Asn 1430 AAA ACA CG Lys Thr Al a GAT TAT PAs p Tlyr 1435 CAA TGT ATA CAT Gin Cys Ilie Asp CT GGA ACC AGT AAO Ala Cly Thr Ser Asn 1445
AAA
Ly s 1450 CAT TTT TAT Asp Phe Ty.,r GGG TAT TGC Gly Tyr Trp TAT AAT OTO 4 36.7.
Tlyr Asn Leu 1455 TOG AGT TAT 441-- Ser Ser 'rj r 1470 CAG GAG ATT GAA GTA ATT AGT Gin Ciu Ile Clu Val Ile Ser 1460 GTT ACT GGT Val Thr Gly 1465 AAA ATA TOO AAO Lys Ile Ser Asn 1475 COG ATT AAT ATC A.AT ACG GGO ATT Pro Ile Asn Ile Asn Thr Cly Ile 1480 GAT ACT GOT AAA 4 46 4 Asp Ser Ala Ly.,s 1485 GTA AAA GTO ,Val1 Lys Val 1490 CAT AAT AGT Asp Asn Ser 1505 ACC GTA AAA Thr Val Lys GOG GGT GGT GAO CAT Ala Gly Gly Asp Asp 1495 COAA ATO Gin Ile 1500 TTT ACT GOT 4512- Phle Thr Ala ACC TAT GTT COT Thr Tyr Val Pro 1510 CAG CAA COG Gin Gin Pro GOA COO Ala Pro 1515 ACT TTT GAG Ser Phle Glu GAG 456)- Glu 1520 ATG ATT TAT CAG TTO AAT AAC CTG ACA ATA CAT TGT AAC AAT TTA AAT 4 605 Met Ile Tyr Gin Phe Asn Asn Leu Thr Ilie Asp Cys Lys Asn Leu Asn 1525 1530 1535 TTO ATO GAO AAT CAG GOA CAT Phe Ile Asp Asn Gin Ala His 1540 ATT GAG ATT Ile Giu Ile 1545.
CAT TTC ACC GOT AOG GOA 46C% Asp Pile Thr Ala Thr Ala 1550 CAA CAT CCC OGA Gin Asp Cly Arg 1555 TTO TTC GCT Phe Leu Giy GCA CAA Ala Clu 1560 ACT TTT ATT Thr Phle Ile ATO CCC CTA ACT 4704 Ile Pro Val Thr 1565 TAT AGO CAA AAT 4 7 52 Tyr Ser Clu Asn AAA AA.A GTT Lys Lys Val 1570 OTO CGT ACT Leu Cly Thr GAG AAC Giu Asn 1575 GTG ATT CC Val Ile Ala
TTA
Leu 1580 AAO CGT Asn Cly 1585 GTT CAA TAT Val Gin Tyr ATC CAA Met Gin 1590 ATT CCC GCA Ile Cly Ala TAT CT ACC CT TTC AAT 4800 7yr Arg Thr Arg Leu Asn 1595 1600 ACC TTA TTO CT Thr Leu Phe Ala CAA CAC TTG CTT AGO Gin Gin Leu Val Ser 1605
CT
Arg 1610 CT AAT CT Ala Asn Arg CCC ATT CAT 4848 Cly Ile Asp 1615 CAA TTA GCA 4896 Gin Leu Gly 1630 GCA CTG OTO Ala Val Leu ACT ATG GAA ACT CAC Ser Met Giu Thr Gin 1620 AAT ATT CAG CAA COG Asn Ile Gin Giu Pro 1625 -CC CCC ACA TAT GTC; CAG Ala Gly Thr Tyr Val Gin 1635 OTT GTG TTG CAT A.AA TAT CAT GAG TOT ATT 4944 Leu Val Leu Asp Lys TIyr Asp Giu Ser Ile 1640 1645 OAT CCC ACT AAT AAA. AGO TTT CT ATT GAA TAT CTT CAT ATA TTT AAA 4992 His Gly Thr Asn Lys Ser Phe Ala Ile Giu Tyr Val Asp Ile Phe Lys -125- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 1655 1660 GAG AAC Clu Asn 1665 CAT AGT TTT CTG ATT TAT CAA Asp Ser Phe Val Ile 7T. Gin 1670 CGA .3AA CTT Gly Glu Leu 1675 AGC GAA ACA ACT 5040 Ser Clu Thr Ser 1680 CAA ACT CTT GTC ;in Thr Val Val A.A GTT Lys Val 1685 TTC TTA TCC Phe Leu Ser TAT TTT rr Phe 1690 ATA CAG CCG Ile clu Ala ACT GCA 5088 Thr Gly L695 AAT AAC AAC Asn Lys Asn CA TTA His Leu 1700 TG GTA CGT Trp Val Arg OCT AAA Ala Lys 1705 TAC C AA AAG G.A ACG M-T 5136 Tyr Gin Lys Glu Thr Thr 1710 CAT AAG ATC TTC Asp Lys Ile Leu 1715 TTC CAC CGT Phe Asp Arg ACT CAT Thr Asp 1720 GAG AAA CAT Clu Lys Asp CCC CAC GGT TGG 5184 Pro His Cly Trp 1725 TTT CTC AGC Phe Leu Ser 1730 CAC CAT CAC Asp Asp His AAG ACC Lys Thr 1735 TTT ACT GT Phe Ser Cly CTC TCT Leu Ser 1740 TCC CCA CAG 5232 Ser Ala Cn GCA TTA Ala Leu 1745 AAG AAC CAC Lys Asn Asp AGT CAA CCG ATG GAT Ser Clu Pro Met Asp 1750 TTC TCT Phe Ser 1755 CC CCC AAT Cly Ala Asn Gc'r 5280 Ala 1760 CTC TAT TTC TGC Leu Tyr Phe Trp CAA CTO TTC TAT TAC Clu Leu Phe Tyr Tyr 1765
ACC
Thr 1770 CCC ATC ATG Pro Met Met GCG AAC CAT Ala Asn His CGT TTC TTC Arg Leu Leu CAG GAA Gin Glu 1780 CAC AAT TTT Gin Asn Phe CAT GCG Asp Ala 1785 ATG GCT CAT 5328 Met Ala His 1775 TGC TTC CCT 5376 Trp Phe Ara 1790 ATT OCT ATC 5424 Ile Ala Ile TAT GTC TGG AGT Tyr Val Trp Ser 1795 CCA TCC GGT Pro Ser Cly TAT ATC CTT CAT GGT Ty,,r Ile Val Asp Gly 1800
AAA
Lys 1805 TAC CAC TGG Tyr His Trp 1810 AAC GTG CCA Asn Val Arg CCC CTC Pro Leu 1815 CAA CAA CAC ACC ACT TC AAT CCA 5472 Clu Clu Asp Thr Ser Trp Asn Ala 1820 CAA CAA Gin Gin 1825 CTG GAC TCC Leu Asp Ser ACC CAT CCA CAT CCT Thr Asp Pro Asp Ala 1830 CTA CCC Val Ala 1835 CAA CAT CAT Gin Asp Asp CCC 5520 Pro 1840 ATC CAC TAG AAG GTG GCT ACC TTT ATC Met His Tyr Lys Val Ala Thr Phe Met 1845 GCG ACC Ala Thr 1850 TTC CAT CTC Leu Asp Leu CTA ATG 5568 Leu Met 1855 CCC CCT GGT CAT CCT CCT TAG CGC Ala Arg Cly Asp Ala Ala Tyr Arg 1860 CAC TTA Gin Leu 1865 GAG CCT CAT Clu Arg Asp ACC TTG GCT 56ii Thr Leu Ala 1870 CAA OCT AAA ATC TGG Glu Ala Lys Met Trp 1875 CCA CAA GTG ATC CTC Pro Gin Val Met Leu 1890 CCT CCT TCA AAA ACC Ala Ala Ser Lys Thr 1905 TAT ACA CAG GCG Tyr Thr Gin Ala 1880 CTT AAT CTC Leu Asn Leu TTG GGT CAT CAC 5664 Leu Cly Asp Glu 1885 AGT ACC ACT TGG GCT AAT Ser Thr Thr Trp Ala Asn 1895 ACA CAC CAC CTT CCT CAC Thr Gin Gin Val Arg Gin 1910 191' CCA ACA TTG GGT AAT 5712 Pro.Thr Leu Cly Asn 1900 CAA GTG CTT ACC CAC 5760 Gin Val Leu Thr Gin 5 1920 -126- SUBSTITUTE SHEET (RULE 26) WO 97/1 7432 PCT/US96/18003 TT-, C-ZT -TC AA-T A.GC AGO Leu A.rg Leu Asn Ser A-r., 1925 3TA AAA ACC CC73 TTG '.ai Lys Thr Pro Leu 1930 CTA CGA ACA C A.T 5,08 Leu cly Thr Ala Asn 1935 TCC CTG ACC OCT TTA Ser Leu Thr Ala Leu 1940 TTC CTG CCG CAG GAA: Phe Leu Pro Gin Giu 1945 A.AT AGC zuXG Asn Ser Lys CTC AAA CGC 5856 Lell Lvs Gly 1950 TAC TCG COG ACA 7r'r Trp Arg Thr 1955 CTG GC CAG Lqu Ala Gin COT ATC Arg Met 1960 TTT A.AT TTA Phe Asn Leu CCT CAT A.AT CTG 5904 Arg His Asn Leu 1965 TCG ATT GAC Ser Ile Asp 1970 CCC CAC CCG Gly Gin Pro CTC TCC Leu Ser 1975 TTG CCG' CTC Leu Pro Leu TAT OCT T,'r Ala 1980 A.AA CCC G-OCT 5952 Lys Pro Ala CAT CCA Asp Pro 1985 AAA GCT TTA Lys Ala Leu CT,- ACT Leu Ser 1990 C GCG GTT Ala Ala Val TCA GCT TCT CAA GG Ser Ala Ser Gin Gly 1995 GCA 6000 C ly 2000 GCC GAO TTC CCC Ala Asp Leu Pro AAC CC Lys Ala 2005 CCGCOTC ACT Pro Leu Thr
ATT
Ile 2010 CAC CCC TTC COT His Arg Phe Pro CAA ATC 6048 Gin Met 2015 OTA SAA CCC Leu Ciu Cly GCA CCC CCC TTC GTT Ala Arg Gly Leu Val 2020 AAC CAC Asn Gin 2025 OTT ATA Leu Ile CAG TTO GT ACT 6096 Gin Phe Gly Ser 2030 CT ATG ACT CXA 6144 Ala Met Ser Gin 2045 TCA CTA TTG CCC Ser Leu Leu Gly 2035 TAO ACT GAG Tlyr Ser Giu CCT CAC CAT CC C-AA Arg Gin Asp Ala Giu 2040 CTA CTC Leu Leu 2050 CAC CAT Gin Asp 205 CAA ACC CAA GCC G3n Thr Gin Ala AGC GAG TTA ATA OTO Ser Giu Leu Ilie Leu 2055 ACC ACT ATT CCT ATC 6192 Thr Ser Ile Arg Met 2060 A.AC CAA TTG Asn Gin Leu CCA GAG Ala Giu 2070 CTG CAT TOG Leu Asp Ser CAA AAA ACC CCC TTC Giu Lys Thr Ala Leu 2075 CAA 6240 Gin 2080 CTC TCT TTA OCT V.,al Ser Leu Ala GA CTG CAA CAA CCC Cly Val Gin Gin Arg 2085 Phe 2090 GAC AGC TAT Asp Ser Tyr TAT GAG GAG Tyr Ciu Giu AAC ATC Asn Ile 2100 AAC GCA Asn Ala GOT GAG Cly Giu 2105 CAG GA Gin Gly 2120 CAC CCA CCTC Gin Arg Ala Leu ACC CAA CTC 6288 Ser Gin Leu 2095 C TTA CCC 6336 Ala Leu Arg 2110 CT ATC GCA 6384 Arg Met Ala TCA CAA TOT OCT Ser Clu Ser Ala 2115 ATT GAG TCT Ile Ciu Ser GCC CAG ATT Alia Gin Ile
TOO
Ser 2125 CCC CC GOT Gly Ala Cly 2130 OTT CAT ATC Val Asp Met CCA CCA AAT ATC TTC Ala Pro Asn Ile Phe 2135 CCC CTC CCT CAT CCC 6432 Cly Leu Ala Asp Cly 2140 CCC ATC Gly Met 2145 CAT TAT GOT His Tyr Cly OCT ATT CCC TAT CC Ala Ile Ala Tyr Ala 2150
ATC
le 2155 OCT GAC GOT ATT GAG 6480 Ala Asp iy Ile Ciu 2160 TTC ACT OCT TCT Leu Ser Ala Ser CCC AAG ATC OTT CAT Ala Lys Met Val Asp 2165 CC GAG AAA GTT CCT Ala Giu Lys Val Ala 2170 CAG TOG 6528 Gin Ser 2175 GAA ATA TAT Ciu Ilie Ty'r CCC COT CCC CO-T CAA CAA TCG AAA ATT CAG CCT CAC AAC 6576 Arg Arg Arg Arg Gin Giu Trp Lys Ile Gin Arg Asp Asn -127- SUBSTITUTE SHEET (RULE 26 WO 97/17432 WO 97/ 7432PCT/US96/1 8003 2180 GCA CAA GCG GAG ATT AAC Ala Gin Ala Giu lie Asn 2195 21355 CAG TTA AAC Gin Leu Asn 2200 2190 CO CAA CTG GAA TCA CTG TCT A64 Ala Gin Leu Glu Ser Leu Ser 2205 ATT COC CGT Ile Arg Arg 2210 GAA G-CC OCT Glu Ala Ala GAA ATG G lu Met 2215 CAA AAA GAG Gin Ly~s Glu TAC OTGO III r Leu 2220 AAA, ACC CAG 6672 Lys Thr Gin CAA OCT Gin Ala 2225 C-AG GCC CAC Gin Ala Gin GCA CAA Ala Gin 2230 CTT ACT TTC TTA AGA AGO %A TTC Leu Thr Phe Leu Arg Ser Lys Phe 2235 AGT 6 I Ser 2240 AAT CAA GCC TTA Asn GIn Ala Leu TAT AGT TGG TTA CGA Tyr Ser Trp Leu Arg 2245
GGG
G ly 2250 COT TTG TOA GGT Arg Leu Ser Gly ATT TAT 6768 Ile Tyr 2255 TTC CAG TT- Phe Gin Phe TAT GAC r'1'yr Asp 2260 TTG 0CC GTA TCA COT Leu Ala Val Ser Arg 2265 ICC CTG Cys Leu TCC TAT CAA TGG Ser Tyr GIn Trp 2275 CAA OCT AAT Glu Ala Asn GAT AAT Asp Asn 2280 TCC ATT AGC Ser Ile Ser ATG GCA GAO CAA 6a16 Met Ala Giu Gin 2270 TTT GTC AAA CCG 6864 Phe Val Lys Pro 2285 GGA GAA. GCT TTG 6912 Gly Olu Ala Leu GOT GCA TOG, Gly Ala Trp 2290 CAA GGA ACT Gln Gly Thr TAC GCC GGC TTA TTG Tlyr Ala Gly Leu Leu 2295
TGT
Cy s 2300 ATA CAA 11e Gin 2305 A.AT CTG GCA Asn. Leu Ala CAA ATO GAA GAG OCA Gin Met Glu Giu Ala 2310 TAT CTG AAA TOO OAA Tyr Leu Lys Trp Olu 2315 TCT 6960 Ser 2320 COC OCT TTG OAA Arg Ala Leu Olu GTA GAA Val Oiu 2325 CGC ACG OTT TCA TTG OCA GTG OTT Arg Thr Val Ser Leu Ala Val Val 2330 TAT OAT Tyr Asp 2335 TCA CTO OAA Ser Leu Glu OT A.AT OAT COT TTT Gly Asn Asp Arg Phe 2340 AAT TrA OCO OAA CAA Asn Leu Ala Olu GIn 2345 ATA CCT OCA 7096 Ile Pro Ala 2350 AAT GOG TTA Asn Gly Leu TTA TTO OAT AAG Leu Leu Asp Lys 2355 000 GAG OGA Gly Olu 017
ACA
Thr 2360 OCA OA ACT AAA Ala Oly Thr Lys
OAA
Oiu 236~ TCA TTO OCT Ser Leu Ala 2370 A.AT OCT ATC Asn Ala Ile CTO TCA Leu Ser 2375 OCT TCO OTC Ala Ser Val AAA TTO TCC OAC TTG 7152 Lys Leu Ser Asp Leu 2380 AAA CTO Lys Leu 2385 OGA ACG OAT Gly Thr Asp TAT COA GAC AGT ATC Tyr Pro Asp Ser Ile 2390 OTT GOT AGC AAC AAO Val Gly Ser Asn Lys 2395 OTT 7200 Val1 2400 COT COT ATT AAO Arg Arg Ile Lys CAA ATC Gln Ile 2405 ACT O-TT TCG CTA CCT GCA. TTG OTT 000 COT 7248 Ser Val Ser Leu Pro Ala Leu Val OhIv Pro 2410 2415 TAT CAG OAT OTT CAG GO-T ATO CTC ryr Cln Asp Ilal Gln Ala Met Leu 2420 ACC TAT GOT 000 ACT Ser Tlyr 017 Gly Ser 2425 ACT CAA TTG 7296 Thr GIn Leu 2430 O-CG AAhA GOT Pro Lys 01I'l 2435 TOT TCA 000 TTG Cys Ser Ala Leu OCT OTO TCT CAT GOT Ala Val Ser His Gly 2440 ACC AAT OAT ACT 7344 Thr Asn Asp Ser 2445 -128- SUBSTITUTE SHEET (RULE 26) WO 97/1 7432 'GT CAG TTC Gly Gin Phe 2450 GGT ATT OCT Gly Ilie Ala 2465 GCT ACC GAC Ala Thr Asp TTG CAT ATT Leu His Ile CAG TTG OAT TTC AAT Gin Leu Asp Phe Asn 2455 CTT OAT OAT CAG GOT Leu Asp Asp Gin Oly 2470 AAG CAG AAA OCA ATA Lys Gin Lys Ala Ile 2485 COT TAT ACC ATC COT Arg Tyr Thr Ile Arg 2500 PCT/US96/1 8003 GAC GGC AAA TAC 'TG CCA TTT GAA 7392 Asp Oly Lys Tyjr Leu Pro Phe Giu 2460 ACA CTG AAT CTT CAA TTT CCO AAT 7440 Thr Leu Asn Leu Gin Phe Pro Asn 2475 2480 TTO CA.A ACT ATO AOC OAT ATT ATT 7488 Leu Gin Thr Met Ser Asp Ile Ile 2490 2495 TAA 7515 2505 INFORMATION FOR SEQ ID NO:12: SEQUENCE CHARACTERISTICS: LENGTH: 2505 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:12: met 1 Gin Arg Tyr Arg Oly Gly Phe Leu Leu 145 Thr Gir Leu GlU GlU 50 Ile Thr Asn Ser His 130 Ala Leu Asn Thr Lys 35 Ile Phe Arg Arg Pro 115 Asp Ser Ala Ser Leu Ser Ser Thr Cys Pro Thr Arg Ala Gin Ala Tyr Gin Met Ala Asp 100 Ala Ala Ser Ser Leu Met Leu Ser 165 Ala Gly Ala Ala 70 Leu As n Tyr Ser Leu 150 As n Glu Met Olu 55 As n Gly Tyr Leu Ile 135 Ser Glu Ile ValI Gin Pro Phe Ala Thr 120 Ty'r Gin Leu Ile Asp Ala Leu 25 Asn Trp Asp Arg Leu Leu Ile Gin 90 Ala Pro 105 Oiu Leu Tyr Leu Lys Asn Cys Leu 170 Met Asp 185 -129- Thr Ile Cys Gin Lys Leu Tyr Gly Asn Lys 75 Gly Gly Tyr Asp Met 155 Ala Met Pro GlU Leu Asn Tyr Ser Arg Lys 140 Asp Gly Leu Asp Lys His ValI Asp Ala 110 Ala Arg Olu Glu Thr 190 Thr Arg Oiu Arg Leu Ser Lys Pro Ile Thr 175 Tyr Phe Ile Lys Leu Phe Met Asn Asp Ser 160 Lys Arg Thr Oiy Lys Ser Gin Asp Glu Val SUBSTITUTE SHEET (RULE 26W WO 97/17432 PCT/US96/18003 Leu Ser Gly Glu Thr Pro Tyr His His Ala Tyr Glu Thr Val Arg Glu 195 200 205 Ile Val His Glu Arg Asp Pro Gly Phe Arg His Leu Ser Gin Ala Pro 210 215 220 Ile Val Ala Ala Lys Leu Asp Pro Val Thr Leu Leu Gly Ile Ser Ser 225 230 235 240 His Ile Ser Pro Glu Leu Tyr Asn Leu Leu Ile Glu Glu Ile Pro Glu 245 250 255 Lys Asp Glu Ala Ala Leu Asp Thr Leu Tyr Lys Thr Asn Phe Gly Asp 260 265 270 Ile Thr Thr Ala Gin Leu Met Ser Pro Ser Tyr Leu Ala Arg Tyr Tyr 275 280 285 Gly Val Ser Pro Glu Asp Ile Ala Tyr Val Thr Thr Ser Leu Ser His 290 295 300 Val Gly Tyr Ser Ser Asp Ile Leu Val Ile Pro Leu Val Asp Gly Val 305 310 315 320 Gly Lys Met Glu Val Val Arg Val Thr Arg Thr Pro Ser Asp Asn Tyr 325 330 335 Thr Ser Gin Thr Asn Tyr Ile Glu Leu Tyr Pro Gin Gly Gly Asp Asn 340 345 350 Tyr Leu Ile Lys Tyr Asn Leu Ser Asn Ser Phe Gly Leu Asp Asp Phe 355 360 365 Tyr Leu Gin Tyr Lys Asp Gly Ser Ala Asp Trp Thr Glu Ile Ala His 370 375 380 Asn Pro Tyr Pro Asp Met Val Ile Asn Gin Lys Tyr Glu Ser Gin Ala 385 390 395 400 Thr Ile Lys Arg Ser Asp Ser Asp Asn Ile Leu Ser Ile Gly Leu Gin 405 410 415 Arg Trp His Ser Gly Ser Tyr Asn Phe Ala Ala Ala Asn Phe Lys Ile 420 425 430 Asp Gin Tyr Ser Pro Lys Ala Phe Leu Leu Lys Met Asn Lys Ala Ile 435 440 445 Arg Leu Leu Lys Ala Thr Gly Leu Ser Phe Ala Thr Leu Glu Arg Ile 450 455 460 Val Asp Ser Val Asn Ser Thr Lys Ser Ile Thr Val Glu Val Leu Asn 465 470 475 480 Lys Val Tyr Arg Val Lys Phe Tyr Ile Asp Arg Tyr Gly Ile Ser Glu 485 490 495 Glu Thr Ala Ala Ile Leu Ala Asn Ile Asn Ile Ser Gin Gin Ala Val 500 505 510 Gly Asn Gin Leu Ser Gin Phe Glu Gin Leu Phe Asn His Pro Pro Leu 515 520 525 Asn Gly Ile Arg Tyr Glu Ile Ser Glu Asp Asn Ser Lys His Leu Pro 530 535 540 -130- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 Asn Pro Asp Leu Asn Leu Lys Pro Asp Ser Thr Gly Asp Asp Gin Arg 545 550 555 560 Lys Ala Val Leu Lys Arg Ala Phe Gin Val Asn Ala Ser Glu Leu Tyr 565 570 575 Gin Met Leu Leu Ile Thr Asp Arg Lys Glu Asp Gly Val Ile Lys Asn 580 585 590 Asn Leu Glu Asn Leu Ser Asp Leu Tyr Leu Vai Ser Leu Leu Ala Gin 595 600 605 Ile His Asn Leu Thr Ile Ala Glu Leu Asn Ile Leu Leu Val Ile Cys 610 615 620 Gly Tyr Gly Asp Thr Asn Ile Tyr Gin Ile Thr Asp Asp Asn Leu Ala 625 630 635 640 Lys Ile Val Glu Thr Leu Leu Trp Ile Thr Gln Trp Leu Lys Thr Gin 645 650 655 Lys Trp Thr Val Thr Asp Leu Phe Leu Met Thr Thr Ala Thr Tyr Ser 660 665 670 Thr Thr Leu Thr Pro Glu Ile Ser Asn Leu Thr Ala Thr Leu Ser Ser 675 680 685 Thr Leu His Gly Lys Glu Ser Leu Ile Gly Glu Asp Leu Lys Arg Ala 690 695 700 Met Ala Pro Cys Phe Thr Ser Ala Leu His Leu Thr Ser Gln Glu Val 705 710 715 720 Ala Tyr Asp Leu Leu Leu Trp lie Asp Gin Ile Gin Pro Ala Gin Ile 725 730 735 Thr Val Asp Gly Phe Trp Glu Glu Val Gin Thr Thr Pro Thr Ser Leu 740 745 750 Lys Val Ile Thr Phe Ala Gin Val Leu Ala Gin Leu Ser Leu Ile Tyr 755 760 765 Arg Arg Ile Gly Leu Ser Glu Thr Glu Leu Ser Leu Ile Val Thr Gin 770 775 780 Ser Ser Leu Leu Val Ala Gly Lys Ser Ile Leu Asp His Gly Leu Leu 785 790 795 800 Thr Leu Met Ala Leu Glu Gly Phe His Thr Trp Val Asn Gly Leu Gly 805 810 815 Gin His Ala Ser Leu Ile Leu Ala Ala Leu Lys Asp Gly Ala Leu Thr 820 825 830 Val Thr Asp Val Ala Gin Ala Met Asn Lys Glu Glu Ser Leu Leu Gin 835 840 845 Met Ala Ala Asn Gin Val Glu Lys Asp Leu Thr Lys Leu Thr Ser Trp 850 855 860 Thr Gin Ile Asp Ala Ile Leu Gin Trp Leu GIn Met Ser Ser Ala Leu 865 870 875 880 Ala Val Ser Pro Leu Asp Leu Ala Gly Met Met Ala Leu Lys Tyr Gly 885 890 895 -131- SUBSTITUTE SHEET (RULE 26) WO 97/17432 WO 97/ 7432PCT/US96/1 8003 Ile Asp His Asn rfIyr Ala Ala Trp Gin Ala Ala Ala Ala Ala Lau Met 900 905 910 Ala Asp His Ala Asn Gin Ala Gin Lys Lys Leu Asp Giu Thr Phe Ser 915 920 925 Lys Ala Leu Cys Asn 71%r Tyr Ilie Asn Ala Val Val Asp Ser Ala Ala 930 935 940 Gly Val Arg Asp Arg Asn Gly Leu Tyr Thr T yr Leu Leu Ilie Asp Asn 945 950 955 960 Gin Vai Ser Ala Asp Val Ilie Thr Ser Arg Ilie Ala Giu Ala Ilie Ala 965 970 975 Gly Ilie Gin Leu Tyr Val Asn Arg Ala Leu Asn Arg Asp Glu Gly Gin 980 985 990 Leu Ala Ser Asp Val Ser Thr Arg Gin Phe Phe Thr Asp Trp Giu Arg 995 1000 1005 71yr Asn Lys Arg Tyr Sar Thr Trp Ala Gly Val Sex Giu Leu Val Tyr 1010 1015 1020 Tyr Pro Glu Asn Tyr Val Asp Pro Thr Gin Arg Ile Gly Gin Thr Lys 1025 1030 1035 1040 Met Met Asp Ala Leu Leu Gin Ser Ile Asn Gin Ser Gin Leu Asn Ala 1045 1050 1055 Asp Thr Val Glu Asp Ala Phe Lys Thr Tyr Leu Thr Ser Phe Giu Gin 1060 1065 1070 Val Ala Asn Leu Lys Val Ilie Ser Ala Tyr His Asp Asn Val Asn Val 1075 1080 1085 Asp Gin Gly Leu Thr Tyr Phe Ile Giy Ile Asp Gin Ala Ala Pro Gly 1090 1095 1100 Thr Tyr Ty'r r e a s i er Lys Cys Glu Asn Gly Lys 1105 1110 1115 1120 Phe Ala Ala Asn Ala Trp Gly Giu Trp Asn Lys Ile Thr Cys Ala Val 1125 1130 1135 Asn Pro Trp Lys Asn Ile Ile Arg Pro Val Val Tyr Met Ser Arg Leu 1140 1145 1150 Tyr Leu Leu Trp Leu Giu Gin Gin Ser Lys Lys Ser Asp Asp Gly Lys 1155 1160 1165 Thr Thr Ilie Tyr Gin Tyr Asn Leu Lys Leu Ala His Ile Arg Tyr Asp 1170 1175 1180 Gly Ser Trp Asn Thr Pro Phe Thr Phe Asp Vai Thr Giu Lys Val Lys 1185 1190 1195 1200 Asn Tyr Thr Ser Ser Thr Asp Ala Ala Giu Ser Leu Gly Leu Tyr Cys 1205 1210 1215 Thr Gly Tyr GIn Gly Glu Asp Thr Leu Leu Val Met Phe Tyr Ser Met 1220 1225 1230 Gin Ser Ser Tyr Ser Ser Tyr Thr Asp Asn Asn Ala Pro Val Thr Gly 1235 1240 1245 -132- SUBSTnTTE SHEET (RULE 260 WO 97/17432 PCT/US96/18003 Leu Tyr lle Phe Ala Asp Met Ser S3r Asp Asn Met Thr Asn Ala ;in 1250 1255 1260 Ala Thr Asn Tyr Trp Asn Asn Ser ,Tyr Pro Gin Phe Asp Thr Val Met 1265 1270 1275 1280 Ala Asp Pro Asp Ser Asp Asn Lys Lys Val Ile Thr Arg Arg Val Asn 1285 1290 1295 Asn Arg Tyr Ala Glu Asp Tyr Glu Ile Pro Ser Ser Val Thr Ser Asn 1300 1305 1310 Ser Asn Tyr Ser Trp Gly Asp His Ser Leu Thr Met Leu Tyr Gly Gly 1315 1320 1325 Ser Val Pro Asn Ile Thr Phe Glu Ser Ala Ala Glu Asp Leu Arg Leu 1330 1335 1340 Ser Thr Asn Met Ala Leu Ser Ile Ile His Asn Gly Tyr Ala Gly Thr 1345 1350 1355 1360 Arg Arg Ile Gin Cys Asn Leu Met Lys Gin Tyr Ala Ser Leu Gly Asp 1365 1370 1375 Lys Phe Ile Ile Tyr Asp Ser Ser Phe Asp Asp Ala Asn Arg Phe Asn 1380 1385 1390 Leu Val Pro Leu Phe Lys Phe Gly Lys Asp Glu Asn Ser Asp Asp Ser 1395 1400 1405 Ile Cys Ile Tyr Asn Glu Asn Pro Ser Ser Glu Asp Lys Lys Trp Tyr 1410 1415 1420 Phe Ser Ser Lys Asp Asp Asn Lys Thr Ala Asp Tyr Asn Gly Gly Thr 1425 1430 1435 1440 Gin Cys Ile Asp Ala Gly Thr Ser Asn Lys Asp Phe Tyr Tyr Asn Leu 1445 1450 1455 Gin Glu Ile Glu Val Ile Ser Val Thr Gly Gly Tyr Trp Ser Ser Tyr 1460 1465 1470 Lys Ile Ser Asn Pro Ile Asn Ile Asn Thr Gly Ile Asp Ser Ala Lys 1475 1480 1485 Val Lys Val Thr Val Lys Ala Gly Gly Asp Asp Gin Ile Phe Thr Ala 1490 1495 1500 Asp Asn Ser Thr Tyr Val Pro Gin Gin Pro Ala Pro Ser Phe Glu Glu 1505 1510 1515 1520 Met Ile Tyr Gin Phe Asn Asn Leu Thr Ile Asp Cys Lys Asn Leu Asn 1525 1530 1535 Phe Ile Asp Asn Gin Ala His Ile Glu Ile Asp Phe Thr Ala Thr Ala 1540 1545 1550 Gin Asp Gly Arg Phe Leu Gly Ala Glu Thr Phe Ile Ile Pro Val Thr 1555 1560 1565 Lys Lys Val Leu Gly Thr Glu Asn Val Ile Ala Leu Tyr Ser Glu Asn 1570 1575 1580 Asn Gly Val Gin Tyr Met Gin Ile Gly Ala Tyr Arg Thr Arg Leu Asn 1585 1590 1595 1600 -133- SUBSTITUTE SHEET (RULE 26) WO 97/17432 WO 9717432PCTIUS96/18003 Thr Leu Phe Ala Gin GIn Leu -Val Ser Arg Ala Asn Arg Gly Ile Asp 1605 1610 1615 Ala Val Leu Ser Met Giu Thr Gin Asn Ile Gin Giu Pro Gin Leu Gl1.
1620 1625 1630 Ala Gly Thr T'yr Val Gin Leu Val Leu Asp Lys 71'r Asp Glu Ser Ile 1635 1640 1645 His G ly Thr Asn Lys Ser Phe Ala Ile Giu Tyr Val Asp Ile Phe Lys 1650 16S5 1660 Giu Asn Asp Ser Phe Val Ile Tyr Gin Gly Glu Leu Ser Glu Thr Ser 1665 1670 1675 1680 Gin Thr Val Val Lys Val Phe Leu Ser rTyr Phe Ile Giu Ala Thr Gly 1685 1690 1695 Asn Lys Asn His Leu Trp Val Arg Ala Lys Tyr Gin Lys Giu Thr Thr 1700 1705 1710 Asp Lys Ilie Leu Phe Asp Arg Thr Asp Glu Lys Asp Pro His Gly Trp 1715 1720 1725 Phe Leu Ser Asp Asp His Lys Thr Phe Ser Gly Leu Ser Ser Ala Gin 1730 1735 1740 Ala Leu Lys Asn Asp Ser Glu Pro Met Asp Phe Ser Gly Ala Asn Ala 1745 1750 1755 1760 Leu Tyr Phe Trp, Glu Leu Phe Tyr Tyr Thr Pro Met.Met Met Ala His 1765 1770 1775 Arg Leu Leu Gin Giu Gin Asn Phe Asp Ala Ala Asn His Trp, Phe Arg 1780 1785 1790 Tyr Val Trp Ser Pro Ser Gly Tyr Ile Val Asp Gly Lys Ile Ala Ile 1795 1800 1805 Tyr His Trp, Asn Val Arg Pro Leu Giu Glu Asp Thr Ser.Trp, Asn Ala 1810 1815 1820 Gin Gin Leu Asp Ser Thr Asp Pro Asp Ala Val Ala Gin Asp Asp Pro 1825 1830 1835 1840 Met His Tyr Lys Val Ala Thr Phe Met Ala Thr Leu Asp Leu Leu Met 1845 1850 1855 Ala Arg Gly Asp Ala Ala Tlyr Arg Gin Leu Glu Arg Asp Thr Leu Ala 1860 1865 1870 Glu Ala Lys Met Trp, Tyr Thr Gin Ala Leu Asn Leu Leu Gly Asp Glu 1875 1880 1885 Pro Gin Val Met Leu Ser Thr Thr Trp Ala Asn Pro Thr Leu Giy Asn 1890 1895 1900 Ala Ala Ser Lys Thr Thr Gin Gin Val Arg Gin Gin Val Leu Thr Gin 1905 1910 1915 1920 Leu Arg Leu Asn Ser Arg Val Lys Thr Pro Leu Leu Gly Thr Ala Asn 1925 1930 1935 Ser Leu Thr Ala Leu Phe Leu Pro Gin Giu Asn Ser Lys Leu Lys Gly 1940 1945 1950 -134- SUBSTITUTE SHEET (RULE 26) WO 97/17432 WO 97/ 7432PCT/US96/18003 T .r Trp Arg Thr Leu Aja Gin Ara Met Phe Asn Leu Ara His Asn Leu 1955 1960 196 Ser Ilie Asp Gly Gin Pro Leu Ser Leu Pro Leu Tyr Ala Lys Pro Ala 1970 1975 1980 Asp Pro Lys Ala Leu Leu Ser Ala Ala Val Ser Ala Ser Gin Gly Gly 1985 1990 1995 2000 Ala Asp Leu Pro Lys Ala Pro Leu Thr Ilie His Arg Phe Pro Gin Met 2005 2010 2015 Leu Giu Gly Ala Arg Gly Leu Val Asn Gin Leu Ile Gin Phe Gly Ser 2020 2025 2030 Ser Leu Leu Gly Tyr Ser Glu Arg Gin Asp Ala Glu Ala Met Ser Gin 2035 2040 2045 Leu Leu Gin Thr Gin Ala Ser Glu Leu Ile Leu Thr Ser Ile Arg Met 2050 2055 2060 Gin Asp Asn Gin Leu Ala Giu Leu Asp Ser Glu Lys Thr Ala Leu Gin 2065 2070 2075 2080 Val Ser Leu Ala Gly Val Gin Gin Arg Phe Asp Ser Tjr Ser Gin Leu 2085 2090 2095 T'jr Glu Giu Asn Ile Asn Ala Gly Giu Gin Arg Ala Leu Ala Leu Arg 302100 2105 2110 Ser Giu Ser Ala Ile Glu Ser Gin Gly Ala Gin Ile Ser Arg Met Ala 2115 2120 2125 Gly Ala Gly Val Asp Met Ala Pro Asn Ile Phe Gly Leu Ala Asp Gly 2130 2135 2140 Gly Met His Tyr Gly Ala Ile Ala Tyr Ala Ile Ala Asp Gly Ile Glu 2145 2150 2155 2160 Leu Ser Ala Ser Ala Lys Met Val Asp Ala Giu Lys Val Ala Gin Ser 2165 2170 2175 Giu Ile Tyr Arg Arg Arg Arg Gin Giu Trp Lys Ile Gin Arg Asp Asn 2180 2185 2190 Ala Gin Ala Glu Ile Asn Gin Leu Asn Ala Gin Leu Giu Ser Leu Ser 2195 2200 2205 Ile Arg Arg Glu Ala Ala Glu Met Gin Lys Giu Tyr Leu Lys Thr Gin 2210 2215 2220 Gin Ala Gin Ala Gin Ala GIn Leu Thr Phe Leu Arg Ser Lys Phe Ser 2225 2230 2235 2240 Asn Gin Ala Leu Tyr Ser Trp Leu Arg Gly Arg Leu Ser Gly Ile Tyr 2245 2250 2255 Phe Gin Phe Tyr Asp Leu Ala Val Ser Arg Cys Leu Met Ala Glu Gin 2260 2265 2270 Ser Tyr Gin Trp Giu Ala Asn Asp Asn Ser Ile Ser Phe Val Lys Pro 2275 2280 2285 Gly Ala Trp Gin Gly Thr Tyr Ala Gly Leu Leu Cys Gly Glu Ala Leu 2290 2295 2300 -135- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 Ile Gin Asn Leu Aia Gin Met Glu Giu Ala Tyr Leu Lys Trp Glu Ser 2305 2310 2315 2320 Arg Ala Leu Glu Val Glu Arg-Thr Val Ser Leu Ala Val Val Tyr Asp 2325 2330 2335 Ser Leu Glu Gly Asn Asp Arg Phe Asn Leu Ala Glu Gin Ile Pro Ala 2340 2345 2350 Leu Leu Asp Lys Gly Glu Gly Thr Ala Gly Thr Lys Glu Asn Gly Leu 2355 2360 2365 Ser Leu Ala Asn Ala Ile Leu Ser Ala Ser Val Lys Leu Ser Asp Leu 2370 2375 2380 Lys Leu Gly Thr Asp Tyr Pro Asp Ser Ile Val Gly Ser Asn Lys Val 2385 2390 2395 2400 Arg Arg Ile Lys Gin Ile Ser Val Ser Leu Pro Ala Leu Val Gly Pro 2405 2410 2415 Tyr Gin Asp Val Gin Ala Met Leu Ser Tyr Gly Gly Ser Thr Gin Leu 2420 2425 2430 Pro Lys Gly Cys Ser Ala Leu Ala Val Ser His Gly Thr Asn Asp Ser 2435 2440 2445 Gly Gin Phe Gin Leu Asp Phe Asn Asp Gly Lys Tyr Leu Pro Phe Glu 2450 2455 2460 Gly Ile Ala Leu Asp Asp Gin Gly Thr Leu Asn Leu Gin Phe Pro Asn 2465 2470 2475 2480 Ala Thr Asp Lys Gin Lys Ala Ile Leu Gin Thr Met Ser Asp Ile Ile 2485 2490 2495 Leu His Ile Arg Tyr Thr Ile Arg 2500 2505 INFORMATION FOR SEQ ID NO:13: SEQUENCE CHARACTERISTICS: LENGTH: 12 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:13: Leu Ile Gly Tyr Asn Asn Gin Phe Ser Gly Xaa Ala 1 5 INFORMATION FOR SEQ ID NO:14: SEQUENCE CHARACTERISTICS: LENGTH: 12 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear -136- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:14: Met Gin Asn Ser Gin Thr Phe Ser Val Gly Glu Leu 1 5 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 9 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID Ala Gln Asp Gly Asn Gin Asp Thr Phe Phe Ser Gly Asn Thr 1 5 INFORMATION FOR SEQ ID NO:16: SEQUENCE CHARACTERISTICS: LENGTH: 5 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:16: Met Gin Asn Ser Leu 1 INFORMATION FOR SEQ ID NO:17: SEQUENCE CHARACTERISTICS: LENGTH: 10 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:17: Ala Phe Asn Ile Asp Asp Val Ser Leu Phe 1 5 -137- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 INFOP.MATICN FOR SEQ ID NO:18: SEQUENCE CHARACTERISTICS: LENGTH: 16 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:18: Phe Ile Val Tyr Thr Ser Leu Gly Val Asn Pro Asn Asn Ser Ser Asn 1 5 10 INFORMATION FOR SEQ ID NO:19: SEQUENCE
CHARACTERISTICS:
LENGTH: 21 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:19: Ile Ser Asp Leu Val Thr Thr Ser Pro Leu Ser Glu Ala Ile Gly Ser 1 5 10 Leu Gin Leu Phe Ile INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 12 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID Met Tyr Tyr Ile Gln Ala Gln Gln Leu Leu Gly Pro 1 5 INFORMATION FOR SEQ ID NO:21: SEQUENCE CHARACTERISTICS: LENGTH: 26 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear -138- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:21: Gly Ile Asp Ala Val Leu Ser Met Glu Thr Gin Asn Ile Gin Glu Pro 1 5 10 Gin Leu Gly Ala Gly Thr Tyr Val Gin Leu INFORMATION FOR SEQ ID NO:22: SEQUENCE CHARACTERISTICS: LENGTH: 15 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: Ile Ser Asn Pro Ile Asn Ile Asn Thr Gly Ile Asp Ser Ala Lys 1 5 10 INFORMATION FOR SEQ ID NO:23: SEQUENCE CHARACTERISTICS: LENGTH: 13 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: Thr Tyr Leu Thr Ser Phe Glu Gin Val Ala Asn Leu Lys 1 5 INFORMATION FOR SEQ ID NO:24: SEQUENCE CHARACTERISTICS: LENGTH: 22 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: Val Leu Gly Thr Glu Asn Val Ile Ala Leu Tyr Ser Glu Asn Asn Gly -139- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 Val Gin Tyr Met Gin Ile INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 6005 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (ix) FEATURE: NAME/KEY: RBS LOCATION: 1..9 (ix) FEATURE: NAME/KEY: CDS LOCATION: 16..3585 OTHER INFORMATION: /product= "PS" (xi) SEQUENCE DESCRIPTION: SEQ ID AAGAAGGAAT TGATT ATG TCT GAA TCT TTA TTT ACA CAA ACG TTG AAA GAA Met Ser Glu Ser Leu Phe Thr Gin Thr Leu Lys Glu GCG CGC CGT Ala Arg Arg 15 GAT GCA TTG GTT Asp Ala Leu Val GCT CAT TAT ATT GCT ACT CAG GTG CCC His Tyr Ile Ala Gin Val Pro GCA GAT Ala Asp TTA AAA GAG AGT Leu Lys Glu Ser CAG ACC GCG GAT Gin Thr Ala Asp
GAT
Asp CTG TAC GAA TAT Leu Tyr Glu Tyr TTG CTG GAT ACC Leu Leu Asp Thr ATT AGC GAT CTG Ile Ser Asp Leu ACT ACT TCA CCG Thr Thr Ser Pro TCC GAA GCG ATT Ser Glu Ala Ile AGT CTG CAA TTG Ser Leu Gin Leu ATT CAT CGT GCG Ile His Arg Ala ATA GAG Ile Glu GGC TAT GAC Gly Tyr Asp GAA CAG TTT Glu Gin Phe 95 ACG CTG GCA GAC Thr Leu Ala Asp GCA AAA CCC TAT Ala Lys Pro Tyr TTT GCC GAT Phe Ala Asp TAT AGC ACT Tyr Ser Thr TTA TAT AAC TGG Leu Tyr Asn Trp AGT TTT AAC CAC Ser Phe Asn His
CGT
Arg 105 TGG GCT Trp Ala 110 GGC AAG GAA CGG Gly Lys Glu Arg AAA TTC TAT GCC Lys Phe Tyr Ala
GGG
Gly 120 GAT TAT ATT GAT Asp Tyr Ile Asp
CCA
Pro 125 ACA TTG CGA TTG AAT AAG ACC GAG ATA TTT ACC GCA TTT GAA CAA Thr Leu Arg Leu Asn Lys Thr Glu Ile Phe Thr Ala Phe Glu Gin -140- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 GCT. ATT TCT CAA. GGG A-A TTA -AA ACT GAA TTA GTC GAA TCT .AAA TTA 433 Gly Ile Ser Gin Gly Lys Leu Lys Ser Glu Leu Val Glu Ser Lys Leu 145 150 155 CGT GAT TAT CTA ATT AGT TAT GAC ACT TTA GCC ACC CTT GAT TAT ATT 531 Arg Asp Tyr Leu Ile Ser Tyr Asp Thr Leu Ala Thr Leu Asp Tyr Ile 160 165 170 ACT CGCC TGC CAA GCC AAA GAT AAT AAA ACC ATC TTC TTT ATT GGC CGT 579 Thr Ala Cys Gin Gly Lys Asp Asn Lys Thr Ile Phe Phe Ile Gly Arg 175 180 185 ACA CAG AAT GCA CCC TAT GCA TTT TAT TGG CGA AAA TTA ACT TTA GTC 627 Thr Gin Asn Ala Pro Tyr Ala Phe Tyr Trp Arg Lys Leu Thr Leu Val 190 195 200 ACT GAT GGC GGT AAG TTG AAA CCA GAT CAA TGC TCA GAG TGG CGA GCA 675 Thr Asp Gly Gly Lys Leu Lys Pro Asp Gin Trp Ser Glu Trp Arg Ala 205 210 215 220 ATT AAT CCC GGG ATT ACT GAG GCA TAT TCA GCC CAT GTC GAG CCT TTC 723 Ile Asn Ala Cly Ile Ser Glu Ala Tyr Ser Cly His Val Glu Pro Phe 225 230 235 TOG GAA AAT AAC AAG CTG CAC ATC CT TGG TTT ACT ATC TCG AAA CAA 771 Trp Glu Asn Asn Lys Leu His Ile Arg Trp Phe Thr Ile Ser Lys CGlu 240 245 250 GAT AAA ATA GAT TTT GTT TAT AAA AAC ATC TGG GTG ATG AGT AGC GAT 819 Asp Lys Ile Asp Phe Val Tyr Lys Asn Ile Trp Val Met Ser Ser Asp 255 260 265 TAT AGC TCOG CCA TCA AAG AAA AAA ATC TTG GAA CTT TCT TTT ACT CAC 867 Tyr Ser Trp Ala Ser Lys Lys Lys lie Leu Glu Leu Ser Phe Thr Asp 270 275 280 TAC AAT ACA CTT GA GCA ACA GOCA TCA TCA ACC CC ACT GAA CTA GCT 915 Tyr Asn Arg Val Gly Ala Thr Gly Ser Ser Ser Pro Thr Glu Val Ala 285 290. 295 300 TCA CAA TAT GOT TCT CAT GCT CAG ATG AAT ATT TCT CAT CAT GGC ACT 963 Ser Gin Tyr Gly Ser Asp Ala Gin Met Asn Ile Ser Asp Asp Gly Thr 305 310 315 GTA CTT ATT TTT CAG AAT GCC GGC GGA GCT ACT CCC ACT ACT GGA GTC 1011 Val Leu Ile Phe Gin Asn Ala Gly Gly Ala Thr Pro Ser Thr Gly Val 320 325 330 ACG TTA TOT TAT GAC TCT GGC AAC GTC ATT AAG AAC CTA TCT AGT ACA 1059 Thr Leu Cys Tyr Asp Ser Gly Asn Val Ile Lys Asn Leu Ser Ser Thr 335 340 345 GGA ACT GCA AAT TTA TC TCA AAC GAT TAT GCC ACA ACT AAA TTA CGC 1107 Gly Ser Ala Asn Leu Ser Ser Lys Asp Tyr Ala Thr Thr Lys Leu Arg 350 355 360 ATG TGT CAT GGA CAA ACT TAC AAT CAT AAT AAC TAC TGC AAT TTT ACA 1155 Met Cys His Gly Gin Ser Tyr Asn Asp Asn Asn Tyr Cys Asn Phe Thr 365 370 375 380 CTC TCT ATT AAT ACA ATA GAA TTC ACC TCC TAC GGC ACA TTC TCA TCA 1203 Leu Ser Ile Asn Thr Ile Glu Phe Thr Ser Tyr Gly Thr Phe Ser Ser 385 390 395 GAT GGA AAA CAA TTT ACA CCA CCT TCT GCCT TCT CCC ATT GAT TTA CAC 1251 Asp Gly Lys Gin Phe Thr Pro Pro Ser Gly Ser Ala Ile Asp Leu His -141- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 CTC. CCT Leu Pro TCA CTA Ser Leu 430 CTT GAT Va l Asp 445 TTC CAT Phe His TAC CAA Tyr Clu TAT CGC Tyr Arg TAT TC Tyr Trp 510 CAG CCC Gin Pro 525 CAT TAC His Tyr CGA GGC Arg Cly CCC AAA Ala Lys GAT ATC Asp Ile 590 OCT GGC Ala Cly 605 ACC TTC Thr Phe GGT CAT Gly Asp
CTC
Leu
GTT
Val 435
CCC
Pro
CTT
Val
TGG
Trp
CAC
Gln
TTC
Leu 515
CCA
Pro
TTC
Phe
CGT
Arg
CAC
Gin
ACT
Thr 595
CCC
Pro
AGC
Ser
TAC
Tyr
AAC
Asn 420
CAC
Gln
TAT
Tyr
ACC
Thr
TAC
Tyr
CTC
Leu 500
CAA
Gln
CAT
Asp
CTG
Leu
CAA
Gin
GCA
Ala 580
TCC
Trp
ACA
Thr
CCA
Ala
AAC
Asn 405
GC
Ala
GGG
Cly
CGT
Cly
CTC
Val
AAA
Lys 485
ATT
Ile
CTG
Leu
CTG
Vai
CAT
His
CTT
Leu 565
CAA
Gln
CCA
Pro
TTC
Phe
GGC
Gly
CAT
Asp 645
CTC
Leu
CTA
Leu
CAC
Gin
ATT
Ile
CGT
Arg 470
TAT
Tyr
ATC
Met
CAT
Asp
ATC
Ile
ACC
Thr 550
GAA
Glu
CAG
Gin
AAT
Asn
CTC
Leu
CAT
Asp 630
CTA
Val
CGC
Arg
CAT
Asp
GCC
Cly 440
CTA
Leu
CA.A
Gln
TTC
Phe
CGC
Cly
GCA
Ala 520
ATC
Met
CAT
Asp
CAT
Asp
CTC
Leu
ACC
Thr 600
TCA
Ser
CCA
Ala
CTC
Leu
AAT
Asn
ATT
Ile 425
GGA
Cly
TGG
Trp
ACC
Thr
CCC
Arg
AGT
Ser 505
TGG
Trp
GCG
Ala
CTA
Leu
ACT
Thr
OCA
Gly 585
TTC
Leu
CCG
Pro
AAT
Asn
CCT
Cly
CT
Leu 665 410
AGC
Ser
TCT
Ser
CAA
Clu
GAA
Glu
AGC
Ser 490
AAA
Lys
CAT
Asp
GAC
Asp
TTG
Leu
CTA
Leu 570
CCG
Pro
AGT
Ser
GAG
Clu
ATT
Ile
TAC
Tyr 650
AGT
Ser CTC GAT 1299 Leu Asp AAT CCC 1347 Asn Pro ATC TTC 1395 Ile Phe 460 CAA COT 1443 Gin Arg 475 CCC GGT 1491 Ala Gly CCA COT 1539 Pro Arg ACC ACA 1587 Thr Thr CCC ATC 1635 Pro Met 540 ATT CCC 1683 Ile Ala 555 CTC CAA 1731 Val Glu CCC.CCT 1779 Arg Pro AAA CAA 1827 Lys Glu GTG ATC 1875 Val Met 620 GGC CAC 1923 Cly Asp 635 TGG CAT 1971 Trp Asp CTG GAT 2019 Leu Asp AAA CTT GAG TTA CCC CTA TAC AAC Lys Leu Glu Leu Arg Leu Tyr Asn -142- SUBSTITUTE SHEET (RULE 26) WO 97/1 7432 PCT/US96/1 8003 GCT CACCC CTA AAT CTG CCA CTG TAT CC ACC CCC GTA CAC CCO AAA 2C-3 G ly G in Pro Leu Asn Leu Pro Leu 'Iyr Ala Thr Pro Val Asp Pro Lys 670 675 680 ACC CTG CAA CGC CAG CAA CC GGA GCG GAC CGT ACA GGC AGT ACT CCG 2115 Thr Leu Gin Arg Gin Gin Ala GIy Gly Asp Gly Thr Gly Ser Ser Pro 685 690 695 7100 GCT GGT GGT CAA GGC AGT GTT CAG GGC TGG CCC TAT CCC TTA TTG GTA 2163 Ala Gly Gly Gin Gly Ser Val Gin Cly Trp Arg T~yr Pro Leu Leu Val 705 710 715 CAA COC CCC CGC TCT GCC GTC AGT TTG TTC ACT CAC TTC GGC AAC ACC 2211 Giu Arg Ala Arg Ser Ala Val Ser Leu Leu Thr Gin Phe Cly Asn Ser 720 725 730 TTA CAA ACA ACC TTA CAA CAT CAC CAT ALAT GAA AAA ATC ACC ATA CTG 2259 Leu Gin Thr Thr Leu Giu His Gin Asp Asn Clu Lys Met Thr Ile Leu 735 740 745 TTC CAC ACT CAA CAG CAA CCC ATC CTC AAA CAT CAC CAC CAT ATA CAA 2307 Leu Gin Thr Gin Gin Clu Ala Ile Leu Lys His Gin His Asp Ile Gin 750 755 760 CAA AAT AAT CTA AAA CCA TTA CAA CAC ACC CTC ACC GCA TTA CAC CCT 2355 Gin Asn Asn Leu Lys Gly Leu Gin His Ser Leu Thr Ala Leu Gin Ala 765 770 775 '780 AGC CCT CAT CCC CAC ACA TTG CCC CAA AAA CAT TAC AGC GAC CTC ATT 2403 Ser Arg Asp Cly Asp Thr Leu Arg Gin Lys His Tyr Ser Asp Leu Ile 785 790 795 AAC CCT CCT CTA TCT CC CCA CAA ATC CCC CGT CTC ACA CTA CCC ACC 2451 Asn Gly Cly Leu Ser Ala Ala Ciu Ile Ala Cly Leu Thr Leu Arg Ser 800 805 810 ACC CCC ATG AWN' ACC AAT CCC CTT GCA ACC CGA TTC CTG AT!' CCC CCC 2499 Thr Ala Met Ile Thr Asn Gly Val Ala Thr Cly Leu Leu Ile Ala Cly 815 820 825 CCA ATC CCC AAC C CTA CCT AAC GTC TTC CCC CTC CCT AAC GGT CGA 2547 Gly Ilie Ala Asn Ala Val Pro Asn Val Phe Gly Leu Ala Asn Ciy Cly 830 835 840 TCC GAA TCG CCA CC CCA TTA ATT CCC TCC CCC CAA GCA ACC CAA GTT 2595 Ser Ciu Trp Cly Ala Pro Leu Ile Gly Ser Gly Gin Ala Thr Gin Val 845 850 855 860 CCC CCC CCC ATC CAG CAT CAG ACC GCC GGC ATT TCA CAA CTG ACA CCA 2643 Cly Ala Gly Ile Gin Asp Gin Ser Ala Gly Ile Ser Ciu Val Thr Ala 865 870 875 CCC TAT CAG CGT CGT CAG GAA CAA TGC CCA TI'C CAA CCC CAT ATT CCT 2691 Gly Tyr Cmn Arg Arg Gin Giu Glu Trp Ala Leu Gin Arg Asp Ile Ala 880 85890 CAT AAC CAA ATA ACC CAA CTG CAT CCC CAG ATA CAA ACC CTG CAA GAG 2739 Asp Asn Ciu Ile Thr Gin Leu Asp Ala Gin Ilie Gin Ser Leu Gin Ciu 895 900 905 CAA ATC ACG ATC CCA CAA AAA CAC ATC ACC CTC TCT CAA ACC GAA CAA 27a7 Gin Ile Thr Met '-da Gin Lys Gin Ile Thr Leu Ser Glu Thr Giu Gin 910 915 920 CC AAT CCC CAA CC ATT TAT CAC CTG CAA ACC ACT CCT TTT ACC CCC 2835 Ala Asn Ala Gin Ala Ile Tyr Asp Le'i Gin Thr Thr Arg Phe Thr Cly -143- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCTIUS96I18003 925 .930 935 9-40C CAG GCA CTG TAT AAC TGG ATG GCC GGT CGT CTC TCC GCG CTC TAT TAC 2333 Gin Ala Leu Tyr Asn Trp Met Ala Gly Arg Leu Ser Ala Leu Tryr Ilir 945 950 955 CAA ATG TAT GAT TCC ACT CTG CCA ATC TGT CTC CAG CCA AAA GCC GCA 2931 Gin Met Tyr Asp Ser Thr Leu Pro Ilie Cys Leu Gin Pro Lys Ala Ala 960 965 970 TTA GTA CAG GAA TTA GGC GAG AAA GAG AGC GAC AGT CTT TTC CAG GTT 29-9 Leu Val Gin Glu Leu Gly Glu Lys Giu ser Asp Ser Leu Phe Gin Val 975 980 985 CCG GTG TGG AAT GAT CTG TGG C AA GGG CTG TTA GCA GGA GAA GGT TTA 3027 Pro Val Trp Asn Asp Leu Trp Gin Gly Leu Leu Ala Gly Giu Gly Leu 990 995 1000 AGT TCA GAG CTA CAG AAA CTG GAT GCC ATC TGG CTT GCA CGT GGT GGT 3075 Ser Ser Giu Leu Gin Lys Leu Asp Aia Ile Trp Leu Ala Arg Gly Gly 1005 1010 1015 1020 ATT GGG CTA GAA GCC ATC CGC ACC GTG TCG CTG GAT ACC CTG TTT GGC 3123 Ile Gly Leu Glu Ala Ile Arg Thr Val Ser Leu Asp Thr Leu Phe Gly 1025 1030 1035 ACA GGG ACG TTA AGT GAA AAT ATC AAT AAA GTG CTT AAC GGG GAA ACG 3171 Thr Gly Thr Leu Ser Giu Asn Ilie Asn Lys Val Leu Asn Gly Giu Thr 1040 1045 1050 GTA TCT CCA TCC GGT GGC GTC ACT CTG GCG CTG ACA GOG GAT ATC 'ITC 3219 Val Ser Pro Ser Gly Gly Val Thr Leu Ala Leu Thr Gly Asp Ile Phe 1055 1060 1065 CAA GCA ACA CTG GAT TTG AGT CAG CTA GGT TTG GAT AAC TCT TAC AAC 3267 Gin Ala Thr Leu Asp Leu Ser Gin Leu Gly Leu Asp Asn Ser Tyjr Asn 1070 1075 1080 7TTG GGT AAC GAG AAG AAA CGT COT ATT AAA CGT ATC GCC GTC ACC CTG 3315 Leu Gly Asn Giu Lys Lys Arg Arg Ile Lys Arg Ile Ala Val Thr Leu 1085 1090 1095 1100 CCA ACA CTT! CTG GOG CCA TAT CAA GAT CTT! GAA GCC ACA CTG GTA ATG 3363 Pro Thr Leu Leu Gly Pro Tyr Gin Asp Leu Glu Ala Thr Leu Val Met 1105 1110 GGT GCG GAA ATC GCC GCC TTA TCA CAC GGT GTG AAT GAC GGA GGC CGG 3411 Gly Ala Glu Ile Ala Ala Leu Ser His Gly Val Asn Asp Gly Gly Arg 1120 1125 1130 TT'r GTT ACC GAC T'TT AAC GAC AGC CGT TT!T CTG CCT TTT GAA GGT CGA 3459 Phe Val Thr Asp Phe Asn Asp Ser Arg Phe Leu Pro Phe Glu Gly Arg 1135 1140 1145 GAT GCA ACA ACC GGC ACA CTG GAG CTC A.AT ATT TTC CAT GCG GGT AA 3507 Asp Ala Thr Thr Gly Thr Leu Giu Leu Asn Ile Phe His Ala Gly Lys 1150 1155 1160 GAG GGA ACG CAA CAC GAG TTG GTC GCG AAT CTG AGT GAC ATC ATT GTG 3555 Giu Gly Thr Gin His Giu Leu Val Ala Asn Leu Ser Asp Ilie Ilie Val 1165 1170 1175 1180 CAT CTG AAT TAC ATC ATT CGA GAC GCG TAA ATTTCTT'ITC TTTGTCGATT 3605 His Leu Asn Tyr Ile Ile Arg Asp Ala 1185 1190 -144- SUBSTITUTE SHEET (RULE 26) WO 97/17432
?ACAGGTCCCT
TCGATTACAA
cTGAAkTGcTG
GGCAGJAGGGA
TTCGGCATCG
CCACAATACG
CTCAATGACC
TTGCCAATTT
ATCGAATACT
CCGGACGGGC
AATGACCAAC
GTCAGCTATC
CATCCCAATG
ACAAGCCAGC
TCTGGTCTT
GGTACAGCGC
GTGCGTACTC
GGAGAAGCCA
AAAAACGCCA
AGGCCAGTCA
CCGACATGGC
GTTGATCTGC
TATAAAGCTC
GCCCCACTGC
GACGGCCAAC
CCCGATGGAA
CCAAGCATCC
CCGAAAAGCG
CCCCAATCCA
TTCAGTGATA
ACCTGTTGGC
AGCCAGCCCC
GGCACCACCG
ATCAGGGCC
CGCTGTCACT
CCGGCCCTGA
CGGCTCCTGG
GCTGGCAATG
GTAATGACGA
AAGGGCAACC
CCTATACCGT
GGCAACCTGC
ATCTACACAT
AAATCGCCCA
AATATCGAGC
TTACCGCACA
CTGTTCGTAC
GACCACGGTG
AATGGTCTGT
GCCGCTTATG
GTACCAATGA
GCGTCACCAC
CCCAGCCACC
AACGCTTTGA
GGGGAGAAGG
CGCAACGTCA
CTACCCTACC
TGGA'N'GGGT
AGTGGACGCA
AGTI'CGCTGA
TGCGTCTATA
CAG4GTATCAC
TGCTCGGT'C
CGAATCTAGG
AAAATAGCTT
ACCTTATCTA
TGTTATTAAG
TCCCAAAGGT
TGGAATGGCC
ATTATCGCTG
CGGTGTTATG
CACGTTCCTA
TGATATCCGT
GACCCGCTAT
CTCCGGTCAA
CTTAGGGAAA
GTGGTTGCTG
CGAAGATGAA
GCGCTATCTG
TGGATAACGC
AGCGCGTACC
ACCCCCGGAT'
TCAACAAGTG
CGCCCCGGAA
GTTGATI'ACC
ACTAGAACTA
CGCACTAGAT
GTTGCCAGGT
GGAAGACGGA
CAATTI'GCAG
TGTTACCGCC
CTTTACGCCA
CCTACCGOG
TGCCAACCAG
CCTGCCTGTC
CGGTCAACAA
GCATGGCCGT
CAATCCCGAA
TGCGCAATCC
GAGTACTTTA
GGCGGTGCTA
TCCCTATCTC
ATTTACAGCA
TCCATTAGCC
TCCCCACAAG
CAAGACGTTA
CAAGCCCGCC
GAAGGACGCG
ACCGCGCAGG
GAAGAAACTG
GCCCATTGTG
GTACAGGTGA
ACCTCCCGCA
TCACTTCATA
ATCTrCTCTC
CTGATGTTTC
CTGGTTGGAC
ATCCGTCAAT
GCCTGGCAAC
AA'TTTAACT
ATGCTGTATC
GACAGCAATG
GATAATGCCT
TCCGGTATTC
ATCAATGCCT
GCAGGCTTAT
CGAAACGGCT
ACAGGGACCG
CATCTGGTGG
TCGGTCAAC
CGGCTGTC
GGCTCTTTGC
TOCAGGATTC
TCAATGGCAT
TGCCAT'IACC
ACAGTGCAGG
GACGCACCCA
GCGAGGTCAT
AAACGCTGCA
AGATCCTGGA
CTICTGGCT
CTTGTCTGGC
TGACCCCAGC
ACGACAATGA
ACTACAGGCA
CCGGAAGAGT
CCGTGCCAAC
GCTATGAATA
ACCGCACCGC
GCTTAATACT
TAAGCCATGA
GGTTTGATCT
CGCAGCAACG
A.AGATCGAGG
CCGTCACTTA
CATTGATGGA
GCGGATACCA
TGCCCGTGGA
CTGAT=AGT
GGCGTAAAGG
ATGCCCGCAA
AAATCAAGGG
CACTAACTCT
TGGCGGATAT
TCATTATCT
PCTIUS96/18003 ACCAGAAGTA 3655 GGGAGAAGCA 3725 CCTTCGACC 3785 TAATGGGCCT 3845 ACATGGCATT 3905 GA.ATATCGCC 3965 AGGCGTTACc 4025 =1'CAGTAAkA 4085 GATATCGACA 4145 AAATCCGCAA 4205 CGGTGAACAT 4265 AAAAACCGCT 4325 ACATCAAACC 4385 GGCTGTTCA 4445 ATGGGATGCA 4505 TGGrrTGAA 4565 GCTCATGGCC 4625 GGAATATGAC 4685 ATCGGACGGG 4745 GGAGAAAATC 4805 TTATCAACTG 4865 CGCT GGTGG 4925 CGACAAAATC 4985 TATCAACGGA 5045 TAGTCAGcAAk 5105 ATATTICAT 5165 GTTGATCGGG 5225 AGAAGATGTC 5285 ACTGGTGGCT 5345 TAATCGCGTc 5405 GTCAGGATTT 5465 CGACGGCTCC 5525 CAACCAAAGT 5585 -145- SUBSTITUTE SHEET (RULE 26) WO 97/1 7432 PTU9/80 PCTIUS96/18003
GGTAATCAGT
ACTTCCCAAC
GTGCCACATA
TTGAATGTAA
CAATTCTGGT
CTGCCGTTTC
CGGCTCACCA
TTGATGCCCC
TTCAAGTCGC
TCGCGCCACA
TGAACAATAA
TGGATGAAAA
CA: ATGCATr
GTGAAGTCAA
OTTOACATTA
C ;ATATTCAG
TCACTGGCGT
CCGGGGCGCA
ATTACAGCTC
GCTATGGTAT
CTACAGCCAC
GCGTTGCCAG
GGATTAGGGA
TGTGACCTGT
CATCACACGC
ACCAAAGCAG
ACCGAAATTC
GGCGTCTGGG
A AGC-CGTACA
TAGCCAGCTT
CACTGACCAA
TACATTATCG
GCAAATCTCC
AGGATGAAAT
ATGGTAAAGA
?.TTTcAcAA;c
GATTCTGACT
ACCCTGGTTG
TAGTTCCGCG
GGCTTGTTAT
CAGCGGCAAC
GCGGGr'u'TTC 5.945 5825 5885 5945 6005 INFORMATION FOR SEQ ID NO:26: SEQUENCE CHARACTERISTICS: LENGTH: 1190 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein Met 1 Alia (xi) Ser Giu SEQUENCE DESCRIPTION: SEQ ID NO:26: Ser Leu Phe Thr Gin Thr Leu Lys Giu Ala Arg Arg Asp Leu Val Ala Ser Ile Gin His Tyr Ile Thr Ala Asp Ala Thr 25 Asp Leu Gin Val Pro Ala Leu Leu Asp Glu Tyr Giu Tyr Thr Lys Ile Ser Asp Leu Thr Ser Pro Leu Giu Ser Giu Ala Ile Giy Thr Ser Leu Gin Leu Ile His Arg Aia Gly Tyr Asp Leu Ala Asp Aia Lys Pro Tyr Tyr Asn Trp Giu Arg Leu 115 Leu Asn Lys Phe Asn His Phe Ala Asp Tyr Ser Thr Tyr Ile Asp Glu Gin Phe Leu Trp Phe Tyr Ala Ala Gly Lys 110 Thr Leu Arg Ile Ser Gin Thr Giu Ile Ala Phe Giu 130 Lys Gly 145 Ile Leu Lys Ser Val Giu Ser Arg Asp Tyr Ser Tyr Asp Ala Thr Leu Ile Thr Ala Gly Lys Asp Thr Ile Phe Phe 185 Leu Gly Arg Thr Gin Asn Ala 190 Asp Gly Gly Pro Tyr Ala 195 Tyr Trp Arg Thr Leu Val -146- SUBSTITUTE SHEET (RULE 26) WO 97/17432 Lys Leu Lys Pro Asp Gin 210 PCTUS96/18003 Ser Giu Trp Arg Ala Ile Asn Ala Giv 220 le 225 Ly s Ser Glu Ala Tyr Ser Gly His Val Glu Pr Phe Ser Gly ser 305 Gin Asp Leu Gin Thr 385 Phe Val Ty r Ser Phel 465 Asp Asn C Met 1 Thr P Ala I 545
S
3
I
A
A
4
I.
L!
r 230 Leu His Ile Arg Trp Phe Thr Ile 245 '/al Pr Lvs Asn 1le Trp Val Met 260 265 Lys Lys Lys Ile Leu Giu Leu Ser 275 280 Ala Thr Gly Ser Ser Ser Pro Thr 290 295 Asp Ala Gin Met Asn Ile Ser Asp 310 Asn Ala Gly Gly Ala Thr Pro Ser 325 3er Gly Asn Val Ile Lys Asn Leu 340 345 jer Ser Lys Asp Tyr Ala Thr Thr 355 360 ;er Tyr Asn Asp Asn Asn Tyr Cys 70 375 :le Giu Phe Thr Ser Tyr Gly Thr 390 'hr Pro Pro Ser Gly Ser Ala Ile 405 sp Leu Asn Ala Leu Leu Asp Ile 420 425 sp Val Gin Gly Gin Phe Gly Gly 435 440 ly Pro Tyr Giy Ile Tyr Leu Trp 50 455 eu Val Thr Val Arg Met Gin Thr 470 hr Trp Tyr Lys Tyr Ile Phe Arg 485 ly Gin Leu Ile Met Asp Gly Ser I 500 505 ro Leu Gin Leu Asp Thr Ala Trp 515 520 sp Pro Asp Val Ile Ala Met Ala 535 Ser 250 Ser Phe Glu Asp Thr 330 Ser Lys Asn Phe Asp 410 Ser Ser Glu lu Ser 190 .ys I ~sp I ~sp P 23' Ly Sei Thr Val Gly 315 Gly Ser Leu Phe Ser 395 Leu Leu Asn Ile ,In 175 kla Pro rhr 'ro o Phe S Glu Asp Asp Ala 300 Thr Val Thr Arg Thr 380 Ser His Asp Pro Phe 460 Arg Gly Arg Thr C 5 Met H 540 Trg Asp Tyr Tyr 285 Ser Val Thr Gly Met 365 Leu Asp Leu Ser Val 445 Phe Tyr 'yr ;In 'is Glu Lys Ser 270 Asn Gln Leu Leu Ser 350 Cys Ser Gly Pro Leu 430 Asp His Glu Arg Trp 510 Pro Tyr I Asn Ile 255 Trp Arg Tyr Ile Cys 335 Ala His lie Lys Asn 415 Leu Asn Ile Asp Asp 195 k.sn kla .ys Asn 240 Asp Ala Val Gly Phe 320 Tyr Asn Gly Asn Gin 400 Tyr Asn Phe Pro Ala 480 Ala Val Thr Leu le Phe Leu His Thr Leu Asp Leu Leu Ile Ala Arg Giy Asp Set 550 555 CFrn -147- SUBSTITUTE SHEET (RULE 26) WO 97/17432 WO 9717432PCT/US96/1 8003 Ala 2T'*r Arg In Leu Giu Arg Asp Thr L=eu Val Glu Ala Lys Met T'r 565 570 57 Trr Ilie Gin Ala Gin Gin Leu Leu Gly Pro Arg Pro Asp Ile His Thr 580 585 590 Thr Asn Thr Trp Pro Asn Pro Thr Leu Ser Lys Giu Ala Gly Ala Ile 595 600 605 Ala Thr Pro Thr Phe Leu Ser Ser Pro Gi6 Val Met Thr Phe Ala Ala 610 615 620 Trp Leu Ser Ala Gly Asp Thr Ala Asn Ilie Gly Asp Gly Asp Phe Leu 625 630 635 640 Pro Pro T'yr Asn Asp Vai Leu Leu Giy 7y r Trp Asp Lys Leu Giu Leu 645 650 655 Arg Leu Tyr Asn Leu Arg His Asn Leu Ser Leu Asp Gly Gin Pro Leu 660 665 670 Asn Leu Pro Leu Tyr Ala Thr Pro Val Asp Pro Lys Thr Leu Gin Arg 675 680 685 Gin Gin Ala Giy Gly Asp Gly Thr Gly Ser Ser Pro Ala Giy Gly Gin 690 695 700 Gly Ser Val Gin Gly Trp Arg Tyr Pro Leu Leu Val Glu Arg Ala Arg 705 710 715 720 Ser Ala Val Ser Leu Leu Thr Gin Phe Gly Asn Ser Leu Gin Thr Thr 725 730 735 Leu Giu His Gin Asp Asn Giu Lys Met Thr Ilie Leu Leu Gin Thr Gin 740 745 750 Gin Giu Ala Ile Leu Lys His Gin His Asp Ile Gin Gin Asn Asn Leu 755 760 765 Lys Gly Leu Gin His Ser Leu Thr Ala Leu Gin Ala Ser Arg Asp Gly 770 775 780 Asp Thr Leu Arg Gin Lys His Tyr Ser Asp Leu Ile Asn Gly Gly Leu 785 790 795 800 Ser Ala Ala Giu Ile Ala Gly Leu Thr Leu Arg Ser Thr Ala Met Ile 805 810 815 Thr Asn Gly Val Ala Thr Gly Leu Leu Ilie Ala Gly Gly Ile Ala Asn 820 825 830 Ala Val Pro Asn Val Phe Giy Leu Ala Asn Gly Gly Ser Giu Trp Gly 835 840 845 Ala Pro Leu Ile Gly Ser Gly Gin Ala Thr Gin Val Gly Ala Gly Ile 850 855 860 Gin Asp Gin Ser Ala Gly Ile Ser Glu Val Thr Aia Gly Tyr Gin Arg 865 870 875 880 Arg Gin Glu Giu Trp Ala Leu Gin Arg Asp Ile Ala Asp Asn Giu Ile 885 890 895 Thr Gin Leu Asp Ala Gin Ile Gin ser Leu Gin Giu Gin Ile Thr Met 900 905 910 -148- SUBSTITUTE SHEET (RULE 26) WO 97/1 7432 PCTIUS96/1 8003 AI.3 Gin Lys Gin Ile Thr Leu se r Glu Thr Giu Gin Ala Asn Ala -3in 915 90925 Ala~ Ile 'Tyr Asp Leu Gin Thr Thr Arg Phe Thr Gly Gin Ala LeuTr 930 935 0-40 Asn Trp Met Ala Gly Arg Leu Ser Ala Leu TyZr T1yr Gin Met 1jyr Asp 945 950 955 960 Ser Thr Leu Pro Ile Cys Leu Gin Pro Lys Ala Ala Leu Val Gi1n Giu 965 970 975 Leu Gly Giu Lys Glu Ser Asp Ser Leu Phe Gin Val Pro Val Trp Asn 980 985 990 Asp Leu Trp Gin Gly Leu Leu Ala Gly Giu Giy Leu Ser Ser Giu Leu 995 1000 1005 Gin Lys Leu Asp Ala Ile Trp Leu Ala Arg Gly Gly Ile Gly Leu Glu 1010 1015 1020 Ala Ile Arg Thr Val Ser Leu Asp Thr Leu Phe Gly Thr Gly Thr Leu 1025 13 0514 13 0514 Ser Glu Asn Ilie Asn Lys Val Leu Asn Giy Glu Thr Vai Ser Pro Ser 1045 1050 1055 Gly Gly Val Thr Leu Ala Leu Thr Gly Asp Ile Phe Gin Ala Thr Leu 1060 1065 1070 Asp Leu Ser Gin Leu Gly Leu Asp Asn Ser Tyr Asri Leu Gly Asn Giu 1075 1080 1085 Lys Lys Arg Arg Ile Lys Arg Ile Ala Val Thr Leu Pro Thr Leu Leu 1090 1095 1100 Gly Pro Tyr Gin Asp Leu Glu Ala Thr Leu Vai Met Gly Ala Glu Ile 1105 1110 1115 1120 Ala Ala Leu Ser His Gly Val Asn Asp Gly Gly Arg Phe Val Thr Asp 1125 1130 1135 Phe Asn Asp Ser Arg Phe Leu Pro Phe Glu Gly Arg Asp Ala Thr Thr 1140 1145 1150 Gly Thr Leu Glu Leu Asn Ile Phe His Ala Gly Lys Glu Gly Thr Gin 1155 1160 1165 His Giu Leu Val Ala Asn Leu Ser Asp Ile Ile Val His Leu Asn Tyr 1170 1175 1180 Ile Ile Arg Asp Ala ias 1190 INFORMATION FOR SEQ ID NO:27: SEQUENCE
CHARACTERISTICS:
LENGTH: 1881 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genolnic) -149- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 (ix) FEATURE: NAME/KEY: CDS LOCATION: 1..1881 OTHER INFORMATION: /product= "P8-1
ATG
Met 1
GCA
Ala
GAG
(xi) SEQUENCE TCT GAA TCT TTA Ser Giu Ser Leu 5 TTC GTT GCT CAT Leu Val Ala His AGT ATC CAG ACC DESCRIPTION: SEQ TTT ACA CAA ACG TTG ID NO:27: AAA GAA GCGC CGC Phe Thr Gin Thr Leu
ATT
Ile
CAT
OCT
Ala
GAT
Giu Ser Ile Gin Thr Ala Asp A
ACC
Thr
GGC
Gly
ACG
Thr
TAT
Tyr
GAA
Glu
TTC
Leu
GGC
Cly 145
ATT
Ile
CGC
Gly
CCC
Pro
.AAA
Lys 50
AGT
Ser
CTG
Leu
AAC
Asn
CCC
Arg
AAT
Asn 130
AAA
Lys
AGT
Ser Lys
'AT
'yr
ATT
Ile
CTG
Leu
CCA
Ala
TGG
Trp
TTG
Leu 115
AAG
Lys
TTA
Leu
TAT
Tyr
CAT
Asp
GCA
Ala I 195
AGC
Ser
CAA
Gin
CAC
Asp
CAT
Asp 100
AAA
Lys
ACC
Thr
AAA
Lys
GAC
Asp
AAT
ksn
PTT
Phe
CAT
Asp
TTC
Leu
TCA
Ser
AGT
Ser
TTC
Phe
GAG
Glu
AGT
Ser
ACT
Thr 165
AAA
Lys
TAT
Tlyr
CTG
Leu
TTI
Phe 70
GCA
Ala TrT Phe
TAT
Tyr
ATA
Ile
GAA
Giu 150
TTA
Leu 4CC hr rGG
CTT
Val 55
ATT
Ile
AAA
Lys
AAC
Asn
GCC
Ala
TTT
Phe 135
TTA
Leu
GCC
Ala
ATC
Ile
CGA
A
T
C
H
C
P
C
H
C
1.
At
TI
CG
Ve
AC
TI
F)
Pf ~sp Leu 40 ICT ACT 'hr Thr AT CGT is Arg CC TAT ro Tyr AC CGT is Arg 105 :C CAT iy Asp 20 CC CCA hr Ala 'C GAA 11 Clu :C CTT ir Leu 'C TTT ie Phe 185
~TTA
10
CAG
Gin
TAC
Tyr
TCA
Ser
GCC
Ala
TTT
Phe 90
TAT
Tyr
TAT
Tyr
TTT.
Phe
TCT
Ser
GAT
Asp 170
RTT
Ile
%CT
Val Pro Ala Asp Leu Lys 30
GAA
Glu
CCC
Pro
ATA
Ile 75
GCC
Ala
AGC
Ser
ATT
Ile
GAA
Glu
AAA
Ly s 155
TAT
Tyr 3GC ;iy
TAI
Tyr
CTG
Leu
GAG
Glu
CAT
Asp
ACT
Thr
GAT
Asp
CAA
Gin 140
TTA
Leu
ATT
lie
CGT
Arg
GTC
CTC
Leu
TCC
Ser
GGC
Gly
CAA
clu
TCC
Trp
CCA
Pro 125
GCT
Giy
CGT
Arg
ACT
Thr
ACA
Thr
ACT
Thr 205
TTC
Leu
GAA
Glu
TAT
Tyr
CAC
Gin
CCT
Ala 110
ACA
Thr ATr Ile
CAT
Asp
GCC
Ala
CAG
Gin 190
GAT
Asp
CTG
Leu
CCC
Ala
CAC
Asp
TTT
Phe
GGC
cly
TTG
Leu
TCT
Ser
TAT
Tyr
TGC
Cys 175
AAT
Asn
GGC
iv
CAT
Asp
ATT
Ile
GGC
Cly
TTA
Leu
AAC
Lys
CGA
Arg
CAA
Gin
CTA
Leu 160
CAA
Gin
CCA
Ala
GGT
Cly 144 192 240 288 336 384 432 480 528 576 624 Lys Giu Ala Arg GTG CCC CCA CAT COT CAT Arg Asp TTA AA rrp Arg Lys Leu Thr Leu Val 200 AAG TTG Lys Leu 210 AAA CCA CAT CAA Lys Pro Asp Gin
TGG
Trp 215 TCA GAG TGG CCA GCA ATT AAT CCC CCC Ser Clu Trp Arg Ala lie Asn Ala Gly -150- SUBSTITUTE SHEET (RULE 26) WO 97/1 7432 ATT .1CT CT/US96/18003 As n 3AG GCA 7*AT TCA G00 CAT Z3TC OAG C7T TTC Ile S.-er Glu Ala Tyr Ser Sly His ':al Giu Pro Phe 7- -G OAA- An'T Trp Glu Asri .AG CT- CAC ATC C5- Ly s
TTT
Phe
TCA
Ser
GGA
Oly
TCT
Ser 305
CAG
Gin
GAC
Asp
TTA
Leu
CAA
Gin
ACA
Thr 385
TTT
Phe 1 OTA C Val TAT C Ty r AGT G Ser G 604 TTC C Phe L 465 L e- C3T Va 1
AAG
Lys
GCA
Ala 290
GAT
Asp
AAT
As n
TCT
Ser rCG Ser
AGT
Ser 370 k.TA Ile kLCA rhr
;AT
~sp
;AC
~sp
;OT
;Iy ~50
:TT
,eu His
TAT
Tyr
~AAA
Lys 275
ACA
Thr
O'CT
Ala
GCC
Ala
GC
0 ly
TCA
Ser 355
TAC
Tyr
GAA
Giu
CCA
Pro
CTC
Leu
GTT
Val 435
CCC
Pro OTT2 Val Ile
AAA
Lys 260
AAA
Lys
GGA
Gly
CAG
Gin
GC
Gly
AAC
Asn 340
AAG
Lys
AAT
Asn
TTC
Phe
:CT
Pro Aksn 420
CAG
31n r'AT r'yr k.CG C Chr Ar~ 24~
AAC
As r
ATC
Ile
TCA
Ser
ATG
Met
GGA
Gly 325
GTO
ValI
OAT
Asp
OAT
Asp
ACC
Thr rT Ser 405
GCG
Ala ]oo .ly
.GT
ily
TC
7a I r' TG Trp
ATC
Ile
TTG
Leu
TCA
Ser
AAT
As n 310
OCT
Ala
ATT
Ile
TAT
Tyr
AAT
Asn
TCC
Ser 390
GT
Oly
CTA
Leu
CAG
Gin
ATT
Ile
COT
Arg
TT]
Phe
TOG
Trp
OAA
Giu
AOC
Ser 295
ATT
Ile
ACT
Thr
AAG
Ly s 0CC Ala
AAC
Asn 375
TAC
Tyr
TCT
Ser
TTA
Leu
TTT
Phe
TAT
T'yr 455
ATC;
Met
ACT
Thr
OTO
Val
CTT
Leu 280
CCG
Pro
TCT
Ser
CCC
Pro
AAC
Asn
ACA
Thr 360
TAC
Tyr
GGC
Oly 0CC Ala
OAT
Asp
GGC
Oly 440
CTA
Leu
CAA
Gin
ATC
Ile
ATO
Met 265
TCT
Ser
ACT
Thr
GAT
Asp
AGT
Ser
CTA
Leu 345
ACT
Thr
TOC
Cys
ACA
Thr Ile
ATT
Ile 425
GGA
Gly rrp
ACC
T'hr T C Se2 25(
AG)
Set
TTI
Phe
GAA
Glu
GAT
Asp
ACT
Thr 330
TCT
Ser
AAA
Lys
AAT
As n Phe
GAT
Asp 410
A.GC
Ser
TCT
Ser
GA.A
Giu
;AA
Mlu *Lys *Ser
ACT
Thr
GTA
Val
'GG
315
GOA
Oly
*AGT
Ser Leu
TTT
Phe
TCA
Ser 395
TTA
Leu
CTC
Leu
AAT
Asri
ATC
Ile
CAA
Gin2 475
OAA
Giu
OAT
Asp
GAC
Asp
OCT
Ala 300
ACT
Thr
GTG
Val
ACA
Thr
CGC
Arg
ACA
Thr 380
TCA
Ser
CAC
His
CAT
Asp
CO
Pro
I'TC
Phe 460
:GT
krg
OAT
As p
TAT
Ty r
TAC
Ty r 285
TCA
Ser
GTA
Val
ACO
Thr
GGA
Gly
ATG
Met 365
CTC
Leu
GAT
Asp
CTC
Leu
TCA
Ser
GTT
Val 445
TTC
Phe
TAC
Tyr
AAA
Lys
AOC
Ser 270
AAT
As n
CAA
Gin
CTT
Leu
TTA
Leu
AGT
Ser 350
TGT
Cys
TCT
Ser
GGA
Sly
CCT
Pro
CTA
Leu 430
GAT
Asp
CAT
H{is
;AA
Mlu
ATA
Ile 255
TOG
Trp
AGA
Arg
TAT
Tyr
ATT
Ile
TOT
Cys 335
GCA
Ala
CAT
His
ATT
Ile
AAA
Lys
AAT
As n 415
CTT
Leu
AAT
As n
ATT
Ile
GAC
Asp OAT 6 Asp OCA 816 Ala OTT 864 Val1 GOT 912 O ly TTT 960 Phe 320 TAT 1008 Tyr AAT 1056 Asn OGA 1104 Gly AAT 1152 Asn CAA 1200 Gin 400 TAT 1248 Tyr AAT 1296 As n TTC 1344 Phe CCO 1392 Pro OCO 1440 Ala 480 470 GAC ACT TOG TAC AAA TAT ATT TTC COC AGC 0CC GOT TAT CGC OAT OCT 1488 SUBSTITUTE SHEET (RULE 26) WO 97/17432 WO 97/ 7432PCTIUS96/1 8003 A:.sp Thr Trp iyr Ll .s 71,r Ile Phe Arg Ser Ala Gl*, Ty'r Arg Asp Ala 485 490 495 AAT GC Asn Gly ATG CCA Met Pro ACT GAT Thr Asp 530 C ATA Ala Ile 545 GCT TAC Ala Tyr TAC ATT Tyr Ile ACC AAT Thr Asn GCC ACA Ala Thr 610 TGC CTA Trp Leu 625
CAG
Gin
TTC
Leu 515
CCA
Pro
TTC
Phe
CGT
Arg
CAG
Gin
ACT
Thr 595
CCG
Pro
AGC
Ser
AT
Ile
CTG
Leu
CTG
Val1
CAT
His
CTT
Leu 565
CAA
Gin
CCA
Pro Met
GAT
Asp
ATC
Ile
ACC
Thr 550
CAA
Giu
CAC
Gin
AAT
Asn
C
Ci y
CCA
Aia 520
ATG
Met
CAT
Asp
CAT
Asp
CTC
Leu
ACC
Thr 600
ACT
Ser 505
TCC
Trp
CC
Ala
CTA
Leu
ACT
Thr
GGA
Giy 585
TTC
Leu
.AAA
Lys
CAT
Asp
CAC
Asp
TTG
Leu
CTA
Leu 570
CCC
Pro
ACT
Ser
CCA
Pro
ACC
Thr
CCG
Pro
AT
Ile 555
GTC
Val1
CC
Arg
~AA
Lys
C=T
Arg
ACA
Thr
ATC
Met 540
CC
Ala
GAA
Giu
CCT
Pro
GAA
Glu
TAT
Try'r
CAC
Gin 525
CAT
His
CCA
Arg
CC
Ala
CAT
Asp
GCT
Ala 605
TG
Trp 510
CCC
Pro
TAC
Tyr
GC
Gly
AAA
Lys
ATC
Ile 590
GCC
Gly GC 1536 Vali ACC 1534 Thr CTC 1632 Leu ACC 1680 S er 560 TAG 1728 Tyr ACC i776 Thr ATT 1824 Ile ACA TTC CTC ACT TCA CCC GAG GTG ATG ACG TTC CCT CC 1872 Thr Phe Leu Ser Ser Pro Giu Val Met Thr Phe Ala Ala 615 620 1881 INFORMATION FOR SEQ ID NO:28: SEQUENCE CHARACTERISTICS: LENGTH: 627 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) Met Ser Glu Ala Leu Vai Giu Ser Ilie 35 Thr Lys Ilie Gly Ser Leu SEQUENCE DESCRIPTION: SEQ ID NO:28: Ser Leu Phe Thr Gin Thr Leu Lys Glu Ala 10 Ala His Tyr Ile Ala Thr Gin Val Pro Ala 25 Gin Thr Ala Asp Asp Leu Tyr Ciu Tyr Leu 40 Ser Asp Leu Val Thr Thr Ser Pro Leu Ser 55 Gin Leu Phe Ile His Arg Ala Ile Glu Gly -152- Arg Arg Asp Asp Leu Lys Leu Leu Asp Ciu Ala Ile Tyr Asp Gly SUBSTITUT SHEET (RULE WO 97/17432 Thr T'yr c1u Leu Gly.
145 Ile dly Pro Lys L 2 Ilies 225 Lys L PCTIUS96/1 8003 Le~ Asi Arc Asr- 130 Lys Ser ,y s yr eu 10 er eu U1 Ala "i Trp Leu 115 Lys Leu Ty1r Asp Ala 195 Lys Glu I2 His I As As r 100 Lys Thr Lys Asp Asnr 180 Phe Pro l1e pSer )Ser Phe Glu Ser Thr 165 Lys Tyr AspC Tyr S Arg TI 245 Al PhE -ir Ile Glu 150 Leu rhr rrp ln ;er ~30 'rp iLy~ As r *Ala *Phe 135 Leu Ala Ile Arg Trp 215 Gly Phe sPrc IHis Gi 1 120 Thr Val Thr Phe Lys 200 Ser His Thr Arc 105 Asp Ala Giu Leu Phe 185 Leu Giu Val1 Ile *Phe Ala Asp 90 r'E',r Ser Thr 1Tvr Ilie Asp *Phe Giu Gin 140 Ser Lys Leu 155 Asp Tyr Ile 170 Ile Gly Arg Thr Leu Val Trp Arg Ala 220 Giu Pro Phe 235 Ser Lys Glu A~ 250 Gb Tr~ Prc 125 C ly Arg Thr Thr rhr 205 I le rrp ~sp u' G1r ,Ale Thr Ile Asp Ala Gin 190 Asp As n Glu Lys i Phe Leu GlY Lys *Leu Arg *Ser Gin Tyr Leu 160 Cys Gin 1715 Asn Ala Gly Gly Ala Gly Asn Asn 240 Ile Asp 255 Phe Ser Gly Ser 305 Gin Asp Leu Gin Thr 385 Val1 Lys Ala 290 Asp As n Ser Ser Ser 370 Ile Lys 275 Thr Ala Ala Gly Ser 355 Tyr Glu *Lys 260 Lys *Gly Gin Gly Asn 340 Lys Asn Phe Asn Ile Trp Val Met Ser Ser Asp Tyr Ser Ile Ser Met Gly 325 Val1 Asp Asp Thr Leu Ser As n 310 Ala Ile Tyr As n Ser 390 Giu Ser 295 Ile Thr Lys Ala Asn 375 Tyr Leu 280 Pro Ser Pro Asn Thr 360 Tyr 265 Ser Thr Asp Ser Leu 345 Thr Cys Phe Giu Asp Thr 330 Ser Lys Asn Thr Val Gly 315 Gly Ser Leu Phe Asp Ala 300 Thr ValI Thr Arg Thr 380 Tyr 285 Ser Val Thr Gly Met 365 Leu 270 Asn Gin Leu Leu Ser 350 Cys Ser Trp Arg Tyr Ile Cys 335 Ala His Ile Ala Val Giy Phe 320 Tr As n Gly As n Gly Thr Phe Ser Ser Asp Gly Lys 395 Gin 400 Phe Thr Pro Val Asp Leu Pro Ser 405 Asn Ala Ala Asp Ile Asp 410 Ile Ser -153- Leu Leu His Asp Leu Pro Ser Leu SUBSTITUTE SHEET (RULE 26) WO 97/17432 WO 97/ 7432PCTIUS96/18003 rrAsp Ser Gly 450 Phe Leu 465 Asp Thr Asn Gly Met Pro Thr Asp 530 Ala Ilie 545 Ala T'yr 7-yr Ile C Thr Asn T Ala Thr P 610 Trp Leu S 625 Va 1 435 Pro Val1 Trp Gin Leu 515 Pro Phe krg ;n 1 'hr T 95 ro T e r 42
TY
Thi Leu 500 Gin Asp Lieu 31a 380 'rp hr 0 nl Gl r G P, -Va 1 *Lys 485 Ile Leu Vali His Leu 565 Gin Pro Phe y in Phe Gly l 440 Ile Leu Trp 455 Arg Met Gin Thr 470 Tyjr Ile Phe Arg Met Asp Gly Ser 505 Asp Thr Ala Trp 520 Ile Ala Met Ala 535 Thr Leu Asp Leu 550 Giu Arg Asp Thr Gin Leu Leu Gly 585 Asn Pro Thr Leu 600 Leu Ser Ser Pro C 615 Ser Glu Giu Ser 490 Ly s Asp Asp Leu Leu 570 Pro er ;iu Asn Pro Ile Phe 460 Gin Arg 475 Ala Giy Pro Arg Thr Thr Pro Met 540 Ile Ala 555 Vai Giu Arg Pro Lys Giu Val Met '1 620 43 0 Val Asp 445 Phe His 7T.r Giu Tyr Arg Tyjr Trp 510 G 1n Pro 525 His Tyr Arg Giy Alia Lys t ksp Ile F.
590 klia Gly A 305 'hr Phe
A
Asr I I s Asp 495 As n Ala Ly s ksp let 375 isa Ia Sh s Pro Ala 480 Ala ValI Thr Leu Ser 560 Tyr Thr Ile Ala INFORMATION FOR SEQ ID NO:29: SEQUENCE
CHARACTERISTICS:
LENGTH: 1689 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (ix) FEATURE: NAME/KEY:
CDS
LOCATION: 1. .1689 OTHER INFORMATION: /product= "S8" (xi) SEQUENCE DESCRIPTION: SEQ ID NO:29: GCA GGC GAT ACC GCA AAT ATT GGC GAC GGT GAT TTC TTG CCA CCG TAC 48 Ala Gly Asp Thr Ala Asn Ile Gly Asp Gly Asp Phe Leu Pro Pro Tyr 1 5 10 -154- SUBSTITUTE SHEET (RULE 26) WO 97/17432 WO 97/ 7432PCT/US96/18003 AT OTA CTA CTC GGT TAC TGG GZAT M-A CTT GAG TTA cGc CTA TAC sp 4l Lu Leu Gly lyr Trp Lys Leu Giu Leu Arg Leu Tysr
AAC
Asn
CTO
Leu
GGA
ciy
CAG
Gin
AGT
Ser
CAG
Gin
ATCC
Ile
CAA
Gin~ 145 CGG C Arg C GAA A Glu I GTT G Val A AAcC Asn V 2 ATT G IlieG 225 .!GC G Lei
GGC
Gl
TTG
Leu
GAT
Asp
:TG
Leu 130
:AC
Us In 5
TC
le
CA
lia
TC
al1 10
GC
i y
CG
GCCC
u Arg T GCC r Ala
SGAC
Asp
TGG
Trp
TTG
Leu
AAT
As n 115
AAA
Lys
AGC
Ser AAAi Lys
GCC
Ala
ACG
Thr 195
TTCC
PheC
TCCC
Ser C GGC P CAC AAT CTG AGT CTG GAT GGT CAA CC His Asn Leu Ser Leu Asp GyGin Pr ACG CCG GTA CAC c Thj
GCI
CGC
Arg
ACT
Thr 100
GAA
GIU
CAT
His
CTG
Leu
CAT
His
.GGT
180 3GA
;GG
3ly
G
liy Pro
SACA
Thr
TAT
T1yr 85
CAG
Gin
A
Lys
CAG
Gin
ACC
Thr
TAC
Tyr 165
CTC
Leu Leu
CTG
Leu
CAA
Gin Va.
CCC
Prc
TTC
Phe
ATC
Met
CAC
His
GCA
Ala 150
AGC
Ser
ACA
Thr
CTG
Leu
GCT
Ala
.CA
Alia 230 1 Asp 55
ACT
i Ser
TTA
Leu
GGC
Gly
ACG
Thr
CAT
Asp 135
TTA
Leu
GAC
Asp
CTA
Leu Ile
AAC
Asn 215
ACC
Thr
GTG
Val
GAT
Asp CTG
C
CCC
Prc
AG'
Se,
TTC
Le.
Asn
ATA
Ile 120
ATA
Ile
CAG
Gin
CTG
Leu
CGC
Arg
GCC
Ala 200
GGT
Giy
CAA
'In k.CA rhr
TT
:ie
*AAA
Lys r CCG 7Pro
GTA
Val
AGC
Ser 105
CTG
Leu
CAA
Gin
GCT
Ala
ATT
Ile
AGC
Ser 185
GGCC
GiyC
GGA
Gly GTT C Val C CCA G Ala G 2 GCT G Ala A 2165 GAG
C
A-.
Thi
GC
Al
GA;
Git.
90
TT?
Leu
TTG
Leu
CAA
Gin
AGC
Ser As n 170
ASCC
rhr
;GA
ily rCG er
GC
;iy
GC
ly ~50
AT
.sp
AA
CCTG
r Leu r GGT i Gly 75
CGC
1Arg
CAA
Gin
CAG
Gin
AAT
Asn
CGT
Arg 155
GT
Gly
GCC
Ala
ATC
Ile
GAA
Giu
GCC
Ala 235
TAT
Tyr
AACC
Asn C ATC
CA,
GlI 64
GG~
G1l
GC(
A1L
AC;Z
Thi
ACI
Thr
AAT
Asn 140
GAT
Asp
GT
Gly
ATG
Met
CC
Ala
TGG
Trp 220
GGC
Gly
:AG
;In
;AA
3iu
.CG
G CTA Leu
~CCC
Arg 0 r' CAA (Gin
CC
iArg
SACG
Thr
CAA
-Gin 125
*CTA
Leu
GC
Gly
*CTA
Leu Ile MAC4 Asn 205
GA
Gly
ATCC
CGT
Arg
ATA
Ile I ATG
G
AA
Ast
CAC
Gir
GOC
Giy
TCT
Ser
TTA
Leu 110
CAG
Gin
AMA
Lys
GAC
Asp
TCT
Ser
ACC
Thr 190
GCG
Ala
GCG
Ala
:AG
;In
:GT
krg
~CC
~hr
;CA
r' CTC 'i LeL
AGI
Ser
GCC
Ala
GAA
Glu
GMA
Gu
*GGA
Gly
ACA
Thr
GCG
Ala 175
MAT
As n
GTA
ValI
CCA
Pro
GAT
Asp
CAG
Gin 255
CMA
Gin
CAM
IPro Gcc
GTT
*Val G TG *Val
*CAT
His
CC
Ala
TTA
Leu
TTG
Leu 160
GCA
Ala
GGC
G ly
CCT
Pro
TTA
Leu
CAG
Gin 240
GMA
Giu
CTG
Leu
AAA
144 19 2 240 288 336 384 432 480 5 2 576 624 672 72)0 768 816 864 Ser Ala Gly Ile
GMA
Ciu
GAT
Trp
GCC
TTG
Leu 260
ATA
TC ;AA Ser Giu 245 CMA CGG Gin Arg CMA AGC -155- SUBSTITUTE SHEET (RULE 26) WO 97/17432 WO 97/ 7432PCT/US96/18003 CAG ATC 275
ACG
G i n
TCT
r 3in Ilie Thr Leu Ser CIt 290 As p
CC
Ala
CCA
Pro
AA.
Lys
CAA
Gin
CAT
Asp 385
ACC
Thr
ATC
Ile
ACT
Thr
CAGC
GinI CGT Arg 465 CAA C Gin TCA C Ser H- AGCGc Ser A C T Let
ATC
Ile
GAG
Giu Gly 370
GCC
Ala
CTG
Vali AlAT
CTG
:TA
.eu 150
~TT
Ele
,AT
~s p
AC
[is
CGT
r g G CAA ui Gin L' CCT tArg
TGT
Cys
AC
ISer 355
CTC
Leu
ATC
Ile
TCG
Ser
AAA
Lys
GCG
Aia 435
OCT
Cly
AAA(
Lys
CTT
Leu OT C Gly TTT C Phe L 515
AC'
Thi
CT~
Let
CTC
Let 340
GAC
Asp
TTA
Leu
TG
Trp
CTG
Lau
CTC
ValI 420
CTG
Leu rTC Laeu
CT
krg
;AA
;lu
TG
Tal 300
ETC
eu
CACT
r Thr
'TC
j Se r 325
CAG
IGin
ACT
Ser
CCA
Aia
CTT
Leu
CAT
Asp 405
CIT
Leu
ACA
Thr
GAT
Asp
ATC
Ile
CC
Ala 485
AATC
Asn CCT ProE
CCT
Arg 310
CG
Ala
CCA
Pro
CTT
Lau
CGA
Gly
CCA
Ala 390
ACC
Thr
AAC
Asn
CCC
Gly
AAC
Asnf
GCC
Ala 170 kCA r'hrI AC C t Isp C 7TT C ~he c Lei
AC(
Th~ 29~
TT~
PhE
CTC
Leu
AAA
Lys
TTC
Phe
GAA
Giu 375
CCT
Arg
CTC
Leu
CCC
C ly
GAT
Asp
TCT
Ser 455 3TC Tal
:TG
eu
CA
;iy
;AA
iu 280
*GAA
Clu
ACC
SThr
TAT
CC
Ala
CAG
Gin 360
GOT
Cly
CCT
C ly Phe
CAA.
Ciu
ATC
Ile 440
TAC
Tyr
ACC
ThrI
CTA
Val
GC
Cly GT C Cly 520 Ci1
CA;
Gi1
CC
Gi
TAG
GCA
Ala 345
OTT
ValI
TTA
Leu
COT
ply
GC
Cly
ACC
Thr 425
TTC
Phe
AAC
Asn
CTC
Leu
ATG
4et
EGG
tkrg 305
:CA
u Gin G CGC 'i Ala
CAC
Gin C AA -Gin 330
TTA
Leu
CCC
Pro
ACT
Ser
ATT
Ile
ACA
Thr 410
CTA
Val
CAA
Gin
TTC
Leu
CCA
Pro
GGT
Cly 490 Phe CAT
C
Iii
AA~
As r
C;
Ala 315
ATG
Met
GTA
Val1
CTC
ValI
TCA
Ser
COG
Gly 395
CCC
Cly
TCT
Ser
GCA
Ala
%CT
['hr 475 klia Pal c-Th ~Al
CTC
LeL
TAI
CAC
Gin~
TG
Trp
GAG
Ciu 380
CTA
Leu
ACC
Thr
CCA
Pro
ACA
Thr
AAC
Asn 460
CTT
Leu
CAA
Clu
ACC
Thr
ACA
rMet 285
ECAA
I Gin
TAT
CAT
*Asp
CAA
Ciu
AAT
*Asn 365
CTA
Leu
CAA
ClU
TTA
Leu
TCC
Ser
CTC
Leu 445
GAG
CiuI CTC C Leu C ATC C Ile ;Z GAG Asp P 5 ACC G Thr G 525 Ala
C
Ala
AAC
As n
TCC
Ser
TTA
Leu 350
CAT
Asp
CAG
Gin
GCC
Ala
ACT
Ser
GT
Gly 430
;AT
Asp
LAC
y
GG
1y cc lia ~he GcC liy 31n
ATT
Ile *Trp
ACT
Thr 335
GCC
Gly
CTC
Leu
XLAA
Lys
ATC
Ile
CAA
Ciu 415
C
Cly
TTG
Leu
A-A
Lys
CCA
Pro
CCC
Ala 495
AAC
Asti
ACA
Thr Ly s TAT91 Ty r ATc 960 Met 320 CTC 1008 Leu GAG 1056 Giu TCG 1104 Trp CTC' 1152 Leu CCC 1200 Ara 400 AAT 1248 Asn CTC 1296 Val1 ACT 1344 Ser COT 1392 Arg TAT 1440 Tyr 480 TTA 1488 Leu GAC 1536 Asp CTC 1584 Leu trg Asp Ala Thr GAG CTC Ciu Leu 530 AAT ATT TTC CAT Asn Ile Phe His COT AAA GAG OGA ACC Cly Lys Ciu Gly Thr 540 CAA CAC GAG TTC 1632 Gin His Ciu Leu -156- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT1US96/18003 -7TC GCG AAT CTG AGT GAC ATC ATT GTO CAT CTG TAC ATC ATT 1530 .al Ala Asn Leu S-?r Asp Ile Ile Val His Leu Asn Tyr Ile i Arg 545 550 555 GCG TAA 1~99 Asp Ala INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: Ala As n As n Leu Giy Gin Ser Gin Ile Gin 145 Arg Giu Asn Ilie Giy As Leu Gly G 1y Let Asg LeL 130 Gir Al Va 21( G 1
I:
I
I)
(ii) MOLECUI (xi) SEQUENC Asp Thr Ala Val Leu Leu IArg His Asn 35 Ala Thr Pro Asp Gly Thr Trp Arg Tyr Leu Thr Gin 100 Asn Giu Lys 115 Lys His Gin Ser Leu Thr I Lys His Tyr 165 SAla Giy Leu 180 i Thr Gly Leu 195 1Phe Gly Leu 0 Ser Gly Gin .ENGTH: 563 amino acids ?YPE: amino acid 'OPOLOGY: linear ~E TYPE: protein E DESCRIPTION: SEQ ID Asn Ile Gly Asp G ly Asp Phe Leu Pro Pro Tyr Gly Leu ValI Gly 70 Pro Phe Met His Ala 150 Ser Thr Leu Ala Ala T'y r Ser Asp Ser Leu G iy Thr Asp 135 Leu Asp Leu Ile Asn 215 Thr Trp Leu 40 Pro Ser Leu Asn Ile 120 Ilie Gin Leu Arg Ala 20 0 Gly Gin Asp Lys 25 Asp Gly Lys Thr Pro Ala Val Giu 90 Ser Leu 105 Leu Leu Gin Gin Ala Ser Ile Asn 170 Ser Thr 185 Giy Gly Gly Ser Val Gly -157- Leu Gin Leu Gly Arg Gin Gin As n Arg 155 Gly Ala Ile Giu Ala Giu Pro Gin Gly Ala Thr Thr Asn 140 Asp Gly Met Ala Trp 220 Gly Leu Leu Arg Gin Arg Thr Gin 125 Leu Gly Leu Ile As n 205 G ly Ile Arg As n Gin G ly Ser Leu 110 Gin Lys Asp Ser Thr 190 Ala Ala Gin Leu Leu Gin Ser Aia Giu GiU Gly Thr Ala 175 As n Val1 Pro Asp SUBSTITUTE SHEET (RULE 26) WO 97/17432 WO 9717432PCT/US96/18003 Ser Giu Asp Gin Asp 305 Ala Pro Lys Aia Trp Ala Ile 290 Leu Giy Ile iu Gly Ala Gin 275 Thr Gin Arg Cys Ser 355 Ile Leu 260 Ile Leu Thr Leu Leu 340 Asp Ser 2145 Gin Gin Ser Thr Ser 325 Gin Ser 230 Giu Arg Ser Giu Arg 310 Ala Pro Leu Vai Thr Ala Gly 250 Asp le Ala Asp 265 Leu Gin Giu Gin 280 Thr* Giu Gin Ala 295 Phe Thr Gly Gin Leu Tyr Tyr Gin 330 Lys Ala Ala Leu 345 Phe Gin Val Pro
T
1 r As n Ile Asn Ala 315 Met ValI ValI Gin Giu Thr Ala 300 Leu Tyr Gin Trp IArg Arg Gin 25 Ile Thr GIn 270 Met Ala Gin 285 Gi1n Ala Ile Tyr Asn Trp Asp Ser Thr 335 Glu Leu Gly 350 Asn Asp Leu 360 Gin Asp 385 Thr Ile Thr Gin Arg 465 Gin Ser Ser P~ Glu L Val1 A 545 Asp A Giy 370 Ala Val As n Leu Leu 450 Ile ksp i s Leu Leu Ala Gly GIU Gly Leu Set Ser Glu Leu Ile Ser Lys Ala 435 Gly Lys Leu dly Phe 515 Asn Asn I Trp Leu ValI 420 Leu Leu Arg GlU Val 500 Leu Ile .eu Leu Asp 405 Leu Thr Asp Ile Ala 485 Ala 390 Thr As n Gly Asn Ala 470 Thr 375 Arg Leu Gly Asp Ser 455 ValI Leu Gly Phe Giu Ile 440 Tyr Thr ValI Gly Gly Thr 425 Phe Asn Leu Met Ile Thr 410 Val1 Gin Leu Pro Gly 490 Gly 395 Gly Ser Ala Gly Thr 475 Ala 380 *Leu Thr Pro Thr As n 460 Leu GlU Glu Leu Ser Leu 445 Glu Leu Ile Gin Ala Ser Gly 430 Asp Lys Gly Ala Lys Ile Giu 415 Gly Leu Lys Pro Ala 495
V
S
A
Tr 4 24C Leu Ly s T'yr Met 320 Leu Giu Trp .eu ~rg 100 s n al1 e r rg yr eu Asn Asp Gly Gly Arg Phe Val Thr Asp Phe Asn Asp D05 510 Phe His Asp 550 Gly Arg 520 Gly Lys Ile Val Ala Gly Leu 555 -158- SUBSTITUTE SHEET (RULE 26) WO 97/17432 WO 97/ 7432PCT/US96/18003 (21 1,JF-ORMTICN FOR SEQ ID (i SEQUENCE
CHARACTERISTICS:
LENGTH: 4458 base pairs TYPE: nucleic acid STR.ANDEDNESS: double TOPOLOGY: linear tii) MOLECUJLE TYPE: DNA (genomic) (ix) FEATURE: NAME/KEY:
CDS
LOCATION: l. .4458 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:31: ATG CAG GAT TCA CCA GAA GTA TCC ATr ACA ACG CTG TCA Met Gin Asp Ser Pro Giu Val Ser Ile Thr Thr Leu Ser CTT ccc AA Leu Pro Lys COT GGC GOT Cly Gly Gly cT GAT GGA Pro Asp Gly
OCT
Ala ATC AAT CCC ATO Ile Asn Gly met GAA GCA CTG AAT Ciu Ala Leu Asn OCT OCC GGc Ala Ala Cly TCG ACC GGC Ser Thr Gly ATG GCC TCC CTA Met Ala Ser Leu CTG CCA TTA CCC Leu Pro Leu Pro AGA GOO Arg Gly ACG OCT CCT GGA Thr Ala Pro Gly
TTA
Leu 55 TCG CTG ATT TAC Ser Leu Ile-Tyr
AGC
Ser AAC AGT GcA GGT Asn Ser Ala Gly GGG, CCT TTC GOC Gly Pro Phe Gly GGC TOO CAA TGC Giy Trp Gin cys OTT ATG TCC ATT Val Met Ser Ile CGA cGC ACC CAA Arg Arg Thr Gin 0CC ATT CCA CAA Gly Ile Pro Gin GOT AAT GAC GAC Gly Asn Asp Asp ACG TTC Thr Phe 96 144 192 240 288 336 384 432 480 528 CTA TCO CCA Leu Ser Pro CAA cCT GAT Gin Pro Asp 115
CAA
Gin 100 ooc GAG OTC ATG Oly Giu Val Met
AAT
Asn 105 ATC Cc CTG AAT Ile Ala Leu Asn GAC CAA. CCC Asp Gin Cly 110 OTT ACC TTG Val Thr Leu ATC CGT CAA Cc Ile Arg Gin Asp AAA ACG CTC CAA Lys Thr Leu Gin CCA ATT Pro Ile 130 TCC TAT ACC GTO Ser Tyr Thr Val coc TAT CAA CC Arg Tyr Gin Ala
CC
Arg 140 CAG ATC CTC CAT Gin Ile Leu Asp
TTC
Phe 145 ACT AAA,_ ATC CAA Ser Lys Ile Giu TOC CAA CCT CC Trp Gin Pro Ala GOT CAA CAA GCA Gly Gin Giu Gly OCT TTC TOO CTG Aia Phe Trp Leu TCO ACA'*CO CAC Ser Thr Pro Asp CAT CTA CAC ATC His Leu His Ile TTA CCC Leu Ciy AAA ACC 0CC CAC CCT TCT CTC GCA AAT CCGCcAA AAT GAC CAA CAA ATC LYS Thr Aia Gin Ala cys Leu Ala Asn Pro Gin Asn Asp Gin Gin Ilie -159- SUBSTITUTE SHEET (RULE 26) WO 97/17432 WO 9717432PCTIUS96/1 8003 GC--CAG TGG TTG CTG GAA A Ala Gin Trp Leu Leu Glu Giu
ACT
Thr I100 OTG ACG CC Val Thr Pro Ala GAA CAT OTC 624 Glu His val AGC TAT -3e r T',r 210 CAA TAT CGA GCC Gin Arg Ala GAT GAA GCC CAT Asp Glu Ala His
TOT
C-1, 220 GAC GAC AAT GA Asp Asp Asn Glu
A.XZL
Lv s 225 ACC OCT CAT CCC Thr Ala His Pro OTT ACC GCA CAG Val Thr Ala Gin
CGC
Arg 235 TAT CTG GTA CAG Tyr Leu Val Gin ACTAC GGC AAC Asn Tyr Gly Asn
ATC
Ilie 245 AACCA CAA GCC Lys Pro Gin Ala
AGC
Ser 250 CTG TTC GTA CTG Leu Phe Val Leu GAT AAC Asp Asn 255 GCA CCT CCC Ala Pro Pro GOT GAG CGC Gly Glu Arg 2175
OCA
Ala 260 CCG OAA GAO TOO Pro Oiu Giu Trp TTT CAT CTG OTC Phe His Leu Val TTT GAC CAC Phe Asp His 270 OAT OCA GOT Asp Ala Gly OAT ACC TCA CTT Asp Thr Ser Leu ACC OTO CCA ACA Thr Val Pro Thr
TOO
Trp 285 ACA OCO Thr Ala 290 CALA TOO TCT OTA Gin Trp Ser Val
COC
Arg 295 CCG OAT ATC TTC Pro Asp Ile Phe
TCT
S er 300 COC TAT OAA TAT Arg Tyr Glu Tyr
GOT
Gly 305 TTT OAA OWG COT Phe Olu Val Arg
ACT
Thr 310 COC COC TTA TOT Arg Arg Leu Cys CAA OTO CTG ATO Gin Val Leu Met
TTT
Phe 320 912 960 1008 1056 CAC COC ACC OCO His Arg Thr Ala ATO 0CC OGA OAA Met Ala Gly Olu AOT ACC AAT GAC Ser Thr Asn Asp 0CC CCO Ala Pro 335 OAA CTO OTT Glu Leu Val ACC ACO TTO Thr Thr Leu 355
OGA
Oly 340 COC TTA ATA CTO OAA TAT GAC AAA AAC Arg Leu Ile Leu Giu Tyr Asp Lys Asn 345 0CC AOC OTC Ala Ser Val 350 ATT ACC ATC COT Ile Thr Ile Arg TTA AGC CAT OAA Leu Ser His Olu
TCO
Ser 365 GAC 000 AGO 1104 Asp Oly Arg CCA OTC Pro Val 370 ACC CAG CCA CCA Thr Gin Pro Pro
CTA
Leu 375 OAA CTA 0CC TOO Oiu Leu Ala Trp COO TTT OAT CTG 1152 Arg Phe Asp Leu
GAO
Olu 385 AAA ATC CCO ACA Lys Ile Pro Thr CAA COC TTT GAC Gin Arg Phe Asp CTA OAT AAT TTT Leu Asp Asn Phe AAC 1200 As n 400 TCG CAG CAA COT Ser Gin Gin Arg CAA CTO OTT OAT Gin Leu Val Asp
CTO
Leu 410 COO OGA OAA 000 Arg Gly Glu Oly TTO CCA Leu Pro 415 1248 GOT ATO CTO oly Met Leu COT CAG GAA Arg Gin Olu 435
TAT
Tyr 420 CAA OAT COA OGC Gin Asp Arg Oly
OCT
Ala 425 TOO TOO TAT AAA Trp Trp Tyr Lys OCT CCO CA.A 1296 Ala Pro Gin 430 AAA ATC 0CC 1344 Lys Ile Ala OAC OGA OAC AOC Asp Gly Asp Ser
AAT
As n 440 0CC OTC ACT TAC Ala Val Thr Tyr 160- SUBSTITUTE SHEET (RULE 26) WO 97/17432 WO 97/ 7432PCT/US96/18003 Pro Lsu 450 CT TA CCC Pro Thr Leu Pro Asn 455 TTC CAG GAT AAT Leu Gin Asp Asn CC TCA TTG ATGC GAT Ala Ser Leu Met Asp 460 ACC CCC TCC GCT AT? 1440 Thr Ala Ser Gly Ile 480 AA C CGA CAC GOC Asn Gly Asp Cly CTC GAT TGC GTT Leu Asp Trp Val
GTT
ValI 475 CCC GCA TAC CAT ACT CAC CA. CCC CAT 101 arg Cly 1'yr His Ser Gin Gin Pro Asp 485 AAG TCC ACC CAC Lys Trp Thr His TTT ACC 1438 Phe Thr 495 CCA ATC AAT Pro Ile Asn OCT GAC CTT Ala Asp Leu 515
CC
Ala 500 TTC CCC CTC CAA Leu Pro Val Ciu TTT CAT CCA AC Phe His Pro Ser ATC CAC TTC 1536 Ile Gin Phe 510 ATC CCC CCC 1584 Ile Ciy Pro ACC CCC CCA GC Thr Cly Ala Cly TCT CAT TTA CTC Ser Asp Leu Val AAA, AC Lys Ser 530 CTC CCT CTA TAT Val Arg Leu Tyr
CC
Ala 535 AAC CAC CCA AAC Asn Gin Arg Asri TCC CCT AAA CCA 16321 Trp Arg Lys Cly GAAk Ciu 545 CAT CTC CCC CAA Asp Val Pro Gin ACA CCT ATC ACC Thr Cly Ile Thr CCT CTC ACA CCC Pro Val Thr Cly 1680 CAT CCC CCC AAA Asp Aia Arg Lys CTC CC TTC ACT Vai Ala Phe Ser ATC CTC CC? TCC Met Leu Cly Ser CC? CAA 1728 Cly Gin 575 CAA CAT CTC CTC CAA ATC AAC CCT AAT CCC CTC ACC TCT Gin His Leu Val Ciu Ile Lys Cly Asn Arg Vai Thr Cys TCC CCC AAT 1776 Trp Pro Asn 590 CGA TTT ACC 1824 Cly Phe Ser OTA CCC CAT Leu Cly His 595 CCC CCT TTC CC? Cly Arg Phe Cly CCA CTA ACT CTC Pro Leu Thr Leu CAC CCC Gin Pro 610 CAA AAT ACC TTC Ciu Asn Ser Phe
AAT
Asn 615 CCC CAA CCC CTC Pro Ciu Arg Leu CTC CC CAT ATC 1872 Leu Ala Asp Ile
GAC
Asp 625 GCC TCC CCC ACC Gly Ser Cly Thr CAC CTT ATC TAT Asp Leu Ile Tyjr
CC
Ala 635 CAA TCC CCC TCT Cmn Ser Cly Ser TTC 1320 Leu 640 CTC AT? TAT CTC AAC CAA ACT CC? AAT Leu Ile Tyr Leu Asn Gin Ser Cly Asn 645 TTT CAT CC CCC Phe Asp Ala Pro TTC ACA 1968 Leu Thr 655 TTA CC TTC Leu Ala Leu CTC CCC CAT Val Ala Asp 675
CCA
Pro 660 CAA CCC CTA CAA Clu Cly Val Gin
TTT
Phe 665 GAC AAC ACT TC Asp Asn Thr Cys CAA CTT CAA 2016 Gin Leu Gin 670 CTC ACT CTC 2064 Leu Thr Val AT? CAC CCA TTA Ile Gin Cly Leu
CCC
Cly 680 ATA CCC ACC TTC Ile Ala Ser Leu CCA CAT Pro His 690 ATC CC CCA CAT Ile Ala Pro His TCC CC? TCT CAC Trp Arg Cys Asp
CTC
Leu 700 TCA CTC ACC A.2A 2112 Ser Leu Thr Lys C!C-C TGG TTC TTC kAT CTA ATC AAC AAT AC CCC CCC CCA CAT CAC ACC 2160 Pro Trp Leu Leu Asn Val Met Asn Asn Asn Arg Cly Ala His His Thr -161- SUBSTITUTE SHEET (RULE 26) WO 97/17432 WO 97/ 7432PCT/US9618003 CTA CAT TAT CGT Lau His Tyr Arg Ser 725 TC C 3C CAA TTC Ser Ala Gin Phe TTG GAT .A-A Lau Asp Giu CTC ACC AAAA Leu Thr Lys CAT TTGC CT.: His Leu Leu 755 GGC A TCT COG Cly Lys Ser Pro TGT TAT CTO CCG Cys 7TYr Leu Pro APA TTA GA6G 2208 Lys L,=u Gin -715 TTT ATG 22 5 6 Phe Pro Met 750 GCG~C COG 2304 Cly Asn Arg TCG TA' T AOGAA Trp Tyr Thr Giu
ATT
Ile 760 CAG GAT GA. ATC Gin Asp Ciu Ile CTC ACC Leu Thr 770 ACT CAA CTC AAC Ser Ciu Vai Asn ACC CAC CCC CTC Ser His Ciy Vai CAT OCT AAA GAG 2352 Asp Cly Lys Giu
CCC
Arg 785 CAA TTC ACA CCA Ciu Phe Arg Giy
TTT
Phe 790 CCC TCC ATC AAA Cly Cys Ilie Lys
CAC
Gin 795 ACA CAT ACC ACA Thr Asp Thr Thr ACG 2400 Thr 800 TTT TCT CAC GC Phe Ser His Gly CCC CCC GAA CAC Aia Pro Giu Gin CCA CCC TCG AiLa Pro Ser AGC TGC TTT Ser Trp Phe CAA TAT TCC Ciu Tyr Trp 835 ACC CCC AT GCAT Thr Giy Met Asp CTA GAC ACC CAA Vai Asp Ser Gin CTC ACT ATT 2448 Leu Ser Ile 815 TTA GCT AG* G 2496 Leu Ala Thr 830 CAA ACC CCT 2544 Ciu Thr Arg CAC CCA CAC ACC Gin Ala Asp Thr
CAA
Gin 840 OCT TAT ACC CCA Ala Tyr Ser Ciy
TTT
Phe 845 TAT ACC Tyr Thr 850 CTC TCG CAT CAC ACC Val Trp Asp His Thr 855 AAC CAC ACA GAC Asn Gin Thr Asp CCA TTT ACC CCC 2592 Ala Phe Thr Pro
AAT
Asn 865 GAG ACA CAA CCT Giu Thr Gin Arg
AAC
As n 870 TCG CTC ACC CCA Trp Leu Thr Arg
C
Ala 875 CTT AAA CCC CAA Leu Lys Cly Gin CTG 2640 Lau 880 CTA CCC ACT GAG Lau Arg Thr Ciu TAC CCT CTC CAC Tyr Gly Leu Asp ACA CAT AAC CAA Thr Asp Lys Gin ACA CTC 2688 Thr Val 895 CCT TAT ACC Pro Tyr Thr ACT CAA TCC CC Ser Giu Ser Arg TAT CAC CTA CCC TCT ATT CCC GTA 2736 Tyr Gin Val Arg 5cr Ile Pro Val 905 910 TGC GTC ACT GCT ATT CAA AAT CCC 2784 Trp Val Thr Ala Ile Giu Asn Arg AAT AAA CAA ACT CAA TTA TCT Asn Lys Giu Thr Giu Leu Ser ACC TAC Ser Tyjr 930 CAC TAT CAA COT His T'Ir Giu Arg ATC ACT CAC CCA CAC TTC ACC CAC ACT 2832 Ile Thr Asp Pro Gin Phe Ser Gin Ser 940
ATC
Ilie 945 AAC TTC CAA CAC Lys Leu Gin His ATC TTT COT CAA Ilie Phe Gly Gin
TCA
5cr 955 CTO CAA ACT CTC Leu Gin 5cr Val CAT 2880 Asp 960 ATT CCC TCC CCC Ile Ala Trp Pro CCC CAA AAA CCA Arg Ciu Lys Pro
CA
Aia 970 CTO AAT CCC TAC Vai Asn Pro Tyr CCC COT 2928 Pro Pro 975 -i62- SUBSTITUTE SHEET (RULE 26) WO 97/1 7432 PCTIUS96/1 8003 7T ZC GA lr OTA TTT GA:C !A7C TAZT GAT OAT CA.A Thr Le-u Pro 0 lu Thr Leu Phe Asp Ser Ser rr Asp0 Asp Gin Gin ,:Inl 380 985 990 CTA TTA COT C-11 GTO AGA CA~A AA~ .AT AOG-C TOG CAT CAC CTG ACT GAT 30274 Leu Leu Ara Leu Val Ar~i Gin Asn Ser Trp His His Leu Thr Asp 995 1000 1005 00; GAA AAC TOG COA TTA GOT TTA CCG AAT G-CA C AA COC COT OAT OTT 3072 Gl -1:i1j Asn Trp Arg Leu Gly Leu Pro Asn Ala Gin Arg Arg Asp val1 icic 1015 1020 TAT ACT TAT GAC COG AGC AAA ATT CCA ACC G? 000GG ATT TCC CTT GOA 3120 Ty'r Thr Tr 1 'r Asp Arg Ser Lys Ile Pro Thr Glu Gly Ile Ser Leu 0O1u 1025 1030 1035 1040 ATC TTG CTO AAA OAT OAT GOC CTG CTA GCA GAT GAA AAA OCG 0CC GTT 3168 Ilie Leu Leu Lys Asp Asp Gly Leu Leu Ala Asp Glu Lys Ala Ala Val 1045 1050 1055 TAT CT GOGA CAA CAA CAG ACO TTT TAC ACC 0CC GOT CAA OCO OAA OTC 3216 Tyr Leu Oly Gin Gin Gin Thr Phe Tyr Thr Ala Gly Gin Ala Olu Val 1060 1065 1070 ACT CTA GA *AA CCC ACO TTA CAA OCA CTO GTC OCO TTC CAA OAA ACC 3264 Thr Leu Glu Lys Pro Thr Leu Gin Ala Leu Val Ala Phe Gin 01u Thr 1075 1080 1085 0CC ATO ATO GAC OAT ACC TCA TTA CAG OCO TAT GAA GOC OTG ATT OAA 3312 Ala Met Met Asp Asp Thr Ser Leu Gin Ala Tyr Glu Gly Val Ile Glu 1090 1095 1100 GAO CAA GAG TTO AAT ACC 0CC CTO ACA CAG 0CC GOT TAT CAG CAA OTC 3360 Olu Gin Oiu Leu Asn Thr Ala Leu Thr Gin Ala Gly Tyr Gin Gin Val 1105 1110 1115 1120 OCO COO TTO TTT AAT ACC AGA TCA OAA AGC CCG OTA TOG 0CC OCA COG 3408 Ala Arg Leu Phe Asn Thr Arg Ser Oiu Ser Pro Vai Trp Ala Ala Arg 1125 1130 1135 CAAk GOT TAT ACC OAT TAC GOT GAC GCC OCA CAG TTC TOO COG CCT CAG 3456 Gin Oly Tyr Thr Asp Tyr Gly Asp Ala Ala Gin Phe Trp Arg Pro Gin 1140 1145 1150 OCT CAG COT AAC TCG TTO CTO ACA 000 AAA ACC ACA CTO; ACC TOG OAT 3504 Ala Gin Arg Asn Ser Leu Leu Thr Gly Lys Thr Thr Leu Thr Trp Asp 1155 1160 1165 ACC CAT CAT TOT OTA ATA ATA CAG ACT CAA GAT 0CC OCT OGA TTA ACO 3552 Thr His His Cys Val Ilie Ile Gin Thr Gin Asp Ala Ala Oly Leu Thr 1170 1175 1180 ACO CAA 0CC CAT TAC OAT TAT COT TTC CTT ACA CCO OTA CAA CTO ACA 3600 Thr Gin Ala His Tyr Asp Tyr Arg Phe Leu Thr Pro Val GIn Leu Thr 1185 1190 1195 1200 OAT ATT AAT OAT AAT CAA CAT ATT OTO ACT CTG GAC GCG CTA GOT COC 3648 Asp Ile Asn Asp Asn Gin His Ile Val Thr Leu Asp Ala Leu Oly Arg 1205 1210 1215 OTA A'CC ACC AOC COG TTC TOG GOC ACA GAG OCA OA CAA 0CC OCA OGC 3696 Val Thr Thr Ser Arg Phe Trp Gly Thr Olu Ala Oly Gin Ala Ala Gly 1220 1225 1230 TAT TO-C AAC CAG CCC TTC ACA CCA CCO GAC TCC OTA OAT AAA OCO CTG 3744 Tyr Ser Asn GIn Pro Phe Thr Pro Pro Asp Ser Val Asp Lys Ala Leu -163- SUBSTITE SHEET (RULE.26) WO 97/1 7432 CA? TTA ACC GCC Ala Leu Thr Gly 1250 GAT AGC TOG ATG Asp Ser Trp Met 1265 PCT/US96/I 8003 1240 CTC =Z GTT GCC Leu Pro ValAl 1255 TCC TTA TCT TTG Ser Leu Ser Leu 1270 124 CAkA TOT TTA C;TC Cini Cys Leu val 1260 TCT CAG CTT TCT Ser Gin Leu SerC 1275
TT
Va, CAA 3840 Gin GAG CA GA.
Glu Glu Ala Glu GC CT?. TOG GCG CAA CTC COT GCC GCT CAT ATC ATT 3888 Ala Leu Trp Ala 12485
ACC
Thr
CAT
His
GAA
Clu
CAC
Gin CAT CCC AAA CTC TOT C Asp ClY Lys 1300 AAC CTC ACG Asn Leu Thr 1315 11A. CCG~ CCJA CZAT CT?. CTO CCC Arg Leu Pro Pro His Val Leu Gly 1330 1335 CAT CCC CAA CAC CAC CAC CAA CAG Asp Pro Gin Gin Gin His Gin Gin 1345 1350 CCC CCC TTA CTC CAG ACT TCA OCT Gly Arg Leu Leu Gin Ser Ser Ala 1365 CAA CCT AAA GAO CAT CCC COG CTG Gin Arg Lys Giu Asp Gly Gly Leu 1380 CTC ACT CCC CCT ACA GAC ACC CC?.
Val Ser Ala Pro Thr Asp Thr Arg 1395 1400 TAT CAC CAC AAA GGC CAA CCT OTG Tyr Asp Asp Lys Gly Gin Pro Val 1410 1415 AAT CAC TOG COT TAC OTT ACT CAT Asn Asp Trp Arg Tyr Val Ser Asp 1425 1430 CCC OAT ACC CAC CTT TAT CAT CCA Ala Asp Thr His Leu Tyr Asp Pro 1445 ACT OCT AAG AAA TAT TTG CGIGA Gin Leu Arg Ala Ala His Met Ile 1290 1295 TTA ACC CCC AAA CCA OCA AC?. ACC 3936 Leu Ser Cly Lys Arg Cly Thr Ser 1305 1310 ATT TCC CT. TTG C? ACT ATT CCC 3984 Ile Ser Leu Leu Ala Ser Ile Pro 0 1325 ATC ACC ACT CAT CC-C TAT CAT ACC 4032 Ile Thr Thr Asp Arg Tyr Asp Ser 1340 ACC CTC ACC TTT ACT CAC COT TTT 4080 Thr Val Ser Phe Ser Asp Cly Phe 1355 1360 COT CAT GAG TCA GOT OAT CCC TOO 4128 Arg His Olu Ser Cly Asp Ala Trp 1370. 1375 OTC OTO CAT C? AAT CGC OTT CTO 4176 Val Val Asp Ala Asn Cly Val Leu 1385 1390 TCO CCC OTT TCC OCT CCC AC?. CAA 4224 Trp Ala Val Ser Oly Arg Thr Clu 1405 COT ACT TAT CA?. CCC TAT TTT CTA 4272 Arg Thr Tyr Gin Pro Tyr Phe Leu 1420 CAC AGC C? CC?. CAT CAC CTO TTT 4320 Asp Ser Ala Arg Asp Asp Leu Phe 1435 1440 TTG OGA COG CAA TAC AAA OTC ATC 4368 Leu Oly Arg Olu Tyr Lys Val Ile 1450 1455 AAG CTC TAC ACC CCC TCC TTT ATT 4416 rm, I I Aa ys L~ys Tyr Leu Arg Olu Lys Leu Tyr Thr Pro Trp Phe Ile 1460 1465 1470 CTC ACT GAG Val Ser ClU 1475 CAT OAA AAC OAT Asp Olu ?.sn Asp ACA CCA TCA ACGA ACC Thr Ala Ser Arg Thr 1480 CC?. TAO Pro 1485 4458 INFORMATION FOR SEQ ID NO:32: -164- SUBSTITUT SHEET (RULE 2q~ WO 97/1 7432 PCT/US96/1 8003 ki) S,:EQUENCE CHAR-ACTERISTICS: LENGTH: 1486 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:32: Met Gin Asp Ser Pro Glu Val Ser Ilie Thr Thr Lau Ser Leu Pro Lys Gly Gly GlY Ala Pro Arg Asn Arg Lau Gin Pro Phe 145 Ala Lys Ala Ser Lys 225 Asn Ala Gly Thr.
Asp Gly Gly Arg Ser Pro Ile 130 Ser Phe Thr Gin Tyr 210 Thr T'yr Pro G lu Gly Thr Pro Thr Pro Asp 115 Ser Lys Trp Ala Trp 195 Gin Ala dly Pro Arg 275 Gin Met Ala Phe Gin Gin 100 Ile Tyr Ile Leu Gin 180 Leu Tyr His Asn Al1a 260 A.sp rrp Ile Ala Pro Gly His Gly Arg Thr GiU Ile 165 Ala Leu Arg Pro Ile 245 Pro Thr Ser As r Ser Gly Ile 70 dly Giu Gin Val Tyr 150 Ser Cys Glu Ala As n 230 Lys Glu Ser ValI Gly Leu Leu 55 Gly Ile Val1 Asp Thr 135 Trp Thr Leu Glu Glu 215 Val1 Pro Giu Leu Arg Met Ser 40 Ser Trp Pro Met Val1 120 Arg Gin Pro Ala Thr 200 Asp Thr Gin Trp His 280 Pro Gly 25 Leu Leu Gin Gin Asn 105 Lys Tyr Pro Asp As n 185 ValI Glu Ala Ala Leu 265 Thr *Glu *Pro Ile Cys Tyr 90 Ilie Thr Gin Ala Gly 170 Pro Thr Ala Gin Ser 2150 Phe Val1 Ala Leu Tyr Gly Gly Ala Leu Ala Ser 155 His Gin Pro His Arg 235 Leu His Pro Leu Pro Ser ValI As n Leu Gin Arg 140 Gly Leu Asn Ala Cys 220 Tyr Phe Leu Thr Asn Lau As n Met Asp As n Gly 125 Gin Gin His Asp Gly 205 Asp Leu ValI Val1 Trp Ala Se r Ser Ser Asp Asp 110 ValI Ile Giu Ile Gin 190 Glu Asp Val1 Lau Phe 270 Asp Ala Thr Ile Thr Gin Thr Lau Gly Leu 175 Gin H is Asn Gin Asp 255 Asp Ala Gly G ly Gly Ser Phe Gly Leu Asp Arg 160 Gly Ile ValI Giu Val1 240 As n His Gly 285 Asp Ile Phe Ser Arg Tyr Glu T'yr -165- SUBSTITTE SHEET (RULE 26) .WO 97117432 PCTIUS96/1 8003 290 295 300 Gly Phe Giu Vai Arg Thr Arg Arg Leu Cys Gin Gin Val1 Lau' Met Phe 305 310 315 320 His Arg Thr Ala Lau Met Ala Gly Glu Ala Ser Thr Asn Asp Ala Pro 3 25 330 335 Giu Leu Val Giy Arg Leu Ile Leu Glu Ty,,r Asp Lys Asn Ala Ser 11alI 340 345 350 Thr Thr Leu Ile Thr Ile Arg Gin Leu Ser His Giu Ser Asp Gly Arg 355 360 365 Pro Val Thr Gin Pro Pro Leu Giu Leu Ala Trp Gin Arg Phe Asp Lau 370 375 380 Giu Lys Ilie Pro Thr Trp Gin Arg Phe Asp Ala Leu Asp Asn Phe Asn 385 390 395 400 Ser Gin Gin Arg Tyr Gin Leu Val Asp Leu Arg Gly Giu Giy Leu Pro 405 410 415 Giy Met Leu Tyr Gin Asp Arg Gly Ala Trp Trp Tyr Lys Aia Pro Gin 420 425 430 Arg Gin Giu Asp Gly Asp Ser Asn Ala Vai Thr Tyr Asp Lys Ile Ala 435 440 445 Pro Leu Pro Thr Leu Pro Asn Leu Gin Asp Asn Aia Ser Leu Met Asp 450 455 460 Ile Asn Gly Asp Gly Gin Leu Asp Trp Val Vai Thr Ala Ser Giy Ile 465 470 475 480 Arg Gly 7yr His Ser Gin Gin Pro Asp Giy Lys Trp Thr His Phe Thr 485 490 495 Pro Ile Asn Ala Leu Pro Val Giu Tyr Phe His Pro Ser Ile Gin Phe 500 505 510 Ala Asp Leu Thr Giy Ala Gly Leu Ser Asp Leu Val Leu Ile Giy Pro 515 520 525 Lys Ser Val Arg Leu Tyr Ala Asn Gin Arg Asn Gly Trp Arg Lys Gly 530 535 540 Giu Asp Val Pro Gin Ser Thr Gly Ile Thr Leu Pro Val Thr Gly Thr 545 550 555 560 Asp Ala Arg Lys Leu Val Ala Phe Ser Asp Met Leu Gly Ser Gly Gin 565 570 575 Gin His Leu Val Glu Ilie Lys Gly Asn Arg Vai Thr Cys Trp Pro Asn 580 585 590 Lau Gly His Giy Arg Phe Gly Gin Pro Leu Thr Leu Ser Gly Phe Ser 595 600 605 Gin Pro Glu Asn Ser Phe Asn Pro Giu Arg Leu Phe Leu Ala Asp Ile 615 620 Asp Gly Ser Gly Thr Thr Asp Leu Ile Tyr Ala Gin Ser Gly Ser Leu 625 630 635 640 Leu Ile Tyr Lau Asn Gin Ser Giy Asn Gin Phe Asp Ala Pro Leu Thr -166- SUBSTITUTE SHEET (RULE 26) WO 97/17432 WO 9717432PCTIUS96/1 8003 Leu Val1 Pro Pro 705 Leu Leu His Leu 257 Arg C 785 Phe S A la Ala His 690 Trp His Thr Leu rhr ~70 1 U er .645 Leu Pro Giu 660 Asp Ile Gin 675 Ile Ala Pro Lau Leu Asn Tyr Arg Ser 725 Lys Ala Gly 740 G ly Gly His Val1 710 Ser Lys Va I Leu His 695 Met Ala GJin G ly 680 Trp As n Gin Phe 665 Ile Arg As n Phe Ala Cys As n Trp 730 Cys Asn Thr Cys Ser Leu Ile 685 Asp Leu Ser 700 Arg GIy Ala 715 Leu Asp Glu Tyr Leu Pro Gin Liu 670 Lau Thr His His Lys Leu 735 Phe Pro Gin1 Val1 Lys Thr Gin Met 745 Lau 755 Ser Phe His Trp Giu Arg Gly Tyr ValI Gly Thr Asn Phe 790 Ala Giu Tyr 775 Gly Pro Gin Asp Glu Ile Ser Gly Asn Gly Val Lys Gir 799 Ala Ala 810 805 Ser Trp Phe Ala Thr Gly Met Asp Giu Val Giu Tyr As n 865 Lau Pro Asn Ser Ile 945 Ile Thr Leu Tyr Thr 850 Glu Arg Tyr Lys Tyr 930 Lys Trp 835 ValI Thr Thr Thr Glu 915 His Leu 820 Gin Trp Gin Giu ValI 900 Thr Tyr Gin Ala Asp Arg Leu 885 Ser Giu Glu His *Asp His Asn 870 Tyr Giu Leu Arg Asp 950 Thr Thr 855 Trp Gly Ser Ser Ile 935 Gin 840 Asn, Leu Leu Arg Ala 920 825 Ala Gin Thr Asp Tyr 905 Trp Tyr Thr Arg Gly 890 Gin Val1 Asp Ser Asp Ala 875 Thr Val1 Thr 765 Trp Asp 780 L Thr Asp Pro Ser Ser Gin Gly Phe 845 Gin Ala 860 Leu Lys Asp Lys Arg Ser Ala Ile 925 Gly Thr Leu Lau 830 Giu Phe Gly Gin Ile 910 3lu Ly s Thr Ser 815 Ala Thr Thr Gin rhr 895 Pro ks n Arg Giu Thr 800 Ile Thr Arg Pro Leu 880 ValI Val Arg Ile Thr Asp Pro Gin Phe Ser Gin Ser 940 Ile Phe Gly Gin Ser Leu GIn Ser Val Asp 955 960 Ala Leu Trj Pr Arc p P~ro Arg Arg Giu Lys Pro Ala 965 970 0 Giu Thr Leu Phe Asp Ser Ser 980 985 9 Leu Val Arg Gin Lys Asn Ser -167- SUBSTITUTE SHEET (RULE 26) 7Tyr Gin 990 Leu WO 97/17432 PCT/US96/18003 L9 1O00 1005 Gly Glu Asn Trp Arg Leu Giv Leu Pro Asn Ala Gin Arg Arg Asp 7ai 1010 1015 1020 Tyr Thr Tyr Asp Arg Ser Lys lle Pro Thr Glu Gly Ile Ser Leu Glu 1025 1030 1035 1040 Ile Leu Leu Lys Asp Asp Gly Leu Leu Ala Asp Glu Lys Ala Ala Val 1045 1050 1055 Tyr Leu Gly Gin Gin Gin Thr Phe Tyr Thr Ala Gly Gin Ala Glu Val 1060 1065 1070 Thr Leu Glu Lys Pro Thr Leu Gin Ala Leu Val Ala Phe Gin Glu Thr 1075 1080 1085 Ala Met Met Asp Asp Thr Ser Leu Gin Ala Tyr Glu Gly Val Ile Glu 1090 1095 1100 Glu Gin Glu Leu Asn Thr Ala Leu Thr Gin Ala Gly Tyr Gin Gin Val 1105 1110 1115 1120 Ala Arg Leu Phe Asn Thr Arg Ser Glu Ser Pro Val Trp Ala Ala Arg 1125 1130 1135 Gin Gly Tyr Thr Asp Tyr Gly Asp Ala Ala Gin Phe Trp Arg Pro Gin 1140 1145 1150 Ala Gin Arg Asn Ser Leu Leu Thr Gly Lys Thr Thr Leu Thr Trp Asp 1155 1160 1165 Thr His His Cys Val Ile Ile Gin Thr Gin Asp Ala Ala Gly Leu Thr 1170 1175 1180 Thr Gin Ala His Tyr Asp Tyr Arg Phe Leu Thr Pro Val Gin Leu Thr 1185 11190 1195 1200 Asp Ile Asn Asp Asn Gin His Ile Val Thr Leu Asp Ala Leu Gly Arg 1205 1210 1215 Val Thr Thr Ser Arg Phe Trp Gly Thr Glu Ala Gly Gin Ala Ala Gly 1220 1225 1230 Tyr Ser Asn Gin Pro Phe Thr Pro Pro Asp Ser Val Asp Lys Ala Leu 1235 1240 1245 Ala Leu Thr Gly Ala Leu Pro Val Ala Gin Cys Leu Val Tyr Ala Val 1250 1255 1260 Asp Ser Trp Met Pro Ser Leu Ser Leu Ser Gin Leu Ser Gin Ser Gin 1265 1270 1275 1280 Glu Glu Ala Glu Ala Leu Trp Ala Gin Leu Arg Ala Ala His Met Ile 1285 1290 1295 Thr Glu Asp Gly Lys Val Cys Ala Leu Ser Gly Lys Arg Gly Thr Ser 1300 1305 1310 His Gin Asn Leu Thr Ile Gin Leu Ile Ser Leu Leu Ala Ser Ile Pro 1315 1320 1325 Arg Leu Pro Pro His Val Leu Gly Ile Thr Thr Asp Arg Tyr Asp Ser 1330 1335 1340 Asp Pro Gin Gin Gin His Gin Gin Thr Val Ser Phe Ser Asp Gly Phe -168- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCTIUS96/18003 1345 1350 1355 Gly Arg Leu Leu Gin Ser Ser Ala Arg His Giu Ser Gly Asp 7 rp 1365 1370 17 Gin Arg Lys Glu Asp Gly Gly Leu Val Vai Asp Ala Asn Gly Val Lau 1380 1385 1390 Val Ser Ala Pro Thr Asp Thr Arg Trp Ala Val Ser Gly Arg Thr.3lu 1395 1400 1405 Tyr Asp Asp Lys Oly Gin Pro Val Arg Thr Tyr Gin Pro Ty 1 r Phe Lau 1410 1415 1420 Asn Asp Trp Arg Tyr Val Ser Asp Asp Ser Ala Arg Asp Asp Leu Phe 1425 1430 1435 14 Ala Asp Thr His Leu Tyr Asp Pro Leu Gly Arg Giu Tyr Lys Val ile 1445 1450 1455 Thr Ala Lys Lys Tyr Leu Arg Oiu Lys Leu Tlyr Thr Pro Trp Phe Ile 1460 1465 1470 Vai Ser Olu Asp Giu Asn Asp Thr Ala Ser Arg Thr Pro* 1475 1480 1485 INFORMATION FOR SEQ ID NO:33: SEQUENCE CHARACTERISTICS: LENGTH: 3288 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:33: ATO GTG ACT GTr ATO CAA AAT AAA ATA TCA TT'T TTA TCA GOT ACA TCC 48 Met Val Thr Val Met Gin Asri Lys Ile Ser Phe Leu Ser Gly Thr Ser 1 5 10 GAA CAG ccc CTG CTT GAC GCC GGT TAT CAA AAC GTA ?ITr GAT ATC GCA 96 Giu. Gln Pro Leu. Leu, Asp Ala Gly Tyr Gin Asn Val Phe Asp Ile Ala 25 TCA ATC AGC COG GCT ACT TTC G 1TI CAA TCC GTT ccc ACC CTG CCC GTT 144 Ser Ilie Ser Arg Ala Thr Phe Val Gin Ser Val Pro Thr Leu, Pro Vai 35 40 AAA GAG GCT CAT ACC GTC TAT CGT CAG GCG CGG CAA CGT GCG GAA AT 192 Ly-,s Giu. Ala His Thr Val Tyr Arg Gin Ala Arg Gin Arg Ala Giu Asn 55 CTG AAA TCC CTC TAC CGA GCC TOG CAA TTO CGT CAG GAG CCG OTT ATT 240 Leu Lys Ser Leu Tyr Arg Ala Trp Gin Leu Arg Gin Giu Pro Val Ile 70 75 AAA GOG CTO; OCT AAA ClTP AAC CTA CAA TCC AAC OTT TCT GTO CTT CA' 1 238 Lys Gly Leu Ala Lys Leu Asn Lau Gin Ser Asn Vai Ser Val Leu G-'n 90 OAT OCT TTG OTA GAG AAT ATT GOC GOT OAT 000 OAT TTC AOC OAT TTA 336 Asp Ala Leu Val Giu Asn Ile Gly Gly Asp Gly Asp Phe Ser Asp Lau -169- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCTUS96/1 8003 100 -TG AAC CCT GCC Met Asn Arg Ala 115 AGT CAA TAT Ser Gin Tir 105 GCT GAC Ala Asp 120 110 GCT GCC TCT ATT CAA TCC CTA Ala Ala Ser Ile Gin 3cr Leu 125
TTT
Phe I
CTG
Leu 145
CTG
Leu
TCA
Ser 130
CAT
His
AAG
Lys CCG GGC CGT TAT GCT TCC GCA CTC TAC AGA GTT GCT AA (G1' rrA Lys
GAT
Asp G.L y
TCA
Ser
CTG
Leu
TCC
Ser 150
TTA
Leu TCC CTT CAT Ser Leu Asp 165 ACT GAG CTG .Thr
CAT
His
AAC
Asn 225
CAA
Glu
GGT
Gly
GTC
Val
GGG
Gly
CC
Ala 305 ALzL:, Lys
CCT
Gly
CCC
Pro Glu
CTG
Leu 210
CGT
Cly
CAA
Gin
AAT
Asn
GGA
Cly Asn 290
GCC
Ala
CAG
Gin
AAT
Asn
GAT
Asp Leu 195
TCC
Ser Val
ACC
Thr
CAA
Gin
TTC
Phe 275
GGC
Cly
GCC
Ala
TGT
Cys GT'r Val Lys 355 ATC TTC TTC Ile Leu Leu 180 TCC CC CCA Ser Cly Ala CAA ATC CAT Gin Ile Asp TGG AAT ACT Trp Asn Thr 230 AGT AAT ACG Ser Asn Thr 245 CAT ACA ?I'T Asp Thr Phe 260 ACC GGG CAA Ser Cly Gin ATT CTC CGC Ile Val Cly CCA ATA CTC Ala Ile Val 310 TAC TAC CTC T1yr Tyr Leu 325 CTG ACC GGC Leu Thr Gly 340 CAC CCT ATT Asp Gly Ile Ala 135
ACT
Ser
AGC
Ser
CAT
Asp
TTC
Phe
TCC
Ser 215
TTG
Leu
AAT
Asn
TTT
Phe
CCT
Pro
CCA
Ala 295
CCA
Ala GTC C Va. I TGT 'I Cys I 1TT c Phe 3 Cit Glu
CTC
Val
TTC
Phe 200
CCT
Ala
ACA
Thr
ACA
Thr
TCC
Ser
ATC
Met 280
CAA
GIn
:CG
?ro
;CT
kla
ITC
?he
CT
1Ia 60
ACC
Thr
CTA
Leu 185
CCA
Pro
TTA
Leu
CAT
Asp
ACG
Thr 170
CAA
Gin
ATG
Met
TCG
Ser
ACC
Thr 5cr Ala Lau TTG CAT ATT Leu His lie CGC AAA Arg Lys 250 GGA AAC Cly Asn 265 GTT TAC Val Tyr TTG ATT Leu Ile TTr AAA Leu Lys CCC CAT Pro Asp 330 TTA ACA Leu Arg 345 CAG GTA Gin Val Ty r Arg 140 CAT AA;T Asp Asn 155 ATG AAT Met Asn AAA GGC Lys Cly ACG TTA Thr Lau CCA CAA Ala Gin 220 ACG CCA Thr Ala 235 CTG TTC Leu Phe ACT TTT Thr Phe CTG TCA Leu Ser GCA GGT Ala Cly 300 CTC ACT Leu Thr 315 GGT ACA Cly Thr CCC AALC Gly Asn S CCC AAC Ala Asn L 3
CGC
Arg
AAA
Lys
CCT
Cly
CCT
Pro 205
GCC
Ala
CAA
Gin
GCT
Ala
TAT
Tyr
CAG
Gin 285
.AT
Asn rGG Prp kCG rhr
~GC
;er 'ys .65 Val Ala Lys Asp CGC CCT Arg Ala GAG CTC Glu Val 175 A-A GAT Lys Asp 190 TAT CAC ryjr Asp AGA ACG Arg Thr CCC CTT Ala Val CCC CAA Ala Gin 255 TTC AAA Phe Lys 270 TAC ACC Tyr Thr CCA CAC Pro Asp TCA ATG I Sar Met
CAT
Asp 160
ACT
Thr
ATT
Ile
CAT
Asp
CTC
Leu
TCA
Ser 240
CAT
Asp
CCC
Ala
AGC
Ser
CAA
Gin
GCA
Ala 320
GAC
Asp,
AAC
sn
AGT
S ar 432 480 528 576 624 672 720 768 816 864 912 960 1008 1056 1104 -170- SUBSTITUTE SHEET (RULE 26W WO 97/17432 ACT CAG Thr Gin 370 CCT TTG CCA AGC TIC CAT CTG CCG GTC ACA CTG GAA Pro Leu Pro Ser Phe His Leu Pro Val Thr Leu Glu PCTUS96/18003 CAC AGC 1152 His Ser 375 380 GAG AAT AAA GAT CAG Glu Asn Lys Asp Gin
GTA
Val
GGG
GAT AGT TCC GGA Asp
ACA
Sei ANl Gly Thr Ly
GGC
Gly
ATT
Ile
TAT
Tyr 465
CAT
His
AGC
Ser
ACC
Thr
TAC
Tyr
TTT
Phe 545
CTG
Leu
CAG
Gin
TTA
Leu
CTA
Leu
GTGC
Val
ACT
Thr
CCA
Pro 450
CAA
Gin
TAT
Tyr
GAG
Glu
CGG
Arg
AGT
Ser 530
GGG
Gly
AAC
Asn
ACT
Thr
GCC
Ala
TCA
Ser I 510 CG C
CCC
Prc 435
TCC
Ser
TTG
Leu
GGT
Gly
TTG
Leu
TTA
Leu 515
CAA
Gln
GAA
Glu
AGC
Ser
GAT
Asp 3GT .,ly 595 ?he
'AC
r Ser Gly 405 GAC AAG Asp Lys 420 ACA AAC Thr Asn CCG CCA Pro Pro ATG ACC Met Thr TTT AAC Phe Asn 485 ACC AGC Thr Ser 500 AGC TTC Ser Phe AGC AGC Ser Ser ACC ACC Thr Thr ACA CTG C Thr Leu 2 565 GGC AAG Gly Lys 1 580 CGC GCT C Arg Ala C GAA GAA r Glu Giu L CAC CAC G i
Z
r
T
P
I
j TAC TAT Tyr Tyr 390 CAG TCA Gin Ser GGG CTG Gly Leu CCT GAT Pro Asp GCC CGC Ala Arg 455 AAT CCG Asn Pro 470 3GC GCT Uly Ala AA CTG .ys Leu kAT CAG ~sn Gin TT GAT :le Asp 535 CA ACC 4 ro Thr *50 CA GAC C la Asp 2 .GC CTA 2 er Leu I AA AAG C iu Lys r 6 TG GAC 1 eu Asp I 615 AC AAA A
CTC
Lei AA1 Asr
TTA
Leu
GAT
Asp 440
GAA
Glu
GCA
Ala
AGC
Ser
AAT
Asn
'TA
Leu 520
GCG
Ala
GC
krg
CG
kla
~AT
~sn
:TG
aeu i00 7GG rp rTT
SAAJ
Ly Trg
TTP
Leu 425
GTG
Val
ACA
Thr
CCG
Pro
TTA
Leu
TCT
Ser 505
ATG
Met
AAA
Lys
GTC
Val
GCT
Ala
TTC
Phe 585
GTA
Val
CTG
Leu
ACA
Thr
;AAA
Lys 410
ACC
Thr
ATT
Ile
CTG
Leu
ACA
Thr
CGG
Arg 490 ATC 4 Ile
GAT
Asp I
GCA
Ala
AAT
Asn T GAT C Asp C 570 ACT C Thr CGT 1 Arg L ATT G Ile A
GAC
G11 391
AAC
Asr Tin Phe
CCI
Pro
TCA
Ser
GAA
Glu 475
GCT
Ala
GAT
Asp
TTG
Leu
;CC
k 1 a
;TC
lai '55
;GT
;ly 7
AC
sp
'TA
.eu
CC
Ila
CAG
I Gin
GCG
1 Ala
TGC
Cys
CCC
Pro
CTG
Leu 460
GAT
Asp
TCT
Ser
ACT
Thr
ACC
Thr
AGC
Ser 540
TAC
Tyr
CAA
Gin GAT I Asp '9 TCA I Ser S 6 AAT G Asn A 620 GGT TAT ATC ACG Gly
CTC
Leu
AGC
Ser
GCT
Ala 445
ACG
Thr
GAT
Asp
CCA
Pro
TTC
Phe
GCT
Ala 525
CGC
Arg
"GT
ly rAT 'yr
ICG
hr
'CC
;er 405
CC
la 'Tyr
GTI
Val
GAT
Asp 430
ATC
Ile
CCG
Pro
ATT
Ile
TTG
Leu
TOT
Cys 510
CAG
Gin
TAT
Tyr
GCC
Ala
CTG
Leu
GTA
Val 590
CAG
Gin
AGT
Ser Ile
ATC
Ile 415
AGC
Ser
AAT
Asn
GTC
Val
ACC
Thr
TCA
Ser 495
GAG
Glu
CAA
Gin
GTT
Val
OCT
Ala
TGG
Trp 575 CTC C Vai 2 ACC C Thr C CGT Arg S Thr 400
AAT
Asn
TCA
Ser
GAT
Asp
AGT
Ser
AAC
Asn 480
ACC
Thr
AAG
Lys
TCT
Ser
CGT
Arg
TAT
ITyr 560
%TT
Ile
CC
kla
;GG
ly
GT
;er 1200 1248 1296 1344 1392 1440 1488 1536 1584 1632 1680 1728 1776 1824 1872 1920 GTG CTG GAT AAC CCG GTC cTT GD.
Pro Asp His His Asp Lys Ile Val Leu Asp Lys Pro Val Leu Glu -171- WO 97/17432 WO 97/ 7432PCTIUS96/18003
GCA
Ala Asn CTG GCA GAG TAT AGC CTA AACAG TAT GGG CT7 Val Ser Leu Lys Gin Arg Tyr Gly Leu CAG ACA CCC AGT TTC Gin
CAT
His
GAT
Asp '705
GAA
Glu
ACC
Thr
CCC
Pro cTG Leu
AAA
Lys~ 785
GAT
Asp
GTA
Val
TTG
Leu
ACA
Thr CGT Gi1y E 865
GAG;
Giu L Thr
GTC
Va 1 690
GAG
GlU
CTG
Leu
TTG
Leu
CGT
Arg
ATG
M4et 770 rcc Ser
T'G
rrp kLGT Ser
;AA
;iu
~TG
e t
~TC
'he ~y s *Pro 675
ATT
Ile
TTA
Leu
CTC
Leu
CAT
Asp
TTG
Leu 755
GAA
Giu
CTG
Leu
ATG
Met
ACG
Thr
A.AC
As n 835
GAT
Asp
GCC
Giy
ATC
Ile Ser
C
Aia
CC
Aia
CGT
Arg
CAA
C lu 740 Trrr Phe
GC
Gly
CAA
Gin
TCG
Ser
CAA
Gin 820
CTT
Val
TCC
Ser
ATT
Ile
ACA
rhr Phe
CTA
Leu
CC
*Ala
*ATT
Ile 725
TAT
Tyr
CCC
Gly
CCA
Cly
CCA
Pro
TCC
Ser 805
TG
Trp
TCT
Cys
C
Ala
AAG
Lys
ATC
Ile 885
*TTC
Phe
TAT
Tyr
CCT
Giy
ATA
Ile 710
CGT
Cly
ACC
Thr
CTG
Leu
AAA
Lys
CTG
Leu 790
CTA
Val
AC
Ser
GAC
Asp
TTA
Leu
AC
Ser 870
GGT
Cly
ATT
Ile
GAA
Ciu
ACA
Thr 695
TC
Cys
CC
Arg
CC
Aia
ACA
Thr
CAT
Asp 775
CCT
Ala
AAT
Asn
CGT
Ciy
AC
Ser
CAC
Gin 855
AAT
As n
PAGT
Ser
AGT
Ser
ACC
Thr 680
GAG
Ciu
TC
Cys
TAT
Tyr
ACT
Ser
TTT
Phe 760
ATC
Ile
ATT
Ile
CTA
Leu
ACC
Thr
CTG
ValI 840
CAC
Gin
GTC
Val
CAT
Asp
GCA
Ala 665
GCT
Ala
GTC
Vali
AAA
Lys
TOC
Cy s
CAC
Gin 745
CC
Ala
TTA
Leu
TTA
Leu
ACT
Ser
CC
Ala 825
AAT
As n Lys
ATG
M4et bAsn
CTA
Val1
TTC
Phe
AA
Lys
CCA
Ala
TTC
Phe 730
TTC.
Leu
CAA
Gin TrG Leu
CCC
Arg
CTG
Leu 810
ACC
Thr
ACT
Ser
CTG
Val1
CGT
Cly
AAT
As n
CC
Arg
TAT
Tyr
TTG
Leu 715
CGT
Cly
TAT
Tyr
CC
Ala
CAA
Gin
CCT
Arg 795
ACT
Thr
CCT
Ala
CAA
Gin
CTC
Leu
ATC
Ile 875
CCT
Pro
TCT
Ser
CCA
Ala 700
CCT
Gly
NAT
As n
CC
Arg
CAA
Giu
CAC
Gin 780
ACC
Thr
TAT
Tyr
GAG
Giu
GCT
Ala
CGC
Arg 860
GTC
VTal
TAT
*Ty r
CC
Ala 685
CAA
C lu
CTC
ValI
CCA
Ala
TTC
Phe ArT Ile 765
TTA
Leu
GAG
Giu
CTC
Leu
ATG
Met.
CC
Ala 845
CC
Ala
ACC
Thr
ACG
Thr 670
GAC
Asp
AAT
As n
ACC
Thr
GC
Cly
GCC
Cly 750 Leu
CCT
Gly
CAG
Gin
CAA
Gin
TTC
Phe 830
ACT
Thr
CTA
Leu
TTC
Phe 54.0 '1;T G0C Asp Ala 655 CCA CAT Pro Asp CGT AAT Gly Asn GAG CAG Giu Gin ACT CAT Ser Asp 720 ACT TTT Ser Phe 735 CCC ATT Ala Ile TCC CGT Trp Arg CAC CCA Gin Ala CTC CTC Val Leu 800 CCC ATG Gly Met 815 AAT TTC Asn Phe AAA CAA Lys Glu ACC CC Ser Ala TCC CTC Trp Leu 880 201i 2064 2 112 2160 2208 2256 2304 2352 2400 2448 2496 2544 2592 2640 2688 CCT TTT ACA 'ITG CCA AAC TAC Pro Phe Thr Leu Ala Asn Tyr 890 172- SUBSTITUTE SHEET (RULE 26) WO 97/17432 WO 97/ 7432PCT/US96/1 8003 CA-T -,AT Trp His Asp 7CC TTA CAA .Der Leu Gin 915 -AA ACC CTG TTT In Thr Leu Phe CAT GAC AAT CCCC His Asp Asn Ala ACG TTA GAG Thr Leu Glu ACC GAC ACT TCT Thr Asp Thr Ser CTA ATT GCT ACT CAG CAA CTT AGC Val Ile Ala Thr Gin Gin Leu Ser 925 CAC CTA ',In Leu 930 CTG TTA ATT CTC Val Leu Ilie V/al
AAA
Ly s 935 TCG CTG AGC CTG Trp Leu Ser Leu GAG CAG GAT CTO Ciu Gin Asp Leu
CA
Gin 945 TTA CTG ACA ACC Leu Leu Thr Thr
TAT
950 CCC CAA CCT TTA Pro Giu Arg Leu AAC CCC ATC ACG Asn Gly Ile Thr
AAT
Asn ITT CCT GTA CCC Val Pro Vai Pro
AAT
As n 965 CCC GAG CTA TTA Pro Ciu Leu Leu
CTC
Leu 970 ACC CTA TCA CGT Thr Leu Ser Arg TTT AAG Phe Lys CAG TGC GAA Gin Trp Glu
ACT
Thr 980 CAA CTC ACC CTT Gin Val Thr Val COT CAT CAA Arg Asp Ciu ACC ACT CAA Thr Thr Giu TTC CAT CAA Phe Asp Gin 995 CTG ATC CC Leu Ile Ala 1010 TTA AAT CCC XAT Leu Asn Ala Asn CAT ATC Asp Met 1000 CC ATC CCC TOT Ala Met Arg Cys 990 AAT CCA CCT TCA Asn Ala Ciy Ser 1005 CCA GCC CAA CTT Cly Ala Gin Val 2784 2832 2280 2928 29716 3024 3072 3120 3168 3216 3264 3285 ACA TTC TAT Thr Leu Tyr GAG ATG Clu Met 1015 CAT AAA CCT Asp Lys Cly
ACG
Thr 1020 AAT ACC Asn Thr 1025 TTC CTA TTA Leu Leu Leu CCT CAA Cly Ciu 1030 AAT AAC TCC Asn Asn Trp CCC AAA ACT ?ITr ACC Pro Lys Ser Phe Thr 1035
TCT
Ser 1040 CTC TOG CAA CTT Leu Trp Gin Leu CTC ACC Leu Thr 1045 TCC TTA CC Trp Leu Arg CTC CCC Vai Gly 1050 CAA AGA CTC Gin Arg Leu AAT CTC Asn Val 1055 CCT ACT ACC Cly Ser Thr ACT CTC Thr Leu 1060 CCC AAT CTC Gly Asn Leu TTC TCC Leu Ser 1065 ATC ATC CAA Met Met Gin CCA CAC CCT Ala Asp Pro 1070 GCT CCC GAG ACT AC Ala Ala Ciu Ser Ser 1075 CCT TTA TTC Ala Leu Leu 1080 GCA TCA CTA CC Ala Ser Val Ala CAA AAC TTA ACT Gin Asn Leu Ser CCC CCA ATC ACC AAT CGT CAC TAA Ala Ala Ilie Ser Asn Arg Gin..
1090 1095 INFORMATION FOR SEQ ID NO:34: SEQUENCE
CHARACTERISTICS:
LENGTH: 1095 amino acids TYPE: amino acids TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE
DESCRIPTION:
Features From To 254 267 254 492 SEQ ID NO:34: Descript ion SEQ ID TcaAii peptide -173- SUBSTITUTIE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 Met Val Thr Val Met Gin Asn Lys lle Ser Phe Leu Ser 1Gy Thr Ser 1 5 10 Giu Gin Pro Leu Leu Asp Ala Gly Tyr Gin Asn Val Phe Asp Ile Ala 25 Ser Ile Ser Arg Ala Thr Phe Val Gin Ser Val Pro Thr Leu Pro Val I0 35 40 Lys Glu Ala His Thr Val Tyr Arg Gin Ala Arg Gin Arg Ala Glu Asn 55 Leu Lys Ser Leu Tyr Arg Ala Trp Gin Leu Arg Gin Glu Pro Val Ile 70 75 Lys Gly Leu Ala Lys Leu Asn Leu Gin Ser Asn Val Ser Val Leu Gin 90 Asp Ala Leu Val Glu Asn Ile Gly Gly Asp Gly Asp Phe Ser Asp Leu 100 105 110 Met Asn Arg Ala Ser Gin Tyr Ala Asp Ala Ala Ser Ile Gin Ser Leu 115 120 125 Phe Ser Pro Gly Arg Tyr Ala Ser Ala Leu Tyr Arg Val Ala Lys Asp 130 135 140 Leu His Lys Ser Asp Ser Ser Leu His Ile Asp Asn Arg Arg Ala Asp 145 150 155 160 Leu Lys Asp Leu Ile Leu Ser Glu Thr Thr Met Asn Lys Glu Val Thr 165 170 175 Ser Leu Asp Ile Leu Leu Asp Val Leu Gin Lys Gly Gly Lys Asp Ile 180 185 190 Thr Glu Leu Ser Gly Ala Phe Phe Pro Met Thr Leu Pro Tyr Asp Asp 195 200 205 His Leu Ser Gin Ile Asp Ser Ala Leu Ser Ala Gin Ala Arg Thr Leu 210 215 220 Asn Gly Val Trp Asn Thr Leu Thr Asp Thr Thr Ala Gin Ala Val Ser 225 230 235 240 Glu Gin Thr Ser Asn Thr Asn Thr Arg Lys Leu Phe Ala Ala Gin Asp 50 245 250 255 Gly Asn Gin Asp Thr Phe Phe Ser Gly Asn Thr Phe Tyr Phe Lys Ala 260 265 270 Val Gly Phe Ser Gly Gin Pro Met Val Tyr Leu Ser Gin Tyr Thr Ser 275 280 285 Gly Asn Gly Ile Val Gly Ala Gin Leu Ile Ala Gly Asn Pro Asp Gin 290 295 300 Ala Ala Ala Ala Ile Val Ala Pro Leu Lys Leu Thr Trp Ser Met Ala 305 310 315 320 Lys Gin Cys Tyr Tyr Leu Val Ala Pro Asp Gly Thr Thr Met Gly Asp 325 330 335 Gly Asn Val Leu Thr Gly Cys Phe Leu Arg Gly Asn Ser Pro Thr Asn -174- SUBSTITUTE SHEET (RULE 26) WO 97/17432 WO 97/ 7432PCT/US96/18003 Pro Thr GiU 385 ValI Gly G ly Ile Tyr 465 His Ser Thr Tyr Phe 545 Leu Gin Leu Leu Vali 625 Ala Asn Gin His Asp Gin 370 Asn Asp Thr Thr Pro 450 Gin Tyr Giu Arg Ser 530 Giy Asn Thr Aia Ser 610 Pro Leu Thr Thr Vali G iy Pro Gin Giy 405 Lys As n Pro Thr As n 485 Ser Phe Ser Thr Leu 565 Lys Aia Giu His Tyr 645 Thr Phe Leu Ile Ser Ty r 390 Gin Giy Pro Aia Asn 470 G iy Ly s As n Ile Pro 550 Aia Ser Giu Leu Asp 630 Vali Phe Tyr Gly Phe Phe 375 Tyr Ser Leu Asp Arg 455 Pro Aia Leu Gin Asp 535 Thr Asp Leu Ly s Asp 615 Lys Ser Ilie Giu Thr Ala 360 His Leu As n Leu Asp 440 Giu Aia Ser Asn Lau 520 Ala Arg Ala Asn Leu 600 Trp Ilie Leu Ser Thr 680 Giu Gin Leu Ly s Trp Leu 425 ValI Thr Pro Leu Ser 505 Met Ly s Val1 Ala Phe 585 ValI Leu ValI Lys5 Ala 665 Ala ValI Pro Thr Ly s 410 Thr Ile Leu Thr Arg 490 Ile Asp Ala Asn Asp 570 Thr Arg Ilie Leu Gin 650 Val1 Phe Ala ValI Giu 395 As n Phe Pro Ser GiU 475 Ala Asp *Leu Ala ValI 555 Giy Asp Leu Aia Asp 635 Arg As n Arg 3 Lys Ser 365 Leu Glu Giy Tr Lau Vai Ser Asp 430 Aia Ile 445 Thr Pro Asp Ile Pro Leu W4o Phe Cys 510 Ala Gin 525 Arg Tyr Gly Aia Tyr Leu Thr Val 590 Ser Gin 605 Aia Ser Pro Vai Gly Leu 7 yr Thr 670 Ala Asp Gly Ser His Ser Ile Thr 400 Ile Asn 415 Ser Ser Asn Asp Vai Ser Thr Asn 480 Ser Thr 495 Giu Lys Gin Ser Vai Arg Aia Tyr 560 Trp Ile 575 Val Ala Thr Gly Arg Ser Leu Giu 640 Asp Ala 655 Pro Asp Giy Asn 685 Val Lys Tyr Ala Giu Asn Giu Gin -175- SUBSTITUT SHEET (RULE 26) WO 97/17432 WO 97/ 7432PCT/US96/18003 700 Gly Leu Ala Ala Ile Cys Lys Ala Lau Tv al Thr ser Giu Leu Leu Arg Ile Gly Arg ?jyr Cys Phe Gly Asn Ala Gly Ser Phe ~7j 1 C Thr Pro Leu Lys 785 Leu Arg Met 770 Ser Asp Leu 755 GiU Leu Gil 74, Ph~ Gil Gir Asp Trp Met Ser Val Ser Leu Giu Thr met 850 Gly Phe 865 Giu Lys Trp His Ser Leu Gin Leu 930 Gin Leu 945 Val Pro Gin Trp Phe Asp Leu le 1010 Asn ThrI 1025 Laeu TrpC Thr Asn 835 Asp Giy Ile Asp Gin 915 ValI Leu Vai Glu Gin 995 Alia LauI in G in 820 Val Ser Ilie Thr Ile 900 Thr Leu Thr Pro Thr 980 Leu rhr Leu .eu u Ty~r Thr Ala Ser Gin o 4 a Gly Leu Thr Phe Ala 760 iGly Lys Asp Ile Leu 775 Pro Leu Ala Ile Leu 790 *Ser Vai Asn Leu Ser 805 Trp Ser Gly Thr Ala 825 Cys Asp Ser Val Asn 840 Ala Leu Gin Gin Lys 855 Lys Ser Asn Val Met 870 Ile Gly Ser Asp Asn 885 Gin Thr Leu Phe Ser 905 Asp Thr Ser Leu Val 920 Ilie Val Lys Trp Leu 935 Thr Tyr Pro Giu Arg 950 Asn Pro Giu Leu Leu 1 9659 Gin Val Thr Vai Ser 985 Asn Ala Asn Asp Met T 1000 Leu Tyr Giu Met Asp L i0i5 Leu Gly Giu Asn Asn T Gin Leu Arg Leu 810 Thr Ser ValI Gly Pro 890 Hlis Ile Ser eu .eu ~70 ~rg 'hr 'I ,ys C rp P
I
Ala Giu I1e Gin Arg 795 Thr Ala Gin Leu Ile 875 Phe Asp Ala Leu I le 955 ['hr ksp Gin 780 IThr Tyr Giu Ala Arg 860 Val1 Thr Asn Thr Thr 940 Asn Leu Glu 765 Leu Giu Lau Met Ala 845 Ala Thr Leu Ala Gin 925 Giu Mly Ser Lau Gly Gin Gin Phe 830 Thr Leu Phe Ala Thr 910 Gin Gin Ile Arg Trp Arg Gin Ala Val Leu 800 Gly Met 815 Asn Phe Lys Giu Ser Ala Trp Leu 880 Asn Ty r 895 Leu Giu Lau Ser Asp Leu Thr Asn 960 Phe Lys 975 7'1U 735 Leu Tyr Arg Phe Gly Ala Ile 750 Ala Met Arg Cys 990 Thr Giu Asn Ala 1005 ;iy Thr Gly Ala 1020 ~ro Lys Ser Phe .035 Gly Gin Thr Ser Val1 Ser 1040
I
Iu U U Thr Trp Leu Arg Val Gly Gin Arg Lau Asn Val -176- SUBSTITUTE SHEET (RULE 26) WO 97/17432 WO 97/ 7432PCTIUS96/1 8003 i045 105 0 .=z Glyr Ser Thr Thr Lau Gly Asn Lau Lau Ser Met Met G-in Ala Asp Pro 51060 1065 l0OT0 Ala Ala Giu Ser Ser Ala Leu Leu Ala Ser Val Ala Gln Asn Lau Ser 1075 1080 1085 Ala Ala Ile Ser Asri Arg GIn 1090 1095 INFORMATION FOR SEQ ID SEQUENCE
CHARACTERISTICS:
LENGTH: 603 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID Pro Leu Ser Thr Ser Glu Leu Thr Ser Lys Leu Asn Ser Ile Asp Thr 1 5 in Phe cys Giu Lys Thr Arg Leu Ser Phe Asr Ala Arg Gly Tyr Thr Ser Ala Pro 145 Gly Tyr Gin 74.,r Ala Leu Val1 Gin Ser 130 Val1 Leu rhr Gin Val Ala Trp ValI Thr 115 Arg Leu Asp Pro Ser Tyr Arg Phe Tyr Leu Ile Gin 85 Ala Leu 100 Gly Leu, Ser Val Glu Ala Ala Asn 165 Asp Gin 180 Asn His Ser Gly As n 70 Thr Ala Ser Pro Leu 150 Thr rhr Val1 Gin Glu 55 5cr Asp Gly Phe Asp 135 Ala Phe Pro Ile 25 Ser Ser Ile 40 Thr Thr Pro Thr Leu Ala Gly Lys Scr 90 Arg Ala Giu.
105 GlU Giu Leu 120 His His Asp Giu Tyr Val Ala Thr Phe 170 Ser Phe 74yr 185 Ala Leu Giy 200 Ala Ala Ile Gin Asp Thr Asp 75 Leu Lys Asp Lys Ser 155 Ile Glu rhr Cys Leu Ala Arg Ala As n Leu Trp Ile 140 Leu Ser rhr ;1u :ys IMet Lys Val1 Ala Phe Val1 Leu 125 ValI Lys Ala Ala Val 205 Lys Asp Ala As n Asp Thr Arg 110 Ile Leu Gin ValI Phe 190 Lys Ala Leu Ala Val1 Gly Asp Leu Ala Asp Arg As n 175 Arg ITyr Leu Thr Ser Ty r Gin Asp Ser Asn Ly s Ty r 160 Pro Ser Ala Gly Ala Asp Gly 195 Giu Asn 210 Giu Gin Asp Giu Leu 215 -177- SUBSTITUTE SHEET (RULE 26) WO 97/17432 WO 97/ 7432PCr/US96I1 8003 225 Ala Phe ar Arg Ala~ Giu Leu 23 0 Thr Lau 245 Pro Arg Lau Arg Ile GlY Airg 235 Olu Tryr Thr ;la 250 Phe Gly Leu Thr 265 Cys Phe Gly Gin Leu Tyr Ala Gin Ala 260 Ilie Glu 305 Leu Met xx Ala Lau Gly 290 Gin Gin Phe Thr Leu Trp 275 Gin Val1 Giy As n Lys 355 Ser Arg Lau Met Gi Ala Leu Met Phe 340 Giu Ala Lys As p ValI 325 Lau Thr Gly Se Tr] 31 Se Git Met Phe 370 Thr Phe Trp Leu GIU 385 Leu Ala Gin Giu Giy 465 Ser Ala Ala Thr Gin Gin 450 Ile Arg Met Asn Leu Leu 435 Asp Thr Phe Arg Tyr Giu 420 Ser Leu Asn Lys Cys 500 Trp 405 Ser Gin Gin Val Gin 485 Phe ILys 390 His Leu Leu Leu Pro 470 Trp Asp r Le 29 p Me 0 r Th: .1 Asz AsE Giy 375 Ile1 Asp Gin Val Leu 455 Val1 Giu Gin Giy Gly 280 Gin Pro 5 t Ser Pro Gin Trp Val Cys 345 )Ser Ala 360 Ile Lys Thr Ile Ile Gin Thr Asp 425 Leu Ile 440 Thr Thr Pro Asn Thr Gin 4 Leu Asn A~ 505 Thr Leu 520 Leu Leu G Let ValI Ser 330 Asp~ Leu Ser Gly rhr 410 rhr ITa ['yr ?ro ~a 1 L90 .i Ala Asn 315 Gly Ser Gin As n Arg 395 Lau Ser Lys Pro GiuI 475 Thr Lys Asp Ile Let.
Ile 300 Leu Thr Val Gin Val1 380 Asp Phe Leu rrp Giu 460 La.u I Tl S 281 Leu Ser Ala Asn Lys 365 Met As n Ser Val1 lal 445 krg ,eu ;er 270 Arg Leu Thr Ser 350 ValI Gly Pro His Ile 430 Ser Leu Leu Arg Gi, Arc Thi Aia 335 Gin Leu Ile Phe Asp 415 Ala Leu Ile rhr k.sp 195 n1 Gin I Thr Tyr 320 Glu Ala Arg ValI Thr 400 Asn Thr Thr As n Leu 480 Glu la. Asn Asp Met Thr Thr Glu 510 Asn Ala Gly S~ 515 Gly. Ala 530 Ser Phe 545 Gin V~ Thr SE ?r Leu Ile Ala il Asn Thr Leu 535 ~r Leu Trp Gin 'yr ly G ly Pro Giv Met 575 Arg Lau Asn Va 550 1i Gly Ser Thr Thr Leu Gly 565 570 -178- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/1 8003 Gln Ala Asp Pro Ala Ala Glu Ser Ser Ala Leu Leu Ala Ser Val A a 580 585 590 Gin Asn Leu Ser Ala Ala Ile Ser Asn Arg Gin 595 600 INFORMATION FOR SEQ ID NO:36: SEQUENCE
CHARACTERISTICS:
LENGTH: 2557 base pairs TYPE: nucleic acid TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genornic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:36: GAATTCGGCT TGCG'ITTA3, ACCATGATA
TAAAGATGG;
TTGGAAT
ACTGGCAGAV.
TTGCCGTAG TGA6AGGAAaj TGATCAGAA
ACTCAATACI
AGCTAT'rrAT
CATGACCTCC
TGCTGGATAC
CGTCTACCAC
ATGTCATGGC CCCCTATAT1 CGGTACTCCT
TTGGGCAGAT
TCTGGGAC'rG GTrGAATACT AACATATCGT
TCAGTATTGT
GCATCAACGA
AAACGCCTTC
CTGGAGCAGC
GCCCGCGCAT
GGGTGAACGC
ACTAGGCGAA
TAACGGCAGA
ACAACTGGCT
GTATTCAAGC
ACAAAATCAT
GTTGGACATC
TATCAATACT
GCCCCACAGG
GCGTTTCCGC
CCGACCTAG
CCCAGTGGGA
ACAGGCTAAT
ACATTACAAC
ACTATATCCG
TCAAGTCCC
AATACTTACT
GATTCATMAT
CCATTGCCAG TATTCAACrG ATTCGGGGGT
TATCAGCCGC
GCACTTGGC
GGGTGTTTCT
TGCGTATCGG
ACAAACCAA
T.ACGCCGA
TACCGTCGA
CT.AATCTrA
AGTTATTAGC
ATTTTATCGG
ACTCAGTGAA
GTAAATrcAJ. CGACGG3TAA GTCCAAkTTA
CCCTTATAAA
r' ATTGATGAT( k AA'rrAAAJ r' ATTCATcAAJ kACTAATTTAI 7ATTACCAGCI
*ACCAGCTATP
GGTI'TACAAG
*GCGGCCACC'I
AAGTTACAGC
*AAGTATACGC
CAGGCTCTGG
CGTCTATTTG
GATGCCCT
AAAGCGTCCT
GATGCCATGA
CAACATCTTC
ATCCTGCAAT
T'N'GGTCGGG
AAACGCGGCA
GC IT I'lCTGG
AAGGCAGCGG
CAGGrITrCTG
TACGTCAACC
CAATTCTITrA
CAATTAGT'T
ATGATGGACG
GATGCCTTTA
GCATATCACG
ACTGATGCCG
TTCGCGGCTA
AGCACTATCC
;TCTCGCTCT'r k. ATAACCTAAAk 7 TAACCATTGA~ 7CCGCTATCAG
GOCTACATAC
ACAAAACGCT
GTTTTGATA
TGCAATTATC
CCGGCGACGG
CGGGTTCATC
CACAATTGA
TGACAAAACC
CACTOATTAT
CGGTGCTAGC
ATCTTGATGC
CCCCAGTAAC
GGGTTAATGT
CTGGATTATA
GGCGTATTAA
ATGAATCTCG
CGGCTATTAA
CGGCAATAAA
GGGCATI'GGA2 TCGACTGGGA
ACTACCCGGA
CATTACTGCA
TGTCTTATCT
G
ATAATATTA6A
T
GTGAATATrA
TI
ATGCCTGGAG
T
GTCCAGT'GAT A
CCGCCTGCT
GAATCTcc
TGAACTGGAT
TGATAAGcA6A
ACAGAAGTGG
AACGCCTGAA
AGACAAAGCA
ATCGGAAAAT
CGCAkATGACA
GGAAGCCGTA
AATGGTTTAC
AGAGATG-p
GCTGACACGT
GGCATTT'GAA
TAATMTG
TCCAGAAAAT
CGCACAACAA
TTCAATCAAT
CCGCCGGGTT
CAGTGCCGCA
A.AGCCGTGAT
UkACCACCCGG U ATGTGGAA
AAATACAAT
LAACTATATT C ~TCCGTCAGC
C
;ACATCGTTT c 'AACGATCAA
G
'TGGCGCAGT
G
GAATGGCAT
.TATAAMTCC c
AAAATTACCG
AATTTATATA
TTATTACTGA
TTGGCTACcc
AGTGTATTCC
ATTAAGAATTN
GATTTGCTAC
GTCGCCCACT
GCAGAGGGN
GAAACGCAGG
CATTCCACCG
GGCGCTGCAA
TTTGCGGATT
GCTAACTCGT
TTGCAAGCCA
GCGTTCTCCT
TTGAA)ATGTC
GAAAGAGACA
GAATTCAACA
TTAAGCACCT
GACTTGTATC
ATCGCCGAAG
;AAAATGCCA
kLAACGCTACA
~ATCCGACCA
-AAAGCCA.AT
~AACAAGTGG
,GGCTGACCT
*TCGATCACA
LAAATTGATT
CCCTGTAMTC
120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1200 1860 -179- SUBSTITUTE SHEET (RULE 26) WO 97/17432 WO 97/ 7432PCT/US9618003 "GCTCTGGrr GG.Lc;LAAAG GAGATCACCA AACAGACAGG AAATAGT.k.'A
GATGGCTATC
A-;L-CT-AAJAC GGATTATCGT
TATGAL-CT.:L-
GGAATACGCC AATCACCTT'r ATAGAGCGCC
CGGACTCTAT
TTTATAACCA
ACAAGACACA
TCTTTGCTCA
TATGGCATCC
ATAGCTATCA ACAATT'rGAT ATTATGAGAT
TCCTTCTTCG
TCAGCATGGT
ATATAACGGA
TAAAAATTA TA'I TCACCA GCAATCAT CAAITrTGAG CCAGCCTGG
CGTTAATCCG
GATGTCAATA
TGTGccGGTT
CTAGATAGTT
AAJAGATATGA
ACCAATAATG
GTAAGTAGCC
GATATTCCAA
AAATTAAGAA
AATAAATATG
AATAATAAGC
.AATTGGCGCA
AAAAAATATC
ATCAAGGTGA
ATAAAAACGC
CCCCAGAA
TCAGAAGAGT
GTAALAGACTA
CTATCAATTA
TTATTCATA
GCAAACTAG
CGAATTC
TATcCGCTAT GATGGCACTT CGAGCTAjA
AGATACGTTG
TTCAATGCA.A
GAGCAAkTGT
GAATAA-CC~C
TGGTTGGGGA
CAAGCCGCA
TGGATATGA
TGATAAATT
CTGGA.:k
CTGGTGATGT
GGACTATATA
TATCGGGATA
TATGCAGAGG
GATTATTACC
TCAAGTGATT
GGACkAArGC
ATTGTGTATA
1920 1938 0 2040 2100 2160 2220 2280 2340 2400 2460 2520 2557 INFORMATION FOR SEQ ID NO:37: SEQUENCE
CHARACTERISTICS:
LENGTH: 845 amino acids TYPE: amino acids TOPOLOGY: linear (ii) MOLECULE TYPE: protein (partial) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:37: Ala Phe Asn Ile Asp Asp Val Ser Leu Phe Arg Leu Leu Lys Ile Thr 1. C-- Asp His ASP Asn Lys Asp Gly Lys Ile Lys Asn Asn Leu Lys Asn Leu Gin Leu Thr Ser Asn Leu Tyr Ile Giy Lys Leu Ala Asp Ile Ile Asp Giu Leu Asp Leu Leu Ile Ala Val Giu Gly Lys Thr Leu Ser Ala Ile Ser Asp Lys Gin Leu Ala Thr Leu Ile Arg Leu Asn Thr Ile Ser Trp Leu His Gin Lys Trp Ser Gin Leu Phe Giu Ile Lys 115 Met Thr Ser Thr Tyr Asn Lys Thr Leu Thr Pro 110 Gin Gly Phe Asn Leu Leu Asp Vai Tyr His Gly Asp Lys 130 Asp Lys Aia Asp Leu 135 Leu His Val Met Ala Pro Tyr Ile Ala 140 Ala Thr 145 Leu Gin Leu Ser 150 Ser Giu Asn Vai Aia His Ser Val Leu Leu 155 160 Trp Ala Asp Lys Leu Gin Pro Giy Asp Gly Aia met Thr Aia Giu Giy 165 170 175 Phe Trp Asp Trp Leu Asn Thr Lys Tyr Thr Pro Giy Ser Ser Glu Ala -180- SUBSTITUTE SHEET (RULE 26) WO 97/17432 WO 97/ 7432PCT/US96/1 8003 ValI Leu Leu 225 Pro Trp Glu Asp His 1 305 180 Glu Thr Gin 195 Giu Met Val 210 Phe Val. Thr Ala His Asp Val Asn Ala 260 Ala Asn Ser 275 kia Asn Leu 290 ~eu Pro Pro Git.
Tyr Lys Ala 245 Leu Leu Leu la 1 IHis Ile Val His Ser Thr 215 Pro Glu Met 230 Leu Ser Leu Gly Giu Lys Thr Ala Glu 280 Leu Gin Ala 295 Thr Pro Giu 310 185 IGin 7-yr Gly Ilie Phe Gly Ile met 250 Ala Ser 265 Gin Leu Ser Ile Asn Ala C'is Gin Ala 205 Asn Giu Asn 220 Ala Aia Thr 235 Leu Thr Arg Ser Val Leu Ala Asp Ala 285 Gin Aia Gin 300 Phe Ser Cys 19,D Leu Ala~ Gly Phe Ala 270 Met As n Trp IAla Gin Phe Arg Ala Aia Ala Asp 255 Ala Phe Asn Lou His Gin Thr Ser 315 Ilie Arg Asn Ile Phe 385 Gin Gin Arg Phe 465 Gly As Prc Giu Asn 370 Leu Val1 Tyr Ile ;lu 150 ?he 'al 1Th~ )Thi Arg 355 Arg Asp Ala Leu Ala 435 Asn Ile Ser Ile Ser 515 Thr Ile Leu 325 Gly Arg 340 Asp Thr Arg Vai Giu Ser Lys Ala.
405 Leu Ilie 420 Giu Ala Val Giu Asp Trp Gin Leu 485 Gir Phe Asp Giu Arg 390 Ala A~sp Ile ilu ~sp 170 Trp Arg Leu Phe 375 Ser Ala As n Ala Asn 455 Lys ValI Phe Cys 360 Asn Ala Ala Gin Ser 440 Ala 1'yr *Asn Gly 345 Pro As n Ala Ile Val 425 Ile Asn Asn Va 330: Arc Val Arg Leu Lys 410 Ser Gin Ser Lys 1 Ala Gin Gin Leu Lys Ala Gly Leu Ser 395 Ser Ala Lou Gly Arg 475 Gly Lys Ile 380 Thr Arg Ala Tyr ValI 460 Tyr *Leu Arg 365 His Tyr Asp Ile Val1 445 Ile Ser Tyr 350 Gly Ty r Tyr Asp Lys 430 As n Ser rhr 335 Ser Arg Asn Ile Lou 415 Thr Arg Arg rrp Cys Ile Arg Ala Arg 400 Tyr Thr Ala Gin Ala ral Tyr Tyr Pro Giu Asn Tyr Ile Asp Pro Thr 490 met Ser Tyr Arg Gin Leu
C
C
C
495 Met Asp 505 Thr Val Ala Asn -181- Leu Ala 525 Val Gin Ser 510 Phe Met Ile Ser SUBSTITUTE SHEET (RULE 26) WO 97/1 7432 PCT/US96/18003 Tyjr 545 Lau Ser His val Ile 625 Asp Trp Lys Gly Asp 705 Met Asnr Arg Asp Ile 785 Ile Arg Phe His Asp Asn Ilie Asn Asn Asp Gin Gly Leu Thr Ty~r Phe Ile Ser Lys Ly s Ile 610 Thr Ty r As n Leu Glu 690 Ser Ala Ser Tyr Tyr 770 Pro Ser Asn Ile Giu Phe Ile 595 Tyr Lys Arg Thr Giu 675 Asp Tyr Ser Tyr Ala 755 G ly Thr Pro Gin Val1 835 Thr As n 580 Asp Lys Gin Tyr Pro 660 Lys5 Thr Lys Lys Gin 740 Giu Trp Ile Lys Cys 820 Tyr Asp 565 Asp Cys Ser Thr Giu 645 Ile As n Leu Asn Asp 725 Gin Asp Gly Asn Leu 805 As n Thr 550 Ala Gly Pro Arg Gly 630 Leu Thr Arg Lau Aia 710 Met Phe Tyr Asp Tyr 790 Arg Lau Ser Tyr Ala 585 Pro Leu Lys Ala Val1 665 Giy Phe Gin Giu Asn 745 Pro Leu Ala His Lys 825 Val T1yr 570 Ala Tyr Leu Asp His 650 Asn Leu Tyr Gly Gin 730 Asn Ser ser Ser Asn 810 Tyr As n 555 Trp Asn Lys Trp 0 ly 635 Ile Ly s Tyr As n Leu 715 Se r Val Ser Met Ser 795 Gly Gly Pro Ser Trp Thr 605 Giu Gin Tyr Ile Ala 685 Gin Ile Val1 Arg Ser 765 Tyr Leu Giu Leu Asn 845 Va 1 Ser 590 Ile Gin Thr Asp Ser 670 G ly Asp Phe Tyr Vali 750 S er As n Ly s Giy Gly 830 Giy 560 His Trp Pro Giu Thr 640 Thr Leu Gin Leu Asp 720 Asp Asn Lys Asp Tyr 800 Ly s Lys INFORMATION FOR SEQ ID NO:38: SEQUENCE CHARACTERISTICS: LENGTH: 16 amino acids TYPE: amino acid -182- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 STRANDNESS: single TOPOLOGY: linear (ii) MOLECULAR TYPE: protein FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:38: Arg Tyr TYr Asn Leu Ser Asp Glu Glu Leu Ser Gln Phe Ile Gly 1 5 10 Lys INFORMATION FOR SEQ ID NO:39: SEQUENCE CHARACTERISTICS: LENGTH: 20 amino acids TYPE: amino acid STRANDNESS: single TOPOLOGY: linear (ii) MOLECULAR TYPE: protein FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:39: Gly Thr Ala Thr Asp Val Ser Gly Pro Val Glu Ile Asn Thr Ala 1 5 10 Ile Ser Pro Ala Lys INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 11 amino acids TYPE: amino acid STRANDNESS: single TOPOLOGY: linear (ii) MOLECULAR TYPE: protein FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID Ala Asn Ser Leu Tyr Ala Leu Phe Leu Pro Gin 1 5 1o INFORMATION FOR SEQ ID NO:41: SEQUENCE CHARACTERISTICS: LENGTH: 14 amino acids -183- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 (ii) (v) (xi) Leu Arg Ser 1 TYPE: amino acid STRANDNESS: single TOPOLOGY: linear MOLECULAR TYPE: protein FRAGMENT TYPE: N-terminal SEQUENCE DESCRIPTION: SEQ ID NO:41: Ala Asn Thr Leu Thr Asp Leu Phe Leu Pro Gin 5 INFORMATION FOR SEQ ID NO:42: SEQUENCE CHARACTERISTICS: LENGTH: 19 amino acids TYPE: amino acid STRANDNESS: single TOPOLOGY: linear (ii) MOLECULAR TYPE: protein FRAGMENT TYPE: N-terminal (xi) Ala Leu SEQUENCE DESCRIPTION: SEQ ID NO:42: Glu Val Glu Arg Thr Val Ser Leu Ala Glu Val Tyr 5 10 Arg 1 Ala Gly Leu Glu INFORMATION FOR SEQ ID NO:43: SEQUENCE CHARACTERISTICS: LENGTH: 11 amino acids TYPE: amino acid STRANDNESS: single TOPOLOGY: linear (ii) MOLECULAR TYPE: protein FRAGMENT TYPE: N-terminal (xi) Ile Arg Glu 1 SEQUENCE DESCRIPTION: SEQ ID NO:43: Asp Tyr Pro Ala Ser Leu Gly Lys 5 INFORMATION FOR SEQ ID NO:44: SEQUENCE CHARACTERISTICS: LENGTH: 16 amino acids TYPE: amino acid STRANDNESS: single TOPOLOGY: linear -184- SUBSTITUTE SHEET (RULE 26) WO 97/17432 WO 97/ 7432PCTIUS96/1800o3 (ii) MOLECULAR TYPE: protein FRAGMENT TYPE: N-terminal (Xi) SEQUENCE DESCRIPTION4: SEQ ID NO:44: Asp Asp Ser Gly Asp Asp Asp Lys Val Thr Asn Thr Asp Ile His 1 5 10 Arg INFORMATION FOR SEQ ID Ci) SEQUENCE
CHARACTERISTICS:
LENGTH: 13 amino acids TYPE: amino acid STRANDNESS: single TOPOLOGY: linear (ii) MOLECULAR TYPE: protein FRAGMENT TYPE: N -terminal (xi) SEQUENCE DESCRIPTION: SEQ ID Asp Val Xaa Gly Ser Glu Lys Ala Asn Giu Lys Leu Lys 1 5 INFORMATION FOR SEQ ID NO:46: Ci) SEQUENCE
CHARACTERISTICS:
LENGTH: 7551 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genornic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:46 (CcdA): AAC GAG TCT Asn Glu Ser GTA AAA GAG Val Lys Giu ATA CCT GAT Ile Pro Asp 10 CdT TTr AAT Gly Phe Asn cGc CAG CAA Arg Gin Gin GTA TTA AAA AGC Vai Lau Lys Ser AGC TCT TTT AAT Ser Ser Phe Asri CAG TGT Gin Cys GAA TTT Giu Phe
TOT
Cys CTG ACA GAT ATT Leu Thr Asp Ile AGC CAC Ser His 25 GTA TCT GAG CAC CTC Tcc TGG TCC GAA Val Ser Giu His Leu Ser Trp Ser Giu 40
ACA
Thr CAC GAC TTA His Asp Leu TAT CAT ~Ty r H is 50 GAT GCA cAA cAG Asp Ala Gin Gin
GCA
Aia 55 CAA AAG GAT AAT Gin Lys Asp Asn CTG TAT GA-A GCG Lau Tyr Giu Aia
CGT
Arg ATT CTC AAA cGC Ile Leu Lys Arg 0CC Ala 70 AAT ccc CAA TTA Asn Pro Gin Leu CAA AAT GCG GTG CAT Gin Asn Aia Val His 75 CTT 240 Leu SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003
GC
Ala
AGC
Ser
TTC
Phe
TTA
Leu
CTC
Leu 145
ACA
Thr TCT2 Ser I
ATT
Ile
GGT
Gly
TCC
Ser
CAC
His 130
AAA
Lys
CTC
Leu -ys
CTC
L a
AG;
Arg
CCC
Pro 115
OCA
Ala
TCA
Ser
TCT
Ser
CTG
Leu
GCC
la 100
GCC
Ala
AGT
Ser
ATG
Met
TTG
Leu
GAA
Glu 180 r CCC AAT GCT GAA; CTG Pro Asn Ald Giu Leu ACT CAA TAT GTT CCC Ser Gin Tyr Val Ala 105 GCT TAT TTG ACT GAA Ala Tyr Leu Thr Glu 120 GAC TCC GTT TAT TAT Asp Ser Val Tyr Tyr 135 GCG CTC AGT CAG CAA Ala Leu Ser Gin Gln 150 TCC AAT GAG CTG TTA Ser Asn Giu Leu Leu 165 AAC TAT ACT AA GTG Asn Tyr Thr Lys Val 185 ATA GGC Ile Glv 90 CCC CGT Pro Gly CTT TAT Leu Tyr CTG GAT Leu Asp AAT ATG Asn Met 155 TTG GAA Leu Glu 170 ATC GAA
TA
T'lI
AC
Th
C'
Ar
AC
Th 14(
GA
As!
AGC
Ser
ATC
Met Giu Met
COT
Arg
GAA
Glu
CCC
Pro 225
GCT
Ala GAAi Glu
CCG
Pros
AGC
Ser CA C Gin C 305 AGT G Ser A AAT G Asn A TAT C Tyr A
CCI
Prc
GT?
Val 210
GCA
Ala
TCA
Ser
GGT
ly
GCC
Ala
.AT
ksp ?90
AG
;In
;AT
sp
TCC
Ser 195
ATC
Ile
ATT
Ile
ATC
Ile
AAT
Asn
TCA
Ser 275
GAA
Glu
CAA
Glu
GGC
Gly
GGC
Gly
CAG
Gin
GCC
Ala
TCG
Ser
GCT
Ala 260
TTO,
Leu
GAA
Glu
TAT
7TYr
ACG
Thr CA A Gin 1 340 CAT I Asp
GCA
Ala
CTA
Leu
GGG
Gly
CCT
Pro 245
GAG
Glu
GCT
Ala
CTT
Leu
ALGT
Ser 3TT 2 IaI I 325 kTG C et
'AT
I
ACC
Thr
CAA
Gin TrG Leu 230
GAG
Glu
GAA
Glu
ATG
Met
AGT
Ser
AAT
Asn 310 kAG -ys
CCT
Pro
CAT
Asp 215
ATG
Met
CTA
Leu
CTT
Leu
CCG
Pro
CAG
Gln 295
AAC
Asn
GTA
Val
TA
200
CCI
Pro
CAT
His
TT
Phe
TAT
Tyr
GAA
Glu 280
TTT
Phe
CAA
Gln
TAT
Tyr r CAT CAT His Asp GGA CTT Gly Leu CAA GCC Gin Ala AAT ATT Asn Ile 250 AAG AAA Lys Lys 265 TAC CTT Tyr Leu ATT GGT Ile Cly CTT ATT Leu Ile CGG ATC Arg Ile 330 CTA TTT Leu Phe I 345 AAT TTT I Asn Phe 1 -186-
GCT
Ala
GAG
Glu
TCC
Ser 235
CTG
Leu
AAT
Asn
AAA
Lys
AAA
Lys
ACT
Thr 315
ACC
TAT
Tyr
CAA
Gin 220
CTA
Leu
ACG
Thr
TTT
Phe
COT
Arg
GCC
Ala 300
CCC
Pro
CGC
*T AAC r Asn C GTT r Val T CAA 9 Glu 125
CGC
r Arg 0
ATA
SIle
ATT
Ile
CTC
Leu
GAA
Clu 205 CTC 2 Leu 2 TTG C Leu C GAG C Giu C GGT Gly A 2 TAT T Tyr r 285 AGC A Ser A GTA G Val V GAA T.
AAT 3A TT, Asn Gin Phi TCT TCC ATC Ser Ser Met 110 GCA CC AAT Ala Arg Asn CGC CCA GAT Arg Pro Asp GAA TTA TCC Glu Leu Ser 160 AAA ACT CAA Lys Thr Glu 175 TCC ACT TTC Ser Thr Phe 190 Asn
AT
'.sn
;GT
ly
;AG
lu
LAT
sn
AT
'yr
AT
sn
TC
al
AT
GTG
Val
GCA
Ala
AT'
Ile
AT
Ile 255
ATC
Ile
AAT
Asn
TTT
Phe
AAC
Asn
ACA
CGT
Arg
TCA
Ser
AAC
Asn 240
ACC
Thr
CAA
Glu
TTA
Leu
OCT
Gly
AGC
Ser 320
A.CC
288 336 384 432 480 528 576 624 672 720 768 816 864 912 960 1008 1056 1104 rhr Arg Giu Tyr Thr Thr 335
TAT
Tyr
TTA
Leu 355 TTC GT GT Phe Gly Gly 350 AAT GCC TCT Asn Ala Ser 365 SUBSTIUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 Ser
CCT
Pro 385
CAT
Asp
CGT
Gly
CA
Gin
GCG
Ala
AAT
Asn I 465
ACT;
Thr L ATA C Ile L AGC C Ser G TTT T Phe S GAT T Asp T: 545 TCG ct Ser L AAA Al Lys Ii TTA Cl Leu Le CTG AI Leu 11 61 AAC CA Lys C 625 CTA CA Leu Hi
I
3
C.
C
A'
Il
TC
SE
TA
TY
AC
rh
T.
,el
LA
yy
TT
eL
.AP
In
CT
er 30 rp
'C
?U
1C
'T
e 0
A
n
T
s TC MG 1- Lys 70 AA GTC in Val rC ACT Le Ser T TCG !r Trp *C TCT 'r Ser 435 A GAA r Glu 0 A CMA Gin
TAT
Tyr k TGC I Cys
ITTT
Phe 515 ACC C Thr C CGA Arg L TTC C Phe A AAA A Lys A 5 GCA G.
Ala A 595 CCC C' Ala Ve TTC C Leu A: ACA C; Thr CI TTA AAT CAT AGA GA CT Leu Asn Asp Lys Arg Glu Le 375 AAT ATA GM TAC TOC CCA A.
Asn Ile Ciu Ty r Ser Ala As 390 CAA CCT TTT CM ATT CCC CT Gin Pro Phe Glu Ile Cly Le 405 41 CCA TAT CCC CCC CCA AAA T Ala Tyr Ala Ala Ala Lys Ph 420 425 TTT CTC CTA AAA CTT AAC AA( Phe Leu Leu Lys Leu Asn Ly 440 TTC TCA CCC ACO ATT CTC GA; Leu Ser Pro Thr Ile Leu GIL 455 CTM CAT ATC AAC ACA CAC GTP Leu Asp Ile Asn Thr Asp Val 470 TAT ATM CAG CCT TAT CCT AT' Tyr Met Gin Arg Tyr Ala Ile 485 490 AAC CCC CCT ATT TCA CAA CGT ksn Ala Pro Ile Ser Gin Arg 500 505 :AT CGC CTC TTT AAT ACG CCA sp Arg Leu Phe Asn Thr Pro 520 ;CC CAT GAG GAG AIT GAT TTA ;iy Asp Ciu Giu Ile Asp Leu 535 LAA ACC ATA CTT AAG CGT CCA ys Thr Ile Leu Lys Arg Ala 550 CC CTC CTT AAA ATT ACC CAC .rg Leu Leu Lys Ile Thr Asp 565 570 AT AAC CTA AAG AAT CTT TCC sn Asn Leu Lys Asn Leu Ser 80 585 AT ATT CAT CAA TTA ACC ATT sp Ile His Gin Leu Thr Ile 600 rA GGT GAA GG AAA ACT AAT al Gly Giu Cly Lys Thr Asn 615 :T ACC CTM ATC ACA AAA CTC La Thr Leu Ile Arg Lys Leu 630 .G M-C TCG ACT CTA TTC CAG .n Lys Trp Ser Val Phe Gln -187- SUBSTITUTE SHEET (RULE 26) 'T CG :u Vr .T Al n Ii 39 C AC u Th 0 r AC a Th G CC' Al.
GG(
a Cl'
STTI
Let 47'
CA']
His
TCP
Ser
TTA
Leu
AAT
Asn
TTT
Phe 555
CAT
His
MAT
Asn
CAT
Asp
TTA
Leu
AAT
Asn 635
CTA
Leu CCA ACT G GCCC 1I Arg Thr Giu Cly 380 'C ACA TTA MAT ACC .e Thr Leu Asn Thr '5 A CCA GTA CTT CCT r Arg Val Leu Pro 415 C CTT CAA GAG TAT r Val Glu Clu Tyr 430 T ATT CCT CTA TCA a Ile Arg Leu Ser 445 C ATT GTG CCC ACT Y Ile Val Arg Ser 460 k GCT MAA CTT TTT i Gly Lys Val Phe 5 r GCT CMA ACT GCC Ala Clu Thr Ala 495 TAT CAT AAT CAA Tyr Asp Asn Gin I 510 CT AAC GGA CAA I Leu Asn Gly Gin i 525 TCA GGT AGC ACC C Ser Cly Ser Thr G 540 AAT ATT CAT CAT C Asn Ile Asp Asp V 5 CAT AAT AAA CAT C Asp Asn Lys Asp C 575 TTA TAT ATT GGA A Leu Tyr Ile Gly L 590 GAA CTC CAT TTA T Clu Leu Asp Leu L 605 TCC CCT ATC ACT CG Ser Ala Ile Ser A 620 ACT ATT ACC AGC T( Thr Ile Thr Ser Ti 64 TTT ATC AT ACC TC Phe Ile Met Thr SE
CC
Ala
GCT
Ala 400
TCC
Ser
AAC
Asn
CGT
Arg
GTT
Val
CTM
Leu 480
CTG
Leu
:CT
Pro
'AT
~r
GC
'ly
TC
'a 1
GA
ly
AA
ys
TA
eu
AT
r p 00 rr 1152 1200 1248 1296 1344 1392 1440 1488 1536 1584 1632 1680 1728 1776 1824 1872 1920 1968 WO 97/17432 PCT/US96/18003 ACC AGC TAT kC AA ACG Thr Ser Tyr Asn Lys Thr 660 CTA ACG CCT Leu Thr Pro 665 650 GAA ATT AAG AAT TTC CTG GAI Glu Ile Lys Asn Leu Lau Asp 670 GAT A.A GAC AAA GCA CAT TTG 2016
ACC
Thr
CTA
Leu
GAA
Glu 705
GGC
Gly
GTC
Va1
CAT
His 690
AAT
Asn
GAC
Asp
TAC
675
GTC
Val
GTC
Val
CGC
Cly CAC GGT His Gly ATG GCG Met Ala CCC CAC Ala His OCA ATC Ala Met 725
TTA
Leu
CCC
Pro
CAA
Ci
TAT
Tyr 695
CGT
Cly 680
ATT
Ile TCC GTA Ser Val 710 ACA GCA CTC CT Leu Le GAA AAJ Thr Ala Glu Ly AAC TAT ACC CCC Ly s
GTT
Val
ACC
Thr
ATG
Met 785
CTC
Leu AAAi Lys
CAA
GCC
Ala
GAA
Glu J 865 GTT val TTC C TYl
CAC
Gir
GGC
Cly 770
TTT
Phe
ATT
Ile
GCG
Ala
CAA
Gin
ACT
Ser 350 kAT ksn
LAT
~sn
;TC
r Thr
TAT
1Tyr 755
ATC
Ile
GCC
Gly
ATC
Met
TCC
Ser
CTG
Leu 835
ATT
Ile
GCC
Ala
CTC
Val
C
Pro 740
TCI
Cys
AAC
Asn
CCT
Ala
CTG
Leu
TCG
Ser 820
GCT
Ala
CAA
Gin
TTC
Phe
GCA
Ala
:TG
CGT
Cly
CAC
Gin
GAA
Clu
CCA
Ala
ACA
Thr 805
GTC
Val
CAT
Asp
GCA
Ala
TCC
Ser
CAA
Gin 885
CAT
TCA TCG GAJ Ser
GCI
Ala
AAC
Asn
ACT
Thr 790
CGT
Arg
CTA
Leu
GCC
Ala
CAA
Gin
TOT
Cys 870
CAA
;In Ser
CTG
Leu
GCC
Ala 775
GGA
Gly
TTT
Phe
CCC
Ala
ATC
Met
AAT
Asn 855
TG.
Trp
TTGJ
Leu Cit
GC
A14 760
TTC
Phe
GCA
Ala
GCG
Ala
CCA
Ala
AAT
Asn 840
CAT
His
ACA
rhr
AAT
Asn k CCC i Ali 745
CA.
Gir
CG
Arg
GCG
Ala
CAT
Asp ?r- Phe 825
CTT
Leu
CAA
Ci
TCT
Ser
CTC
Val
TCA
Ser 905
CTA
Val r TGC CC.
u Trp Ai 71' TTC TG( S Phe Tr 730 GTA GAJ I Val CI TTG GA; I Leu GIL CTA TI- Leu Phe CCC GCG Pro Ala 795 TGG GTC Trp Val 810 AA CT Clu Ala CAT GCT Asp Ala CAT CTT His Leu ATC AAT Ile Asn 875 CCC CCA Ala Pro 890 ATC AAA Met Lys TTA ACC Leu Thr s Asp Lys Ala Asp 685 C TTC CAA TTA TCA r Leu Gin Leu Ser 700 A CAT AAC TTA CAG a Asp Lys Leu Gin 5 J GAC TGG TTG AAT Asp Trp Leu Asn 735 I ACG CAC GAA CAT Thr Gin Gu His 750 ATG GTT TAC CAT I Met Val Tyr His 765 CTC ACA AAA CCA Val Thr Lys Pro 780 CAT AT CC CTT His Asp Ala Leu AAC GCA CTA GGC Asn Ala Leu Cly 815 AAC TCG TTA ACC Asn Ser Leu Thr 830 AAT TTC CTC TTG Asn Leu Leu Leu 845 CCC CCA GTA ACT Pro Pro Val Thr 860 ACT ATC CT CAA Thr Ile Leu Gin CAG GGC GTT TCC C Gin Gly Val Ser 895 GAG ACA CCC ACC I Glu Thr Pro Thr 910 CCC GGG TTG M.T T Ala Giy Leu Asn S 925 Leu
TCG
Ser
CCC
Pro 720
ACT
Thr
ATC
Ile
TCC
Ser
GAG
Clu
TCA
Ser 800
GAA
Glu
GCA
Ala Gln
CCA
Pro rOG rrp
CT
la
'AT
'yr
CA
er 2064 2112 2160 2208 2256 2304 2352 2400 2448 2496 2544 2592 2640 2688 2736 2784 2832 Leu Val Gly CCC CAC TGC Ala Gin Trp 915 Leu Asp 900 GAA AAC C Glu Asn CAT ATT CAA ryr Ile Gin CG CCA GGC la Ala Gly 920 CM% CAG GCT AAT ACA TTA CAC GCT TTT CTG -188- CAT GMA TCT CCC ACT GCC SUBSTITUTE SHEET (RULE 26) WO 97/1 7432 PCT/US96/18003 Gin Gin Ala Asn Thr Lau 930 His Ala Phe Lau Asp Ciu Ser Arg Ser Ala GCA TTA AGC ACC TA C TAT ATC CGT CAA GTC GCC A.G GCA GOG GCG OCT Ala Leu Ser Thr Tyr 7y r Ile Arg Gin Val Ala Lys Ala Ala Ala AlIa 945 950 955 960 ATT AMA AGC CGT GAT GAC TTG TAT CAA TAC TTA CTG ATT GAT A.AT C '1'G Ile Lys Ser Arg Asp Asp Leu Tyr Gin Tyr Leu Leu Ile Asp Asn Gin 965 970 975 GTT TCT GCG GCA ATA AAA ACC ACC CGG ATC GCC GAA GCC ATT GCC ACT Val Ser Ala Ala Ilie Lys Thr Thr Arg Ile Ala Giu Ala Ile Ala Ser 980 985 990 ATr CMA CTG TAC GTC MAC CGG GCA TTC GMA AAT GTG GMA GMA AAT GCC Ile Gin Leu Tyr Val Asn Arg Ala Leu Giu Asn Val Giu Giu Asn Ala 995 1000 1005 AAT TCG COG GTT ATC AGC CGC CAA TI'C ?I'T ATC GAC TCO. GAC AA TA-*C Asn Ser Gly Val Ile Ser Arg Gin Phe Phe Ile Asp Trp Asp Lys Tyr 1010 1015 1020 AAT AAA CGC TAC AGC ACT TGG C GOT GTT TCT CAA TTA GTT TAC TAC Asn Lys Arg Tyr Ser Thr Trp Ala Gly Val Ser Gin Leu Val Tyr Tyr 1025 1030 1035 1040 CCG GMA AAC TAT ATT GAT CCC ACC ATG CGT ATC CCA CAA ACC ILAA ATG Pro Glu Asn Tyr Ile Asp Pro Thr Met Arg Ile Gly GIn Thr Lys Met 1045 1050 1055 ATG GAC GCA 'ITA CTG CAA TCC GTC AGC CMA AGC CMA TTA AAC GCC GAT Met Asp Ala Leu Leu Gin Ser Val Ser Gin Ser Gin Leu Asn Ala Asp 1060 1065 1070 ACC CTC CMA GAT GCC TTT ATC TCT TAT CTG ACA TCC TTT GMA CMA GTG Thr Val Giu Asp Ala Phe Met Ser Tyr Leu Thr Ser Phe Giu Gin Val 1075 1080 1085 GCT MAT C N' AMA CIT ATT AGC GCA TAT CAC CAT MAT ATT M.T MAC CAT Ala Asn Leu Lys Val Ile Ser Ala Tyr His Asp Asn Ile Asn Asn Asp 1090 1095 1100 CMA COO CTG ACC TAT IT ATC COA CTC ACT GMA ACT CAT CCC GOT GMA Gin Gly Leu Thr Tyr Phe Ile Gly Leu Ser Giu Thr Asp Ala Cly Ciu 1105 1110 1115 1120 TAT TAT TOO CCC ACT GTC CAT CAC ACT AAA TTC MAC GAC GOT MAA 'ITC Tyr Tyr Trp Arg Ser Val Asp His Ser Lys Phe Asn Asp Cly Lys Phe 1125 1130 1135 C OCT MAT CCC TOG ACT GMA TOG CAT AMA ATT OAT TOT CCA ATT MAC Ala Ala Asn Ala Trp Ser Ciu Trp His Lys Ile Asp Cys Pro Ile Asn 1140 1145 1150 CCT TAT AMA ACC ACT ATC CCT CCA GTG ATA TAT AMA TCC CCC CTG; TAT Pro Tyr Lys Ser Thr Ile Arg Pro Val Ile Tyr Lys Ser Arg Leu Tyr 1155 1160 1165 CTG CTC TGC TTC CMA CMA MG GAG ATC ACC AMA CAG ACA GGA AAT ACT Leu Leu Trp Leu Ciu Gin Lys Glu Ile Thr Lys Gin Thr Gly Asn 5cr 1170 1175 1180 MAA OAT CCC TAT CMA ACT GMA ACC CAT TAT CCT TAT GMA CTA AAA TCG Lys Asp Cly Tyr GIn Thr Glu Thr Asp Tyr Arg Tyr Ciu Lau Lys Leu 1185 1190 1195 1200 CC CAT ATC CCC TAT OAT CCC ACT TCC M.T ACC CCA ATC ACC TTT CAT Ala His Ilie Arg Tyr Asp Cly Thr Trp Asn Thr Pro Ile Thr Phe Asp 1205 1210 1215 2928 2976 3024 3 072 3120 3168 3216 3264 3312 3360 3408 3456 3504 3552 3600 3648 -189- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCTIUS96/18003 GTC AAT AAA AAA ATA TCC GAG CTA AAA CTG GAA AA-A .AAT AGA GCG CC o366 Val Asn Lys Lys Ile Ser Glu Leu Lys Leu Glu Lys Asn Arg Ala Pro 1220 1225 1230 CGA CTC TAT TGT GCC GGT TAT CAA GGT GAA CAT ACG TTC CTG GTG ATG 3744 Gly Leu Tyr Cys Ala Gly Tyr Gin Gly Glu Asp Thr Leu Leu Val Met 1235 1240 1245 TTT TAT AAC CAA CAA GAC ACA CTA GAT AGT TAT AAA AAC GCT TCA ATG 3792 Phe Tyr Asn Gin Gin Asp Thr Leu Asp Sar Tyr Lys Asn Ala Ser Met 1250 1255 1260 CAA GGA CTA TAT ATC TTT CCT GAT ATG GCA TCC AAA GAT ATG ACC CCA 3840 Gin Gly Leu Tyr Ile Phe Ala Asp Met Ala Ser Lys Asp Met Thr Pro 1265 1270 1275 1280 GAA CAG AGC AAT GTT TAT CGG GAT AAT AGC TAT CAA CAA TTT GAT ACC 3888 Glu Gin Ser Asn Val Tyr Arg Asp Asn Ser Tyr Gin Gin Phe Asp Thr 1285 1290 1295 AAT AAT GTC AGA AGA GTC AAT AAC CGC TAT GCA GAG GAT TAT GAG ATT 3936 Asn Asn Val Arg Arg Val Asn Asn Arg Tyr Ala Glu Asp Tyr Glu Ile 1300 1305 1310 CCT TCC TCG GTA ACT AGC CGT AAA GAC TAT GGT TGG CGA GAT TAT TAC 3984 Pro Ser Ser Val Ser Ser Arg Lys Asp Tyr Gly Trp Gly Asp Tyr Tyr 1315 1320 1325 CTC AGC ATG GTA TAT AAC GGA CAT ATT CCA ACT ATC AAT TAC AAA GCC 4032 Leu Ser Met Val Tyr Asn Gly Asp Ile Pro Thr Ile Asn Tyr Lys Ala 1330 1335 1340 GCA TCA AGT GAT TTA AAA ATC TAT ATC TCA CCA AAA TTA AGA ATT ATT 4080 Ala Ser Ser Asp Leu Lys Ile Tyr Ile Ser Pro Lys Leu Arg Ile Ile 1345 1350 .1355 1360 CAT AAT GGA TAT GAA GGA CAG AAG CGC AAT CAA TGC AAT CTG ATG AAT 4128 His Asn Gly Tyr Glu Gly Gin Lys Arg Asn Gin Cys Asn Leu Met Asn 1365 1370 1375 AAA TAT GGC AAA CTA GT GAT AAA TTT ATT GTT TAT ACT AGC TTG GG 4176 Lys Tyr Gly Lys Leu CGly Asp Lys Phe Ile Val Tyr Thr Ser Leu Gly 1380 1385 1390 GTC AAT CCA AAT AAC TCG TCA AAT AAG CTC ATG TTT TAC CCC GTC TAT 4224 Val Asn Pro Asn Asn Ser Sar Asn Lys Leu Met Phe Tyr Pro Val Tyr 1395 1400 1405 CAA TAT AGC GGCA AAC ACC AGT GGA CTC AAT CAA GGG AGCA CTA CTA TTC 4272 Gin Tyr Ser Gly Asn Thr Ser Gly Leu Asn Gin Gly Arg Leu Leu Phe 1410 1415 1420 CAC CCT GAC ACC ACT TAT CCA TCT AAA CTA GAA GCT TGG ATT CCT GGA 4320 His Arg Asp Thr Thr Tyr Pro Ser Lys Val Glu Ala Trp Ile Pro Gly 1425 1430 1435 1440 GCA AAA CGT TCT CTA ACC AAC CAA AAT GCC GCC ATT GGT GAT CAT TAT 4368 Ala Lys Arg Ser Leu Thr Asn Gin Asn Ala Ala Ile Gly Asp Asp Tyr 1445 1450 1455 GCT ACA GAC TCT CT AAT AA-A CCG GAT GAT CTT AAG CAA TAT ATC TTT 4416 Ala :ir Asp Ser Leu Asn Lys Pro Asp Asp Leu Lys Gin Tyr Ile Phe 1460 1465 1470 ATG ACT GAC ACT AAA GGG ACT GCT ACT CAT GTC TCA GGC CCA GTA GAG 4464 Met Thr Asp Ser Lys Gly Thr Ala Thr Asp Val Ser Gly Pro Val Glu 1475 1480 1485 ATT AAT ACT GCA ATT TCT CCA GCA AAA GTT CAG ATA ATA GTC AAA GCG 4512 Ile Asn Thr Ala Ile Ser Pro Ala Lys Val Gin Ile Ile Val Lys Ala 1490 1495 1500 -190- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCJIUS96I1 8003 GGT GGC A.AG GAG CAA ACT TTT ACC GCA GAT *AA GAT GTC TCC ATT CAG 4560 Gly Sly Lys Giu Gin Thr Phe Thr Ala Asp Lys Asp Val Ser Ile Gin 1505 1510 1515 1520 CCA TCA CCT AGC TTT GAT ATG AAT TAT CAA TTT AAT GCC CTT GAA 4608 Pro Ser Pro Ser Phe Asp Giu Met Asn ryr Gin Phe Asn Ala Leu Glu 1525 1530 1535 ATA GAC GGT TCT GGT CTG AA-T TTT ATT AAC A.AC TCA GCC ACT ATT GAT 4656 Ile Asp Giy Ser Giy Lau Asn Phe Ile Asn. Asn Ser Ala Ser Ile Asp 1540 1545 1550 GTT ACT TTT ACC GCA TITr GCG GAG GAT GGC CGC AAA CTG GGT TAT GAA 4704 Val Thr Phe Thr Aia Phe Ala Giu Asp Gly Arg Lys Leu Gly Tyr Glu 1555 1560 1565 AGT TTC AGT ATT CCT GTT ACC CTC AAG GTA AGT ACC GAT AAT GCC CTG 4752 Ser Phe Ser Ile Pro Val Thr Leu Lys Val Ser Thr Asp Asn Ala Leu 1570 1575 1580 ACC CTG CAC CAT AAT GAA AAT GGT GCG CAA TAT ATG CAA TGG CAA TCC 4800 Thr Leu His His Asn Giu Asn Gly Ala Gin Ty~r Met Gin Trp Gin Ser 1585 1590 1595 1600 TAT CGT ACC CGC CTG AAT ACT CTA TTT GCC CGC CAG TTG GTT GCA CGC 4848 Tyr Arg Thr Arg Leu Asn Thr Leu Phe Ala Arg Gin Leu Val Ala Arg 1605 1610 1615 GCC ACC ACC GGA ATC GAT ACA ATT CTG ACT ATG GAA ACT CAG A.AT ATT 4896 Ala Thr Thr Gly Ile Asp Thr Ile Leu Ser Met Giu Thr Gin Asn Ile 1620 1625 1630 CAG GAA CCC CAG TTA CCC AAA GGT TTC TAT OCT ACG TTC GTC ATA CCT 4944 Gln Giu Pro Gin Leu Cly Lys Gly Phe Tyr Ala Thr Phe Val Ile Pro 1635 1640 1645 CCC TAT .AAC CTA TCA ACT CAT GOT OAT CAA CGT TOG TT AAG CTN' TAT 4992 Pro Tyr Asn Leu Ser Thr His Gly Asp Glu Arg Trp Phe Lys Lau Tyr 1650 1655 1660 ATC AAA CAT GTT GTT CAT AAT A.AT TCA CAT ATT ATC TAT TCA OGC CAG 5040 Ile Lys His Val Val Asp Asn Asn Ser His Ile Ile Tyr Ser Oly Gin 1665 1670 1675 1680 CTA ACA GAT ACA AAT ATA AAC ATC ACA ?1'A TTT ATT CCT CT OAT OAT 5088 Leu Thr Asp Thr Asn Ile Asn Ile Thr Leu Phe Ile Pro Leu Asp Asp 1685 1690 1695 GTC CCA TTG AAT CAA GAT TAT CAC GCC AAO GTT TAT ATO ACC TTC AAG 5136 Val Pro Leu Asn Gin Asp Tyr His Ala Lys Val TPyr Met Thr Phe Lys 1700 1705 1710 AAA TCA CCA TCA GAT GOT ACC TOO TGG GGC CCT CAC TTT GTT AGA OAT 5184 Lys Ser Pro Ser Asp Oly Thr Trp Trp Oly Pro His Phe Val Arg Asp 1715 1720 1725 GAT AAA OGA ATA GTA ACA ATA AAC CCT AAA TCC ATT TTG ACC CAT TTT 5232 Asp Lys Oly Ile Val Thr Ile Asn Pro Lys Ser Ile Leu Thr His Phe 1730 1735 1740 GAG AGC GTC AAT GTC CTO AAT AAT ATT AGT AOC GAA CCA ATO OAT TTC 5280 Glu Ser Val Asn Val Leu Asn Asn Ile Ser Ser Giu Pro Met Asp Phe 1745 1750 1755 1760 AGC GGC GCT AAC AGC CTC TAT TTC TGG GA CTG TTC TAC TAT ACC CCG 5328 Ser Gly Ala Asn Ser Leu Tyr Phe Trp Giu Leu Phe Tyr Tyr Thr Pro 1765 1770 1775 ATO CTG GTT GCT CAA CGT TTG CTO CAT GAA CAG AAC TTC CAT OAA GCC 5376 Met Leu Val Aia Gin Arg Leu Leu His Giu Gin Asn Phe Asp Olu Ala -191- SUBSTITUTE SHEET (RULE 26) WO 97/17432 WO 97/ 7432PCTIUS96/18003 1790 1785 1790 CCT TGC CTC kA2 TAT CTC TCG AGT CCA TCC GGT TAT ATT GTC CAC 542 4 Asn Arg Trp Leu Lys I1ir Val Trp Ser Pro Ser CiY y Ir Ile Val His 1795 1800 1805 GGC CAG ATT CAG AAC TAC CAG TCG AAC GTC CCC CCG TTA CTG GAA GC*aC 5472 Gly Gin Ile Gin Asn Tyr Gin Trp As n Val Arg Pro Leu Leu Giu Asp 1810 1815 1820 ACAT TGG AAC AGT GAT CCT TTC CAT TCC GTC CAT CCT GAC CCG GTA 5520 Thr Ser Trp Asn Ser Asp Pro Leu Asp Ser Val Asp Pro Asp Ala Val 1825 1830 1835 1840 GCA CAC CAC CAT CCA ATC; CAC TAC AAA OTT TCA ACT TTT ATG CCT ACC 5568 Ala Gin His Asp Pro Met His 1'/r Lys Val Ser Thr Phe Met Arg Thr 1845 1850 1855 TTC CAT CTA TTG ATA GCA CCC CCC GAG CAT CCT TAT CGC CAA CTC CAA 5616 Leu Asp Leu Leu Ile Ala Arg Cly Asp His Ala Tyr Arg Gin Leu Ciu 1860 1865 1870 CCA CAT ACA CTC AAC CA). C AAG ATG TOG TAT ATC CAA CC CTC CAT 5664 Arg Asp Thr Leu Asn Olu Ala Lys Met Trp Tyr Met Gin Ala Leu His 1875 1880 1885 CTA TTA CGT GAC AAA CCT TAT CTA CCC CTC ACT ACC ACA TGC ACT CAT 5712 Lau Leu Cly Asp Lys Pro Tyr Leu Pro Leu Ser Thr Thr Trp Ser Asp 1890 1895 1900 CCA CCA CTA GAC AGA CCC CC CAT ATC ACT ACC CAA. AAT CCT CAC GAC 5760 Pro Arg Leu Asp Arg Ala Ala Asp Ile Thr Thr Gin Asn Ala His Asp 1905 1910 1915 1920 AOC CCA ATA CTC CCT CTG CCC CAG AAT ATA CCT ACA CCG GCA CCT TTA 5808 Ser Ala Ile Val Ala Leu Arg Gin Asn Ile Pro Thr Pro Ala Pro Leu 1925 1930 1935 TCA TTG; CCC AGC GCT AAT ACC CTO ACT CAT CTC TTC CTG CCC CAA ATC 5856 Ser Leu Arg Ser Ala Asn Thr Leu Thr Asp Leu Phe Leu Pro Gin Ile 1940 1945 1950 AAT CAA CTG ATO ATG AAT TAC TOG CAG ACA TTA GCT CAG AGA GTA TAC 5904 Asn Giu Val Met Met Asn Tyr Trp Gin Thr Leu Ala Gin Arg Val Tyr 1955 1960 1965 AAT CTG CCT CAT AAC CTC TCT ATC GACGGCC CAC CCC ?TA TAT CTC CCA 5952 Asn Leu Arg His Asn Leu Ser Ile Asp Gly Gin Pro Leu Tyr Leu Pro 1970 1975 1980 ATC TAT GCC ACA CCC CCC CAT CCC AAA CC TTA CTC ACC CCC CCC CTT 6000 Ile Tyr Ala Thr Pro Ala Asp Pro Lys Ala Leu Leu Ser Ala Ala Val 1985 1990 1995 2000 CCC ACT TCT CAA. GOT CCA CCC AAO CTA CCG GA. TCA TTT ATG TCC CTG 6048 Ala Thr Ser Gin Gly Cly Gly Lys Leu Pro Giu Ser Phe Met Ser Leu 2005 2010 2015 TGG CGT TTC CCC CAC ATO CTC GAA AAT CC CCC CCC ATO GTT ACC CAG 6096 Trp Arg Phe Pro His Met Leu Ciu Asn Ala Arg Cly Met Val Ser Gin 2020 2025 2030 CTC ACC CAG TTC CCC TCC ACO TTA CAA A).T ATT ATC GAA CCT GAG GAC 6144 Leu Thr Gin Phe Cly Ser Thr Leu Gin Asn Ile Ile Giu Arg Gin Asp 2035 2040 2045 CC CAA 0CC CTC A.AT CC TTA TTA CAA AAT CAG CCC CCC GAG CTC ATA 6192 Ala Giu Ala Leu Asn Ala Leu Leu Gin Asn Gin Ala Ala Giu Leu Ile 2050 2055 2060 TTG ACT AAC CTG ACC ATT CAG GAC AAA ACC ATT GAA CAA TTC CAT CC 6240 -192- SUBSTITUTE SHEET (RULE 26) WO 97/1 7432 PCTIUS96/18003 Lau Thr Asn Lau Ser Ile Gin Asp LV.s Thr Ile '3iu Giu Lau A-.sp Ala 2065 20 -7 0 2075 2080 GAG AA ACG CTG TTG GAA A. A TCC z.k CCG GGA GCA CAA TCG CGC TTT 628o3 Clu Lys Thr Val Leu Glu Lys Ser. Lys Ala GIY Ala CGin Ser Arg Phe 2085 2090 2095 GAT AGC TAC CCC AAA CTG TAC GAT GAG ATC AAC CCC GGT CAA AAC 63336 Asp Ser Tyr Gly Lys Leu Tyr Asp Giu Asn Ile Asn Ala Gly Giu Asn 2100 2105 2110 CAAL~ CCC ATG ACC CTA CGA GCG TCC GCC CCC GGG CTT ACC ACC CCA GTT 6384 Gin Ala Met Thr Leu Arg Ala Ser Ala Ala Gly Leu Thr Thr Ala Val 2115 2120 2125 CAG GCA TCC CGT CTG GCC GGT GCG GCG GCT GAT CTG GTG CCT AAC ATC 6432 Gin Ala Ser Arg Leu Ala Gly Ala Ala Ala Asp Leu Val Pro Asn Ile 13 2135 2140 TTC CCC TTT CCC CGT CCC CCC AGC CGT TGG CCC CCT ATC CCT GAG CC 6480 Phe Gly Phe Ala Gly Gly Gly Ser Arg Trp Cly Ala Ile Ala Glu Ala 2145 2150 2155 2160 ACA GCT TAT GTG ATG GAA TTC TCC CC AAT GTT ATC AAC ACC GAA CC 6528 Thr Cly rIy r Val Met Giu Phe Ser Ala Asn Val Met Asn Thr Glu Ala 2165 2170 2175 CAT AAA ATT ACC CAA TCT CAA ACC TAC CGT CGT CCC CGT CAC GAG TGG 6576 Asp Lys Ile Ser Gin Ser Glu Thr Tlyr Arg Arg Arg Arg Gin Giu Trp 2180 2185 2190 GAG ATC CAC CCC AAT AAT CCC CAA CC CAA TTG AAC CAA ATC CAT GCT 6624 Giu Ile Gin Arg Asn Asn Ala Ciu Ala Giu Laeu Lys Gin Ile Asp Ala 2195 2200 2205 CAC CTC AAA TCA CTC GCT CTA CCC CCC CAA CCC CCC CTA TTC CAC AAA 6672 Gin Leu Lys Ser Leu Ala Val Arg Arg Ciu. Ala Ala Val Leu Gin Lys 221 2215 2220 ACC ACT CTC AAA ACC CAA CAA CAA CAC ACC CAA TCT CAA TTC CCC TTC 6720 Thr Ser Leu Lys Thr Gin Gin Ciu Gin Thr Gin Ser Gin Leu Ala Phe 2225 2230 2235 2240 CTG CAA CGT AAC TTC ACC AAT CAC CC TTA TAC AAC TGG CTC CCT CCT 6768 Leu Gin Arg Lys Phe Ser Asn Gin Ala Leu Tyr Asn Trp Leu Arg Cly 2245 2250 2255 CGA CTC CC CC ATT TAC TTC CAC TTC TAC CAT TTC CCC CTC CC CCT 6816 Arg Leu Ala Ala Ile Tryr Phe Gin Phe Tyr Asp Leu Ala Vai Ala Arg 2260 2265 2270 TGC CTC AT'- CCA CAA CAA CCT TAC CGT TCC GA.A CTC AAT CAT CAC TCT 6864 Cys Leu Met Ala Ciu Gin Ala Tyr Arg Trp Ciu Leu Asn Asp Asp Ser 2275 2280 2285 CCC CCC TTC ATT AAA CCC CCC CCC TGC CAC CCA ACC TAT CCC GGT CTG 6912 Ala Arg Phe Ilie Lys Pro Cly Ala Trp Gin Gly Thr Tyr Ala Gly Leu 2290 2295 2300 CTT GCA GGT GAA ACC TTG ATC; CTC ACT CTC GCA CAA ATC CAA GAC CCT 6960 Leu Ala Gly Ciu Thr Leu Met Leu Ser Leu. Ala Gin Met Clu Asp Ala 2305 2310 2315 2320 CAT CTC AAA CCC CAT AAA CCC CCA TTA GAG CTT CAA CCC ACA GTA TCG 7008 His Leu Lys Arg Asp Lys Arg Ala Leu. Ciu Val Glu Arg Thr Val Ser 2325 2330 2335 CTG CCC CAA CTT TAT CCA CCA TTA CCA AAA CAT AAC CCT CCA TTT TCC 7056 Leu Ala Giu Val Tyr Ala Cly Leu Pro Lys Asp Asn Cly Pro Phe Ser 2340 2345 2350 -193- SUBSTITUTE SHEET (RULE 26) WO 97/17432 WO 9717432PC'r/US96/18003 CTG GCT CAG GAA Lau Ala Gin Glu 2355 GGC AGT GGT AAT Gly Ser Gly Asn 2370 ATT GAC AGCTG GTG Ile Asp Ly.,s Lau Val 2360 MAT AMT TTG GCG TTC Asn Asn Leu Ala Phe 2375 AGT CAAk GZ:T TCA GGC AGT GC: Ser Gin Gly Ser Gly Ser A-'a 2365 GGC GCC GGC ACG GAC ACT k~Ak GlY Ala Gly Thr Asp Thr Lys 2380 ACC TCT Thr Ser 2385 TT~G CAG GCA Leu Gin Ala TCA GTT Ser Val 2390 TCA TTC GCT Ser Phe Ala GAT TTG Asp Leu 2395 AAA ATT CGT Lys Ile Arg
GAA
Glu 2400 GAT TAC CCG Asp 'rjr Pro GTC ACT *ITG Val Thr Leu GCA TCG CTT Ala Ser Leu 2405 CCC GCG CTA Pro Ala Leu 2420 GGC AAA A?1' Gly Lys Ile CGA COT Arg Arg 2410 ATC AAA CAG Ile Lys Gin ATC AGC Ile Ser 2415 CTG GGA CCG TAT Lau Gly Pro Tyr 2425 CAG GAT GTA CAG GCA ATA Gin Asp Val Gin Ala Ile 2430 '71 D4 7 15 2 7200 7248 7296 7344 7392 7440 7488 7536 TTG TCT TAC GGC GAT A Leu Ser Tyr Gly Asp Lys 2435 CCC GGA TTA Ala Gly Leu 2440 GCT AAC GGC Ala Asn Gly TGT GAA GCG CTG Cys Giu Ala Leu 2445 GCA GTT TCT Ala Val Ser 2450 CAC GGT ATG His Gly Met AAT GAC Asn Asp 2455 AGC GGC CAA Ser Gly Gin TTC CAG Phe Gin 2460 CTC GAT TTC Leu Asp Phe AAC GAT Asn Asp 2465 GGC AAA TTC Gly Lys Phe CTG CCA Leu Pro 2470 TTC GAA GCC Phe Glu Gly ATC CC Ile Ala 2475 ATT CAT CAA Ile Asp Gin
GC
Gly 2480 ACG CTG ACA CTC Thr Leu Thr Leu AGC TT'c Ser Phe 2485 CCA A.AT GCA Pro Asn Ala TCT ATO CCG Ser Met Pro 2490 GAG AAA CCT AAA Clu Lys Gly Lys 2495 CAA CCC ACT Gin Ala Thr TAC ACC ATN Tyr Thr Ile ATO 'ITA AAA Met Leu Lys 2500 AAA TMA Lys..
2516 ACC CTG MAC GAT ATC ATT ?l'C Thr Leu Asn Asp Ile Ile Leu 2505 CAT ATT CGC His Ile Arg 2510 7551 INFORMATION FOR SEQ ID NO:47: SEQUENCE CHARACTERISTICS: LENGTH: 2516 amino acids TYPE: amino acids STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:47 (TcdA): Features From To Descript ion Peptide 1 2516 TcdA proteins Peptide 89 1937 TcdAii peptide Fragment 89 100 S2 N-terminus (SEQ ID NO:13) Fragment 284 299 (SEQ ID NO:38) Fragment 554 563 (SEQ ID NO:17) Fragment 1080 1092 (SEQ ID N0:23; 12/13) Fragment 1385 1400 (SEQ ID NO:18) Fragment 1478 1497 (SEQ ID NO:39) Fragment 1620 1642 (SEQ ID NO:21; 19/23) Fragment 1938 1948 (SEQ ID 140:41) Peptide 1938 2516 TcdAiii peptide Fragment 2327 2345 (SEQ ID NO: 42) Fragment 2398 2408 (SEQ ID NO:43) -194- SUBSTITUTE SHEET (RULE 26) WO 97/1 7432 PCT/US96/1 8003 Met 1 Gly Arg Tlyr Arg 65 Ala Ser 7.sn Glu Ser Phe Asn Cys Gin Gin Val His Asp Ala Ile Leu Lys Ile Leu Ala Gly Arg Ala 100 Val Lys Glu Ile Pro 5 Leu Thr Asp Ile Ser 25 Ser Glu His Lau Ser 40 Gin Gin Ala Gin Lys 55 Arg Ala Asn Pro Gin 70 Pro Asn Ala Giu Leu Ser Gin Tyr Vai Ala 105 Asp 10 His Trp Asp Leu Ilie 90 Pro Val Lau Lys Ser Ser Phe Ser Giu Thr 45 Asn Arg Leu 60 Gin Asn Ala 75, Gly Tyr Asn Gly Thr Val Ser Asn His ValI As n Ser Gin cys Giu Phe Asp Leu Giu Ala His Lau Gin Phe Ser Met Phe Ser Pro Ala Ala Tyr Leu Thr Glu Leu Tyr Arg 115 120 Giu Ala Leu His Ala Ser Asp Ser Val Tyr Leu 145 Thr Ser Arg Giu Pro 225 Ala Giu 130 Lys Leu Lys Pro ValI 210 Ala Ser Gly Ser Ser Leu Ser 195 Ile Ile Ile Asn Met Leu Glu 180 G ly Gin Ala Ser Ala 260 Ala Ser 165 Asn Ala Gly Pro 245 Glu Lau 150 As n Tyr Thr Gin Leu 230 Giu Glu 135 Ser GiU Thr Pro Asp 215 Met Lau Leu Gin Leu Lys Tyr 200 Pro His Phe Tyr Tyr Gin Leu Val1 185 His Gly Gin As n Lys 265 Leu As n Leu 170 Met Asp Leu Ala Ile 250 Lys Asp Met 155 Giu Glu Ala Glu Ser 235 Leu Asn Thr 140 Asp Ser Met Tyr Gin 220 Leu Thr Phe 125 Arg Ile Ile Lau Giu 205 Leu Leu Giu Gly Arg GlU Lys Ser 190 Asn As n Gly Glu Asn 270 Arg *Pro Lau Thr 175 Thr Val1 Ala Ile Ile 255 Ile As n Asp Ser 160 Giu Phe Arg Ser As n 240 Thr Glu Pro Ala Ser 275 Ser Asp Giu 290 Gin Gin Giu 305 Ser Asp Gly Asn Ala Tyr Tyr Arg Leu 355 Ser Ile Lys Leu Ala met Pro Giu Tyr Leu Lys Arg 280 Tyr Tyr Asn Leu 285 GiU Tyr Thr Gin 340 Asp Leu Ser Asn 310 Lys Asp Lys Asp GIn Phe Ilie Gly 295 Asn Gin Leu Ile Val Tyr Arg Ile 330 Val Giu Leu Phe 345 Phe Lys Asn Phe 360 Lys Arg Glu Lau -195- Lys Thr 315 Thr Pro Tyr ValI Ser Val1 Glu Gly Ala 365 Thr Asn ValI Tyr Gly 350 Ser Glu Phe As n Thr 335 Giu Giy Gly Ser 320 Thr As n Leu Ala SUBSTITUTE SHEET (RULE 26) WO 97/17432 WO 9717432PCT/US96/18003 Pro 385 Asp C ly Gin Ala As n 4 65 Thr Ile Ser Phe Asp 545 Ser Lys Leu Leu Ly s 625 Laeu Thr Thr Leu Giu 705 G ly Lys 37 0 Gin Ile Ser Thr 450 Leu Ly s Leu Gin Ser 530 Trp Leu Ile Leu Ile 610 Gin His Ser Val1 His 690 As n Asp Tyr Val1 S er Trp Ser 435 Giu Gin Tyr Cys Phe 515 Thr Arg Phe Lys Ala 595 Ala Leu Thr Tyr Tyr 675 ValI ValI G iy Thr Asn Gin Ala 420 Phe Leu Leu Tyr As n 500 Asp G ly Lys Arg As n 580 Asp Vai Ala Gin As n 660 His Met Ala Ala Pro 740 Ile Pro 405 Tyr Leu Ser Asp Met 485 Ala Arg Asp Thr Leu 565 As n Ile Gly Thr Lys 645 Lys Gly Ala His Met 725 Giy Tyr Giu Ala Lys Thr 455 Asn Arg Ile Phe Giu 535 Leu Lys Lys Gin Gly 615 Ile Ser Leu Gin Tyr 695 Val1 Ala Ser Ser Ile Ala Lau 440 Ile Thr Tyr Ser As n 520 Ile Lys Ile Asn Leu 600 Lys Arg Val1 Thr Gly 680 Ile Leu Giu Giu Ala Gly Lys 425 As n Leu Asp Ala Gin 505 Thr Asp Arg Thr Lau 585 Thr Thr Lys Phe Pro 665 Phe Ala Leu Lys Ala 745 As n Leu 410 Phe Ly s Giu Val1 Ile 490 Arg Pro Leu Ala Asp 570 Ser Ile Asn Leu Gin 650 Giu Asp Ala Trp Phe 730 Vali Ile 395 Thr Thr Ala Giy Leu 475 His Ser Leu Asn Phe 555 His As n Asp Leu As n 635 Leu Ile Lys Thr Ala 715 Trp Giu Leu ValI Giu Arg 445 Val Lys GiU Asp Asn 525 G iy Ile Asnr Tyr Leu 605 Ala Ile Ile Asn Lys 685 Gin Lys Trp Gin As n Leu Giu 430 Lau Arg ValI Thr Asn 510 G ly Ser Asp Lys Ile 590 Asp Ile Thr Met Leu 670 Ala Leu Leu Lau Giu 750 -196- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 .al Gin 'yr Cys Gin Ala Leu Ala Gin Lau Glu Met Val Tyr His Ser 755 760 765 Thr Gly Ile Asn Glu Asn Ala Phe Arg Leu Phe Val Thr Lys Pro Glu 770 775 780 Met Phe Gly Ala Ala Thr Gly Ala Ala Pro Ala His Asp Ala Leu 3er 785 790 795 800 Leu Ile Met Leu Thr Arg Phe Ala Asp Trp Val Asn Ala Leu Gly Glu 805 810 815 Lys Ala Ser Ser Val Leu Ala Ala Phe Glu Ala Asn Ser Leu Thr Ala 820 825 830 Glu Gin Leu Ala Asp Ala Met Asn Leu Asp Ala Asn Leu Leu Leu Gin 835 840 845 Ala Ser Ile Gln Ala Gin Asn His Gln His Leu Pro Pro Val Thr Pro 850 855 860 Glu Asn Ala Phe Ser Cys Trp Thr Ser Ile Asn Thr Ile Leu Gin Trp 865 870 875 380 Val Asn Val Ala Gin Gin Leu Asn Val Ala Pro Gin Gly Val Ser Ala 885 890 895 Leu Val Gly Leu Asp Tyr Ile Gin Ser Met Lys Glu Thr Pro Thr Tyr 900 905 910 Ala Gin Trp Glu Asn Ala Ala Gly Val Leu Thr Ala Gly Leu Asn Ser 915 920 925 Gin Gin Ala Asn Thr Leu His Ala Phe Leu Asp Glu Ser Arg Ser Ala 930 935 940 Ala Leu Ser Thr Tyr Tyr Ile Arg Gin Val Ala Lys Ala Ala Ala Ala 945 950 955 960 Ile Lys Ser Arg Asp Asp Leu Tyr Gin Tyr Leu Leu Ile Asp Asn Gin 965 970 975 Val Ser Ala Ala Ile Lys Thr Thr Arg Ile Ala Glu Ala Ile Ala Ser 980 985 990 Ile Gin Leu Tyr Val Asn Arg Ala Leu Glu Asn Val Glu Glu Asn Ala 995 1000 1005 Asn Ser Gly Val Ile Ser Arg Gin Phe Phe Ile Asp Trp Asp Lys Tyr 1010 1015 1020 Asn Lys Arg Tyr Ser Thr Trp Ala Gly Val Ser Gin Leu Val Tyr Tyr 1025 1030 1035 1040 Pro Glu Asn Tyr Ile Asp Pro Thr Met Arg Ile Gly Gin Thr Lys Met 1045 1050 1055 Met Asp Ala Leu Leu Gin Ser Val Ser Gin Ser Gin Leu Asn Ala Asp 1060 1065 1070 Thr Val Glu Asp Ala Phe Met Ser Tyr Leu Thr Ser Phe Glu Gin Val 1075 1080 1085 Ala Asn Leu Lys Val Ile Ser Ala Tyr His Asp Asn Ile Asn Asn Asp 1090 1095 1100 Gin Gly Leu Thr Tyr Phe Ile Gly Leu Ser Glu Thr Asp Ala Gly Glu 1105 1110 1115 1120 Tyr Tyr Trp Arg Ser Val Asp His Ser Lys Phe Asn Asp Gly Lys Phe 1125 1130 1135 -197- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 Ala Ala Asn Ala Trp Ser Glu Trp His Lys Ile Asp Cys Pro Ile Asn 1140 1145 1150 Pro Tyr Lys Ser Thr Ile Arg Pro Val Ile Tyr Lys Ser Arg Leu Tyr 1155 1160 1165 Leu Leu Trp Leu Glu Gin Lys Glu Ile Thr Lys Gin Thr Gly Asn Ser 1170 1175 1180 Lys Asp Gly Tyr Gin Thr Glu Thr Asp Tyr Arg Tyr Glu Leu Lys Leu 1185 1190 1195 1200 Ala His Ile Arg Tyr Asp Gly Thr Trp Asn Thr Pro Ile Thr Phe Asp 1205 1210 1215 Val Asn Lys Lys Ile Ser Glu Leu Lys Leu Glu Lys Asn Arg Ala Pro 1220 1225 1230 Gly Leu Tyr Cys Ala Gly Tyr Gin Gly Glu Asp Thr Leu Leu Val Met 1235 1240 1245 Phe Tyr Asn Gin Gin Asp Thr Leu Asp Ser Tyr Lys Asn Ala Ser Met 1250 1255 1260 Gin Gly Leu Tyr Ile Phe Ala Asp Met Ala Ser Lys Asp Met Thr Pro 1265 1270 1275 1280 Glu Gin Ser Asn Val Tyr Arg Asp Asn Ser Tyr Gin Gin Phe Asp Thr 1285 1290 1295 Asn Asn Val Arg Arg Val Asn Asn Arg Tyr Ala Glu Asp Tyr Glu Ile 1300 1305 1310 Pro Ser Ser Val Ser Ser Arg Lys Asp Tyr Gly Trp Gly Asp Tyr Tyr 1315 1320 1325 Leu Ser Met Val Tyr Asn Gly Asp Ile Pro Thr Ile Asn Tyr Lys Ala 1330 1335 1340 Ala Ser Ser Asp Leu Lys Ile Tyr Ile Ser Pro Lys Leu Arg Ile Ile 1345 1350 1355 1360 His Asn Gly Tyr Glu Gly Gin Lys Arg Asn Gin Cys Asn Leu Met Asn 1365 1370 1375 Lys Tyr Gly Lys Leu Gly Asp Lys Phe Ile Val Tyr Thr Ser Leu Gly 1380 1385 1390 Val Asn Pro Asn Asn Ser Ser Asn Lys Leu Met Phe Tyr Pro Val Tyr 1395 1400 1405 Gin Tyr Ser Gly Asn Thr Ser Gly Leu Asn Gin Gly Arg Leu Leu Phe 1410 1415 1420 His Arg Asp Thr Thr Tyr Pro Ser Lys Val Glu Ala Trp Ile Pro Gly 1425 1430 1435 1440 Ala Lys Arg Ser Leu Thr Asn Gin Asn Ala Ala Ile Gly Asp Asp Tyr 1445 1450 1455 Ala Thr Asp Ser Leu Asn Lys Pro Asp Asp Leu Lys Gin Tyr Ile Phe 1460 1465 1470 Met Thr Asp Ser Lys Gly Thr Ala Thr Asp Val Ser Gly Pro Val Glu 1475 1480 1485 lle Asn Thr Ala Ile Ser Pro Ala Lys Val Gin Ile Ile Val Lys Ala 1490 1495 1500 Gly Gly Lys Glu Gin Thr Phe Thr Ala Asp Lys Asp Val Ser Ile Gin -198- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 1505 1510 1515 1520 Pro Ser Pro Ser Phe Asp Glu Mse Asn Tyr Gin Phe Asn Ala Leu Glu 1525 1530 1535 Ile Asp Gly Ser Gly Leu Asn Phe Ile Asn Asn Ser Ala Ser Ile Asp 1540 1545 1550 "-al Thr Phe Thr Ala Phe Ala Glu Asp Gly Arg Lys Leu Gly Tyr Glu 1555 1560 1565 Ser Phe Ser Ile Pro Val Thr Leu Lys Val Ser Thr Asp Asn Ala Leu 1570 1575 1580 Thr Leu His His Asn Glu Asn Gly Ala Gln T7r Met Gin Trp Gin Ser 1585 1590 1595 1600 Tyr Arg Thr Arg Leu Asn Thr Leu Phe Ala Arg Gin Leu Val Ala Arg 1605 1610 1615 Ala Thr Thr Gly Ile Asp Thr Ile Leu Ser Met Glu Thr Gln Asn Ile 1620 1625 1630 Gin Glu Pro Gin Leu Gly Lys Gly Phe Tyr Ala Thr Phe Val Ile Pro 1635 1640 1645 Pro TIr Asn Leu Ser Thr His Gly Asp Glu Arg Trp Phe Lys Leu Tyr 1650 1655 1660 Ile Lys His Val Val Asp Asn Asn Ser His Ile Ile Tyr Ser Gly Gin 1665 1670 1675 1680 Leu Thr Asp Thr Asn Ile Asn Ile Thr Leu Phe Ile Pro Leu Asp Asp 1685 1690 1695 Val Pro Leu Asn Gin Asp Tyr His Ala Lys Val Tyr Met Thr Phe Lys 1700 1705 1710 Lys Ser Pro Ser Asp Gly Thr Trp Trp Gly Pro His Phe Val Arg Asp 1715 1720 1725 Asp Lys Gly Ile Val Thr Ile Asn Pro Lys Ser Ile Leu Thr His Phe 1730 1735 1740 Glu Ser Val Asn Val Leu Asn Asn Ile Ser Ser Glu Pro Met Asp Phe 1745 1750 1755 1760 Ser Gly Ala Asn Ser Leu Tyr Phe Trp Glu Leu Phe Tyr Tyr Thr Pro 50 1765 1770 1775 Met Leu Val Ala Gin Arg Leu Leu His Glu Gin Asn Phe Asp Glu Ala 1780 1785 1790 Asn Arg Trp Leu Lys Tyr Val Trp Ser Pro Ser Gly Tyr Ile Val His 1795 1800 1805 Gly Gin Ile Gln Asn Tyr Gln Trp Asn Val Arg Pro Leu Leu Glu Asp 1810 1815 1820 Thr Ser Trp Asn Ser Asp Pro Leu Asp Ser Val Asp Pro Asp Ala Val 1825 1830 1835 1840 Ala Gin His Asp Pro Met His Tyr Lys Val Ser Thr Phe Met Arg Thr 5 1845 1850 1855 Leu Asp Leu Leu Ile Ala Arg Gly Asp His Ala Tyr Arg Gin Leu Glu 1860 1865 1870 Arg Asp Thr Leu Asn Glu Ala Lys Met Trp Tyr Met Gin Ala Leu His 1875 1880 1885 -199- SUBSTITUTE SHEET (RULE 26) WO 97/1 7432 PCT/US96IJ 8003 Leu Leu Gb As p Lys Pro r/yr TLeu Pro Leu Ser Thr Thr Trp Ser Asp 1390 1895 1900 Pro Arg Leu Asp- Arg -Ala Ala Asp Ilie Thr Thr Giln Asn Ala His Asp 1905 1910 1915 1920 Ser Ala Ilie Val Ala Leu Arg Gin Asn Ile Pro Thr Pro Ala Pro Leu 1925 1930 1935 Ser Leu Arg Ser Ala Asn Thr Leu Thr Asp Leu Phe Leu Pro Gin Ile 1940 1945 1950 Asn Giu Val Met Met Asn erjr Trp Gin Thr Leu Ala Gin Arg Val T1yr 1955 1960 1965 Asn Leu Arg His Asn Leu ser Ile Asp Gly Gin Pro Leu Tyr Leu Pro 1970 1975 1980 Ilie Tyr Ala Thr Pro Ala Asp Pro Lys Ala Leu Leu Ser Ala Ala Val 1985 1990 1995 2000 Ala Thr 5cr Gin Gly Gly Gly Lys Leu Pro Giu Ser Phe Met Ser Leu 2005 2010 2015 Trp Arg Phe Pro His Met Leu Giu Asn Ala Arg Gly Met Val Ser Gin 2020 2025 2030 Leu Thr Gin Phe Gly Ser Thr Leu Gin Asn Ilie Ilie Giu Arg Gin Asp 2035 2040 2045 Ala Glu Ala Leu Asn Ala Leu Leu Gin Asn Gin Ala Ala Glu Leu Ile 2050 2055 2060 Leu Thr Asn Leu Ser Ile Gin Asp Lys Thr Ile Glu Giu Leu Asp Ala 2065 2070 2075 2080 Giu Lys Thr Val Leu Giu Lys Ser Lys Ala Giy Ala Gin Ser Arg Phe 2085 2090 2095 Asp Ser Tyr Gly Lys Leu Tyr Asp Giu Asn Ile Asn Ala Gly Giu Asn 2100 2105 2110 Gin Ala Met Thr Leu Arg Ala Ser Ala Ala Gly Leu Thr Thr Ala Val 2115 2120 2125 Gin Ala Ser Arg Leu Ala Gly Ala Ala Ala Asp Leu Vai Pro Asn Ile 2130 2135 2140 Phe Gly Phe Ala Gly Giy Gly Ser Arg Trp Gly Ala Ile Ala Giu Ala 2145 2150 2155 2160 Thr Gly Tyr Val Met Glu Phe Ser Ala Asn Val Met Asn Thr Giu Ala 2165 2170 2175 Asp Lys Ilie Ser Gin Ser Giu Thr Tyr Arg Arg Arg Arg Gin Giu Trp 2180 2185 2190 Giu Ilie Gin Arg Asn Asn Ala Giu Ala Giu Leu Lys Gin Ile Asp Ala 2195 2200 2205 Gin Leu Lys Ser Leu Aia Val Arg Arg Giu Ala Ala Val Leu Gin Lys 2210 2215 2220 Thr Ser Leu Lys Thr Gin Gin Giu Gin Thr Gin Ser Gin Leu Ala Phe 2225 2230 2235 2240 Leu Gin Arg Lys Phe Ser Asn Gin Ala Leu Tyr Asn Trp Leu Arg Gly 2245 2250 2255 Arg Leu Ala Ala Ile Tyr Phe Gin Phe Tyr Asp Leu Ala Val Ala Arg 2260 2265 2270 -200- SUBSTITUTE SHEET (RULE 26) WO 97/17432 WO 97/ 7432PCT/US96/18003 Cys Lau Met Ala Glu Gln Ala Tyr Arg Trp Giu Leu Asn Asp Asp Ser 2275 2280 2285 Ala Arg Phe Ilie Lys Pro Gly Ala Trp Gln Gly Thr Tyr Ala Gly Leu 2290 2295 2300 Lau Ala Gly Glu Thr Leu Met Leu Ser Leu Ala Gin Met Glu Asp Ala 2305 2310 2315 2320 His Leu Lys Arg Asp Lys Arg Ala Leu Giu Val Glu Arg Thr Val Ser 2325 2330 2335 Leu Ala Glu Vai Tyr Ala Cly Leu Pro Lys Asp Asn Gly Pro Phe Ser 2340 2345 2350 Leu Ala Gin Glu Ilie Asp Lys Leu Val Ser Gin Gly Ser Gly Ser Ala 2355 2360 2365 Gly Ser Cly Asn Asn Asn Leu Ala Phe Gly Ala Cly Thr Asp Thr Lys 2370 2375 2380 Thr Ser Leu Gin Ala Ser Val Ser Phe Ala Asp Leu Lys Ile Arg Ciu 2385 2390 2395 2400 Asp r,'r Pro Ala Ser Lau Cly Lys Ilie Arg Arg Ile Lys Gin Ile Ser 2405 2410 2415 Val Thr Leu Pro Ala Leu Leu Cly Pro Tyr Gin Asp Val Gin Ala Ile 2420 2425 2430 Leu Ser Tyr Cly Asp Lys Ala Gly Leu Ala Asn Gly Cys Ciu Ala Leu 2435 2440 2445 Ala Val Ser His Cly Met Asn Asp Ser Cly Gln Phe Gin Leu Asp Phe 2450 2455 2460 Asn Asp Gly Lys Phe Leu Pro Phe Ciu Cly Ile Ala Ilie Asp Gin Cly 2465 2470 2475 2480 Thr Leu Thr Leu Ser Phe Pro Asn Ala Ser Met Pro Glu Lys Cly Lys 2485 2490 2495 Gin Ala Thr Met Leu Lys Thr Leu Asn Asp Ile Ile Leu His Ile Arg 2500 2505 2510 Tyr Thr Ilie Lys 2516 INFORMATION FOR SEQ ID NO:48: SEQUENCE CHARACTERISTICS: LENGTH: 5547 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:48 (tcdAii coding region): CTG ATA CCC TAT AAC AAT CA.A TT ACC CCT AGA GCC ACT CAA TAT CTT 48 Lau Ile Gly Tyr Asn Asn Gin Phe Ser Gly Arg Ala Ser Gin Tyr Val 1. 5 10 GCG CCG GGT ACC CTT TCT TCC ATC TTC Tcc CCC GCC GCT TAT TTC ACT 96 Ala Pro Gly Thr Vai Ser Ser Met Phe Ser Pro Ala Ala Tyr Leu Thr 25 -201- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 Ciiu
TAT
Tyr
CAA
Gin
TTA
Leu
GTG
Val
CAT
His
GGA
Gly I CA c Gin 145 AAT A Asn I AAG A Lys L TAC C 7yr L
CTT
Leu
CTG
Leu
AAT
Asn
TTC
Leu
ATC
Met
GAT
Asp :TT C eu C 130 TAT CCT Ty1r Arg CAT ACC Asp Thr ATG GAT Met Asp GAA GCA CCC Glu Ala Arg CGC CCC CCA Arg Arg Pro 55 ATA GAA TTA Ile Glu Leu 70 AAT TTA Asn Leu 40 CAT CTC Asp Leu TCC ACA Ser Thr
CAC
His Lys
CTC
Lau
OCA
Ala
TCA
Ser
TCT
Ser ACT CAC TCC Ser Asp Ser 45 ATC CCC CTC Met Ala Leu 60 TTG TCC AAT Leu Ser Asn OTT TAT 144 Val Tyr ACT CAG 192 Ser Gln GAG CTC 240 Glu Leu 75
GAA
Clu
GAA
Glu
GCT
Ala 115 3AG 1 u
AGC
Ser
ATO
Met 100
TAT
Tyr
CAA
Cii
ATT
Ile
CTC
Leu
CAA
Glu
CTC
Leu
A.A.
Ly.
TC(
Se
A.A
Asr As r CC TCC ia Ser rTT CTG le Leu AA AAT ys Asn TT AAA eu Lys 195 CTA TTG GGI Leu
ACC
Thr
TTT
Phe 180
CGT
Cly 150
GAG
Glu
AAT
Asn
TAT
A ACT s Thr C ACT r Thr r GTG I Val
CCA
1 Ala 135
AT?
Ile
ATT
Ile ATC C Ile C AAT I Asn I TTT C Phe G 215 AAC Asn S ACA A Thr T GAG A Glu A TAT T Tr L 2 GGC G
CAA
Clu
TTC
Phe
CGT
Arg 120
TCA
Ser Ser
CCT
Arg 105
GAA
Giu
CCC
Pro TCT AAA CTG Lys 90
CCT
Pro
CTT
Va1
CCA
Ala Leu
TCC
Ser
ATC
Ile
ATT
Ile
ATC
Ile 155
AAT
Asn
TCA
CAA
Glu
GGC
Cly
CAC
Gln
GCC
Ala 140
TCC
Ser
GCT
Ala
TTG
AAC CCT TCA AAC TAT ACT Asn Tyr Thr CCA ACC CCT Ala Thr Pro 110 CTA CAA GAT Leu Cln Asp 125 CC TTG ATG Gly Leu Met CCT GAG CTA Pro Glu Leu AAA 288 Ly s TAT 336 Tyr CCT 384 Pro CAT 432 His TTT 480 Phe Ala
GAA
Glu
CCC
Pro 185 Arg Tyr Tyr
ATT
Ile
CTT
Leu 225
CGC
Arg
CTA
Leu
AAT
Asn
CAA
Glu ccA Ala
GGT
Gly 210
ATT
Ile
ATC
Ile
TTT
Phe
TTT
Phe
CT?
Leu 290
AAT
Asn
AAA
Lys
ACT
Thr
ACC
Thr
CCC
Pro
TAT
Tyr 275
GTT
Val
ATC
Ile
GCC
Ala
CCC
Pro
CGC
Arg
TTC
Phe 260
AAT
Asn
CGA
Arg
ACA
rhr
AGC
Ser
GTA
Val
GAA
Glu 245
CCT
cly
GCC
Ala
ACT
Thr
AAT
Asn
GTC
Val 230
TAT
Tyr
CCT
dly
TCT
Ser
GAA
Glu rTA ACC.
.eu Ser !00 ;GT CAA ;iy Gin 6GC ACT er Ser CC AAT 'hr Asn AT TAT Sn Tyr 265 Ser
CGT
Gly 170
GCC
Ala
GAT
Asp
CAC
Gln
CAT
Asp
GCT
Ala 250 Ccc;
CAA
Glu
CAA
GlU
CC
Gly 235
TAT
Tyr
TTA
CAA
Glu
TAT
Tyr 220
ACC
Thr
CAA
Gin
GAT
CT?
Leu 205
ACT
Ser
GTT
Val
ATG
Met
TAT
Ser Leu Ala CAA CTT TAT 528 Clu Leu Tyr 175 ATC CCC GAA 576 Met Pro Glu 190 ACT CAC TT? 624 Ser Gin Phe AAT AAC CAA 672 Asn Asn Gin AAG CTA TAT 720 Lys Val Tyr 240 CAT GTG GAG 768 Asp Val Glu 255 AAA TTC AAA 816 Lys Phe Lys 270 Arg Leu Asp Tyr
TA
eu 80
CT
la TCC ATC Ser Ile CCT CAA Pro Gin AAC TTA AAT CAT Lys Leu Asn Asp 285 CTC AAT ATA GAA Val Asn le Clu 300
AAA
Lys
TAC
Tyr ACA 864 Arg TCC 912 Ser Gly A.
295 TTA AAT ACC GCT CAT ATC ACT CAA CCT TTT GAA ATT 960 Leu Asn Thr Aia Asp Ile Ser Gin Pro Phe Glu Ile -202- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 305 310 CCC CT-- ACA CGA CTA CTT CCT Cly Leu Thr Arg Val Leu Pro 325 -k2A TTT ACC CTT CMA GAG TAT Lys Phe Thr Val Giu Ciu Tyr 340 M-C AAG CCT ATT CCT CTA TCA Asn Lys Ala lie Arg Leu Ser 355 CTC GAA GGC ATT CTC CGC ACT Leu Ciu Gly Ile Val Arg Ser 370 375 GAC GTA TTA GCT AAA CTT TTT Asp Val Leu Gly Lys Val Phe I 385 390 GCT ATT CAT CCT CAA ACT GCCC Ala Ile His Ala Giu Thr Ala I 405 CAA CT TCA TAT GAT AA T CAA C Gin Arg Ser Tyr Asp Asn Gin P 420 ACG CCA TTA CTC AAC GCA CAA T Thr Pro Leu Leu Asn Cly Gin T 435 4 GAT TTA AAT TCA GGT AGC ACC CI Asp Leu Asn Ser Ciy Ser Thr C 450 455 CGT GCA TTT AAT ATT GAT GAT G Arg Ala Phe Asn Ile Asp Asp V 465 470 ACC CAC CAT GAT AAT AAA CAT CC Thr Asp His Asp Asn Lys Asp C 485 CTT TCC AAT TTA TAT ATT GCA AP Leu Ser Asn Leu Tyr Ile Cly Ly 500 ACC ATT CAT GAA CTG CAT TTA TT Thr Ile Asp Giu Leu Asp Leu Le 515 52 ACT AAT TTA TCC CCT ATC ACT CA' Thr Asn Leu Ser Ala Ile Ser Asl 530 535 AAA CTC AT ACT ATT ACC ACC TC Lys Leu Asn Thr Ile Thr Ser Tr 545 550 TTC CAG CTA TTT ATC AT ACC TCC Phe Gin Leu Phe Ile Met Thr Sex 565 CT GAA ATT AAG AAT TTC CTG CAT Pro Glu Ile Lys Asn Leu Leu Asp 580
TCC
TTT GAT AAAC GAC CAA GCA GAT TTG -j 32') TCC CCT TCT TCC CCA TAT CCC CCC .CA 1008 Ser Cly Ser Trp Ala Tyr Ala Ala Ala 330 335 M.AC CM- TAq TCT TTT CTC CTA MA CTT i056 Asn Gin ?Ir Ser Phe Leu Leu Lys Leu 345 350 CGT CCC ACA GAA TTC TCA CCC ACC ATT 1104 Arg Ala Thr Glu Leu Ser Pro Thr Ile 360 365 GTT MAT CTA CMJA CTC CAT ATC MAC ACA 1152 Val Asn Leu Gin Leu Asp Ile Asn Thr 380 CTC ACT AAA TAT TAT ATC CAG CCT TAT 1200 Leu Thr Lys Ty r Tyr Met Gin Arg Tyr 395 400 TG ATA CTA TCC AAC GCG CCT ATT TCA 1248 .eu Ile Leu Cys Asn Aia Pro Ile Ser 410 415 :CT ACC CAA TTT CAT CCC CTG TTT MAT 1296 ro Ser Gin Phe Asp Arg Leu Phe Asn 425 430 'AT TrT TCT ACC CCC CAT GAG GAG ATT 1344 yr Phe Ser Thr Cly Asp Giu Ciu Ile 40 445 GC CAT TGC CGA AAA ACC ATA CTT AAC 1392 ly Asp Trp Arg Lys Thr le Leu Lys 460 TC TCG CTC TTC CCC CTC CTT AAA ATT 1440 al Ser Leu Phe Arg Leu Leu Lys Ile 475 480 IA AAA ATT AAA AAT MAC CTA AAG AAT 1488 Ly Lys Ile Lys Asn Asn Leu Lys Asn 490 495 LA TTA CTC CCA CAT ATT CAT CAA TTA 1536 s Leu Leu Ala Asp Ile His Gin Leu 505 510 'A CTC ATT GCC CTA GGT CAA CCG AMA 1584 u Leu Ile Ala Vai Gly Ciu Cly Lys 0 525 T AAC CAA -TI GCT ACC CTC ATC AGA 1632 p Lys Gin Leu Ala Thr Leu Ile Arg 540 CTA CAT ACA CAC AAC TCC ACT CTA 1680 p Leu His Thr Gin Lys Trp Ser Val 555 560 ACC ACC TAT AAC MAA ACG CTA ACC 1728 Thr Sen Tyr Asn Lys Thr Leu Thr 570 575 ACC CTC TAC CAC CGT TTA CAA GCT 1776 Thr Val Tyr His Cly Leu Gin Gly 585 590 CTA CAT CTC ATG CCC CCC TAT ATT 1824 -203- SUBSTITUTE SHEET (RULE 26) WO 97/17432 WO 9717432PCTUS96/1 8003
GCC
.AlIa
CTT
Lau
AAA
Ly s
GCC
Gin
CGT
r g
GCG
Ala 3) 705
CAT
Asp
TTT
Phe cTT Lau Gin
TCT
Ser 785
GTC
V-A 1
TCA
Ser
GTA
TTT
Phe 365 Asp
CC
.Ila 610
TGC
Trp
TTC
Phe
GTA
ValI
TTC
Leu
CTA
Lau 690
CCC
Pro
TG
Trp
GAA
Glu
CAT
Asp
CAT
His 770
ATC
Ile
GCC
Ala
ATG
Met
TTA
Leu
CTG
Leu 850
OTC
Va 1 Ly s 595
ACC
Thr
GCA
Ala
TGG
Trp
CAA
Ciu
GAA
Ciu 675 Phe
GCG
Ala
GTG
ValI
GCT
Ala
CCT
Ala 755
CTT
Leu
AAT
Asn
CCA
Pro
AAA
Ly s
ACC
Thr 835
GAT
Asp ccc :Ala Asp
TTG
Leu
CAT
Asp
GAC
Asp
ACC
Thr 660
ATG
Met
G
Val1
CAT
His
AAC
As n
AAC
As n 740
AAT
As n
CCC
Pro
ACT
Thr
CAG
Gin
GAG
Ciu 820
GCC
Ala
CAA
Giu
AA
Ly s [Lys Al a CAA TTA Gin Leu .ACG TTA Lys Leu 630 TCC TTC Trp Leu 645 CAC GAA Gin Ciu CTT TAC Val Tyr ACA AAA Thr Lys CAT GCC Asp Ala 710 CCA CTA Ala Leu 725 TCC TTA Ser Leu TTG CTC Leu Leu CCA CTA Pro Val ATC CTC Ile Leu 790 CGC CTT Cly Val 805 ACA CCC Thr Pro GC 'TCG Gly Leu TCT CC Ser Arg GCA CCG Ala Ala 870 His
AAT
Asn
GAC
Asp
TAT
Tryr 650
CAG
Gin
GCC
C ly Phe
ATT
Ile
CC
Ala 730
CAA
Gin
ACT
Ser
AAT
Asn
AAT
Asn
CTC
ValI 810
CAG
Gin
CAC
Gin
TTA
Leu
AAA
Ly s Vai1
CTC
Ila I G GC Gly 635
ACC
Thr
TAT
T'y r
ATC
Ile
GC
Ciy
ATG
Met 715
TCC
Ser
CTC
Lau
ATT
Ile
CC
Ala
GTC
Val1 795
CCC
Gly
TCC
Trp
CCT
Ala
AC
Ser
AC
Ser 875 Met
GCC
A. .la 620
CCA
Ala
CCC
Pro
TCT
Cys
AAC
As n
CCT
Ala 700
CTC
Leu
TCC
Ser
GCT
Ala
CAA
Gin
'FTC
Phe 780
CCA
Ala
CTC
Leu
CAA
Giu
AAT
As n
ACC
Thr 860
CCT
Arg Pro
TCC
Ser
ACA
Thr
TCA
Ser
CCT
Ala 670
AAC
As n
ACT
Thr
CGT
Arg
CTA
Leu
GCC
Ala 750
CAA
Gin
TGT
Cys
CAA
Gin
TAT
Tyr
CC
Ala 830
TTA
Leu
TAT
Tyr
GAC
Asp I I a CTC 1872 Leu G.A 1920 Giu 640 CA?. 1963 Giu CCA 2016 Ala TTC 206-4 Phe CCA 2112 Ala CCG 2160 Ala 720 CCA 2208 Ala AAT 2256 As n CAT 2304 His ACA 2352 Thr AAT 2400 As n 800 CAA 2448 Gin CCC 2496 Cly CCT 2544 Ala CGT 2592 Arg TAT 2640 Try r 880 -204- SUBSTITUTE SHEET (RULE 26) WO 97/1 7432 PCT/US96/18003 :A TIC TTA rCTO ATT GAT AAT CAG GTT TCT GCG GC.A ATA I.A I=-CO; G In 71,r Lau Lau Ile Asp asn Gin Val Ser Ala Ala le Lys Thr Thr 385 890 895 COG ATC 0CC GAA GCC ITT GOO AGT ATT CAA CTG TAC 3TC AAC CGG OA 2736 Arg Ile Ala Giu Ala Ila A1la Ser Ile Gin Lau Ty Ir Val Asn Arg Ala 900 905 910 TTG OA AAT OTG GAA G AA- AAT GCO AA2T TCG GGG OTT ITC AGC COC 2784 1( Llu Glu Asn Val Giu Giu Asn Ala Asn Ser Gly Vai le Ser Arg Gin 915 920 92 TTIC TTT ATC GAC TOG GAC AA TAC AAT AA.A COC TAC AO ACT TOO 000 2832 Phe Phe Ile Asp Trp Asp Lys Ty'r Asn Lys Arg 7,'r Ser Thr Trp Ala 930 935 940 GOT OTT TCT CAA TTA OTT TAC TAC CCG OA.A AAC TAT ATT OAT COG ACC 2980 Oly Val Ser Gin Lau Val Tyr TIyr Pro Oiu Asn Tyr Ile Asp Pro Thr 945 950 955 960 ATO COT ATC GGA CAA ACC AAA ATO ATO GAC OA TTA CTG CAA TCC OTC 2928 Met Arg Ile Oly Gin Thr Lys Met Met Asp Ala Leu Leu Gin Ser Val 965 970 975 AO CAA AGC CAA TTA A--C 0CC OAT ACC OTC OAA OAT 0CC TTT ATO TCT 2976 Ser Gin Ser Gin Leu Asn Ala Asp Thr Val Olu Asp Ala Phe Met Ser 980 985 990 TAT CTG ACA TOG TTT GAA CAA TO OT A.AT CTT AAA OTT ATT AO GOA 3024 Tyr Leu Thr Ser Phe Oiu Gin Val Ala Asn Leu Lys Val Ile Ser Ala 995 1000 1005 TAT CAC OAT AAT ATT AAT AAC OAT CAA 000 CTO ACC TAT TTI' ATC OGA 3072 Tyr His Asp Asn Ile Asn Asn Asp Gin Oly Leu Thr Tyr Phe Ile Oly 1010 1015 1020 CTC AOT OAA ACT OAT 0CC GOT OAA TAT TAT TOO COC AOT OTC OAT CAC 3120 Leu Ser Glu Thr Asp Ala Oly Giu Tyr Tyr Trp Arg Ser Val Asp His 1025 1030 1035 1040 AOT AAA TTC AAC GAC GOT AAA TTC OCO OCT AAT GCC TOG AOT OAA TOO 3i68 Ser Lys Phe Asn Asp Oly Lys Phe Ala Ala Asn Ala Trp Ser Olu Trp 1045 1050 1055 CAT AAA ATT OAT TOT CCA ATT AAC CCT TAT A.AA AO ACT ATC COT CCA 3216 His Lys Ile Asp Cys Pro Ile Asn Pro Tyr Lys Ser Thr Ile Arg Pro 1060 1065 1070 OTO ATA TAT AAA TCC COC CTO TAT CTO CTC TOG TTG OAA CAA AAO GAG 3264 Val Ile Tyr Lys Ser Arg Leu Tyr Leu Leu Trp Leu Oiu Gin Lys Giu 1075 1080 1085 ATC ACC AAA CAG ACA OGA AAT AOT AAA OAT GOC TAT CAA ACT GAA ACC 3312 Ile Thr Lys Gin Thr Oly Asn Ser Lys Asp Oly Tyr Gin Thr Oiu Thr 1090 1095 1100 OAT TAT COT TAT OAA CTA AAA TTO OCO CAT ATC COC TAT OAT GOC ACT 3360 Asp Tyr Arg Tyr Giu Leu Lys Leu Ala His Ile Arg Tlyr Asp Gly Thr 1105 1110 1115 1120 TOO AAT ACO CCA ATC ACC TTT OAT OTC AAT A-AA AAA ATA TCC GAG CTA 3408 Trp Asn Thr Pro Ile Thr Phe Asp Val Asn Lys Lys Ile Ser Olu Leu 1125 1130 1135 AAA CTO GOAA A:A~ AAT AGA OCO, CCC OGA CTC TAT TOT 0CC GOT TAT CAA 3456 Lys Leu Olu Lys Asn Arg Ala Pro Oiy Leu Tyr Cys Ala Gly TNr Gin 1140 1145 1150 GOT OAA OAT ACO TTO CTO OTO ATO TTT TAT AAC CAA CAA OAC ACA CTA 3504 Gly Olu Asp Thr Leu Leu Val Met Phe Tyr Asn Gin Gin Asp Thr Leu 1155 1160 1165 -205- SUBSTITUTE SHEET (RULE 26) WO 97/1 7432 PCTIUS96/I 8003 CAT ACT TAT AAA G)C CT TCA 'ATO CA.A CCA C TA TAT ATC TTT OCT 3 CA'T 3552 Asp Ser Tyr Lys Asn Ala Sar Met Gin Gly Leu I'Ir Ile Phe Ala Asp 1170 1175 1180 ATG GCA TCC AA CAT ATC ACC CCA GCA CAC AGC A.AT CTT TAT CGC CAT 3600 Met Ala Ser Lys Asp Met Thr Pro Ciu Gin Ser Asn V/al ITr Arg Asp 1185 1190 .1195 1200 1) AAT ACC TAT CAA CAA TTT GAT ACC AT AAT CTC ACA AGA GTC A.T 3~ 36 48 Asn Ser Tyr Gin Gin Phe Asp Thr Asn Asn Val Arg Arg Val Asn Asn 1205 1210 1215 CGC TAT GCA GAG CAT TAT GAG ATT CCT TCC TCC GTA ACT AGC COT AAA~ 3696 Arg Tyr Ala Ciu Asp Tyr Ciu Ile Pro Ser Ser Val 3cr Ser Arg Lys 1220 1225 1230 GAC TAT GT TOO GCA CAT TAT TAC CTC ACC ATG CTA TAT AAC GGA CAT 3744 Asp Tyr Cly Trp Ciy Asp Tyr Tyr Leu Ser Met Vai Tyr Asn Gly Asp 1235 1240 1245 ATT CCA ACT ATC .AAT TAC AAA GCC CCA TCA ACT GAT TTA AAA ATC TAT 3792 Ile Pro Thr Ile Asn Tyr Lys Ala Ala Ser Ser Asp Leu Lys Ile Tyr 1250 12-55 1260 ATC TCA CCA AAA TTA AGA ATT ATT CAT AAT CCA TAT CA?. GOA CAC AAG 3840 Ile Ser Pro Lys Leu Arg Ile Ile His Asn Cly Tyr Clu Cly Gin Lys 1265 1270 1275 1280 CCC AAT CAA TGC AAT CTC ATC A.AT AAA? TAT CCC AAA CTA GCT CAT 3 88 8 Arg Asn Gin Cys Asn Leu Met Asn Lys Tyr Cly Lys Leu Gly Asp Lys 1285 1290 1295 TTT ATT CTT TAT ACT ACC TTG CCC OTC CCA AAT AAC TCC TCA AAT 3936 Phe Ile Val Tyr Thr Ser Leu Gly Val Asn Pro Asn Asn Ser Ser Asn 1300 1305 1310 AAC CTC ATC TTT TAC CCC CTC TAT CA. TAT AGC GA AAC ACC ACT CC?. 3984 Lys Leu Met Phe Tyr Pro Val Tyr Gin Tyr Ser Gly Asn Thr Ser Gly 1315 1320 1325 CTC AAT CA. CCC AGA CT?. CTA TTC CAC COT GAC ACC ACT TAT CC?. TCT 4032 Leu Asn Gin Cly Arg Leu Leu Phe His Arg Asp Thr Thr Tyr Pro Ser 1330 1335 1340 AAA~ CT?. CA. OCT TCO AT? CCT CCOA C? AAA? COT TCT CTA ACC AAC CAA 4080 Lys Val Ciu Ala Trp Ilie Pro Cly Ala Lys Arg Ser Leu Thr Asn Gin 1345 1350 1355 1360 AA;,T CCC CCC AT? GOT CAT CAT TAT CT AC?. CAC TCT CTC AAT CCC 4128 Asn Ala Ala Ile Cly Asp Asp Tyr Ala Thr Asp Ser Leu Asn Lys Pro 1365 1370 1375 CAT CAT CTT AAG CA. TAT ATC TTT ATG ACT CAC ACT AAA CCC ACT CCT 4176 As p Asp Leu Lys Gin Tyr Ilie Phe Met Thr Asp Ser Lys Cly Thr Ala 1380 1385 1390 ACT CAT CTC TCA CCC CC?. CTA GAG ATT AAT ACT C? ATT TCT CCA C? 4224 Thr Asp Vai Ser Cly Pro Vai Gu Ile Asn Thr Ala Ile Ser Pro Ala 1395 1400 1405 AA? OTT CAC AT?. ATA CTC AAA? CC GT CCC AAC GAG CAA. ACT TTT ACC 4272 Lys Val Gin Ile Ile Val Lys Aia Cly Cly Lys Giu Gin Thr Phe Thr 1410 1415 1420 CA CAT AkkA CAT OTC TCC ATT CAC CCA TCA CCT ACC TT? CAT GAA ATC 4320 Al' Ia Asp Lys Asp Val Ser Ilie Cmn Pro 5Cr Pro Ser Phe Asp Giu Met 1425 1430 1435 1440 AAkT TAT CAA. TTT AAT GCC CTT GAA AT?. CAC CCT TCT CCT CTC AAT TTT 4368 Asn Tyr Gin Phe Asn Ala Leu iu Ilie Asp Cly Ser Cly Leu Asn Phe -206- SUBSTITUTE SHEET (RULE 26) WO 97/17432 WO 9717432PCTIUS96/18003 1445 1450 1455 ATT AAC A.C TCA CCC ACT ATT CAT CTT ACT TTT ACC GCA TTT C GAG 4416 Ile Asn Asn Sar Ala Ser Ile Asp Val Thr Phe Thr Ala Phe Ala Giu 1460 1465 1470 GAT CCC CGC AA CTG CCT TAT CAA ACT TTC ACT ATT CCT GTT ACC CTC 4464 As~p Gly Arg Lys Leu Cly Ty~r Giu Ser Phe Ser Ile Pro Val Thr Leu 1475 1480 1485 G CTA ACT ACC GAT AAT CCC CTG ACC CTC CAC CAT GAA AAT GGT 4512 Lys Val Ser Thr Asp Asn Ala Lau Thr Lau His His Asn Giu Asn Cly 1490 1495 1500 CC CAA TAT ATC CAA~ TCG CAA TCC TAT CCT ACC CCC CTC AAT ACT OTA 4560 Ala Cln Tyr Met Gin Trp Gin Ser Tyr Arg Thr Arg Leu Asn Thr Leu 1505 1510 1515 1520 ?I'T CCC CCC CAC TTC CTT CCA CCC CCC ACC ACC GGA ATC CAT ACA ATT 4608 Phe Ala Arg Gin Leu Val Ala Arg Ala Thr Thr Cly Ile Asp Thr Ile 1525 1530 1535 CTG ACT ATC GAA ACT CAC AAT ATT CAG CAA CCC CAC TTA CCC AAA CCT 4656 Leu Ser Met Ciu Thr Gin Asn Ile Gin Ciu Pro Gin Leu Cly Lys Cly 1540 1545 1550 TTC TAT CCT ACC TTC CTG ATA CCT CCC TAT AAC CTA TCA ACT CAT CCT 4704 Phe Tyr Ala Thr Phe Val Ile Pro Pro Tyr Asn Leu Ser Thr His Cly 1555 1560 1565 CAT CAA CCT TCG TT AAG CTT TAT ATC AAA CAT CTT CTT CAT A.AT AAT 4752 Asp Ciu Arg Trp Phe Lys Leu Tyr Ile Lys His Val Val Asp Asn Asn 1570 1575 1580 TCA CAT ATT ATC TAT TCA CCC CAC CTA ACA CAT ACA AAT ATA AAC ATC 4800 Ser His Ilie Ilie Tyr Ser Cly Gin Leu Thr Asp Thr Asn Ile Asn Ile 1585 1590 1595 1600 ACA TTA TTT ATT CCT CTT CAT CAT CTC CCA TTC AAT CAA CAT TAT CAC 4848 Thr Leu Phe Ile Pro Leu Asp Asp Val Pro Leu Asn Gin Asp Tyr His 1605 1610 1615 CCC AAG GTT TAT ATC ACC TTC AAG AAA TCA CCA TCA CAT CCT ACC TOG 4896 Ala Lys Val Tyr Met Thr Phe Lys Lys Ser Pro Ser Asp Cly Thr Trp 1620 1625 1630 TCC CCC CCT CAC TTT CTT AGA CAT GAT AAA CGA ATA CTA ACA ATA AAC 4944 Trp Cly Pro His Phe Val Arg Asp Asp Lys Gly Ilie Val Thr Ile Asn 1635 1640 1645 CCT AA TCC ATT TTG ACC CAT TTT GAG ACC CTC AAT CTC CTG AAT AAT 4992 Pro Lys Ser Ilie Laeu Thr His Phe Giu Ser Val Asri Val Leu Asn Asn 1650 1655 1660 ATT ACT ACC GAA. CCA ATC CAT TTC AGC CCC CCT AAC ACC CTC TAT TTC 5040 Ilie Ser Ser Ciu Pro Met Asp Phe Ser Cly Ala Asn Ser Leu TIyr Phe 1665 1670 1675 1680 TCC GA-A CTC TTC TAC TAT ACC CCC ATC CTOGCTT CCT CAA CCT TTC CTC 5088 Trp Ciu Leu Phe Tyir Tyr Thr Pro Met Leu Val Aia Gin Arg Leu Leu 1685 1690 1695 CAT CAA CAC AAC TTC CAT CAA CCC AAC CCT TCG CTG AAA TAT CTC TCC 5136 His Ciu Gin Asn Phe Asp Giu Aia Asn Arg Trp Laeu Lys Tyr Vai Trp 1700 1705 1710 ACT CCA TCC CCT TAT ATT CTC CAC CCC CAG ATT CAC AAC TAC CAC TOG 5184 .3er Pro Ser Cly Tyr Ilie Val His Cly Gin Ilie Gin Asn Tyr Gin Trp 1715 1720 1725 A-C CTC CCC CCC TTA CTC CA.A CAC ACC ACT TOG AAC ACT CAT CCT TTC 5232 -207- SUBSTITUTE SHEET (RULE 26) WO 97/1 7432 PCTIUS96I1 8003 Asn 'a zArg Pro Lau Leu Clu Asp Thr Ser Trp Asn ser Asp Pro Lau 17301735 1740 GAT TCC CTC CAT CCT CAC C GTA GCA CAG CAC GAT CCA ATG CAC TAC 5230 Asp Ser Val Asp Pro Asp Ala ValI Ala Gin His Asp Pro Met His r1Z'r 1745 1750 1755 1760 AAA GTT TCA ACT TTT~ ATG CCT ACC TTC GAT CTA TTC ATA GCA CGC GCC 5328 Lys Val Ser Thr Phe Met Arg Thr Leu Asp Leu Leu Ile Ala Arg Cly 1765 1770 1775 CAC CAT GCT TAT CGC CAA CTG CA. CCA GAT ACA CTC AAC GAA GCC kAG 5 37 6 Asp His Ala Ty~r Arg Gin Leu Ciu Arg Asp Thr Leu Asn Ciu Ala Lys 1780 1785 1790 ATC TCC TAT ATG CAA CC CTC CAT CTA TTA CCT CAC AAA CCT TAT CTA 5424 Met Trp Tyr Met Gin Ala Leu His Leu Leu Cly Asp Lys Pro Tyr Leu 1795 1800 1805 CCC CTC ACT ACC ACA TCC ACT CAT CCA CCA CTA CAC ACA CCC CC CAT 5 47 2 Pro Lau Ser Thr Thr Trp Ser Asp Pro Arg Leu Asp Arg Ala Ala Asp 1810 1815 1820 ATC ACT ACC CAA AAT GCT CAC CAC AGO CCA ATA CTC CCT CTC CCC CAC 5520 Ilie Thr Thr Gin Asn Ala His Asp Ser Aia Ilie Val Ala Leu Arg Gin 1825 1830 1835 1840 AAT ATA CCT ACA CCC CA CCT TTA TOA 5547 Asn Ilie Pro Thr Pro Ala Pro Lau Ser 1845 1849 INFORM4ATION FOR SEQ ID NO:49: SEQUENCE CHARACTERISTICS: LENGTH: 1849 amino acids TYPE: amino acids STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein Leu 1 Ala Ciu Tyr Gin Lau (xi) SEQUENCE DESCRIPTION: SEQ ID NO:49 (TcdAii): Features From To Description Peptide 1 1849 TcdAii peptide Fragment 1 12 S2 N-terminus (SEQ ID NO:13) Fragment 196 211 (SEQ ID N0:38) Fragment 466 475 (SEQ ID NO:17) Fragment 993 1004 (SEQ ID NO:23; 12/13) Fragment 1297 1312 (SEQ ID NO:18) Fragment 1390 1409 (SEQ ID NO:39) Fragment 1532 1554 (SEQ ID NO:2i; 19/23) Ile Ciy Tyr Asn Asn Gin Phe Ser Cly Arg Ala Ser Cmn Tyr Vai 5 10 Pro Cly Thr Val Ser Ser Met Phe Ser Pro Ala Ala Tyr Lau Thr 25 Leu Tyr Arg Ciu Ala Arg Asn Leu His Ala Ser Asp Ser Vai Tyr 40 Leu Asp Thr Arg Arg Pro Asp Leu Lys Ser Met Ala Leu Ser Gin 55 Asn Met Asp Ilie Giu Lau Ser Thr Leu Ser Leu Ser Asn Giu Leu 70 75 Leu Ciu Ser Ilie Lys Thr Ciu Ser Lys Leu Ciu Asn Tyr Thr Lys -208- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/1 8003 Val Met Glu Met Leu Ser Thr Phe Arg Pro 7?r Ser Gly Ala Thr His Asp Ala 115 Tyr Glu Asn Va 1 Gly Leu Glu Gin Leu Asn Al 130 Gin Ala 145 Asn Ile Lys Lys Tyr Leu Lys 19C Ile Gly Lys 210 Leu Ile Thr 225 Arg Ile Thr Leu Phe Pro Asn Phe Tyr 275 Giu Leu Val 290 Ala Asn Ile 305 Gly Leu Thr Lys Phe Thr As Lys Ala 355 Leu Giu Gly 370 Asp Val Leu 385 Ala Ile His Gin Arg Ser r Leu Leu Gly 150 1 Thr Giu Glu 165 i Phe Gly Asn 180 Arg Tyr Tyr Ala Ser As Pro Val Val 230 Arg Glu Tyr 245 Phe Gly Gly 260 Asn Ala Ser Arg Thr Glu Thr Leu Asn 310 Arg Val Leu 325 Val Glu Glu 340 Ile Arg Leu Ile Val Arg Gly Lys Val I 390 Ala Glu Thr I 405 Tyr Asp Asn C 420 135 Ile Ile Ile Asn Phe 215 Asn Thr Glu Tyr Gly 295 Thr Pro Tyr Ser Ser 375 ?he I lia I ;In P In 1 4 'hr G 55 Arg 120 Ser Asn Thr Glu Leu 200 Gly Ser Thr Asn Leu 280 Ala Ala Ser Asn Arg 360 alI .eu .eu I 'ro S 4 'r P 40 ly A Glu Val Ile Gir
'I
Prc Al Glu Prc 185 Ser Gin Ser Asn Tyr 265 Ser Pro Asp ly Gin 345 A1a ksn rhr le jer 25 'he Ala Ser Gly 170 Ala Asp Gin Asp Ala 250 Arg Ile Gin Ile Ser 330 Tyr Thr Leu Lys Leu C 410 Gin E Ser I 1I4 Iit 15' As Sex
GIL
GIU
Gly 235 Tyr Leu Lys Val Ser 315 Trp Ser 3lu 'In ryr 95 -ys The 'hr e Ala 140 c-Ser 5 i Ala Leu Glu Tyr 220 Thr Gin Asp Leu Asn 300 Gin Ala Phe Leu Leu 380 Tyr Asn 2 Asp Gly 4 Lys 'T 460 Leu 125 Gly Pro Glu Ala Leu 205 Ser Val Met Tyr Asn 285 Ile Pro Tyr Leu Ser 365 ksp 4et C la I ~rg 1 4 Isp G Gi Let Gl.
Glu Met 190 Ser Asn Lys Asp Lys 270 Asp Glu Phe Ala Leu 350 Pro le ;In ?ro .eu lu I Asp i Met I Leu Leu 175 Pro Gin Asn Vai Val 255 Phe Lys Tyr Glu Ala 335 Lys Thr Asn 1 Arg Ile S 415 Phe Giu I Pro His Phe 160 Tyr Glu Phe Gin Tyr 240 Glu Lys Arg Ser Ile 320 Ala Leu Ile rhr ryr 100 jer Ls n :le Thr Pro Leu Leu Asn Gly
C
4 Asp Leu 7) 450 Asn Ser Giy Ser sp Trp Arg 'hr Ile Leu Lys -209- SUBSTITUTE SHEET (RULE 26) WO 97/17432 Arg Al 465 Thr As Leu SE Thr I: Thr Ar Lys L 545 Phe G.
Pro G Phe A Ala A 6 Leu T 625 Lys P Ala V Gin L Arg L 6 Ala P 705 Asp T Phe G Leu A Gin 7 Ser I 785 Val Ser Le s n 30 In lu sp la 10 rp he al eu eu 90 ro rp lu rsp i s '70 ket 4et Phe Asn 1 His Asp 4 Asn Leu 500 Asp Glu I 515 Leu Ser Asn Thr Leu Phe Ile Lys 580 Lys Asp 595 Thr Leu Ala Asp Trp Asp Glu Thr 660 Glu Met 675 Phe Vai Ala His Val Asn Ala Asn 740 Ala Asn 755 Leu Pro Asn Thr i Pro Gin Lys Glu 820
I
~sn 185 Ili r ,eu kla Ile Ile 565 Asn Lys GIn Lys Trp 645 Gin Val Thr Asp Ala 725 Ser Leu Prc 1l Gil Th~ Asp 470 Ly s Ile Asp Ile Thr 550 Met Leu Ala Leu Leu 630 Leu Glu Tyr Lys Ala 710 Leu Leu Let Val Le 79( Va r Pre Asp V Asp G Gly L Leu L 5 Ser A 535 Ser T Thr S Leu Asp I Ser 615 Gin I Asn His His Pro 695 Leu Gly Thr Leu I Thr 775 j Gin 0 I Ser o Thr ai ly I's *eu 20 Isp 'rp ,er ~sp .eu )00 3er ?ro hr Ile Ser 680 Glu Ser Glu Ala Glr 76( PrC Tr Al Ser Ly s Leu 505 Leu Ly s Leu Thr Thr 585 Leu Glu Gly Lys Val 665 Thr Met Leu Lys GlL 745 Al
)GI
Va a Lei Leu P 4 Ile L 490 Leu P Ile Gin I His Ser 570 Val His Asn Asp Tyr 650 Gin Gly Phe Ile Ala 730 1 Gin i Ser i Asn I Asn u Val 810 he 75 ,ys la la .eu rhr 555 ['yr ['yr ial Val Gly 635 Thr Tyr Ile Gly Met 715 Ser Leu Ile Ale Va 79' Gil Arg L Asn Asp Val C Ala 540 Gin I Asn His Met Ala 620 Ala Pro cys Asn Ala 700 Leu Ser Ala Gin x Phe 780 i Ala 5 r' Leu ,eu L sn L :ie H 5 ly G 525 rhr L L'ys I Lys I ly I A.a I 605 His Met Gly Gin Glu 685 Ala Thr Vai Asp Ala 765 Sen Gin Asp eu eu is lu ,eu :rp rhr .eu 590 Pro Ser rhr Ser Ala 670 Asn Thr Arg Leu Ala 750 Gin Cys Glr 7;' PCT/US96/1 8003 Lys I- 480 Lys Asn 495 Gin Leu Gly Lys Ile Arg Sen Val 560 Leu Thr 575 Gin Gly Tyr Ile Val Leu Ala Glu 640 Ser Giu 655 Leu Ala Ala Phe Gly Ala Phe Ala 720 Ala Ala 735 Met Asn Asn His Trp Thr 1 Leu Asn 800 Ile Gin 815 r Ala Gin Trp Glu Asn 825 Gin Gin Ala Asn Thr Ala Ala Gly 830 Leu His Ala Val Leu Thn 835 Ala Gly Leu Asn -210- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 Phe Leu Asp Glu ar Arg Ser Ala Ala Leu Ser Thr Tyrr Tr le Arg 350 355 860 Gin Val Ala Lys Ala Ala Ala Ala Ile Lys Ser Arg Asp Asp Leu 1Tyr 365 870 875 880 Gin T'r Leu Leu Ile Asp Asn Gin Val Ser Ala Ala Ile Lys Thr Thr 885 890 895 Ara Ie Ala Glu Ala Ile Ala Ser Ile Gin Leu Tyr Val Asn Arg Ala 900 905 910 Leu Glu Asn Val Glu Glu Asn Ala Asn Ser Gly Val Ile Ser Arg Gln 915 920 925 Phe Phe Ile Asp Trp Asp Lys Tyr Asn Lys Arg Tyr Ser Thr Trp Ala 930 935 940 Gly Val Ser Gin Leu Val Tyr Tyr Pro Glu Asn Tyr Ile Asp Pro Thr 945 950 955 960 Met Arg Ile Gly Gin Thr Lys Met Met Asp Ala Leu Leu Gin Ser Val 965 970 975 Ser Gin Ser Gin Leu Asn Ala Asp Thr Val Glu Asp Ala Phe Met Ser 980 985 990 Tyr Leu Thr Ser Phe Glu Gin Val Ala Asn Leu Lys Val Ile Ser Ala 995 1000 1005 Tyr His Asp Asn Ile Asn Asn Asp Gin Gly Leu Thr Tyr Phe Ile Gly 1010 1015 1020 Leu Ser Glu Thr Asp Ala Gly Glu Tyr Tyr Trp Arg Ser Val Asp His 1025 1030 1035 1040 Ser Lys Phe Asn Asp Gly Lys Phe Ala Ala Asn Ala Trp Ser Glu Trp 1045 1050 1055 His Lys Ile Asp Cys Pro Ile Asn Pro Tyr Lys Ser Thr Ile Arg Pro 1060 1065 1070 Val Ile Tyr Lys Ser Arg Leu Tyr Leu Leu Trp Leu Glu Gin Lys Glu 1075 1080 1085 Ile Thr Lys Gin Thr Gly Asn Ser Lys Asp Gly Tyr Gin Thr Glu Thr 1090 1095 1100 Asp Tyr Arg Tyr Glu Leu Lys Leu Ala His Ile Arg Tyr Asp Gly Thr 1105 1110 1115 1120 Trp Asn Thr Pro Ile Thr Phe Asp Val Asn Lys Lys Ile Ser Glu Leu 1125 1130 1135 Lys Leu Glu Lys Asn Arg Ala Pro Gly Leu Tyr Cys Ala Gly Tyr Gln 1140 1145 1150 Gly Glu Asp Thr Leu Leu Val Met Phe Tyr Asn Gin Gin Asp Thr Leu 1155 1160 1165 Asp Ser Tyr Lys Asn Ala Ser Met Gin Gly Leu Tyr Ile Phe Ala Asp 1170 1175 1180 Met Ala Ser Lys Asp Met Thr Pro Glu Gin Ser Asn Val Tyr Arg Asp 1185 1190 1195 1200 Asn Ser Tyr Gin Gin Phe Asp Thr Asn Asn Val Arg Arg Val Asn Asn 1205 1210 1215 Arg Tyr Ala Glu Asp Tyr Glu Ile Pro Ser Ser Val Ser Ser Arg Lys 1220 1225 1230 -211- SUBSTITUTE SHEET (RULE 26) WO 97/1 7432 PCTIUS96I1 8003 Asp Sly Trp Sly Asp Tyr 7Th-r Leu ser met val Tj r Asn ly As 1235 1240 1245 Ile Pro Thr Ile Asn ry ir Lys Ala Ala Ser Ser Asp Leu Lys Ile Tyr 1250 1255 1260 Ile Ser Pro Lys Leu Arg Ile Ile His Asn 'ly Tlyr Glu Gly Gin Lys- 1265 1270 1275 1280 Arg Asn Gin Cys Asn Leu Met Asn Ly s Ty,,r Gly Lys Leu dly Asp Lys 1285 1290 1295 Phe Ilie Val 7I'r Thr Ser Leu Gly Val Asn Pro Asn Asn Ser Ser Asn 1300 1305 1310 Lys Leu Met Phe Tyr Pro Val Tyr Gin Tykr Ser Gly Asn Thr Ser Gly 1315 1320 1325 Leu Asn Gin Giy Arg Leu Leu Phe His Arg Asp Thr Thr Tyr Pro Ser 1330 1335 1340 Lys Val Glu Ala Trp Ile Pro Giy Ala Lys Arg Ser Leu Thr Asn Gin 1345 1350 1355 1360 Asn Ala Ala Ile Giy Asp Asp Tyr Ala Thr Asp Ser Leu Asn Lys Pro 1365 1370 1375 Asp Asp Leu Lys Gin Tyr Ile Phe Met Thr Asp Ser Lys Gly Thr Ala 1380 1385 1390 Thr Asp Val Ser Gly Pro Val Giu Ile Asn Thr Ala Ile Ser Pro Ala 1395 1400 1405 Lys Val Gin Ile Ilie Val Lys Ala Gly Gly Lys Glu Gin Thr Phe Thr 1410 1415 1420 Ala Asp Lys Asp Val Ser Ile Gin Pro Ser Pro Ser Phe Asp Giu Met 1425 1430 1435 1440 Asn Tyr Gin Phe Asn Ala Leu Giu Ile Asp Gly Ser Gly Leu Asn Phe 1445 1450 1455 Ile Asn Asn Ser Ala Ser Ile Asp Val Thr Phe Thr Ala Phe Ala Giu 1460 1465 1470 Asp Gly Arg Lys Leu Gly Tyr Glu Ser Phe Ser Ilie Pro Val Thr Leu 1475 1480 1485 Lys Val Ser Thr Asp Asn Ala Leu Thr Leu His His Asn Glu Asn Gly 1490 1495 1500 Ala Gin Tyr Met Gin Trp Gin Ser Tyr Arg Thr Arg Leu Asn Thr Leu 1505 1510 1515 1520 Phe Ala Arg Gin Leu Val Ala Arg Ala Thr Thr Gly Ile Asp Thr Ile 1525 1530 1535 Leu Ser Met Giu Thr Gin Asn Ilie Gin Giu Pro Gin Leu Gly Lys Gly 1540 1545 1550 Phe Tyr Ala Thr Phe Val Ile Pro Pro Tyr Asn Leu Ser Thr His Gly 1555 1560 1565 Asp Giu Arg Trp Phe Lys Leu Tyr Ile Lys His Val Val Asp Asn Asn 1570 1575 1580 Ser His Ile Ile Tyr Ser Gly Gin Leu Thr Asp Thr Asn Ile Asn Ile 1585 1590 1595 1600 Thr Leu Phe Ile Pro Leu Asp Asp Val Pro Leu Asn Gin Asp Tyr His -212- SUBSTITUTE SHEET (RULE 26) WO 97/17432 WO 97/ 7432PCTIUS96/18003 1605 1610 1615 Ala Lys Vhal Ty~r Met 1620 Thr Phe Lys Lys 5cr 1625 Pro 5cr Asp Gly Thr Trp 1630 Trp Gly Pro His Phe Val Arg Asp Asp Lys Glv le Val Thr Ile Asn 1635 1640 1645 Pro Lys ser Ile Leu Thr His Phe Glu Ser Val Asn Val Leu Asn Asn 1650 1655 1660 Ile 5cr Ser Glu Pro Met Asp Phe Ser Gly Ala Asn Scr Lau Tyr Phe 1665 1670 1675 1680 Trp Giu Leu Phe ?jr Tyr Thr Pro Met Leu Val Ala Gin Arg Leu Leu 16a5 1690 1695 His Giu Gin Asn Phe Asp Giu Ala Asn Arg Trp Leu Lys Tyr Val Trp 1700 1705 1710 Ser Pro Ser dly Tyr Ile Val His Gly Gin Ile Gin Asn Tyr Gin Trp 1715 1720 1725 Asn Val Arg Pro Leu Leu Glu Asp Thr 5cr Trp Asn Ser Asp Pro Lau 1730 1735 1740 Asp Ser Val Asp Pro Asp Ala Val Ala Gin His Asp Pro Met His Tyr 1745 1750 1755 1760 Lys Val Ser Thr Phe Met Arg Thr Leu Asp Leu Leu Ile Ala Arg Gly 1765 1770 1775 Asp His Ala Tyr Arg Gin Leu Glu Arg Asp Thr Leu Asn Glu Ala Lys 1780 1785 1790 Met Trp Tyr Met Gin Ala Leu His Leu Leu Gly Asp Lys Pro Tyr Leu 1795 1800 1805 Pro Leu Ser Thr Thr Trp Ser Asp Pro Arg Leu Asp Arg Ala Ala Asp 1810 1815 1820 Ile Thr Thr Gin Asn Ala His Asp 5cr Ala Ile Val Ala Leu Arg Gin 1825 1830 1835 1840 Asn Ile Pro Thr Pro Ala Pro Leu Ser 1845 1849 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 1740 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:50 (TcdAiii coding region): TTG cGC AGC GCT A.AT ACC CTG ACT GAT CTC TTC CTG CCG CAA ATC AAT 48 Leu Arg Scr Ala Asn Thr Leu Thr Asp Leu Phe Leu Pro Gin Ile Asn 1 5 10 GAA GTG ATG ATG AAT TAC TGG CAG ACA TTA GCT cAG AGA GTA TAC AAT 96 Giu Val met Met Asn Tyr Trp Gin Thr Leu Ala Gin Arg Val Tyr Asn 25 CTG CGT CAT AAC CTC TCT ATC GAC GGC CAd CCG TTIA TAT CTG ccA ATC 144 -213- SUBSTITUTE SHEET (RULE 26) WO 97/17432PCIS6l80 PCT/US96/18003 Lau zrg His Asn Lau Ser Ilie Asp Gly Gin Pro Lau 40
TAT
Tyr Thr 65
CT
Arg
ACC
Thr
GAA
Glu
ACT
Thr
A.AA
Lys 145
AGO
Ser
GCC
Ala
CCA'
Ala
GGC
Gly
GGT
Gly 225
AAA,
Ly s
ATC
Ile
CTC
Leu
AGT
Ser]
CAA~
Gin 305
GCC
Ala
TCT
Ser
TTC
Phe
CAG
Gin
GCG
Ala
AAC
As n 130
ACG
Thr
TAC
I'llr
ATG
Met rCC Ser
TTT
Phe 210
TAT
I'y r
ATT
Ile
CAG
Gin
AA
Ly s
CTG
Laeu 290
CGT
Arg
ACA
Thr
CAA
Gin
CCG
Pro
TTC
Phe
CTC
Leu 115
CTG
Leu
GTG
Val
GGC
G ly
ACG
Thr
CGT
Arg 195
GCC
Ala
GTG
Val
AGO
Ser
CG
Arg
TCA
Ser 275
AAA
Lys
AAG
Lys
CG
Pro
GGT
Gly
CAC
His
GC
Gly 100
AAT
As n
AGC
Ser
TTG
Leu
AAA
Lys
CTA
Leu 180
CTO
Leu
GGT
31y
PLTG
Met
CAA
Gin
PAAT
.s n 260
CTC
Leu
ACC
Thr
TTC
Phe
CC
Ala
GGA
ATG
Met
TCC
Ser
GCG
Ala
ATT
Ile
GAA
Giu
OTO
Leu 165
CGA
Arg
CO
Ala
GGC
C ly
GAA
Glu
TCT
Ser 245
AAT
Asn
GCT
Ala C AA Gin
AGC
Ser
GAT
Asp
GGC
Gly 70
CTG
Leu
ACG
Thr
TT~A
Leu
CAG
Gin
AAA
Lys 150
TAC
Tyr
GCG
Ala
GT
Gly
GOC
C ly TTrC Phe 230
GAA
Giu Pro 55 L~y s Glu
TTA
Leu
TTA
Leu
GAC
Asp 135
TOO
Ser
CAT
Asp
TOO
Ser
GOG
Ala
AGO
Ser 215
TOO
Ser
ACC
Thr
AAA
Ly s
OTA
Lau
A.AT
Asn
CAA
Gin
OA.A
Gin 120
AAAA
Ly s
AAA
Ly s
GAG
Glu
GC
Ala
GOG
Ala 200
OGT
Arg
CG
Ala
TAO
Tyr
GOG
Ala
CCC
Pro
GC
Al1a
AAT
Asn 105
AAT
As n
ACC
Thr
GOG
Ala
AAT
Asn
GC
Ala 185
GOT
Ala
TCG
Trp
AAT
Asn
OGT
Arg
GAA
Giu 265
GAA
Glu
ACC
Thr
TTA
Lau
TTA
Leu
GAA
Giu
CC
Arg 90
ATT
Ile C AG G in
ATT
Ile
GGA
Gly
ATO
Ile 170
GGG,
G ly
GAT
Asp
CGG
Gly
OTT
ValI
CT
Arg 250
TTC
Leu
GOC
Ala
O~A
Gin
TAO
Tyr
OTO
Lau
TCA
Sear 75
CGC
Gly
ATO
Ile
CO
Ala
GA.A
Giu
GOA
Ala 155
AAO
As n OT T Lau
OTC
Leu
GOT
Ala
ATG
Met 235
CC
Arg
AAG
Ly s
CO
Al a
TOT
Ser
AAO
As n 315
AGO
Ser 60
TTT
Phe
ATG
Met
GAA
Giu
GC
Ala
CAA
Giu 140
CAA
Gin
GCC
Ala
ACC
Thr
CTG
Val
ATO
Ile 220
AAO
As n
OGT
Arg
CAA
Gin
GTA
Val
CAA
Gin 300
TG
Trp Lau CO GC Ala Ala ATG TOO Met Ser GTT AGO Val Ser CT OAC Arg Gin 110 GAG OTC Ciu Leu 125 TTC CAT Leu Asp TOG CC Ser Arg CCT CAA Gly Glu AOG GOA Thr Ala 190 OCT AAO Pro Asn 205 CT GAG Ala Ciu ACC CAA Thr Ciu CAG GAG Gin Clu ATO CAT Ile Asp 270 TTC CAG Leu GIn 285 TTC CO Leu Ala OTC CT Lau Arg Pro Ile GTT CO 192 Val Al Ia COT,: TGG 240 Lau Trp CAG CTO 288 Gin Leu GAO CG 336 Asp Ala ATA TTG 384 Ile Leu CO GAG 432 Ala Giu TTT CAT 480 Phe Asp 160 AAO C AA 528 Asn Gin 175 GTT CAC 576 Val Gin ATO TTO 624 Ile Phe CG ACA 672 Ala Thr CG CAT 720 Ala Asp 240 TCC GAG 768 Trp Glu 255 CT CAG 816 Ala Gin AAA ACC 864 Lys Thr TTO OTG 912 Phe Laeu GGT OGA 960 Cly Arg 320 CO CAA CG Ala Glu Ala
OTA
ValI
CAA
Gin
AAT
As n 310
OGO
Arg
G.AA
Giu 295
CAG
GIn -214- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003
CT-G
Leu
CTG
Leu
CCC
1t :Arg
GGA
Ala
CTG
Leu 385
GCC
Ala
GCT
Ala
AGT
Ser
TCT
Ser
TAG
Tyr 465
ACT
Thr
TCT
Ser
GTT
Val
CAT
Asp
CTG
Leu 545
GCC
Ala loG Thr 3CG Ala
ATG
Met
TTC
Phe
GGT
Gly 370 LyS
GAA
Glu
GAG
Gin
GGT
Gly
TTG
Leu 450
CCG
Pro
TTG
Leu
TAG
Tyr
TCT
Ser
GGG
Gly 530
ACA
Thr
ACT
Thr
ATT
l~e
GCG
Ala
GCA
Ala
ATT
Ile 355
GAA
Glu
CGC
Arg
GTT
VaI
GAA
Glu
AAT
Asn 435
GAG
Gln
GCA
Ala
CCC
Pro
GGG
Gly
CAC
His 515
AAA
Lys
GTG
Leu
ATG
Met
AAA
Lys 579 ATT TAG TTG Ile Tyr Phe 325 GAA CAA GCT Giu Gin Ala 340 AAA CCG GGG Lys Pro Gly ACG TTG ATG Thr Leu Met GAT AAA CGC Asp Lys Arg 390 TAT GGA GGA Tyr Ala Gly 405 ATT GAG AAG Ile Asp Lys 420 AAT AAT TTG Asn Asn Leu GGA TGA GTT Ala Ser Vai TCG CTT GGC Ser Leu Gly 470 GGG GTA GTG Ala Leu Leu 485 GAT AAA GCC Asp Lys Ala 500 GGT ATG AAT Gly Met Asn TTC CTG CCA Phe Leu Pro AGG TTC CCA Ser Phe Pro 550 TTA AAA ACC Leu Lys Thr 565 TAA 1740
GAG
Gin
TAG
Tyr
GCC
Ala
GTG
Leu 375
GCA
Ala
TTA
Leu
GTG
Leu
GCG
Ala
TCA
Ser 455
AAA
Lys
GGA
Gly
GGA
Gly
GAG
Asp
TTG
Phe 535
AAT
Asn
TTG
Phe
GGT
Arg
TGG
Trp 360
ACT
Ser
TTA
Leu
CCA
Pro
GTG
Val
TTC
Phe 440
TTC
Phe
ATT
Ile
CCG
Pro
TTA
Leu
AGG
Ser 520
GAA
Glu
GCA
Ala
TAG
TGG
Trp 345
GAG
Gin
GTG
Leu
GAG
Glu
AAA
Lys5
ACT
Ser 425
GGC
Gly
GCT
Ala
GGA
Arg
TAT
Tyr
GCT
Ala 505
GGC
Gly
GGC
cly
TCT
Ser
GAT
Asp 330
GAA
Glu
GGA
Gly
GCA
Ala
GTT
Val
GAT
Asp 410
GAA
Gin
GCC
Ala
GAT
Asp
GGT
Arg
GAG
Gin 490
AAC
Asn
CAA
Gin
ATG
Ile
ATG
Met
TTG
Leu
CTC
Leu
ACC
Thr
CAA
Gln
GAA
Glu 395
AAG
Asn
GGT
Gly
GGG
Gly
TTG
Leu
ATG
Ile 475
GAT
Asp
GOG
Gly
TTG
Phe
GCC
Ala
CCG
Pro 555 Ala
AAT
Asn
TAT
7-yr
ATG
Met 380
CGC
Arg
GGT
Gly
TCA
Ser
ACG
Thr
AAA
Lys 460 Lys
GTA
Val
TGT
Cys
CAG
Gln
ATT
Ile 540
GAG
Glu
GTC
Vaal
GAT
Asp
GCC
Ala 365
GAA
Glu
ACA
Thr
CCA
Pro
GGG
Gly
GAC
Asp 445
ATT
Ile
GAG
Gln
GAG
Gln
GAA
Glu
CTC
Leu 525
GAT
Asp
AAA
Lys
GCG
Ala
GAG
Asp 350
GGT
Gly
GAG
Asp
GTA
Val
TTT
Phe
AGT
Ser 430
ACT
Thr
GGT
Arg
ATG
Ile
GCA
Ala
GGC
Ala 510
GAT
Asp
C.A
Gln
GGT
Gly
GGT
Arg 335
TCT
Ser
GTG
Leu
GCT
Ala
TCG
Ser
TCC
Ser 415
GCC
Ala
AAA
Lys
GAA
Glu
AGC
Ser
ATA
Ile 495
CTG
Leu
TTG
Phe
GGG
Gly
AAA
Lys TGC 1003 Cv; s occ 1056 Ala CTT 1104 Leu CAT 1152 His CTG 1200 Leu 400 GTG 1248 Leu GGC 1296 Cly ACC 1344 Thr GAT 1392 Asp GTG 1440 Val 480 TTG 1488 Leu GCA 1536 Ala AAG 1584 Asn ACG 1632 Thr CAA 1680 Gin 560 CTG AAC GAT ATG Leu Asn Asp Ile 570 ATT TTG CAT ATT GGC TAG 1728 Ile Leu His Ile Arg Tyr 575 INFORMATION FOR SEQ ID NO:51: -215- SUBSTITUTE SHEET (RULE 26) WO 97/1 7432 PTU9/80 PCTIUS96/18003 SEQUENCE CHARACTERISTICS: LENIGTH: 579 amino acids TYPE: amino acids STRANDEDNESS: single TOPOLOGY: linear MOLECULE TYPE: protein SEQUENCE DESCRIPTION: SEQ ID D1O:51 Leu
I
Glu Leu Tyr Thr Arg Thr Giu Thr Lys 145 Ser Ala Ala Gly G ly 225 Lys Ile Leu ser (ii) (x i) Arg Ser 'alI Met Arg His Ala Thr Ser Gin Phe Pro Gin Phe Ala Leu 115 Asn Leu 130 Thr Val Tyr Gly Met Thr Ser Arg 195 Phe Ala 210 Tyr Val Ile Ser Gin Arg Lys Ser 275 Leu Lys 2 90 Ala Asn Thr Leu Thr Asp Leu Phe Leu Pro Gin Ile Asn (TcdAiii): Met 20 Asn Pro G ly His Gly 100 Asn Ser Leu Lys Leu 180 Leu G ly Met Gin Asnf 260 Leu Thr As n Leu Ala Gly Met Ser Ala Ile G lu Leu 165 Arg Ala Gly Giu Ser 245 Asn Ala Gin Tyr Ser Asp G ly 70 Leu Thr Leu Gin Lys 150 Tyr Ala Gly G ly Phe 230 Giu Ala Val1 Gin Trp Ile Pro 55 Lys Giu Leu Leu Asp 135 Ser Asp Ser Ala Ser 215 Ser Thr Giu Arg Giu 295 Gin Asp 40 Lys Leu As n Gin Gin 120 Lys Lys Giu Ala Ala 200 Arg Ala Tyr Ala Arg 280 Gin Thr 25 Gly Ala Pro Ala As n 105 As n Thr Ala Asn Ala 185 Ala Trp As n Arg GiU 265 Giu Thr Leu Leu Ala Gin Pro Leu Leu Giu Ser 75 Arg Gly 90 Ile Ile Gin Ala Ile Glu Gly Ala 155 Ile Asn 170 Gly Leu Asp Leu Gly Ala Val Met 235 Arg Arg 250 Leu Lys Ala Ala Gin Ser 1 'yr Asn 315 Gin Leu Ser Phe Met Giu Ala Glu 140 Gin Ala Thr ValI Ile 220 As n Arg Gin Val1 Gin 300 Trp Arg Ty1r Ala Met Val Arg Giu 125 Leu Ser G ly Thr Pro 205 Ala Thr Gin Ile Leu 285 Leu Leu Val Leu Ala Ser Ser Gin 110 Leu Asp Arg Giu Ala 190 Asn G iu Giu Glu Asp 270 Gin Aia Arg Tyr Pro ValI Leu Gin Asp Ile Aia Phe As n 175 Val1 Ile Ala Ala Trp 255 Ala Ly s Phe G ly As n Ile Ala Trp Leu Ala Leu Giu Asp 160 Gin Gin Phe Thr Asp 240 Giu Gin Thr Leu Arg 320 Arg Lys Phe Ser Asn Gin Ala -216- SUBSTITUTE SHEET (RULE 26) WO 97/17432PCUS/180 PCT/US96/18003 =eu Ala Ala Ilie Leu Arg Ala Leu 385 Ala Ala Ser Ser Tyzr 465 Thr Ser ValI Asp Leu 545 la Thr Met Ala Glu 340 Phe Ile Lys 355 G1Y Glu Thr 370 Lys Arg Asp Giu Val Tyr Gin Glu Ile 420 Gly Asn Asn 435 Leu Gin Ala 450 Pro Ala Ser Leu Pro Ala Tyr Gly Asp 500 Ser His Gly 515 Gly Lys Phe 530 Thr Leu Ser Thr Met Leu Ile Lys..
579 Tr 325 Gin Pro Lau Lys Ala 405 Asp Asn Ser Leu Leu 485 Lys Met Leu Phe Lys 565 Phe Gin Phe Tyr Asp Lau Ala Val Al'a 330 Arg Cys 335 Tyjr Ala Leu 375 Ala Leu Leu Ala Ser 455 Ly s Gly G ly Asp Phe 535 Asn Leu Trp 345 Gin Lau Giu Lys Ser 425 Gly Ala Arg Ty r Ala 505 C ly G ly Ser Asp Asn Met 380 Arg Gly Ser Thr Ly s 460 Lys ValI Cy s Gin Ile 540 Giu Leu INFORMATION FOR SEQ ID NO:52: SEQUENCE CHARACTERISTICS: LENGTH: 5532 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:52 (TcdAiii coding region): TTT ATA CAA CCT TAT AGT CAT CTG T'IT GGT AAT CT GCT CAT AAC TAT 48 Phe Ile Gin Cly Tyr Ser Asp Leu Phe Gly Asn Arg Ala Asp Asn Tyr 1 5 10 GCC CC CCC CCC TCC CTT GCA TOG ATC TTC TCA CCG CC CT TAT TTC 96 -217- SUBSTITE SHEET (RULE 26) WO 97/1 7432 PCTIUS96/18003 Ala Ala Pro Gly Ser Ala Ser Met Phe Ser Pro Ala Ala 'Pr Leu 215 ACC GAA TTG TAC COT CAA CCC AAA %kAC TTG CAT CAC AGC AGC TCA ATT 144 Thr Clu Leu Tlyr Arg Glu Ala L',s Asn Leu His Asp Ser Ser Ser Ile 40 TAT TAC CTA CAT kzL- COT CCC CCG GAT TTA GCA ACC TTA ATG CTC ACC 192 r 1 'r Tyr Leu Asp Lys Arg Arg Pro Asp Leu Ala Ser Leu Met Leu Ser I) 50 55 CAG AAA AAT ATO OAT GAO OAA ATT TCA ACO CTO OCT CTC TCT AAT GAA 240 Gin Lys Asn Met Asp G1i Clu Ile Ser Thr Leu Ala Leu Ser Asn Clu 70 75 TTC TOC CTT CCC COO ATC CAA ACA AAA ACA OA AAA TCA CAA OAT GAA 288 Leu Cys Leu Ala Oly Ile Olu Thr Lys Thr Oly Lys Ser Gin Asp Clu 90 OTO ATO CAT ATO TTO TCA ACT TAT CO-T TTA ACT OCA GAO ACA CCT TAT 336 Val Met Asp Met Leu Ser Thr Tyr Arg Leu Ser Cly Ciu Thr Pro Tyr 100 105 110 CAT CAC OCT TAT GAA ACT OTT COT OAA ATC OTT CAT OAA COT GAT CCA 384 His His Ala Tyr Olu Thr Val Arg Olu Ile Val His Clu Arg Asp Pro 115 120 125 OA TTT COT CAT TTG TCA CAGCOCA CCC ATT OTT OCT OCT AAG CTC CAT 432 Cly Phe Arg His Leu Ser Gin Ala Pro Ile Val Ala Ala Lys Leu Asp 130 135 140 CCT OTO ACT TTC TT OT ATT AOC TCC CAT ATT TCG CCA OAA CTO TAT 480 Pro Val Thr Leu Leu Gly Ile Scr Scr His Ilie Ser Pro Oiu Leu Tyr 145 150 155 160 AAC TTG CTC ATT GAG GAG ATC CCC CAA AAA CAT GA.A CCC CC CTT CAT 528 Asn Leu Leu Ile Giu Glu Ile Pro Glu Lys Asp Glu Ala Ala Leu Asp 165 170 175 ACO CTT TAT AAA ACA AAC TTT CCC GAT ATT ACT ACT OCT CAC TTA ATO 576 Thr Leu Tyr Lys Thr Asn Phe Cly Asp Ile Thr Thr Ala Gin Leu Met 180 185 190 TCC CCA ACT TAT CTC 0CC COO TAT TAT 0CC CTC TCA CCC CAA GAT ATT 624 Ser Pro Scr Tyr Leu Ala Arg Tyr Tyr Gly Val 5cr Pro Glu Asp Ile 195 200 205 CCC TAC OTO ACG ACT TCA TTA TCA CAT OTT OCA TAT AOC ACT GAT ATT 672 Ala Tyr Val Thr Thr Ser Leu Ser His Val Cly Tyr Ser Ser Asp Ile 210 215 220 CTO OTT ATT CCC TTG CTC CAT CT OTO OCT AAG ATO GAA OTA OTT COT 720 Leu Val Ile Pro Leu Val Asp Cly Val Cly Lys Met Olu Val Val Arg 225 230 235 240 OTT ACC COA ACA CCA TCC CAT AAT TAT ACC AGT CAC ACG AAT TAT ATT 768 Val Thr Arg Thr Pro 5cr Asp Asn Tyr Thr Ser Cmn Thr Asn Tyr Ile 245 250 255 GAG CTO TAT CCA CAC COT CCC GAC AAT TAT TTC ATC AAA TAC A.AT CTA 816 Olu Leu Tyr Pro Gin Oly Cly Asp Asn Tyr Leu Ile Lys Tyr Asn Leu 260 265 270 AOC AAT ACT TTT OCT TTG OAT OAT TI'T TAT CTG CA.A TAT AAA OAT GOT 864 Ser Asn Ser Phe Cly Leu Asp Asp Phe Tyr Leu Gin Tyr Lys Asp 01y 275 280 285 TCC OCT CAT TOO ACT GAO ATT 0CC CAT AAT CCC TAT CCT OAT ATO GTC 912 Ser Ala Asp Trp Thr Ciu Ile Ala His Asfl Pro Tyr Pro Asp Met Val 290 295 300 -218- SUBSTITUTE SHEET (RULE 26) WO 97/1 7432 PCTIUS96II 8003 ATA A..T CAA -G TAT G TCA CAG GCG ACA ATC AA CGT AGT GAC T lie Asn Gin Lys Ty r Giu Ser Gin Ala Thr Ilie Lys Ara Ser Asp Ser 305 310 315 3 2 0 -AC A.AT ATA CTC AGT ATA GGG TTA CAA AGA TOO, CAT AGC GOT AGT TAT 1008 Asp Asn Ile Leu Ser 1ie Gly Leu Gin Arg Trp, His Ser Gly Ser ly r 325 330 335 AA T TTT CCC GCC CCC AAT TTT AAA ATT GAC CAA TAC TCC CCG AAA GCT 1056 I) Asn Phe Ala Ala Ala Asn Phe Lys Ile Asp Gin T1yr Ser Pro Lys Ala 340 345 350 TTC CTG CTT AAA ATG AAT AAG GCT ATT CGG TTG CTC AAA GCT ACC GGC 1104 Phe Leu Leu Lys Met Asn Lys Ala Ilie Arg Leu Leu Lys Ala Thr Cly 355 360 365 CTC TCT TTT GCT ACC TTG GAG CGT ATT OTT GAT ACT GTT AAT AGC ACC-1i52 Leu Ser Phe Ala Thr Leu Giu Arg Ile Val Asp Ser Vai Asri Ser Thr 370 375 380 AAA TCC ATC ACC GTT GAG GTA TTA AAC AAG OTT TAT COG GTA AAA TTC 1200 Lvs Ser Ile Thr Val Giu Val Leu Asn Lys Val Tyr Arg Val Lys Phe 385 390 395 400 TAT ATT GAT COT TAT CCC ATC AGT GAA GAG ACA CCC CCT ATT TTC OCT 1248 'ryr Ile Asp Arg Ty r Gly Ile Ser Giu Ciu Thr Aia Ala Ile Leu Ala 405 410 415 AAT ATT AAT ATC TCT CAG CAA CCT OTT CCC AAT CAG CTT AGC CAG TTT 1296 Asn Ile Asn Ile Ser Gin Gin Ala Val Gly Asn Gin Leu Ser Gin Phe 420 425 430 GAG CAA CTA TTT AAT CAC CCC CCC CTC AAT GCT ATT CCC TAT GAA ATC 1344 Giu Gin Leu Phe Asn His Pro Pro Leu Asn Gly Ile Arg Tyr Ciu Ile 435 440 445 ACT GAG GAC AAC TCC AAA CAT CTT CCT AAT CCT CAT CTG AAC CTT AAA 1392 Ser Giu Asp Asn Ser Lys His Leu Pro Asn Pro Asp Leu Asn Leu Lys 450 455 460 CCA GAC ACT ACC GOT CAT CAT CAA CCC AAG CC OTT TTA AAA CCC CC 1440 Pro Asp Ser Thr Gly Asp Asp Gin Arg Lys Ala Val Leu Lys Arg Ala 465 470 475 480 TTT CAC OTT AAC CCC ACT GAG TTC TAT CAG ATO 'ITA TTG ATC ACT CAT 1488 Phe Gin Val Asn Ala Ser Giu Leu Tyr Gin Met Leu Leu Ile Thr Asp 485 490 495 COT AAA GAA GAC GOT OTT ATC AAA AAT AAC ITA GAG AAT TTC TCT CAT i536 Arg Lys Giu Asp Ciy Val Ile Lys Asn Asn Leu Ciu Asn Leu Ser Asp 500 505 510 CTC TAT TTC OTT ACT TTC CTG CCC CAC ATT CAT AAC CTG ACT ATT OCT 1584 Leu Ty r Leu Val Ser Leu Leu Aia Gin Ile His Asn Leu Thr Ile Ala 515 520 525 CAA TTC AAC ATT TTC TTC CTG ATT TOT CCC TAT CCC GAC ACC AAC ATT 1632 Giu Leu Asn Ilie Leu Leu Val Ile Cys Gly Tyr Cly Asp Thr Asn Ile 530 535 540 TAT CAC ATT ACC CAC CAT AAT TTA CCC AA ATA CTG GAA ACA TTC TTC 1680 Ty r Gin Ile Thr Asp Asp Asn Leu Ala Lys Ile Val Giu Thr Leu Leu 545 550 555 560 TOG ATC ACT CAA TCC; TTC AAG ACC CAA AA TCG ACA OTT ACC CAC CTC 1728 Trp Ile Thr Cmn Trp Leu Lys Thr Gin Lys Trp Thr Vai Thr Asp Leu 565 570 575 TTT CTC ATC ACC ACC CCC ACT TAC ACC ACC ACT TTA ACC CCA CA.A ATT 17706 Phe Leu Met Thr Thr Ala Thr Tyr Ser Thr Thr Leu Thr Pro Giu Ile 580 585 590 -219- SUBSTITUTE SHEET (RULE 26) WO 97/17432 WO 97/ 7432PCTIUS96II 8003 ACC AAT CT-- ACG GCT ACG TTG TCT Sar Asn Leu Thr Ala Thr Leu Ser 595 600 TCA ACT TTG CA'1'T GGC .AA G.AC ACT 1824 Ser Thr Laeu His Gly Lys G.,u 3cr 605
CTC
Lau
GCT
Ala 625
ATT
Ile 610
TTG
Leu CAA CAT Glu Asp TTIG ACT Lau Thr
CTG
Leu
TCT
.kAA AGA GCA Lys Arg Ala 615 CAA GAA- GTT Gin Glu Val ATM C CCT TCC TTC ACT Me
C
Al c a is 630
ATA
lie
CAA
Ciu ValI
ACC
Thr :LkA Lys 705
TTT
Phe
CAC
Asp
CTC
ValI
CTC
Lau
GAA
Ciu 690
AC
Ser
CAT
His
CAC
Gin
CAA
Gin
GCA
Ala 675
CTG
Leu
ATA
Ile
ACC
Thr
AT'!
Ile
ACA
Thr 660
CAA
Gin
TCA
Ser
CTM
Leu
TOG
Trp
CAA
*Gin 645
ACA
Thr
TTC
Leu
CTC
Leu
GAT
Asp
CTT
ValI 725
CCC
Pro
CCA
Pro
AC
Ser
ATC
Ile
CAC
His 710
AAT
Asn
GCA
Ala
ACC
Thr
CTC
Leu
CTC
ValI 695
CCT
Cly
GCC
C ly C CC TTG Aia Ala Leu
AAA
ATC
Met
AAC
Ly s
CAA
Gin 785
CCA
Ala
TCC
Trp
CAC
Gin
ATT
Ile
AAT
Asn
CAT
Asp 770
TG
Trp
CCC
Gly
CAA
Gin
AAA
Lys
AAT
As n 850
AAG
Lys 755
CTA
Leu Leu
ATM
met
CCT
Ala
AAA
Lys 835
CCT
Ala Lys 740
CAC
Ciu
ACA
Thr
CAC
Gin
ATM
Met
C
Ala 820
CMC
Leu
CTT
Val GAC GGA Asp Cly
CAA
Giu
AAA
Lys
ATG
Met
CC
Ala 805
C
Ala
G.AT
A~sp
GTC
ValI
TCT
Ser
CMG
Lau
TCT
Ser 790
CMC
Leu
CCT
Ala
GAG
Clu
CAT
Asp
CC
Ala
CTC
Leu
ACC
Thr 775
TCC
Ser
AAA
Lys
CC
Ala
ACC
Thr *CAA ATA ACT Gin Ilie Thr 650 ACC TTC, AAC Ser Leu Lys 665 ATC TAT COT Ile Tyr Arg 680 ACT CAA TCT Thr Gin Ser CMTC'TA ACC Leu Leu Thr TTC CCC CAA Leu Cly Gin 730 TTO ACA CTT Leu Thr Val 745 CTA CAA ATC Leu Cmn Met 760 ACT TCO ACA Ser Trp Thr CCC TTO CC C Ala Leu Ala TAT CCC ATA C Tyr Cly IleP 810 CM ATO GCT G Leu Met Ala A 825 rTC ACT AAC C Al.
TA'
T3
GT
Va
CTC
Val
CC'!
Arg
TCT
Ser
CTC
Leu 715
CAT
His
ACC
rhr
:;CA
kla
:AG
.win
.TT
Pal 195
AT
~sp
~AT
~sp a Pro Cys Phe 620 T CAC CTC CTC r Asp Leu Leu r' CAT CCC TTT IAsp Gly Phe ATT ACC TTT Slie Thr Phe 670 A'PT CCC TTA Ile Cly Leu 685 *CTC CTA CTC *Leu Leu Val 700 ATC CCC TTC Met Ala Leu CCC TCC TTC Ala Ser Leu CAT CTA CCA Asp Val Ala 750 CCT AAT CAC Aia Asn Gin 765 ATT CAC CCT Ile Asp Ala 780 TCT CCA CTC C Ser Pro Leu; CAT AAC TAT G Thr
TTC
rLeU Trp 655
GCT
Ala
ACT
Ser
CCA
Aia
GAA
C iu
ATA
Ile 735
CAA
Gln
.TG
1 *TCC 1872 *Ser TCC i920 Trp 640 GAA 1968 Giu CAC 2016 Gin C.AA 2064 Glu CCC 2112 Cly CCT 2160 Gly 720 TTC 2208 Leu CCT 2256 Ala GAG 2304 Glu k.TT CTM 2352 Ile Leu ;AT CTM 2400 ~sp Leu 800 ;CT CCC 2448 la Ala AG GCA 2496 In Ala His
CAT
His As n
GCT
Ala Ty r A 8 AAT C Asn C 830 CA ?TTA TCT AAC TAT TAT 2544 Phe 340 Ser Lys Ala Leu Cys Asn Tyr Tyr 845 ACT CCT CCT CCA GTA CCT Ser Ala Ala Cly Val Arg 855 860 CAT CCT AAC GCT 2592 Asp Arg Asn Gly TTA TAT ACC TAT TTG CTM ATT CAT AAT CAC Leu Tyr Thr Tyr Leu Leu Ile Asp Asn Gin -220- SUBSTITUTE SHEET (RULE 26) GTT TCT CCC CAT GTM ATC 2640 Val Ser Ala Asp Val Ile WO 97/17432 WO 97/ 7432PCTIUS96/1 8003 865 870 8715 8 ACT TCA COT ATT GCA GAA GCT ATC GCC GOT ATT CAA CTG TAC GTT .AC 2 68 8 Thr Ser Arg Ile Ala Glu Ala Ilie Ala Gly Ile Gin Lau ?yr Val *Asn 885 890 895 COG GCT TTA AAC CGA GAT GAA GGT CAG CTT GCA TCG GAC CTT ACT ACC 2730' Arg Ala Leu Asn Arg Asp Clu Gly Gin Leu Ala Sar Asp Val Ser Thr 900 905 910 CGT CAC TTC TTC ACT CAC TOG GAA CCT TAC AAT A-k CCT TAC ACT ACT 2784 Arg Gin Phe Phe Thr Asp Trp Giu Arg Tyr Asn Lys Arg Tyr Ser Thr 915 920 925 TCC GCT CCT CTC TCT GCA CTC CTC TAT TAT CCA C.A AAC TAT CTT CAT 2832 Trp Ala Cly Val Ser Ciu Leu Val Tyr Tyr Pro Giu Asri T[yr Val Asp 930 935 940 CCC ACT CAC CGC ATT CCC CAA ACC AAA ATC ATC CAT CC CTC TTG CAA 2880 Pro Thr Gin Arg Ilie Cly Cin Thr Lys Met Met Asp Ala Laeu Lau Gin 945 950 955 960 TCC ATC AAC CAC ACC CAC CTA AAT CC CAT ACG CTC CAA CAT CCT TTC 2928 Ser Ilie Asn Gin Scr Gin Leu Asn Ala Asp Thr Val Ciu Asp Ala Phe 965 970 975 AAA ACT TAT TTC ACC ACC T'IT CAC CAC GTA CCA A.AT CTC AAA CTA ATT 2976 Lys Thr Tyr Leu Thr 5cr Phe Ciu Cmn Val Ala Asn Leu Lys Val Ile 980 985 990 ACT GCT TAC CAC CAT AAT CTG AAT CTC CAT CA.A CCA TTA ACT TAT TTT 3024 Ser Ala Tyr His Asp Asn Val Asn Val Asp Gin Ciy Leu Thr Tyr Phe 995 1000 1005 ATC CCT ATC CAC CAA CCA CCT CCC CCT ACC TAT TAC TCC CCT ACT GTT 3072 Ilie Gly Ilie Asp Gin Ala Ala Pro Cly Thr Tyr Tlyr Trp Arg Ser Val 1010 1oi5 1020 CAT CAC ACC AAA TCT CAA AAT CCC AAG, TTT[ CCC CCT AAT CCT TOC CCT 3120 Asp His Ser Lys Cys Ciu Asn Gly Lys Phe Ala Ala Asn Ala Trp Cly 1025 1030 1035 1040 GAG TOG AAT AAA ATT ACC TCT CCT CTC AAT CCT TGC AAA AAT ATC ATC 3168 Giu Trp Asn Lys Ile Thr Cys Ala Val Asn Pro Trp Lys Asn Ilie Ile 1045 1050 1055 CCT CCC CTT CTT TAT ATC TCC CCC TTA TAT CTG CTA TOO CTC GAG CAG 32i6 Arq Pro Val Val Tyr Met Scr Arg Leu Tyr Leu Leu Trp Leu Ciu Gin 1060 1065 1070 CAA TCA AAC AAA ACT CAT CAT GOT AAA ACC ACC ATT TAT CAA TAT AAC 3264 Gin Ser Lys Lys Ser Asp Asp Cly Lys Thr Thr Ile Tyr Gin Tyir Asn 1075 1080 1085 TTA AA.A CTG GCT CAT AT? COT TAC CAC GOT ACT TCG AAT ACA CCA TTT 3312 Leu Lys Leu Ala His Ile Arg Tyr Asp Gly Ser Trp Asn Thr Pro Phe 1090 1095 1100 ACT 'i-IT CAT GTC ACA GAA AAG GTA AA AAT TAC ACG TCC ACT ACT CAT 3360 Thr Phe Asp Val Thr Giu Lys Vai Lys Asn Tyr Thr Ser Ser Thr Asp 1105 1110 1115 1120 GCT GCT GAA TCT TTA CCC TTG TAT TGT ACT GOT TAT CAA CCC GAA CAC 3408 Ala Ala Giu Ser Leu Ciy Leu Tyr Cys Thr Gly Tyr Gin Gly Giu Asp 1125 1130 1135 ACT CTA TTA ATG TTC TAT TCC ATC CAC ACT ACT TAT AGC TCC TAT 3456 Thr Leu Lau Val Met Phe T[yr Ser Met Gin Ser Ser T[yr Ser Ser Tyr 1140 1145 1150 ACC CAT AAT AAT CC CCC GTC ACT CCC CTA TAT ATT TTC GCT CAT ATG 3504 -221- SUBSiTruT SHEET (RULE 26) WO 97/17432 WO 9717432PCT/US96/18003 Th~r Asp Asn ,a sn Ala Pro Val Thr dly Lau T lie Phe Ala isp Met 1155 116c0 1165 TCA TCA GAC AA-:T ATG ACG AAT GCA CAA~ OCA ACT AC TA*1'T TOO !AT AAC 3552 ser Ser Asp Asn met Thr Asn Ala Gin Ala Thr A-,sn TrP Asn !.sn 1170 1175 1180 ACT TAT CCG CA~A TTT OAT ACT GTG ATO OCA OAT CC GAT AOC GAC A.T 3600 Ser Tyr Pro Gin Phe Asp Thr Val Met Ala Asp Pro Asp Ser Asp Asn 1185 1190 1195 1200 A kA A: GTC ATA ACC AGA AGA OTT AAT AAC COT TAT OCO GAO OAT TAT 3648 Lys Lys Val Ile Thr Arg Arg Val Asn Asn Arg Tyr Ala 01u Asp Tyr 151205 1210 1215 GAA ATT CCT TCC TCT OTO ACA ACT AAC ACT AAT TAT TCT TOG GOT OAT 3696 Glu Ile Pro Ser Ser Val Thr Ser Asn Ser Asn Tyr Ser Trp Oly Asp 1220 1225 1230 CAC ACT TTA ACC ATO CTT TAT GOT GOT ACT OTT CCT AAT ATT ACT TTT 3744 His Ser Leu Thr Met Leu Tyr Gly Oly Ser Val Pro Asn Ile Thr Phe 1235 1240 1245 OXA TCO OCO OCA GAA OAT TTA AGO CTA TCT ACC AAT ATO OCA TTGO ACT 3792 01u Ser Ala Ala Olu Asp Leu Arg Leu Ser Thr Asn Met Ala Leu Ser 1250 1255 1260 ATT ATT CAT A.AT OGA TAT OCO OGA ACC COC COT ATA CAA TOT AAT CTT 3840 Ile Ile His Asn Oly Tyr Ala Oly Thr Arg Arg Ile Gin Cys Asn Leu 1265 1270 1275 1280 ATG AAA CAA TAC OCT TCA TTA GOT OAT AAA TTT1 ATA ATT TAT OAT TCA 3888 Met Lys Gin Tyr Ala Ser Leu Oly Asp Lys Phe Ile Ile TIyr Asp Ser 1285 1290 1295 TCA TTT OAT OAT OCA AAC COT TTT AAT CTG OTO CCA TTO TTT AAA TTC 3936 Ser Phe Asp Asp Ala Asn Arg Phe Asn Leu Val Pro Leu Phe Lys Phe 1300 1305 1310 OGA AAA GAC GAG AAC TCA OAT OAT AGT ATT TOT ATA TAT AAT GAA AAC 3984 Gly Lys Asp Glu Asn Ser Asp Asp Ser Ilie Cys Ile Tyr Asn Giu. Asn 1315 1320 1325 CCT TCC TCT GAA OAT AAG AAG TOG TAT TTT TCT TCG AAA OAT GAC AAT 4032 Pro Ser Ser Giu Asp Lys Lys Trp Tyr Phe Scr Scr Lys Asp Asp Asn 1330 1335 1340 AAA ACA OCO OAT TAT AAT GGT OGA ACT CAA TOT ATA OAT OCT OGA ACC 4080 Lys Thr Ala Asp Tyr Asn Gly Gly Thr Gin Cys Ile Asp Ala Oly Thr 1345 1350 1355 1360 ACT AAC AAA OAT TTT TAT TAT AAT CTC CAG GAG ATT GAA OTA ATT ACT 4128 Scr Asn Lys Asp Phe Tyr Tyr Asn Leu Gin iu Ile Olu, Val Ile Scr 1365 1370 1375 OTT ACT GOT 000 TAT TG TCG AOT TAT AAA ATA TCC AAC CCO ATT AAT 4176 Val Thr Gly Gly Tyr Trp Ser Ser Tyr Lys Ile Ser Asn Pro Ile Asn 1380 1385 1390 ATC A.AT ACO GOC ATT OAT ACT OCT AAA OTA AAA GTC ACC OTA AAA OCO 4224 Ile Asn Thr Gly Ile Asp Ser Ala Lys Val Lys Val Thr Val Lys Ala 1395 1400 1405 GOT GOT GAC OAT CAA ATC TTT ACT OCT OAT AAT ACT ACC TAT OTT CCT 4272 Oly Gly Asp Asp Gin lie Phe Thr Ala Asp Asn Ser Thr Tyr Val Pro 1410 1415 1420 CAG CAA CCG OCA CCC ACT TTT GAG GAO ATO ATT TAT CAG TTC A.AT AAC 4320 Gin Gin Pro Ala Pro Ser Phe Olu Olu Met Ile Tyr Gin Phe Asn Asn 1425 1430 1435 1440 -222- SUBSTITUTE SHEET (RULE 26) WO 97/17432 WO 9717432PCT/US96/1 8003 :TG A 'CA *A 'TA GA 'T TGT kAG AAT TTA AAT TTC ATC GA-c A.AT CAG C? AT 4363 Lau Thr Ile Asp Cys Lys Asn Lau isn Phe Ile Asp Asn Gin Ala His 1445 1450 1455 ATT GAG ATT GAT TTC ACC GCT ACC GCA C?.A GAT GCC CGA TTC TTG GGT 4416 Ilie Giu Ilie Asp Phe Thr Ala Thr Ala Gin Asp Cly Arg Phe Leu Gly 1460 1465 1470 G CA GAA ACT TTT ATT ATC CCG GTA ACT A.AA A.AA GTT CTC GGT ACT GAG 4464 1) Ala Giu Thr Phe Ilie Ilie Pro Val Thr Lys Lys Val Leu Gly Thr Giu 1475 1480 1485 A-AC GTG ATT CC TTA TAT AGC GAA A.AT AAC CGT GTT CAA TAT ATG CAA 4512 Asn Val Ilie Ala Leu TPyr Ser Giu Asn Asn Gly Val Gin Tyr Met Gin 1490 1495 1500 ATT GCC GCA TAT CGT ACC CGT TTG AAT ACG TTA TTC GCT CAA CAG TTC 4560 Ilie Gly Ala Tyr Arg Thr Arg Lau Asn Thr Leu Phe Ala Gin Gin Leu 1505 1510 1515 1520 GTT AGC CGT GCT AAT CGT GGC ATT GAT GCA CTG CTC AGT ATG GAA ACT 4608 Val Ser Arg Ala Asn Arg Cly Ilie Asp Ala Val Leu Ser Met Giu Thr 1525 1530 1535 CAC .kAT ATT CAC CAA CCC CA.A TTA GCA GCG GGC ACA TAT GTC CAG CTT 4656 Gin Asn Ilie Gin Giu Pro Gin Leu Gly Ala Cly Thr Tyr Val Gin Leu 1540 1545 1550 GTG TTC GAT AAA TAT CAT GAG TCT ATT CAT CGC ACT A.AT AAA ACC TTT 4704 Val Leu Asp Lys Tyr Asp Ciu Ser Ile His Gly Thr Asn Lys Ser Phe 1555 1560 1565 GCT ATT GAA TAT GTT GAT ATA TTT AAA GAG AAC GAT ACT 'PIT GTC ATT 4752 Ala Ile Ciu Tyr Val Asp Ile Phe Lys Glu Asn Asp Ser Phe Val Ile 1570 1575 1580 TAT CAA GCA GAA CTT AGC GAA ACA ACT CAA ACT G'IT CTC AAA CTT TTC 4800 Tyr Gin Gly Ciu Leu Ser Giu Thr Ser Gin Thr Val Val Lys Val Phe 1585 1590 1595 1600 TTA TCC TAT TTT ATA GAC CC ACT GGA A.AT AAG AAC CAC TTA TGG CTA 4848 Lau Ser Tyr Phe Ile Clu Ala Thr Cly Asn Lys Asn His Leu Trp Val 1605 1610 1615 CCT CCT AAA TAC CAA AAC CAA ACC ACT CAT AAG ATC TTC TTC CAC CGT 4896 Arg Ala Lys Tyr Gin Lys Glu Thr Thr Asp Lys Ilie Leu Phe Asp Arg 1620 1625 1630 ACT CAT CAG AAA~ CAT CCC CAC GGT TCG T'IT CTC ACC CAC CAT CAC AAG 4944 Thr Asp Glu Lys Asp Pro His Cly Trp Phe Leu Ser Asp Asp His Lys 1635 1640 1645 ACC 'PIT ACT GGT CTC TCT TCC GCA CAC GCA TTA AAG AAC CAC ACT CAA 4992 Thr Phe Ser Cly Leu Ser Ser Ala Gin Ala Leu Lys Asn Asp Ser Glu 1650 1655 1660 CCC ATC CAT TTC TCT CCC CCC AAT CCT CTC TAT TTC TGC CAA CTG TTC 5040 Pro Met Asp Phe Ser Cly Ala Asn Ala Leu Tyr Phe Trp Clu Leu Phe 1665 1670 1675 1680 TAT TAC ACC CCC ATG ATC ATC CCT CAT CGT TTG TTC CAG CAA CAC AAT 5088 Tyr Tlyr Thr Pro Met Met Met Ala His Arg Leu Leu Gin Giu Gin Asn 1685 1690 1695 TTT CAT CC CC KAC CAT TCG TTC CGT TAT GTC TCG ACT CCA TCC GGT 5136 Phe Asp Ala Ala Asn His Trp Phe Arg Tyr Val Trp Ser Pro 5cr Gly 1700 1705 1710 TAT ATC GTT CAT CGT AAA ATT GCT ATC TAC CAC TGG AAC GTC CCA CCC 5184 7;r Ile Val Asp Cly Lys Ile Ala Ile Tyr His Trp Asn Val Arg Pro 1715 1720 1725 -223- SUBSTITUTE SHEET (RULE 26) WO 97/17432 WO 97/ 7432PCTIUS96/1 8003 CTG GAA GA-A GAC ACC AGT TGG AAT GCA CAA C.A CTG GAC TCC ACC GAT 5-732 Lau Giu Glu Asp Thr ser Trp Asn Ala Gin Gin Lau Asp Ser Thr Asp 1730 1735 1740 CCA; GAT GCT GTA GCC CAA GAT GAT CCG ATG CAC TAC AAG GTG GCT ACC 5280 Pro Asp Ala Val Ala Gin Asp Asp Pro Met His Tyr Lys Val Ala Thr 1745 1750 1755 1760 TTT ATG GCG ACG TTG GAT CTG CTA ATG GCC CT GGT GAT GCT GCT TAC 5328 Phe Met Ala Thr Leu Asp Leu Leu Met Ala Arg Gly Asp Ala Ala Trr 1765 1770 1775 CGC CAG TTA GAG CGT GAT ACG TTG GCT GAA GCT AAA ATO TGG TAT ACA 5376 Arg Gin Lau Glu Arg Asp Thr Leu Ala Glu Ala Lys Met Trp Tyr Thr 1780 1785 1790 CAG GCG CTT A.AT CTG TTG GGT CAT GAG CCA CAA GTG ATG CTG AGT ACG 5424 Gin Ala Leu Asn Leu Leu Gly Asp Glu Pro Gin Val Met Leu Ser Thr 1795 1800 1805 ACT TGG GCT AAT CCA ACA TTG GGT AAT GCT GCT TCA AAA ACC ACA CAG 5472 Thr Trp Ala Asn Pro Thr Leu Gly Asn Ala Ala Ser Lys Thr Thr Gin 1810 1815 1820 CAG GTT COT CAG CAA GTG CTT ACC CAG TTC CGT CTC AAT AGC AGG GTA 5520 Gin Val Arg Gin Gin Val Leu Thr Gin Leu Arg Leu Asn Ser Arg Val 1825 1830 1835 1840 AAA ACC CCG TTG 5532 Lys Thr Pro Leu 1844 INFORMATION FOR SEQ ID NO:53: SEQUENCE CHARACTERISTICS: LENGTH: 1844 amino acids TYPE: amino acids STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein Phe 1 Ala Thr Ili r Gin Leu (xi) SEQUENCE DESCRIPT Features From To Peptide 1 1844 Fragment 1 11 Fragment 978 990 Fragment 1387 1401 Fragment 1484 1505 Fragment 1527 1552 Ilie Gin Gly Tyr Ser Asp Leu 5 Ala Pro Gly Ser Val Ala Ser Glu Leu Tlyr Arg Glu Ala Lys 40 Tyr Leu Asp Lys Arg Arg Pro 55 Lys Asn Met Asp Glu Giu Ile 70 CYs Leu Ala Gly Ilie Glu Thr ION: SEQ ID NO:53 (TcbAii): Description TcbAii peptide (SEQ ID NO:1) (SEQ ID NO:23) (SEQ ID NO:22) (SEQ ID NO:24) (SEQ ID NO:21) Phe Gly Asn Arg Ala Asp Asn Tyr 10 Met Phe Ser Pro Ala Ala Tyr Leu 25 Asn Leu His Asp Ser Ser Ser Ile Asp Leu Ala Ser Leu Met Leu Ser 5cr Thr Leu Ala Leu 5cr Asn Giu 75 Lys Thr Gly Lys Ser Gin Asp Giu -224- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 Val Met His His Gly Phe 130 Pro Val 145 Asn Leu Thr Leu Ser Pro Ala Tyr 210 Leu Val 225 Val Thr Glu Leu Ser Asn Ser Ala 290 Ile Asn 305 Asp Asn Asn Phe Phe Leu Leu Ser 370 Lys Ser 385 Tyr Ile Asn Ile Glu Gin Ser Glu 450 Asp Ala 115 Arg Thr Leu Tyr Ser 195 Val Ile Arg Tyr Ser 275 Asp Gin Ile Ala Leu 355 Phe Ile Asp Asn Leu 435 Asp Met 100 Tyr His Leu Ile Lys 180 Tyr Thr Pro Thr Pro 260 Phe Trp Lys Leu Ala 340 Lys Ala Thr Arg Ile 420 Phe Asn Leu Glu Leu Leu Glu 165 Thr Leu Thr Leu Pro 245 Gin Gly Thr Tyr Ser 325 Ala Met Thr Val Tyr 405 Ser Asn Ser Ser Thr Ser Gly 150 Glu Asn Ala Ser Val 230 Ser Gly Leu Glu Glu 310 Ile Asn Asn Leu Glu 390 Gly Gin His Lys Thr Val Gin 135 Ile Ile Phe Arg Leu 215 Asp Asp Gly Asp Ile 295 Ser Gly Phe Lys Glu 375 Val Ile Gin Pro His 455 Arg 120 Ala Ser Pro Gly Tyr 200 Ser Gly Asn Asp Asp 280 Ala Gln Leu Lys Ala 360 Arg Leu Ser Ala Pro 440 Leu Arg 105 Glu Pro Ser Glu Asp 185 Tyr His Val Tyr Asn 265 Phe His Ala Gin Ile 345 Ile Ile Asn Glu Val 425 Leu Pro Leu Ile Ile His Lys 170 Ile Gly Val Gly Thr 250 Tyr Tyr Asn Thr Arg 330 Asp Arg Val Lys Glu 410 Gly Asn Asn Ser Val Val Ile 155 Asp Thr Val Gly Lys 235 Ser Leu Leu Pro Ile 315 Trp Gin Leu Asp Val 395 Thr Asn Gly Pro Gly His Ala 140 Ser Glu Thr Ser Tyr 220 Met Gin lle Gin Tyr 300 Lys His Tyr Leu Ser 380 Tyr Ala Gin Ile Asp 460 Glu Glu 125 Ala Pro Ala Ala Pro 205 Ser Glu Thr Lys Tyr 285 Pro Arg Ser Ser Lys 365 Val Arg Ala Leu Arg 445 Leu Thr 110 Arg Lys Glu Ala Gin 190 Glu Ser Val Asn Tyr 270 Lys Asp Ser Gly Pro 350 Ala Asn Val Ile Ser 430 Tyr Asn Pro Asp Leu Leu Leu 175 Leu Asp Asp Val Tyr 255 Asn Asp Met Asp Ser 335 Lys Thr Ser Lys Leu 415 Gin Glu Leu -225- SUBSTITUTE SHEET (RULE 26) WO 97/1 7432 PCT/US96/18003 Pro Asp Ser Thr 31y Asp 465 470 Asp Gin A-rg Lys Ala val L-u Lys Ar A la 475 430 Phe Arg 101 Laeu Giu Tyr4 545 Trp Phe Ser Leu Ala 625 Ile Glu Val Thr Lys 705 Phe Ala Met Gin Lys 'y r Leu 530 Gin Ilie Leu Asn Ilie 610 Leu Asp ValI Leu Glu 690 Ser His Ala As r Val Glu Leu 515 Asn Ile Thr Met Leu 595 Gly His Gin Gin Ala 675 Leu Ilie Thr Leu Lys 755 %.spC 500 Val rhr Thr 580 Thr Giu Leu Ile Thr 660 Gin Ser Lau Trp Lys 740 Glu ~85 ;ly ser Leu As p rrp 565 Thr Ala Asp Thr Gin 645 Thr Leu Lau As; Val 721 Asi Gli Ser G Val I Leu L Leu V 5 Asp A 550 Leu L Ala I1 Thr Leu Ser 630 Pro Pro Ser Ile His 710 Asn Gly u Ser s Leu t Ser 790 a Laeu 5 Iu le eu 'al .35 snf 'y [hr eu L~ys 615 ;in Alia Thr Leu Val1 695 Leu Lys Ala 520 Ile Leu Thr Tyr Ser 600 Arg GlU Gin Ser Ilie 680 Thr Z.,r Asn 505 Gin Cys Ala Gin Ser 585 Ser Ala ValI Ilie Leu 665 Tyr Gin Gin Met Lau Leu Ilie Thr Asp 490 As n Ile G ly Lys Lys 570 Thr Thr Met Ala Thr 650 Ly s Arg Ser Leu Giu A His Asn L.
rl'/r Gly A 540 Ilie Val G 555 Trp Thr V Thr Leu TI Leu His C Ala Pro C 620 Tyr AspI 635 Val AspC Val Ilie Arg Ilie *Ser Leu 700 *Leu Met 715 i His Aia 1 Thr Asp tAla Aia r Gin Ile 780 a Val Ser 795 e Asp His 0 a Asp His s n eu sp iu 'a 1 'hr ly 405 -ys eu ;Ay rhr Gly 685 Leu Ala Ser Val As 76~ As PrI As: Al Leu 510 Thr Thr Thr Thr Pro 590 Ly s Phe Leu Phe Phe 670 Leu Val1 Leu Leiu Al SGIr p Al o Lel n Ty1 a As 83 Ser Asp Ile Ala Asn Ile Lau Leu 560 Asp Leu 575 Giu Ilie Glu Ser Thr Ser Leu Trp 640 Trp Giu 655 Ala Gin Ser Glu Ala Gly Glu Gly 720 Ile Leu 735 x Gin Ala -i Val Giu Ile Leu Asp Leu 800 r Al.a Ala 815 n Gin Ala 0 Gly Leu Leu Thi Gly Ala Leu Thr 775 Ser Ly s Leu Leu Leu 760 Ser Ala Tyr Gly Thr 745 Gin Trp Leu G ly Gin 73( Va Mel Th: Lys Asp 770 Leu Thr Ly Gin 785 Ala Trp G iy Leu Met Gin Met Me Al1 s0 Trp Gin Ala Ala Aia Ala Ala Leu 820 met Ai 825 Gin Lys Lys 835 Leu Asp Glu Thr Phe Ser Lys Ala Leu Cys 840 845 -226- Asn T~yr Tyr SUBSTITUTE SHEET (RULE 26) WO 97/1 7432 PTU9/80 PCT/US96/18003 Ile Asn Ala Val *,al Asp Ser Ala Ala Gly Val Arg Asp Arq Asri Gly 3 50 855 860 Leu -TyZr Thr 71,r Leu Leu Ile Asp Asn Gin Val Ser Ala Asp Val Ile 865 870 875 880 Thr Ser Arg Ilie Ala Giu Ala Ile Ala Gly Ile Gin Leu T yr Val Asn 885 890 895 1 0 Arg Ala Leu Asn Arg Asp Giu Giy Gin Leu Ala Ser Asp Val Ser Thr 900 905 910 Arg Gin Phe Phe Thr Asp Trp Giu Arg Tyr Asn Lys Arg Tyr Ser Thr 915 920 925 Trp Ala Gly Vai Ser Giu Leu Val Tyr Tyr Pro Giu Asn Tyr Val Asp 930 935 940 Pro Thr Gin Arg Ile Giy Gin Thr Lys Met Met Asp Ala Leu Leu Gin 945 950 955 960 Ser Ile Asn Gin Ser Gin Leu Asn Ala Asp Thr Vai Giu Asp Ala Phe 965 970 975 Lys Thr Tyr Leu Thr Ser Phe Giu Gin Val Ala Asn Lau Lys Val Ile 980 985 990 Ser Ala Tyr His Asp Asn Vai Asn Val Asp Gin Giy Leu Thr Tyr Phe 995 1000 i005 Ilie Gly Ile Asp Gin Aia Aia Pro Giy Thr Tyr Tyr Trp Arg Ser Val 1010 1015 1020 Asp His Ser Lys Cys Giu Asn Gly Lys Phe Ala Ala Asn Ala Trp Gly 1025 1030 1035 1040 Giu Trp Asn Lys Ile Thr Cys Ala Val Asn Pro Trp Lys Asn Ile Ile 1045 1050 1055 Arg Pro Val Val Tyr Met Ser Arg Leu Tyr Leu Leu Trp Leu Giu Gin 1060 1065 1070 Gin Ser Lys Lys Ser Asp Asp Giy Lys Thr Thr Ile Tyr Gin Tyr Asn 1075 1080 1085 Leu Lys Leu Ala His Ile Arg Tyr Asp Gly Ser Trp Asn Thr Pro Phe 1090 1095 1100 Thr Phe Asp Val Thr Giu Lys Val Lys Asn Tyr Thr Ser Ser Thr Asp 1105 1110 1115 1120 Ala Ala Glu Ser Leu Gly Leu Tyr Cys Thr Gly Tyr Gin Gly Giu Asp 1125 1130 1135 Thr Leu Leu Val Met Phe Tyr Ser Met Gin Ser Ser Tyr Ser Ser Tyr 1140 1145 1150 Thr Asp Asn Asn Ala Pro Val Thr Gly Leu Tyr Ile Phe Ala Asp Met 1155 1160 1165 Ser Ser Asp Asn Met Thr Asn Ala Gin Ala Thr Asn Tyr Trp Asn Asn 1170 1175 1180 Ser Tyr Pro Gin Phe Asp Thr Val Met Ala Asp Pro Asp Ser Asp Asn 1185 1190 1195 1200 Lys Lys Val Ile Thr Arg Arg Val Asn Asn Arg Tyr Ala Giu Asp Tyr 1205 1210 1215 Giu Ilie Pro Ser Ser Val Thr Ser Asn Ser Asn Tyr Ser Trp Gly Asp -227- SUBSTrrUTE SHEET (RULE WO 97/1 7432 PTU9/80 PCT/US96/18003 1220 1225 1230 His Ser Leu Thr Met Leu 7Tyr Sly Sly Ser Val Pro Asn Ile Thr Phe 1235 1240 1245 Giu Ser Ala Ala Glu Asp Leu Arg Leu Ser Thr Asn Met Ala Leu Ser 1250 1255 1260 Ile Ile His Asn Giy rlyr Ala Sly Thr Arg Arg Ile GIn Cys Asn Leu 1265 1270 1275 1280 Met Lys Gin Prr Ala Ser Leu Gly Asp Lys Phe Ile Ile T1yr Asp Ser 1285 1290 1295 Ser Phe Asp Asp Ala Asn Arg* Phe Asn Leu Val Pro Leu Phe Lys Phe 1300 1305 1310 Gly Lys Asp Glu Asn Ser Asp Asp Ser Ile Cys Ile Tyr Asn Glu Asn 1315 1320 1325 Pro Ser Ser Giu Asp Lys Lys Trp Tyr Phe Ser Ser Lys Asp Asp Asn 1330 1335 1340 Lys Thr Ala Asp Tyr Asn Gly Gly Thr Gin Cys Ile Asp Ala Sly Thr 1345 1350 1355 1360 Ser Asn Lys Asp Phe Tyr Tyr Asn Leu Gin Giu Ilie Glu Val Ile Ser 1365 '1370 1375 Val Thr Gly Gly Tyr Trp Ser Ser Tyr Lys Ile Ser Asn Pro Ile Asn 1380 1385 1390 Ile Asn Thr Gly Ile Asp Ser Ala Lys Val Lys Val Thr Val Lys Ala 1395 1400 1405 Sly Gly Asp Asp Gin Ile Phe Thr Ala Asp Asn Ser Thr Tyr Val Pro 1410 1415 1420 GIn Gin Pro Ala Pro Ser Phe Giu Glu Met Ile Tyr Gin Phe Asn Asn 1425 1430 1435 1440 Leu Thr Ile Asp Cys Lys Asn Leu Asn Phe Ile Asp Asn Gin Ala His 1445 1450 1455 Ile Glu Ilie Asp Phe Thr Ala Thr Ala Gin Asp Gly Arg Phe Leu Sly 1460 1465 1470 Ala Glu Thr Phe Ile Ile Pro Val Thr Lys Lys Val Leu Gly Thr Giu 1475 1480 1485 Asn Val Ile Ala Leu Tyr Ser Glu Asn Asn Sly Val Gin Tyr Met Gin 1490 1495 1500 Ile Sly Ala Tyr Arg Thr Arg Leu Asn Thr Leu Phe Ala Gin Sin Leu 1505 1510 1515 1520 Val Ser Arg Ala Asn Arg Sly Ile Asp Ala Val Leu Ser Met Slu Thr 1525 1530 1535 Sin Asn Ile Gin Glu Pro Sin Leu Sly Ala Gly Thr Tyr Val Gin Leu 1540 1545 1550 Val Leu Asp Lys Tyr Asp Glu Ser Ile His sly Thr Asn Lys Ser Phe 1555 1560 1565 Ala Ilie Giu Tyr Val Asp Ile Phe Lys Slu Asn Asp Ser Phe Val Ile 1570 1575 1580 Tyr Sin Gly Glu Leu Ser Glu Thr Ser Gin Thr Val Val Lys Val Phe 1585 1590 1595 1600 -228- SUBSTITE SHEET (RULE 26) WO 97/17432 WO 9717432PCT/US96/1 8003 Leu Ser T1yr Phe Ile Giu Ala Thr Gly Asn Lys sn His Lau Trp Val 1605 1610 1615 Arg Ala Lys Tyr Gin Lys Giu Thr Thr Asp Lys Ile Leu Phe Asp Arg 1620 162 5 1630 Thr Asp Glu Lys Asp Pro His Gly Trp Phe Leu Ser Asp Asp His Lys 1635 1640 1645 Thr Phe Ser GIy Leu Ser Ser Ala Gin Ala Leu Lys Asn Asp Ser Glu 1650 1655 1660 Pro Met Asp Phe Ser Gly Ala Asn Ala Leu Ty'r Phe Trp Giu Leu Phe 1665 1670 1675 1680 l?"jr TIyr Thr Pro Met Met Met Ala His Arg Leu Leu Gin Giu Gin Asn 1685 1690 1695 Phe Asp Ala Ala Asn His Trp Phe Arg Tyr Val Trp Ser Pro Ser Gly 1700 1705 1710 TyIr Ilie Val Asp Gly Lys Ile Ala Ile Tyr His Trp Asn Val Arg Pro 1715 1720 1725 Leu Giu Glu Asp Thr Ser Trp Asn Ala Gin Gin Leu Asp Ser Thr Asp 1730 1735 1740 Pro Asp Ala Val Ala Gin Asp Asp Pro Met His Tyr Lys Val Ala Thr 1745 1750 1755 1760 Phe Met Ala Thr Leu Asp Leu Leu Met Ala Arg Gly Asp Ala Ala Tyr 1765 .1770 1775 Arg Gin Leu Glu Arg Asp Thr Leu Ala Giu Ala Lys Met Trp Tyr Thr 1780 1785 1790 Gin Ala Leu Asn Leu Leu Gly Asp Giu Pro Gin Val Met Leu Ser Thr 1795 1800 1805 Thr Trp Ala Asn Pro Thr Leu Gly Asri Ala Ala Ser Lys Thr Thr Gin 1810 1815 1820 Gin Val Arg Gin Gin Val Leu Thr Gin Leu Arg Leu Asn Ser Arg Val 1825 1830 1835 1840 Lys Thr Pro Leu 1844 INFORMATION FOR SEQ ID NO:54: SEQUENCE CHARACTERISTICS: LENGTH: 1722 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:54 (TcbAiii coding region;: CTA GGA ACA GCC AAT Tcc CTG ACC GCT TTA T'rc CTG ccG CAG GAA AAT 48 Leu Gly Thr Ala Asn Ser Leu Thr Ala Leu Phe Leu Pro Gin Giu Asn 1 5 10 AGC AAG CTC AAA GCG TAG TGG GGG ACA CTG GCG CAG CGT ATG TTT AAT 96 Ser Lys Leu Lys Gly Tyr Trp Arg Thr Leu Ala Gln Arg Met Phe Asn 25 -229- SUBSTITUTE SHEET (RULE WO 97/17432 TTA Cc Leu Ai TAT G Tyr A GCT TC Ala se CGC T'I Arg Ph- ATA CA Ilie GI GAAz CC Glu Al ACC AG Thr Se 13 AAA AC Lys Th 145 AGC TA Ser Ty GCG CT Ala Le' ATT TCi Ile Se.
GGC CT( Gly Let 211 GCT CA( Ala Asi 225 AAA G7.
Lys Va.
ATT CA( Ilie GIr CTG GAJ Leu GlI TAC CTC Tlyr Le.
29( AGA AGCC Arg Sex 305
;T
'C
T
r
C
e
T
a
TI
r r
T
r1
CAT
His 35
AAA
Lys
CAA
Gin
CCT
Pro
TTC
Phe
ATG
Met 115
ATT
Ilie
CC
Ala
AGC
Ser
GCG
Ala
CGT
Arg 195
GCT
Ala
GGT
Gly
GCT
Ala
CGT
Arg
TCA
Ser 275 AAA2 Lys
AA
Lys
AAT
As n
CCG
Pro
GGG
Gly
CAA
Gin
GGT
Gly 100
AGT
Ser
CGT
Arg
TTG
Leu
CAA
Gln
ITA
Leu 180
ATG
Met
GAT
Asp
PLTT
Ile
CAG
G3n
CAC
Asp 260
CTG
Leu k.CC rhr
FTC
Phe
CTO
Leu
GCT
Ala
GGA
Cly
ATG
Met 85
ACT
Ser
CAA
Gin
ATG
Met
CAA
Gin
CTG
Leu 165
CGC
Arg
TCG
Ser
GAT
Asp
CC
Ala 70
CTA
Leu
TCA
S er
CTA
Leu
CAG
Gin
GTC
Val 150
TAT
Tyr
TCA
Ser
ATT
Ilie
CCA
Pro 55
GAC
Asp
GAA
Giu
CTA
Leu
CTG
Leu
CAT
Asp 135
TCT
Ser
GAG
Glu
GAA
Glu
GCG
Ala
ATG
Met 215
ACT
Ser
ATA
Ile
CAA
Gin
CC
Arg
GCT
Ala 295
CAAC
Gin C1AC Asp 40
A.AA
Lys
TTG
Leu
GGG
Gly Leu
CAA
Gln 120
AAC
Asn
TTA
Leu
GAG
Glu
TCT
Ser
GGT
Gly 200
CAT
His dCT Ala
TAT
Tyr 3CG Ala
CGT
A.rg 280
:AG
;In
'CG
CCC CAG -1l1y Gin GCT TTA Ala Leu CCC AAG Pro Lys GCA CGG Ala Arg 90 CCC TAC Gly Tyr 105 ACC CAA Thr Gln CAA TTG GIn Leu GCT GGA Ala Cly AAC ATC Asn Ile 170 GCT A IT Ala Ile 185 GTT GAT Val Asp TAT CGT Tyr Gly TCT GCC Ser Ala CGC CGT Arg Arg 250 GAG ATI' Glu Ile 265 GAA GCC Glu Ala GCG CAG Ala Gin TTA TAT Leu Tyr -230- CC:3 Pro
CTG
Leu
GCG
Ala 75
GGC
Gly
ACT
Ser
CC
Ala
GCA
Ala
GTC
Val1 155
AAC
As n
GAG
Glu
ATO
Met
GCT
Ala
AAG
Ly s 235
CGC
Arg
AAC
As n
GCT
Ala
GCA
Ala
ACT
Ser 315
CTC
Leu
AGT
Ser 60
CCG
Pro
TTG
Leu
GAG
Glu
AGC
Ser
GAG
Giu 140
CAA
Gin
GCA
Ala
TCT
Ser
GCA
Ala
ATT
Ile 220
ATG
Met
CGT
Arg
CAG
GIn
GAA
Clu
CAA
Gin 300
TGG
Trp
TCC
Ser 45
GCG
Ala
CTG
Leu
GTT
ValI
CGT
Arg
GAG
Giu 125
CTG
Leu
CAA
Gin
GGT
Gly
CAC
Gin
CCA
Pro 205
GCC
Ala
GTT
Val1
CAA
Gin
TTA
Leu
ATG
Met 285
CTT
Leu
TTA
Leu
TT-,
Leu
CCG
Ala
ACT
Thr
A.AC
As n
CAG
Gin 110
TTA
Leu
GAT
Asp
CGG
Arg
GAG
Glu
GGA
Gly 190
AAT
Asn
TAT
Tyr
GAT
Asp
GAA
GlU
AAC
Asn 270
CAA
Gin
ACT
Thr
CGA
Arg PCT/US96/1 8003 CCGC OT 144 Pro Leu OTT TCA 192 Val Ser ATT CAC 240 Ile His CAG CTT 288 Gin Leu GAT GCG 336 Asp Ala ATA CTG 384 Ile Leu TCG GAA 432 Ser Giu TTT GAC 480 Phe Asp 160 CAG CGA 528 Gin Arg 175 GCG CAG 576 Ala Gin ATC TTC 624 Ilie Phe GCC ATC 672 Ala Ile GCG GAG 720 Ala Glu 240 TCG AAA 768 Trp Lys 255 GCG CAA 816 Ala Gin, A.AA GAG 864 Lys Giu TTC TTA 912 Phe Leu GGG CGT 960 Gly Arg 320 GCA GGC Ala Gly GGC GGC Gly Gly GAG TTG Giu Leu 230 rcG GAA Ser Glu 245 PLAC GCA Psn Ala TCT ATT Ser Ile CAG CAA ;In Gin kGT AAT Ser Asn 310 SUBSTmJUTE SHEET (RULE 26) WO 97/1 7432 TTC T--A Leu Ser CTG ATG Leu Met
GCT
Gly
GCA
Ala
ATT
Ile
GAG
Giu 340
TAT
Ty,,r 325
CA-A
Gin
TAT
T1yr
TGG
Trp 345
GAC
Asp 330 G AA Giu
TTG
Leu
OCT
Ala
TCA
Ser
AAT
As n 350 PCTUS96/1 8003 CGT TC iOC,8 Arg cys 335 TCC ATT 1056 Ser Ile :Gcc Ser
TGT
C-ys
CTG
Leu 385
GCA
Aia
GAA
GlU AAAz Lys
TTI
Phe
GGA
Ci y 370
AAA?
Lys
GTG
ValI
CAA.
Gin
GA.A
Glu
GTC
Vai1 355
GAA
Giu
TG
Trp
CTT
Val1
ATA
Ile
AAT
Asn 435
AAA
Lys
CCT
Ala
GA.A
CIU
TAT
Tyr
CCT
Pro 420
GGC
Gly
CCC
Pro
TTC
Leu
GT
Gly
ATA
Ile
CCA
Ala
CAA
Gin 375
ACI
Thr
CAA
Gin TAC CC GcC TCT CCC CCT Ser
GAT
Asp 405
GCA
Ala
TTA
Leu Arg 390
TCA
Ser Leu
TCA
Ser Ala
CTC
Leu
TTC
Leu
TTC
Leu Lys
OCT
Cly 465
CCA
Ala
GC
Ciy
CCT
Cly
TAC
Tfy r
CTT
Leu 545
ATC
Met TTG TCC CAC TTC AAA Leu 450
AC
Ser
TTC
Leu
ACT
Ser
ACC
Thr
CTC
Leu 530
CAA.
Gin
AC
Ser Ser
AAC
As n
CTT
Val1
ACT
Thr
AAT
Asn 515
CCA
Pro Phe
CAT
Asp Asp Leu Lys AAC CTT Lys Val CCC CCT Gly Pro 485 CAA TTG Gin Leu 500 CAT ACT Asp Ser 'rrr CA? Phe Giu CCC AAT Pro Asn ATT ATT Ile Ile 565
CCT
Arg 470
TAT
Tyr
CCG
Pro
CCT
Giy
GGT
Ciy
CCT
Aia 550
TTG
Leu
CTG
Leu 455
CCT
Arg
CAG
Gin
AAA
Lys
CAG
Gin AT'r Ile 535
ACC
Thr TTVG GAA CTA CAA Leu Giu Vai Ciu 395 CAA CGT AAT CAT GiU Cly Asri Asp 410 CAT AAG CCC GAG Asp Lys Cly Ciu 425 GCT AAT GCT ATC Aia Asn Ala.Ile 440 CCA ACG CAT TAT Gly Thr Asp Tyr A'IT AAC CAA ATC Ile Lys Gin Ile 475 CAT GTT CAG CCT Asp Vai Gin Ala 490 GGT TOT TCA C Cly Cys Ser Ala 505 TTC CAG ?I'G CAT Phe Gin Leu Asp 520 Tyzr Ala 365 ATG CA?.
Met Ciu 380 CCC ACC Arg Thr CCT Tm-r Arg Phe CCA ACA Cly Thr CTG TCA Leu Ser 445 CCA CAC Pro Asp 460 ACT CTT Ser Val PTG CTC Met Leu TTG GCT Leu Aia TTC AAT Phe Asn Gly
GAG
Ciu
GTT
Vali
AAT
Asn
GCA
Ala 430
GCT
Ala
ACT
Ser
TCG
Ser
AGC
Ser
GTG
Val1 510
GAC
Asp
TT
Le~
GC)Z
IA14
TC;
Ser
TTA
Leu 415
CC?
Gly
TCC
Ser
ATC
Ile
CTA
Leu
TAT
Tyr 495
TCT
Ser
GC
Cly
CTG
Leu
CAA.
Gin kTTG 1104 .Leu STAT 1i52 ITyr TTC 1200 Leu 400 CC 1242 IAia ACT 1296 Thr CTC 1344 Val CTT 1392 Val CCT 1440 Pro 480 CCT 1488 Cly CAT 1536 His A.AA 1584 Ly s AAT 1632 As n ACT 1680 Thr 560 1722 GCT CTT CAT Ala Leu Asp ,AC AAG CAG ksp Lys Gin
CAT
Asp
AAA
Lys 555 CAT ATT CCT His Ile Arg TAT ACC ATC CCT TAA Tyr Thr Ilie Arg..
570 573 INFORMATION FOR SEQ ID SEQUENCE
CHARACTERISTICS:
LENGTH: 573 amino acids TYPE: amino acids -231- SUBSTITUTE SHEET (RULE 26), WO 97/17432 WO 9717432PCT/US96/18003 Leu Se r Leu Tyr Ala Arg Ile Glu Thr Lys 145 Ser Ala le Giy Ala 225 Lys Ile Leu T'yr Arg 305 Leu (ii) (xi) G17 Thr Lys Leu Arg His Ala Lys Ser Gin Phe Pro Gin Phe Ala Met 115 Ser Ile 130 Thr Ala Tyr Ser Leu Ala Ser Arg 195 Leu Ala 210 Asp Gly Val Ala Gin Arg Giu Ser 275 Leu Lys 290 Ser Lys Ser Giy STRANDEDNESS: sirngie, TOPOLOGY: linear MOLECULE TYPE: protein SEQUENCE DESCRIPTION: SEQ ID NO:55 (TcbAiii): Ala Asn Ser Leu Thr Ala Phe Leu Pro Gin Giu Asn Lys As n Pro G ly Gin Gly 100 Ser Arg Leu Gin Leu 180 Met Asp Ile Gin Asp 260 Leu Thr Phe Ile Gly Leu Ala G ly Met 85 Ser Gin Met Gin Leu 165 Arg Ala Gly Giu Ser 245 Asn S er Gin Ser Tyr 325 TI',?r Ser Asp Aia 70 Leu Ser Leu Gin Val1 150 Tyr Ser Gly Gly Leu 230 Giu Ala Ilie Gin As n 310 Phe Trp Ile Pro 55 Asp Glu Leu Leu Asp 135 Ser Giu Giu Ala Met 215 Ser Ile Gin Arg Ala 295 Gin Gin Arg Asp Lys Leu G ly Leu Gin 120 As n Leu Giu Ser Gly 200 His Ala Tyr Ala Arg 280 Gin Ala Phe Thr Leu Gly Gin Ala Leu Pro Lys Ala Arg Giy Tyr 105 Thr Gin Gin Leu Ala Gly Asn Ilie 170 Ala Ilie 185 Val Asp Tyr Gly Ser Ala Arg Arg 250 Giu Ile 265 Giu Ala Ala Gin Leu Ty r Tyr Asp 330 Ala Pro Leu Ala Gly Ser Ala Ala Val 155 Asn Giu Met Ala Lys 235 Arg Asn Ala Ala Ser 315 Gin Leu Ser Pro Leu Giu Ser Giu 140 Gin Ala Ser Ala Ile 220 Met Arg Gin Giu Gin 300 Trp Leu Ala Val Ser Arg Cys 335 -232- SUBSTITUTE SHEET (RULE 26) WO 97/1 7432 PTU9/80 PCT/US96/18003 Lau Met Ala Giu Gin Ser 1 'r Gin TrP Giu Ala Asn Asp Asn Ser Ile 340 345 350 Ser Phe Vai Lys Pro Gly Ala Trp Gin Gly Thr Tyr Ala Gly Leu Leu 355 360 365 Cys Gly Glu Ala Leu Ilie Gin Asn Leu Ala Gin Met Glu Giu Ala Tyr 370 375 330 Leu Lys Trp Giu 5cr Arg Ala Leu Giu Val Giu Arg Thr Val Ser Leu 385 390 395 400 Ala Val Val Tyr Asp 5cr Leu Giu Gly Asn Asp Arg Phe Asn Leu Ala 405 410 415 Giu Gin Ile Pro Ala Leu Leu Asp Lys Gly Giu Gly Thr Aia Gly Thr 420 425 430 Lys Giu Asn Giy Leu Ser Leu Ala Asn Ala Ile Leu Ser Ala Ser Val 435 440 445 Lys Leu Ser Asp Leu Lys Lau Gly Thr Asp Tyr Pro Asp Ser Ile Val 450 455 460 Gly Ser Asn Lys Val Arg Arg Ile Lys Gin Ile 5cr Val ser Leu Pro 465 470 475 480 Ala Leu Val Gly Pro Tyr Gin Asp Val Gin Ala Met Leu Ser Tyr Gly 485 490 495 Gly Ser Thr Gin Leu Pro Lys Gly CysSe5r Aia Leu Ala Val 5cr His 500 505 510 Gly Thr Asn Asp 5cr Gly Gin Phe Gin Leu Asp Phe Asn Asp Gly Lys 515 520 525 Tyr Leu Pro Phe Giu Gly Ile Ala Leu Asp Asp Gin Gly Thr Lau Asn 530 535 540 Leu Gin Phe Pro Asn Ala Thr Asp Lys Gin Lys Ala Ile Leu Gin Thr 545 550 555 560 Met Ser Asp Ile Ile Leu His Ile Arg Tyr Thr Ile Arg..
565 570 573 INFORMATION FOR SEQ ID NO:56 SEQUENCE CHARACTERISTICS: LENGTH: 2898 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:56 (tccA) 1 ATG AAT CAA CTC GCC AGT ccc CTG ATT TCC cGc ACC GAA GAG ATC CAC 48 1 Met Asn Gin Leu Ala Ser Pro Leu Ile Scr Arg Thr Giu Glu Ile His 16 49 AAC TTA ccc GGT AAA TTG ACC GAT CTT GGT TAT ACC TCA GTG TTT GAT 96 17 Asn Leu Pro Gly Lys Leu Thr ASP Leu Gly Tyr Thr Ser Val Phe Asp 32 97 GTG OTA CGT ATG CCG CGT GAG CGT TTT ATT CGT GAG CAT CGT GCT GAT 144 -233- SUBSTITUE SHEET (RULE 26) WO 97/17432 WO 9717432PCT/US96I1 8003 33 Val Val Arg Met Pro Arg Glu Arg Phe Ile Arg Glu His Arg Ala Asp 145 CTC GGG CGC AGT GCT GA. *AAA ATG TAT GAC CTG GCA GTG GGC TAT GCT 49 193 241 81 289 97 337 113 385 129 433 145 481 161 529 177 577 193 625 209 673 225 721 241 769 Leu
CAT
His
CAG
Gin
AAT
Asn
GGA
Gly
TAT
Tyr
AAT
As n
GAT
Asp
ATT
Ile
GCG
Ala
TAT
Tyr
ACT
Thr
AAC
As n
GCT
Gly
CAG
G1n
TT
Phe
CAG
Gin
TCA
Ser
CAA
Gin
ACG
Thr
AA
Lys
CTG
Leu
GTA
ValI
CAT
His
ACG
Thr
TTC
Phe
TTG
Arg
GTG
Val1
GGC
Gly Phe
CCG
Pro
TTG
Leu CTG3 Leu
GCA
Ala
TCC
Ser
AAC
Asn
TAT
Tyr Leu
TGG
Trp Ser
TTG.
Laeu
CTT
Leu
GA.A
Giu
GCC
Ala
GCG
Ala
ATC
Ile
AAA
Lys
GCC
Ala
GGT
G ly
CAA
Gin
GCA
Ala Alia
CAC
His
AGA
Arg
GAT
Asp
GCC
Ala
CTT
Leu
GAG
Glu
AAT
Asn
GCT
Ala
AGA
Arg
CAT
His
GAT
Asp
AC?~
Thr Glu
CAT
His
AGT
Ser
GCA
Ala
AAT
As n
GAA
Giu
CGT
Arg
GAG
Giu Ilie Leu
*CAG
Gin
ATC
Sle
GC;
*Aid Lys
TTT
Phe
CCG
Pro
AAC
As n
GAT
Asp
CAG
Gin
CGC
Arg
GTG
Val1
CAG
Gin
TCC
Ser
CAG
Gin
ACTI
Thr
~AA;
Lys Met CGC C Arg TTC 1I Phe S
ACGC
Thr C
GCG
Ala
GAA
Giu
CCC
Pro ATA4 Ile
AAG
Lys
ACT
Thr
ATT
Ile TrG Leu
GGA
Gly
ATG
Met r' GTC ,yr
GT
~rg 'cc ;er
;GT
ly
:CG
Pro
A.AG
GAT
Asp
CCG
Pro
AAA
Lys
ACC
Thr
CAG
Gin
CCA
Pro
AAJ
Lys
GCC
Aid
GG
Asp
AAT
As n
GTA
ValI
TGG
Trp,
GTA
Val1
AAT
Asn
CTG
Leu
CA.A
Gin
CTG
Leu
CGT
Arg
ACA
Thr
CAC
Gir
CTC
Let
AG~
SeJ r CA( Leu A TCT C Ser L TCA G Ser G AAA C Lys A~ GCC 'I Ala GGC C Gly
GGT
Gly Leu
AGT
Ser
TAC.
Tyr
GCT
Ala
ACG
Thr
AGC
1Ser r' CAG r Gin
GAT
lia
TT
au
GC
ly
~AT
L5p
~AT
.~r
;CT
CAG
Gin rrG Leu
CCG
Pro
CAA
Gin
CTC
Leu
GA)
Asj PhE
TT(
ValI
AGT
ser
CCG
Pro
AAA
Lys
CTG
Leu
ACT
Thr Leu Leu
ACT
Thr
AAT
As n
TCG
Ser
GAT)
Asp
'ACC
Thi
~TCC
Sel
CTA'
Gly GAA G Giu A GAT TI Asp GCA C Ala E ACT C Thr ACC2 Thr
TTA
Leu
GTC
Val
GAT
Asp
AAT
Asn
*GTA
*Val
CTG
Lau
ACT
*Thr
,CCA
*Pro r CAG r Gin -yr
CT
la 'Ac 'yr
CA
~ro
AT
4is
'%TT
Ile
A.TT
Ile
PLAT
As n
CTG
Leu
CTG
Leu
TTG
Lau
CCC
Pro
GCC
Aia
GAC
GIl
CT)
Let Ala
GTT
,,-aI
GCC
Ala
AGT
SE~r
ATT
Ile
ATG
Met
AAT
Asn
GAA
Giu
GAA
Giu
CCG
Pro
GGT
G iy
CAA
Gin
AGT
Ser
CAG
Gin
)AAC
Asn 192 64 2 288 96 336 11.1 384 128 432 144 480 160 528 176 576 192 624 208 672 224 720 240 768 256 3 16 272 864 233 ACC CGA CTG CAA ATC 257 Ala Leu Thr Arg Leu Gin Il 817 CAG AAA ATC A'rr ACG GAG AC' 273 Gin Lys Ile Ilie Thr Giu Thr Vai Gly Gin Asp Phe Tyl 865 TAT GGT GAC 289 Tyr Giy Asp AGT TCG CTT ACT GTG AATA Ser Ser Leu Thr Val Asn S -234- SUBSTITUTE SHEET (RULE 26) .GT TTC AGC GAC ATG ACC ATA er Phe Ser Asp Met Thr Ile WO 97/17432 WO 97/ 7432PCTIUS96/1 8003 3013 321 1009 *337 1057 353 1105 369 1153 385 1201 401 1249 417 1297 433 1345 449 1393 465 1441 481 1489 497 1537 513 1585 529 1633 545 ATG A -,CT GAT CGA Met Thr Asp Arg TOT TCA ACT OTC Cys Ser Thr Val TCT COT CAC ACC Ser Gly Asp Thr CAT CCC GGT AAG His Ala Gly Lys GCO CAT TTT OCT Ala His Phe Ala COT ATT AAC CGC Arg Ile Asn Arg GAG OAT ATT CAC Oiu Asp Ile Asp AAT ACC GCO CTC Asn Thr Ala Leu TTC AAA CAT TAT Phe Lys His Tyr GCC TG CTC CC Cly Trp Leu Arg TTT TTA GAC CAA Phe Leu Asp Gin OTO ATA GAT AAT C Val Ile Asp Asn C 00G 0CC COT OTT OlY Ala Arg Val L CAG TTC CTC TTA Gin Phe Leu Leu L ACC CAA AGC ACA C Thr Gin Ser Thr L COT CTO OCT AAT T
ACA~
T hr
GGA
G 1 y
AC.
Th.
ccC P r
CT
Let
AC;
Thr
CTC
Leu
TCC
Ser
CAG
Gin
GTA
lalI
;TG
la 1
AG
in
L.AG
~ys
"TO
eu
TC
eu
OT
Gly A C r Ala G GAC o Giu
ACG
SThr
OTO
Val
TTA
Leu
ATC
Met
CCC
Ala
OTG
Val TTT2 Phe I CAT Asp P
CAT
His I 0CC G Ala A
A.ACT
Asn C TTLeu rCT Ser
ACC
Th;
C
Al Va
CCC
Arg
CTG
ValI
AAC
Asn
~A
Lys ;cc .s n 'he
LTC
le
AT
.sp
CT
ys ACT OTA CCC Thr Val Pro ACG OTT OTT Thr Val Val G CCA TTT OCO r Pro Phe Ala 3 ATT ACC CTG I Ile Thr Leu rAAC AAT CTC IAsn Asn Leu *CTC CAA AAA Leu Cin Lys *ACT TCT CCT Thr Ser Ala GAC A.AT ACC Asp Asn Thr TAT OT OTT Tyr Gly Val CCG TTT CC Pro Phe Ala TCC OTC OGC Ser Vai Cly CTC TAT ACA Val Tyr Thr AGC ACC OCA Ser Thr Ala AAT ATT CCCc Asn Ile Ala AAT CTG TTT G Asn Leu Phe V Gin
A.AG
Ly s
TA'
AG'
Sel
AC'
Thi
TOC
Trp Met
CTC
Lau
AC
Ser
ATT
Ile
ACC
rhr
ITO
Laeu
:TG
.eu 0T ~rg
;TG
T
al GTA GA.A Val Glu TCT OAT Ser Asp T GOC CC r Gly Ala r CCC ACT r Arg Ser G. AT GAC *Asp Asp CTC AAT Leu Asn G AT CC Asp Ala COT ATG Arg Met OCT AAA Ala Lys ACA CCC Thr Pro TTT OAT Phe Asp ACC ACC Thr Thr GOC CTC2 Giy Leu CA6A CAG C Gin Gin G OTC TCA G Val Ser
A
CTC
Leu
AAT
Asn
CG
Ar
GI~
Ly
CTC
Lau
GA;
GIl
TTG
Lau
CAA
Gin
GCA
Ala
ACA
Th r
;CC
;ly k.AT ksn f
'CC
'ly
*CT
lia OTO AC-'T 1003 '/al Ser 336 CTTT ATT 105,5 9 Phe lie 352 T C GAG 11G4 y Ala Giu 368 3TTO G 'A sLau Asp 334 CCT TAT 12:00 Pro Tyr 400 ACA OGA 1248 IThr Gly 416 GGA OTO 1296 ly Val 432 *TTT OCT 1344 Phe Ala 448 ACO CCC 1392 Thr Pro 464 CCC TTT 1440 Pro Phe 480 CCC CAT 1488 Gly Asp 496 CAT COT 1536 His Arg 512 A-T OT:- 1584 Asn %*al1 523 TTC TA*O1 1632 Phe Ty.,r 544 TCT TTC 1680 Ser Phe 560 TO OCO CCC ACA TTG 000 ATA AAT CCA GA Arg Leu Ala Asn Leu Ala Arg Thr Leu Gly Ilie Asn Pro Glu 1681 TOT CCC TTGOGTT CAT CGA TTA OAT GCA COT -235- SUBSTITUTE SHEET (RULE 26) ACA CCC ATC CTC TOG CIG ?'23 WO 97/17432 PCT/US96/18003 561 1729 577 1777 1824 593 608 1825 609 1873 625 1921 641 1969 657 2017 673 2065 689 2113 705 2161 721 2209 737 2257 753 2305 769 2353 785 Cys Ala Leu Val Asp Arg Leu Asp CAA TTG GCA GGG !k CCC ACA ATC Gin Leu Ala Gly Lys Pro Thr Ile CTG GCG GCG GAT ATT CTG AGT Leu Ala Ala Asp Ile Leu Ser Ala.l: Thr. Gly Ile Val Trp win ACG GTA CCA CAA AAA GAT TCC CCG Thr Val Pro Gin Lys Asp Ser Pro 592 TTG CTG CAA GCG CTA AGT GCG ATT GCT Leu Leu Gin Ala Leu Ser Ala Ile Ala
CAA
Gin
TTG
Leu
AAC
Asn
GCA
Ala
CAC
His
ATC
Ile
GCA
Ala
GCA
Ala
CAG
Gin
TCA
Ser
TGG
Trp
GAT
Asp
TGG
Trp
AGT
Ser
TTT
Phe
ACA
Thr
GCT
Ala
GAT
Asp
ACG
Thr
ATC
Ile
GGC
Gly
CTC
Leu
TTG
Leu
ATT
Ile
CAA
Gin
GAC
Asp
ATC
Ile
TTG
Leu
ATT
Ile
AAG
Lys
GTG
Val
ACT
Thr
GTG
Val
CCT
Pro
AGT
Ser
CCC
Pro
CAA
Gin
AAC
Asn
CGT
Arg
TTG
Leu
GAC
Asp
GTT
Val
GTC
Val
ACT
Thr
GCC
Ala
GCG
Ala
GCG
Ala
GCT
Ala
CAG
Gin
CCT
Pro
CAA
Gin
TCC
Ser
TGG
Trp
GGT
Gly
AAT
Asn
CTG
Leu
GTC
Val
TTA
Leu
ACT
Thr
GAC
Asp
CAC
His
ATT
Ile
GTG
Val
CGC
Arg
TTT
Phe
CTG
Leu
ACA
Thr
ACT
Thr
AGT
Ser
TTG
Leu
TGG
Trp
TAT
Tyr
CAT
Asp
TCT
Ser
TGG
Trp
AGT
Ser
GCT
Ala
GTG
Val
CAA
Gin
AAT
Asn
CTG
Leu
TTG
Leu
GCA
Ala
CTG
Leu
TTA
Leu
ACC
Thr
CAG
Gin
GGG
Cly
CTG
Leu
ACT
Thr
AGC
Ser
ACG
Thr
TTG
Leu
CGC
Arg
TTG
Leu
CGT
Arg
ACG
Thr
GAA
Glu
TCG
Ser
AAC
Asn
GCA
Ala
CTC
Leu
GAT
Asp
TTA
Leu
TTG
Leu
GCG
Ala
TGG
Trp
AAG
Lys
CAA
Gin
CTG
Leu
TTT
Phe
CAG
Gin
CTA
Leu
CCA
Pro
TCA
Ser
GCT
Ala
TCT
Ser
AAT
Asn
CAG
Gin
ACT
Ser
GAT
Asp
TTA
Leu
ACT
Ser
TCA
Ser
GGC
Gly
GGC
Gly
TTA
Leu
GCA
Ala
GGC
Gly
GAT
Asp
CAG
Gin
ACT
Thr
GGA
Gly
GCC
Ala
CGT
Arg
CCT
Pro
GCA
Ala
ACT
Thr
AGT
Ser
GTC
Val
GGT
Gly
ATA
Ile
GAA
Glu
GTA
Val
CTG
Leu
CAA
Gin
GTT
Val
GAA
Glu
CTG
Leu
GAC
Asp
ACG
Thr
GAT
Asp
AAT
Asn
CAA
Gin
GAT
Asp
CAG
Gin
AAC
Asn
ACA
Thr
AAG
Lys
GTG
Val
CTT
Leu
GAT
Asp
TTT
Phe
ACC
Thr
AGT
Ser
AGT
Ser
AAG
Lys
AAA
Lys
GTG
Val
ACC
Thr
ACT
Thr
GTA
Val
TTG
Leu
CAA
Gin
GTG
Val
AAC
Asn
CCG
Pro
GTT
Val
AAG
Lys
ACT
Thr
AGT
Ser
TAC
Tyr
GCC
Ala
CGC
Arg
CTG
Leu
TTG
Lau
GGT
Cly
GGC
Gly
CTT
Leu
ATA
Ile
CTG
Leu
CAA
Gin
CAG
Gin
CAG
Gin
GCC
Ala
CGC
Arg
ACC
Thr 1872 624 1920 640 1968 656 2016 672 2064 688 2112 704 2160 720 2208 736 2256 752 2304 768 2352 784 2400 800 2448 816 2401 TCC TTG TTG ACC CAA CAA TTC 801 Ser Leu Leu Thr Gin Gin Phe GCA ATG GTG CAA Ala Met Val GIn -236- SUBSTITUTE SHEET (RULE 26) WO 97/17432 WO 97/ 7432PCT/US96/18003 2449 TTG CTC G'AC TAT CCA OCC TAT TTT C 317, Lau Leu Asp rir Pro Ala 'Ty~r Phe Gly GCT TCC GCA GAA ACA 07 Ala Ser Ala Glu Thr-.~ Thr 3 2497 833 2545 849 2593 865 2641 881 2689 897 2737 913 2785 929 2833 945 2881 961
CAT
Asp
TTG
Lau
TAC
Tyr
GCA
Ala
CC
Ala
OAT
Asp
OTT
Val
ATC
Ile
CTC
Leu
TTA
Lau
CAG
Gin
GCT
Ala
;CG
k.la
CA
'hr ACT TTO TOO ATO CTT TAT ACC CTC ACC TOT TAT Ser
CAA
Cmn
CC
Arg
ACO
Thr
TOO
Trp
CTT
Leu
CAC
GCn Leu
ATO
Met
ACA
Thr
TTC
Leu
TCO
Ser
CTC
Leu
CAA
Gin Trp
CCT
Cly
OCT
Ala
OCA
Ala
OTA
Val
COT
Arg
CAG
Met
OAA
Glu
AAT
Asn
ACO
Thr
TTO
Leu Leu Leu
OCT
Ala
OCT
Ala
CTA
Leu
GC
Oly
CAA
Cin
GC
Tyr
OCT
Oly
ACC
Thr
TTC
Lau O ly
CAC
Gin Thr
OCT
Cly
ACA
Thr
OCT
Cly
ATT
Ile
OCA
Ala Lau
ACC
Thr
CCG
Pro
TOO
Trp
CC
Ala
CAC
CTG
Ser
OAA
Oiu
TTC
Leu
GAC
Olu
AAA
Lys
AAC
As n Cys
OAT
Asp
AOC
Ser
OTT
Val1
ACC
Thr
CAA
CGT
Tyr
CAT
Asp
CAA
Cin
AAC
As n
ACA
Thr
ACT
Thr
AC
Ser
OTA
ValI
TCT
Ser
GAG
Olu
CCC
Pro
OCT
Gly G AT Asp
CTC
Leu
CAT
Asp
TTO
Leu
CAA
Cin
CTT
Lau
TTA
Lau
CC
Ala
OCT
Ala
C;A
Cmn
CTC
Leu
GCC
O ly
TAT
Tyr 2544 348 21592 364 2640 880 2688 396 2736 912 2784 928 2832 944 2880 960
:AA
~AT CC CTOACT CTGAA Cmn Gin Cly Tyr Leu Leu Ser Arg Asp Ser Asp
ACC
Thr
OTC
Val CTT TOO Leu Trp AAO GCC Lys Gly CAA AC Cmn Ser ACT AAC Ser Asn ACC OCT CAC CC Thr Oly Cmn Ala TOA 2898 End 966 CTC OTO OCT GGC OTA TCC CAT Leu Val Ala Oly Val Ser His INFORMATION FOR SEQ ID NO:57 SEQUENCE CHARACTERISTICS: LENGTH: 965 amino acids TYPE: amino acid TOPOLOGY: iinea r (ii) MOLECULE TYPE: protein (xi) Features 33 49 31 Met Asn Asn Leu Val Val Leu Cly His Cmn Gin Phe SEQUENCE DESCRIPTION: SEQ ID NO:57 (TCCA peptide) From To Description 1 10 SEQ ID NO:8 Gin Leu Ala Ser Pro Leu Ile Ser Arg Thr Olu Olu Ile His Pro Cly Lys Leu Thr Asp Leu Cly Tyr Thr Ser Val Phe Asp Arg Met Pro Arg Olu Arg Phe Ilie Arg Ciu His Arg Ala Asp Arg Ser Ala Ciu Lys Met Tyr Asp Leu Ala Val Oly Ty.r Ala Val Leu His His Phe Arg Arg Asn Ser Leu Ser Clu Ala Val Oly Leu Arg Ser Pro Phe Ser Val Ser Cly Pro Asp Tyr Ala -237- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCTIUS96/18003 97 Asn Gln Phe Leu Asp Ala Asn Thr Gly Trp Lys Asp Lys Ala Pro Ser 3 3 3 4: 113 Gly Ser Pro Glu Ala Asn 129 Tyr Gln Leu Ala Leu Glu 145 Asn Thr Leu Ala Glu Arg 161 Asp Lys Ala Ile Asn Glu 177 Ile Leu Ser Lys Ala Ile 193 Ala Val Asn Ala Arg Leu 209 Tyr His Tyr Gly His Gin 225 Thr Thr Leu Gln Asp Ile 241 Asn Phe Trp Ala Thr Ala 257 Ala Leu Thr Arg Leu Gln 273 Gln Lys Ile Ile Thr Glu 25 289 Tyr Gly Asp Ser Ser Leu 305 Met Thr Asp Arg Thr Ser 321 Cys Ser Thr Val Gly Gly 337 Ser Gly Asp Thr Thr Ala 353 His Ala Gly Lys Pro Glu 5 369 Ala His Phe Ala Leu Thr 385 Arg Ile Asn Arg Thr Val 401 Glu Asp Ile Asp Leu Leu 417 Asn Thr Ala Leu Ser Met 433 Phe Lys His Tyr Gln Ala I 5 449 Gly Trp Leu Arg Val Val 465 Phe Leu Asp Gln Val Phe P 481 Val Ile Asp Asn Gln Asp P 497 Gly Ala Arg Val Lys His I 513 Gln Phe Leu Leu Leu Ala A 529 Thr Gln Ser Thr Leu Asn C 545 Arg Leu Ala Asn Leu Ala A 561 Cys Ala Leu Val Asp Arg L 577 Gln Leu Ala Gly Lys Pro T 593 Leu Ala Ala Asp Ile Leu S Asp Gln Arg Val Gin Ser Gin Thr Lys Ile Thr Thr Leu Ser Thr Ala Val Arg Val Asn Lys Ala i sn S ?he V le S sp A ys A rg T eu A hr I er L Al Gl Pr
II.
Ly Th] Il< Let Gly Met Val Val Thr Thr Pro Ile Asn Leu Thr Asp ryr Pro er al er isn .sn hr sp le eu eu a Pro u Lys o Asp e Pro s Lys r Thr e Gln i Pro Lys Ala Gly Asn Val Val Phe Thr Asn Gin Ser Asn Gly Phe Val C Tyr I Thr A Ile A Leu P Leu G Ala G Thr V Leu G Glu P Gi Le Ar Thz Gi Leu Sei Gir Ser Pro Val Ala Leu Leu Lys Ala Thr Val la 1 ia ly 'hr la la he ly ly al in I Ala n Gly u Giy n Leu u Ser g Tyr r Ala i Thr I Ser Gln Asp Phe Gin Lys Tyr Ser Thr Trp I Met Leu Ser P Ile T Thr P Leu T Leu G Arg G Val V Ile A Thr G Pro G Ala L.
Ty Al Al Gir Leu Pro Gln Leu Asp Phe Phe Ser Val Ser Gly Arg Asp .eu Asp Arg la 'hr ?he 'hr ;ly In al sn ly in eu r Le a Th a Let Let Thr As, Se As; Thr Ser Tyr Asp Glu Asp Ala Ser Asp Asn Ala Met Lys Pro Asp Thr Leu Gln Ser Pro Ile Lys Ser u Thr r Thr u Leu u Val r Asp n Asn Val Leu Thr Pro Gin Met Leu Asn Arg Gly Lys Leu Glu Leu Gin Ala Thr Gly Asn Gly Ala I Glu- S Val Asp S Ala I Hi
II.
IlI As Leu Let Lea Pro Ala Glu Leu Thr Met Val Phe Ala Leu Pro Thr Gly Phe Thr Pro Gly His Asn Phe Ser 'rp ;er ile s Ile e Met Asn n Glu Glu a Pro 1 Gly SGin Ser Gin Asn Ile Leu Ser Ile Glu Asp Tyr Gly Val Ala Pro Phe Asp Arg Val Tyr Phe Gin Pro Ala 112 128 144 160 176 192 208 224 240 256 272 288 304 320 336 352 368 384 400 416 432 448 464 480 496 512 528 544 560 576 592 608 624 609 Gln Trp Gln Gln Gln His Asp L he Ser Ala Leu Leu Leu Leu -238- SUBSTITUTE SHEET (RULE 26) WO 97/1 7432 PCTIUJS96/1 8003 -625 Lau Ser Asp A1sn Pro Ile Ser Thr Ser Gin Gly Thr Asp Asp Gin Lau ~4 i 641 Asn Phe Ile Arg Gin 'dalI Trp Gin Asn Leu Gly Ser Thr Phe ValI Gly 656 657 Aia Thr Leu Leu Ser Ar.; Ser Gly Ala Pro Leu Val Asp Thr Asn G Iv 672 673 His Ala Ile Asp Trp Phe Ala Leu Leu Ser Ala Gly Asn Ser Pro Leu 688 689 Ilie Asp Lys Val Gly Leu Val Thr Asp Ala Gly Ile Gin Ser Val Ile 704 705 Ala Thr Val Val Asn Thr Gin Ser Leu Ser Asp Giu Asp Lys Lys Leu 720 721 Ala Ile Thr Thr Leu Thr Asn Thr Leu Asn Gin Vai Gin Lys Thr Gin 736 737 GIn Gly Val Ala Val Ser Leu Leu Ala Gin Thr Leu Asn Val 5cr GIn 752 753 Ser Laeu Pro Ala Leu Leu Leu Arg Trp Ser Giy Gin Thr Thr Tyr Gin 768 769 Trp Leu Ser Ala Thr Trp, Ala Leu Lys Asp Ala Val Lys Thr Ala Ala 784 785 Asp Ile Pro Ala Asp Tlyr Leu Arg Gin Leu Arg Glu Val Val Arg Arg 800 301 Ser Leu Lau Thr Gin Gin Phe Thr Leu Ser Pro Ala Met Val Gin Thr 816 317 Leu Leu Asp Tyr Pro Ala Tyr Phe Gly Ala 5cr Ala Giu Thr Val Thr 232 833 Asp Ile Ser Lau Trp Met Leu Tyr Thr Leu Ser Cys Tyr Ser Asp Leu 848 849 Leu Leu Gin Met Gly Giu Ala Gly Giy Thr Glu Asp Asp Val Leu Ala 864 865 Tyr Leu Arg Thr Ala Asn Ala Thr Thr Pro Lau Ser Gin Ser Asp Ala 880 881 Ala Gin Thr Leu Ala Thr Leu Leu Gly Trp Giu Val Asn Giu Leu Gin 896 897 Ala Ala Trp Ser Val Leu Gly Gly Ile Ala Lys Thr Thr Pro Gin Leu 912 913 Asp Ala Leu Leu Arg Leu Gin Gin Ala Gin Asn GIn Thr Gly Leu Gly 928 929 Val Thr Gin Gin Gin Gin Gly Tyr Leu Leu Ser Arg Asp 5cr Asp T',yr 944 945 Thr Leu Trp GIn Ser Thr Gly Gin Ala Laeu Val Ala Gly Val Ser His 960 961 Val Lys Gly Ser Asn 965 INFORMATION FOR SEQ ID NO:58 SEQUENCE CHARACTERISTICS: LENGTH: 4698 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:58 (tccB) 1 ATG TTA TCG ACA ATG GAA AAA CAA CTG AAT GAA Tcc CAG CGT GAT GCG 43 1 Met Leu 5cr Thr Met Giu Lys Gin Leu Asn Glu 5cr GIn Arg Asp Ala i13 49 TTG GTG ACT GGC TAT ATG A.AT TTT GTG GCG CCG ACG TTG AAA GGC GTC o 17 Lcu Val Thr Gly Tyr Met Asn Phe Val Ala Pro Thr Leu Lys Giy Val 32 -239- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 q-, 33 145 49 193 65 241 81 289 97 337 113 385 129 433 145 481 161 529 177 577 193 625 209 673 225 721 241 769 257 817 273
ACT
Ser
GAC
Asp
ATT
Ile
CCG
Pro
AAT
Asn
TAC
Tyr
TAT
Tyr
CAT
Asp
GTG
Val
GAC
Asp
CGC
Arg
GCA
Ala
ACT
Thr
GTA
Val
GCA
Ala
TAC
Tyr
CGT
Gly
CCG
Pro
GCC
Ala
GGG
Gly
GAT
Asp
GCT
Ala
TTC
Phe
CGT
Arg
AGT
Ser
CAA
Gin
TAC
Tyr
GGG
Gly
TTG
Leu
TTT
Phe
GTA
Val
AAC
Asn CAG c Gin
GAA
Glu
ACC
Ser
CGT
Arg
AAC
Asn
GAA
Glu
TCG
Ser
GTG
Val
AAT
Asn
GCT
Ala
TAC
Tyr
AAT
Asn
CCG
Pro
TAT
Tyr
CAG
Gin
ATA
Ile
CG
Pro
GTG
Val
ATA
Ile
AG
Gin
CAA
;In
AAC
Asn
GAG
Glu
CAG
Gin
CTA
Leu
ATC
Ile
TGG
Trp
CCG
Pro
CTG
Leu
AAT
Asn
AAG
GTG
Val
GCT
Ala
CAG
Gin
GCG
A-a
TAT
Tyr
TAT
Tyr
CTG
Leu
GAT
Asp
TAT
Tyr
TAC
Tyr
CGT
Arg
GTG
Val
TCT
Ser
GAT
Asp
GAT
ACG
Thr
GAT
Asp
CAA
Gin
ATG
Met
GCT
Ala
ATT
Ile
GAG
Glu
GCT
Ala
GTG
Val
TAC
Tyr
CAG
GIn
ACG
Thr
GGT
Gly
CGA
Arg
GCT
GTG
Val
GAG
Glu
TAT
Tyr
GAG
GCu
ATC
Ile
TCA
Ser
ACG
Thr
GTT
Val
CTC
Leu
TTT
Phe
ATC
Met
CCA
Pro
GAT
Asp
CTA
Leu
GAC
GAA C Glu GTT C Val C
ATG
Met
CCT
Pro
TGG
Trp
CCC
Pro
ACT
Thr
TTG
Leu
ACT'
Ser
ATT
Ile
CAT
Asp
AAT
Asn
ACG
Thr
TAT
Tyr
GGT
Gly
AAA
;AT
Asp
AG
Glu
%CT
Thr
TCT
Ser
CT
Ala
ATC
Ile
TTA.
Leu
GCG
Ala
CGT
Gly
GGT
Gly
TTG
Leu
TGC
Cys
GTG
Val
GTG
Val
AAA
TTA
Leu
ACG
Thr
CGT
Arg
ACA
Thr
GCG
Ala
ACC
Thr
AAT
Asn
TAT
Tyr
TAT
Tyr
CGC
Arg
AGT
Ser
TGG
Trp
CTO
Leu
GCT
Ala
AAC
PAC
Tyr
AGT
Ser
:TG
Lu
'CT
Ala
GGG
Gly
CGG
Arg
CAG
Gin
CTC
Leu
ATT
Ile
ACT
Thr
AAG
Lys
AAT
Asn
GAG
Glu
TGG
Trp
ATC
GA-A 'T Glu T CGG G Arg V
GTC
Val A AAC G Asn G GCT C Ala C CAG C Gin
AAT
Asn
AAT
Asn
AAT
Asn
ACC
Thr
AAC
Asn
GAT
Asp
CAT
His
GTT
Val
GGT
AT
yr
TA
al
LAC
sn
AA
lu
;AG
Glu
GAA
Glu
:GA
Arg
GAG
Glu
CAG
Gin
ACT
Thr
CGT
Arg
TGG
Trp
ACA
Thr
GAG
Glu
A-AA
TTG
Lau
GCA
Ala C
GGC
Gly
TGG
Trp
GTT
Val Lys
CTC
Leu
TTT
Phe
GAT
Asp
AAA
Lys
CA.A
Gin
CAG
Gin
GTT
Val
CGT
Arg
ACC
:TG
'eu
CAA
In
TCT
Ser
CGT
%rg
CGA
Arg
AGC
Ser
GAT
Asp
GAG
Glu
AAA
Lys
CCG
Pro
GAT
Asp
GAA
Glu
CCC
Arg
GAC
Asp
CAT
ATT
Ile
GCG
Ala
GA.!%
Glu
GAT
Asp
A-AT
Asn
CAT
His
CCG
Pro
GCA
Ala
TTT
Phe
TAT
Tyr
CCG
Pro
ATC
Ile
CCG
Pro
CCG
Pro
GCC
144 48 192 64 240 288 96 336 112 384 128 432 144 480 160 528 176 576 192 624 208 672 224 720 240 768 256 816 Lys Asp Ala Asp AAG TTT GGT TATI 865 CCG AAT ACG 289 Pro Asn Thr Lys Phe Gly Tyr Lys ACC ACG TTA ATG ACA Thr Thr Leu Met Thr -240- SUBSTITUTE SHEET (RU Lys Asn Ile Gly Lys Thr His Ala CGT TAT GAT GAT ACT TGG ACA GCG Arg Tyr Asp Asp Thr Trp Thr Ala CAA CA-A GCA GGG GAA AGT TCA GA-A Gin Gin Ala Gly Glu Ser Ser Glu LE 26) WO 97/17432 PCT/1JS96/18003 913 ACA CAG CGA TCC AGC CTG CTG ATT GAT GAA TCT AGC ACC ACA TTG 305 Thr Gin Arg Ser Ser Lau Leu Ile Asp Glu Ser Ser Thr Thr Lau Arg 320 961 CAA GTT AAT CTG TTG GCT ACC ACC GAT TTT AGT ATC GAT CCG ACC GAG 1008 321 Gin Val Asn Leu Leu Ala Thr Thr Asp Phe Ser Ile Asp Pro Thr Glu 336 I0 1009 GAA ACG GAC AGT AAC CCG TAT GGC CGC CTA ATG 'ITG GGG GTG TTT GTC 1056 337 Giu Thr Asp Ser Asn Pro Tyr Gly Arg Leu Met Leu Gly Val Phe Val 352 1057 CGT CAA 'N'T GAA GGT'GAT GGG GCC AAT AGA AAA AAT AAA CCC GTT GTT 1104 353 Arg Gin Phe Glu Giy Asp Giy Ala Asn Arg Lys Asn Lys Pro Val Val 368 1105 TAT GGT TAT CTC TAT TOT GAC TCA OCT TTC AAT CGT CAT GTT CTC AGG 1152 369 Tyr Gly Tyr Leu Tyr Cys Asp Ser Ala Phe Asn Arg His Val Leu Arg 384 1153 CCG TT~A ACT AAG AAC TTT TTG ITrC ACT ACT TAC CGT GAT GAA ACG GAT 1200 385 Pro Leu Ser Lys Asn Phe Leu Phe Ser Thr Tyr Arg Asp Giu Thr Asp 400 1201 GGT CAA AAC AGC TTG CAA T GCG GTA TAC GAT AAA AA.G TAT GTA ATT 1248 401 Gly Gin Asn Ser Leu Gin Phe Ala Val Tyr Asp Lys Lys Tyr Val Ile 416 1249 ACT AAG GTT OTT ACA GGT GCA ACG GAA GAT CCC GAA AAT ACA GGA TGG 1296 417 Thr Lys Val Val Thr Gly Ala Thr Giu Asp Pro Giu Asn Thr Gly Trp 432 1297 GTA AGT AAA GTT GAT GAC TTG AAA CAA GGC ACT ACT GGG GCC TAT GTG 1344 433 Val Ser Lys Val Asp Asp Lau Lys Gin Giy Thr Thr Gly Ala Tyr Val 448 1345 TAT ATC GAT CAA CAT GCC CTG ACG CTT CAT ATA CAA ACC ACA ACT AAT 1392 449 Tyr Ile Asp Gin Asp Gly Leu Thr Leu His Ile Gin Thr Thr Thr Asn 464 1393 GCC GAT TTT ATT AAC CGT CAT ACG TTT GGA TAT AAC GAT CTT GTA TAT 1440 465 Gy Asp Phe Ile Asn Arg His Thr Phe Gly Tyr Asn Asp Leu Val Tr 480 1441 CAT TCT AAG, TCT GGT TAT GGT TTC ACG TOG TCA GGA AAT GAA GGT TTT 1488 481 Asp Ser Lys Ser Gly Tyr Giy Phe Thr Trp Ser Gly Asn Giu Gly Phe 496 1489 TAT CTG GAT TAC CAT GAT GGA AAT TAT TAC ACC TTT CAT AAT GCA AT.; 1536 497 Tyr Leu Asp Tyr His Asp Giy Asn Tyr Tyr Thr Phe His Asri Ala Ilie 512 1537 ATC AAC TAC TAT CCG TCT GGA TAT GOT GOT OGA TCT GTT CCT AAT GGA 1584 513 Ile Asn Tyr Ty r Pro Ser Gly Tyr Gly Gly Gly Ser Val Pro Asn G 523 1585 ACG TOG GCG TTA GAG CAA AGO ATT A.AT GAG GOA TGG GCT ATT OCT CCC 1632 529 Thr Trp Ala Leu Glu Gin Arg Ile Asn Giu Gly Trp Ala Ile Ala Pro 544 1633 CTG CTT GAT ACT CTC CAT ACT OTT ACT GTG AAG GGC AGT TAT ATC GCT 1680 545 Leu Leu Asp Thr Leu His Thr Val Thr Vai Lys Gly Ser Ty r Ile Ala 560 -241- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 1681 561 1729 577 1777 593 1825 609 1873 625 1921 641 1969 657 2017 673 2065 689 2113 705 2161 721 2209 737 2257 753 2305 769 2353 785 2401 801 2449 817 TGG GAA GGG GAA ACA CCT ACC OGT TAT AAT CTG TAT ATT COA GAT GOT Trp Glu Gly Glu Thr Pro Thr Gly Tyr Asn Leu Tyr Ile Pro Asp Gly ACC GTG TTG CTA Thr
AAT
Asn
ACT
Thr
GAA
Glu
AGT
Ser
TAT
Tyr
CTA
Leu
GGG
Gly
GTT
Val
TCC
Ser
ACC
Thr
ACT
Thr
ACC
Thr
AAA
Lys
GAA
Glu
GAA
Glu Val
AAG
Lys
ATC
Ile
ATC
Ile
ACT
Thr
ACT
Thr
CAG
GIn
AAA
Lys
GCG
Ala
CGT
Arg
TCA
Ser
TTG
Leu
TTG
Leu
AGC
ser Leu
:AG
.In Leu
CTT
Leu
AAA
Lys
AAT
Asn
CAA
Gin
TTG
Leu
GTT
Val
AAA
Lys
TTG
Leu
TAC
Tyr
TCA
Ser
ATT
Ile Leu
GAG
Glu
AAT
Asn
OCT
Ala
ACT
Thr
TCT
Ser
TOT
Cys
GGG
Gly
CAA
Gin
GAT
Asp
TTA
Leu
GAG
Glu
GAT
Asp
TCT
Ser
TTC
Phe
GAG
Glu
TTC
Phe
GAG
Glu
TTG
Leu
GCT
Ala
GAT
Asp
AGT
Ser
CCC
Pro
AAG
Lys
GAT
Asp
ATG
Met
CAC
His
TCG
Ser
TGG
Trp
GTA
Val
ACT
Ser
ACG
Thr
GGA
Gly
GCG
Ala
AAT
Asn
TAT
Tyr
AGC
Ser
AAA
Lys
GCG
Ala
GCT
Ala
CCT
Pro
GAC
Asp
CTO
Leu
CCG
Pro
TTT
Phe
TTT
Phe
AAA
Lys
GCG
Ala
CTT
Leu
GAT
Asp
GTC
Val
TCT
Ser
AAA
Lys
CGT
Arg
AAA
Lys
AAT
Asn
TCT
Ser
TTT
Phe
CCG
Pro
GCA
Ala
GAT
Asp
ACG
Thr
ATC
Ile
GAT
Asp
ACC
Thr
TTC
Phe
OTG
Val
TGG
Trp
GCT
Ala
GGT
Gly
ACC
Thr
CTG
Leu
CTO
Leu
AAT
Asn
TTT'
Phe
CAA
GIn
AAA
Lys
TCC
Ser
GCC
Ala
GGA
Gly
AC
Ser
TCC
Ser
TGG
Trp
GTC
Val
CCG
Pro
CTG
Leu
CGT
Arg
GGG
Gly
GAA
Glu
GOT
Gly rTG Leu
AAG
Lys
ATA
Ile
CCA
Pro
GAT
Asp
CGC
Arg
GGT
Gly
ACT
Thr
GAT
Asp
AGT
Ser
GAT
Asp
GTG
Val
CTT
Leu
CTG
Leu
GCA
Ala
TCA
Ser
GTT
Val
AGT
Ser
AAT
Asn
GAT
Asp
AAC
Asn
AAC
Asn
GCO
Ala
GAT
Asp
CAT
His
AAG
Lys
GCC
Ala
CAA
GIn
AAC
Asn
GAT
Asp
GAT
Asp
AAC
Asn
GCT
Ala
TTG
Lau
TTT
Phe
TGG
Trp
CGC
Arg
CTO
Leu
ACT
Thr
CCG
Pro
TAT
Tyr
TGG
Trp
ATT
Ile
TAT
Tyr
ACC
Thr
AGT
Ser
TTA
Leu
GGT
Gly
ACA
Thr
CAT
His
GCT
Ala
CCA
Pro
AAA
Lys
TTT
Phe
TAT
Tyr
GAC
Asp
GAC
Asp
TTT
Phe
CCT
Pro
CTG
Leu
ACC
Thr
TTG
Leu
GTG
Val
CTC
Leu
CGC
Arg
TAC
Tyr
ATT
Ile
ACA
Thr
TTC
Phe
AAA
Lys
TCT
Ser
AAA
Lys
COC
Arg
AAC
Asn
CGA
Arg
GAC
Asp
TTT
Phe
CTG
Leu
ACT
Thr
TAT
Tyr
TTT
Phe
ATC
Ile
COT
Gly
CTA
Leu
TAT
Tyr
CGT
Arg
ACA
Thr
AAC
Asn
CCG
Pro
GTC
Val
TTA
Leu
TTC
Phe
GTG
Val
GAT
Asp
GAC
Asp
TTC
Phe
GCC
Ala
TTT
Phe
CTT
Leu
ACC
Thr
CAG
Gin
TAC
Tyr
ACT
Thr
TAC
Tyr
TCA
Ser
TAT
Tyr
GTT
Val
TGG
Trp
CGT
Arg
TAC
Tyr
GGC
Gly
TGG
Trp
AAC
Asn
GAC
Asp 1-28 576 1776 592 1824 608 1872 624 1920 640 1968 656 2016 672 2064 688 2112 704 2160 720 2208 736 2256 752 2304 768 2352 784 2400 800 2448 316 2496 332 CAG GCA GIn Ala GAA CCA Glu Pro TTC TTT Phe Phe CAA TTT Gln Phe -242- SUBSTITUTE SHEET (RULE 26) WO 97/17432 WO 9717432PCTIUS96/1 8003 2497 CCG CG ATG AAA AC AAG CCA CAC AAT GOC CCG GOT TA'''T TGG AA-T 333 Pro Ala Met Lys Asn Lys Pro His Asn Ala Pro Ala '11,r Trp Asn Val 2=I4 5 349 ,593 865 2641 881 2689 897 2737 913 2785 929 2833 945 2881 961 2929 977 297 7 993 3025 1009 3073 1025 3121 1041 3169 1057 3217 1073
CGT
Arg
TOT
Ser Lys
ATG
Met
TAT
Ty r
CTG
Leu
CAA
Gin
AC
Thr
GCA
Ala
CAC
His
ACC
Thr
GAT
Asp
XAT
As n
GOT
Ala
GGT
Gly
OC
Pro
ATA
Ile
CG
Ala
TGG
Trp
TAO
Tyr
AGT
Ser
AAA
Lys
GOT
Ala
GAT
Asp
TG
Trp
GTT
Val1
COG
Pro
GGC
Gly
ATO
Met
CAG
Gin GTT GA-A Val Giu CCA GAT Pro Asp TTT ATT Phe Ile
CCCOA.A
Arg Gin CTC CO Leu Ala ATT TG Ile Trp OTT TTA Val Leu 000 GOA Pro Ala CCC TAC Gly Tyr AOG TTG Thr Leu CCC AAG Gly Lys CG TTC Ala Leu ACT GC Ser Gly COG CA Pro Arg
OTGCOTT
Leu Leu
CCA
Gly
ACT
Thr
CO
Ala
TTG
Leu
CT
Ala
AOG
Thr
CT
Arg Leu Phe
CAT
Asp
CG
Pro Leu
CO
Ala
CT
Ala
AGT
Ser
TTG
Lieu
GOT
Ala
AAO
As n
GCT
C ly
GGG
Gly
OTO
Leu
CAC
His
A.AT
Asn
OTO
Leu
TAO
Ty r
CG
Pro
GOT
Ala
GTG
ValI
GTC
Val1
CT
Arg TOA CT CAT TTC G!AO Ser Arg His Leu Asp OAT CCG GTG ATA TA'O His Pro Val Ile Ty1r OTG ATT CT CAG CCA Leu Ile Ala Gin Gly OTC ACT CAC CCC CT Leu Thr Gin Ala Arg OCT CT CCC CAT CTA Pro Arg Pro Asp Val CAT ACC TTA CA CO Asp Thr Laeu Ala Ala CAC ?I'C CT AAT ACT Gin Leu Ala Asn Ser GTO AGO TAO TTG A-AA Val Ser Tyr Leu Lys A.AT GTT OTO ATG TTG Asn Val Leu Met Leu A.AT OTC CT OAT A.AC Asn Leu Arg His Asn OTG TAT CT C COT Leu Tyr Ala Ala Pro CAG TOO CCC AOG TTC Gin Ser Gly Thr Leu CCG OCA TAO CT TTO Pro Pro Tyr Arg Phe GCT AC TTG ACC ACT Gly Thr Leu Thr Ser AGOC? GA GA CO TCT Ser Glu Arg Ala Cys
CAT
Asp Gin Asp
CTZ
Va I TOO3 5c r
CCC
Cly
CAT
Asp
OTO
Lau
TO-T
Ser
OTG
Lau
GTT
ValI A-.c G Thr
AGO
Ser -1TT Phe
CAA
Gin -592 2640 880 2688 85 6 2736 512 2784 928 2332 944 2880 960 2923 976 29*76 992 3024 1008 3072 1024 312-0 1040 3 163 1056 3216 !3 -243- SUBSTITUTE SHEET (RULE 26) WO 97/1 7432 PTU9180 PCT/US96/18003 3265 CAA GAG TTG CCG CA~A CAG CA. CTG TTG GAT ATG TCC AGC TAT .17C 1089 Glu Giu Leu Ala Gin Gin Gin Leu Lau Asp Met Ser Ser 7'ijr Ala Tie uc.
3313 ACC TTG CkA CAA CAG GCG CTG GAT CCA TTG GCG GCA GAT CGT CTG GCG 3 36;) 1105 Thr Leu Gin Gin Gin Ala Leu Asp Cly Leu Ala Ala Asp Arg Leu la 12 336i CTG CTA GCT ACT CAG GCT ACG CCA CAA CAG CCT CAT GAG CAT TAT TAC 34G8 1121 Lau Lau AlIa Ser Gin Ala Thr Ala Gin Gin Arg His Asp His T 2 'r TI.,r i136 3409 ACT CTG TAT CAG AAC AAC ATC TCC ACT GCG GAA CAA CTG GTG ATO GAC 3456 1137 Thr Leu Tlyr Gin Asn Asn Ile Ser Ser Ala Ciu Gin Leu Val Met Asp 1±52 3457 ACC CA.A ACG TCA GCA CAA TCC CTC ATT TCT TCT TCC ACT CCT CTA CAA 3504 1153 Thr Gin Thr Ser Ala Gin Ser Leu Ile Ser Ser Ser Thr Cly Val Gin 1163-0 3505 ACT CCC ACT CCC CCA CTC AAA GTC ATG CCC AAT ATC TTT CCT TTC CCT 3552 1169 Thr Ala Ser Cly Ala Leu Lys Val Ile Pro Asn Ile Phe Cly Lau Ala 1134 3553 CAT CCC CCC TCC CCC TAT CAA CCA CTA ACC CAA CC ATT CCC ATC CCC 3600 1185 Asp Cly Cly Ser Arg Tyr Ciu Gly Val Thr Giu Ala Ile Ala Ile Gly 1200 3601 TTA ATCGCCT CCC CGA GAA CCC ACC AGCGCTG CTG CCC GAG CCT CTC CCA 3648 1201 Leu Met Ala Ala Cly Gin Ala Thr Ser Val Val Ala Ciu Arg Leu Ala 1216 3649 ACC ACC GAG AAT TAG CCC CCC CC CGGT CAA GAG TCC CAA ATC CAA TAC 3696 1217 Thr Thr Giu Asn Tyr Arg Arg Arg Arg Giu Giu Trp Gin Ile Gin TyIr 1232 3697 GAG GAG CA GAG TCT GAG CTC GAG GGA TTA GAG AAA GAG TTG CAT CC 3744 1233 Gin Gin Ala Gin Ser Giu Val Asp Ala Leu Gin Lys Gin Lau Asp Ala 1243 3745 CTC GGA CTC CCC GAG AAA CCA GGT CAA ACT TGG CTG CAA GAG CC AAC 3792 1249 Leu Ala Val Arg Ciu Lys Ala Ala Gin Thr Ser Leu Gin Gin Ala Lys 1264 3793 CA GAG GAG CTA CAA ATT CCC ACC ATO CTG ACT TAG TTA ACT ACT CCT 3840 1265 Ala Gin Gin Val Gin Ile Arg Thr Met Leu Thr Tyr Leu Thr Thr Arg 1230 3841 TTG ACC GAG CC ACT GTG TAG GAG TGC CTC ACT CCT CAA TTA TCC CC 3888 1281 Phe Thr Gin Ala Thr Leu Tyr Gin Trp Leu Ser Gly Gin Leu Ser Ala 12936 3389 TTC TAT TAT CAA. CC TAT CAT CCC GTC CTT GGT CTC TC GTG TCC CCC 3936 1297 Leu Tyr Tyr Gin Ala Tyr Asp Ala Val Val Ala Leu Gys Leu Ser Ala 1312 3937 CAA. CCT TGG TCC GAG TAT CAA TTC GGT CAT TACGCCT ACC ACT TTT ATC 3934 13i3 Gin Ala Cys Trp Gin Tyr Giu Leu Gly Asp Tyr Ala Thr Thr Phe Ile 1323 3985 GAG ACC GGT ACC TOG A).C GAG CAT TAG CGT CCT TTGCGA.A CTC CCC GAG 4032 1329 Gin Thr Cly Thr Trp Asn Asp His Tyr Arg Cly Leu Gin Val Gly Ciu 1344 4033 ACA CTGCA G C TC A).T TTC CAT GAG ATG GAA CC CCC TAT TTA CTT CCT 4030 1345 Thr Leu Gin Leu Asn Leu His Gin Met Ciu Ala Ala Tyr Leu Val Arg £360 -244- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCTUS96/18003 4081 1361 4129 1377 4177 1393 4225 1409 4273 1425 4321 1441 4369 1457 4417 1473 4465 1489 4513 1505 4561 1521 4609 1537 4657 1553
CAC
His
TTG
Leu
TTT
Phe
TTG
Leu
CCGO
Pro
TTG
Leu
GGT
Gly
CAG
Gln
TTG
Leu
GTT
Val
GAC
Asp
AAT
Asn
GAA
Glu
GGT
Gly
CCA
Pro
CGC
Arg
TAT
Tyr
TTA
Leu
AAA
Lys
CAG
Gin
CGT
Arg
TCC
Ser
GAT
Asp
ATG
Met
CGC
Arg
GAT
Asp
TTA
Leu
CAG
Gin
CAA
Gin
GCA
Ala
GAG
Glu
GTG
Val
TTG
Leu
AAA
Lys
AAG
Lys
GAT
Asp COT CTT Arg Leu GAT GOT Asp Gly AGC GAA Ser Glu ATT AAA Ile Lys AAC GTG Asn Val GCA GAT Ala Asp GGT GAT Gly Asp GCG CTC Ala Leu GAA GAT Glu Asp TGG ACT Trp Thr ACA TTO Thr Leu GAT GTC Asp Val
AAT
Asn
TTT
Phe
AAG
Lys
ACT
Thr
AAG
Lys
ATC
Ile
GCG
Ala
TCT
Ser
GAG
Glu
CTT
Leu
AAA
Lys
CTG
Leu
GTC
Val
GOT
Gly
CTG
Leu
GTG
Val
GCA
Ala
AAT
Asn
ACG
Thr
TCT
Ser
CGC
Arg
AAC
Asn
GCG
Ala
GTG
Val
ATC
lie
AAG
Lys
TTT
Phe
TCA
Ser
ACG
Thr
GGT
Gly
CAT
His
GGC
Gly
TAT
Tyr
TTC
Phe
GAT
Asp
CAG
Gln
GTC
Val
CGT
Arg
TTA
Leu
GAC
Asp
GTG
Val
CTC
Leu
GTT
Val
ATT
Ile
ATT
Ile
CTA
Leu
CCG
Pro
GAG
Glu
GTG
Val
ACT
Thr
.AA
Lys
AAC
Asn
ACG
Thr
ACT
Thr
AAA
Lys
GTC
Val
AAT
Asn
TCA
Ser
CGT
Arg
ATG
Met
CAT
His
GTC
Val
ACC
Thr
GAC
Asp
TTG
Leu
CAG
Gln
CCT
Arg
ACC
Thr
GAT
Asp
TTT
Phe
TCT
Ser
CAG
Gin
TAT
Tyr
TCG
Ser
GAA
Glu
TAT
Tyr
CCG
Pro
ACC
Thr
CTC
Leu
AAT
Asn
GCC
Ala
GAG
Glu
GTG
Val
GCC
Ala
ACC
Thr
CTC
Leu
GGC
Gly
CCC
Pro
ACG
Thr
AGC
Ser
AAT
Asn
CTG
Leu
GGT
Gly
GGG
Gly
GAT
Asp
GCA
Ala
GCC
Ala
AAA
Lys
AAA
Lys
GGG
Gly
TTA
Leu
ACC
Ser
CAT
Asp
CGT
Arg
AGC
Ser
ACT
Thr
GAG
Glu
CTG
Leu
TGC
Cys
AGC:
Ser
GTC
Val
CAC
His
GTC
Val
ACT
Ser
CCG
Pro
CCC
Ala
TTT
Phe
GGA
Gly
CAT
His
TTG
Leu
GAC
Asp
CTA
Lau
GAC
Asp
TAT
Ty r
GGG
Gly
ATA
Ile
ACA
Thr
AGC
Ser
GAG
Glu
GCT
Ala
ATT
Ile
GCG
Ala
GGC
Gly 4 123 1376 4176 1332 4224 1408 4272 1424 4320 1440 4368 1456 4416 1472 4464 1488 4512 1504 4560 1520 4608 1536 4656 1552 GCC GCC ACT TTC GCA AAC CAG Gly Ala Ser Phe Ala Asn Gin AAG AAA ACA CTC TCT TAA Lys Lys Thr Leu Ser End 4698 1566 INFORMATION FOR SEQ ID NO:59 SEQUENCE CHARACTERISTICS: LENGTH: 1665 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:59 (TCCB peptide) Features From To Description -245- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 1 11 SEQ ID 1O:7 17 33 49 81 97 113 129 145 161 177 193 209 225 241 257 273 289 305 321 337 353 369 385 401 417 433 449 465 481 497 Met Leu Ser Asp Ile Pro Asn Tyr Ty r Asp Val Asp Arg Ala Thr Val Ala Tyr Pro Thr Gin Glu Arg Tyr Pro Gly Thr Val Tyr Gly Asp Tyr Leu Val Gly Pro Ala Gly Asp Ala Phe Arg Ser Gin Tyr Gly Leu Phe Val Asn Asn Gin Val Thr Gin Gly Leu Gin Lys Ser Ile Asp Ser Leu Ser Thr Gin Glu Ser Arg Asn Glu Ser Val Asn Ala Tyr Asn Pro Tyr Gin Ile Thr Arg Asn Asp Phe Tyr Ser Asn Val Lys Asp Phe Lys Asp Thr Gly Pro Val Ile Gin Gin Asn Glu Gin Leu Ile Trp Pro Leu Asn Lys Lys Thr Ser Leu Ser Glu Leu Lys Ser Val Val Gin Ile Ser Tyr Met Tyr Val Ala Gin Ala Tyr Tyr Leu Asp Tyr Tyr Arg Val Ser Asp Asp Phe Thr Ser Leu Asn Gly Tyr Asn Leu Thr Asp Asp Asn Gly His Glu Met Thr Asp Gin Met Ala Ile Glu Ala Val Tyr Gin Thr Gly Arg Ala Gly Leu Leu Ala Pro Asp Cys Phe Gln Gly Asp Gly Arg Tyr Asp Lys 1sn Val Glu Tyr Glu Ile Ser Thr Val Leu Phe Met Pro Asp Leu Asp Tyr Met Leu Thr Tyr Gly Asp Leu Phe Ala Leu Leu His Gly Gly Gin Phe Glu Val C Met I Pro Trp Pro Thr I Leu Ser C Ile C Asp L Asn C Thr V Tyr
V
Gly
L
Lys
P
Thr G Ile P Thr P Gly
P
Ala Ser P Phe S Ala Thr G Lys
G
Thr L Thr P Phe T Asn -246- Lau Val Asp 1lu Thr er Aa Ile Leu 1la Ily ly Leu :ys al a 1 ys rg 1n sp sp rg As n la ;er a I iu ;In eu he hr ,yr Asn Ala Leu Thr Arg Thr Ala Thr Asn Tyr Tyr Arg Ser Trp Leu Ala Asn Tyr Gin Glu Phe Leu Arg Phe Thr Tyr Asp Gly His Gly Trp Ty r Glu Pro Tyr Ser Leu Ala Gly Arg Gln Leu Ile Thr Lys Asn Glu.
Trp Ile Asp Ala Ser Ser Met Lys Asn Tyr Asp Pro Thr Ile Tyr Ser Thr Ser Gin Arg Asp Ala Thr Glu Arg Val Asn Ala Gln Asn Asn Asn Thr Asn Asp His Val Gly Asp Gly Ser Ile Leu Asn Arg Arg Lys Glu Thr Gln Asn Gly Phe SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCTIUS96/18003 z51i3 Ile asn 1'*r 7 1 'r Pro Ser GlY T/yr Gly GlY Gly Ser Val Pro A-sn 1 52- 6 529 Thr Trp Ala Leu Giu Gin Arg lie Asn Glu Gly Trp Ala Ilie Ala Pr3 154 4 545 Lau Laeu Asp Thr Leu His Thr Val Thr Val Lys Gly Ser .r Ile Ala 560 561 Trp Giu Gly Glu Thr Pro Thr Gly Tyr Asn Leu Tyr Ile Pro Asp Gly 576 I(0 577, Thr Val Leu Leu Asp Trp Phe Asp LYS Ile Asn Phe Ala Ilie Gly Lau 59 593 Asn Lys Leu Giu Ser Val Phe Thr ser Pro Asp Trp Pro Thr Lau Thr 608 609 Thr Ilie Lys Asn Phe Ser Lys Ilie Ala Asp Asn Arg Lys Phe Ty'r Gin 624 625 Glu Ilie Asn Ala Giu Thr Ala Asp Gly Arg Asn Leu Phe Lys Arg rlr 640 641 Ser Thr Gin Thr Phe Gly Lau Thr Ser Gly Ala Thr T1yr Ser Thr Thr '556 657 Tyr Thr Leu Ser Giu Ala Asp Phe Ser Thr Asp Pro Asp Lys Asn Tlyr 672 673 Leu Gin Vai Cys Leu Asn Vai Val Trp Asp His 'lIyr Asp Arg Pro Ser 688 689 Gly Lys Lys Gly Ala Tlyr Ser Trp Val Ser Lys Trp Phe Asn ValI Tyr 704 705 Val Ala Leu Gin Asp Ser Lys Ala Pro Asp Ala Ile Pro Arg Laeu Val 720 721 Ser Arg Tyr Asp Ser Lys Arg Giy Leu Val Gin Tyr Leu Asp Phe Trp 736 737 Thr Ser Ser Leu Pro Ala Lys Thr Arg Leu Asn Thr Thr Phe Vai Arg 752 753 Thr Leu Ilie Giu Lys Ala Asn Leu Gly Leu Asp Ser Leu Leu Asp 71,r 768 769 Thr Leu Gin Ala Asp Pro Ser Leu Glu Ala Asp Leu Val Thr Asp Gly 784 785 Lys Ser Giu Pro Met Asp Phe Asn Gly Ser Asn Gly Leu Tlyr Phe Trp 800 801 Glu Leu Phe Phe His Leu Pro Phe Leu Val Ala Thr Arg Phe Ala Asn 816 817 Glu Gin Gin Phe Ser Pro Ala Gin Lys Ser Leu His Tyr Ile Phe Asp 832 833 Pro Ala Met Lys Asn Lys Pro His Asn Ala Pro Ala Tyr Trp Asn Val 348 849 Arg Pro Leu Val Giu Gly Asn Ser Asp Leu Ser Arq His Leu Asp Asp 864 865 Ser Ile Asp Pro Asp Thr Gin Ala Tyr Ala His Pro Val Ile Tyr Gin 880 881 Lys Ala Val Phe Ile Ala Tyr Val Ser Asn Leu Ile Ala Gin Gly Asp 896 897 Met Trp Tyr Arg Gin Leu Thr Arg Asp Gly Leu Thr Gin Ala Arg Val 912 913 Tyr Tyr Asn Leu Ala Ala Giu Leu Leu Gly Pro Arg Pro Asp Val Ser 928 929 Leu Ser Ser Ilie Trp Thr Pro Gin Thr Leu Asp Thr Leu Ala Ala Gly 944 945 Gin Lys Ala Val Leu Arg Asp Phe Glu His Gin Leu Ala Asn Ser Asp 9-SO 961 Thr Ala Leu Pro Ala Leu Pro Gly Arg Asn Val 5cr Tyr Leu Lys Leu 9 76 977 Ala Asp Asn Gly Tlyr Phe Asn Glu Pro Lau Asn Val Leu Met Leu Ser 992 993 His Trp Asp Thr Leu Asp Ala Arg Leu Tyr Asn Leu Arg His Asn Lau 1003 1009 Thr Val Asp Gly Lys Pro Leu Ser Leu Pro Leu Tyr Ala Ala Pro Val1 1024 Asp Pro Val Ala Leu Leu Ala Gin Arg Ala Gin Ser G ly Thr Leu Thr 1640 -247- SUBSTITUTE SHEET (RULE 26) WO 97/17432 WO 9/1 432PCT/US96/18003 1041 1057 1073 1089 1105 1121 1137 1153 1169 1185 1201 1217 1233 1249 1265 1281 1297 1313 1329 1345 1361 1377 1393 1409 1425 1441 1457 1473 1489 1505 1521 1537 1553 sn Gly Ala Met Gly Gin Giu Glu Thr Leu Leu Leu Thr Leu Thr Gin Thr Ala Asp Gly Leu Met Thr Thr Gin Gin Leu Ala Ala Gin Phe Thr Leu Tyr Gin Ala Gin Thr Thr Leu His Giu Leu Gly Phe Pro Leu Arg Pro Tyr Leu Leu Gly Lys Gin Gin Leu Arg Val Ser Asp Asp Asn Met Gly Ala ValI Leu As n Leu Gin Ala Tyr Thr Ser Gly Ala Giu Ala ValI Gin Gin Tyr Cys Gly Gin Arg Asp Leu Gin Gin Ala Giu Val Leu Lys Lys Asp S er Ser Pro Leu Ala Gin Ser Gin Ser Gly Ser Ala As n Gin Arg ValI Ala Gin Trp Thr Leu Arg Asp Ser Ile Asn Ala G ly Ala Giu Trp Thr Asp Phe Leu Ser Leu Leu Asp Ala Ser Leu Val1 Gly Thr Arg Arg Arg Arg Glu Val Asp Ala Lys Ala Ala Gin Ile Arg Thr Met Leu Tftjr Gin Trp Tyr Asp Ala Val Tyr Giu Leu Gly Asn Asp His Tyr Leu His Gin Met Asn Val Ile Arg Phe Gly Lys Leu Lys Leu Phe Asp Thr Val Ser Val Lys Ala Thr Leu Ile Asn Gly Val Ala Thr His Ile Ser Ser Gly Ile Glu Arg Tyr Leu Leu Asn Phe Pro Lys Ala Asp Glu Leu Val Gin Val Asn Gin Val Lys -248- ValI Val1 Arg Asp Lau Gin Ala Ser Pro Thr ValI Glu Leu Thr Leu Leu Val1 Asp Arg Glu Thr Lys As n Thr Thr Lys Val1 As n Ser Arg Met His Pro Pro T1yr Arg Phe Glv Thr Leu Thr Ser Ser Glu Arg Ala Cys Met Ser Ser Tyr Ala Ala Ala Asp Arg Lau Arg His Asp His '1'yr Glu Gin Leu Val Met Ser Ser Thr Gly Val Asn Ile Phe Gly Leu Giu Ala Ile Ala Ile Val Ala Glu Arg Leu Giu Trp Gin Ile Gin Gin Lys Gin Leu Asp Ser Leu Gin Gin Ala Thr Tyr Leu Thr Thr Ser Gly Gin Leu Ser Ala Leu Cys Leu Ser Tyr Ala Thr Thr Phe Gly Leu Gin Vai Gly Ala Ala Tyr Leu Val Val Ser Leu Lys Ser Thr Glu Gly Lys Val Asp Tyr Pro Gly His Leu Pro Thr Leu Val Gin Thr Ser Ser Ser Arg Leu Asn Asp Pro Thr Asn Leu Arg Ala Asp Ala Gly Ser Phe Phe Glu Gly Thr Gly Ser Val Asp Giu His Gin Ala Ala Leu Leu Tyr Thr Ala Cys Asp Phe Gin Ile Ala r Asp Gln Ala Gly Ala Ty r Ala Ly s Arg Ala Ala Ile Giu Arg Leu Asp G Ile Thr Ser Glu Ala Ile Ala G ly 1 05 ,6 1-072 1088 1104 1120 1136 1152 1168 1184 1200 1216 1232 1248 1264 1280 1296 1312 1328 1344 1360 1376 1392 1408 1424 1440 1456 1472 1488 1504 1520 1536 1552 Lys Thr Leu Ser 1565 SUBSTITUTE SHEET (RULE 26) WO 97/1 7432 PTU9/80 PCT/US96/18003 INFORMATION FOR SEQ !D 1.O:60 SEQUENCE CHARACTERISTICS: LENGTH: 3132 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) 193 241 81 289 97 337 113 385 129 433 145 481 161 529 177 577 193 (xi)
ATG
Met
GTG
Val1
ATT
Ile
OAT
Asp
GCA
Ala
GAT
Asp
ACT
Thr
GCG
Ala
GGT
Gly
AAA
Ly s
GAG
Giu
GTG
ValI
TCT
Ser
TTA
Leu
GTA
Val1
GCC
Ala
AAG
Lys
CTG
Leu
GTT
Val
ACC
Thr
CGC
Arg
OTO
ValI
TAT
Tyr
ACC
Thr
CAC
His
GAT
Asp
ATC
Ile
COT
Arg
CAG
Gin 0CC Ala
OCA
Ala
GT
Gly
TTO
Leu
ACA
Thr
AAC
Asnr
COO
Arg
CAA
Gin KAT COC Asn Arg 000 000 Gly Gly GOA CAC Gly His OCT OAT Ala Asp GOT CAT Oly His TTO AAT Leu Asn OTT COT Val Arg TTA TCT Leu Ser GAG COC Giu Arg CTC TCC Leu Ser TTO ATO Leu Met TTO CTO Leu Leu
GT
Gliy
OAT
Asp
CTO
Leu
AAC
Asn 0CC Ala
OAT
Asp
CAG
Gin
OTO
ValI Phe
GOT
Giy
ACT
Ser
CTO
Leu
ACT
Thr
AAC
Asn
TCA
Ser
CTO
Leu
ATT
Ile
ACC
Thr
AGC
Ser
ATC
Ile
CTG
Leu
CAG
Gin
TCC
Ser
GAC
Asp
TAC
Tyr
GTA
Val1
CGG
Arg
GAA
Giu
COT
Arg
GAG
Oiu
TGG
Trp
TOT
Cys
TCA
Ser
ATT
Ile
ACC
Thr
AOT
Ser
AAO
Lys
ACA
Thr
GOT
Gly
COC
Arg
CAA
Gin
OCT
Aia
ATA
Ile
CTG
Leu
CAG
Gin
COT
Arg
COC
Arg
ATT
Ile
CCT
Pro
GAG
Giu
COT
Arg
TAT
Tyr
OTT
Val1 000 Gly
COC
Arg
OCO
Aia
GAO
Giu
OAT
Asp
OTC
Val
GAC
Asp
AAT
Asn
ACT
Ser
TCO
Ser
OAA
Oiu
TTC
Phe
AAT
As n
CAC
His
GGC
Giy
OCT
Ala
ATT
Ile
ACC
Thr
CCA
Pro Phe
OTC
Val
OTA
Val1
GGC
O iy
AAC
Asn
ACA
Thr
TAC
Tyr 0CC Ala
AAC
As n
GGT
Oiy
COT
Arg
COC
Arg
OTC
ValI
OAT
Asp
ATO
Met
AAC
Asn
CAA
Gin
ACC
Thr
GAC
Asp
ATO
Met
TG
Trp SEQUENCE DESCRIPTION: SEQ ID NO:60 (cccC) ACT CCO TCT GAG ACT ACT CTT TAT ACT CAA ACC CCA ACA OTC AGC Ser Pro Ser Giu Thr Thr Leu Ty 2 r Thr Gin Thr Pro Thr Val q.,r OCO GAA GO Aia Giu Gly -249- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/18003 625 GAC GAA ACT GTC TGG CAG GGA ATG CTG GCA AGT GAG GTC TAT ACG ACA 209 Asp Glu Thr Val Trp Gin Gly Met Leu Ala Ser Glu Val Tyr Thr Thr 673 225 721 241 769 257 817 273 865 289 913 305 961 321 1009 337 1057 40353 1105 369 1153 385 1201 401 1249 417 1297 433 1345 449 1393 CAA AGT ACC Gin Ser Thr AAA GGC Lys Gly GGG AGT Gly Ser TCC CTG Ser Leu AAC GGC Asn Gly ATA GGT Ile Gly GTA TTG Val Leu AGT ATC Ser Ile GTG GAG Val Glu AGT GCG Ser Ala CTT CCC Leu Pro AAT TAC Asn Tyr ATC CGA Ile Arg ACC GTT Thr Val GAT CCA Asp Pro ATG TTA
A.AT
Asn
TGG
Trp
AGC
Ser
GTG
Val
ATC
Ile
CAG
Gin
CAT
His
CCG
Pro
ACA
Thr
TCA
Ser
CTT
Leu
CAC
His
TCA
Ser
ACC
Thr
ATA
ACT
Thr
ATT
Ile
TTG
Leu
TGG
Trp
GTT
Val
ACC
Thr
GAT
Asp
AAT
Asn
GAG
Glu
GGG
Gly
CCC
Pro
CGT
Arg
AGT
Ser
AGC
Ser
CGA
Arg
CCG
AAT
Asn
CAG
Gin
ACG
Thr
TCA
Ser
ACG
Thr
ACC
Thr
CTA
Leu
GAT
Asp
AAT
Asn
CGT
Arg
GTT
Val
ACC
Thr
TCA
Ser
CGC
Arg
GTG
Val
GGG
ATC
Ile
CTG
Leu
AAA
Lys
GCA
Ala
TAC
Tyr
CGT
Arg
TAT
Tyr
GAA
Glu
TAT
Tyr
ATG
Met
CCT
Pro
ACT
Thr
GCG
Ala
AAC
Asn
GCG
Ala
AAT
GGG
Gly
GCT
Ala
GGC
Gly
GGT
Gly
ACT
Ser
GCC
Ala
AAG
Lys
GCT
Ala
GTT
Val
GCT
Ala
GTT
Val
TAT
Tyr
ACT
Thr
CGG
Arg
CTA
Leu
CTG
GCT
Ala
TAT
Tyr
CAG
Gin
CAT
His
TAT
Tyr
GAA
Glu
TAT
Tyr
ACC
Thr
TAT
Tyr
AAT
Asn
CCT
Pro
GAC
Asp
CAA
Gin
GCG
Ala
TTT
Phe
GAT
TTA
Leu
GAC
Asp
ACT
Ser
AAA
Lys
GAG
Glu GG43 Gly
GAT
Asp
CGC
Arg
GAT
Asp
ATC
Ile
ACT
Thr
CGT
Arg
AAT
Asn
GTA
Val
GAT
Asp
TGG
CTC
Leu
ATT
Ile
GAA
Glu
TTG
Leu
CCG
Pro
AGT
Ser
CCG
Pro
TTT
Phe
TCT
Ser
GGT
Gly
GAC
Asp
GGC
Gly
AGT
Ser
TTG
Leu
TCC
Ser
AAT
ACC
Thr
GCC
Ala
CAG
Gin
CGT
Arg
GAA
Glu
CAA
Gin
GTG
Val
TGG
Trp
CTC
Leu
CAG
Gin
GAC
Asp
GGT
Gly
TAC
Tyr
AGT
Ser
GGC
Gly
ATT
CAA
Gin
GGT
Gly
GTG
Val
GAA
Glu
ACT
Thr
TCA
Ser
GGG
Gly
CGT
Arg
TAT
Tyr
CAA
Gin
ACC
Ser
AAT
Asn
ACC
Thr
ACA
Thr
GGT
Gly
CGG
720 240 1- 3 256 816 272 864 288 912 304 960 320 1008 336 1056 352 1104 368 1152 384 1200 400 1243 416 1296 432 1344 448 1392 464 1440 -250- SUBSTITUTE SHEET (RULE 26) WO 97/17432 WO 97/ 7432PCTIUS96/1 8003 1441 431 1489 497 1537 513 1585 529 1633 545 1681 561 1729 577 1777 593 1825 609 1873 625 1921 641 1969 657 2017 673 2065 689 2113 705 2161 '72 1 I et Laeu CAA CGA Gin Arg TAT CC I-jr Arg CAG ACG GIn Thr TTA GAG Leu Glu CAG, GTG GIn Val CAC TGG His Trp TAC AGC Tyr Ser GAA GGG Glu Gly GCG ATA Ala Ile CGT TAC Arg Tyr TAC CGT Tyr Arg GC GGA Ala Gly CCC ATC Pro Ilie AA-T CGA Asn Arg GAG GCA Glu Gly ATT GCC Ile Ala Ilie
GTC
Val1
TAT
GCC
Gly
CTA
Lau Ile
GAA
Giu
TAC
Tyr
CAG
Gin
TGG
Trp
TCC
Ser
TAT
Tyr
ACC
Thr
ACA
Thr
AAT
As n
ATG
Met
GCC
Gly Pro
ACA
Thr
AC
Ser
AAC
Asn
CCC
Arg
ACG
Thr
ACT
Ser
CAT
Asp
ATT
Ilie
C
Ala
COT
C ly
TAT
Tyr
CTC
ValI
TTG
Leu
ACA
Thr
TCC
Ser
CCC
C ly Cly Pro
ACT
Ser
ACT
Ser
ACA
Thr
GTA
ValI
GGT
Gly
AAT
Asn
CTC
Leu
CC
Ala
AAA
Lys
CAA
Gin
CAT
Asp
ACT
Thr
TTI'
Phe
CC
Alda Ile -in
GTG
Val
CAT
Asp
ACT
Thr
ACT
Thr
GGT
Cly
AAG
Ly s
CTG
Leu
ACT
Ser
ACA
Arg
GAG
Clu
CCT
Pro
CCC
C ly
CAC
Asp
TGG
Trp
TCA
Ser
CC
Ala As n
AC
Ser
GC
Cly
CAA
Gin
CCC
C ly
CAA
Ciu
CCG
Pro
CTT
Leu
CAC
Gin
AAT
Asn
CCC
Arg
TG
Trp
CTG
Leu
CAT
His Phe
ATG
Met
ATT
Ile Leu
CGT
Arg
ATG
Met
GTA
Val1
GTT
Val
CC
Ala
ACA
Thr
GC
C iy
CAA
Giu
CAG
Gin
CAT
Asp
GTG
ValI
AAT
As n
CAC
Asp
GCT
Ala
AGA
Arg
GCC
Cly Asp
GAA
Glu
CCC
Arg
CAA
Gin
GCA
Ala
CCT
C ly
CAT
Asp
TCC
Ser
GAG
Glu
ACA
Thr
CC
Ala
GGT
C ly
TTC
Leu
GGA
Gly
TCA
Ser
CCC
Arg Trp
AAT
As n
CTG
Leu
CCC
Arg
CAT
Asp
CC
Arg
ATT
Ile
AGC
Ser
TAT
Tyr
GAA
Ciu
ACT
Thr
CGA
Arg
TAC
Tyr
TTA
Lau
,TTT
Phe
GCA
Gly As n
AC
Ser
CTA
Leu
GTC
ValI
AAA
Lys
GCA
Ala
CAC
Asp
CAC
Gin
TAT
T17yr
CC
Ala
GGA
C ly
TGG
Trp
CGA
Arg
CCA
Aia
TTC
Leu
CAA
Gin Ile
ACT
Ser Lys
ACT
Thr
ACA
Thr
CAC
Gin
AAC
As n Leu
CCC
Pro
AC
Ser Leu
TTC
Leu
ATC
Met
CCC
Pro Phe
AAA
Lys Arg
GAC
Asp
GTG
Val1
TAT
Ty r
ACC
Thr
GTA
Val1
AAT
Asn
CAA
Giu
TAT
Tyr
TAC
Tyr
TAT
Tyr
ACT
Ser
GTG
Val
TCT
Ser
CGT
Arg
ATT
Ile Gly
ACT
Ser
ACT
Ser
CTC
Leu
GAA
Ciu
AGG
Arg
CAG
Gin
CTG
Lau
GCC
Gly
A.AA
Lys
TAT
Tyr
GCT
Ala Arg
CCA
Pro
AAA
Lys
GC
C ly Glu
GAA
G lu
CCC
Pro G AT Asp
CTA
ValI
GTG
ValI
CAT
Asp
CCT
Gly Phe
TAC
Tyr
CAT
Asp
A.AT
As n
AAXT
Asn
CCT
Pro
AGA
Arg
TG
Tr p
CAG
G In
GGA
Cly
TTO
Lau
TTG
Lau
CGC;
Arg
AGC
Ser
:ACG
Thr
ATT
Ile
GC
Gly Pro .As n
AGA
Arg p Gcc Al1a 1433 496 512 1584 528 1632 544 1680 560 1728 576 it776 592 1824 608 1872 6 24 1920 640 i968 656 2)016 6-72 2064 88 2112 '04 2160 2 -U 2 2 03 GGT CTT CC CCT ACC ATT CCC GOT Gly Leu Ala Ala Thr Ile Ala 1la 251- SUBSTITUTE SHEET (RULE 26) WO 97/17432 WO 97/ 7432PCT/US96/18003 2209 737 2257 2305 769 2353 785 2401 801 2449 817 2497 833 2545 849 2593 865 2641 881 2689 897 2737 913 2785 929 2833 945 2881 961 2929 977 ACG OCT CCC GCG GCT !TC CCC CTC ATT CTC COG OTT C GCC GTA C Thr Ala Gly Ala Ala Ile Pro Val Ile Leu Gly Val Ala Ala Val CC GOG GGC GC TTC ATC Ala Gly Ile Cly Ali GGC GCC GCA TTA Lys Gly Gly CTA CAG TCG Val Gin Ser TAT CCC OCA Tyr Gly Ala GTA ACA GGC Val Thr Gly CCC GGC GCT Cly Cly Ala TTA GCC ACT Leu Cly Thr GC C C Gly Ala Ala GGT ATC CAT Cly Ile His GGT TTA CAT Oly Leu Asp GTC OCT TAT4 ATC COT OCT Met Cly Gly TAT CCC CC Tyr Ala Ala CCT CTC T'T Pro Val Phe ATT CCA ACTC Ile Cly ThrC Ala~
GCG
Ala
CCC
Arg
CCT
Ala
ATT
Ilie
GCC
Ala
OCT
dly
CCC
Ala
GTC
Val1
GCC
Ala
GGA
Gly
GT
,ly
;AG
;Mu
;GC
My kGT Leu
CCT
Ala
GCA
Ala
GTG
ValI Gly
TCT
Ser
GCG
Cly
OCT
Cly
OCT
Ala
CCT
Ala Phe
TTA
Leu
CCC
Pro
CTC
Leu CCT C Let
CT
Let
GCC
Gl
CAP
Cly
CC
Ala
ACC
Thr
ATC
Met
ATT
Ile
ACT
Ser
GCT
Gly 1TCG Leu
GCC
A.la kTA Ile
AC
iis cc ui Met r CCT Ala
CC
Ala
OCT
Cly
TCA
Ser
CCC
Gly
CTT
Leu
ATC
Ile
CC
Cly
AAC
As n Leu
ACT
Ser
AGA
Arg I-T T Phe AGA C Arg OCT
CCA
C ly
CCA
Arg
CCT
Ala
CTC
ValI
TG
Trp
ACT
Ser
ACC
Thr
ACC
Thr
ACC
Thr
C
Pro
GGT
Gly kGG krg
:AA~
Min kGCT 3er
;TGI
Pal D~
~CTC
TAT
CTC
Leu
CC
Ala
COT
Cly
ATA
Ile
CC
Ala
CAT
His
OCT
Cly
TAT
Tyr
GCC
Ala
GCT
:TC
Leu
TTA
.eu
~TC
let C CGT IZ
AAC
Asn
CTA
Val1 Cly Val
AAT
Asn
GTA
ValI
CAA
Ciu
ACC
Thr
TAT
Tyr
GA
Gly
GAA
Glu.
I'TA
Leu
GTA
ValI
CTC
Val
CAC
Gin
CC
Ala
CCA
Ala
AAT
Asn
GC
Cly
CTC
ValI
CAA
GIn
GC
Cly
CAT
His
ATO
Met
GC
Cly
CAT
His
CCT
Cly
COG
Cly
ACT
Ser
TCA
Ser
OCT
Ala
ACC
Thr
GCC
C ly
CCC
Cly
TCC
Ser
TTA
Leu
GCT
Ala
CC
PArg
TTC
Phe
AAP
Lys
TCA
Ser
CCC
Ala
CAT
Asp
ATT
Ile
CCA
Ala
ACT
Ser
TCC
Trp
GCG
Ala
GTC
ValI
GTT
ValI
%GT
Ser
TCC
Ser
CC
*Ala
CC
*Ala
*CCC
Arg
CAT
Asp
CC
Ala
ACT
Thr
ATT
Ile
MAT
As n
AAC
As n
CTC
ValI
CTC
ValI
CTC
Val
TCC
Trp
CCT
ACC CTG CTC GMA Ser Lau Leu Glu
ACG
Thr
C
Ala
CCC
Cly
CCC
Cly
ACT
Thr
CC
Ala
CCC
Arg
CCT
Cly
TAC
Tyr
AGA
Arg
AC
Sec-r
CC
Ala
GCT
Gly
ATT
Ile
C
TTA
Lau
OCT
Ala
C
Ala
AT?
Ile
ATG
Met
COT
Cly
CCA
Ala
TTT
Phe
OCA
Ala
ATA
Ile
CCA
Pro
AGA
Arg
OCT
Cly
TCC
Ser
ATG
7 52 2304 768 2352 784 2400 800 2443 316 2496 832 2544 848 2592 864 2'64 0 880 2688 896 2736 912 2784 928 2832 944 2830 960 2928 976 2 97 6 :TC CCC CCC CTT Aeu Oly Cly Leu 'GA ACA GAO ACT ly Arg Clu Ser .TA CAT CAT CTC 2977 AGA CCC TTA -252- SUBSTITUTE SHEET (RULE 26) WO 97/1 7432PCIS6l80 PCT/US96/18003 393 Arg Ala Leu Saar Al.a Ala G-ly Ser 3y Ile Asp His 7dl Ala GIy. Met 3025 ATT GGT A.AT CAG ATC AGA GGC GTC TTG ACC ACA ACC GGG ATC GCT 1009 Ile Gly Asn Gin Ilie Arg Gly Arg Val Leu Thr Thr Thr Gly Ile Ala 10-4 3073 AAT GCG ATA GAC TAT GGC ACC AGT GCT GTG GGA GCC GCA CGA CGA GTT 3 120 25 Asn Ala Ile Asp Trlr Gly Thr Ser Ala Val Gly Ala Ala Arg !-rg Val 1040 3 12 1 TTT TCT TTG TAA 3132 1041 Phe ser Leu End 1043 INFORMATION FOR SEQ ID NO:6i SEQUENCE CHARACTERISTICS: LENGTH: 1043 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:61 (TccC peptide) 1 Met Ser Pro Ser Giu Thr Thr Leu Tyr Thr Gin Thr Pro Thr Val Ser 16 17 Val Leu Asp Asn Arg Gly Leu Ser Ile Arg Asp Ile Gly Phe His Arg 32 33 Ile Val Ilie Gly Giy Asp Thr Asp Thr Arg Val Thr Arg His Gin Tyr 43 49 Asp Ala Arg Gly His Leu Asn rjr Ser Ile Asp Pro Arg Leu Trir Asp 64 65 Ala Lys Gin Ala Asp Asn Ser Val Lys Pro Asn Phe Val Trp Gin His 81 Asp Leu Ala Gly His Ala Leu Arg Thr Glu Ser Val Asp Ala Gly Arg 96 97 Thr Val Ala Leu Asn Asp Ile Giu Gly Arg Ser Val Met Thr Met Asn 112 113 Ala Thr Gly Val Arg Gin Thr Arg Arg Tyr GiU Gly Asn Thr Leu Pro 128 129 Gly Arg Leu Leu Ser Val Ser Giu Gin Val Phe Asn Gin Giu Ser Ala 144 145 Lys Val Thr Giu Arg Phe Ile Trp Ala Gly Asn Thr Thr Ser Giu Lys 160 161 Glu Tyr Asn Leu Ser Gly Leu cys Ile Arg His Tyr Asp Thr Ala Gly 1-16 177 Val Thr Arg Leu Met Ser Gin Ser Leu Ala Gly Ala Met Leu Ser Gin 192 193 Ser His Gin Leu Leu Ala Giu Giy Gin Giu Ala Asn Trp Ser Gly Asp 2 03 209 Asp Giu Thr Val Trp Gin Gly Met Leu Ala Ser Glu Val Tyr Thr Thr 224 225 Gln Ser Thr Thr Asn Ala Ile Gly Aia Leu Leu Thr Gin Thr Asp Ala 240 241 Lys Gly Asn Ilie Gin Arg Leu Ala Ty r Asp Ile Ala Gly Gin Leu Lys 256 257 Gly Ser Trp Leu Thr Val Lys Gly Gin Ser Glu Gin Val Ile Val Lys 272 2713 Ser Leu Ser Trp Ser Ala Ala Gly His Lys Lau Arg Giu Giu His Gly 23 289 Asn Gly Val Val Thr Glu Tyr Ser Tyr Giu Pro Giu Thr Gln Arg Leu 304 305 Ile Gly Ile Thr Thr Arg Arg Ala Giu Gly Ser Gin Ser Gly Ala Arg 3120 -253- SUBSTITUTE SHEET (RULE 26) WO 97/17432 PCT/US96/1 8003 3211 *.al Lau -,3n Asp Leu A-rg T'rr Lys 71Ir Asp Pro Val lyAsn val I1f 337 Ser Ile His Asn Asp Ala Giu Ala Thr Arg Phe Trp Arg Asn Gin Lys 353 Val Giu Pro Giu Asn Arg 7T 1 ,r Val Tyjr Asp Ser Leu 'Ilyr Gin Lau Met 368 369 Ser Ala Thr Gly Arg Giu Met Ala Asn Ile Gly Gin Gin Ser Asn Gin 384 395 Leu Pro Ser Pro Val Ile Pro Val Pro Thr Asp Asp Ser Thr 7TYr Thr 400 401 Asn r1yr Leu Arg Thr '1'yr Thr Tyr Asp Arg Giy Gly Asn Leu Val Gin 416 417 Ile Arg His Ser Ser Pro Ala Thr Gin Asn Ser Tyr Thr Thr Asp Ile 432 433 Thr Val Ser Ser Arg Ser Asn Arg Aia Val Lau Ser Thr Leu Thr Thr 443 449 Asp Pro Thr Arg Vai Asp Ala Leu Phe Asp Ser Gly Giy His Gin Lys 464 465 Met Leu Ile Pro Giy Gin Asn Leu Asp Trp Asn Ile Arg Gly Giu Leu 480 481 Gin Arg Val Thr Pro Vai Ser Arg Giu Asn Ser Ser Asp Ser Giu Trp 49-5 497 Tyr Arg 7hyr Ser Ser Asp Gly Met Arg Leu Leu Lys Vai Ser Giu Gin 512 513 Gin Thr Gly Asn Ser Thr Gin Val Gin Arg Vai Thr Tyr Leu Pro Gly 528 529 Lau Giu Leu Arg Thr Thr Gly Val Ala Asp Lys Thr Thr Giu Asp Leu 544 545 Gin Val Ile Thr Vai Gly Giu Ala Gly Arg Ala Gin Val Arg Val Leu 560 561 His Trp Giu Ser Gly Lys Pro Thr Asp Ile Asp Asn Asn Gin Val Arg 5.76 577 Tyr Ser Tyr Asp Asn Leu Leu Gly Ser Ser Gin Leu Giu Leu Asp Ser 592 593 Giu Giy Gin Ile Leu Ser Gin Giu Glu Tyr Tyr Pro Tyr Gly Gly Thr 608 609 Ala Ile Trp Ala Ala Arg Asn Gin Thr Glu Ala Ser Tyr Lys Phe Ile 624 625 Arg Tyr Ser Gly Lys Glu Arg Asp Ala Thr Gly Leu Tyr Ty1r Ty r Gly 640 641 Tyr Arg Tyr Tyr Gin Pro Trp Val Gly Arg Trp Leu Ser Ala Asp Pro 656 657 Ala Gly Thr Val Asp Gly Leu Asn Leu Tyr Arg Met Val Arg Asn Asn 672 673 Pro Ile Thr Leu Thr Asp His Asp Gly Leu Ala Pro Ser Pro Asn Arg 688 689 Asn Arg Asn Thr Phe Trp Phe Ala Ser Phe Leu Phe Arg Lys Pro Asp 704 705 Giu Gly Met Ser Ala Ser Met Arg Arg Gly Gin Lys Ile Gly Arg Ala 720 721 Ilie Ala Gly Gly Ilie Ala Ile Gly Gly Leu Ala Ala Thr Ile Ala Ala 736 737 Thr Ala Giy Ala Ala Ilie Pro Val Ile Lau Giy Val Ala Ala Val Gly 752 753 Ala Gly Ile Gly Ala Leu Met Gly Tyr Asn Val Gly Ser Leu Leu Glu 76 769 Lys Gly Gly Ala Leu Leu Ala Arg Leu Val Gin Gly Lys Ser Thr Leu -34 785 Val Gin Ser Ala Ala Gly Ala Ala Ala Gly Ala Ser Ser Ala Ala Ala 300 801 ry r Gly Ala Arg Ala Gin Gly Val Giy Val Ala Ser Ala Ala Gly Ala 3 16: 817 Val Thr Giy Ala Val Gly Ser Trp Ile Asn Asn Ala Asp Arg Gly Ilie 3-,2 833 Gly Gly Ala Ile Gly Ala Gly Ser Ala Val Gly Thr Ile Asp Thr met 4 -254- SUBSTITUTE SHEET (RULE 26) WO 97/17432 WO 97/ 7432PCT/US96/1 8003 ;4 965 881 397 313 929 5945 361 977 993 1009 1025 LEu Gly Gly G ly /Va 1 Met Ty r Pro Ile Arg Ile As n Glv Thr ser Thr Leu Thr His .3lu 'val 31y la Ala Ala Ile Leu dly Gly Ala ValI Gly Ala Gly Ala Ala His Asp Tyr Gly Ala Phe Thr Leu As n Ile Gly Ala ValI Ala Gly Cly GlU Gly Ser Gin Asp Ile 017 Asnr Leu Ser Arg Phe Arg Gly Gly Thr Thr Tyr Gly Giu Leu ValI Leu C ly Ile Leu ValI Giln Gly His Met Gly His C ly Arg Asp Thr Gly Ser Trp Ala ValI Val1 Ser Leu Ser Val1 Thr Ala 912 923 944 960 976 992 1008 1024 1040 1041 Phe Ser Leu 1043 -255- SUBSTITUTE SHEET (RULE
I
Claims (13)
- 2. The method of claim 1 wherein said toxin is sprayed on said plant.
- 3. The method of claim 1 wherein said plant is a transgenic plant that has been engineered to express a gene encoding said toxin.
- 4. The method of claim 3 wherein said gene has a DNA sequence selected from the group consisting of SEQ ID NO:11, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58 and SEQ ID The method of claim 1 wherein said toxin is applied to soil in contact with said plant.
- 6. The method of claim 1 wherein said toxin comprises a polypeptide with an 1i N-terminal amino acid sequence selected from the group consisting of SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:10, SEQ ID NO: 13, SEQ ID NO:14, SEQ ID NO:15, SEQ ID NO:16, SEQ ID NO:17, SEQ ID NO:18, SEQ ID NO:19, SEQ ID NO:20, SEQ ID NO:21, SEQ ID NO:22, SEQ ID NO:23, SEQ ID 20 NO:24, SEQ ID NO:38, SEQ ID NO:39, SEQ ID NO:40, SEQ ID NO:41, SEQ ID NO:42, SEQ ID NO:43.
- 7. The method of claim 1 wherein said toxin has an amino acid sequence selected from the group consisting of SEQ ID NO:12, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID NO:35, SEQ ID NO:47, SEQ 25 ID NO:49, SEQ ID NO:51, SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID NO:59, and SEQ ID NO:61.
- 8. The method of claim 7 wherein said toxin has an amino acid sequence of SEQ ID NO:12.
- 9. The purified protein comprising an amino acid sequence of SEQ ID NO:12.
- 10. An insect bait comprising, as active ingredient, a Photorhabdus toxin having oral insecticidal activity in combination with a conventional bait matrix.
- 11. The insect bait of claim 10, wherein said toxin comprises a polypeptide with an N-terminal amino acid sequence selected from the group consisting of SEQ ID NO:l, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID SNO:6, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:10, SEQ ID NO: 13, [I:\DayLib\LIBFF]38901 spec.doc:gcc 257 SEQ ID NO:14, SEQ ID NO:15, SEQ ID NO:16, SEQ ID NO:17, SEQ ID NO:18, SEQ ID NO:19, SEQ ID NO:20, SEQ ID NO:21, SEQ ID NO:22, SEQ ID NO:23, SEQ ID NO:24, SEQ ID NO:38, SEQ ID NO:39, SEQ ID NO:40, SEQ ID NO:41, SEQ ID NO:42, SEQ ID NO:43.
- 12. The insect bait of claim 10, wherein said toxin has an amino acid sequence selected from the group consisting of SEQ ID NO:12, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID NO:35, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:51, SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID NO:59, and SEQ ID NO:61.
- 13. The insect bait of claim 10, wherein said toxin has an amino acid sequence of SEQ ID NO:12.
- 14. A method of protecting a plant from an insect which comprises incorporating with the plant an effective amount of a Photorhabdus protein toxin having oral insecticidal activity, substantially as hereinbefore described with reference to any one of the examples.
- 15. An insect bait comprising, as active ingredient, a Pholorhabdus toxin having oral insecticidal activity in combination with a conventional bait matrix, substantially as a •°hereinbefore described with reference to any one of the examples. 20 Dated 9 November, 2000 Wisconsin Alumni Research Foundation Patent Attorneys for the Applicant/Nominated Person S" SPRUSON FERGUSON a *o a [I:\DayLib\LI BFF]38901 spec.doc:gcc
Applications Claiming Priority (7)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US725595P | 1995-11-06 | 1995-11-06 | |
| US60/007255 | 1995-11-06 | ||
| US60842396A | 1996-02-28 | 1996-02-28 | |
| US08/608423 | 1996-02-28 | ||
| US08/705484 | 1996-08-28 | ||
| US70548496A | 1996-08-29 | 1996-08-29 | |
| PCT/US1996/018003 WO1997017432A1 (en) | 1995-11-06 | 1996-11-06 | Insecticidal protein toxins from photorhabdus |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| AU1050997A AU1050997A (en) | 1997-05-29 |
| AU729228B2 true AU729228B2 (en) | 2001-01-25 |
Family
ID=27358315
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| AU10509/97A Ceased AU729228B2 (en) | 1995-11-06 | 1996-11-06 | Insecticidal protein toxins from photorhabdus |
Country Status (13)
| Country | Link |
|---|---|
| EP (1) | EP0797659A4 (en) |
| JP (2) | JP3482214B2 (en) |
| KR (1) | KR100354530B1 (en) |
| AU (1) | AU729228B2 (en) |
| BR (1) | BR9606889A (en) |
| CA (1) | CA2209659C (en) |
| HU (1) | HUP9900768A3 (en) |
| IL (1) | IL121243A (en) |
| MX (1) | MX9705101A (en) |
| PL (1) | PL186242B1 (en) |
| RO (1) | RO121280B1 (en) |
| SK (1) | SK93197A3 (en) |
| WO (1) | WO1997017432A1 (en) |
Families Citing this family (173)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| GB9618083D0 (en) * | 1996-08-29 | 1996-10-09 | Mini Agriculture & Fisheries | Pesticidal agents |
| ES2286850T3 (en) * | 1997-05-05 | 2007-12-01 | Dow Agrosciences Llc | XENORHABDUS INSECTICIATED PROTEIN TOXINS. |
| AU7974198A (en) * | 1997-06-20 | 1999-01-04 | Mycogen Corporation | Method to identify pesticidal microbes |
| AUPO808897A0 (en) | 1997-07-17 | 1997-08-14 | Commonwealth Scientific And Industrial Research Organisation | Toxin genes from the bacteria xenorhabdus nematophilus and photohabdus luminescens |
| JP2002504336A (en) * | 1998-02-20 | 2002-02-12 | ノバルティス アクチエンゲゼルシャフト | Insecticidal toxins from Photolabdus |
| US6281413B1 (en) | 1998-02-20 | 2001-08-28 | Syngenta Participations Ag | Insecticidal toxins from Photorhabdus luminescens and nucleic acid sequences coding therefor |
| US6174860B1 (en) * | 1999-04-16 | 2001-01-16 | Novartis Ag | Insecticidal toxins and nucleic acid sequences coding therefor |
| GB9901499D0 (en) * | 1999-01-22 | 1999-03-17 | Horticulture Res Int | Biological control |
| AUPP911399A0 (en) * | 1999-03-10 | 1999-04-01 | Commonwealth Scientific And Industrial Research Organisation | Plants and feed baits for controlling damage |
| EP1069134A1 (en) * | 1999-07-15 | 2001-01-17 | Wisconsin Alumni Research Foundation | Photorhabdus luminescens strains |
| WO2001016305A2 (en) * | 1999-09-02 | 2001-03-08 | Agresearch Limited | Nucleotide sequences encoding an insectidal protein complex from serratia |
| FR2803592A1 (en) | 2000-01-06 | 2001-07-13 | Aventis Cropscience Sa | NOVEL DERIVATIVES OF 3-HYDROXYPICOLINIC ACID, PROCESS FOR THEIR PREPARATION AND FUNGICIDAL COMPOSITIONS CONTAINING SAME |
| US8440880B2 (en) | 2000-06-30 | 2013-05-14 | Monsanto Technology Llc | Xenorhabdus sp. genome sequences and uses thereof |
| FR2815969B1 (en) | 2000-10-30 | 2004-12-10 | Aventis Cropscience Sa | TOLERANT PLANTS WITH HERBICIDES BY METABOLIC BYPASS |
| JP2005536198A (en) * | 2002-06-28 | 2005-12-02 | ダウ アグロサイエンス リミテッド ライアビリティー カンパニー | Insecticidal proteins and polynucleotides derived from Penibacillus sp. |
| WO2004067727A2 (en) | 2003-01-21 | 2004-08-12 | Dow Agrosciences Llc | Mixing and matching tc proteins for pest control |
| US7071386B2 (en) | 2003-01-21 | 2006-07-04 | Dow Agrosciences Llc | Xenorhabdus TC gene for pest control |
| US7319142B1 (en) | 2004-08-31 | 2008-01-15 | Monsanto Technology Llc | Nucleotide and amino acid sequences from Xenorhabdus and uses thereof |
| CN101903524A (en) * | 2007-08-31 | 2010-12-01 | 巴斯夫植物科学有限公司 | Pathogen control genes and methods of use in plants |
| EP2504442B1 (en) | 2009-11-24 | 2014-07-16 | Katholieke Universiteit Leuven, K.U. Leuven R&D | Banana promoters |
| ES2659086T3 (en) | 2009-12-23 | 2018-03-13 | Bayer Intellectual Property Gmbh | HPPD-inhibiting herbicide-tolerant plants |
| EA201290559A1 (en) | 2009-12-23 | 2013-01-30 | Байер Интеллектуэль Проперти Гмбх | PLANTS RESISTANT TO HERBICIDES - HPPD INHIBITORS |
| AR079883A1 (en) | 2009-12-23 | 2012-02-29 | Bayer Cropscience Ag | TOLERANT PLANTS TO INHIBITING HERBICIDES OF HPPD |
| ES2659085T3 (en) | 2009-12-23 | 2018-03-13 | Bayer Intellectual Property Gmbh | HPPD Inhibitor Herbicide Tolerant Plants |
| WO2011076877A1 (en) | 2009-12-23 | 2011-06-30 | Bayer Cropscience Ag | Plants tolerant to hppd inhibitor herbicides |
| AR080105A1 (en) | 2010-02-02 | 2012-03-14 | Bayer Cropscience Ag | SOFT TRANSFORMATION USING HYDROPHENYL PIRUVATO DIOXYGENASE (HPPD) INHIBITORS AS SELECTION AGENTS |
| EP2618668B1 (en) | 2010-09-20 | 2016-09-14 | Wisconsin Alumni Research Foundation | Mosquitocidal xenorhabdus, lipopeptide and methods |
| EP2669371A1 (en) | 2010-11-10 | 2013-12-04 | Bayer CropScience AG | HPPD variants and methods of use |
| MX2013010908A (en) | 2011-03-25 | 2013-10-07 | Bayer Ip Gmbh | Use of n-(tetrazol-4-yl)- or n-(triazol-3-yl)arylcarboxamides or their salts for controlling unwanted plants in areas of transgenic crop plants being tolerant to hppd inhibitor herbicides. |
| CA2830802A1 (en) | 2011-03-25 | 2012-10-04 | Bayer Intellectual Property Gmbh | Use of n-(1,2,5-oxadiazol-3-yl)benzamides for controlling unwanted plants in areas of transgenic crop plants being tolerant to hppd inhibitor herbicides |
| KR101246707B1 (en) * | 2012-08-01 | 2013-03-25 | ㈜엠알이노베이션 | Nematocide compound containing photorhabdus temperata subsp. temperata |
| WO2014043435A1 (en) | 2012-09-14 | 2014-03-20 | Bayer Cropscience Lp | Hppd variants and methods of use |
| RU2723717C2 (en) | 2013-03-07 | 2020-06-17 | Атеникс Корп. | Toxins genes and methods of using them |
| MX2016011745A (en) | 2014-03-11 | 2017-09-01 | Bayer Cropscience Lp | Hppd variants and methods of use. |
| EA201890696A1 (en) | 2015-09-11 | 2018-09-28 | Байер Кропсайенс Акциенгезельшафт | GRFD VARIANTS AND APPLICATIONS |
| BR112019010476A2 (en) | 2016-11-23 | 2019-09-10 | BASF Agricultural Solutions Seed US LLC | recombinant nucleic acid molecule, vector, host cell, transgenic plant, transgenic seed, recombinant polypeptide, composition, method for controlling a pest population, for killing pests, for producing a polypeptide, plant or plant cell, method for protecting a plant against a pest, to increase yield on a plant, use and primary product |
| KR20190095411A (en) | 2016-12-22 | 2019-08-14 | 바스프 아그리컬쳐럴 솔루션즈 시드 유에스 엘엘씨 | Use of CR14 for the control of nematode pests |
| US11286498B2 (en) | 2017-01-18 | 2022-03-29 | BASF Agricultural Solutions Seed US LLC | Use of BP005 for the control of plant pathogens |
| CN110431234B (en) | 2017-01-18 | 2024-04-16 | 巴斯夫农业种子解决方案美国有限责任公司 | BP005 toxin gene and method of use thereof |
| BR112019018056A2 (en) | 2017-03-07 | 2020-08-11 | BASF Agricultural Solutions Seed US LLC | recombinant nucleic acid molecule, expression cassette, host cell, plants, transgenic seeds, recombinant polypeptide, methods for checking tolerance and for controlling weeds, utility product and use of the nucleotide sequence |
| US20210032651A1 (en) | 2017-10-24 | 2021-02-04 | Basf Se | Improvement of herbicide tolerance to hppd inhibitors by down-regulation of putative 4-hydroxyphenylpyruvate reductases in soybean |
| WO2019083810A1 (en) | 2017-10-24 | 2019-05-02 | Basf Se | Improvement of herbicide tolerance to 4-hydroxyphenylpyruvate dioxygenase (hppd) inhibitors by down-regulation of hppd expression in soybean |
| MX2022000950A (en) | 2019-07-22 | 2022-02-14 | Bayer Ag | 5-amino substituted pyrazoles and triazoles as pest control agents. |
| BR112022000942A2 (en) | 2019-07-23 | 2022-05-17 | Bayer Ag | Heteroaryl-triazole compounds as pesticides |
| CN118561817A (en) | 2019-07-23 | 2024-08-30 | 拜耳公司 | New heteroaryl-triazole compounds as pesticides |
| EP3701796A1 (en) | 2019-08-08 | 2020-09-02 | Bayer AG | Active compound combinations |
| WO2021058659A1 (en) | 2019-09-26 | 2021-04-01 | Bayer Aktiengesellschaft | Rnai-mediated pest control |
| CA3156302A1 (en) | 2019-10-02 | 2021-04-08 | Bayer Aktiengesellschaft | Active compound combinations comprising fatty acids |
| UY38911A (en) | 2019-10-09 | 2021-05-31 | Bayer Ag | HETEROARYL-TRIAZOLE COMPOUNDS AS PESTICIDES, FORMULATIONS, USES AND METHODS OF USE OF THEM |
| AR120176A1 (en) | 2019-10-09 | 2022-02-02 | Bayer Ag | HETEROARYL-TRIAZOLE COMPOUNDS AS PESTICIDES |
| EP4461128A3 (en) | 2019-10-14 | 2025-03-26 | BASF Agricultural Solutions US LLC | Novel insect resistant genes and methods of use |
| US12241075B2 (en) | 2019-10-14 | 2025-03-04 | Basf Agricultural Solutions Us Llc | Insect resistant genes and methods of use |
| US20220380318A1 (en) | 2019-11-07 | 2022-12-01 | Bayer Aktiengesellschaft | Substituted sulfonyl amides for controlling animal pests |
| WO2021097162A1 (en) | 2019-11-13 | 2021-05-20 | Bayer Cropscience Lp | Beneficial combinations with paenibacillus |
| WO2021099271A1 (en) | 2019-11-18 | 2021-05-27 | Bayer Aktiengesellschaft | Active compound combinations comprising fatty acids |
| TW202134226A (en) | 2019-11-18 | 2021-09-16 | 德商拜耳廠股份有限公司 | Novel heteroaryl-triazole compounds as pesticides |
| TW202136248A (en) | 2019-11-25 | 2021-10-01 | 德商拜耳廠股份有限公司 | Novel heteroaryl-triazole compounds as pesticides |
| WO2021155084A1 (en) | 2020-01-31 | 2021-08-05 | Pairwise Plants Services, Inc. | Suppression of shade avoidance response in plants |
| PY2112437A (en) | 2020-02-18 | 2022-08-16 | Bayer Ag | NEW HETEROARYL-TRIAZOLE COMPOUNDS AS PESTICIDES |
| EP3708565A1 (en) | 2020-03-04 | 2020-09-16 | Bayer AG | Pyrimidinyloxyphenylamidines and the use thereof as fungicides |
| WO2021211926A1 (en) | 2020-04-16 | 2021-10-21 | Pairwise Plants Services, Inc. | Methods for controlling meristem size for crop improvement |
| WO2021209490A1 (en) | 2020-04-16 | 2021-10-21 | Bayer Aktiengesellschaft | Cyclaminephenylaminoquinolines as fungicides |
| US20230212163A1 (en) | 2020-04-21 | 2023-07-06 | Bayer Aktiengesellschaft | 2-(het)aryl-substituted condensed heterocyclic derivatives as pest control agents |
| TWI891782B (en) | 2020-05-06 | 2025-08-01 | 德商拜耳廠股份有限公司 | Novel heteroaryl-triazole compounds as pesticides |
| EP4146628A1 (en) | 2020-05-06 | 2023-03-15 | Bayer Aktiengesellschaft | Pyridine (thio)amides as fungicidal compounds |
| WO2021228734A1 (en) | 2020-05-12 | 2021-11-18 | Bayer Aktiengesellschaft | Triazine and pyrimidine (thio)amides as fungicidal compounds |
| EP4153566A1 (en) | 2020-05-19 | 2023-03-29 | Bayer CropScience Aktiengesellschaft | Azabicyclic(thio)amides as fungicidal compounds |
| MX2022015107A (en) | 2020-06-02 | 2023-03-01 | Pairwise Plants Services Inc | METHODS TO CONTROL THE SIZE OF THE MERISTEM TO IMPROVE CROPS. |
| US12565489B2 (en) | 2020-06-04 | 2026-03-03 | Bayer Aktiengesellschaft | Heterocyclyl pyrimidines and triazines as novel fungicides |
| CA3186659A1 (en) | 2020-06-10 | 2021-12-16 | Bayer Aktiengesellschaft | Azabicyclyl-substituted heterocycles as fungicides |
| WO2021257775A1 (en) | 2020-06-17 | 2021-12-23 | Pairwise Plants Services, Inc. | Methods for controlling meristem size for crop improvement |
| CN116157017A (en) | 2020-06-18 | 2023-05-23 | 拜耳公司 | 3-(Pyridazin-4-yl)-5,6-dihydro-4H-1,2,4-oxadiazine derivatives as fungicides for crop protection |
| US20230292747A1 (en) | 2020-06-18 | 2023-09-21 | Bayer Aktiengesellschaft | Composition for use in agriculture |
| WO2021255089A1 (en) | 2020-06-19 | 2021-12-23 | Bayer Aktiengesellschaft | 1,3,4-oxadiazole pyrimidines and 1,3,4-oxadiazole pyridines as fungicides |
| UY39276A (en) | 2020-06-19 | 2022-01-31 | Bayer Ag | USE OF 1,3,4-OXADIAZOL-2-ILPYRIMIDINE COMPOUNDS TO CONTROL PHYTOPATHOGENIC MICROORGANISMS, METHODS OF USE AND COMPOSITIONS. |
| WO2021255091A1 (en) | 2020-06-19 | 2021-12-23 | Bayer Aktiengesellschaft | 1,3,4-oxadiazoles and their derivatives as fungicides |
| UY39275A (en) | 2020-06-19 | 2022-01-31 | Bayer Ag | 1,3,4-OXADIAZOLE PYRIMIDINES AS FUNGICIDES, PROCESSES AND INTERMEDIARIES FOR THEIR PREPARATION, METHODS OF USE AND USES OF THE SAME |
| EP3929189A1 (en) | 2020-06-25 | 2021-12-29 | Bayer Animal Health GmbH | Novel heteroaryl-substituted pyrazine derivatives as pesticides |
| KR20230039665A (en) | 2020-07-02 | 2023-03-21 | 바이엘 악티엔게젤샤프트 | Heterocycle derivatives as pest control agents |
| WO2022033991A1 (en) | 2020-08-13 | 2022-02-17 | Bayer Aktiengesellschaft | 5-amino substituted triazoles as pest control agents |
| WO2022053453A1 (en) | 2020-09-09 | 2022-03-17 | Bayer Aktiengesellschaft | Azole carboxamide as pest control agents |
| WO2022058327A1 (en) | 2020-09-15 | 2022-03-24 | Bayer Aktiengesellschaft | Substituted ureas and derivatives as new antifungal agents |
| EP3974414A1 (en) | 2020-09-25 | 2022-03-30 | Bayer AG | 5-amino substituted pyrazoles and triazoles as pesticides |
| EP3915971A1 (en) | 2020-12-16 | 2021-12-01 | Bayer Aktiengesellschaft | Phenyl-s(o)n-phenylamidines and the use thereof as fungicides |
| WO2022129190A1 (en) | 2020-12-18 | 2022-06-23 | Bayer Aktiengesellschaft | (hetero)aryl substituted 1,2,4-oxadiazoles as fungicides |
| WO2022129188A1 (en) | 2020-12-18 | 2022-06-23 | Bayer Aktiengesellschaft | 1,2,4-oxadiazol-3-yl pyrimidines as fungicides |
| JP2023554063A (en) | 2020-12-18 | 2023-12-26 | バイエル・アクチエンゲゼルシヤフト | Use of DHODH inhibitors to control resistant phytopathogenic fungi in crops |
| WO2022129196A1 (en) | 2020-12-18 | 2022-06-23 | Bayer Aktiengesellschaft | Heterobicycle substituted 1,2,4-oxadiazoles as fungicides |
| EP4036083A1 (en) | 2021-02-02 | 2022-08-03 | Bayer Aktiengesellschaft | 5-oxy substituted heterocycles as pesticides |
| EP4291641A1 (en) | 2021-02-11 | 2023-12-20 | Pairwise Plants Services, Inc. | Methods and compositions for modifying cytokinin oxidase levels in plants |
| US12365910B2 (en) | 2021-02-25 | 2025-07-22 | Pairwise Plants Services, Inc. | Methods and compositions for modifying root architecture in plants |
| BR112023019400A2 (en) | 2021-03-30 | 2023-12-05 | Bayer Ag | 3-(HETERO)ARYL-5-CHLORODIFLOROMETHYL-1,2,4-OXADIAZOLE AS A FUNGICIDE |
| WO2022207496A1 (en) | 2021-03-30 | 2022-10-06 | Bayer Aktiengesellschaft | 3-(hetero)aryl-5-chlorodifluoromethyl-1,2,4-oxadiazole as fungicide |
| WO2022233777A1 (en) | 2021-05-06 | 2022-11-10 | Bayer Aktiengesellschaft | Alkylamide substituted, annulated imidazoles and use thereof as insecticides |
| EP4337661A1 (en) | 2021-05-12 | 2024-03-20 | Bayer Aktiengesellschaft | 2-(het)aryl-substituted condensed heterocycle derivatives as pest control agents |
| US20220411813A1 (en) | 2021-06-17 | 2022-12-29 | Pairwise Plants Services, Inc. | Modification of growth regulating factor family transcription factors in soybean |
| UY39827A (en) | 2021-06-24 | 2023-01-31 | Pairwise Plants Services Inc | MODIFICATION OF UBIQUITIN LIGASE E3 HECT GENES TO IMPROVE PERFORMANCE TRAITS |
| US12529063B2 (en) | 2021-07-01 | 2026-01-20 | Pairwise Plants Services, Inc. | Methods and compositions for enhancing root system development |
| US12365911B2 (en) | 2021-08-12 | 2025-07-22 | Pairwise Plants Services, Inc. | Modification of brassinosteroid receptor genes to improve yield traits |
| JP2024529148A (en) | 2021-08-13 | 2024-08-01 | バイエル、アクチエンゲゼルシャフト | Active compound combinations and antifungal compositions containing them - Patents.com |
| PY2270221A (en) | 2021-08-17 | 2023-03-20 | Pairwise Plants Services Inc | METHODS AND COMPOSITIONS FOR MODIFYING HISTIDINE KINASE CYTOKININ RECEPTOR GENES IN PLANTS |
| MX2024002386A (en) | 2021-08-25 | 2024-03-14 | Bayer Ag | Novel pyrazinyl-triazole compounds as pesticides. |
| EP4395531A1 (en) | 2021-08-30 | 2024-07-10 | Pairwise Plants Services, Inc. | Modification of ubiquitin binding peptidase genes in plants for yield trait improvement |
| EP4144739A1 (en) | 2021-09-02 | 2023-03-08 | Bayer Aktiengesellschaft | Anellated pyrazoles as parasiticides |
| AR126938A1 (en) | 2021-09-02 | 2023-11-29 | Pairwise Plants Services Inc | METHODS AND COMPOSITIONS TO IMPROVE PLANT ARCHITECTURE AND PERFORMANCE TRAITS |
| CA3232804A1 (en) | 2021-09-21 | 2023-03-30 | Pairwise Plants Services, Inc. | Methods and compositions for reducing pod shatter in canola |
| US20230108968A1 (en) | 2021-10-04 | 2023-04-06 | Pairwise Plants Services, Inc. | Methods for improving floret fertility and seed yield |
| AR127300A1 (en) | 2021-10-07 | 2024-01-10 | Pairwise Plants Services Inc | METHODS TO IMPROVE FLOWER FERTILITY AND SEED YIELD |
| CN118541353A (en) | 2021-11-03 | 2024-08-23 | 拜耳公司 | Bis (hetero) aryl thioether (thio) amides as fungicidal compounds |
| WO2023099445A1 (en) | 2021-11-30 | 2023-06-08 | Bayer Aktiengesellschaft | Bis(hetero)aryl thioether oxadiazines as fungicidal compounds |
| AR127904A1 (en) | 2021-12-09 | 2024-03-06 | Pairwise Plants Services Inc | METHODS TO IMPROVE FLOWER FERTILITY AND SEED YIELD |
| UY40132A (en) | 2022-01-31 | 2023-08-31 | Pairwise Plants Services Inc | SUPPRESSION OF THE SHADE AVOIDANCE RESPONSE IN PLANTS |
| CN118973394A (en) | 2022-02-01 | 2024-11-15 | 环球化学股份有限公司 | Methods and compositions for controlling insect pests on cereals |
| CN118984651A (en) | 2022-02-01 | 2024-11-19 | 环球化学股份有限公司 | Methods and compositions for controlling insect pests on corn |
| EP4472416A1 (en) | 2022-02-01 | 2024-12-11 | Globachem NV | Methods and compositions for controlling pests in cotton |
| EP4472419A1 (en) | 2022-02-01 | 2024-12-11 | Globachem NV | Methods and compositions for controlling pests in rice |
| CN119012913A (en) | 2022-02-01 | 2024-11-22 | 环球化学股份有限公司 | Methods and compositions for controlling pests |
| CN119156135A (en) | 2022-02-01 | 2024-12-17 | 环球化学股份有限公司 | Methods and compositions for controlling pests on soybeans |
| PY2314419A (en) | 2022-03-02 | 2023-09-25 | Pairwise Plants Services Inc | MODIFICATION OF BRASSINOSTEROID RECEPTOR GENES TO IMPROVE PERFORMANCE TRAITS |
| WO2023192838A1 (en) | 2022-03-31 | 2023-10-05 | Pairwise Plants Services, Inc. | Early flowering rosaceae plants with improved characteristics |
| AU2023251095A1 (en) | 2022-04-07 | 2024-10-17 | Monsanto Technology Llc | Methods and compositions for improving resistance to fusarium head blight |
| EP4511387A1 (en) | 2022-04-21 | 2025-02-26 | Pairwise Plants Services, Inc. | Methods and compositions for improving yield traits |
| UY40250A (en) | 2022-05-02 | 2023-11-15 | Pairwise Plants Services Inc | METHODS AND COMPOSITIONS TO IMPROVE PERFORMANCE AND DISEASE RESISTANCE |
| CN119522219A (en) | 2022-05-03 | 2025-02-25 | 拜耳公司 | Crystalline form of (5S)-3-[3-(3-chloro-2-fluorophenoxy)-6-methylpyridazin-4-yl]-5-(2-chloro-4-methylbenzyl)-5,6-dihydro-4H-1,2,4-oxadiazine |
| JP2025516324A (en) | 2022-05-03 | 2025-05-27 | バイエル、アクチエンゲゼルシャフト | Use of (5S)-3-[3-(3-chloro-2-fluorophenoxy)-6-methylpyridazin-4-yl]-5-(2-chloro-4-methylbenzyl)-5,6-dihydro-4H-1,2,4-oxadiazine for controlling undesirable microorganisms |
| UY40255A (en) | 2022-05-05 | 2023-11-15 | Pairwise Plants Services Inc | METHODS AND COMPOSITIONS FOR MODIFYING ROOT ARCHITECTURE AND/OR IMPROVING PERFORMANCE RANGES |
| UY40326A (en) | 2022-06-27 | 2023-12-29 | Pairwise Plants Services Inc | METHODS AND COMPOSITIONS FOR MODIFYING SHADE ESCAPE IN PLANTS |
| CN119487055A (en) | 2022-06-29 | 2025-02-18 | 成对植物服务股份有限公司 | Methods and compositions for controlling meristem size for crop improvement |
| US20240002873A1 (en) | 2022-06-29 | 2024-01-04 | Pairwise Plants Services, Inc. | Methods and compositions for controlling meristem size for crop improvement |
| WO2024030984A1 (en) | 2022-08-04 | 2024-02-08 | Pairwise Plants Services, Inc. | Methods and compositions for improving yield traits |
| CA3264244A1 (en) | 2022-08-11 | 2024-02-15 | Pairwise Plants Services, Inc. | Methods and compositions for controlling meristem size for crop improvement |
| EP4584282A1 (en) | 2022-09-08 | 2025-07-16 | Pairwise Plants Services, Inc. | Methods and compositions for improving yield characteristics in plants |
| EP4295688A1 (en) | 2022-09-28 | 2023-12-27 | Bayer Aktiengesellschaft | Active compound combination |
| WO2024068517A1 (en) | 2022-09-28 | 2024-04-04 | Bayer Aktiengesellschaft | 3-(hetero)aryl-5-chlorodifluoromethyl-1,2,4-oxadiazole as fungicide |
| WO2024068520A1 (en) | 2022-09-28 | 2024-04-04 | Bayer Aktiengesellschaft | 3-(hetero)aryl-5-chlorodifluoromethyl-1,2,4-oxadiazole as fungicide |
| WO2024068519A1 (en) | 2022-09-28 | 2024-04-04 | Bayer Aktiengesellschaft | 3-(hetero)aryl-5-chlorodifluoromethyl-1,2,4-oxadiazole as fungicide |
| WO2024068518A1 (en) | 2022-09-28 | 2024-04-04 | Bayer Aktiengesellschaft | 3-heteroaryl-5-chlorodifluoromethyl-1,2,4-oxadiazole as fungicide |
| EP4385326A1 (en) | 2022-12-15 | 2024-06-19 | Kimitec Biogorup | Biopesticide composition and method for controlling and treating broad spectrum of pests and diseases in plants |
| AU2023408197A1 (en) | 2022-12-19 | 2025-06-26 | Basf Agricultural Solutions Us Llc | Insect toxin genes and methods for their use |
| KR102875344B1 (en) * | 2022-12-22 | 2025-10-27 | 주식회사 남보 | Photorhabdus cinerea NB-YG4-3 strain, composition and control method for wilt disease using the same |
| EP4665747A1 (en) | 2023-02-16 | 2025-12-24 | Pairwise Plants Services, Inc. | Methods and compositions for modifying shade avoidance in plants |
| UY40661A (en) | 2023-03-02 | 2024-10-15 | Pairwise Plants Services Inc | METHODS AND COMPOSITIONS FOR MODIFYING SHADE AVOIDANCE IN PLANTS |
| UY40664A (en) | 2023-03-09 | 2024-10-15 | Pairwise Plants Services Inc | MODIFICATION OF BRASSINOSTEROID SIGNALING PATHWAY GENES TO IMPROVE PERFORMANCE TRAITS |
| UY40746A (en) | 2023-05-18 | 2024-12-13 | Pairwise Plants Services Inc | METHODS AND COMPOSITIONS FOR IMPROVING PLANT PERFORMANCE CHARACTERISTICS |
| WO2025008446A1 (en) | 2023-07-05 | 2025-01-09 | Bayer Aktiengesellschaft | Composition for use in agriculture |
| WO2025008447A1 (en) | 2023-07-05 | 2025-01-09 | Bayer Aktiengesellschaft | Composition for use in agriculture |
| AU2024386241A1 (en) | 2023-07-07 | 2026-01-08 | Basf Agricultural Solutions Us Llc | Use of cry genes for the control of nematode pests |
| PY2455894A (en) | 2023-07-18 | 2025-03-28 | Pairwise Plants Services Inc | METHODS AND COMPOSITIONS FOR MODIFYING ROOT ARCHITECTURE IN PLANTS |
| AR133310A1 (en) | 2023-07-27 | 2025-09-17 | Pairwise Plants Services Inc | METHODS AND COMPOSITIONS FOR MODIFYING PLANT PERFORMANCE TRAITS |
| WO2025026738A1 (en) | 2023-07-31 | 2025-02-06 | Bayer Aktiengesellschaft | 6-[5-(ethylsulfonyl)-1-methyl-1h-imidazol-4-yl]-7-methyl-3-(pentafluoroethyl)-7h-imidazo[4,5-c]pyridazine derivatives as pesticides |
| EP4501112A1 (en) | 2023-08-01 | 2025-02-05 | Globachem NV | Plant defense elicitors |
| KR20260042262A (en) | 2023-08-01 | 2026-03-30 | 글로바켐 엔브이 | insecticide mixture |
| WO2025031668A1 (en) | 2023-08-09 | 2025-02-13 | Bayer Aktiengesellschaft | Azaheterobiaryl-substituted 4,5-dihydro-1h-2,4,5-oxadiazines as novel fungicides |
| CN121693501A (en) | 2023-08-09 | 2026-03-17 | 拜耳公司 | Pyridazin-4-yl oxadiazines as novel bactericides |
| WO2025064734A1 (en) | 2023-09-21 | 2025-03-27 | Pairwise Plants Services, Inc. | Early flowering black raspberry plants with improved characteristics |
| WO2025080600A1 (en) | 2023-10-11 | 2025-04-17 | Pairwise Plants Services, Inc. | Methods and compositions for improving crop yield traits |
| WO2025078128A1 (en) | 2023-10-11 | 2025-04-17 | Bayer Aktiengesellschaft | Pyridazin-3-one-4-yloxadiazines as novel fungicides |
| WO2025090606A1 (en) | 2023-10-27 | 2025-05-01 | Basf Agricultural Solutions Us Llc | Use of novel genes for the control of nematode pests |
| WO2025098875A1 (en) | 2023-11-10 | 2025-05-15 | Bayer Aktiengesellschaft | Active compound combinations having insecticidal/acaricidal properties |
| WO2025098876A1 (en) | 2023-11-10 | 2025-05-15 | Bayer Aktiengesellschaft | Active compound combinations having insecticidal/acaricidal properties |
| WO2025098874A1 (en) | 2023-11-10 | 2025-05-15 | Bayer Aktiengesellschaft | Active compound combinations having fungicidal/insecticidal/acaricidal properties |
| WO2025168620A1 (en) | 2024-02-07 | 2025-08-14 | Bayer Aktiengesellschaft | Heteroaryl-substituted 4,5-dihydro-1h-2,4,5-oxadiazines as novel fungicides |
| WO2025178902A1 (en) | 2024-02-22 | 2025-08-28 | Pairwise Plants Services, Inc. | Methods and compositions for improving yield characteristics in plants |
| WO2025186065A1 (en) | 2024-03-05 | 2025-09-12 | Bayer Aktiengesellschaft | Heteroaryl-substituted (aza)quinoxaline derivatives as pesticides |
| WO2025190927A1 (en) | 2024-03-14 | 2025-09-18 | Bayer Aktiengesellschaft | Active compound combinations having insecticidal/acaricidal properties |
| EP4652842A1 (en) | 2024-05-21 | 2025-11-26 | Kimitec Biogroup S.L | Biopesticide composition, procedure of obtain thereof, and method for controlling and treating broad spectrum of pests, diseases and weeds in plants |
| EP4652843A1 (en) | 2024-05-21 | 2025-11-26 | Kimitec Biogroup S.L | Biopesticide composition, procedure of obtain thereof, and method for controlling and treating broad spectrum of pests in plants |
| WO2025257122A1 (en) | 2024-06-12 | 2025-12-18 | Bayer Aktiengesellschaft | Active compound combinations having insecticidal/acaricidal properties |
| WO2025257121A1 (en) | 2024-06-12 | 2025-12-18 | Bayer Aktiengesellschaft | Active compound combinations having insecticidal/acaricidal properties |
| WO2026010930A1 (en) | 2024-07-05 | 2026-01-08 | BASF Agricultural Solutions Seed US LLC | Use of axmi277 for the control of rotylenchulus reniformis nematode pests |
| WO2026027375A1 (en) | 2024-07-29 | 2026-02-05 | Bayer Aktiengesellschaft | Hydroxy-dihydropyridinone carboxamides as pesticides |
| EP4721566A1 (en) | 2024-10-07 | 2026-04-08 | Kimitec Biogroup S.L | Microbial composition based on bacillus infantis strain for agricultural use |
| WO2026082481A1 (en) | 2024-10-17 | 2026-04-23 | Bayer Aktiengesellschaft | Active compound combinations having insecticidal/acaricidal properties |
Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5254799A (en) * | 1985-01-18 | 1993-10-19 | Plant Genetic Systems N.V. | Transformation vectors allowing expression of Bacillus thuringiensis endotoxins in plants |
Family Cites Families (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CA1249234A (en) * | 1984-09-05 | 1989-01-24 | Bernard V. Mcinerney | Xenocoumacins |
| US5039523A (en) * | 1988-10-27 | 1991-08-13 | Mycogen Corporation | Novel Bacillus thuringiensis isolate denoted B.t. PS81F, active against lepidopteran pests, and a gene encoding a lepidopteran-active toxin |
| WO1995000647A1 (en) * | 1993-06-25 | 1995-01-05 | Commonwealth Scientific And Industrial Research Organisation | Toxin gene from xenorhabdus nematophilus |
| AU7513994A (en) * | 1993-07-27 | 1995-02-28 | Agro-Biotech Corporation | Novel fungicidal properties of metabolites, culture broth, stilbene derivatives and indole derivatives produced by the bacteria (xenorhabdus) and (photorhabdus) spp. |
| GB9618083D0 (en) * | 1996-08-29 | 1996-10-09 | Mini Agriculture & Fisheries | Pesticidal agents |
-
1996
- 1996-11-06 KR KR1019970704633A patent/KR100354530B1/en not_active Expired - Fee Related
- 1996-11-06 EP EP96941335A patent/EP0797659A4/en not_active Withdrawn
- 1996-11-06 SK SK931-97A patent/SK93197A3/en unknown
- 1996-11-06 BR BR9606889A patent/BR9606889A/en not_active Application Discontinuation
- 1996-11-06 PL PL96321212A patent/PL186242B1/en not_active IP Right Cessation
- 1996-11-06 MX MX9705101A patent/MX9705101A/en active IP Right Grant
- 1996-11-06 WO PCT/US1996/018003 patent/WO1997017432A1/en not_active Ceased
- 1996-11-06 CA CA002209659A patent/CA2209659C/en not_active Expired - Fee Related
- 1996-11-06 AU AU10509/97A patent/AU729228B2/en not_active Ceased
- 1996-11-06 HU HU9900768A patent/HUP9900768A3/en unknown
- 1996-11-06 JP JP51836997A patent/JP3482214B2/en not_active Expired - Fee Related
- 1996-11-06 RO RO97-01251A patent/RO121280B1/en unknown
- 1996-11-06 IL IL121243A patent/IL121243A/en not_active IP Right Cessation
-
2003
- 2003-07-16 JP JP2003197785A patent/JP3657593B2/en not_active Expired - Fee Related
Patent Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5254799A (en) * | 1985-01-18 | 1993-10-19 | Plant Genetic Systems N.V. | Transformation vectors allowing expression of Bacillus thuringiensis endotoxins in plants |
Non-Patent Citations (2)
| Title |
|---|
| J.INVERT. PATH 66, 149-155 * |
| MEDLINE ABSTRACT 7986856, 8206856, 8206831, 8449874 * |
Also Published As
| Publication number | Publication date |
|---|---|
| CA2209659C (en) | 2008-01-15 |
| WO1997017432A1 (en) | 1997-05-15 |
| IL121243A0 (en) | 1998-01-04 |
| RO121280B1 (en) | 2007-02-28 |
| IL121243A (en) | 2010-05-31 |
| KR19980701244A (en) | 1998-05-15 |
| PL321212A1 (en) | 1997-11-24 |
| HUP9900768A2 (en) | 1999-06-28 |
| MX9705101A (en) | 1997-10-31 |
| JP3657593B2 (en) | 2005-06-08 |
| SK93197A3 (en) | 1998-05-06 |
| JP3482214B2 (en) | 2003-12-22 |
| CA2209659A1 (en) | 1997-05-15 |
| EP0797659A4 (en) | 1998-11-11 |
| PL186242B1 (en) | 2003-12-31 |
| AU1050997A (en) | 1997-05-29 |
| EP0797659A1 (en) | 1997-10-01 |
| JP2002509424A (en) | 2002-03-26 |
| KR100354530B1 (en) | 2003-01-06 |
| BR9606889A (en) | 1997-10-28 |
| HUP9900768A3 (en) | 2002-10-28 |
| JP2004089189A (en) | 2004-03-25 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| AU729228B2 (en) | Insecticidal protein toxins from photorhabdus | |
| AU2829997A (en) | Insecticidal protein toxins from (photorhabdus) | |
| AU755389B2 (en) | Insecticidal protein toxins from xenorhabdus | |
| US7181884B2 (en) | Materials and methods for controlling pests | |
| US7569748B2 (en) | Nucleic acid encoding an insecticidal protein toxin from photorhabdus | |
| CN101686705A (en) | Hemipteran- and Coleopteran-active toxin proteins from Bacillus thuringiensis | |
| RU2225114C2 (en) | Recombinant dna encoding protein that represents insecticide agent, insecticide agent (variants), strain of microorganism xenorhabdus nematophilus (variants), insecticide composition, method for control of insect-pests | |
| US6528484B1 (en) | Insecticidal protein toxins from Photorhabdus | |
| US6280722B1 (en) | Antifungal Bacillus thuringiensis strains | |
| ES2348509T5 (en) | New insecticidal proteins from Bacillus thuringiensis | |
| AU9712501A (en) | Insecticidal protein toxins from photorhabdus | |
| AU2004313474A1 (en) | Toxin complex proteins and genes from Xenorhabdus bovienii | |
| MXPA99001288A (en) | Insecticidal protein toxins from xenorhabdus | |
| UA82485C2 (en) | Insecticide protein toxins from photorhabdus |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| FGA | Letters patent sealed or granted (standard patent) |