AU770407B2

AU770407B2 - N-acetylglycosaminyl transferase genes

Info

Publication number: AU770407B2
Application number: AU51436/99A
Authority: AU
Inventors: Bozena Korczak; April Lew
Original assignee: Glycodesign Inc
Current assignee: Glycodesign Holdings Ltd
Priority date: 1998-08-07
Filing date: 1999-08-05
Publication date: 2004-02-19
Anticipated expiration: 2019-08-05
Also published as: WO2000008171A1; JP2002526037A; EP1102853A1; MXPA01001425A; CA2339352A1; AU5143699A

Description

-1- Title: N-Acetylglycosaminyltransferase Genes FIELD OF THE INVENTION The invention relates to novel N-acetylglycosaminyltransferase V nucleic acids, proteins encoded by the nucleic acids, and uses of the nucleic acids and proteins.

BACKGROUND OF THE INVENTION Any discussion of the prior art throughout the specification should in no way be considered as an admission that such prior art is widely known or forms part of common general knowledge in the field.

Protein glycosylation is mediated by a series of enzymes found in the Golgi apparatus. Many of the enzymes in this pathway are subject to regulation during embryogenesis, lymphocyte activation, and in cancer progression. Structural diversity of carbohydrates on cell surfaces and secreted or non-secreted receptors) proteins affects their function and the associated cell biology. Somatic mutations and drugs which block the biosynthesis of-GlcNAcpl-6Manal-6Man-branching of N-linked oligosaccharides, also inhibit organ colonisation, invasion in vitro, and limit solid tumor growth in vivo.

Synthesis of GlcNAc-branched carbohydrate structure is dependent upon Nacetylglycosaminyltransferases, one of which is N-acetylglycosaminyltransferase V (GlcNAc-TV). GlcNAc-TV catalyzes the addition of 1-6GlcNAc to thetrimannosyl core 20 in the biosynthetic pathway for branched complex-type N-linked oligosaccharides found S* on some cell surface and secreted glycoproteins (Schachter, H. (1986) Biochem. Cell.

Biol. 64:163-181). The 1-6GlcNAc product of GlcNAc-TV is the preferred antenna and •rate limiting substrate in the pathway for addition of terminal polylactosamine sequences which affect cell-cell and cell-substratum interactions (van den Eijnden, D.H. et al, (1988) 263:12461-12465; Yousefi, S. et al, (1991) J. Biol. Chem. 266:1772-1783; and Heffeman, M. et al, (1993) J. Biol. Chem. 268:1242-1251).

The rat (Shoreibah, M. et al (1993), 268:15381-15385) and human (Saito, H. et Sal, (1994) Biochem. Biophys. Res. Commun. 198:318-327 233:18-26) GlcNAc-TV sequences predict a 741 amino acid type II glycoprotein. The human GlcNAc-TV gene 30 is located on human chromosome 2q21 with 17 exons and spans 155Kb (Saito et al., (1995) Eur. J. Biochem. 233:18-26). The putative promoter region of the GlcNAc-TV 500029306 1.DOC rM~Hn~nia~~i~a~,dum~~ ~~~ilJill'lYRiR;~a:~;rUT lagene has API and PEA3/ets binding sites, and is responsive to ras signaling pathways (Buckhaults P JBiol Chem (1997) 272:19575-19581).

Oncogenic transformation of rodent fibroblasts by polyoma virus, v-src, H-ras or v-fps leads to increased GlcNAc-TV expression (Yamashita, K. et al, (1985) J. Biol.

Chem. 260:3963-3969; Pierce, M and Arango, J. (1986) J. Biol. Chem. 261:10772- 10777; Dennis, J et al. (1987) Science 236:582-585, 1987), and in human carcinomas of breast, colon and skin GlcNAc-TV-generated structures correlate with pathological staging of tumors (Femandes, B. et al (1991) 51:718-723). The GlcNAc-TV message is also subject to increased frequency of alternate splicing in tumors cells, resulting in a peptide encoded by an intron sequence of the GlcNAc-TV gene which has been identified as a widely occurring "tumor-associated antigen". Fifty percent of tested human melanoma tumors expressed this antigen, while it is absent in normal tissues (Guilloux, Y. et al (1996) J. Exp. Med. 183:1173-1183). In a rat model of heritable liver cancer, GlcNAc-TV transcript levels are elevated in primary tumors and lymph node metastases (Miyoshi, E. et al, (1993) Cancer Res. 53:3899-3902, 1993). In addition, topical expression of GlcNAc-TV in epithelial cells results in morphological transformation and tumorogenesis (Demetriou, M. et al (1995) J. Cell Biol. 130-383), while tumor cell •-g oooo 500029306I.DOC .V f14r A~ 6WVW hii v WO 00/08171 PCT/CA99/00711 -2mutants selected for loss of GlcNAc-TV activity show reduced malignant potential in vivo (Lu, Y. et al (1994) Clin. Exp. Metastasis 12:47-54).

SUMMARY OF THE INVENTION The present inventors have identified novel GlcNAc-TV nucleic acid molecules. The nucleic acids are herein designated "glcNAc-TV-b" or "GlcNAc-TV-b nucleic acid molecule" and "glcNAc-TVc" or "GlcNAc-TV-c nucleic acid molecule". The proteins encoded by the nucleic acid molecules are herein designated "GlcNAc-TV-b" or "GlcNAc-TV-b Protein", and "GlcNAc-TV-c" or "GlcNAc-TVc Protein".

Broadly stated the present invention contemplates an isolated nucleic acid molecule encoding a protein of the invention, including mRNAs, DNAs, cDNAs, genomic DNAs, PNAs, as well as antisense analogs and biologically, diagnostically, prophylactically, clinically or therapeutically useful variants or fragments thereof, and compositions comprising same.

In particular, the present invention contemplates an isolated GlcNAc-TV-b or GlcNAc-TV-c nucleic acid molecule comprising a sequence that comprises at least 18 nucleotides and hybridizes under stringent conditions to the complementary nucleic acid sequence of SEQ. ID. NO. 1, or a degenerate form thereof. Further embodiments of this aspect of the invention provide biologically, diagnostically, prophylactically, clinically or therapeutically useful variants thereof and compositions comprising same.

The invention also contemplates an isolated GlcNAc-TV-b or GlcNAc-TV-c protein encoded by a nucleic acid molecule of the invention, a truncation, an analog, an allelic or species variation thereof, or a homolog of a protein of the invention, or a truncation thereof. (Truncations, analogs, allelic or species variations, and homologs are collectively referred to herein as "GlcNAc-TV-b Related Proteins" or "GlcNAc-TV-c Related Proteins).

The nucleic acid molecules of the invention permit identification of untranslated nucleic acid sequences or regulatory sequences which specifically promote expression of genes operatively linked to the promoter regions. Identification and use of such promoter sequences are particularly desirable in instances, such as gene transfer or gene therapy, which can specifically require heterologous gene expression in a limited environment CNS environment). The invention therefore contemplates a nucleic acid encoding a regulatory sequence of a nucleic acid molecule of the invention such as a promoter sequence, preferably a regulatory sequence of glcNAc-TV-b or glcNAc-TV-c.

The nucleic acid molecules which encode for a mature GlcNAc-TV-b or GlcNAc-TV-c protein may include only the coding sequence for the mature polypeptide (SEQ ID NO. 5 or the coding sequence for the mature polypeptide and additional coding sequences leader or secretory sequences, proprotein sequences); the coding sequence for the mature polypeptide (and optionally additional coding sequence) and non-coding sequence, such as introns or non-coding sequences and/or 3' of the coding sequence of the mature polypeptide SEQ ID NO. 3).

Therefore, the term "nucleic acid molecule encoding a protein" encompasses a nucleic acid molecule which includes only coding sequence for the protein as well as a nucleic acid molecule which includes additional coding and/or non-coding sequences.

~~~W~lhl4l~~t~W~C~naIIWIWWR~lliRI~K(I IIMF1IUYa~llllR'I'(IWiillkatll~~!(Wl~ili RI~Xinm~\ll~l~!~i~X iW~Yln~iurd~,'~ii~ P~I~,~I!YllyJ~All~~aaY!'IIBYI~(KI~!~\ IRY~il#il I[;II1Hlll~'~dlasllliil'R~6~~ WO 00/08171 PCT/CA99/00711 The nucleic acids of the invention may be inserted into an appropriate expression vector, and the vector may contain the necessary elements for the transcription and translation of an inserted coding sequence. Accordingly, recombinant expression vectors may be constructed which comprise a nucleic acid molecule of the invention, and where appropriate one or more transcription and translation elements linked to the nucleic acid molecule.

Vectors are contemplated within the scope of the invention which comprise regulatory sequences of the invention, as well as chimeric gene constructs wherein a regulatory sequence of the invention is operably linked to a nucleic acid sequence encoding a heterologous protein a protein not naturally expressed in the host cell), and a transcription termination signal.

A recombinant expression vector can be used to transform host cells to express a GlcNAc-TVb Protein, GlcNAc-TV-b Related Proteins, a GlcNAc-TV-c Protein, or GlcNAc-TV-c Related Proteins, or a heterologous protein. Therefore, the invention further provides host cells containing a recombinant molecule of the invention. The invention also contemplates transgenic non-human mammals whose germ cells and somatic cells contain a recombinant molecule comprising a nucleic acid molecule of the invention in particular one that encodes an analog of GlcNAc-TV-b or GlcNAc-TV-c, or a truncation of GlcNAc-TV-b or GlcNAc-TV-c.

The proteins of the invention may be obtained as an isolate from natural cell sources, but they are preferably produced by recombinant procedures. In one aspect the invention provides a method for preparing a GlcNAc-TV-b Protein, a GlcNAc-TV-b Related Protein, a GlcNAc-TV-c Protein, or a GlcNAc-TV-c Related Protein utilizing the purified and isolated nucleic acid molecules of the invention. In an embodiment a method for preparing a GlcNAc-TV-b Protein, a GlcNAc-TV-b Related Protein, a GlcNAc-TV-c Protein, or a GlcNAc-TV-c Related Protein is provided comprising: transferring a recombinant expression vector of the invention having a nucleotide sequence encoding a GlcNAc-TV-b Protein, a GlcNAc-TV-b Related Protein, a GlcNAc-TV-c Protein, or a GlcNAc-TV-c Related Protein, into a host cell; selecting transformed host cells from untransformed host cells; culturing a selected transformed host cell under conditions which allow expression of the GlcNAc-TV-b Protein, GlcNAc-TV-b Related Protein, GlcNAc-TV-c Protein, or GlcNAc-TV-c Related Protein; and isolating the GlcNAc-TV-b Protein, GlcNAc-TV-b Related Protein, GlcNAc-TV-c Protein, or GlcNAc-TV-c Related Protein.

The invention further broadly contemplates a recombinant GlcNAc-TV-b Protein, GlcNAc- TV-b Related Protein, GlcNAc-TV-c Protein, or GlcNAc-TV-c Related Protein obtained using a method of the invention.

A GlcNAc-TV-b Protein, a GlcNAc-TV-b Related Protein, a GlcNAc-TV-c Protein, or a GlcNAc-TV-c Related Protein of the invention may be conjugated with other molecules, such as proteins, to prepare fusion proteins or chimeric proteins. This may be accomplished, for example, by the synthesis of N-terminal or C-terminal fusion proteins.

WO 00/08171 PCT/CA99/00711 -4- The invention further contemplates antibodies having specificity against an epitope of a GlcNAc-TV-b Protein, a GlcNAc-TV-b Related Protein, a GlcNAc-TV-c Protein, or a GlcNAc-TV-c Related Protein of the invention. Antibodies may be labeled with a detectable substance and used to detect proteins of the invention in biological samples, tissues, and cells.

The invention also permits the construction of nucleotide probes which are unique to the nucleic acid molecules of the invention or to proteins of the invention. Therefore, the invention also relates to a probe comprising a sequence encoding a protein of the invention, or a part thereof. The probe may be labeled, for example, with a detectable substance and it may be used to select from a mixture of nucleotide sequences a nucleic acid molecule of the invention including nucleic acid molecules coding for a protein which displays one or more of the properties of a protein of the invention.

In accordance with an aspect of the invention there is provided a method of, and products for, diagnosing and monitoring conditions mediated by a GlcNAc-TV-b Protein, a GlcNAc-TV-b Related Protein, a GlcNAc-TV-c Protein, or a GlcNAc-TV-c Related Protein by determining the presence of nucleic acid molecules and proteins of the invention.

Still further the invention provides a method for evaluating a test compound for its ability to modulate the biological activity of a GlcNAc-TV-b Protein, a GlcNAc-TV-b Related Protein, a GlcNAc-TV-c Protein, or a GlcNAc-TV-c Related Protein of the invention. For example a substance which inhibits or enhances the catalytic activity of a GlcNAc-TV-b Protein, a GlcNAc-TV-b Related Protein, a GlcNAc-TV-c Protein, or a GlcNAc-TV-c Related Protein may be evaluated. "Modulate" refers to a change or an alteration in the biological activity of a protein of the invention. Modulation may be an increase or a decrease in activity, a change in characteristics, or any other change in the biological, functional, or immunological properties of the protein.

Compounds which modulate the biological activity of a protein of the invention may also be identified using the methods of the invention by comparing the pattern and level of expression of a nucleic acid molecule or protein of the invention in tissues and cells, in the presence, and in the absence of the compounds.

Methods are also contemplated that identify compounds or substances proteins) which bind to glcNAc-TV-b or glcNAc-TV-c regulatory sequences promoter sequences, enhancer sequences, negative modulator sequences).

The substances and compounds identified using the methods of the invention may be used to modulate the biological activity of a GlcNAc-TV-b Protein, a GlcNAc-TV-b Related Protein, a GlcNAc-TV-c Protein, or a GlcNAc-TV-c Related Protein of the invention, and they may be used in the treatment of conditions mediated by the proteins including but not limited to proliferative diseases such as cancer, viral, bacterial, and parasitic infections, to stimulate hematopoietic progenitor cell growth, or confer protection against chemotherapy or radiation therapy. Accordingly, the nucleic acid molecules and proteins of the invention, and substances and compounds may be formulated into compositions for administration to individuals suffering from one or more of these conditions.

Therefore, the present invention also relates to a composition comprising one or more of a nucleic acid ~~hUI I~VI~"Il u~ualrnn ~us~i~mli! O molecule or protein of the invention, or a substance or compound identified using the methods of the invention, and a pharmaceutically acceptable carrier, excipient or diluent.

A method for treating or preventing these conditions is also provided comprising administering to a patient in need thereof, a composition of the invention.

The present invention provides the means necessary for production of gene-based therapies directed at the brain. These therapeutic agents may take the form of polynucleotides comprising all or a portion of a nucleic acid of the invention comprising a regulatory sequence of glcNAc-TV-b or glcNAc-TV-c placed in appropriate vectors or delivered to target cells in more direct ways.

Having provided novel GlcNAc-TV proteins, and nucleic acids encoding same, the invention accordingly further provides methods for preparing oligosaccharides e.g.

two or more saccharides. In specific embodiments, the invention relates to a method for preparing an oligosaccharide comprising contacting a reaction mixture comprising an activated GlcNAc, and an acceptor in the presence of a GlcNAc-TV-b Protein, a GlcNAc-TV-b Related Protein, a GlcNAc-TV-c Protein, or a GlcNAc-TV-c Related Protein of the invention.

In accordance with a further aspect of the invention, there are provided processes for utilizing proteins or nucleic acid molecules of the invention, for in vitro purposes related to scientific research, synthesis of DNA, and manufacture of vectors.

20 These and other aspects, features, and advantages of the present invention should be apparent to those skilled in the art from the following drawings and detailed •description.

According to a first aspect, the present invention provides an isolated N- Sacetylglycosaminyltransferase V-b protein comprising an amino acid sequence of SEQ ID NO. 2, SEQ ID NO. 4 or SEQ ID NO. 6, or an isolated Nacetylglycosaminyltransferase V-c protein comprising an amino acid sequence of SEQ ID NO. 10, or SEQ ID NO. 12.

According to a second aspect, the present invention provides an isolated nucleic acid molecule comprising a nucleic acid sequence of SEQ ID NO: 1, SEQ ID NO: 3, 30 SEQ ID NO: 5, SEQ ID NO: 9, or SEQ ID NO: 11.

i According to a third aspect, the present invention provides an isolated nucleic acid molecule encoding a polypeptide comprising an amino acid sequence of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 10, or SEQ ID NO: 12.

500029306I.DOC A ,1 JA S ,fr) 9 According to a fourth aspect, the present invention provides a vector comprising a protein according to the first aspect or a nucleic acid molecule according to the second or third aspects.

According to a fifth aspect, the present invention provides a host cell comprising a protein according to the first aspect or a nucleic acid molecule according to the second or third aspects.

According to a sixth aspect, the present invention provides a method for preparing a protein comprising: transferring a vector as claimed in claim 5 into a host cell; selecting transformed host cells from untransformed host cells; culturing a selected transformed host cell under conditions which allow expression of the protein; and isolating the protein.

According to a seventh aspect, the present invention provides a protein prepared in accordance with the sixth aspect.

According to an eighth aspect, the present invention provides an antibody having specificity against an epitope of a protein according to the first or seventh aspects.

According to a ninth aspect, the present invention provides a probe comprising a sequence encoding a protein according to the first or seventh aspects, or a part thereof.

S_ 20 According to a tenth aspect, the present invention provides a method of diagnosing and monitoring conditions mediated by a protein according to the first or seventh aspects by determining the presence of a nucleic acid molecule according to the "°second or third aspects or a protein according to the first aspect.

According to an eleventh aspect, the present invention provides a method for identifying a substance which associates with a protein according to the first or seventh V, aspects comprising reacting the protein with at least one substance which potentially •can associate with the protein, under conditions which permit the association between the substance and protein, and removing or detecting protein associated with the substance, wherein detection of associated protein and substance indicates the substance 30 associates with the protein.

According to a twelfth aspect, the present invention provides a method for evaluating a compound for its ability to modulate the biological activity of a protein according to the first or seventh aspects comprising providing a known concentration of 500029306_1.DOC the protein with a substance which associates with the protein and a test compound under conditions which permit the formation of complexes between the substance and protein, and removing and/or detecting complexes.

According to a thirteenth aspect, the present invention provides a method for detecting a nucleic acid molecule encoding a protein comprising an amino acid sequence of SEQ ID NO: 2, 4, 6, 10 or 12 in a biological sample comprising the steps of: hybridising the nucleic acid molecule of the second or third aspect to nucleic acids of the biological sample, thereby forming a hybridisation complex; and detecting the hybridisation complex wherein the presence of the hybridisation complex correlates with the presence of a nucleic acid molecule encoding the protein in the biological sample.

According to a fourteenth aspect, the present invention provides a composition comprising one or more of a nucleic acid molecule according to the second or third aspect or a protein according to the first or seventh aspect, or a substance or compound identified using a method according to the invention, and a pharmaceutically acceptable carrier, excipient or diluent.

According to a fifteenth aspect, the present invention provides use of one or more of a nucleic acid molecule according to the second or third aspect or a protein according to the first aspect, or a substance or compound identified using a method according to 20 the invention in the preparation of a pharmaceutical composition for treating a condition S•mediated by a protein according to the first or seventh aspects.

°According to a sixteenth aspect, the present invention provides a gene-based •therapy directed at the brain comprising a polynucleotide comprising all or a portion of a 0 regulatory sequence of SEQ ID NO: 7 or 8.

According to a seventeenth aspect, the present invention provides a method for preparing an oligosaccharide comprising contacting a reaction mixture comprising an activated GlcNAc, and an acceptor in the presence of a protein according to the first or seventh aspects.

According to an eighteenth aspect, the present invention provides an S0 o30 oligosaccharide prepared by a method according to the seventeenth aspect.

Unless the context clearly requires otherwise, throughout the description and the claims, the words 'comprise', 'comprising', and the like are to be construed in an 500029306 IDOC nl~i~n inclusive sense as opposed to an exclusive or exhaustive sense; that is to say, in the sense of "including, but not limited to".

DESCRIPTION OF THE DRAWINGS The invention will be better understood with reference to the drawings in which: Figure 1 is a reproduction of autoradiograms resulting from a Northern hybridization experiment in which mRNA isolated from different human tissues was sized-fractionated and probed with radioactive human partial GlcNAc-TV clone (nucleotides 1508-1921) and human partial GlcNAc-TV-b (nucleotides 1959-2417); Figure 2 is a reproduction of autoradiograms resulting from a Northern hybridization experiment in which mRNA isolated from different human brain tissues was size-fractionated and probed with radioactive human partial GlcNAc-TV clone (nucleotides 1508-1921) and human partial GlcNAc-TV-b (nucleotides 1959-2417); and Figure 3 is a reproduction of phosphoimager resulting from a Northern hybridization experiment in which mRNA isolated from different human tumor cell lines was size-fractionated and probed with radioactive human partial GlcNAc-TV clone (nucleotides 1508-1921) and human partial GlcNAc-TV (nucleotides 1959-2417).

DETAILED DESCRIPTION OF THE INVENTION In accordance with the present invention there may be employed conventional 20 molecular biology, microbiology, and recombinant DNA techniques within the skill of the art. Such techniques are explained fully in the literature. See for example, Sambrook, Fritsch Maniatis, Molecular Cloning: A Laboratory Manual, Second Edition (1989) Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY); DNA SCloning: A Practical Approach. Volumes I and II Glover ed. 1985); "WO I'll, r 1 911 4k "'lIl, "I I Y 011 WO 00/08171 PCT/CA99/00711 -6- Oligonucleotide Synthesis Gait ed. 1984); Nucleic Acid Hybridization B.D. Hames S.J.

Higgins eds. (1985); Transcription and Translation B.D. Hames S.J. Higgins eds (1984); Animal Cell Culture R.I. Freshney, ed. (1986); Immobilized Cells and enzymes IRL Press, (1986); and B.

Perbal, A Practical Guide to Molecular Cloning (1984).

Nucleic Acid Molecules of the Invention As hereinbefore mentioned, the invention provides isolated GlcNAc-TV-b and GlcNAc-TV-c nucleic acid molecules. The GlcNAc-TV-b and GlcNAc-TV-c nucleic acid molecules differ in their 3' ends.

The term "isolated" refers to a nucleic acid (or protein) removed from its natural environment, purified or separated, or substantially free of cellular material or culture medium when produced by recombinant DNA techniques, or chemical reactants, or other chemicals when chemically synthesized.

Preferably, an isolated nucleic acid molecule is at least 60% free, more preferably at least 75% free, and most preferably at least 90% free from other components with which they are naturally associated.

The term "nucleic acid" is intended to include modified or unmodified DNA, RNA, including mRNAs, DNAs, cDNAs, and genomic DNAs, or a mixed polymer, and can be either single-stranded, doublestranded or triple-stranded. For example, a nucleic acid sequence may be a single-stranded or doublestranded DNA, DNA that is a mixture of single-and double-stranded regions, or single-, double- and triple-stranded regions, single- and double-stranded RNA, RNA that may be single-stranded, or more typically, double-stranded, or triple-stranded, or a mixture of regions comprising RNA or DNA, or both RNA and DNA. The strands in such regions may be from the same molecule or from different molecules. The DNAs or RNAs may contain one or more modified bases. For example, the DNAs or RNAs may have backbones modified for stability or for other reasons. A nucleic acid sequence includes an oligonucleotide, nucleotide, or polynucleotide. The term "nucleic acid molecule" and in particular DNA or RNA, refers only to the primary and secondary structure and it does not limit it to any particular tertiary forms.

In an embodiment of the invention an isolated nucleic acid is contemplated which comprises: a nucleic acid sequence encoding a protein having substantial sequence identity preferably at least 70%, more preferably at least 75% sequence identity, with an amino acid sequence of SEQ. ID. NO. 2, 4, 6, 10, or 12; (ii) nucleic acid sequences complementary to (iii) nucleic acid sequences differing from any of the nucleic acids of or (ii) in codon sequences due to the degeneracy of the genetic code; (iv) a nucleic acid sequence comprising at least 18 nucleotides and capable of hybridizing under stringent conditions to a nucleic acid sequence of SEQ.

ID. NO. 1, 3, 5, 9, or 11 or to a degenerate form thereof; a nucleic acid sequence encoding a truncation, an analog, an allelic or species variation of a protein comprising an amino acid sequence of SEQ.

ID. NO. 2, 4, 6, 10, or 12; or (vi) a fragment, or allelic or species variation of(i), (ii) or (iii) h y rrl T 1 Ii~[.14~4 !44441I4A 1 il 4T4i i:1442h.i~j WO 00/08171 PCT/CA99/00711 -7- In a specific embodiment, the isolated nucleic acid comprises: a nucleic acid sequence having substantial sequence identity preferably at least 70%, more preferably at least 75% sequence identity with a nucleotide sequence of SEQ. ID. NO. 1, 3, 5, 9, or 11; (ii) nucleic acid sequences complementary to preferably complementary to a full nucleic acid sequence of SEQ. ID. NO. 1, 3, 5, 9, or 11; (iii) nucleic acid sequences differing from any of the nucleic acids of to (ii) in codon sequences due to the degeneracy of the genetic code; or (iv) a fragment, or allelic or species variation of(i), (ii) or (iii).

The term "complementary" refers to the natural binding of nucleic acid molecules under permissive salt and temperature conditions by base-pairing. For example, the sequence binds to the complementary sequence Complementarity between two single-stranded molecules may be "partial", in which only some of the nucleic acids bind, or it may be complete when total complementarity exists between the single stranded molecules.

In a preferred embodiment the isolated nucleic acid comprises a nucleic acid sequence encoded by an amino acid sequence of SEQ. ID. NO. 2, 4, 6, 10, or 12 or comprises a nucleotide sequence of SEQ. ID. NO. 1, 3, 5, 9, or 11 wherein T can also be U.

The terms "sequence similarity" or "sequence identity" refers to the relationship between two or more amino acid or nucleic acid sequences, determined by comparing the sequences, which relationship is generally known as "homology". Identity in the art also means the degree of sequence relatedness between amino acid or nucleic acid sequences, as the case may be, as determined by the match between strings of such sequences. Both identity and similarity can be readily calculated (Computational Molecular Biology, Lesk, ed., Oxford University Press New York, 1988; Biocomputing: Informatics and Genome Projects, Smith, D.W. ed., Academic Press, New York, 1993; Computer Analysis of Sequence Data, Part I, Griffin, and Griffin, H.G. eds. Humana Press, New Jersey, 1994; Sequence Analysis in Molecular Biology, von Heinje, Academic Press, New York, 1987; and Sequence Analysis Primer, Gribskov, M. and Devereux, eds. M. Stockton Press, New York, 1991). While there are a number of existing methods to measure identity and similarity between two amino acid sequences or two nucleic acid sequences, both terms are well known to the skilled artisan (Sequence Analysis in Molecular Biology, von Heinje, Academic Press, New York, 1987; Sequence Analysis Primer, Gribskov, M. and Devereux, eds. M. Stockton Press, New York, 1991; and Carillo, and Lipman, D. SIAM J. Applied Math., 48:1073, 1988). Preferred methods for determining identity are designed to give the largest match between the sequences tested. Methods to determine identity are codified in computer programs. Preferred computer program methods for determining identity and similarity between two sequences include but are not limited to the GCG program package (Devereux, J. et al, Nucleic Acids Research 12(1): 387, 1984), BLASTP, BLASTN, and FASTA (Atschul, S.F. et al., J. Molec. Biol. 215:403, 1990). Identity or similarity may also be determined using the alignment algorithm of Dayhoff et al; Methods in Enzymology 91: 524-545 (1983).

:~vr~~n~~~rBnw~~rt~6U~r~a~!~FZ~til~i~ji I~(~fHi~~allFA(rlsllCBli#ilM~~/l~nililii WO 00/08171 PCT/CA99/00711 -8- Preferably, a nucleic acid molecule of the present invention has substantial sequence identity using the preferred computer programs cited herein, for example at least 70%, more preferably at least nucleic acid identity; still more preferably at least 80% nucleic acid identity; and most preferably at least 90% to 95% sequence identity to a sequence of SEQ. ID. NO. 1, 3, 5, 9, or 11.

Isolated nucleic acid molecules encoding a GlcNAc-TV-b Protein or GlcNAc-TV-c Protein, and having a sequence which differs from a nucleic acid sequence of SEQ. ID. NO. 1, 3, 5, 9, or 11, due to degeneracy in the genetic code are also within the scope of the invention. Such nucleic acid molecules encode equivalent proteins but differ in sequence from a sequence of SEQ. ID. NO. 1, 3, 9, or 11 due to degeneracy in the genetic code. As one example, DNA sequence polymorphisms within glcNAc-TV-b or glcNAc-TV-c may result in silent mutations which do not affect the amino acid sequence. Variations in one or more nucleotides may exist among individuals within a population due to natural allelic variation. Any and all such nucleic acid variations are within the scope of the invention. DNA sequence polymorphisms may also occur which lead to changes in the amino acid sequence of GlcNAc-TV-b Protein or GlcNAc-TV-c Protein. These amino acid polymorphisms are also within the scope of the present invention. In addition, species variations i.e. variations in nucleotide sequence naturally occurring among different species, are within the scope of the invention.

Another aspect of the invention provides a nucleic acid molecule which hybridizes under selective conditions, e.g. high stringency conditions, to a nucleic acid which comprises a sequence which encodes a GlcNAc-TV-b Protein or GlcNAc-TV-c Protein of the invention. Preferably the sequence encodes an amino acid sequence of SEQ. ID. NO. 2, 4, 6, 10, or 12 and comprises at least 18 nucleotides. Selectivity of hybridization occurs with a certain degree of specificity rather than being random. Appropriate stringency conditions which promote DNA hybridization are known to those skilled in the art, or can be found in Current Protocols in Molecular Biology, John Wiley Sons, N.Y.

(1989), 6.3.1-6.3.6. Numerous equivalent conditions comprising either low or high stringency depend on factors such as the length and nature of the sequence (DNA, RNA, base composition), nature of the target (DNA, RNA, base composition), milieu (in solution or immobilized on a solid substrate), concentration of salts and other components formamide, dextran sulfate and/or polyethylene glycol), and temperature of the reactions (within a range from about 5°C below the melting temperature of the probe to about 20 0 C to 25 0 C below the melting temperature). One or more factors may be varied to generate conditions of either low or high stringency different from, but equivalent to, the above listed conditions. For example, 6.0 x sodium chloride/sodium citrate (SSC) or 0.5% SDS at about 0 C, followed by a wash of 2.0 x SSC at 50 0 C may be employed. The stringency may be selected based on the conditions used in the wash step. By way of example, the salt concentration in the wash step can be selected from a high stringency of about 0.2 x SSC at 50 0 C. In addition, the temperature in the wash step can be at high stringency conditions, at about 65 0

C.

It will be appreciated that the invention includes nucleic acid molecules encoding a GlcNAc- TV-b Protein, GlcNAc-TV-b Related Protein, GlcNAc-TV-c Protein, or GlcNAc-TV-c Related Protein, including truncations of the proteins, allelic and species variants, and analogs of the proteins as described herein. In particular, fragments of a nucleic acid molecule of the invention are contemplated '"~in~i~(nLV~rl*I'JV f 'I I(V(lrIIPl lllli~k ~fWIiVIII A~PXP L* JlilBI !P~ii llP I Ii1IP IIIU 1 h/l~l1Pbl~ 'PPU'' L'1 pi HI'pI II IRI ~1,F U E I I 2.PI~~ WO 00/08171 PCT/CA99/00711 -9that are a stretch of at least about 10, preferably at least 15, more preferably at least 18, and most preferably at least 20 nucleotides, more typically at least 50 to 200 nucleotides but less than 2 kb. It will further be appreciated that variant forms of the nucleic acid molecules of the invention which arise by alternative splicing of an mRNA corresponding to a cDNA of the invention are encompassed by the invention.

An isolated nucleic acid molecule of the invention which comprises DNA can be isolated by preparing a labeled nucleic acid probe based on all or part of a nucleic acid sequence of SEQ. ID. NO.

1, 3, 5, 9, or 11. The labeled nucleic acid probe is used to screen an appropriate DNA library a cDNA or genomic DNA library). For example, a cDNA library can be used to isolate a cDNA encoding a GlcNAc-TV-b Protein, a GlcNAc-TV-b Related Protein, a GlcNAc-TV-c Protein, or a GlcNAc-TV-c Related Protein by screening the library with the labeled probe using standard techniques. Alternatively, a genomic DNA library can be similarly screened to isolate a genomic clone encompassing a glcNAc-TV-b or glcNAc-TV-c gene. Nucleic acids isolated by screening of a cDNA or genomic DNA library can be sequenced by standard techniques.

An isolated nucleic acid molecule of the invention which is DNA can also be isolated by selectively amplifying a nucleic acid of the invention. "Amplifying" or "amplification refers to the production of additional copies of a nucleic acid sequence and is generally carried out using polymerase chain reaction (PCR) technologies well known in the art (Dieffenbach, C. W. and G. S.

Dveksler (1995) PCR Primer, a Laboratory Manual, Cold Spring Harbor Press, Plainview, In particular, it is possible to design synthetic oligonucleotide primers from a nucleotide sequence of SEQ. ID. NO. 1, 3, 5, 7, 8, 9, or 11 for use in PCR. A nucleic acid can be amplified from cDNA or genomic DNA using these oligonucleotide primers and standard PCR amplification techniques. The nucleic acid so amplified can be cloned into an appropriate vector and characterized by DNA sequence analysis. cDNA may be prepared from mRNA, by isolating total cellular mRNA by a variety of techniques, for example, by using the guanidinium-thiocyanate extraction procedure of Chirgwin et al., Biochemistry, 18, 5294-5299 (1979). cDNA is then synthesized from the mRNA using reverse transcriptase (for example, Moloney MLV reverse transcriptase available from Gibco/BRL, Bethesda, MD, or AMV reverse transcriptase available from Seikagaku America, Inc., St. Petersburg, FL).

An isolated nucleic acid molecule of the invention which is RNA can be isolated by cloning a cDNA encoding a GlcNAc-TV-b Protein, GlcNAc-TV-b Related Protein, GlcNAc-TV-c Protein, or GlcNAc-TV-c Related Protein into an appropriate vector which allows for transcription of the cDNA to produce an RNA molecule which encodes a GlcNAc-TV-b Protein, GlcNAc-TV-b Related Protein, GlcNAc-TV-c Protein, or GlcNAc-TV-c Related Protein. For example, a cDNA can be cloned downstream of a bacteriophage promoter, a T7 promoter) in a vector, cDNA can be transcribed in vitro with T7 polymerase, and the resultant RNA can be isolated by conventional techniques.

Nucleic acid molecules of the invention may be chemically synthesized using standard techniques. Methods of chemically synthesizing polydeoxynucleotides are known, including but not limited to solid-phase synthesis which, like peptide synthesis, has been fully automated in vAAft, 444W4ft4IdtWhPlld,~k.M b~ f tIWf~l~it/f~f ~ttt Eh~~r.ilBII~I Y~(4i4i.4 t44~~h FF4ftfiI. WPQ WO 00/08171 PCT/CA99/00711 commercially available DNA synthesizers (See Itakura et al. U.S. Patent No. 4,598,049; Caruthers et al. U.S. Patent No. 4,458,066; and Itakura U.S. Patent Nos. 4,401,796 and 4,373,071).

Determination of whether a particular nucleic acid molecule is a GlcNAc-TV-b or GlcNAc-TVc or encodes a GlcNAc-TV-b Protein, GlcNAc-TV-b Related Protein, GlcNAc-TV-c Protein, or GlcNAc-TV-c Related Protein can be accomplished by expressing the cDNA in an appropriate host cell by standard techniques, and testing the expressed protein in the methods described herein. A GlcNAc- TV-b or GlcNAc-TV-c cDNA or cDNA encoding a GlcNAc-TV-b Protein, GlcNAc-TV-b Related Protein, GlcNAc-TV-c Protein, or GlcNAc-TV-c Related Protein can be sequenced by standard techniques, such as dideoxynucleotide chain termination or Maxam-Gilbert chemical sequencing, to determine the nucleic acid sequence and the predicted amino acid sequence of the encoded protein.

The initiation codon and untranslated sequences of a nucleic acid molecule of the invention may be determined using computer software designed for the purpose, such as PC/Gene (IntelliGenetics Inc., Calif.). The intron-exon structure and the transcription regulatory sequences of a nucleic acid molecule of the invention and/or encoding a GlcNAc-TV-b Protein, GlcNAc-TV-b Related Protein, GlcNAc-TV-c Protein, or GlcNAc-TV-c Related Protein may be identified by using a nucleic acid molecule of the invention to probe a genomic DNA clone library. Regulatory elements can be identified using standard techniques. The function of the elements can be confirmed by using these elements to express a reporter gene such as the lacZ gene which is operatively linked to the elements. These constructs may be introduced into cultured cells using conventional procedures or into non-human transgenic animal models. In addition to identifying regulatory elements in DNA, such constructs may also be used to identify nuclear proteins interacting with the elements, using techniques known in the art.

In accordance with one aspect of the invention, a nucleic acid is provided comprising a GlcNAc-TV-b regulatory sequence such as a promoter sequence. In particular, an isolated nucleic acid molecule is contemplated which comprises: a nucleic acid sequence having at least 75% sequence identity with a sequence of SEQ. ID. NO. 7 or 8; (ii) nucleic acid sequences complementary to (iii) nucleic acid sequences differing from any of the nucleic acids of or (ii) in codon sequences due to the degeneracy of the genetic code; (iv) a nucleic acid sequence comprising at least 10, most preferably 18 nucleotides and capable of hybridizing under stringent conditions to a nucleic acid sequence of SEQ.

ID. NO. 7 or 8, or to a degenerate form thereof; a fragment, or allelic or species variation of (ii) or (iii).

In a preferred embodiment, the isolated nucleic acid comprises a nucleic acid sequence of SEQ. ID. NO. 7 or 8, wherein T can also be U.

The invention contemplates nucleic acid molecules comprising all or a portion of a nucleic acid of the invention comprising a regulatory sequence of a glcNAc-TV-b gene or a glcNAc- TV-c gene SEQ ID Nos: 7 or 8) contained in appropriate vectors. The vectors may contain 444, WO 00/08171 PCT/CA99/00711 -11heterologous nucleic acid sequences. "Heterologous nucleic acid" refers to a nucleic acid not naturally located in the cell, or in a chromosomal site of the cell. Preferably, the heterologous nucleic acid includes a nucleic acid foreign to the cell.

In accordance with another aspect of the invention, the nucleic acids isolated using the methods described herein are mutant glcNAc-TV-b or glcNAc-TV-c gene alleles. For example, the mutant alleles may be isolated from individuals either known or proposed to have a genotype which contributes to the symptoms of cancer. Mutant alleles and mutant allele products may be used in therapeutic and diagnostic methods described herein. For example, a cDNA of a mutant glcNAc-TV-b gene may be isolated using PCR as described herein, and the DNA sequence of the mutant allele may be compared to the normal allele to ascertain the mutation(s) responsible for the loss or alteration of function of the mutant gene product. A genomic library can also be constructed using DNA from an individual suspected of or known to carry a mutant allele, or a cDNA library can be constructed using RNA from tissue known, or suspected to express the mutant allele. A nucleic acid encoding a normal glcNAc-TV-b gene or any suitable fragment thereof, may then be labeled and used as a probe to identify the corresponding mutant allele in such libraries. Clones containing mutant sequences can be purified and subjected to sequence analysis. In addition, an expression library can be constructed using cDNA from RNA isolated from a tissue of an individual known or suspected to express a mutant glcNAc-TV-b allele. Gene products from putatively mutant tissue may be expressed and screened, for example using antibodies specific for a GlcNAc-TV-b Protein or a GlcNAc-TV-b Related Protein as described herein.

Library clones identified using the antibodies can be purified and subjected to sequence analysis.

Antisense molecules and ribozymes are contemplated within the scope of the invention.

"Antisense refers to any composition containing nucleotide sequences which are complementary to a specific DNA or RNA sequence. Ribozymes are enzymatic RNA molecules that can be used to catalyze the specific cleavage of RNA. Antisense molecules and ribozymes may be prepared by any method known in the art for the synthesis of nucleic acid molecules. These include techniques for chemically synthesizing oligonucleotides such as solid phase phosphoramidite chemical synthesis.

Alternatively, RNA molecules may be generated by in vitro and in vivo transcription of DNA sequences encoding a GlcNAc-TV-b Protein, GlcNAc-TV-b Related Protein, GlcNAc-TV-c Protein, or GlcNAc-TV-c Related Protein. Such DNA sequences may be incorporated into a wide variety of vectors with suitable RNA polymerase promoters such as T7 or SP6. Alternatively, these cDNA constructs that synthesize antisense RNA constitutively or inducibly can be introduced into cell lines, cells, or tissues. RNA molecules may be modified to increase intracellular stability and half-life.

Possible modifications include, but are not limited to, the addition of flanking sequences at the and/or 3' ends of the molecule or the use of phosphorothioate or 2' O-methyl rather than phosphodiesterase linkages within the backbone of the molecule. This concept is inherent in the production of PNAs and can be extended in all of these molecules by the inclusion of nontraditional bases such as inosine, queosine, and wybutosine, as well as acetyl-, methyl-, thio-, and similarly modified forms of adenine, cytidine, guanine, thymine, and uridine which are not as easily recognized by endogenous endonucleases.

alr nllr? r~nl~u;~cr.~nrm~"illsn,~r~ YwlllURMlll~i~YWI~I~~~%b!n*l!isWlr~rYli WO 00/08171 PCT/CA99/00711 -12- Proteins of the Invention The proteins of the invention are predominantly expressed in the central nervous system, with the exception of the spinal cord. The proteins are also expressed in different tumors such as cervical carcinoma, lung carcinoma, colon carcinoma, melanoma, and they have been specifically found in tumors from the breast and uterus.

The amino acid sequence of an isolated GlcNAc-TV-b Protein of the invention comprises a sequence of SEQ.ID. NO. 2, 4, or 6. The amino acid sequence of an isolated GlcNAc-TV-c Protein of the invention comprises a sequence of SEQ.ID. NO.2, 10, or 12. In addition to proteins comprising an amino acid sequence of SEQ.ID. NO. 2, 4, 6, 10, or 12 the proteins of the present invention include truncations, and analogs, allelic and species variations, and homologs of GlcNAc-TV-b or GlcNAc- TV-c and truncations thereof as described herein GlcNAc-TV-b Related Proteins or GlcNAc-TV-c Related Proteins).

Truncated proteins may comprise peptides of between 3 and 70 amino acid residues, ranging in size from a tripeptide to a 70 mer polypeptide, preferably 12 to 20 amino acids. In one aspect of the invention, fragments of a GlcNAc-TV-b or GlcNAc-TV-c protein are provided having an amino acid sequence of at least five consecutive amino acids of SEQ.ID. NO. 2, 4, 6, 10, or 12 where no amino acid sequence of five or more, six or more, seven or more, or eight or more, consecutive amino acids present in the fragment is present in a protein other than GlcNAc-TV-b or GlcNAc-TV-c. In an embodiment of the invention the fragment is a stretch of amino acid residues of at least 12 to contiguous amino acids from particular sequences such as the sequences of SEQ.ID. NO. 2, 4, 6, 10, or 12. The fragments may be immunogenic and preferably are not immunoreactive with antibodies that are immunoreactive to proteins other than GlcNAc-TV-b or GlcNAc-TV-c.

The truncated proteins may have an amino group a hydrophobic group (for example, carbobenzoxyl, dansyl, or T-butyloxycarbonyl), an acetyl group, a 9-fluorenylmethoxy-carbonyl (PMOC) group, or a macromolecule including but not limited to lipid-fatty acid conjugates, polyethylene glycol, or carbohydrates at the amino terminal end. The truncated proteins may have a carboxyl group, an amido group, a T-butyloxycarbonyl group, or a macromolecule including but not limited to lipid-fatty acid conjugates, polyethylene glycol, or carbohydrates at the carboxy terminal end.

The proteins of the invention may also include analogs of GlcNAc-TV-b or GlcNAc-TV-c, and/or truncations thereof as described herein, which may include, but are not limited to GlcNAc-TV-b or GlcNAc-TV-c, containing one or more amino acid substitutions, insertions, and/or deletions. Amino acid substitutions may be of a conserved or non-conserved nature. Conserved amino acid substitutions involve replacing one or more amino acids of the GlcNAc-TV-b or GlcNAc-TV-c amino acid sequence with amino acids of similar charge, size, and/or hydrophobicity characteristics. When only conserved substitutions are made the resulting analog is preferably functionally equivalent to GlcNAc-TV-b or GlcNAc-TV-c. Non-conserved substitutions involve replacing one or more amino acids of the GlcNAc-TV-b or GlcNAc-TV-c amino acid sequence with one or more amino acids which possess dissimilar charge, size, and/or hydrophobicity characteristics.

Yhrr~ii~ii~il~ ~Ny~.1~Y(iYrillULdAWtM b~!iJ'ilel~i~'~m?~l~lr~ildBil~lElrlhlhU IrB(gs~i~~i(i~! ?~llli\fili~F~~)''iV:UWqi~;P~t~iilW.llu IllilYA~N~IIIII(UC~rYiiRlli~b~~iii~iWiil WiHllilYl~irliiN~-'~ WO 00/08171 PCT/CA99/00711 -13- One or more amino acid insertions may be introduced into a GlcNAc-TV-b Protein or GlcNAc-TV-c Protein. Amino acid insertions may consist of single amino acid residues or sequential amino acids ranging from 2 to 15 amino acids in length.

Deletions may consist of the removal of one or more amino acids, or discrete portions from the GlcNAc-TV-b or GlcNAc-TV-c amino acid sequence. The deleted amino acids may or may not be contiguous. The lower limit length of the resulting analog with a deletion mutation is about 10 amino acids, preferably 100 amino acids.

An allelic variant of GlcNAc-TV-b or GlcNAc-TV-c at the protein level differs from one another by only one, or at most, a few amino acid substitutions. A species variation of a GlcNAc-TV-b Protein or GlcNAc-TV-c Protein is a variation which is naturally occurring among different species of an organism.

The proteins of the invention also include homologs of GlcNAc-TV-b or GlcNAc-TV-c and/or truncations thereof as described herein. Such GlcNAc-TV-b or GlcNAc-TV-c homologs include proteins whose amino acid sequences are comprised of the amino acid sequences of GlcNAc-TV-b or GlcNAc-TV-c regions from other species that hybridize under selective hybridization conditions (see discussion of selective and in particular stringent hybridization conditions herein) with a probe used to obtain a GlcNAc-TV-b Protein or GlcNAc-TV-c Protein. These homologs will generally have the same regions which are characteristic of a GlcNAc-TV-b or GlcNAc-TV-c Protein. It is anticipated that a protein comprising an amino acid sequence which has at least 70% identity, more preferably at least 75% identity, most preferably 80 to 90% identity, with an amino acid sequence of SEQ. ID. NO. 2, 4, 6, 10, or 12 will be a homolog of a protein of the invention. A percent amino acid sequence homology or identity is calculated using the methods described herein, preferably the computer programs described herein.

The invention also contemplates isoforms of the proteins of the invention. An isoform contains the same number and kinds of amino acids as the protein of the invention, but the isoform has a different molecular structure. The isoforms contemplated by the present invention preferably have the same properties as the protein of the invention as described herein.

The present invention also includes GlcNAc-TV-b Proteins, GlcNAc-TV-b Related Proteins, GlcNAc-TV-c Proteins, or GlcNAc-TV-c Related Proteins conjugated with a selected protein, or a marker protein (see below), or other glycosyltransferase, to produce fusion proteins or chimeric proteins.

A GlcNAc-TV-b Protein, a GlcNAc-TV-b Related Protein, a GlcNAc-TV-c Protein, or a GlcNAc-TV-c Related Protein of the invention may be prepared using recombinant DNA methods.

Accordingly, the nucleic acids of the present invention having a sequence which encodes a GlcNAc- TV-b Protein, a GlcNAc-TV-b Related Protein, a GlcNAc-TV-c Protein, or a GlcNAc-TV-c Related Protein of the invention may be incorporated in a known manner into an appropriate expression vector which ensures good expression of the protein. Possible expression vectors include but are not limited to cosmids, plasmids, or modified viruses replication defective retroviruses, adenoviruses and adeno-associated viruses), so long as the vector is compatible with the host cell used.

~~~l~;liy~!yl~*u~gilgblrulrdk~UUlanULil I~sti~hi~lYMWRL'~ciuiYIICR~IIII1!~::1111 IlijriB~'Y6.insl~~ri~lkll2ii!ii YiAii~kiiY~1IIVI~HIWiE~i~'....~..;..

WO 00/08171 PCT/CA99/00711 -14- The invention therefore contemplates a recombinant expression vector of the invention containing a nucleic acid molecule of the invention, and the necessary regulatory sequences for the transcription and translation of the inserted protein-sequence. Suitable regulatory sequences may be derived from a variety of sources, including bacterial, fungal, viral, mammalian, or insect genes (For example, see the regulatory sequences described in Goeddel, Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, CA (1990). Selection of appropriate regulatory sequences is dependent on the host cell chosen as discussed below, and may be readily accomplished by one of ordinary skill in the art. The necessary regulatory sequences may be supplied by the native GlcNAc-TV Protein and/or its flanking regions.

The invention further provides a recombinant expression vector comprising a nucleic acid molecule of the invention cloned into the expression vector in an antisense orientation. That is, the DNA molecule is linked to a regulatory sequence in a manner which allows for expression, by transcription of the DNA molecule, of an RNA molecule which is antisense to a nucleic acid sequence of SEQ. ID. NO. 1, 3, 5, 7, 8, 9, or 11. Regulatory sequences linked to the antisense nucleic acid can be chosen which direct the continuous expression of the antisense RNA molecule in a variety of cell types, for instance a viral promoter and/or enhancer, or regulatory sequences can be chosen which direct tissue or cell type specific expression ofantisense RNA.

The recombinant expression vectors of the invention may also contain a marker gene which facilitates the selection of host cells transformed or transfected with a recombinant molecule of the invention. Examples of marker genes are genes encoding a protein such as G418, dhfr, npt, als, pat and hygromycin which confer resistance to certain drugs, p-galactosidase, chloramphenicol acetyltransferase, firefly luciferase, trpB, hisD, herpes simplex virus thymidine kinase, adenine phosphoribosyl transferase, or an immunoglobulin or portion thereof such as the Fc portion of an immunoglobulin preferably IgG. Visible markers such as anthocyanins, beta-glucuronidase and its substrate GUS, and luciferase and its substrate luciferin, can be used to identify transformants, and also to quantify the amount of transient or stable protein expression attributable to a specific vector system (Rhodes, C. et al. (1995)Mol. Biol. 55:121-131). The markers can be introduced on a separate vector from the nucleic acid of interest.

The recombinant expression vectors may also contain genes that encode a fusion moiety which provides increased expression of the recombinant protein; increased solubility of the recombinant protein; and aid in the purification of the target recombinant protein by acting as a ligand in affinity purification. For example, a proteolytic cleavage site may be added to the target recombinant protein to allow separation of the recombinant protein from the fusion moiety subsequent to purification of the fusion protein. Typical fusion expression vectors include pGEX (Amrad Corp., Melbourne, Australia), pMAL (New England Biolabs, Beverly, MA) and pRIT5 (Pharmacia, Piscataway, NJ) which fuse glutathione S-transferase (GST), maltose E binding protein, or protein A, respectively, to the recombinant protein.

The vectors may be introduced into host cells to produce a transformed or transfected host cell. The terms "transfected and "transfection" encompass the introduction of nucleic acid a i HIM144 WfAVIVRN4 W A NM kWPIth WY;k. U'A"W WO 00/08171 PCT/CA99/00711 vector) into a cell by one of many standard techniques. A cell is "transformed" by a nucleic acid when the transfected nucleic acid effects a phenotypic change. Prokaryotic cells can be transfected or transformed with nucleic acid by, for example, electroporation or calcium-chloride mediated transformation. Nucleic acid can be introduced into mammalian cells via conventional techniques such as calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated transfection, lipofectin, electroporation or microinjection. Suitable methods for transforming and transfecting host cells can be found in Sambrook et al. (Molecular Cloning: A Laboratory Manual, 2nd Edition, Cold Spring Harbor Laboratory press (1989)), and other laboratory textbooks.

Human artificial chromosomes (HACs) may also be employed to deliver larger fragments of DNA that can be contained and expressed in a plasmid. HACs of 6 to 10M are constructed and delivered via conventional delivery methods (liposomes, polycationic amino polymers, or vesicles) for therapeutic purposes.

Suitable host cells include a wide variety of prokaryotic and eukaryotic host cells. For example, the proteins of the invention may be expressed in bacterial cells such as E. coli, insect cells (using baculovirus), yeast cells, or mammalian cells. Other suitable host cells can be found in Goeddel, Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, CA (199 1).

A host cell may also be chosen which modulates the expression of an inserted nucleic acid sequence, or modifies glycosylation or phosphorylation) and processes cleaves) the protein in a desired fashion. Host systems or cell lines may be selected which have specific and characteristic mechanisms for post-translational processing and modification of proteins. For example, eukaryotic host cells including CHO, VERO, BHK, A431, HeLA, COS, MDCK, 293, 3T3, and WI38 may be used. For long-term high-yield stable expression of the protein, cell lines and host systems which stably express the gene product may be engineered.

Host cells and in particular cell lines produced using the methods described herein may be particularly useful in screening and evaluating compounds that modulate the activity of a GlcNAc-TVb Protein, GlcNAc-TV-b Related Protein, a GlcNAc-TV-c Protein, or a GlcNAc-TV-c Related Protein.

The proteins of the invention may also be expressed in non-human transgenic animals including but not limited to mice, rats, rabbits, guinea pigs, micro-pigs, goats, sheep, pigs, non-human primates baboons, monkeys, and chimpanzees) (see Hammer et al. (Nature 315:680-683, 1985), Palmiter et al. (Science 222:809-814, 1983), Brinster et al. (Proc Natl. Acad. Sci USA 82:44384442, 1985), Palmiter and Brinster (Cell. 41:343-345, 1985) and U.S. Patent No. 4,736,866). Procedures known in the art may be used to introduce a nucleic acid molecule of the invention encoding a GlcNAc-TV-b Protein, GlcNAc-TV-b Related Protein, a GlcNAc-TV-c Protein, or a GlcNAc-TV-c Related Protein into animals to produce the founder lines of transgenic animals. Such procedures include pronuclear microinjection, retrovirus mediated gene transfer into germ lines, gene targeting in embryonic stem cells, electroporation of embryos, and sperm-mediated gene transfer.

The present invention contemplates a transgenic animal that carries the GlcNAc-TV-b or GlcNAc-TV-c gene in all their cells, and animals which carry the transgene in some but not all their cells. The transgene may be integrated as a single transgene or in concatamers. The transgene may be MWW,,-V) A-W-J, N& 00 VA' hy py WO 00/08171 PCT/CA99/00711 -16selectively introduced into and activated in specific cell types (See for example, Lasko et al, 1992 Proc.

Natl. Acad. Sci. USA 89: 6236). The transgene may be integrated into the chromosomal site of the endogenous gene by gene targeting. The transgene may be selectively introduced into a particular cell type inactivating the endogenous gene in that cell type (See Gu et al Science 265: 103-106).

The expression of a recombinant GlcNAc-TV-b Protein, GlcNAc-TV-b Related Protein, GlcNAc-TV-c Protein, or GlcNAc-TV-c Related Protein in a transgenic animal may be assayed using standard techniques. Initial screening may be conducted by Southern Blot analysis, or PCR methods to analyze whether the transgene has been integrated. The level of mRNA expression in the tissues of transgenic animals may also be assessed using techniques including Northern blot analysis of tissue samples, in situ hybridization, and RT-PCR. Tissue may also be evaluated immunocytochemically using antibodies against a GlcNAc-TV-b Protein or GlcNAc-TV-c Protein of the invention.

Proteins of the invention may also be prepared by chemical synthesis using techniques well known in the chemistry of proteins such as solid phase synthesis (Merrifield, 1964, J. Am. Chem.

Assoc. 85:2149-2154) or synthesis in homogenous solution (Houbenweyl, 1987, Methods of Organic Chemistry, ed. E. Wansch, Vol. 15 I and II, Thieme, Stuttgart). Protein synthesis may be performed using manual procedures or by automation. Automated synthesis may be carried out, for example, using an Applied Biosystems 431A peptide synthesizer (Perkin Elmer). Various fragments of the proteins of the invention may be chemically synthesized separately and combined using chemical methods to produce the full length molecule.

N-terminal or C-terminal fusion proteins or chimeric proteins comprising a GlcNAc-TV-b Protein, GlcNAc-TV-b Related Protein, a GlcNAc-TV-c Protein, or a GlcNAc-TV-c Related Protein of the invention conjugated with other molecules, such as proteins markers or other glycosyltransferases) may be prepared by fusing, through recombinant techniques, the N-terminal or C-terminal of a GlcNAc-TV-b Protein, a GlcNAc-TV-b Related Protein, a GlcNAc-TV-c Protein, or a GlcNAc-TV-c Related Protein, and the sequence of a selected protein or marker protein with a desired biological function. The resultant fusion proteins contain a GlcNAc-TV-b Protein, a GlcNAc-TV-b Related Protein, a GlcNAc-TV-c Protein, or a GlcNAc-TV-c Related Protein fused to the selected protein or marker protein as described herein. Examples of proteins which may be used to prepare fusion proteins include immunoglobulins, glutathione-S-transferase (GST), protein A, hemagglutinin and truncated myc.

Antibodies A protein of the invention, or a portion thereof can be used to prepare antibodies specific for the proteins. Antibodies can be prepared which bind a distinct epitope in an unconserved region of the protein. An unconserved region of the protein is one that does not have substantial sequence homology to other proteins. A region from a conserved region such as a well-characterized domain can also be used to prepare an antibody to a conserved region of a protein of the invention In an embodiment of the invention, oligopeptides, peptides, or fragments used to induce antibodies to a protein of the invention have an amino acid sequence consisting of at least 5 amino acids and more preferably at least 10 amino acids. The oligopeptides, etc. can be identical to a portion W-RWY MU WRI MPME0111WER64 IMUN 0 IR WIU WIMAA iii fliW WWAI rJif rTrR;_ WO 00/08171 PCT/CA99/00711 -17of the amino acid sequence of the natural protein, and they may contain the entire amino acid sequence of a small, naturally occurring molecule. Antibodies having specificity for a protein of the invention may also be raised from fusion proteins created by expressing fusion proteins in bacteria as described herein.

The invention can employ intact monoclonal or polyclonal antibodies, and immunologically active fragments a Fab or (Fab) 2 fragment), an antibody heavy chain, an antibody light chain, a genetically engineered single chain Fv molecule (Ladner et al, U.S. Pat. No. 4,946,778), or a chimeric antibody, for example, an antibody which contains the binding specificity of a murine antibody, but in which the remaining portions are of human origin. Antibodies including monoclonal and polyclonal antibodies, fragments and chimeras, etc. may be prepared using methods known to those skilled in the art.

Applications of the Nucleic Acid Molecules, Proteins, and Antibodies of the Invention The nucleic acid molecules, GIcNAc-TV-b Proteins, GIcNAc-TV-b Related Proteins, GIcNAc-TV-c Proteins, or GIcNAc-TV-c Related Proteins, and antibodies of the invention may be used in the prognostic and diagnostic evaluation of conditions requiring modulation of a nucleic acid or protein of the invention including cancer, and the identification of subjects with a predisposition to such conditions (See below). Methods for detecting nucleic acid molecules and proteins of the invention, can be used to monitor conditions requiring modulation of the nucleic acids or proteins including cancer solid tumors, such as breast and uterine cancer) by detecting and localizing the proteins and nucleic acids. It would also be apparent to one skilled in the art that the methods described herein may be used to study the developmental expression of the proteins of the invention and, accordingly, will provide further insight into the role of the proteins. The applications of the present invention also include methods for the identification of compounds which modulate the biological activity of a protein of the invention (See below). The compounds, antibodies, etc. may be used for the treatment of conditions requiring modulation of proteins of the invention including cancer solid tumors, such as breast and uterine cancer). (See below).

Diagnostic Methods A variety of methods can be employed for the diagnostic and prognostic evaluation of conditions requiring modulation of a nucleic acid or protein of the invention including cancer (e.g.

solid tumors, breast and uterine cancer), and the identification of subjects with a predisposition to such conditions. Such methods may, for example, utilize nucleic acid molecules of the invention, and fragments thereof, and antibodies directed against proteins of the invention, including peptide fragments. In particular, the nucleic acids and antibodies may be used, for example, for: the detection of the presence of glcNAc-TV-b or glcNAc-TV-c mutations, or the detection of either over- or under-expression of GlcNAc-TV-b or GIcNAc-TV-c mRNA relative to a non-disorder state or the qualitative or quantitative detection of alternatively spliced forms of glcNAc-TV-b or glcNAc-TV-c transcripts which may correlate with certain conditions or susceptibility toward such conditions; and the detection of either an over- or an under-abundance of a protein of the invention relative to a rl r; l. WO 00/08171 PCT/CA99/00711 -18non-disorder state or the presence of a modified less than full length) protein of the invention which correlates with a disorder state, or a progression toward a disorder state.

The methods described herein may be performed by utilizing pre-packaged diagnostic kits comprising at least one specific nucleic acid or antibody described herein, which may be conveniently used, in clinical settings, to screen and diagnose patients and to screen and identify those individuals exhibiting a predisposition to developing a disorder.

Nucleic acid-based detection techniques and peptide detection techniques are described below.

The samples that may be analyzed using the methods of the invention include those which are known or suspected to express glcNAc-TV-b or glcNAc-TV-c or contain a protein of the invention. The methods may be performed on biological samples including but not limited to cells, lysates of cells which have been incubated in cell culture, chromosomes isolated from a cell a spread of metaphase chromosomes), genomic DNA (in solutions or bound to a solid support such as for Southern analysis), RNA (in solution or bound to a solid support such as for northern analysis), cDNA (in solution or bound to a solid support), an extract from cells or a tissue, and biological fluids such as serum, urine, blood, and CSF. The samples may be derived from a patient or a culture.

Methods for Detecting Nucleic Acid Molecules of the Invention A nucleic acid molecule encoding a protein of the invention may be used in Southern or northern analysis, dot blot, or other membrane-based technologies; in PCR technologies; or in dipstick, pin, ELISA assays or microarrays utilizing fluids or tissues from patient biopsies to detect altered expression. Such qualitative or quantitative methods are well known in the art and some methods are described below.

The nucleic acid molecules of the invention allow those skilled in the art to construct nucleotide probes for use in the detection of nucleic acid sequences of the invention in biological materials. Suitable probes include nucleic acid molecules based on nucleic acid sequences encoding at least 5 sequential amino acids from regions of the GIcNAc-TV-b or GIcNAc-TV-c nucleic acid molecules (see SEQ. ID. No. 1, 3, 5, 7, 8, 9, or 11), preferably they comprise 15 to 30 nucleotides. A nucleotide probe may be labeled with a detectable substance such as a radioactive label which provides for an adequate signal and has sufficient half-life such as 3 2 p, 3 H, 14 C or the like. Other detectable substances which may be used include antigens that are recognized by a specific labeled antibody, fluorescent compounds, enzymes, antibodies specific for a labeled antigen, and luminescent compounds. An appropriate label may be selected having regard to the rate of hybridization and binding of the probe to the nucleotide to be detected and the amount of nucleotide available for hybridization. Labeled probes may be hybridized to nucleic acids on solid supports such as nitrocellulose filters or nylon membranes as generally described in Sambrook et al, 1989, Molecular Cloning, A Laboratory Manual (2nd The nucleic acid probes may be used to detect glcNAc-TV-b or GlcNAc-TV-c genes, preferably in human cells. The nucleotide probes may also be useful for example in the diagnosis or prognosis of cancer, the staging of the cancer, and in monitoring the progression of these conditions, or monitoring a therapeutic treatment. The probes may also be useful for mapping the naturally occurring genomic sequence. Sequences can be mapped to a particular ~~~rhllxl~n~a~,~iutd~iCYlriiiR~ii ruLl WO 00/08171 PCT/CA99/00711 -19chromosome, to a specific region of a chromosome, or to an artificial chromosome construction (e.g.

HACs, yest artificial chromosomes (YACs), bacterial artificial chromosomes (BACs) bacterial P1 constructions or single chromosome cDNA libraries (see Price, C.M. 1993, Blood Rev. 7:127-1134 and Trask, B.J. 1991, Trends Genet. 7;149-154).

The probe may be used in hybridization techniques to detect glcNAc-TV-b or glcNAc-TV-c genes. The technique generally involves contacting and incubating nucleic acids recombinant DNA molecules, cloned genes) obtained from a sample from a patient or other cellular source with a probe of the present invention under conditions favourable for the specific annealing of the probes to complementary sequences in the nucleic acids. After incubation, the non-annealed nucleic acids are removed, and the presence of nucleic acids that have hybridized to the probe if any are detected.

The detection of nucleic acid molecules of the invention may involve the amplification of specific gene sequences using an amplification method such as PCR, followed by the analysis of the amplified, molecules using techniques known to those skilled in the art. Suitable primers can be routinely designed by one of skill in the art.

Genomic DNA may be used in hybridization or amplification assays of biological samples to detect abnormalities involving glcNAc-TV-b or glcNAc-TV-c structure, including point mutations, insertions, deletions, and chromosomal rearrangements. For example, direct sequencing, single stranded conformational polymorphism analyses, heteroduplex analysis, denaturing gradient gel electrophoresis, chemical mismatch cleavage, and oligonucleotide hybridization may be utilized.

Genotyping techniques known to one skilled in the art can be used to type polymorphisms that are in close proximity to the mutations in a glcNAc-TV-b or glcNAc-TV-c gene. The polymorphisms may be used to identify individuals in families that are likely to carry mutations. If a polymorphism exhibits linkage disequalibrium with mutations in the glcNAc-TV-b or glcNAc-TV-c genes, it can also be used to screen for individuals in the general population likely to carry mutations. Polymorphisms which may be used include restriction fragment length polymorphisms (RFLPs), single-base polymorphisms, and simple sequence repeat polymorphisms (SSLPs).

A probe of the invention may be used to directly identify RFLPs. A probe or primer of the invention can additionally be used to isolate genomic clones such as YACs, BACs, PACs, cosmids, phage or plasmids. The DNA in the clones can be screened for SSLPs using hybridization or sequencing procedures.

Hybridization and amplification techniques described herein may be used to assay qualitative and quantitative aspects of glcNAc-TV-b or glcNAc-TV-c expression. For example, RNA may be isolated from a cell type or tissue known to express glcNAc-TV-b brain) and tested utilizing the hybridization standard Northern analyses) or PCR techniques referred to herein. The techniques may be used to detect differences in transcript size which may be due to normal or abnormal alternative splicing. The techniques may be used to detect quantitative differences between levels of full length and/or alternatively splice transcripts detected in normal individuals relative to those individuals exhibiting symptoms of a disease such as cancer.

~~~tlAh .V ~4 4" 44 4444 41 i i~I 4L*4I~ WMF YW'lI 4 41 W 4 4444g i'j 44 f4 l W WO 00/08171 PCT/CA99/00711 The primers and probes may be used in the above described methods in situ i.e directly on tissue sections (fixed and/or frozen) of patient tissue obtained from biopsies or resections.

Microarrays Oligonucleotides derived from any of the nucleic acid molecules of the invention may be used as targets in microarrays. "Microarray" refers to an array of distinct polynucleotides or oligonucleotides synthesized on a substrate, such as paper, nylon, or other type of membrane, filter, chip, glass slide, or any other suitable solid support.

The microarrays can be used to monitor the expression level of large numbers of genes simultaneously (to produce a transcript image) and to identify genetic variants, mutations, and polymorphisms. This information can be useful in determining gene function, understanding the genetic basis of disease, diagnosing disease, and in developing and monitoring the activity of therapeutic agents (Heller, R. et al. (1997) Proc. Natl. Acad, Sci. 94:2150-55).

In an embodiment of the invention, the microarray is prepared and used according to the methods described in PCT application W095/11995 (Chee et al), Lockhart D. J. et al, 1996, Nat.

Biotech. 14:1675-1680) and Schena M. et al 1996, Proc. Natl. Acad, Sci. 93: 10614-10619).

The microarray can be composed of a large number of unique, single-stranded nucleic acid sequences, usually either synthetic antisense oligonucleotides or fragments of cDNAs fixed to a solid support. The oligonucleotides can be about 6-60 nucleotides in length, preferably 15-30 nucleotides in length, and most preferably about 20-25 nucleotides in length. For some microarrays it may be preferred to use oligonucleotides which are about 7-10 nucleotides in length. The microarray can contain oligonucleotides covering the known 5' or 3' sequence, sequential oligonucleotides covering the full length sequence, or unique oligonucleotides selected from particular areas along the length of the sequence. Polynucleotides used in the microarray can be oligonucleotides specific to a gene(s) of interest in which at least a fragment of the sequence is known or that are specific to one or more unidentified cDNAs which are common to particular cell types, or developmental or disease state.

To produce oligonucleotides to a known sequence for a microarray, a gene of interest is examined using a computer algorithm which starts at the 5' or more preferably at the 3' end of the nucleotide sequence. The algorithm identifies oligomers of a defined length that are unique to the gene, have a GC content within a suitable range for hybridization, and lack predicted secondary structure that can interfere with hybridization. In some cases it may be appropriate to use pairs of oligonucleotides on a microarray. The "pairs" will be identical, except for a single nucleotide which can be located in the center of the sequence. The second oligonucleotide in the pair serves as a control. The number of oligonucleotide pairs may range from two to one million. The oligomers are synthesized at designated areas on a substrate using a light-directed chemical process.

The oligomers can be synthesized on the surface of the substrate by using a chemical coupling procedure and an ink jet application apparatus, such as described in PCT application W095/251116 (Baldeschweiler et A "gridded" array analogous to a dot (or slot) blot can also be used to arrange and link cDNA fragments or oligonucleotides to the surface of a substrate using a vacuum system, thermal, UV, mechanical or chemical bonding procedures. An array can be produced by hand or using ATAWAAWO kAMIAWYOWWWWW 10 1 MO NI ijg P #1 A tMMAM WOMI WWlf MAPR rru~ iYRO IMM WO 00/08171 PCT/CA99/00711 -21available devices (slot blot or dot blot apparatus), materials (any suitable solid support), and machines (including robotic instruments) and it can contain 8, 24, 96, 384, 1536 or 6144 oligonucleotides, or any other multiple between two and one million which lends itself to the efficient use of commercially available instrumentation.

Sample analysis using microarrays, is conducted by making RNA or DNA from a biological sample into hybridization probes. The mRNA is isolated, and cDNA is prepared and used as a template to make antisense RNA (aRNA). The aRNA is amplified in the presence of fluorescent nucleotides, and labeled hybridization probes are incubated with the microarray so that the probe sequences hybridize to complementary oligonucleotides of the microarray. Incubation conditions are selected so that hybridization occurs with precise complementary matches or with various degrees of less complementarity. After removal of nonhybridized probes, a scanner determines the levels and patterns of fluorescence. The scanned images are examined to determine the degree of complementarity and the relative quantity of each oligonucleotide sequence on the microarray. The biological samples may be obtained from any bodily fluids (such as blood, urine, saliva, phlegm, gastric juices, etc.), cultured cells, biopsies, or other tissue preparations. A detection system can be used to measure the absence, presence, and amount of hybridization for all of the distinct sequences simultaneously. This data can be used for large scale correlation studies on the sequences, mutations, variants, or polymorphisms among samples.

Methods for Detecting Proteins Antibodies specifically reactive with a GlcNAc-TV-b Protein, a GlcNAc-TV-b Related Protein, a GlcNAc-TV-c Protein, or a GlcNAc-TV-c Related Protein, or derivatives, such as enzyme conjugates or labeled derivatives, may be used to detect GlcNAc-TV-b Proteins, GlcNAc-TV-b Related Proteins, GlcNAc-TV-c Proteins, or GlcNAc-TV-c Related Proteins in various biological materials.

They may be used as diagnostic or prognostic reagents and they may be used to detect abnormalities in the level of GlcNAc-TV-b Proteins, GlcNAc-TV-b Related Proteins, GlcNAc-TV-c Proteins, or GlcNAc-TV-c Related Proteins, expression, or abnormalities in the structure, and/or temporal, tissue, cellular, or subcellular location of the proteins. Antibodies may also be used to screen potentially therapeutic compounds in vitro to determine their effects on a condition such as cancer etc. In vitro immunoassays may also be used to assess or monitor the efficacy of particular therapies. The antibodies of the invention may also be used in vitro to determine the level of GlcNAc-TV-b or GlcNAc-TV-c expression in cells genetically engineered to produce a GlcNAc-TV-b Protein, a GlcNAc-TV-b Related Protein, a GlcNAc-TV-c Protein, or a GlcNAc-TV-b Related Protein.

The antibodies may be used in any known immunoassays which rely on the binding interaction between an antigenic determinant of a protein of the invention, and the antibodies.

Examples of such assays are radioimmunoassays, enzyme immunoassays ELISA), immunofluorescence, immunoprecipitation, latex agglutination, hemagglutination, and histochemical tests. The antibodies may be used to detect and quantify proteins of the invention in a sample in order to determine its role in particular cellular events or pathological states, and to diagnose and treat such pathological states.

i~iN rA cia *'e~iW L!Ji*MII U; I ~riiWi HIi1p WO 00/08171 PCT/CA99/00711 -22- In particular, the antibodies of the invention may be used in immuno-histochemical analyses, for example, at the cellular and sub-subcellular level, to detect a protein of the invention, to localise it to particular cells and tissues, and to specific subcellular locations, and to quantitate the level of expression.

Cytochemical techniques known in the art for localizing antigens using light and electron microscopy may be used to detect a protein of the invention. Generally, an antibody of the invention may be labeled with a detectable substance and a protein may be localised in tissues and cells based upon the presence of the detectable substance. Various methods of labeling polypeptides and glycoproteins are known in the art and may be used. Examples of detectable substances include, but are not limited to, the following: radioisotopes 3 H, 14 C, 35S, 1 1311), fluorescent labels FITC, rhodamine, lanthanide phosphors), luminescent labels such as luminol; enzymatic labels horseradish peroxidase, 3-galactosidase, luciferase, alkaline phosphatase, acetylcholinesterase), biotinyl groups (which can be detected by marked avidin streptavidin containing a fluorescent marker or enzymatic activity that can be detected by optical or calorimetric methods), and predetermined polypeptide epitopes recognized by a secondary reporter leucine zipper pair sequences, binding sites for secondary antibodies, metal binding domains, epitope tags). In some embodiments, labels are attached via spacer arms of various lengths to reduce potential steric hindrance. Antibodies may also be coupled to electron dense substances, such as ferritin or colloidal gold, which are readily visualised by electron microscopy.

The antibody or sample may be immobilized on a carrier or solid support which is capable of immobilizing cells, antibodies etc. For example, the carrier or support may be nitrocellulose, or glass, polyacrylamides, gabbros, and magnetite. The support material may have any possible configuration including spherical bead), cylindrical inside surface of a test tube or well, or the external surface of a rod), or flat sheet, test strip). Indirect methods may also be employed in which the primary antigen-antibody reaction is amplified by the introduction of a second antibody, having specificity for the antibody reactive against a protein of the invention. By way of example, if the antibody having specificity against a protein of the invention is a rabbit IgG antibody, the second antibody may be goat anti-rabbit gamma-globulin labelled with a detectable substance as described herein.

Where a radioactive label is used as a detectable substance, a protein of the invention may be localized by radioautography. The results of radioautography may be quantitated by determining the density of particles in the radioautographs by various optical methods, or by counting the grains.

Methods for Identifying or Evaluating Substances/Compounds The methods described herein are designed to identify substances that modulate the biological activity of a GlcNAc-TV-b Protein, GlcNAc-TV-b Related Protein, GlcNAc-TV-c Protein, or GlcNAc- TV-c Related Protein including substances that interfere with, or enhance the activity of a GlcNAc-TVb Protein, GlcNAc-TV-b Related Protein, GlcNAc-TV-c Protein, or GlcNAc-TV-c Related Protein.

The substances and compounds identified using the methods of the invention include but are not limited to peptides such as soluble peptides including Ig-tailed fusion peptides, members of random WWI MW i I- WO 00/08171 PCT/CA99/00711 -23peptide libraries and combinatorial chemistry-derived molecular libraries including libraries made of D- and/or L-configuration amino acids, phosphopeptides (including members of random or partially degenerate, directed phosphopeptide libraries), antibodies polyclonal, monoclonal, humanized, anti-idiotypic, chimeric, single chain antibodies, fragments, Fab, F(ab) 2 and Fab expression library fragments, and epitope-binding fragments thereof)], and small organic or inorganic molecules.

The substance or compound may be an endogenous physiological compound or it may be a natural or synthetic compound.

Substances which modulate a GlcNAc-TV-b Protein, GlcNAc-TV-b Related Protein, GlcNAc-TV-c Protein, or GlcNAc-TV-c Related Protein can be identified based on their ability to associate with (or bind to) a GlcNAc-TV-b Protein, GlcNAc-TV-b Related Protein, GlcNAc-TV-c Protein, or GlcNAc-TV-c Related Protein. Therefore, the invention also provides methods for identifying substances which associate with a GIcNAc-TV-b Protein, GlcNAc-TV-b Related Protein, GlcNAc-TV-c Protein, or GlcNAc-TV-c Related Protein. Substances identified using the methods of the invention may be isolated, cloned and sequenced using conventional techniques. A substance that associates with a protein of the invention may be an agonist or antagonist of the biological or immunological activity of a polypeptide of the invention.

The term "agonist", refers to a molecule that increases the amount of, or prolongs the duration of, the activity of the protein. The term "antagonist" refers to a molecule which decreases the biological or immunological activity of the protein. Agonists and antagonists may include proteins, nucleic acids, carbohydrates, or any other molecules that associate with a polypeptide of the invention.

Substances which can associate with a GlcNAc-TV-b Protein, GlcNAc-TV-b Related Protein, GlcNAc-TV-c Protein, or GlcNAc-TV-c Related Protein may be identified by reacting a GlcNAc-TV-b Protein, GlcNAc-TV-b Related Protein, GlcNAc-TV-c Protein, or GlcNAc-TV-c Related Protein with a test substance which potentially associates with a GlcNAc-TV-b Protein, GlcNAc-TV-b Related Protein, GlcNAc-TV-c Protein, or GlcNAc-TV-c Related Protein, under conditions which permit the association, and removing and/or detecting GlcNAc-TV-b Protein, GlcNAc-TV-b Related Protein, GlcNAc-TV-c Protein, or GlcNAc-TV-c Related Protein associated with the test substance. Substanceprotein complexes, free substance, or non-complexed protein may be assayed. Conditions which permit the formation of substance-protein complexes may be selected having regard to factors such as the nature and amounts of the substance and the protein.

The substance-protein complex, free substance or non-complexed proteins may be isolated by conventional isolation techniques, for example, salting out, chromatography, electrophoresis, gel filtration, fractionation, absorption, polyacrylamide gel electrophoresis, agglutination, or combinations thereof. To facilitate the assay of the components, antibody against a protein of the invention or the substance, or labeled protein, or a labeled substance may be utilized. The antibodies, proteins, or substances may be labeled with a detectable substance as described above.

A GlcNAc-TV-b Protein, GlcNAc-TV-b Related Protein, GlcNAc-TV-c Protein, or GlcNAc- TV-c Related Protein, or the substance used in the method of the invention may be insolubilized. For example, a protein, or substance may be bound to a suitable carrier such as agarose, cellulose, dextran, lil"'~ I' r~*Riinl~ i li~n~,nr'rlllr~ WO 00/08171 PCT/CA99/00711 -24- Sephadex, Sepharose, carboxymethyl cellulose polystyrene, filter paper, ion-exchange resin, plastic film, plastic tube, glass beads, polyamine-methyl vinyl-ether-maleic acid copolymer, amino acid copolymer, ethylene-maleic acid copolymer, nylon, silk, etc. The carrier may be in the shape of, for example, a tube, test plate, beads, disc, sphere etc. The insolubilized protein or substance may be prepared by reacting the material with a suitable insoluble carrier using known chemical or physical methods, for example, cyanogen bromide coupling.

The invention also contemplates a method for evaluating a compound for its ability to modulate the biological activity of a protein of the invention, by assaying for an agonist or antagonist enhancer or inhibitor) of the association of the protein with a substance which associates with the protein. The basic method for evaluating if a compound is an agonist or antagonist of the association of a protein of the invention and a substance that associates with the protein, is to prepare a reaction mixture containing the protein and the substance under conditions which permit the formation of substance- protein complexes, in the presence of a test compound. The test compound may be initially added to the mixture, or may be added subsequent to the addition of the protein and substance. Control reaction mixtures without the test compound or with a placebo are also prepared. The formation of complexes is detected and the formation of complexes in the control reaction but not in the reaction mixture indicates that the test compound interferes with the interaction of the protein and substance.

The reactions may be carried out in the liquid phase or the protein, substance, or test compound may be immobilized as described herein.

It will be understood that the agonists and antagonists i.e. inhibitors and enhancers that can be assayed using the methods of the invention may act on one or more of the binding sites on the protein or substance including agonist binding sites, competitive antagonist binding sites, non-competitive antagonist binding sites or allosteric sites.

The invention also makes it possible to screen for antagonists that inhibit the effects of an agonist of the interaction of a protein of the invention with a substance which is capable of binding to the protein. Thus, the invention may be used to assay for a compound that competes for the same binding site of a protein of the invention.

Substances that modulate a GlcNAc-TV-b Protein, GlcNAc-TV-b Related Protein, GlcNAc- TV-c Protein, or GlcNAc-TV-c Related Protein of the invention can be identified based on their ability to interfere with or enhance the activity of a GlcNAc-TV-b Protein, GlcNAc-TV-b Related Protein, GlcNAc-TV-c Protein, or GlcNAc-TV-c Related Protein. Therefore, the invention provides a method for evaluating a compound for its ability to modulate the activity of a GlcNAc-TV-b Protein, GlcNAc- TV-b Related Protein, GlcNAc-TV-c Protein, or GlcNAc-TV-c Related Protein comprising reacting an acceptor and a sugar donor for a GlcNAc-TV-b Protein, GlcNAc-TV-b Related Protein, GlcNAc- TV-c Protein, or GlcNAc-TV-c Related Protein in the presence of a test substance; measuring the amount of sugar donor transferred to acceptor, and carrying out steps and in the absence of the test substance to determine if the substance interferes with or enhances transfer of the sugar donor to the acceptor by the GlcNAc-TV-b Protein, GlcNAc-TV-b Related Protein, GicNAc-TV-c Protein, or GlcNAc-TV-c Related Protein.

WWI', NNFIN WMIAMMMOiJl'iY IY1LM~X~iYiBJaC WO 00/08171 PCT/CA99/00711 Suitable acceptors for use in the method of the invention are a saccharide, oligosaccharides, polysaccharides, glycopeptides, glycoproteins, or glycolipids which are either synthetic with linkers at the reducing end or naturally occurring structures, for example, asialo-agalacto-fetuin glycopeptide.

The sugar donor may be a nucleotide sugar, dolichol-phosphate-sugar or dolicholpyrophosphate-oligosaccharide, for example, uridine diphospho-N-acetylglucosamine (UDP-GlcNAc), or derivatives or analogs thereof. The GlcNAc-TV-b Protein, GlcNAc-TV-b Related Protein, GlcNAc- TV-c Protein, or GlcNAc-TV-c Related Protein may be obtained from natural sources or produced used recombinant methods as described herein.

The acceptor or sugar donor may be labeled with a detectable substance as described herein, and the interaction of the protein of the invention with the acceptor and sugar donor will give rise to a detectable change. The detectable change may be colorimetric, photometric, radiometric, potentiometric, etc. The activity of a GlcNAc-TV-b Protein, GlcNAc-TV-b Related Protein, GlcNAc- TV-c Protein, or GlcNAc-TV-c Related Protein of the invention may also be determined using methods based on HPLC (Koenderman et al., FEBS Lett. 222:42, 1987) or methods employed synthetic oligosaccharide acceptors attached to hydrophobic aglycones (Palcic et al Glycoconjugate 5:49, 1988; and Pierce et al, Biochem. Biophys. Res. Comm. 146: 679, 1987).

The GlcNAc-TV-b Protein, GlcNAc-TV-b Related Protein, GlcNAc-TV-c Protein, or GlcNAc-TV-c Related Protein is reacted with the acceptor and sugar donor at a pH and temperature and in the presence of a metal cofactor, usually a divalent cation like manganese, effective for the protein to transfer the sugar donor to the acceptor, and where one of the components is labeled, to produce a detectable change. It is preferred to use a buffer with the acceptor and sugar donor to maintain the pH within the pH range effective for the proteins. The buffer, acceptor and sugar donor may be used as an assay composition. Other compounds such as EDTA and detergents may be added to the assay composition.

The reagents suitable for applying the methods of the invention to evaluate compounds that modulate a GlcNAc-TV-b Protein, GlcNAc-TV-b Related Protein, GlcNAc-TV-c Protein, or GlcNAc- TV-c Related Protein may be packaged into convenient kits providing the necessary materials packaged into suitable containers. The kits may also include suitable supports useful in performing the methods of the invention.

Compositions and Treatments The nucleic acid molecules and proteins of the invention and substances or compounds identified by the methods described herein, antibodies, and antisense nucleic acid molecules of the invention may be used for modulating the biological activity of a GIcNAc-TV-b Protein, GlcNAc-TV-b Related Protein, GlcNAc-TV-c Protein, or GlcNAc-TV-c Related Protein, and they may be used to treat or prevent cancer, inhibit or treat tumor metastasis, stimulate hematopoietic progenitor cell growth, confer protection against chemotherapy and radiation therapy in a subject, and/or treat proliferative disorders, microbial or parasitic infections, or neurological disorders.

The substances, compounds, etc. of the invention may be especially useful in the treatment of various forms of neoplasia such as melanomas, adenomas, sarcomas, and particularly carcinomas of *i'i X *;l(rli"~q~urr~will1 WO 00/08171 PCT/CA99/00711 -26solid tissues in patients. In particular the composition may be used for treating cervico-uterine cancer, cancer of the kidney, brain, stomach, lung, rectum, breast, bowel, gastric, liver, thyroid, neck, cervix, salivary gland, bile duct, pelvis, mediastinum, urethra, bronchogenic, bladder, esophagus and colon, and Kaposi's Sarcoma which is a form of cancer associated with HIV-infected patients with Acquired Immune Deficiency Syndrome (AIDS).

Accordingly, the proteins, substances, antibodies, and compounds etc. may be formulated into pharmaceutical compositions for administration to subjects in a biologically compatible form suitable for administration in vivo. By "biologically compatible form suitable for administration in vivo" is meant a form of the substance to be administered in which any toxic effects are outweighed by the therapeutic effects. The substances may be administered to living organisms including humans, and animals. Administration of a therapeutically active amount of the pharmaceutical compositions of the present invention is defined as an amount effective, at dosages and for periods of time necessary to achieve the desired result. For example, a therapeutically active amount of a substance may vary according to factors such as the disease state, age, sex, and weight of the individual, and the ability of antibody to elicit a desired response in the individual. Dosage regima may be adjusted to provide the optimum therapeutic response. For example, several divided doses may be administered daily or the dose may be proportionally reduced as indicated by the exigencies of the therapeutic situation.

The active substance may be administered in a convenient manner such as by injection (subcutaneous, intravenous, etc.), oral administration, inhalation, transdermal application, or rectal administration. Depending on the route of administration, the active substance may be coated in a material to protect the compound from the action of enzymes, acids and other natural conditions that may inactivate the compound.

The compositions described herein can be prepared by per se known methods for the preparation of pharmaceutically acceptable compositions which can be administered to subjects, such that an effective quantity of the active substance is combined in a mixture with a pharmaceutically acceptable vehicle. Suitable vehicles are described, for example, in Remington's Pharmaceutical Sciences (Remington's Pharmaceutical Sciences, Mack Publishing Company, Easton, Pa., USA 1985).

On this basis, the compositions include, albeit not exclusively, solutions of the substances or compounds in association with one or more pharmaceutically acceptable vehicles or diluents, and contained in buffered solutions with a suitable pH and iso-osmotic with the physiological fluids.

After pharmaceutical compositions have been prepared, they can be placed in an appropriate container and labeled for treatment of an indicated condition. For administration of a composition of the invention the labeling would include amount, frequency, and method of administration.

The compositions, substances, compounds etc. may be indicated as therapeutic agents either alone or in conjunction with other therapeutic agents or other forms of treatment chemotherapy or radiotherapy). They can be used to enhance activation of macrophages, T cells, and NK cells in the treatment of cancer and immunosuppressive diseases. By way of example, they can be used in combination with anti-proliferative agents, antimicrobial agents, immunostimulatory agents, or antiinflammatories. In particular, they can be used in combination with anti-viral and/or anti-proliferative ""YM~.~rY1111-- 1VI I INWV1 V_ Wpll NPWW;NF~fllY119 lrW Pil NURVOWN0 WOMMIAM44MM1111 lip l Xl 11 VMANW- (-WH~!Yn~ll~~IllrrBiiB WO 00/08171 PCT/CA99/00711 -27agents, such as Thl cytokines including interleukin-2, interleukin-12, and interferon-, and nucleoside analogues such as AZT and 3TC. They can be administered concurrently, separately, or sequentially with other therapeutic agents or therapies.

The nucleic acid molecules encoding a GlcNAc-TV-b Protein, GlcNAc-TV-b Related Protein, GlcNAc-TV-c Protein, or GlcNAc-TV-c Related Protein or any fragment thereof, or antisense sequences may be used for therapeutic purposes. Antisense to a nucleic acid molecule encoding a protein of the invention may be used in situations to block the synthesis of the protein. In particular, cells may be transformed with sequences complementary to nucleic acid molecules encoding a GlcNAc-TV-b Protein, GlcNAc-TV-b Related Protein, GlcNAc-TV-c Protein, or GlcNAc-TV-c Related Protein. Thus, antisense sequences may be used to modulate GlcNAc-TV-b Protein, GlcNAc- TV-b Related Protein, GlcNAc-TV-c Protein, or GlcNAc-TV-c Related Protein activity, or to achieve regulation of gene function. Sense or antisense oligomers or larger fragments, can be designed from various locations along the coding or regulatory regions of sequences encoding a protein of the invention.

Expression vectors may be derived from retroviruses, adenoviruses, herpes or vaccinia viruses or from various bacterial plasmids for delivery of nucleic acid sequences to the target organ, tissue, or cells. Vectors that express antisense nucleic acid seqeunces of glcNAc-TV-b or glcNAc-TV-c can be constructed using techniques well known to those skilled in the art (see for example, Sambrook et al.

(supra)).

Genes encoding a GlcNAc-TV-b Protein, GlcNAc-TV-b Related Protein, GlcNAc-TV-c Protein, or GlcNAc-TV-c Related Protein can be turned off by transforming a cell or tissue with expression vectors that express high levels of a nucleic acid molecule or fragment thereof which encodes a protein of the invention. Such constructs may be used to introduce untranslatable sense or antisense sequences into a cell. Even if they do not integrate into the DNA, the vectors may continue to transcribe RNA molecules until all copies are disabled by endogenous nucleases. Transient expression may last for extended periods of time (e.g a month or more) with a non-replicating vector or if appropriate replication elements are part of the vector system.

Modification of gene expression may be achieved by designing antisense molecules, DNA, RNA, or Peptide nucleic acid (PNA), to the control regions of a glcNAc-TV-b or glcNAc-TV-c gene i.e.

the promoters, enhancers, and introns. Preferably the antisense molecules are oligonucleotides derived from the transcription initiation site between positions -10 and +10 from the start site). Inhibition can also be achieved by using triple-helix base-pairing techniques. Triple helix pairing causes inhibition of the ability of the double helix to open sufficiently for the binding of polymerases, transcription factors, or regulatory molecules (see Gee J.E. et al (1994) In: Huber, B.E. and B.I. Carr, Molecular and Immunologic Approaches, Futura Publishing Co., Mt. Kisco, An antisense molecule may also be designed to block translation of mRNA by inhibiting binding of the transcript to the ribosomes.

Ribozymes may be used to catalyze the specific cleavage of RNA. Ribozyme action involves sequence-specific hybridization of the ribozyme molecule to complementary target RNA, followed by IAIAWft WWWWAR, d, 44FO@WWW*4W' I I WO 00/08171 PCT/CA99/00711 -28endonucleolytic cleavage. For example, hammerhead motif ribozyme molecules may be engineered that can specifically and efficiently catalyze endonucleolytic cleavage of sequences encoding a polypeptide of the invention.

Specific ribosome cleavage sites within any RNA target may be initially identified by scanning the target molecule for ribozyme cleavage sites which include the following sequences: GUA, GUU, and GUC. Short RNA sequences of between 15 and 20 ribonucleotides corresponding to the region of the cleavage site of the target gene may be evaluated for secondary structural features which may render the oligonucleotide inoperable. The suitability of candidate targets may be evaluated by testing accessibility to hybridization with complementary oligonucleotides using ribonuclease protection assays.

The activity of the proteins, nucleic acid molecules, substances, compounds, antibodies, antisense nucleic acid molecules, and compositions of the invention may be confirmed in animal experimental model systems.

The invention also provides methods for studying the function of a GlcNAc-TV-b Protein, GlcNAc-TV-b Related Protein, GlcNAc-TV-c Protein, or GlcNAc-TV-c Related Protein. Cells, tissues, and non-human animals lacking in glcNAc-TV-b or glcNAc-TV-c expression or partially lacking in glcNAc-TV-b or glcNAc-TV-c expression may be developed using recombinant expression vectors of the invention having specific deletion or insertion mutations in the glcNAc-TV-b or glcNAc-TV-c gene.

A recombinant expression vector may be used to inactivate or alter the endogenous gene by homologous recombination, and thereby create a glcNAc-TV-b or glcNAc-TV-c deficient cell, tissue or animal.

Null alleles may be generated in cells, such as embryonic stem cells by deletion mutation. A recombinant glcNAc-TV-b or glcNAc-TV-c gene may also be engineered to contain an insertion mutation which inactivates glcNAc-TV-b or glcNAc-TV-c. Such a construct may then be introduced into a cell, such as an embryonic stem cell, by a technique such as transfection, electroporation, injection etc. Cells lacking an intact glcNAc-TV-b or glcNAc-TV-c gene may then be identified, for example by Southern blotting, Northern Blotting or by assaying for expression of a protein of the invention using the methods described herein. Such cells may then be used to generate transgenic non-human animals deficient in glcNAc-TV-b or glcNAc-TV-c. Germline transmission of the mutation may be achieved, for example, by aggregating the embryonic stem cells with early stage embryos, such as 8 cell embryos, in vitro; transferring the resulting blastocysts into recipient females and; generating germline transmission of the resulting aggregation chimeras. Such a mutant animal may be used to define specific cell populations, developmental patterns and in vivo processes, normally dependent on glcNAc-TV-b or glcNAc-TV-c expression.

A protein of the invention may be used to support the survival, growth, migration, and/or differentiation of cells expressing the polypeptide. Thus, a polypeptide of the invention may be used as a supplement to support, for example cells in culture.

W~UIYI( nr~l~PIXS6~7M~ll~hUl~l~m;li~ lf~G 411tii:7n!llllltMmll~AJMp)RlldllC' L~~1R(~HHV;II'~VhQRt~Y~iht~/WII~Bliir~ WO 00/08171 PCT/CA99/00711 -29- Methods for Preparing Oligosaccharides The invention relates to a method for preparing an oligosaccharide comprising contacting a reaction mixture comprising an activated GlcNAc and an acceptor in the presence of a protein of the invention.

Examples of acceptors for use in the method for preparing an oligosaccharide are a saccharide, oligosaccharides, polysaccharides, glycopeptides, glycoproteins, or glycolipids which are either synthetic with linkers at the reducing end or naturally occurring structures, for example, asialoagalacto-fetuin glycopeptide. The activated GlcNAc may be part of a nucleotide-sugar, a dolicholphosphate-sugar, or dolichol-pyrophosphate-oligosaccharide.

In an embodiment of the invention, the oligosaccharides are prepared on a carrier that is nontoxic to a mammal, in particular a lipid isoprenoid or polyisoprenoid alcohol. An example of a suitable carrier is dolichol phosphate. The oligosaccharide may be attached to a carrier via a labile bond allowing for chemical removal of the oligosaccharide from the lipid carrier. In the alternative, the oligosaccharide transferase may be used to transfer the oligosaccharide form a lipid carrier to a protein.

The following non-limiting examples are illustrative of the present invention: Example 1 Isolation of Human GlcNAc-TVb A cDNA sequence of a human GlcNAc-TV homolog was identified by similarity matching using the GeneBank ESTdatabase (accession number R87580). This EST cDNA clone (designated as hGTNVb) was sequenced (627 base pairs) and when translated was shown to be 67% identical to the 3' end of the human GlcNAc-TV amino acid sequence. This information initiated a search for the entire sequence of this human GlcNAc-TV-like cDNA using two different methods; screening a human brain cDNA library by colony plaque lifts and 5' RACE (rapid amplification ofcDNA ends).

A human brain 5'STRETCH PLUS cDNA library (gtlO- CLONTECH (Cat HL3002A) was screened (using standard protocols) with a 32 P-dCTP labeled 203 base pair cDNA probe generated by restriction enzyme digestions of the hGlcNAc-TV-b EST cDNA with Notl and BamHI. Two million phage clones were screened and 4 positive clones were identified. Each of these clones was purified to homogeneity by three subsequent rounds of screening and phage DNA was isolated from each of these clones using conventional methods. The cDNA insert was isolated from each of these clones and then subcloned into the EcoRl site of the Bluescript vector (Stratagene) and sequenced. Two out of four clones had sequences that were identical to the EST clone and thereby provided no new information.

The other two clones were found to be similar to hGlcNAc-TV-b. One clone (1820 base pairs) was identical in sequence to the coding region of the EST clone with an additional 1295 base pairs of 3' untranslated sequence and the other clone was 61% identical (amino acid comparison) with hGlcNAc- TV-b and was designated as hGlcNAc-TV-c. Interestingly the 3' ends of hGlcNAc-TV-b and hGlcNAc-TV-c are very dissimilar suggesting that one of these clones is a splice variant of the other.

The 5' RACE protocol was used to isolate the 5' region of the hGlcNAc-TV-b cDNA sequence. First strand cDNA synthesis was performed using a PCR primer that was incubated (primer TVB#1A CCAGACCTGGTCGGCCCCTGCAGCCACAG) (SEQ ID NO. 13) (100 mMfinal WO 00/08171 PCT/CA99/00711 concentration) with 2 gg of mRNA from PFSK-1 cells (ATCC CRL-2060 primitive neuroectodermal tumor) and incubated for 10 minutes at 85 °C and then chilled on ice for 1 minute. To this mixture was added, to final concentrations, 20 mM Tris-HCI (pH 50 mM KCI, 2.5 mM MgCI 2 10 mM DTT, 400 pM each dATP, dCTP, dGTP, dTTP and 200 Units of Superscript II RT (GIBCO-BRL) and incubated for 50 minutes at 42 0 C. The reaction was terminated by placing it at 70 0 C for 15 minutes which was then incubated with 2 Units of RNAse and incubated for an additional 30 minutes. The generated cDNA was purified by using GlassMax DNA spin cartridges following the manufacturer's instructions (GIBCO-BRL). The isolated cDNA was tailed with terminal deoxynucleotidyl transferase (TdT) that added homopolymeric dCTP tails to the 3' ends of the cDNA in a reaction that was incubated for 10 minutes at 37 0 C with a final composition of 10 mM Tris-HCI (pH 25 mM KCI, mM MgCI 2 200 pM dCTP and 1 Unit of TdT. The TdT was heat inactivated for 10 minutes at 0 C. The tailed cDNA (5 pl) was amplified by PCR using two primers (primer TVB#1B GGAGGCAGCCCCGGGAGCTGGGAG (SEQ ID NO. 14) and an Abridged Anchor primer sequence not provided from GIBCO-BRL) with the final composition of the reaction as 20 mM Tris- HCI (pH 50 mM KCI, 1.5 mM MgCI 2 400 mM primer TVB#1B, 400 mM Abridged Anchor primer, 200 pM each dATP, dCTP, dGTP, dTTP and 2.5 Units of Taq DNA polymerase. This reaction was transferred to a thermal cycler preequilibrated to 94C. Thirty five cycles of PCR was performed with the following cycling protocol: predenaturation at 94 0 C for 2 minutes, denaturation at 94 0 C for 1 minute, annealing of primers at 58 0 C for 1.5 minutes, primer extension at 72 0 C for 2.5 minutes and final extension at 72 0 C for 10 minutes. The 5' RACE products were analyzed using standard agarose gel electrophoresis protocols. No visible bands were observed therefore the region above 1.6 kb marker was isolated using a DNA gel extraction kit from Stratagene and subcloned into the T/A Bluescript vector using standard procedures. Several cDNA fragments were subcloned into the Bluescript vector and were sequenced. Only one clone containing a 1.7 kb cDNA fragment was similar to hGlcNAc-TV-b. The actual size of this cDNA fragment is 1676 base pairs which did not encompass the entire hGlcNAc-TV-b clone, therefore a second round of 5' RACE was performed using the same protocol as above with different primers. To isolate the 5' end of hGlcNAc-TV-b, another primer TVB#2A (GGTCAAGATAAATGCGTTTTTCCACCGATC) (SEQ ID NO. 15) was used in place of primer TVB#1A, and TVB#2B (GTGGATTATATCCTATGGCAGAAAAGCTTTATAT) (SEQ ID NO. 16) was used replacing TVB#2A. This set of primers generated three cDNA fragments 1.7 and 1.4 kb) which were isolated following the manufacturer's instructions using a DNA gel extraction kit from Stratagene and subcloned into the T/A Bluescript vector using standard procedures. Each of the cDNA fragments were sequenced which revealed that only the 1.4 kb fragment was similar to hGlcNAc-TV and represents the 5' end of hGlcNAc-TV-b. The actual size of this fragment is 1440 base pairs.

The entire cDNA sequence of hGlcNAc-TV-b is 4541 base pairs and was reconstructed by first isolating a 1431 base pair band (designated band A) (Stratagene gel extraction kit) from the 1440 base pair 5' end of hGTNV (from the second round of 5' RACE) by restriction enzyme digestion with WO 00/08171 PCT/CA99/00711 -31 HindIII. Second, the middle section of hGlcNAc-TV-b (1623 base pairs-designated band B) was isolated from the 1676 base pair hGlcNAc-TVb fragment (from the first round of 5' RACE) by restriction enzyme digestions with HindIII and Smal and then ligated (using standard protocols) to band A. And finally the 3' end of hGTNVb was isolated by using the Smal restriction enzyme to isolate a 1487 base pair band (designated band Band C was then ligated to band A+B to generate the entire nontranslated and translated sequence ofhGlcNAc-TV-b.

Example 2 Expression of GlcNAcTV-b Northern Blot Analysis of Human Tissues Human multiple tissue and tumor cell line Northern blots were obtained from Clontech. The Northern blot containing mRNA from human breast and uterus cancer tissues as well as normal tissues was obtained from Invitrogen. All Northern blots contained 2 g o f mRNA/lane. These blots were hybridized with [a- 32 P]dCTP-labeled hGlcNAc-TV (nucleotides 1508-1921) and GlcNac-TV-b (nucleotides 1959-2417) cDNAs. Amersham multiprime DNA labeling kit and [a- 32 P]dCTP (3000 Ci/mol) were used for labeling. Northern blots were hybridized under stringent conditions following the recommended protocol (Clontech) and exposed to x-ray film or phosphoimager.

Results The expression pattern of the two GlcNAc-TVs was examined in different human tissues.

Hybridization of GlcNAc-TV cDNA probe to Northern blots under stringent conditions revealed the wide expression of two transcripts ranging in size from 7.4-9.5 kb (Figure The major transcript 9.3 kb was expressed in most tissues as well as in different parts of human brain (Figure The 9.3 kb and 7.4 kb transcripts were not detected in human tumor cell lines with the exception of human colorectal cell line SW480 (Figure Although in this case the 7.4 kb transcript was a predominant one. When the same blots were tested with GlcNAc TV-b cDNA probe, a very different pattern of tissue specific expression was observed. The high levels of 4.5 kb transcipt were expressed in brain tissue and low levels in testis (Figure The presence of this transcript was not detected in other tested tissues. The GlcNAc-TV-b transcript was expressed throughout the adult brain with the exception of spinal cord (Figure Four cell lines derived from solid tumors revealed expression of GlcNAc-TVb, whereas the kb transcript was not detected in leukemia and lymphoma (Figure The high expression of GlcNAc-TVb was detected in two different human tumor tissues (breast and uterus) whereas normal tissue, adjacent to tumor tissues showed very low levels of GlcNAc-TVb transcript.

Having illustrated and described the principles of the invention in a preferred embodiment, it should be appreciated to those skilled in the art that the invention can be modified in arrangement and detail without departure from such principles. All modifications coming within the scope of the following claims are claimed.

All publications, patents and patent applications referred to herein are incorporated by reference in their entirety to the same extent as if each individual publication, patent or patent application was specifically and individually indicated to be incorporated by reference in its entirety.

IfiA KKI; M OM 1 NO 1 A1"ihW10% 11 AW 4qlfo$ toP Waf R iIMFOA in tWIN I PWY'M W01 01 W1 IW ]WI IW 1fi NMNF PW N f W tl 0W k A 10 'palPRMi1 Q'P %0 Vq4 AW01' )1r 0N' A WO 00/08171 WO 00/817 1PCT/CA99/0071 I 32 Nothing herein is to be construed as an admission that the invention is not entitled to antedate such disclosures by virtue of prior invention.

WB, 4 4 4~W~ 2M~i W~ I~ EDITORIAL NOTE APPLICATION NUMBER 51436/99 The following Sequence Listing pages 1-38 is part of the description.

The claims pages follow on pages 33-36 VOK 1-4WAMAWAI MIMUIM0004A fAAW WAM '00MAWN AdRAW, q- WO 00/08171 WO 00/817 1PCT/CA99/0071 1 1.

SEQUENCE LISTING <110> GlycoDesign Inc.

<120> Novel N-Acetylglycosaminyltransferase Genes <130> pl74pct5 <140> <141> <150> <151> U.S. 60/095,919 1998-08-07 <160> 16 <170> Patentln Ver. <210> <211> <212> <213> 1 2061

DNA

Homo sapiens <400> 1 atgtttttta tttggactca agcagtgtca gcagaggaaa gatctgaaaa caatctcaag aaaaaatatg tcccagaaat tgagtttact gttgcttgta tttggggatt gatgttactg cactatactt ttcaacaacc aagacatcaa 120 agttacgtga gcaaatacta gacttaagca aaagatatgt taaagctcta 180 ataagaacac agtggatgtc gagaacggtg cttctatggc aggatatgcg 240 gaacaattgc tgtccttctg gatgacattt tgcaacgatt ggtgaagctg 300 M q 1 Pro 1.m lmlm,-, WO 00/08171 WO 00/817 1PCT/CA99/0071 I gagaacaaag agtgggaatt atagcagttg aaagctttat ggtcaagata aagtggataa tccagttgca tggagacata gattttgaac cggagacggc aacgcagaga actgtatcta tggagtgatg ctggctgagc gtaggagaca aaaa ct ctag ggatcagatc tggggatggt aacacttttc aatgagatga aataagcata gactcctcta gaccatcgat cgttgcgctc ccacccaatt ttctcccagc tacaacaact ccctacctac cagcaccagg ttgactatat tggtgccagt aaaatcacct attgctggct aatgcgtttt atgacatgtg ctttttttat aaaatcctta ttctgtacag gaatggttga agaaaaaacg agattgctga taattacatc tcaaggatgt gaattgttga gtccaacctg ccgattttga ggaat ctgaa ttgggtttgc agaggcagaa tttacttcga cacccaatat t cc tcct ccg cgctggaagc caaggaagaa atccctacgc cagaggagtt cctacgagta acttctgcag tgttgtgaat aaccacaaat tgtgctgctc taggacagag tccaccgatc ccgttcggat atac ct cagt cgacgacgct tgtgattcat gggatgggcc gaaaaaggcc aacaggtttc tgcgtacgca cgtgaagaag gctactttac ggctcaacat acatgccaat ccctaataac gatcgagcag tcagacgctt aatcattcac tccctcttac agagaccttc catggcaaat tacagagttt ggagaacttc tgaagcagc c cacctgcgag agcttcagaa ggctcagcag aaaagaacga catccactgt gcaatacttt gacggttacc ccgtgcaagg gacgccgaca gagcataatt cataaggacg caaatcgcaa ctagttcacc agtgccgcac gcggggcatg attataggta gctgatgtaa cggtggatgg tatgcgcaaa ttttatacaa cacctaaact gtgtatggca aattacatcg tctcgaaacc ttgttactag cgatgcgtct ttacgaggca atcggcaagc atcaaggcca gggatgctgg cactgccacc ccaacaccac atgtctcggg ggattatatc acaataaaag cacactacga ctcattatgg atcattgtcc catgcgctga agttccattt agtccctagc tgggaat cat ctcttggtga acgttaggat accgatctgg ttggactcgg ttcgagtcct caaagggtca tgttccccca ccagtgatat aagtggatag aagtgcaagc acggtattct gactagggac ttctcaaacc agcccacctc cccacgtgtg ttatgagaac agcggatcac cacccagttt caatggtact cagtatcagg ctatggcaga cactaacgga gggaaaaatt tatagatggg ccatgcaccc aattcgtagt tatgagacta agataagcag taccaaggac cttagttcat cactgcatca ttgcccatct tcaattcaag tgaaacttt t caagagccct tactccagaa gcaccacctt cttctggaag aactgtgtat ttctggtcgg tccttacgaa gaagttcccc cagagaggtg gacagtcgac tcaggtagac cgcctacatc tataat ccgc 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 A 1~j I, ~Wb LW A A A A~ A~ ~A WO 00/08171 WO 0008171PCT/CA99/0071 I tccctctcca gggcaacccc a <210> 2 <211> 687 <212> PRT <213> Homo sapiens <400> 2 Met Phe Phe Thr Ile Ser Arg Lys Asn Met Ser Gin Lys Leu Ser Leu 1 5 10 Leu Leu Leu Val Phe Gly Leu Ile Trp, Gly Leu Met Leu Leu His Tyr 25 Thr Phe Gin Gin Pro Arg His Gin Ser Ser Val Lys Leu Arg Giu Gin 40 2061 Ile Leu Asp Leu Ser Lys Lys Asn Thr Vai Asp Val 70 Asp Leu Lys Arg Thr Ile Arg Tyr Val Lys Ala 55 Giu Asn Gly Ala Ser 75 Ala Vai Leu Leu Asp 90 Leu Ala Glu Giu Asn Met Ala Gly Tyr Ala Asp Ile Leu Gin Arg Leu Val Lys Leu Giu Asn Lys Vai Asp Tyr 100 105 Ile Val Val Asn Gly Ser 110 WO 00/08171 WO 0008171PCT/CA99/0071 I Ala Ala Asn Thr Thr Asn Gly 115 Thr Asn Lys Arg Thr Asn Val.

130 135 Asn His Leu Val Leu Leu His 145 150 Thr Ser Gly Asn Leu Val Pro Val Thr 120 125 Ser Gly Ser Ile Arg Ile Ala Val Glu 140 Pro Leu Trp Ile Ile Ser Tyr Gly 155 Lys Ala Leu Tyr Cys Trp, 165 Leu Arg Thr Giu Ala Ile Leu Tyr Asn Lys 170 175 Ser Thr Asn Gly Gly Gin Asp Lys 180 Val Phe Pro Pro Tyr Pro His Tyr Giu Gly Lys 195 Lys Trp Ile Asn Asp 205 Ile Asp Gly 190 Met Cys Arg Ser Cys Thr Ser Asp Pro Cys Lys Ala 210 Tyr Gly Ile Asp Gly Ser 220 Phe Phe Ile Tyr Leu 225 Trp Arg His Lys Asn 245 Giu Ile Arg Ser Asp 260 Ser Asp Ala Asp Asn His Cys Pro His Ala Pro 230 235 240 Pro Tyr Asp Asp Ala Glu His Asn Ser Cys Ala 250 255 Phe Giu Leu Leu Tyr Ser Val Ile His His Lys 265 270 WO 00/08171 WO 0008171PCT/CA99/0071 I Asp Giu Phe His Phe Met Arg 275 Leu Arg Arg Arg Arg 280 Met Val Giu Gly 285 Asn Ala Giu Lys Trp Ala 290 Gin Ilie Ala Lys Ser 295 Leu Ala Asp Lys Val His Leu Gly 315 Lys 305 Lys Arg Lys Lys Ala Leu 310 Ile Ile Thr Lys Asp 320 Thr Val Ser Lys Ile Ala Giu Thr Gly Phe Ser 325 330 Ala Ala Pro Leu Gly 335 Asp Leu Vai His Trp Ser Asp Val 340 Ile Thr Ser Aia Tyr 345 Ala Ala Gly 350 Asp Val Val His Asp Val Arg Ile Thr Ala 355 Leu Ala Glu Leu Lys 365 Lys Lys Ile Ile Gly Asn 370 Arg 375 Ser Gly Cys Pro Ser 380 Val Gly Asp Arg Ile 385 Val Glu Leu Leu Ala Asp Val Ile Gly 395 Leu Gly Gin Phe Lys 400 Trp Met Val Arg Val 415 Lys Thr Leu Gly Pro 405 Thr Trp Ala Gin His Arg 410 Leu Giu Thr Phe Gly Ser Asp Pro Asp Phe Glu His Ala Asn Tyr Ala Wom j* 'gm WO 00/08171 WO 0008171PCT/CA99/0071 I 420 Gin Thr Lys Gly His Lys Ser 435 Pro Trp Gly Trp Trp Asn 440 445 Pro His Thr Pro Giu Asn 460 Leu Asn Pro Thr Phe Leu Asn Asn Phe Tyr Thr Met 450 Gly Phe Ala Ile Giu 465 Gin 470 His Leu Asn Ser Ser Asp Met His His 475 Asn Glu Met Lys Arg 485 Gin Asn Gin Thr Leu Val Tyr Giy Lys 490 Val Asp 495 Ser Phe Trp Ile Giu Val 515 Ser Tyr Ser 530 Lys Asn Lys His Ile Tyr Phe Gu Ile Ile 500 505 His Asn Tyr Asn Ile Pro Gin Ala Thr Val Tyr Asp Ser Ser Thr 520 Pro 525 Arg Asn His Gly Ile Leu Ser Giy 535 Asp His Arg Phe Leu Leu Arg Giu Thr Phe Leu Leu Leu Gly Leu 545 550 555 Arg Cys Ala Pro Leu Giu Ala Met Aia Asn Arg 565 570 Giy Thr Pro Tyr Giu 560 Cys Vai Phe Leu Lys 575 kFAAMAO IRRO ,,XTRNW7-K.

WO 0008171PCT/CA99/0071

I

WO 00/08171 Pro Lys Phe Pro Pro Pro Asn Ser Arg Lys Asn Thr Glu Phe Leu Arg 580 585 590 Gly Lys Pro Thr Ser Arg Glu Val Phe Ser Gin His Pro Tyr Ala Giu 595 600 605 Asn Phe 61.0 Ile Gly Lys Pro His Val Trp Thr Val 615 Asp Tyr Asn Asn Ser 620 Giu 625 Giu Phe Glu Ala Ala Ile Lys Ala Ile 630 Met 635 Arg Thr Gin Val Asp 640 Pro Tyr Leu Pro Thr Ala Tyr Ile 660 Tyr Giu Tyr Thr Cys 645 Giu 650 Giy Met Leu Giu Arg Ile 655 Gin His Gin Asp Phe Cys 665 Arg Ala Ser Giu His Cys 670 His Pro Pro Ser Phe Ile Ile Arg Ser 675 680 Leu Ser Arg Ala Thr Pro 685 <210> 3 <21i> 4541 <212> DNA <213> Homo sapiens <400> 3 ggctcttacc gcagcctgag tttcagcagc tgctgcgcaa. ggccaaactc ttcctcgggt WeYWWWA "WOW' WO 00/08171 WO 0008171PCT/CA99/0071 1 ttggcttcc c tgcagtcccg ccacctccag acgtgtggac tgagaactca ggatccacgc gaggcccacg gctcggaaca tggccgtgcc cctccttctt acagcaccga gtgctacctg cggctctgcc ttgcttttta ctcaagttac atct caagaa tggggattga ttacgtgagc aagaacacag acaattgctg gactatattg gtgccagtaa aatcaccttg tgctggctta tgcgtttttc gacatgtgcc ttttttatat aatccttacg ctacgagggc cttcagcccg agaggtgttc agtcgactac ggtagacccc ctacatccag ccccgcagag ccagcttggc tgggagggcc ccccttcctg gtcggagatg cagaaggagc cctgccgcga cgagtcgagt catttttcag aaaatatgtc tgttactgca aaatactaga tggatgtcga tccttctgga ttgtgaatgg ccacaaataa tgctgctcca ggacagaggc caccgatcga gttcggatcc acctcagtga acgacgctga cccgcccccc ccccacagct tcccagcatc aacaactcag tacctaccct caccaggact cccctttgtc tcctggggcc tgcaccgaca aacagccagg aaccacctgt ctctgctctt cttccgcaag tttttttctt tcaagtctgt ccagaaattg ctatactttt cttaagcaaa gaacggtgct tgacattttg ctcagcagcc aagaacgaat tccactgtgg aatactttac cggttaccca gtgcaaggct cgccgacaat gcataattca tggaggccat ccctcaacca cctacgcgga aggagtttga acgagtacac tctgcagagc ctggccccca tggcccccgc cctgcctgga acqccttcct actct cggcg cagtgcgccg cggaattccg ttttttttca ttgtttgctt agtttactgt caacaaccaa agatatgtta tctatggcag caacgattgg aacaccacca gtctcgggca attatatcct aataaaagca cactacgagg cattatggta cattgtcccc tgcgctgaaa cgccaatggt cgagttcttc gaacttcatc agcagccatc ctgcgagggg tccagaccac atgccaccca gcacaccctg ccacgggcta caagctgcag ttcgcccagc gctccaacac gccggaattc agtcttgatt cttcagaaat tgcttgtatt gacatcaaag aagctctagc gatatgcgga tgaagctgga atggtactag gtatcaggat atggcagaaa ctaacggagg gaaaaattaa tagatgggtc atgcaccctg ttcgtagtga tgcatcttcc ccaggcaagc ggcaagcccc aaggccatta atgctggagc tgccctacca cctcgagtgg cgggcctggc atctgtgagc gtgccctgtg ctggccagga caagtaccgc cggaattctt tgtggcttac gttttttaca tggactcatt cagtgtcaag agaggaaaat tctgaaaaga gaacaaagtt tgggaatttg agcagttgaa agctttatat tcaagataaa gtggataaat cagttgcact gagacataaa ttttgaactt 120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 ctgtacagtg tgattcatca taaggacgag ttccatttta tgagactacg gagacggcga 1800 i~4~I ~i~1i ~A~i W~I~VI V~ WO 00/08171 WO 00/817 1PCT/CA99/0071 I 9 atggttgagg gatgggccca aatcgcaaag tccctagcag ataagcagaa cgcagagaag 1860 aaaaaacgga attgctgaaa attacatctg aaggatgtcg attgttgagc ccaac ctggg gattttgaac aatctgaacc gggtttgcga aggcagaatc tacttcgaaa cccaatattc ct CCtcCgag ctggaagcca aggaagaata Ccctacgcgg gaggagtttg tacgagtaca ttctgcagag gcaaccccac tgggagctgg cggcccctgc tt tgggggga aaataataaa aagcgcggcc ggcctgagct cacgtatgga cttctgaatt aaaaggCc~t caggtttcag cgtacgcagc tgaagaagat tactttacgc ctcaacatcg atgccaatta ctaataactt tcgagcagca agacgcttgt tcattcacaa cctcttactc agaccttctt tggcaaatcg cagagttttt agaacttcat aagcagccat cctgcgaggg cttcagaaca ccaccagcct tggaggggcc agccacagaa aagcaataga tattttattt gcaagcttat cagctaggac agagttcaat ctcattccta agttcacctg tgccgcacct ggggcatgac tataggtaac tgatgtaatt gtggatggtt tgcgcaaaca ttatacaatg cctaaactcc gtatggcaaa ttacatcgaa tcgaaaccac gttactagga atgcgtcttt acgaggcaag cggcaagccc caaggccatt gatgctggag ctgccacCca aggcctgctc aggctggacg ccacqatggc gacactcttt ggatgtgagg tccctttagt agtgactatt cttagagtag gcacattgtc ggaatCatta cttggtgact gttaggatca cgatctggtt ggactcggtc cgagtccttg aagggtcaca ttcccccata agtgatatgc gtggatagct gtgcaagcaa ggtattcttt ctagggactc ctcaaaccga cccacctcca cacgtgtgga atgagaactc cggatcaccg cccagtttta ctccaccttc cttcccgtgg aaaaaatcta ttctctcttt tgcagaagag gagggt taat taatatagtt acaccttgtg cttacagatt ccaaggacac tagttcattg ctgcatcact gccCatCtgt aattcaagaa aaacttttgg agagcccttg ctccagaaaa accaccttaa tctggaagaa ctgtgtatga ctggtcggga cttacgaacg agt tcc cccc gagaggtgtt cagtcgacta aggtagaccc cctacatcca taatccgctc cgggaggcag gagtcccctc tttgttctca ttttaaagat aaaaaaaaaa t taaaaagca aatgccagga aatacacaaa cccaggggac tgtatctaag gagtgatgta ggctgagctc aggagacaga aactctaggt atcagatccc gggatggtgg cacttttctt tgagatgaag taagcatatt ctcctctaca ccatcgattc ttgcgctccg acccaattca ctcccagcat caacaactca ctacctaccc gcaccaggac ccCt Ctccagg ccccgggagc c aga cCtggt aggactaacc ttatttcttt aaaaaaaaaa aaagaattcc actttcaccc ccaacactcc accaagaggt 1920 1980 2040 2100 2160 2220 2280 2340 2400 2460 2520 2580 2640 2700 2760 2820 2880 2940 3000 3060 3120 3180 3240 3300 3360 3420 3480 3540 WO 00/08171 WO 0008171PCT/CA99/0071 I ttttgcctat gggttaaaag aatgattgct agagcatatt catctccaca ccgttcattc ttacatcgtc tacctggttg gccttggtag tgtggcttct tcacccaggc actgcaacct ggatcacag4 ttcacgatgt cttcccaaaa tggcaaattc ataaaattaa ccagtcgtct gaatccacaa atcaacggtt ttcagacaca cccttggcaa aaggtagcta aaacgaaacc gagcttggaa aaagtttctt tggagtgcat ccacctccca catgtgccac tggccagact tgctggatta acatagctac ctagcaacag gctaagatgg ttcctctaaa aaaggatatc tcgtaaacaa aggcgggaga ctgccctcta gaactccaca aacatcatca tttttttttt tttcttgtgt ggttcaagag cacacccagc ggtctcgaac caggtgtgaa tttcatactt taaatggtga tgaagggtgt gttgatggga c caggc cc tc cagaggggca gagggctcac gtgttgatat aagtttttca ggtgaggata tttttttttt ccaaccaaga atgctcctgc taagttttgt tcctgaccta ccactgcacc gttaaaatac agtcctaatt ccccatcccc aagtttccat cagcaaatgc atactcatgc caaccttgga gggaataaag attactgatg ttgcactgga tgagacagag ctcacatacc cctagcctcc atttttagaa aagtgatcca tggcctccaa cgaaatgctt aaataagcat atgtttaata ctttcagata cttctggaat ttcgcaaaag gaagcctggt caaaaaagta tgtct cagca gctgacctct tctcactgtg atctcagctc caagtagctg gagatggggt cctgccttgg gatttctatt ccatac cagt 3600 3660 3720 3780 3840 3900 3960 4020 4080 4140 4200 4260 4320 4380 4440 4500 4541 tagcaaaagg ccacccggaa ttcagcttgg acttaaccag g <210> 4 <211> 1485 <212> PRT <213> Homo sapiens <400> 4 Gly Ser Tyr Arg Ser Leu Ser Phe Ser Ser Cys Cys Ala Arg Pro Asn 1 5 10 Ser Ser Ser Gly Leu Ala Ser Pro Thr Arg Ala Pro Pro Pro Trp, Arg 25 WO 00/08171 WO 0008171PCT/CA99/0071 I 11 Pro Ser Pro Met Val Ala Ser Ser Cys Ser Pro Ala Ser Ala Arg Pro 40 Thr Ala s0 Arg Cys Pro Ser Thr Thr Ser Ser Ser Gin Ala 55 Ser Pro Pro Pro Glu Ser Ser Ala Ser Pro s0 Ser Pro Ser Ile Pro Thr Arg Arg 70 Thr 75 Thr Cys Gly Gin Ser Thr Thr Thr Thr Gin 90 Arg Ser Leu Lys Gin Pro Ser Arg Pro Ala Arg Gly 115 Leu Giu Leu Arg Thr Pro Thr 100 105 Tyr Pro Thr Ser Thr Pro 110 Cys Trp Ser Gly Ser Thr Pro Thr Ser 120 Ser Thr Arg Thr 125 Ser Ala 130 Ser Pro 145 Glu Leu Gin Thr Thr Ala Leu Pro Glu 135 His Ala Pro Gin Phe Val Leu Ala Pro Asn Ala Thr 150 His 155 Leu Glu Trp Ala Arg 160 Asn Thr Ser Leu Ala Pro Gly Ala Trp Pro Pro 165 170 Arg Thr Pro Cys Gly 175 Pro Gly Trp Pro Cys Leu Gly Giy Pro Ala Pro Thr Pro Ala Trp Thr ~p~P WO 00/08171 WO 0008171PCT/CA99/0071 1 180 Thr Gly Ser Val Ser Pro Pro 195 190 Ser Ser Pro Ser Thr Ala 200 205 Val Thr Ala Pro Ser Arg 220 Arg Thr Pro Arg Thr Thr Ser Ser 210 Ser Cys Arg Cys Pro 215 Cys Thr Leu Gly Val Arg 225 230 Gly Ala Ser Ala Leu Gin 245 Pro Ala Trp Pro Gly Val Leu Pro Ala Giu 235 240 Cys Ala Gly Ser Asn Thr Lys Tyr Arg Arg 250 255 Leu Cys Pro Cys Arg Asp Phe Arg Lys Arg Asn Ser Gly 260 265 Gly Ile Leu Leu Leu Phe Thr Ser Arg Val Phe Phe Leu 275 280 285 Arg Asn Ser 270 Phe Phe Phe Lys Ser 290 Leu Phe 305 Phe Val Ala Tyr Ala Ser Ser Giu 310 Leu Lys Leu Pro Phe 295 Met Phe Phe Thr Ile 315 Leu Leu Leu Val Phe 330 Phe Ser Gin Val Cys 300 Ser Arg Lys Asn Met 320 Gly Leu Ile Trp Gly 335 Ser Gin Lys Leu Ser Leu 325 L41 ft WO-T-10- 44 MtW 'AW RFOf4v1&wvk1WA lwgl jw *lftc WO 00/08171 WO 0008171PCT/CA99/0071 I Leu Met Leu Leu His Tyr Thr Phe Gin Gin Pro Arg His Gin Ser Ser 340 345 350 Val Lys Leu Arg Giu Gin Ile Leu Asp Leu Ser Lys 355 360 Arg Tyr Vai Lys 365 Glu Asn Gly Aia Aia Leu 370 Ser Met 385 Ala Giu Glu Asn Lys Asn Thr Vai Asp 375 Val 380 Aia Gly Tyr Aia Asp Leu Lys Arg 390 Ile Ala Vai Leu Leu 400 Asp Asp Ile Leu Gin Arg Leu Vai Lys Leu 405 410 Gly Ser Aia Ala Asn Thr 425 Giu Asn Lys Vai Asp Tyr 415 Thr Asn Gly Thr Ser Gly 430 Ile Vai Vai Asn Leu Vai 435 Asn 420 Pro Vai Thr Thr Asn Lys Arg Thr Asn 440 Vai Ser Gly Ser 445 Ile Arg Ile Ala Vai Giu Asn His Leu Vai Leu Leu His Pro Leu Trp 450 455 460 Ile Ile Ser Tyr Gly Arg Lys Ala Leu Tyr Cys Trp Leu Arg Thr Giu 465 470 475 480 Ala Ile Leu Tyr Asn Lys Ser Thr Asn Gly Gly Gin Asp Lys Cys Val 485 490 495 WO 00/08171 WO 0008171PCT/CA99/0071 I 14, Phe Pro Pro Ile Asp Gly Tyr Pro His Tyr Glu Giy Lys Ile Lys Trp 500 505 510 Ile Asn Asp Met Cys Arg Ser Asp Pro Cys Lys Ala His Tyr Gly Ile 515 520 525 Asp Gly Ser Ser Cys Thr 530 His Cys Pro His Ala Pro 545 550 Phe Phe Ile Tyr Leu 535 Ser Asp Ala Asp Asn 540 Trp, Arg His Lys Pro Tyr Asp Asp Giu His Asn Ser Ser Val Ile His 580 Arg Arg Met Val 595 Cys 565 Ala Giu Ile Arg Ser 570 Asp Phe Glu Leu Leu Tyr 575 His Lys Asp Glu Phe 585 Glu Gly Trp Ala Gin 600 His Phe Met Arg Leu Arg Arg 590 Ile Ala Lys Ser Leu Ala Asp 605 Lys Gin Asn Ala Giu Lys Lys Lys Arg Lys Lys Ala Leu Val His Leu 610 615 620 Gly Ile Ile Thr Lys Asp Thr Val Ser Lys Ile Ala Giu Thr Gly Phe 625 630 635 640 Ser Ala Ala Pro Leu Gly Asp Leu Val His Trp Ser Asp Val Ile Thr U ~W4I i~ ,d i~ IM~ WO 00/08171 WO 00/817 1PCT/CA99/0071 I 645 655 Ser Ala Tyr Ala Ala Gly His Asp 660 Val 665 Arg Ile Thr Ala Ser Leu Ala 670 Giu Leu Lys Asp Val Val Lys 675 Lys 680 Ile Ile Gly Asn Arg Ser Gly Cys 685 Pro Ser Val Gly Asp Arg 690 Ile Val Giu 695 Leu Leu Tyr Ala Asp Val Ile 700 Gly Leu Gly Gin Phe Lys 705 710 Lys Thr Leu Gly Pro Thr Trp Ala Gin 715 Arg Trp Met Val Arg Val 725 Leu Glu Thr Phe Gly Ser Asp Pro 730 Asp Phe 735 Giu His Ala Asn Tyr Ala Gin Thr Lys Gly His Lys Ser 740 745 Trp Trp Asn Leu Asn Pro Asn Asn Phe Tyr Thr Met Phe 755 760 765 Pro Giu Asn Thr Phe Leu Gly Phe Ala Ile Glu Gin His 770 775 780 Pro Trp Gly 750 Pro His Thr Leu Asn Ser Ser Asp Met His His Leu Asn Giu Met Lys Arg Gin Asn Gin Thr Leu 785 790 795 800 AW 0M~ g~ f I 1A Il -VAM OP ^m WO 00/08171 WO 0008171PCT/CA99/0071 I Val Tyr Gly Lys Val Asp Ser Phe Trp 805 Lys Asn Lys His Ile 810 Tyr Phe 815 Glu Ile Ile His Asn Tyr Ile Giu 820 Val 825 Gin Ala Thr Val Tyr Asp Ser 830 Ser Thr Pro Asn Ile Pro Ser 835 Ser Arg Asn His Gly 845 Arg Giu Thr Phe Leu 860 Ile Leu Ser Leu Leu Gly Gly Arg 850 Asp His Arg Phe Leu Leu 855 Leu 865 Gly Thr Pro Tyr Giu Arg 870 Cys Ala Pro Leu Glu Ala Met Ala Asn 875 880 Arg Cys Val Phe Leu Lys Pro Lys Phe Pro Pro Pro Asn Ser 885 890 Arg Lys 895 Asn Thr Giu Phe Leu Arg Gly Lys 900 Pro Thr Ser Arg Giu 905 Val Phe Ser 910 Val Trp Thr Gin His Pro Tyr Aia Giu Asn 915 Phe 920 Ile Gly Lys Pro His 925 Val Asp 930 Tyr Asn Asn Ser Giu 935 Giu Phe Giu Ala Ala 940 Ile Lys Ala Ile Met Arg 945 Thr Gin Val Asp Pro 950 Tyr Leu Pro Tyr Giu Tyr Thr Cys Giu 955 960 f# Hif q WO 00/08171 WOOO/8171PCT/CA99/0071

I

Gly.Met Leu Giu Arg Ile Thr Ala Tyr 965 Arg Ala Ser Giu His Cys His Pro Pro 980 985 Ser Arg Ala Thr Pro Pro Thr Ser Leu 995 1000 Gly Gly Ser Pro Gly Ser Trp Giu Leu 1010 1015 Leu Pro Val Gly Val Pro Ser Arg Pro 1025 1030 Asn His Asp Gly Lys Lys Ser Ile Cys 1045 Gly Lys Ala Ile Giu Thr Leu Phe Phe 1060 1065 Ile Gin His Gin Asp 970 Phe Cys 975 Ser Phe Ile Ile Arg Ser Leu 990 Gly Leu Leu Leu His Leu Pro 1005 Val Glu Gly Pro Gly Trp Thr 1020 Gly Arg Pro Leu Gin Pro Gin 1035 1040 Ser Gin Gly Leu Thr Phe Gly 1050 1055 Ser Leu Phe Leu Lys Ile Tyr 1070 Phe Phe Lys Ile Phe Tyr Leu Asp Val Arg Cys Arg Arg Giu Lys Lys 1075 1080 1085 Lys Lys Lys Lys Lys Arg Giy Arg Lys Leu Ile Pro Phe Ser Giu Gly 1090 1095 1100 Phe Lys Lys Gin Lys Asn Ser Giy Leu Ser Ser Ala Arg Thr Val Thr WMR411. VAR WO 00/08171 WO 0008171PCT/CA99/0 0711 1105 1110 1115 1120 Ile Tyr Ser Cys Gin Giu Leu Ser Pro His Val Trp Lys Ser Ser Ile 1125 1130 1135 Leu Glu Thr Pro Cys Giu Tyr Thr Asn Gin His Ser Leu Leu Asn Ser 1140 1145 1150 His Ser His 1155 Ile Val Leu Thr Asp Ser 1160 Gin Gly Thr Pro Arg Gly Phe 1165 Cys Leu Tyr 1170 Lys Ile Asn Gin Gin Met Val Lys Ser Leu Asn Lys His 1175 1180 Gly Leu Lys Ala Ser Arg Leu Leu Arg Trp Arg Val Ser Pro Ser Pro 1185 1190 1195 1200 Cys Leu Ile Asn Asp Cys Ile His Asn Ser Ser Lys Val Asp Gly Lys 1205 1210 1215 Val Ser Ile Phe Gin Ile Arg Ala Tyr 1220 1225 Pro Gly Pro Pro Ala Asn Ala Phe Trp 1235 1240 Tyr Gin Arg Leu Lys Asp Ile 1230 Asn His Leu His Ile Gin Thr 1245 His Arg Lys Gin Gin Arg Gly Asn Thr His Ala Ser Gin Lys Pro Phe 1250 1255 1260 g ~iA ~fA WO 00/08171 WO 0008171PCT/CA99/0071 I 19 Ile Pro Leu Gly Lys Gly Gly Arg Giu Gly Ser Pro Thr Leu Giu Lys 1265 1270 1275 1280 Pro Giy Leu His Arg 1285 Giu Ser Lys Lys Val 1300 Gin Gly Ser Tyr Cys Pro Leu Val Leu Ile Trp 1290 1295 Tyr Leu Val Giu Thr Lys Pro Asn Ser Thr Lys 1305 1310 Phe Phe Asn Tyr Cys Val Ser Ala Ala Leu Val Gly Ala 1315 1320 1325 Trp Lys Thr Ser Ser Gly Giu Asp Ile Ala Leu Giu Leu Thr Ser Cys Gly Phe Ser 1330 1335 1340 Phe Phe Phe Phe Phe Phe Phe Phe Leu Arg Gin Ser Leu Thr Vai Ser 1345 1350 1355 1360 Pro Arg Leu Giu Cys 1365 Ser Gin Leu Thr Ala 1380 Ile Phe Leu Cys Pro Thr Lys Thr His Ile Pro 1370 1375 Thr Ser Thr Ser Gin Val Gin Giu Met Leu Leu 1385 1390 Pro Pro Pro Lys Leu Gly Ser Gin Ala Cys Ala Thr Thr Pro Ser Val 1395 1400 1405 Leu Tyr Phe Lys Arg Trp Gly Phe Thr Met Leu Ala Arg Leu Val Ser 1410 1415 1420 WO 00/08171 WO 0008171PCT/CA99/0071 I Asn Ser Pro Lys Val Ile His Leu Pro Trp Leu Pro Lys Met Leu Asp 1425 1430 1435 1440 Tyr Arg Cys Glu Pro Leu His Leu Ala Ser Lys Ile Ser Ile Trp, Gin 1445 1450 1455 Ile His Ile Ala Thr Phe Ile Leu Val Lys Ile Pro Lys Cys Phe His 1460 1465 1470 Thr Ser Gin Lys Ala Thr Arg Asn Ser Ala Trp Thr Pro 1475 1480 1485 <210> <211> 2298 <212> DNA <213> Homo sapiens <400> atgtttttta tttggactca agcagtgtca gcagaggaaa gatctgaaaa gagaacaaag agtgggaatt atagcagttg aaagctttat caat ctcaag tttggggatt agttacgtga ataagaacac gaacaattgc ttgactatat tggtgccagt aaaatcacct attgctggct aaaaaatatg gatgttactg gcaaatacta agtggatqtc tgtccttctg tgttgtgaat aaccacaaat tgtgctgCtc taggacagag tcccagaaat cactatactt gacttaagca gagaacggtg gatgacattt qgctcagcag aaaagaacga catccactgt gcaatacttt tgagtttact ttcaacaacc aaagatatgt cttctatggc tgcaacgatt ccaacaccac atgtctcggg ggat tatatc acaataaaag gttgcttgta aagacatcaa taaagctcta aggatatgcg ggtgaagctg caatggtact cagtatcagg ctatggcaga cactaacgga 120 180 240 300 360 420 480 540 wq -W Q v qgk w wmmN ,ZTA gpcwp Nq WO 00/08171 WO 0008171PCT/CA99/O 0711 21 ggtcaagata aatgcgtttt tccaccgatc gacggttacc cacactacga gggaaaaatt 600 aagtggataa tccagttgca tggagacata gattttgaac cggagacggc aacgcagaga actgtatcta tggagtgatg ctggctgagc gtaggagaca aaaactctag ggatcagatc tggggatggt aacacttttc aatgagatga aataagcata gactcctcta gaccatcgat cgttgcgctc ccacccaatt ttctcccagc tacaacaact ccctacctac cagcaccagg tccctctcca agccccggga tccagacctg caaggactaa atgacatgtg ctttttttat aaaatcctta ttctgtacag gaatggttga agaaaaaacg agattgctga taattacatc tcaaggatgt gaattgttga gtccaacctg ccgattttga ggaatctgaa ttgggtttgc agaggcagaa tttacttcga cacccaatat tcctcctccg cgctggaagc caaggaagaa atccctacgc cagaggagtt cctacgagta acttctgcag gggcaacccc gctgggagct gtcggcccct cctttggggg ccgttcggat atacctcagt cgacgacgct tgtgattcat gggatgggcc gaaaaaggcc aacaggtttc tgcgtacgca cgtgaagaag gctactttac ggc tcaacat acatgccaat ccctaataac gatcgagcag tcagacgctt aatcattcac tccctcttac agagaccttc catggcaaat tacagagttt ggagaacttc tgaagcagcc cacctgcgag agcttcagaa acccaccagc ggtggagggg gcagccacag gaaagcaata ccgtgcaagg gacgccgaca gagcataatt cataaggacg caaatcgcaa ctagttcacc agtgccgcac gcggggcatg at tat aggta gctgatgtaa cggtggatgg tatgcgcaaa ttttatacaa cacctaaact gtgtatggca aattacatcg tctcgaaacc ttgttactag cgatgcgtct ttacgaggca atcggcaagc atcaaggcca gggatgctgg cactgccacc ctaggcctgc ccaggctgga aaccacgatg gagacactct ctcattatgg atcattgtcc catgcgctga agttccattt agtccctagc tgggaatcat ctcttggtga acgttaggat accgatctgg ttggactcgg ttcgagtcct caaagggtca tgttccccca ccagtgatat aagtggatag aagtgcaagc acggtattct gactagggac ttctcaaacc agcccacctc cccacgtgtg ttatgagaac agcggatcac cacccagttt tcctccacct cgcttcccgt gcaaaaaatc ttttctctct tatagatggg 660 ccatgcaccc 720 aattcgtagt 780 tatgagacta 840 agataagcag 900 taccaaggac 960 cttagttcat 1020 cactgcatca 1080 ttgcccatct 1140 tcaattcaag 1200 tgaaactttt 1260 caagagccct 1320 tactccagaa 1380 gcaccacctt 1440 cttctggaag 1500 aactgtgtat 1560 ttctggtcgg 1620 tccttacgaa 1680 gaagttcccc 1740 cagagaggtg 1800 gacagtcgac 1860 tcaggtagac 1920 cgcctacatc 1980 tataatccgc 2040 tccgggaggc 2100 gggagtcccc 2160 tatttgttct 2220 ttttttaaag 2280 g if -A M WO 00/08171PC/A9071 PCT/CA99/00711 atttatttct ttaaataa <210> 6 <211> 765 <212> PRT <213> Homo sapiens 2298 <400> 6 Met Phe Phe Thr Ile Ser Arg Lys Asn 1 5 Leu Leu Leu Vai Phe Gly Leu Ile Trp 25 Thr Phe Gin Gin Pro Arg His Gin Ser 40 Ile Leu Asp Leu Ser Lys Arg Tyr Val 55 Lys Asn Thr Vai Asp Val Glu Asn Gly 70 Met Ser Gin Lys Leu 10 Ser Leu Giy Leu Met Leu Leu His Tyr Ser Val Lys Leu Arg Giu Gin Lys Ala Leu Ala Glu Glu Asn Ala Ser Met Ala Giy Tyr Ala 75 Asp Leu Lys Arg Thr Ile Ala Val Leu Leu Asp Asp Ile Leu Gin Arg 90 Leu Val Lys Leu Giu Asn Lys Val Asp Tyr Ile Val Val Asn Gly Ser 100 105 110 i W MWM 4 V :I k NANt ~I WO 00/08171 WO 00/817 1PCT/CA99/0071 I Ala Ala Asn Thr Thr Asn Gly 115 Thr Ser Gly Asn Leu Val Pro Val Thr 120 125 Ser Gly Ser Ile Arg Ile Ala Val Glu 140 Thr Asn 130 Asn His 145 Lys Arg Thr Asn Val 135 Leu Val Leu Leu His 150 Pro Leu Trp Ile Ile Ser Tyr Gly 155 Lys Ala Leu Tyr Cys Trp, Leu Arg Thr 165 Glu Ala Ile Leu Tyr 170 Asn Lys 175 Ser Thr Asn Tyr Pro His 195 Gly Gin Asp Lys Val Phe Pro Pro Ile Asp Gly 190 Met Cys Arg Tyr Glu Gly Lys Ile 200 Lys Trp Ile Asn Asp 205 Ser Asp 210 Pro Cys Lys Ala His 215 Tyr Gly Ile Asp Gly Ser 220 Ser Cys Thr Phe Phe 225 Ile Tyr Leu Ser Asp 230 Ala Asp Asn His Cys Pro His Ala 235 Pro 240 Trp, Arg His Lys Asn Pro Tyr Asp Asp Ala Giu His Asn Ser Cys Ala 245 250 255 Glu Ile Arg Ser Asp Phe Glu Leu Leu Tyr Ser Val Ile His His Lys 260 265 270 WO 00/08171 WO 0008171PCT/CA99/0071 I Asp.Glu Phe His Phe Met 275 Trp, Ala Gin Ile Ala Lys 290 Arg Leu Arg Arg Arg A-rg Met Val Giu Gly 280 285 Ser Leu Ala Asp Lys Gin Asn Ala Giu Lys 295 300 Leu Val His Leu Gly Ile Ile Thr Lys Asp 315 320 Lys Lys Arg Lys Lys 305 Al a 310 Thr Val Ser Lys Ile 325 Ala Giu Thr Gly Phe Ser Ala Ala Pro 330 Leu Gly 335 Asp Leu Val His 340 His Asp Val Arg 355 Trp Ser Asp Val Ile Thr Ser Ala Tyr 345 Ile Thr Ala Ser Leu Ala Giu Leu Lys 360 365 Ala Ala Gly 350 Asp Val Val Lys Lys Ile Ile Gly Asn 370 Arg 375 Ser Gly Cys Pro Ser Val 380 Gly Asp Arg Val Giu Leu Leu Tyr Ala Asp Val Ile Gly Leu Gly Gin Phe Lys 390 395 400 Thr Trp Ala Gin His Arg Trp Met Val Arg Val 410 415 Lys Thr Leu Gly Pro 405 Leu Glu Thr Phe Gly Ser Asp Pro Asp Phe Giu His Ala Asn Tyr Ala WIN 'A A kwP-110-90 ANPO Qw- -MW4wfw WO 00/08171 WO 0008171PCT/CA99/0071 I 420 Gin Thr Lys Giy His Lys Ser 435 Pro Trp Gly Trp Trp 440 Asn Leu Asn Pro 445 Asn Thr Phe Leu Asn Asn Phe Tyr Thr Met 450 Phe 455 Pro His Thr Pro Gly 465 Phe Ala Ile Glu Gin 470 His Leu Asn Ser Ser 475 Asn Gin Thr Leu Val 490 Asp Met His His Leu 480 Tyr Gly Lys Val Asp 495 Asn Giu Met Lys Arg Gin 485 Ser Phe Trp Lys Asn Lys His Ile Tyr Phe Giu Ile Ile 500 505 His Asn Tyr Ile Giu Val Gin Ala Thr 515 Ser Tyr Ser Arg Asn His 530 Leu Leu Arg Giu Thr Phe 545 550 Arg Cys Ala Pro Leu Giu 565 Tyr Asp Ser Ser Thr 520 Asn Ile Pro Gly Ile Leu Ser Gly 535 Leu Leu Leu Gly Leu 555 Ala Met Ala Asn Arg 570 Arg Asp His Arg Phe 540 Gly Thr Pro Tyr Giu 560 Cys Val Phe Leu Lys 575 N M~ ti/4~~4 YY~M~~ i WO 00/08171 WO 0008171PCT/CA99/0071 I Pro Lys Phe Pro Pro Pro Asn Ser 580 Arg Lys Asn Thr Giu Phe Leu Arg 585 590 Phe Ser Gin His Pro Tyr Ala Giu 605 Gly Lys Pro Thr Ser Arg Giu 595 Asn Phe Ile Gly Lys Pro 610 His 615 Val Trp Thr Val Asp Tyr Asn Asn Ser 620 Giu Giu Phe Glu Ala 625 Ala 630 Ile Lys Ala Ile met Arg Thr Gin Val 635 Asp 640 Pro Tyr Leu Pro Tyr 645 Giu Tyr Thr Cys Giu Gly Met Leu Giu Arg Ile 650 655 Thr Ala Tyr Ile Gin 660 His Gin Asp Cys Arg Aia Ser Giu His Cys 670 His Pro Pro Ser Phe Ile Ile 675 Ser Leu Ser Arg Ala 685 Thr Pro Pro Pro Gly Ser Thr Ser Leu Gly Leu Leu 690 Leu 695 His Leu Pro Gly Gly Ser 700 Trp Giu Leu Vai Giu Giy 705 710 Ser Arg Pro Gly Arg Pro 725 Pro Gly Trp Thr Leu Pro Vai Gly Val Pro 715 720 Leu Gin Pro Gin Asn His Asp Gly Lys Lys 730 735 WO 00/08171 WO 0008171PCT/CA99/0071 I 27.

Ser. Ile Cys Ser Gin Gly Leu Thr Phe Giy Gly Lys Ala Ile Glu Thr 740 745 750 Leu Phe Phe Ser Leu Phe Leu Lys Ile Tyr Phe Phe Lys 755 760 765 <210> 7 <211> 948 <212> DNA <213> Homo sapiens <400> 7 cggctcttac tttggcttcc ctgcagtccc cccacctcca cacgtqtgga atgagaactc cggatccacg agaggcccac ggctcggaac ctggccgtgc ccctccttct gacagcaccg agtgctacct ccggctctgc tttgcttttt cgcagcctga cctacgaggg gcttcagccc gagaggtgtt cagtcgacta aggtagaccc cctacatcca gccccgcaga accagcttgg ctgggagggc tccccttcct agtcggagat gcagaaggag ccctgccgcg acgagtcgag gtttcagcag ccccgccccc gccccacagc ctcccagcat caacaactca ctacctaccc gcaccaggac gcccctttgt ctcctggggc ctgcaccgac gaacagccag gaaccacctg cctctgctct acttccgcaa ttttttttct ctgctgcgca ctggaggcca tccctcaacc ccctacgcgg gaggagtttg tacgagtaca ttctgcagag cc tggc cc cc ctggcccccg acctgcctgg gacgccttcc tactctcggc tcagtgcgcc gcggaattcc tttttttttc aggccaaact t cgccaatgg acgagttctt agaacttcat aagcagccat cctgcgaggg ctccagacca aatgccaccc cgcacaccct accacgggct tcaagctgca gttcgcccag ggct ccaaca ggccggaatt aagtcttgat cttcctcggg ttgcatcttc cccaggcaag cggcaagccc caaggccatt gatqctggag ctgccctacc acctcgagtg gcgggcctgg aatctgtgag ggtgccctgt cctggccagg ccaagtaccg ccggaattct ttgtggctta #11 WN WO 00/08171 WO 0008171PCT/CA99/0071 I cctcaagtta ccatttttca gtcaagtctg tttgtttgct tcttcaga <210> 8 <211> 1295 <212> DNA <213> Homo sapiens <400> 8 taaatatttt ggccgcaagc agctcagcta tggaagagtt aattctcatt ctatataaaa aaagccagtc tgctgaatcc tattatcaac cacattcaga attccccttg cgtcaaggta gttgaaacga gtaggagctt ttctaaagtt aggctggagt acctccacct caggcatgtg atgttggcca aaaatgctgg attcacatag atttggatgt ttattccctt ggacagtgac caatcttaga cctagcacat ttaactagca gtctgctaag acaattcctc ggttaaagga cacatcgtaa gcaaaggcgg gctactgccc aaccgaactc ggaaaacat c tctttttttt gcattttctt cccaggttca ccaccacacc gactggtctc attacaggtg ctactttcat gaggtgcaga tagtgagggt tatttaatat gtagacacct tgtccttaca acagtaaatg atggtgaagg taaagttgat tat cccaggc acaacagagg gagagagggc tctagtgttg cacaaagttt atcaggtgag tttttttttt gtgtccaacc agagatgctc cagctaagtt gaactcctga tgaaccactg acttgttaaa agagaaaaaa taatttaaaa agttaatgcc tgtgaataca gattcccagg gtgaagtcct gtgtccccat gggaaagttt cctccagcaa ggcaa tact c tcaccaacct atatgggaat ttcaattact gatattgcac tttttgagac aagactcaca ctgccctagc ttgtattttt cctaaagtga cacctggcct ataccgaaat aaaaaaaaaa agcaaaagaa aggaactttc caaaccaaca ggacaccaag aattaaataa ccccatgttt ccatctttca atgccttctg atgcttcgca tggagaagcc aaagcaaaaa gatgtgtctc tggagctgac agagtctcac taccatctca ctcccaagta agaagagatg tccacctgcc ccaagatttc gcttccatac aaaaaagcgc ttccggcctg accccacgta ctcccttctg aggtttttgc gcatgggtta aataaatgat gataagagca gaatcatctc aaagccgttc tggtttacat agtatacctg agcagccttg ctcttgtggc tgtgtcaccc gctcactgca gctgggatca gggtttcacg ttggcttccc tatttggcaa cagttagcaa 120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 M Av-Ahvx Affr Jd ;t 14 1 Fri kj w m P w~o I o 4 k 4W I F f" h mA oc "'lF WO 00/08171 WO 0008171PCT/CA99/0071 I aaggccaccc ggaattcagc ttggacttaa ccagg <210> 9 <211> 2298 <212> DNA <213> Homo sapiens 1295 <400> 9 atgtttttta tttggactca agcagtgtca gcagaggaaa gatctgaaaa gagaacaaag agtgggaat t atagcagttg aaagctttat ggtcaagata aagtggataa tccagttgca tggagacata gattttgaac cggagacggc aacgcagaga actgtatcta tggagtgatg ctggctgagc gtaggagaca aaaactctag caatctcaag aaaaaatatg tcccagaaat tgagtttact gttgcttgta tttggggatt agttacgtga ataagaacac gaacaattgc ttgactatat tggtgccagt aaaatcacct attgctggct aatgcgtttt atgacatgtg ctttttttat aaaatcctta ttctgtacag gaatggttga agaaaaaacg agattgctga taattacatc tcaaggatgt gaattgttga gtccaacctg gatgttactg gcaaatacta agtggatgtc tgtccttctg tgttgtgaat aac cacaaat tgtgctgctc taggacagag tccaccgatc ccgttcggat atacctcagt cgacgacgct tgtgattcat gggatgggcc gaaaaaggcc aacaggtttc tgcgtacgca cgtgaagaag gctactttac ggc tcaacat cactatactt gacttaagca gagaacggtg gatgacattt ggctcagcag aaaagaacga catccactgt gcaatacttt gacggttacc ccgtgcaagg gacgccgaca gagcataatt cataaggacg caaatcgcaa ctagttcacc agtgccgcac gcggggcatg attataggta gctgatgtaa cggtggatgg ttcaacaacc aaagatatgt cttctatggc tgcaacgatt ccaacaccac atgtctcggg ggattatatc acaataaaag cacactacga ctcattatgg atcattgtcc catgcgctga agttccattt agtccctagc tgggaatcat ctcttggtga acgttaggat accgatctgg ttggactcgg ttcgagtcct aagacatcaa taaagctcta aggatatgcg ggtgaagctg caatggtact cagtatcagg ctatggcaga cactaacgga gggaaaaatt tatagatggg ccatgcaccc aattcgtagt tatgagacta agataagcag taccaaggac cttagttcat cactgcatca ttgcccatct tcaattcaag tgaaactttt 120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 Wl~ VIAI~ L~ i I WO 00/08171 WO 0008171PCT/CA99/0071 I ggatcagatc tggggatggt aacacttttc aatgagatga aataagcata gactcctcta gac cat cgat cgttgcgctc ccacccaatt ttctcccagc tacaacaact ccctacctac cagcaccagg tccctctcca cttgttctac ggtaaaaaat ttatttttat gttcgttgtc ccgattttga ggaatctgaa ttgggtttgc agaggcagaa tttacttcga cacccaatat tcctcctccg cgctggaagc caaggaagaa atccctacgc cagaggagt t cctacgagta acttctgcag gggcaacccc cgccgtttcc tatattggtt ctttttttaa gtcgttaa acatgccaat ccctaataac gatcgagcag tcagacgctt aatcattcac tccctcttac agagaccttc catggcaaat tacagagttt ggagaacttc tgaagcagcc cacctgcgag agcttcagaa acctttccca agaactagcc ttctcgtact agatttattt tatgcgcaaa ttttatacaa cacctaaact gtgtatggca aattacatcg tctcgaaacc ttgttactag cgatgcgtct ttacgaggca atcggcaagc atcaaggcca gggatgctgg cactgccacc ttccagggta gggccttgta aatt tatggg ttagaaatta caaagggtca tgttccccca ccagtgatat aagtggatag aagtgcaagc acggtattct gactagggac ttctcaaacc agcccacctc cccacgtgtg ttatgagaac agcggatcac cacccagttt acccgactac gtcaccggaa gtgaatctaa ttaaatattt caagagccct tactccagaa gcaccacctt cttctggaag aactgtgtat ttctggtcgg tcctiacgaa gaagttcccc cagagaggtg gacagtcgac tcaggtagac cgcctacatc tataatccgc acggctaaga ccaccccggg tcgtgatact ttattgggat 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 2100 2160 2220 2280 2298 <210> <211> 765 <212> PRT <213> Homo sapiens <400> Met Phe Phe Thr Ile Ser Arg Lys Asn Met Ser Gin Lys Leu Ser Leu 1 5 10 Leu Leu Leu Val Phe Gly Leu Ile Trp Gly Leu Met Leu Leu His Tyr WO 00/08171 WO 0008171PCT/CA99/007 11 Thr Phe Gin Gin Pro Arg His Gin 40 Ile Leu Asp Leu Ser Lys Arg Tyr 55 Ser Ser Val Lys Leu Val Lys Ala Leu Ala Arg Giu Gin Giu Giu Asn Lys Asn Thr Vai Asp Vai Giu Asn Gly Aia Ser Met Ala Giy Tyr Aia 70 75 Asp Leu Lys Arg Thr Ile Aia Vai Leu Leu Asp Asp Ile Leu Gin Arg 90 Leu Vai Lys Leu Giu Asn Lys Vai 100 Asp Tyr Ile Vai Vai 105 Asn Gly Ser 110 Pro Val Thr Aia Aia Asn Thr Thr Asn Giy 115 Thr Asn Lys Arg Thr Asn Vai 130 135 Ser Giy Asn Leu Val 125 Ser Gly Ser Ile Arg 140 Pro Leu Trp Ile Ile 155 Ile Aia Vai Giu Ser Tyr Giy Arg 160 Asn 145 His Leu Vai Leu Leu His 150 Lys Ala Leu Tyr Cys Trp Leu Arg Thr Giu Ala Ile Leu Tyr Asn Lys 165 170 175 A w MW V~ W. w -w;h WOW VA4 A f WO 00/08171 WO 00/817 1PCT/CA99/0071 I Ser Thr Asn Gly Gly Gin Asp Lys 180 Tyr Pro His Tyr Giu Gly Lys Ile 195 200 Cys Val Phe Pro Pro Ile Asp Gly 185 190 Lys Trp Ile Asn Asp Met Cys Arg 205 Gly Ile Asp Giy Ser Ser Cys Thr 220 Ser Asp 210 Pro Cys Lys Ala His 215 Phe 225 Phe Ilie Tyr Leu Ser 230 Asp Ala Asp Asn His Cys Pro His Ala 235 Pro 240 Trp Arg His Lys Asn 245 Pro Tyr Asp Asp Ala Giu His Asn Ser 250 Cys Ala 255 Giu Ile Arg Ser Asp 260 Phe Giu Leu Leu Tyr Ser Val Ile 265 His His Lys 270 Asp Giu Phe His Phe Met Arg Leu Arg Arg Arg Arg 275 280 Trp Ala Gin Ile Ala Lys Ser Leu Aia Asp Lys Gin 290 295 300 Lys Lys Arg Lys Lys Ala Leu Val His Leu Giy Ile 305 310 315 Val Giu Gly Asn Ala Giu Lys Ile Thr Lys Asp 320 Thr Vai Ser Lys Ile Ala Giu Thr Giy Phe Ser 325 330 Ala Ala Pro Leu Gly 335 f~A~W~ N ~Ar~r~j ~t ~i 4 ~44N, V WO 00/08171 WO 0008171PCT/CA99/0071 I Asp Leu Val His Asp Val 355 His Trp, Ser Asp Val Ile Thr Ser Ala Tyr Ala Ala Gly 340 345 350 Arg Ile Thr Ala Ser Leu Ala Glu Leu Lys Asp Val Val 360 365 Lys Lys 370 Ile Vai 385 Ile Ile Gly Asn Arg Ser Gly Cys Pro 375 Ser Val Giy Asp Arg 380 Giu Leu Leu Tyr Ala Asp Vai Ile 390 Gly 395 Leu Gly Gin Phe Lys 400 Lys Thr Leu Gly Pro Thr Trp Ala Gin 405 Gly Ser Asp Pro Asp 425 Arg Trp Met Vai Arg Vai 415 Leu Giu Thr Gin Thr Lys 435 Phe Giu His Ala Asn Tyr Ala 430 Gly Trp Trp Asn Leu Asn Pro 445 Gly His Lys Ser Pro Trp 440 Asn Asn Phe Tyr Thr Met Phe Pro His Thr Pro Giu Asn Thr Phe Leu 450 455 460 Giy Phe Aia Ile Giu Gin His Leu Asn Ser Ser Asp Met His His Leu 465 470 475 480 Asn Giu Met Lys Arg Gin Asn Gin Thr Leu Val Tyr Gly Lys Val Asp 0 4 ~A~4~LW~ 'A ~t N,~1 ~J4W~~ ~I 4~iiij~f~ qA~i ~MI WO 00/08171 WO 0008171PCT/CA99/0071 I 485 495 His Asn Tyr 510 Ser Phe Trp Lys Asn Lys His Ile Tyr Phe Giu Ile Ile 500 505 Ile Giu Val Gin Ala Thr Val 515 Tyr Asp Ser Ser Thr 520 Ile Leu Ser Gly Arg 540 Pro 525 Asn Ile Pro Ser Tyr 530 Ser Arg Asn His Gly 535 Asp His Arg Phe Leu 545 Leu Arg Giu Thr Phe 550 Leu Leu Leu Gly Leu Gly 555 Thr Pro Tyr Giu 560 Arg Cys Ala Pro Leu Giu 565 Ala Met Ala Asn Arg Cys Val Phe 570 Leu Lys 575 Pro Lys Phe Pro Pro Pro Asn Ser Arg Lys Asn Thr Giu 580 585 Phe Leu Arg 590 Gly Lys Pro Thr Ser Arg Giu 595 Asn Phe Ile Gly Lys Pro His 610 615 Glu Giu Phe Giu Ala Ala Ile 625 630 Val Phe Ser Gin His 600 Val Trp Thr Val Asp 620 Lys Ala Ile Met Arg 635 Pro 605 Tyr Ala Glu Tyr Asn Asn Ser Thr Gin Vai Asp 640 WO 00/08171 WO 00/817 1PCT/CA99/0071 I Pro Tyr Leu Pro Tyr Giu Tyr Thr Cys 645 Giu Gly Met Leu Giu 650 Arg Ile 655 Thr Ala Tyr Ile 660 His Pro Pro Ser 675 Gin His Gin Asp Phe 665 Cys Arg Ala Ser Giu His Cys 670 Thr Pro Pro Phe Ile Ile Ser Leu Ser Arg Phe Pro Phe Gin Gly Asn 690 Pro Thr Thr Arg Leu Arg 695 700 Gly Pro Cys Ser His Arg 715 Leu Vai Leu Pro Asn His Pro Giy 720 Pro Phe Pro Giu Leu 705 Ala 710 Gly Lys Lys Leu Asn Arg Asp Thr 740 Ile Ile Lys Tyr 755 Tyr 725 Trp Phe Ser Arg Thr Asn Leu Trp Gly Giu Ser 730 735 Leu Phe Leu Ser Phe Phe Lys Asp Leu Phe Leu Giu 745 750 Phe Tyr Trp Asp Vai Arg Cys Arg Arg 760 765 <210> i1 <21i> 237 <212> DNA <213> Homo sapiens WO 00/08171 WO 00/817 1PCT/CA99/0071 I <400> 11 cctttcccat gaactagccg tctcgtacta gatttatttt tccagggtaa cccgactaca cggctaagac ttgttctacc gccgtttcca ggccttgtag tcaccggaac caccccgggg gtaaaaaatt atattggttt 120 atttatgggg tgaatctaat cgtgatactt tatttttatc tttttttaaa 180 tagaaattat taaatatttt tattgggatg ttcgttgtcg tcgttaa 237 <210> 12 <211> 78 <212> PRT <213> Homo sapiens <400> 12 Pro Phe Pro Phe Gin Gly Asn Pro Thr Pro Pro Phe Pro Glu Leu Ala Gly Pro 25 Gly Gly Lys Lys Leu Tyr Trp Phe Ser 40 Thr Arg Leu Arg Leu Val Leu 10 Cys Ser His Arg Asn His Pro Arg Thr Asn Leu Trp Gly Glu Ser Asn Arg Asp Thr Leu Phe Leu Ser Phe Phe Lys Asp Leu Phe Leu 55 Glu Ile Ile Lys Tyr Phe Tyr Trp, Asp Val Arg Cys Arg Arg 70 VMWW WWWONAW I'll 010MA "M WFOOM WO 00/08171 WO 0008171PCT/CA99/0071 I <210> <22.1> <212> <213> <220> <223> 13 28

DNA

Artificial Sequence Description of Artificial Sequence:primer <400> 13 cagacctggt cggcccctgc agccacag <210> 14 <211> 24 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence:primer <400> 14 ggaggcagcc ccgggagctg ggag <210> <211> <212> <213> <220> <223>

DNA

Artificial Sequence Description of Artificial Sequence:primer A~ ~A~AI ~DII 'A jj WO 00/08171 WO 0008171PCT/CA99/0071 I 381 <400> ggtcaagata aatgcgtttt tccaccgatc <210> 16 <211> 34 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence:primer <400> 16 gtggattata tcctatggca gaaaagcttt atat 34 "MAIW9

Claims

1. An isolated N-acetylglycosaminyltransferase V-b protein comprising an amino acid sequence of SEQ ID NO. 2, SEQ ID NO. 4 or SEQ ID NO. 6, or an isolated N-acetylglycosaminyltransferase V-c protein comprising an amino acid sequence of SEQ ID NO. 10, or SEQ ID NO. 12.

2. An isolated nucleic acid molecule comprising a nucleic acid sequence of SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 5, SEQ ID NO: 9, or SEQ ID NO: 11.

3. An isolated nucleic acid molecule encoding a polypeptide comprising an amino acid sequence of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: or SEQ ID NO: 12.

4. An isolated nucleic acid molecule as claimed in any of the preceding claims wherein the nucleic acid is fused to a nucleic acid which encodes a heterologous protein.

5. A vector comprising a protein according to claim 1 or a nucleic acid molecule of any of claims 2 or 3.

6. A host cell comprising a protein according to claim 1 or a nucleic acid molecule of any of claims 2 or 3.

7. A method for preparing a protein comprising: transferring a vector as claimed in claim 5 into a host cell; selecting transformed host cells from untransformed host cells; culturing a selected transformed host cell under conditions which allow expression of the protein; and isolating the protein.

8. A protein prepared in accordance with the method of claim 7.

9. An antibody having specificity against an epitope of a protein as claimed in S.claim 1 or claim 8. An antibody as claimed in claim 9 labeled with a detectable substance and used Sto detect the protein in biological samples, tissues, and cells.

11. A probe comprising a sequence encoding a protein as claimed in claim 1 or 8, or a part thereof.

12. A method of diagnosing and monitoring conditions mediated by a protein as claimed in claim 1 or 8 by determining the presence of a nucleic acid molecule as 500029306_.DOC LI~II^Bmv2^(^'-M("lYW Y1~Vllr"~~*lhi.;ffLiljIY -34- claimed in any of the preceding claims or a protein as claimed in any of the preceding claims.

13. A method for identifying a substance which associates with a protein as claimed in claim 1 or 8 comprising reacting the protein with at least one substance which potentially can associate with the protein, under conditions which permit the association between the substance and protein, and removing or detecting protein associated with the substance, wherein detection of associated protein and substance indicates the substance associates with the protein.

14. A method as claimed in claim 13 wherein association of the protein with the substance is detected by assaying for substance-protein complexes, for free substance, for non-complexed protein, or for activation of the protein. A method for evaluating a compound for its ability to modulate the biological activity of a protein as claimed in claim 1 or 8 comprising providing a known concentration of the protein with a substance which associates with the protein and a test compound under conditions which permit the formation of complexes between the substance and protein, and removing and/or detecting complexes.

16. A method for detecting a nucleic acid molecule encoding a protein comprising an amino acid sequence of SEQ ID NO: 2, 4, 6, 10 or 12 in a biological sample comprising the steps of: 20 hybridising the nucleic acid molecule of claims 2 to 4 to nucleic acids of the biological sample, thereby forming a hybridisation complex; and detecting the hybridisation complex wherein the presence of the hybridisation complex correlates with the presence of a nucleic acid 't molecule encoding the protein in the biological sample.

17. A method as claimed in claim 16 wherein nucleic acids of the biological samples are amplified by the polymerase chain reaction prior to the hybridising step.

18. A composition comprising one or more of a nucleic acid molecule or protein claimed in any of the preceding claims, or a substance or compound identified using a method as claimed in any of the preceding claims, and a o *o 30 pharmaceutically acceptable carrier, excipient or diluent.

19. Use of one or more of a nucleic acid molecule or protein claimed in any of the preceding claims, or a substance or compound identified using a method as claimed in any of the preceding claims in the preparation of a pharmaceutical 500029306I.DOC pw VAW1014OVi 0 P T composition for treating a condition mediated by a protein as claimed in claim 1 or 8. A gene-based therapy directed at the brain comprising a polynucleotide comprising all or a portion of a regulatory sequence of SEQ ID NO: 7 or 8.

21. A method for preparing an oligosaccharide comprising contacting a reaction mixture comprising an activated GlcNAc, and an acceptor in the presence of a protein as claimed in claim 1 or 8.

22. An oligosaccharide prepared by a method according to claim 21.

23. An isolated N-acetylglycosaminyltransferase V-b protein, substantially as herein described with reference to any one of the examples but excluding comparative examples.

24. An isolated nucleic acid molecule, substantially as herein described with reference to any one of the examples but excluding comparative examples. A vector, substantially as herein described with reference to any one of the examples but excluding comparative examples.

26. A host cell, substantially as herein described with reference to any one of the examples but excluding comparative examples.

27. A method for preparing a protein, substantially as herein described with reference to any one of the examples but excluding comparative examples. 20 28. A protein prepared by a method, substantially as herein described with reference S• to any one of the examples but excluding comparative examples.

29. An antibody having specificity against an epitope of a protein, substantially as herein described with reference to any one of the examples but excluding comparative examples.

30. A probe comprising a sequence encoding a protein, substantially as herein described with reference to any one of the examples but excluding comparative examples. S31. A method of diagnosing and monitoring conditions mediated by a protein, substantially as herein described with reference to any one of the examples but S" 30 excluding comparative examples.

32. A method for identifying a substance which associates with a protein, substantially as herein described with reference to any one of the examples but excluding comparative examples. 5000293061 .DOC ~v~inr~i~mruMu~~i~il~ni~~nrN~~ -36-

33. A method for evaluating a compound for its ability to modulate the biological activity of a protein, substantially as herein described with reference to any one of the examples but excluding comparative examples.

34. A method for detecting a nucleic acid molecule encoding a protein, substantially as herein described with reference to any one of the examples but excluding comparative examples. A composition, substantially as herein described with reference to any one of the examples but excluding comparative examples.

36. Use of one or more of a nucleic acid molecule or protein, or a substance or compound identified by a method, in the preparation of a pharmaceutical composition for treating a condition mediated by a protein, substantially as herein described with reference to any one of the examples but excluding comparative examples.

37. A gene-based therapy directed at the brain, substantially as herein described with reference to any one of the examples but excluding comparative examples.

38. A method for preparing an oligosaccharide, substantially as herein described with reference to any one of the examples but excluding comparative examples.

39. An oligosaccharide prepared by a method, substantially as herein described with reference to any one of the examples but excluding comparative examples. ooo DATED this 28' t h day of November 2003 i BALDWIN SHELSTON WATERS Attorneys for: GlycoDesign Inc. 500029306_1.DOC M69A'w"u qM:6M MAlW.