Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
Algorithms and Computational Biology Lab
[go: Go Back, main page]

   
Algorithms and Computational Biology Lab
   

 

- Interests -

We are interested in all aspects of the design and analysis of combinatorial algorithms. Ongoing research includes
  • approximation algorithms,
  • on-line algorithms,
  • computational geometry,
  • graph drawing,
  • information retrieval,
  • average-case analysis of algorithms,
  • computational complexity.
  • We are especially interested in algorithmic problems arising in computational molecular biology, such as
  • (multiple) sequence alignment,
  • reconstruction of evolutionary trees,
  • physical mapping,
  • DNA microarray analysis,
  • genome-level gene dynamics.
  •  

    - People -

     

    - Talks: Spring 2004 -

      "Talks are generally held on Mondays, from 3:10-4:30am in Surge 349."
    Date Speaker Topic
    Apr. 12 Jie Zheng Title: "De novo Repeat Classification and Fragment Assembly."
    Authors: Pavel A. Pevzner, Haixu Tang, Glenn Tesler.
    RECOMB 2004.

    Abstract: "Repetitive sequences make up a significant fraction of almost any genome and an important and still open question in bioinformatics is how to represent all repeats in DNA sequences. We propose a radically new approach to repeat classficiation that is motivated by the fundamental topological notion of quotient spaces. A torus or Kein bottle are examples of quotient spaces that can be obtained from a square by gluing some points. Our new repeat classification algorithm is based on the observation that the alignment-induced quotient space of a DNA sequence compactly represents all sequence repeats. This observation leads to a simple and efficient solution of the repeat classification problem as well as new approaches to fragment assembly and multiple alignment."

    Apr. 19 No Meeting  
    Apr. 26 Qiaofeng Yang Title: "Discovering temporal relations in molecular pathways using protein-protein interactions."
    Authors: Martin Farach-Colton, Yang Huang, John L. Woolford.
    RECOMB 2004.

    Abstract: The availability of large-scale protein-protein interactin data provides us with many opportunities to study molecular pathways involving proteins. In this paper they propose to mine temporal relations in molecular pathways by protein-protein interaction data. In particular, they model the assembly pathways of protein complexes with interval graphs and determine the temporal order of joining the pathway for protein by ordering the vertices in the interaction graph. They develop a tool called XRONOS to perform such a computation. They then apply XRONOS to the ribosome assembly pathway and present validation results for the obtained ordering. The results are promising and show the potential usage for XRONOS in the study of molecular pathways.

    May 3 Qi Fu Title: "ALIGNING ALIGNMENTS EXACTLY"
    From: RECOMB 2004.

    Abstract: A basic computational problem that arises in both the construction and local-search phases of the best heuristics for multiple sequence alignment is that of aligning the columns of two multiple alignments. When the scoring function is the sum-of-pairs objecive and induced pairwise alignments are evaluated using linear gap-costs, this problem is called Aligning Alignments. While seemingly a straightforward extension of two-sequence alignment, it is actually proved to be NP-complete. As explained in the paper, this provides the first demonstration that minimizing linear gap-costs, in the context of multiple sequence alignment, is inherently hard.

    The paper also develop an exact algorithm for Aligning Alignments that is remarkably efficient in practice, both in time and space. Even though the problem is NP-complete, computational experiments on both biological and simulated data show we can compute optimal alignemtns for all benchmark instances in two standard datasets, and solve very large random instances with highly-gapped sequences.

    May 10 Zheng Liu Title: "Disigning Multiple Simultaneous Seeds for DNA Similarity Search"
    Authors: Yanni Sun, Jeremy Buhler
    From: RECOMB 2004.

    Abstract: The challegne of similarity search in massive DNA sequence databases has inspired major changes in BLAST-style alignment tools, which accelerate serach by inspecting only pairs of sequences sharing a common short "seed", or pattern of matching residues. Some of these changes raise the possibility of improving search performance by probing sequence pairs with several distinct seeds, any one of which is sufficient for a seed match. However, designing a set of seeds to maximize their combined sensitivity to biologically meaningful sequence alignments is computationally difficult, even given recent advances in designing single seeds.

    This work describes algorithmic improvements to seed design that address the problem of designing a set of n seeds to be used simultaneously. We give a new local search method to optimize the sensitivity of seed sets. The method relies on efficient incremental computation of the probability that an alignment contains a match to a seed, given that it has already faied to match any of the seeeds in a set. We demonstrate experimentally that multi-seed designs, even with relatively few seeds, can be significantly more sensitive than even optimized single-seed designs.

     

     

    - Talks: Winter 2004 -

      "Talks are generally held on Wednesdays, from 10:10-11:30am in Surge 349."
    Date Speaker Topic
    Jan. 07 Jing Li Title: "Efficient Haplotype Inference on Pedigrees and Applications"

    Ref: Extended Abstract

    Jan. 14 Marek Chrobak Title: "The Wake-Up Problem in Multi-Hop Radio Networks"

    Ref: M. Chrobak, L.Gasieniec, D.Kowalski from Proc. SODA'2004

    Jan. 21 Marek Chrobak Title: "The Analysis of the Greedy Algorithm for Minimum Common String Patrition"

    Ref: Work in Progress

    Jan. 28 Neal Young Title: "Lagrangian relaxation via randomized rounding"

    Abstract: Lagrangian relaxation algorithms have a long history (dating back at least to von Neumann). These algorithms are used mainly to find approximate solutions to packing and covering problems -- linear programs and integer linear programs, with non-negative coefficients. The algorithms build a solution in small steps, guided by a smooth penalty function that serves as a proxy for some of the underlying constraints of the original problem. In comparison to other methods of solving linear programs, Lagrangian relaxation algorithms can be easier to implement and faster, and they handle sparse problems gracefully. These algorithms have been extensively studied over the last 15 years. Their design and analysis is fairly technical. For example, designing the appropriate penalty functions is something of a black art.
    In the talk I will introduce a particular approach to the design of one class of Lagrangian relaxation algorithms (corresponding roughly to algorithms developed in the computer science literature over the last fifteen years). I will introduce techniques that help both to understand existing algorithms (including the underlying techniques and the relationships between the algorithms), and to design and analyze new algorithms The framework for this is randomized rounding -- an elegant technique, based on the probabilistic method, for designing approximation algorithms for combinatorial problems. This is a novel approach to Lagrangian relaxation algorithms and a novel use of randomized rounding, using a new kind of randomized rounding scheme based on random sampling.
    More information available at http://www.cs.ucr.edu/~neal/wiki/wiki.pl?LRviaRR

    Feb. 4 Jie Zheng Title: "Complexity and Approximation of Minimum Common Partition"

    Ref: Work in Progress

    Feb. 11 Keith Humphreys Title: "Approximate Quantiles in One Pass with Limited Memory"

    Ref: Approximate Medians and other Quantiles in One Pass and with Limited Memory,
    by G S Manku, S Rajagopalan and B G Lindsay,
    Proc. 1998 ACM SIGMOD, Vol 27, No 2, p 426-35, June 1998.

    Slides from talk here

    Feb. 18 Haifeng Li Title: "A Class of Edit Kernels for SVMs to Predict Translation Initiation Sites in Eukaryotic mRNAs"

    Ref:

    Feb. 25 Qiaofeng Yang Title: "Finding biclusters by random projections"

    Ref: Work in Progress

    Mar. 3 Wojciech Jawor Title: " Online Competitive Algorithms for Maximizing Weighted Throughput of Unit Jobs"

    (joint work with Yair Bartal, Francis Y. L. Chin, Marek Chrobak, Stanley P. Y. Fung, Ron Lavi, Jiri Sgall and Tomas Tichy)

    Abstract: We will consider an online problem of scheduling unit length jobs to maximize the weighted throughput. Each job will be specified by the release time, deadline, and weight. The goal of the algorithm is to schedule the maxium weighted number of jobs. Each job needs to be scheduled between its release time and deadline, otherwise it brings no profit. In the problem jobs arrive over time and the algorithm needs to make the decision of which job to schedule without the knowledge of the future. I will show upper and lower bounds for the problem, as well as some known results on some special cases of the problem.

    Mar. 10 John Noga Title: "Online List Batching"

    Ref:

     

     

    - Talks: Fall 2003 -

      "Talks are generally held on Thursdays, from 3:40-5:00pm in Surge 349."
    Date Speaker Topic
    Oct. 9 Jie Zheng Title: "Algorithmic Techniques of Sorting by Reversals"

    Ref: S. Hannenhalli, Pavel A. Pevzner.
    Transforming Cabbage into Turnip: Polynomial Algorithm for Sorting Signed Permutations by Reversals. ACM STOC 1995.

    Oct. 16 Xin Chen Title: "Methods for the assignment of orthologous genes"

    Ref: Tatusov RL, Koonin EV, Lipman DJ.
    A genomic perspective on protein families. Science, 278(5338): 631-7, 1997.

    Oct. 23 Petr Kolman Title: " Algorithms for the Maximum Disjoint Paths Problem" (Joint work with Christian Scheideler, JHU)

    Abstract: The {\em disjoint paths problem} is defined as follows. Given an undirected graph $G=(V,E)$ and a set $T$ of $k$ pairs of nodes $(s_i, t_i)$, $1\leq i\leq k$, decide whether there exist $k$ edge disjoint paths $P_1,\cdots,P_k$ such that the path $P_i$ connects $s_i$ and $t_i$. This is an NP-complete problem. The optimization variant of this problem is called the {\em maximum disjoint paths problem} (MDPP), which is simply to find the maximum subset of $T$ for which there exist edge-disjoint paths. In the talk a simple approximation (and online) algorithm for this problem will be presented.

    Oct. 30 Avraham Goldstein (Yeshiva Univ) Title: "Some algorithmic and complexity issues in clustering fingerprint vectors with missing values."

    Nov. 6 Swastik Kopparty Title: "Linearity Testing"

    Nov. 13 Wojciech Jawor Title: "How to Guarantee Quality of Service on the Internet"

    Nov. 13 Jiri Sgall
    Mathematical Institute,
    Academy of Sciences of the Czech Republic, Prague
    Title: "Online preemptive scheduling" (based on joint work with Leah Epstein and Tomas Ebenlendr)

    Abstract: We survey the results on online preemptive scheduling on identical and uniformly related machines. (Preemptive means that a job can change machines or be idle for some time, related machines means that different machine may have different speeds.) While the problem for identical machines is completely solved and the optimal competitive ratio is about 1.58, for related machines there is still a gap between current lower bound of 2 and the upper bounds of 4 and 2.71 for deterministic and randomized algorithms, respectively. We present a semi-online algorithm which, if the optimal makespan is given in advance, produces an optimal schedule. We use it to build the currently best algorithms mentioned above.

     

     

    - Misc. -

  • Talks from last year are stored here.
  • Pictures from the FOCS2000 meeting, hosted by UCR, are located here.
  •