Chapter 1 presents fundamental concepts from molecular biology. We describe the basicstructure and function of proteins and nucleic acids, the mechanisms of molecular genetics,the most important laboratory techniques for studying the genome of organisms, andan overview of existing sequence databases.Chapter 2 describes strings and graphs, two of the most important mathematical objectsused in the book. A brief exposition of general concepts of algorithms and theiranalysis is also given, covering definitions from the theory of NP-completeness.The following chapters are based on specific problems in molecular biology. Chapter3 deals with sequence comparison. The basic two-sequence problem is studied andthe classic dynamic programming algorithm is given. We then study extensions of thisalgorithm, which are used to deal with more general cases of the problem. A section is devotedto the multiple-sequence comparison problem. Other sections deal with programsused in database searches, and with some other miscellaneous issues.Chapter 4 covers the fragment assembly problem. This problem arises when a DNAsequence is broken into small fragments, which must then be assembled to reconstitutethe original molecule. This is a technique widely used in large-scale sequencing projects,such as the Human Genome Project. We show how various complications make thisproblem quite hard to solve. We then present some models for simplified versions of theproblem. Later sections deal with algorithms and heuristics based on these models.Chapter 5 covers the physical mapping problem. This can be considered as fragmentassembly on a larger scale. Fragments are much longer, and for this reason assemblytechniques are completely different. The aim is to obtain the location of some markersalong the original DNA molecule. A brief survey of techniques and models is given.We then describe an algorithm for th
1