# Algorithms in Sequence Analysis - Assignment 1

## Question 1

Global pairwise alignment

Gaps in one sequence. Modify the alignment equation for dynamic programming (below) to allow gaps only in sequence X.

M[i, j] = max { M[i-1, j-1] + score(X[i], Y[j]), M[i, j-1]-g, M[i-1, j]-g }

## Question 2

The edit distance between two words is a number of operations needed to transform one word into another. The operations available are:

- replacement of single letter by another
- insertion of a single letter
- deletion of a single letter

Write down the new equation for M[i,j] in such a way that the score of the alignment will be the edit distance.

## Question 3

Perform a global alignment of the protein sequences DARWIN and CRICK. First, complete the template matrix Q3 including the arrows. Use this scoring function:

M[i, j] = max { M[i-1, j-1] + blosum62(X[i], Y[j]), M[i, j-1]-2, M[i-1, j]-2 }

where blosum62(X[i], Y[j]) is the substitution score between residues X[i] and Y[j] according to the blosum62 exchange matrix:

## Question 4

Provide the alignment after traceback from your matrix.

## Question 5

What is the total alignment score?

## Question 6

Local pairwise alignment Find two maximal scoring local alignments between sequences TGAGA and GAGGC using the following scoring function:

M[i, j] = max { M[i-1, j-1] +/- 1, M[i, j-1] - 2, M[i-1, j] - 2, 0 }

First fill out the template matrix Q5.

Note: You should use the Waterman-Eggert method.

Provide your score matrix.

## Question 7

Provide the two alignments and their scores.

Note: We ask for local alignments, so be sure that you hand in local alignments.