Pattern recognition in genetic sequences by mismatch density (Q1070960)
From MaRDI portal
scientific article
Language | Label | Description | Also known as |
---|---|---|---|
English | Pattern recognition in genetic sequences by mismatch density |
scientific article |
Statements
Pattern recognition in genetic sequences by mismatch density (English)
0 references
1984
0 references
The main object of this paper is to formalize the concept of pattern similarity in genetic sequences and to give an algorithm which finds all such similarities in a given context: Consider two genetic sequences, represented by strings of symbols \(a_ 1...a_ m\) and \(b_ 1...b_ n\), respectively. A pair of segments \(a_ p...a_ q\) and \(b_ r...b_ s\) out of these sequences, aligned in such a way as to exhibit their similarity, is called an alignment. For an alignment A the length l(A) is defined as the average length of the two segments, and the degree of mismatch d(A) is defined via the minimum number of mutations and deletions needed to make the two identical. Then the degree of similarity between the two segments of alignment A is measured by \(r\cdot l(A)-d(A)\), where r is a positive proportionality constant. The algorithm presented in this paper checks for all alignments between any segment from one sequence and any segment from another and finds every alignment A such that the function \(r\cdot l(A)-d(A)\) is a maximum in the neighbourhood of A. Moreover, the method is extensively illustrated by means of several examples.
0 references
dynamic programming
0 references
pattern similarity
0 references
genetic sequences
0 references
algorithm
0 references
alignment
0 references
degree of mismatch
0 references
degree of similarity
0 references
examples
0 references