Pattern recognition in genetic sequences by mismatch density (Q1070960): Difference between revisions

The main object of this paper is to formalize the concept of pattern similarity in genetic sequences and to give an algorithm which finds all such similarities in a given context: Consider two genetic sequences, represented by strings of symbols \(a_ 1...a_ m\) and \(b_ 1...b_ n\), respectively. A pair of segments \(a_ p...a_ q\) and \(b_ r...b_ s\) out of these sequences, aligned in such a way as to exhibit their similarity, is called an alignment. For an alignment A the length l(A) is defined as the average length of the two segments, and the degree of mismatch d(A) is defined via the minimum number of mutations and deletions needed to make the two identical. Then the degree of similarity between the two segments of alignment A is measured by \(r\cdot l(A)-d(A)\), where r is a positive proportionality constant. The algorithm presented in this paper checks for all alignments between any segment from one sequence and any segment from another and finds every alignment A such that the function \(r\cdot l(A)-d(A)\) is a maximum in the neighbourhood of A. Moreover, the method is extensively illustrated by means of several examples.

0 references

zbMATH Keywords

dynamic programming

0 references

pattern similarity

0 references

genetic sequences

0 references

algorithm

0 references

alignment

0 references

degree of mismatch

0 references

degree of similarity

0 references

examples

0 references

reviewed by

Guenther Karigl

0 references

MaRDI profile type

MaRDI publication profile

0 references

cites work

An extreme value theory for long head runs

0 references

An algorithm for the distance between two finite sequences

0 references

On the Theory and Computation of Evolutionary Distances

0 references

The theory and computation of evolutionary distances: Pattern recognition

0 references

full work available at URL

https://doi.org/10.1007/bf02459499

0 references

Identifiers

zbMATH Open document ID

0584.92009

0 references

DOI

10.1007/BF02459499

0 references

Mathematics Subject Classification ID

0 references

0 references

0 references

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:1070960

@@ Property / MaRDI profile type @@
+MaRDI publication profile
@@ Property / MaRDI profile type: MaRDI publication profile / rank @@
+Normal rank
@@ Property / cites work @@
+An extreme value theory for long head runs
@@ Property / cites work: An extreme value theory for long head runs / rank @@
+Normal rank
@@ Property / cites work @@
+An algorithm for the distance between two finite sequences
+Normal rank
@@ Property / cites work @@
+On the Theory and Computation of Evolutionary Distances
+Normal rank
@@ Property / cites work @@
+The theory and computation of evolutionary distances: Pattern recognition
+Normal rank
@@ Property / full work available at URL @@
+https://doi.org/10.1007/bf02459499
+Normal rank
@@ Property / OpenAlex ID @@
+W4236413252
@@ Property / OpenAlex ID: W4236413252 / rank @@
+Normal rank