The tag has no wiki summary.

learn more… | top users | synonyms

7
votes
0answers
88 views

Simplifying the disjoint union of wildcard strings

Setting: patterns with "don't care" symbols, binary alphabet. For example, pattern $x = 001?$ represents the set $L(x) = \{0010, 0011\}$. We are given a set $P$ of disjoint patterns: $L(x) \cap L(y) ...
10
votes
2answers
231 views

Edit distance with move operations

Motivation: A coauthor edits a manuscript and I would like to see a clear summary of the edits. All "diff"-like tools tend to be useless if you are both moving text around (e.g., re-organising the ...
8
votes
4answers
1k views

Can Suffix trees be used to find all common substrings?

I am trying to use suffix trees to compare string sequences. I have found implementations/theory for the longest common sub string problem using suffix trees. However, What i am looking for is a ...
0
votes
0answers
383 views

suffix tree: about Ukkonen's algorithm

I have specific question about suffix trees. I am reading the book Algorithms on strings_trees and sequence. I cannot understand details of Ukkonen's algorithm for constructing suffix trees. Why ...
1
vote
2answers
234 views

Matching substrings within two strings

I'm currently thinking about the following problem: Given two strings $S,T$ over an alphabet $\Sigma$, return back a list $L$ of common substrings described by their length and positions within them. ...
17
votes
2answers
406 views

n-dimensional pattern matching

What are some known results for finding an exact n-dimensional subarray inside a n-dimensional array? In 1D, it is just a string matching problem, KMP does it in linear time. In 2D, this paper shown ...
0
votes
1answer
477 views

Algorithm: Find the first k longest substrings between two similar strings

Consider two strings, S and T. Find the first k longest common non-overlapping substrings of S and T.
-4
votes
3answers
116 views

Text Analyze and Processing

What algorithms do you know regarding text analyze/processing? Detailed, I need to "find out" the most used words, sentences, words' combinations, their category etc (these are just examples, moreover ...
7
votes
3answers
273 views

Permutation pattern matching in strings

Loosely speaking, permutation pattern matching deals with problems of the following kind: Given permutations $\pi$ in $S_n$ and $\sigma$ in $S_m$, with $m\leq n$, does $\pi$ contain a subsequence ...
6
votes
1answer
218 views

Fast algorithm for scoring a sequence

Let us define: $U = \left\{ u_j \right\}, 1 \leq j \leq N = 2^{L}$, the set of all different binary sequences of length $L$. $V = \left\{ v_i \right\}, 1 \leq i \leq M = \binom{L}{k}2^{k}$, the set ...
8
votes
1answer
295 views

Deciding if a wildcard string is completely matched by another wildcard string in a set

Here's a problem that has been bugging me for a while. Let's say a string is a sequence of 1s and 0s, and a wildcard string is a sequence of 1, 0, and ?s. All strings and wildcard strings have the ...
4
votes
1answer
191 views

How to calculate this string-dissimilarity function efficiently?

Migrated from stackoverflow. Hello, I was looking for a string metric that have the property that moving around large blocks in a string won't affect the distance so much. So "helloworld" is ...
13
votes
6answers
1k views

Computing the Levenshtein distance quickly

Given a huge database of allowed words (alphabetically sorted) and a word, find the word from the database that is closest to the given word in terms of Levenshtein distance. The naive approach is, ...
0
votes
0answers
376 views

Rabin karp string searching algorithm analysis [closed]

My prof said Rabin-Karp analysis in Introduction to algorithms by Cormen, Leiserson, Rivest, and Stein is wrong. Can someone point me to the correct analysis. Thanks
2
votes
1answer
267 views

Using compression to improve edit distance computation

I am doing a seminar on a paper titled "Unified Compression-Based Acceleration of Edit-Distance Computation" that uses straight-line programs to improve edit distance computation. It is a common ...

1 2
15 30 50 per page