The string-matching tag has no wiki summary.
7
votes
0answers
88 views
Simplifying the disjoint union of wildcard strings
Setting: patterns with "don't care" symbols, binary alphabet.
For example, pattern $x = 001?$ represents the set $L(x) = \{0010, 0011\}$.
We are given a set $P$ of disjoint patterns: $L(x) \cap L(y) ...
10
votes
2answers
231 views
Edit distance with move operations
Motivation: A coauthor edits a manuscript and I would like to see a clear summary of the edits. All "diff"-like tools tend to be useless if you are both moving text around (e.g., re-organising the ...
8
votes
4answers
1k views
Can Suffix trees be used to find all common substrings?
I am trying to use suffix trees to compare string sequences. I have found implementations/theory for the longest common sub string problem using suffix trees.
However, What i am looking for is a ...
0
votes
0answers
383 views
suffix tree: about Ukkonen's algorithm
I have specific question about suffix trees. I am reading the book Algorithms on strings_trees and sequence.
I cannot understand details of Ukkonen's algorithm for constructing suffix trees. Why ...
1
vote
2answers
234 views
Matching substrings within two strings
I'm currently thinking about the following problem:
Given two strings $S,T$ over an alphabet $\Sigma$, return back a list $L$ of common substrings described by their length and positions within them.
...
17
votes
2answers
406 views
n-dimensional pattern matching
What are some known results for finding an exact n-dimensional subarray inside a n-dimensional array?
In 1D, it is just a string matching problem, KMP does it in linear time.
In 2D, this paper shown ...
0
votes
1answer
477 views
Algorithm: Find the first k longest substrings between two similar strings
Consider two strings, S and T. Find the first k longest common non-overlapping substrings of S and T.
-4
votes
3answers
116 views
Text Analyze and Processing
What algorithms do you know regarding text analyze/processing? Detailed, I need to "find out" the most used words, sentences, words' combinations, their category etc (these are just examples, moreover ...
7
votes
3answers
273 views
Permutation pattern matching in strings
Loosely speaking, permutation pattern matching deals with problems of the following kind:
Given permutations $\pi$ in $S_n$ and $\sigma$ in $S_m$, with $m\leq n$, does $\pi$ contain a subsequence ...
6
votes
1answer
218 views
Fast algorithm for scoring a sequence
Let us define:
$U = \left\{ u_j \right\}, 1 \leq j \leq N = 2^{L}$, the set of all different binary sequences of length $L$.
$V = \left\{ v_i \right\}, 1 \leq i \leq M = \binom{L}{k}2^{k}$, the set ...
8
votes
1answer
295 views
Deciding if a wildcard string is completely matched by another wildcard string in a set
Here's a problem that has been bugging me for a while. Let's say a string is a sequence of 1s and 0s, and a wildcard string is a sequence of 1, 0, and ?s. All strings and wildcard strings have the ...
4
votes
1answer
191 views
How to calculate this string-dissimilarity function efficiently?
Migrated from stackoverflow.
Hello,
I was looking for a string metric that have the property that moving around large blocks in a string won't affect the distance so much. So "helloworld" is ...
13
votes
6answers
1k views
Computing the Levenshtein distance quickly
Given a huge database of allowed words (alphabetically sorted) and a word, find the word from the database that is closest to the given word in terms of Levenshtein distance.
The naive approach is, ...
0
votes
0answers
376 views
Rabin karp string searching algorithm analysis [closed]
My prof said Rabin-Karp analysis in Introduction to algorithms by Cormen, Leiserson, Rivest, and Stein is wrong. Can someone point me to the correct analysis.
Thanks
2
votes
1answer
267 views
Using compression to improve edit distance computation
I am doing a seminar on a paper titled "Unified Compression-Based Acceleration
of Edit-Distance Computation" that uses straight-line programs to improve edit distance computation. It is a common ...