The search tag has no wiki summary.
0
votes
1answer
64 views
Full Text Indexing Strategy for MS Excel Documents
Background
As part of a broader application that allows users to search thousands of MS Office documents on a private network, I need to index and make searchable Microsoft Excel files.
My basic ...
0
votes
2answers
35 views
Live search/filter as you type in client approach
As an exercise for myself to practive my JavaScript "skills" I'm trying to write (let's reinvent) a client-side filter. It should be able to filter "content blocks" as the client types.
Wit a ...
-2
votes
0answers
43 views
How to architect a grouped search
Lets say I have a movie related application.
Which consists of three tables/collections in MongoDB
Movies
Actors
Directors
When a user starts searching, I want to provide results from all 3 tables ...
0
votes
0answers
32 views
Search substring using suffix array (and LCP)
I'm searching for a best way to improve suffix array run time using LCP.
My text (about 2 500 000 chars) seems like: 0ricco0eric0america0polo0....
My thoughs:
SA=suffixArray
char=firstChar(input)
s ...
4
votes
1answer
252 views
Writing a spell checker similar to “did you mean”
I'm hoping to write a spellchecker for search queries in a web application - not unlike Google's "Did you mean?" The algorithm will be loosely based on this: http://catalog.ldc.upenn.edu/LDC2006T13
...
0
votes
1answer
82 views
Natural Language to Search Criteria - Date Ranges
Consider an application that stores a set of records that contain:
Description
Cost
Purchase Date
I'd like to be able to allow users to utilize natural language to search the dataset.
For ...
3
votes
4answers
150 views
fast n-gram access data structure
TL;DR
Is there a data structure that'd quickly let me match words at any point (e.g., 'foo' matches 'foobar' and 'zoofoo'), and, ideally, returns a list of "characters that show up after the needle" ...
1
vote
1answer
54 views
Determinining “value” in multi-agent microeconomical simulation
I am trying to determine an objective way for a self-interested agent to calculate the optimal buying/selling price for goods in a multi-agent simulation not dissimilar to Sugarscape ...
0
votes
0answers
39 views
Search technique - multiple hits, prioritise results
I have developed a search algorithm, which basically matches records on 3 different criteria types. Name, Address, and Keyword(s).
"doh" , will find:
Name: doh
Address: re
Keywords: me
...
0
votes
2answers
116 views
Microeconomical simulation: coordination/planning between self-interested trading agents
In a typical perfect-information strategy game like Chess, an agent can calculate its best move by searching the state tree for the best possible move, while assuming that the opponent will also make ...
3
votes
1answer
272 views
modelling a restaurant availability search
I am looking into finding an efficient way which can scale up to thousands of restaurants, for doing a reservation search. Ideally, it would be efficient to answer queries like find a restaurant ...
0
votes
4answers
296 views
Binary Search seems superior, why did the committee of C++ still have Find in the algorithm library?
I wish to search for an integer in a vector of integer. I have two candidates for the job:
Binary Search
Find
It seems that Binary Search is the best candidate for the job as although I have to ...
3
votes
3answers
283 views
storing and retrieving millions of documents using c#
I am working on an integration project, where my “app/web service” will sit in the middle serving documents.
Basically, a request is sent with the document id as part of the query string, I check if ...
11
votes
1answer
445 views
How is machine learning incorporated into search engine design?
I am currently building a small in-house search engine based on Apache Lucene. Its purpose is simple - based on some keywords, it will suggest some articles written internally within our company. I am ...
1
vote
1answer
118 views
How to calculate the worst-case runtime of this search-algorithm
I've written a special indexOf function for a list of unsorted unique values.
I can search for one or multiple (unsorted) values, passed as array/list, and the function will return me an array/list ...
0
votes
0answers
81 views
Searching in a large dataset
I am working on a model of social interactions in mice. I have mice and boxes and a simulation that outputs which mouse stays in which box during which time period. The problem is how to obtain, in ...
2
votes
2answers
241 views
How should I make searching a relational database more efficient? [duplicate]
This is in the scope of a web application. I have a database which has a few nested relations. There is a feature which depicts the history of a large chain of relations. It is essentially a data ...
0
votes
0answers
67 views
Sharding / indexing strategy for multi-faceted search
I'm currently thinking about our database structure and how we modify it for scale. Specifically, we're thinking about using ElasticSearch to provide our search functionality.
One common pattern with ...
-1
votes
2answers
91 views
Web Search for a Hard Drive [closed]
Here is the situation. Our organization has a fair amount of data in the form of documents, images, videos stored on a intranet server.
We need to be able to expose these documents via some sort of ...
-1
votes
3answers
507 views
Find all lines segments intersections
I have a collection of lines segments, represented by an array.
Ex: [3,7,13,6,9] is 4 line segments: [(3,7)(7,13)] , [(7,13)(13,6)] , [(13,6)(6,9)] , ([6,9)(9,3)]
I want to find all the lines ...
0
votes
2answers
1k views
Using lucene and sql server togheter. Newbie needs directions [closed]
Basically the whole thing can be explained simply:
I need to index one or more SQL Server 2005 databases with lucene so I can search the various records.
I found a lot of examples and documentation ...
0
votes
3answers
415 views
A* search for Sudoku
I have a homework problem for an Artificial Intelligence course that I am having trouble answering.
Consider solving the Sudoku problem using A* search. The start state
has some number of cells ...
1
vote
2answers
349 views
Search algorithm
I would like to create a site where users can post articles with the following optional parts:
A title
Contents (text)
Categories
Keywords
Articles will be stored in mongodb and the site will be ...
2
votes
3answers
196 views
How to combine search words? AND or OR?
I have a basic search in my webpage. When I designed it, I chose to combine the search box inputs with OR. For example: A search for foo bar will be translated to foo OR bar, so every entry which ...
0
votes
2answers
201 views
How can I conceptually model a craigslist search?
I'm trying to understand how something like this works, but I'm inexperienced and I'm trying to understand how the process would work.
Say you have ten categories, a thousand zip codes, and ten ...
0
votes
0answers
42 views
Solving point in interval queries
There are n intervals given by starting (a[1], a[2], ..., a[n]) and ending points (b[1], b[2], ..., b[n]) and m queries of the form: given an integer x find the indices of the intervals which contain ...
2
votes
1answer
243 views
Use a search box that calls on a JSON file? [closed]
I use a JSON file to populate several drop down lists.
The format is:
{
"value" :"lightyear",
"name" :"Light Year(yl)"
},
{
"value" :"astronomicalUnit",
...
1
vote
2answers
678 views
Finding duplicate files? [duplicate]
I am going to be developing a program that detects duplicate files and I was wondering what the best/fastest method would be to do this? I am more interested in what the best hash algorithm would be ...
1
vote
0answers
72 views
How to map the english dictionary to UNSPSC codes?
Is there a db which maps the words from the english dictionary to the UNSPSC codes?(http://www.unspsc.org/)
My problem is the following:
I am building a search system. And the customer searches for ...
1
vote
2answers
443 views
Implementing search over large data set, PHP or Mysql stored procedure?
I'm building an Online Food Ordering System with PHP and MYSQL, One of the feature of the application is to allow users to search for the restaurants by typing the area name.
I would like to know ...
0
votes
3answers
133 views
Binary Search Programming implementation
Binary Search, as we all know requires the elements to be sorted. But we have to take care of unsorted elements too, in the worst case. If the input size is very large, is it a good idea to sort the ...
0
votes
0answers
68 views
Considerations for beginning work on a unified search
I have become interested in creating a unified search for a corporate asset management database. My goal is to allow users to submit queries like:
stuff in building 3210
stuff in building 3210 owned ...
8
votes
1answer
274 views
heuristic for searching through non-perfectly sorted data
Given sorted data, the search solution is obvious. Given unsorted data, sensible options are sort then search or just linear search.
This question is about what to do if the data is somewhat sorted, ...
2
votes
4answers
673 views
Querystring Advanced Search where there are about 20 search fields
I am creating an advanced search page where there are about 20 search fields for a user to filter their search. My question deals with the query string, Is it standard web development practice to have ...
0
votes
2answers
530 views
What is a good algorithm for priority allocation of work duties?
I am currently doing a project (in PHP) that has the following requirements:
There is a list of people, sorted in a certain priority. Work should be allocated to them by this priority. e.g. If the ...
2
votes
1answer
101 views
Auto-completion or Suggest
How does Google or amazon implement the auto-suggestion at their search box. I am looking for the most used algorithm with technology stack.
PS: I have searched over the net and found this and this ...
6
votes
1answer
223 views
What is “the right way” to do search on a website?
I'm talking the kind of search that auto-suggests your query as you type, the way Google does, the way Wikipedia does, the way Stack Exchange suggests other questions as you type the title, etc. And ...
-1
votes
1answer
245 views
Address search from large text file
Basically I want to develop a Address lookup(part of my project) using C# (and I can use SQL if necessary). I have a very large text file which have all the UK address and postcodes. Addresses needs ...
2
votes
2answers
159 views
doing a full permutation search and replace on a string
I'm writing an app that does something like a custom number (license) plate generator tool where if I ask for the plate "robin" it will suggest I try:
r0bin
rob1n
r0b1n
Are there any published ...
8
votes
1answer
774 views
Good technique for search text tokenization
We are looking for a way to tokenize some text in the same or similar way as a search engine would do it.
The reason we are doing this is so that we can run some statistical analysis on the tokens. ...
-1
votes
1answer
77 views
Verify uniqueness of new content
I'm working on a review site, where there is a minor issue with almost duplicate reviews across items. Just a few words are changed. It would be very nice to be able to uncover these duplicates before ...
5
votes
2answers
270 views
Is it possible (and practical) to search a string for arbitrary-length repeating patterns?
I've recently developed a huge interest in cryptography, and I'm exploring some of the weaknesses of ECB-mode block ciphers. A common attack scenario involves encrypted cookies, whose fields can be ...
3
votes
3answers
429 views
Data structure: sort and search effectively
I need to have a data structure with say 4 keys . I can sort on any of these keys. What data structure can I opt for? Sorting time should be very little.
I thought of a tree, but it will be only help ...
3
votes
1answer
261 views
What technology/algorithm should be used to abstract meaning or keywords from a passage of text?
Hi and thanks for looking!
Background
I have a project wherein I need to abstract meaning from a passage of text to determine what the text is seeking and then match that text to a list of search ...
1
vote
2answers
143 views
Fuzzy search for a sub-string without tokens
Let's say I have the following lines:
Lorem ipsum dolor sit amet, (tag) consectetur adipiscing elit.
Phasellus congue nisi vel lorem dignissim tristique. (tag)
Etiam vulputate lacus nec velit ...
1
vote
1answer
149 views
DB technology for efficient search in tabular data?
We have a repository of tables. Around 200 tables, each table can be thousands of rows, all tables are originally in Excel sheets.
Each table has a different scheme. All data is text or numbers.
We ...
12
votes
8answers
2k views
Find a “hole” in a list of numbers
What is the fastest way to find the first (smallest) integer that doesn't exist in a given list of unsorted integers (and that is greater than the list's smallest value)?
My primitive approach is ...
1
vote
1answer
415 views
Is there a more efficient way to filter large arrays than preg_match()?
I have a log that our web application builds. Each month it contains around 16,000 entries of a string with about the average sentence worth of text.
To filter/search through these in our admin panel ...
1
vote
1answer
792 views
Tineye.com search algorithm?
I was wondering how does tineye carry a search. Does it store all the images and then extracts exif data? Which in turn must be stored in a database and queried against. So probably it is using some ...
7
votes
1answer
198 views
How important is index size when searching?
My company has recently began using Apache Solr to search its data. As we learn how to use it we have gone down the path of indexing multiple fields to get the results we need. Most of these are ...