Is there any way to populate a full text search index without storing the backing information in the database?
I don't mean using remote-blob-storage
or FILESTREAM
types. The backing information is a composite of some existing columns with the text of files on a unc share, the number of files is quite large and storing them on the database is not practicable.
I need to be able to have a record like:
| col_a | col_b |
|--------+-------|
| ccc dd | ee ff |
Then be able to search for "ccc AND ff"
without having to do a search for "ccc"
and "ff"
separately because there is a significant performance hit when doing this with more than 2 values (we will be frequently searching for more than 10 words across ~15 columns), additionally the value of col_b
might be stored in a file on the UNC share and not actually in the database at all.
The approach I am currently thinking of is to write a custom IFilter and specify a new extension so that my table for full-text searching looks like this:
id BIGINT
extension CHAR(3)
content VARBINARY(8)
The content would represent the BIGINT
key of another table, and the IFilter would do a SQL read get the data and then return the composite value back to the FTS indexer.
The only problem I can see from this is that if my backing data is updated this table wouldn't be and as such the FTS would not re-index the new values, but this is fixed by using an update trigger on the other table.
Can anyone see a problem with this implementation? or suggest a better one?
FILETABLE
? – Jon Seigel May 21 at 17:19