SQL Server 2008 FileStream Full Text Search

I have a SQL Server 2008 database with a table that is included with FileStream. For the rest of this question, I will refer to this table as Tbl_FileStream.

Tbl_FileStream contains hundreds of thousands of files, from PDF to JPG to TXT files.

Aslo, Tbl_FileStream has a full-text index created in FileStream. The full-text index works wonderfully, and I have a stored procedure that does a full-text search (using CONTAINSTABLE and RANK), and it works great.

However, I am in a pickle regarding the fact that full-text search can return to me when it gets hit when searching for FileStream. For example, we need to search for the phrase "et dolore", then my search will return results that indicate 59 documents matching the search query. Of course, I can get the names of the documents found, because I store the document headers in Tbl_FileStream , but I really need to get the text surrounding the search query in the actual file .

For example, suppose I have a text document with the following latin font -
Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam unumy eirmod time invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. In vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.

Using SQL Server’s full-text search capabilities, do I need to search for the words "et dolore", then what I really need to return is some arbitrary number of words (about 10 or so) where the search was found within the document, so in fact I get some phrase like "... sed diam unumy eirmod time invidunt ut labore et dolore ...".

, , - , , , . , , , , , .

SQL Server 2008?

, - , ?

, .

+3
2

SQL Server 2008

SQL Server 2008, , . , OCR , . , , - 2 .

, , , :

TextContents nvarchar(max) null.

, - :

Declare @SearchTerm nvarchar(max)
Declare @MaxResultTextLen int

Set @SearchTerm = 'et dolore'
Set @MaxResultTextLen = 100

Select  CharIndex(@SearchTerm, F.TextContents),
    Case 
    When CharIndex(@SearchTerm, F.TextContents) <= @MaxResultTextLen 
        Then Substring(F.TextContents, 1, @MaxResultTextLen) + '...'
    Else Substring(@SearchTerm
        , CharIndex(@SearchTerm, R.TextContents) 
                - @MaxResultTextLen + Len(@SearchTerm)
        , @MaxResultTextLen) + '...'
    End As TextContext
From Files As F
Where Contains(F.TextContents, @SearchTerm)

, , , - , . , , PDF OCR'd, . , dtSearch ( ), , " ", ​​ .

dtSearch

+1

, , , Sql Server.

0

Source: https://habr.com/ru/post/1786019/


All Articles