Friday, February 17, 2012

Document stored as image, problem with inflectional search

Hello,
I'm storing documents (txt, html, etc) as images in my database, and can
run regular 'contains' searches on my data just fine, but when I use
formsof(inflectional,word), it will only return results that contain the
word exactly, not any inflectional forms of it.
Is this some known issue about the way indexing is done or how SQL
processes images? Or could something be wrong on my end?
My @.@.version is:
Microsoft SQL Server 2000 - 8.00.194 (Intel X86)
Aug 6 2000 00:57:48
Copyright (c) 1988-2000 Microsoft Corporation
Enterprise Edition on Windows NT 5.0 (Build 2195: Service Pack 4)
Thanks!
- Parhaum
Parhaum,
First of all, thank you for providing the @.@.version info as this is most
important in troubleshooting SQL FTS issues such as this one! Secondly, can
you post the exact formsof(inflectional,word) query & word or phrase that
you are having problems with along with the output of the following SQL
code?
EXEC sp_help_fulltext_columns
EXEC sp_help <your_FT-enable_table_name_here>
There are several known words when used with the US English wordbreaker that
do not generate the correct or expected inflectional, see this thread
(http://www.webservertalk.com/archive.../t-965368.html) for more details.
Thanks,
John
SQL Full Text Search Blog
http://spaces.msn.com/members/jtkane/
"Parhaum Toofanian via droptable.com" <forum@.nospam.droptable.com> wrote
in message news:141e567797a94b829e022eb7c3eb7ca2@.droptable.co m...
> Hello,
> I'm storing documents (txt, html, etc) as images in my database, and can
> run regular 'contains' searches on my data just fine, but when I use
> formsof(inflectional,word), it will only return results that contain the
> word exactly, not any inflectional forms of it.
> Is this some known issue about the way indexing is done or how SQL
> processes images? Or could something be wrong on my end?
> My @.@.version is:
> Microsoft SQL Server 2000 - 8.00.194 (Intel X86)
> Aug 6 2000 00:57:48
> Copyright (c) 1988-2000 Microsoft Corporation
> Enterprise Edition on Windows NT 5.0 (Build 2195: Service Pack 4)
>
> Thanks!
> - Parhaum
|||did you use the ms.locale metatag to indicate the type of language your html
docs are stored in?
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
"Parhaum Toofanian via droptable.com" <forum@.nospam.droptable.com> wrote
in message news:141e567797a94b829e022eb7c3eb7ca2@.droptable.co m...
> Hello,
> I'm storing documents (txt, html, etc) as images in my database, and can
> run regular 'contains' searches on my data just fine, but when I use
> formsof(inflectional,word), it will only return results that contain the
> word exactly, not any inflectional forms of it.
> Is this some known issue about the way indexing is done or how SQL
> processes images? Or could something be wrong on my end?
> My @.@.version is:
> Microsoft SQL Server 2000 - 8.00.194 (Intel X86)
> Aug 6 2000 00:57:48
> Copyright (c) 1988-2000 Microsoft Corporation
> Enterprise Edition on Windows NT 5.0 (Build 2195: Service Pack 4)
>
> Thanks!
> - Parhaum
|||First of all, thank you both for your quick replies! Regarding the locale,
I can probably check, but it's a database in my company that I'm developing
with, so I'd need to ask the DBA about it.
Second, background on the database. The table with the image data is
J_DOCU_CONTENT, and I have uploaded many files including 3 sample .txt
files with text like "park services" and "servicing the shuttle". There's
a table called T_DOCUMENT that contains some other information, but even
running the last query without the T_DOCUMENT information returns the same
result.
Trying to do an inflectional search on "service" returns 0 results. An
inflectional search on "services" returns the park services file, and so
on, as if it were a 'contains' exact match search.
:: Result of:
:: EXEC sp_help_fulltext_columns;
TABLE_OWNER<x>
TABLE_ID<id>
TABLE_NAMEJ_DOCU_CONTENT
FULLTEXT_COLUMN_NAMEDATA
FULLTEXT_COLID4
FULLTEXT_BLOBTP_COLNAMEDOCUMENT_TYPE
FULLTEXT_BLOBTP_COLID3
FULLTEXT_LANGUAGE0
:: Result of:
:: EXEC sp_help J_DOCU_CONTENT;
NameJ_DOCU_CONTENT
Owner<x>
Typeuser table
Created_datetime2005-04-14 15:38:09.013
Column_namePRIMARY_KEY
Typeint
Computedno
Length4
Prec10
Scale0
Nullableno
TrimTrailingBlanks(n/a)
FixedLenNullInSource(n/a)
CollationNULL
Column_nameVERSION
Typeint
Computedno
Length4
Prec10
Scale0
Nullableno
TrimTrailingBlanks(n/a)
FixedLenNullInSource(n/a)
CollationNULL
Column_nameDOCUMENT_TYPE
Typevarchar
Computedno
Length250
Prec
Scale
Nullableno
TrimTrailingBlanksno
FixedLenNullInSourceno
CollationSQL_Latin1_General_CP1_CI_AS
Column_nameDATA
Typeimage
Computedno
Length7000
Prec
Scale
Nullableno
TrimTrailingBlanks(n/a)
FixedLenNullInSource(n/a)
CollationNULL
No identity column defined.
No rowguidcol column defined.
constraint_type
PRIMARY KEY (clustered)
:: Result of:
:: SELECT t1.NAME FROM J_DOCU_CONTENT t0, T_DOCUMENT t1 WHERE
::(contains(t0.DATA, ' FORMSOF(INFLECTIONAL, service) ')
::AND
::(t0.PRIMARY_KEY = t1.CONTENT_ID))
NAME
(0 row(s) affected)
Message posted via http://www.droptable.com
|||Yikes. Tried to format it from the direct SQL output, and not sure now
which ended up looking worse...
Message posted via http://www.droptable.com

No comments:

Post a Comment