Showing posts with label keyword. Show all posts
Showing posts with label keyword. Show all posts

Thursday, March 29, 2012

Does SQLSERVER support fuzzy text searching(like the function of agrep).

Does SQLSERVER support fuzzy text searching(like the function of agrep).
For example, given the input keyword [homogenos] and similarity
parameter [2 characters], the function will find out valid result from
datasource by either replacing,inserting or deleting upto two different
characters from word [homogenos].
Both homogenooos(homogeno[+oo]s) and homogeos(homoge[-n]os are valid
result.
thx.No, not directly. You need to build a function that does Levenstein Edit
distance. You will find an implementation of this in the Fuzzy functions
which ship with SSIS in SQL 2005.
You can also use the expansion options in the thesaurus capabilities in
FullText search so a search on homogenos could be expanded to search on
homogenos homogenoos, homogeneous or homogeos, but you have to know in
advance what all the expansions might be and hard code them into your
thesaurus file.
Hilary Cotter
Director of Text Mining and Database Strategy
RelevantNOISE.Com - Dedicated to mining blogs for business intelligence.
This posting is my own and doesn't necessarily represent RelevantNoise's
positions, strategies or opinions.
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
"zlf" <zlfcn@.hotmail.com> wrote in message
news:uGyRG55YGHA.4944@.TK2MSFTNGP02.phx.gbl...
> Does SQLSERVER support fuzzy text searching(like the function of agrep).
> For example, given the input keyword [homogenos] and similarity
> parameter [2 characters], the function will find out valid result from
> datasource by either replacing,inserting or deleting upto two different
> characters from word [homogenos].
> Both homogenooos(homogeno[+oo]s) and homogeos(homoge[-n]os are valid
> result.
> thx.
>

Does SQL Server Support the "MINUS" Keyword?

Maybe there's another better way to do this... Anyway, here's what I'm trying to do:

I have two tables CarType & Cars. Table CarType has a column CarTypeId which is the primary key (int, identity). Table Cars includes a column CarTypeId (int) which is a foreign key into the CarType table.

Now I am trying to build a list of all those CarTypeId entries (from the CarType table) for which there are no Cars (i.e. there is no corresponding entry in the Cars table).

I read that there is an SQL "MINUS" keyword that you can use like this:

SELECT CarTypeId FROM CarType
MINUS
SELECT CarTypeId FROM Cars

So if CarType contains records with the keys {1, 2, 3, 4, 5} and Cars contains records with the foreign keys {2, 4} (using set notation), your result should be the set {1, 3, 5}. (Note: I have tried alternate versions of this where I selected the DISTINCT CarTypeId entries from Cars, and where I sorted them; results same in both cases).

I tried it in SQL Server 2005 Express. The result is just {1, 2, 3, 4, 5}.

My questions:

    Is the MINUS capability supported by SQL Server? I know that there are all sorts of different SQL dialects & implementations and that SQL has evolved & transmutated over the years; perhaps they opted to leave this out of SQL Server... If not supported, well then I guess I have to throw myself at the feet of the real SQL gurus and ask if there's another way to accomplish this.

Here's a reference to the website where I initially found out about MINUS:
http://www.1keydata.com/sql/sql-minus.html

Thanks in advance for your time/comments.

I think you want EXCEPT

ms-help://MS.VSCC.v80/MS.VSIPCC.v80/MS.SQLSVR.v9.en/tsqlref9/html/b1019300-171a-4a1a-854f-e1e751de3565.htm

|||You can also use a NOT EXISTS to find records that don't exist in the other table|||

Thanks Simon for your comments and pointing me in the right direction.

This is what worked for me:

SELECT CarTypeId FROM CarType
WHERE (CarTypeId NOT IN
(SELECT DISTINCT CarTypeId FROM Cars)
)

Still interested to know what became of the MINUS keyword though, because I think that syntax is slicker...

Ian

|||

SQL Server has never supported MINUS.

in SQL 2005 it now has the EXCEPT clause which does the same thing

|||

Caution: Microsoft talks about this issue at https://www.microsoft.com/technet/prodtechnol/sql/2000/reskit/part2/c0761.mspx?mfr=true. However, it is incomplete in that it does not address a curveball.

That is, the = operator does not address nulls.

The Microsoft example is as follow:

Oracle Microsoft SQL Server

SELECT CCODE, CNAME
FROM DEPT_ADMIN.CLASS
MINUS
SELECT C.CCODE, C.CNAME
FROM STUDENT_ADMIN.GRADE G,
DEPT_ADMIN.CLASS C
WHERE C.CCODE = G.CCODE

SELECT CCODE, CNAME
FROM DEPT_ADMIN.CLASSC
WHERE NOT EXISTS
(SELECT 'X' FROM
STUDENT_ADMIN.GRADE G
WHERE C.CCODE = G.CCODE)

This will work to identify students in one table but not in another. However, it will not work like Oracle's MINUS command to spot differences between tables. So, say you want to see this -- between the 2 tables, find where students are missing AND students whose names have changed -- either from A to B or from not null to null or from null to not null.

Try this:

SELECT CCODE, CNAME
FROM DEPT_ADMIN.CLASS C
WHERE NOT EXISTS
(SELECT 'X' FROM
STUDENT_ADMIN.GRADE G
WHERE C.CCODE = G.CCODE

and (C.CNAME = G.CNAME or (C.CNAME is null and G.NAME is null) )

That will mimic the full functionality of Oracle's MINUS command in SQL Server 2000 T-SQL.

|||

Hi Ian,

I havent come across the Minus in SQL server but to do what you want I would try the following

SELECT T.cartypeid

FROM cartype as T

LEFT OUTER JOIN cars C

ON T.cartypeID = C.cartypeID

WHERE c.caretypeID is null

This joins everything in the cartype table but the where clause restricts the result set to those records with a null car.cartypeID

i.e. no matching record

Does SQL Server Support the "MINUS" Keyword?

Maybe there's another better way to do this... Anyway, here's what I'm trying to do:

I have two tables CarType & Cars. Table CarType has a column CarTypeId which is the primary key (int, identity). Table Cars includes a column CarTypeId (int) which is a foreign key into the CarType table.

Now I am trying to build a list of all those CarTypeId entries (from the CarType table) for which there are no Cars (i.e. there is no corresponding entry in the Cars table).

I read that there is an SQL "MINUS" keyword that you can use like this:

SELECT CarTypeId FROM CarType
MINUS
SELECT CarTypeId FROM Cars

So if CarType contains records with the keys {1, 2, 3, 4, 5} and Cars contains records with the foreign keys {2, 4} (using set notation), your result should be the set {1, 3, 5}. (Note: I have tried alternate versions of this where I selected the DISTINCT CarTypeId entries from Cars, and where I sorted them; results same in both cases).

I tried it in SQL Server 2005 Express. The result is just {1, 2, 3, 4, 5}.

My questions:

    Is the MINUS capability supported by SQL Server? I know that there are all sorts of different SQL dialects & implementations and that SQL has evolved & transmutated over the years; perhaps they opted to leave this out of SQL Server... If not supported, well then I guess I have to throw myself at the feet of the real SQL gurus and ask if there's another way to accomplish this.

Here's a reference to the website where I initially found out about MINUS:
http://www.1keydata.com/sql/sql-minus.html

Thanks in advance for your time/comments.

I think you want EXCEPT

ms-help://MS.VSCC.v80/MS.VSIPCC.v80/MS.SQLSVR.v9.en/tsqlref9/html/b1019300-171a-4a1a-854f-e1e751de3565.htm

|||You can also use a NOT EXISTS to find records that don't exist in the other table|||

Thanks Simon for your comments and pointing me in the right direction.

This is what worked for me:

SELECT CarTypeId FROM CarType
WHERE (CarTypeId NOT IN
(SELECT DISTINCT CarTypeId FROM Cars)
)

Still interested to know what became of the MINUS keyword though, because I think that syntax is slicker...

Ian

|||

SQL Server has never supported MINUS.

in SQL 2005 it now has the EXCEPT clause which does the same thing

|||

Caution: Microsoft talks about this issue at https://www.microsoft.com/technet/prodtechnol/sql/2000/reskit/part2/c0761.mspx?mfr=true. However, it is incomplete in that it does not address a curveball.

That is, the = operator does not address nulls.

The Microsoft example is as follow:

Oracle Microsoft SQL Server

SELECT CCODE, CNAME
FROM DEPT_ADMIN.CLASS
MINUS
SELECT C.CCODE, C.CNAME
FROM STUDENT_ADMIN.GRADE G,
DEPT_ADMIN.CLASS C
WHERE C.CCODE = G.CCODE

SELECT CCODE, CNAME
FROM DEPT_ADMIN.CLASSC
WHERE NOT EXISTS
(SELECT 'X' FROM
STUDENT_ADMIN.GRADE G
WHERE C.CCODE = G.CCODE)

This will work to identify students in one table but not in another. However, it will not work like Oracle's MINUS command to spot differences between tables. So, say you want to see this -- between the 2 tables, find where students are missing AND students whose names have changed -- either from A to B or from not null to null or from null to not null.

Try this:

SELECT CCODE, CNAME
FROM DEPT_ADMIN.CLASS C
WHERE NOT EXISTS
(SELECT 'X' FROM
STUDENT_ADMIN.GRADE G
WHERE C.CCODE = G.CCODE

and (C.CNAME = G.CNAME or (C.CNAME is null and G.NAME is null) )

That will mimic the full functionality of Oracle's MINUS command in SQL Server 2000 T-SQL.

|||

Hi Ian,

I havent come across the Minus in SQL server but to do what you want I would try the following

SELECT T.cartypeid

FROM cartype as T

LEFT OUTER JOIN cars C

ON T.cartypeID = C.cartypeID

WHERE c.caretypeID is null

This joins everything in the cartype table but the where clause restricts the result set to those records with a null car.cartypeID

i.e. no matching record