Have you ever wondered how would you convert that SQL column which is comma delimited into Rows so that you can effectively join it with other tables or use that infomration better. This might happen for those of you who have saved the keywords lets say of a blog application in a single column delimited with comma, or you might be using an old DotNetNuke version where the RoleIDs are in a ; delimited string saved in one column and you want to use that information in a better and easier way, or for any other reasons that you might needed. Well here is an easy solution which you could do in one select, thanks to the XML Data Type functions.
So lets say you have a table like this
And your data is like this
And you want to generate a report like this
Keywords Count -------- ----------- AA 3 BB 3 CC 2 DD 2 EE 1 FF 3 GG 1 HH 1 JJ 1 KK 1 MM 2 TT 1 UU 1 WW 1 XX 1 (15 row(s) affected)
First you have to select the Keywords, you can already easily view the parsed column by doing this query
WITH myCommonTblExp AS ( SELECT CompanyID, CAST('<I>' + REPLACE(Keywords, ',', '</I><I>') + '</I>' AS XML) AS Keywords FROM Companies ) SELECT CompanyID, RTRIM(LTRIM(ExtractedCompanyCode.X.value('.', 'VARCHAR(256)'))) AS Keywords FROM myCommonTblExp CROSS APPLY Keywords.nodes('//I') ExtractedCompanyCode(X)
now knowing that you can do that, all you have to do is to group them and count, but you cannot group XML methods so my suggestion is create a view of the query above
CREATE VIEW [dbo].[DissectedKeywords] AS WITH myCommonTblExp AS ( SELECT CAST('<I>' + REPLACE(Keywords, ',', '</I><I>') + '</I>' AS XML) AS Keywords FROM Companies ) SELECT RTRIM(LTRIM(ExtractedCompanyCode.X.value('.', 'VARCHAR(256)'))) AS Keywords FROM myCommonTblExp CROSS APPLY Keywords.nodes('//I') ExtractedCompanyCode(X) GO
and perform your count on that view
SELECT Keywords, COUNT(*) AS KeyWordCount FROM DissectedKeywords GROUP BY Keywords ORDER BY Keywords
See its simple!
Hey, I was just wondering what the ExtractedCompanyCode(x) is.
I’m trying to apply this to a description column to find the most common words.
In my table: CompanyId=Code, Keywords=Description, Companies =Material. But I can’t understand what the ExtractedCompanyCode(x) is.
Thank you!