Parsing domain names form URL
1. Define a variable
2. Get the domain name from url using Request.Url.Host
3. Assign that value to the variable declared in Step1 (Like Dts.Variables("variableName")=<ValueFromStep2>).
March 21st, 2011 3:33pm
You could also use a regex function in a
Script Component. (but you need to understand .net)
You only need to find the correct regular expression. For example:
http://regexlib.com/Search.aspx?k=URL
Please mark the post as answered if it answers your question | My SSIS Blog:
http://microsoft-ssis.blogspot.com
Free Windows Admin Tool Kit Click here and download it now
March 21st, 2011 3:45pm
I have a need to parse domain names out of URL strings, Has anyone done this before or can point me to the right directions?
thanks,
Rok
March 21st, 2011 4:38pm
Are you using SSIS to achieve this? If there is a pattern in the url string (which should be there I suppose) then you can extract domain name from the url string using string functions.
Nitesh Rai- Please mark the post as answered if it answers your question
Free Windows Admin Tool Kit Click here and download it now
March 21st, 2011 4:41pm
Thanks all for your responses and tips! Although there were different ways to tackle, I needed the fastest way to accomplish it because of time constraint and it was for one time run only. I picked up this UDF from some website and tailored it to
my need, Now I cannot seem to find that address so that I can cite it here . But whoever the person who wrote it, saved me lot of time and got my small project done. So God Bless him!!!!
--I'm just pasting it here so somebody might use it later----
CREATE FUNCTION [dbo].[usp_Parse_domain_name_v5] (@strURL varchar(1000))
RETURNS varchar(1000)
AS
BEGIN
IF CHARINDEX('http://',@strURL) > 0 OR CHARINDEX('https://',@strURL) > 0
SELECT @strURL = REPLACE(@strURL,'https://','')
SELECT @strURL = REPLACE(@strURL,'http://','')
SELECT @strURL = REPLACE(@strURL,'www','')
SELECT @strURL = REPLACE(@strURL,'www.','')
SELECT @strURL = REPLACE(@strURL,'ftp://','')
-- Remove everything after "/" if one exists
IF CHARINDEX('/',@strURL) > 0 (SELECT @strURL = LEFT(@strURL,CHARINDEX('/',@strURL)-1))
--Optional: Remove subdomains but differentiate between www.google.com and www.google.com.au
--IF (LEN(@strURL)-LEN(REPLACE(@strURL,'.','')))/LEN('.') < 3 -- if there are less than 3 periods
--SELECT @strURL = PARSENAME(@strURL,2) + '.' + PARSENAME(@strURL,1)
--ELSE -- It's likely a google.co.uk, or google.com.au
--SELECT @strURL = PARSENAME(@strURL,3) + '.' + PARSENAME(@strURL,2) + '.' + PARSENAME(@strURL,1)
RETURN @strURL
END
GO
March 22nd, 2011 6:55pm