Issue with Encoding
Hi i am new to ssis we have an utility that uploads the data to Shared folder by reading the data from Oracle database - the data in the database is Korean first the data is exported into Flat File and this file becomes an input to Conversion tool.. Here is the snap shot of data in Flat File (csv) file Text qualifier : " Delimiter: , "199963","10","1","2009-03-31","","480000.000","KRW","95327-0 육ìƒìš´ì†¡ë¹„-운송료 ","020700","","","","1168","1001","","1358750","KOREA" "199963","4","1","2009-03-31","","480000.000","KRW","95629-0 육ìƒìš´ì†¡ë¹„-운송료 ","020700","","","","1168","1001","","1358750","KOREA" "199963","25","1","2009-03-31","","-887300.000","KRW","95327-0 육ìƒìš´ì†¡ë¹„-보관료 ","020700","","","","1168","1001","","1358750","KOREA" "199963","27","1","2009-03-31","","-573200.000","KRW","95327-0 육ìƒìš´ì†¡ë¹„-기타비용 ","020700","","","","1168","1001","","1358750","KOREA" "199963","28","1","2009-03-31","","-379609.000","KRW","95327-0 육ìƒìš´ì†¡ë¹„-기타비용 ","020700","","","","1168","1001","","1358750","KOREA" This is the code that is used in conversion utility ---------------------------------------------------------- strDataFile -- source file containing above data 'Reading CSV as Korean file Using s As StreamReader = New StreamReader(strDataFile, System.Text.Encoding.GetEncoding(949), False) ' Read one line from file unicodeString = s.ReadToEnd() End Using End If 'Get Bytes rbyteShiftJIS = utf8WithoutBom.GetBytes(unicodeString.ToString()) ' Get String Without BOM decodedString = utf8WithoutBom.GetString(rbyteShiftJIS) 'overwrite the file with UTF-8 Encoded data Using w As StreamWriter = New StreamWriter(strDataFile, False, utf8WithoutBom) w.Write(decodedString) over writing the same file with utf8 encoding after this process the lines are as follows ------------------------------------------------ "199963","10","1","2009-03-31","","480000.000","KRW","95327-0 ????? ,"020700","","","","1168","1001","","1358750","KOREA" "199963","25","1","2009-03-31","","-887300.000","KRW","95327-0 ?????," 020700","","","","1168","1001","","1358750","KOREA" "199963","26","1","2009-03-31","","-10000.000","KRW","95327-0 ????? ,"020700","","","","1168","1001","","1358750","KOREA" "199963","27","1","2009-03-31","","-573200.000","KRW","95327-0 ???? ","020700","","","","1168","1001","","1358750","KOREA" "199963","28","1","2009-03-31","","-379609.000","KRW","95327-0 ???? ","020700","","","","1168","1001","","1358750","KOREA" Here for the Bold one you can notice that first 3 lines dose not have double quotes (") at the end.(For the bold text the ending text qualifier is missing) and for the last 2 line there are double quotes("). i am not sure why this is happening.. can any one please help me in retaining the quotes at the end after encoding it to UTF8withoutBOM I guess there is something that is suppressing the quotes if the description column has "?" as a last character. Thanks in AdvanceSri
February 12th, 2011 3:40am

It is likely that the text was not imported correctly from the database e.g. it was converted to 949 and then concatenated as ASCII later to create the file, besides, the conversion code looks very valid. To proof my point: open the file in any good Unicode editor and check that it is displayed correctly (e.g. EMEditor). If it is good, then here is what you can do: use Mlang objects instead of the standard routines.Arthur My Blog
Free Windows Admin Tool Kit Click here and download it now
March 23rd, 2011 9:25pm

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics