Converting Korean data to UTF8 Encoding - Data loss
Hi i am new to ssis we have an utility that uploads the data to Shared folder by reading the data from Oracle database - the data in the database is Korean first the data is exported into Flat File and this file becomes an input to Conversion tool.. Here is the snap shot of data in Flat File (csv) file Text qualifier : " Delimiter: , "199963","10","1","2009-03-31","","480000.000","KRW","95327-0 육ìƒìš´ì†¡ë¹„-운송료 ","020700","","","","1168","1001","","1358750","KOREA" "199963","4","1","2009-03-31","","480000.000","KRW","95629-0 육ìƒìš´ì†¡ë¹„-운송료 ","020700","","","","1168","1001","","1358750","KOREA" "199963","25","1","2009-03-31","","-887300.000","KRW","95327-0 육ìƒìš´ì†¡ë¹„-보관료 ","020700","","","","1168","1001","","1358750","KOREA" "199963","27","1","2009-03-31","","-573200.000","KRW","95327-0 육ìƒìš´ì†¡ë¹„-기타비용 ","020700","","","","1168","1001","","1358750","KOREA" "199963","28","1","2009-03-31","","-379609.000","KRW","95327-0 육ìƒìš´ì†¡ë¹„-기타비용 ","020700","","","","1168","1001","","1358750","KOREA" This is the code that is used in conversion utility ---------------------------------------------------------- strDataFile -- source file containing above data 'Reading CSV as Korean file Using s As StreamReader = New StreamReader(strDataFile, System.Text.Encoding.GetEncoding(949), False) ' Read one line from file unicodeString = s.ReadToEnd() End Using End If 'Get Bytes rbyteShiftJIS = utf8WithoutBom.GetBytes(unicodeString.ToString()) ' Get String Without BOM decodedString = utf8WithoutBom.GetString(rbyteShiftJIS) 'overwrite the file with UTF-8 Encoded data Using w As StreamWriter = New StreamWriter(strDataFile, False, utf8WithoutBom) w.Write(decodedString) over writing the same file with utf8 encoding after this process the lines are as follows ------------------------------------------------ "199963","10","1","2009-03-31","","480000.000","KRW","95327-0 ????? ,"020700","","","","1168","1001","","1358750","KOREA" "199963","25","1","2009-03-31","","-887300.000","KRW","95327-0 ?????," 020700","","","","1168","1001","","1358750","KOREA" "199963","26","1","2009-03-31","","-10000.000","KRW","95327-0 ????? ,"020700","","","","1168","1001","","1358750","KOREA" "199963","27","1","2009-03-31","","-573200.000","KRW","95327-0 ???? ","020700","","","","1168","1001","","1358750","KOREA" "199963","28","1","2009-03-31","","-379609.000","KRW","95327-0 ???? ","020700","","","","1168","1001","","1358750","KOREA" Here for the Bold one you can notice that first 3 lines dose not have double quotes (") at the end.(For the bold text the ending text qualifier is missing) and for the last 2 line there are double quotes("). i am not sure why this is happening.. can any one please help me in retaining the quotes at the end after encoding it to UTF8withoutBOM I guess there is something that is suppressing the quotes if the description column has "?" as a last character. Thanks in Advance
February 9th, 2011 5:34am

Can any one please helpSri
Free Windows Admin Tool Kit Click here and download it now
February 9th, 2011 11:31am

This is not the correct forum for your post. As an aside, I think your extraction from Oracle is not right, specifically in terms it produced some chars that could not be manipulated. And namely Πand preceding it too perhaps.Arthur My Blog
February 9th, 2011 11:35am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics