best way to import big amount of data from the file into database
Dear, I have a question regarding the data import to the SQL Server 2008. I need to import big amount of data (100 k - 500 k of records at a time) from either XML or text files on a regular basis and if possible do not utilize the server too much. Should I choose XML or text files for best performance? Will it minimize the server load if I split the big file into couple smaller files and import them one by one? I read about the bulk import and it looks that the XML data need to be imported as one document into the column of XML type and then transforemd to internal representation with sp_xml_preparedocument sp to be finaly read with OPENXML statement. That seems to be a lot of unnecessary steps when compared to import from text file where data columns can be accessed at the time of reading data from the file without need of intermediary table and transformations. The structure of the data will be probably flat means one line in the file will be one record in the table and even if not then hierarchical data can be split into separate flat files. What is the best practice for this - bulk import, openrowset(bulk), bcp, SSIS and why? Thanks in advanceDzordz
June 7th, 2012 3:58pm

Hi Dzordz, A text file is a faster import. This is because the text has less overhead text and no validation is required. Use the Fast Parse option for best results. And then Fast Load with table lock on the destination side. This will equal to the bulk loading which is the best. There is sense in splitting, but it will not give much advantage, also can be quite the opposite when hitting the same table. PS: BCP is the command line interface to bulk load, SSIS is not a method, just an ETL engine. Arthur My Blog
Free Windows Admin Tool Kit Click here and download it now
June 7th, 2012 4:05pm

Hi Arthur, Thanks for quick response. I understand that you mean the bulk load statement and not the openrowset (bulk ...) function, right? Is there any difference between these? Yes, I know that SSIS is a ETL engine but the question is if using it will be more benefitial than mentioned bulk load? One more question regarding the delimiter. Is there any way to assure that the chosen delimiter character will not be included in the data itself and cause import errors? Thanks in advanceDzordz
June 7th, 2012 5:07pm

Hi, To your second question about delimiter character, no there is no way to do that. To the first question about using SSIS or bulk load- if you follow the steps Arthur mentioned above and use SSIS, you will get almost the same performance with either options. However, if you need to perform any complex transformations you may be better off with SSIS.Regards, Samuel Vanga Twitter Blog
Free Windows Admin Tool Kit Click here and download it now
June 7th, 2012 6:43pm

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics