You are on page 1of 8

BCP Questions 1) What version of bcp can I use with my server?

? We recommend that you use the version of bcp that is shipped to you with your server: SQL Server 11.0.x 10.0.2 or higher with the latest EBF Adaptive Server 11.511.1.1 You can use any 10.0.x or 11.x bcp version with any 10.0.x or 11.x server version. 2) What is the form of storage of bcp data? bcp can store data in either ASCII or native format. ASCII format is the same as storing in character mode, which is human readable. Native format stores data in operating system format or binary mode, which is not human readable. 3) What are the privileges required to bcp data? To bcp data, you need: * A valid SQL Server or Adaptive Server account * Appropriate permissions on the database tables and operating system files: 1) To copy in data, you need insert permissions on the table. 2) To copy out data, you need select permissions on the table being copied, as well as on the system tables sysobjects, syscolumns, and sysindexes. 4)What is fast and slow bcp ? Fast BCP -- Any operation performed on a table with no triggers or indexes is fast bcp because inserts are not logged. Slow BCP --If the table contains indexes or triggers, bcp logs all inserts in the transaction log, which is called slow bcp because the logging slows down the operation. Remember, logging inserts can cause the transaction log to fill up very quickly. 5)Can bcp copy out an entire database? No, not directly. You can, however, simply use a shell script to copy out an entire database. 6) What database options must be set for bcp ? To enable the use of fast for tables without indexes or triggers, the system or database administrator for the database turns on the select into/bulkcopy/pllsort option (in pre-11.5, called select into/bulkcopy ). If select into/bulkcopy/pllsort is turned off (false is the default) when you try to bulk copy data in, the following error occurs: You cannot run the non-logged version of bulk copyin this database. Please check with the DBO. To set the database option to true : 1. Use the isql command: sp_dboption dbname , "select into/bulkcopy/pllsort", true

2. Issue a checkpoint in the database. Note: select into/bulkcopy/pllsort does not need to be turned on for bcp..out or slow bcp .

7) What are the advantages of one format over the other? The advantages and disadvantages of each are: ASCII Format (Recommended) Advantages: * Portable across all platforms * Easy to read and, thus, easy to debug * The only method recommended for importing external data sources, such as COBOL or Microsoft Access Disadvantages: * Requires conversion from and to server datatypes and character format for every data value * Takes slightly more space than data stored in native format Native Format Advantages: * Takes less disk space than character format * Does not require conversion to character format, which means better performance Disadvantages: * Not compatible across platforms * Not compatible across external data sources * Even across servers, types differ from release to release which may result in incompabilities * Not human readable, thus very difficult to debug 8) How can I make my fast bcp process perform better under System 11? There are several ways to increase performance. See the Performance and Tuning Guide and the Utility Programs manual for guidelines for performing these tasks: Increase extent allocation. Currently extents (8 pages) are allocated one at a time. You can preallocate 2 to 31 extents at a time. Any unused extents that were preallocated within each bcp batch are de-allocated. For maximum performance, size your bcp batch and set the number of pre-allocated extents parameter (previously known as cpreallocext ) to eliminate any de-allocations. For System 11, use: sp_configure "number of preallocatedextents", nn For System 10, use: buildmaster -ycpreallocext = nn Partition the table.Also see Can I bcp data simultaneously into several table partitions? Configure the OAM page caching to reduce physical reads of OAM (Object Allocation Map) pages: o For 11.5: sp_configure "number of oam trips" o For pre-11.5:

sp_configure "number of coamtrips" * Set memory low and buffer wash high (80%), so that the I/O executes continually to take advantage of idle cycles. Normal bcp flushes pages during checkpoint or end of batch. For more information on setting memory (total memory ) see the Performance and Tuning Guide , and setting buffer wash (housekeeper free write percent ) see the System Administration Guide . * Configure a buffer pool to a large I/O size such as 16K: sp_poolconfig default, "16K" * Increase the network packet size to the network I/O size with the -A flag: bcp -A 16384 Note: Depending upon the size of the bcp job and the tuning required, you may want to create a configuration file that maximizes bcp performance. You will need to restart the server, specifying the new configuration file, to execute the file. See the System Administration Guide for details on creating configuration files. Will the row in sysindexes for the text or image column(s) affect whether I can do fast bulk copy? No, text or image columns will not prevent fast bulk copy into a table that has no indexes. bcp checks the sysstat column of sysindexes. Bits in sysstat indicate whether the index is clustered (O_CLUST 0x10) or nonclustered (O_NONCLUST 0x020). Neither of these bits are set in sysindexes rows for text or image columns. How does bcp delimit data? Are delimiters mandatory? By default, bcp uses tabs to delimit columns and new lines as row terminators. To change them, use the -t (field terminator) and -r (row terminator) options. The terminator can be as long as 30 characters. You must use double quotes. For example: bcp tempdb..testout data -c -t "," -r "###" -Uuser -Ppassword In this example, every column copied out of table test will be delimited by commas (,) and every row by pound signs (###). You would then use the same options when copying in the data. Note: Be sure to use double quotes if the terminator is a special shell character. The use of delimiters is highly recommended, because character and text datatypes can have the default delimiters (tab and new line) embedded in them. bcp cannot distinguish between the actual delimiters and those embedded in the data. Always use a delimiter pattern that is unlikely to be

in the data. What is a format file and when should I use it? A format file describes the layout of data in the datafile, such as: * bcp version * Number of columns in the host file to be copied * Type of data that it contains for each column (sybchar for all types in an ASCII file) * Length of the data * Delimiters * Server column into which the value is to be copied Use a format file when the data is in a format different from the defaults, which are tab delimited columns and row delimited data. For example, you would use a format file when: * Datafile is in fixed length format. Datafiles from mainframe sources are often defined this way. * Sequence of data in the datafile does not match that in the corresponding server table. * Number of columns in the datafile does not match that in the corresponding server table. How does bcp handle null data? For bcp to consider a value as null, there must be nothing between the column delimiters. A space is not considered as a null. Depending on the datatype of the column in the server, the value is translated accordingly. Consider the following datafile: 1,hello,01/01/96 ,, ,, If we copy this data into a table defined as: create tabletest(col1 int null, col2 char(10)null, col3 datetime null) with the command: bcp test in data -c -t"," -Uuser -Ppassword then a select from the table would contain the following values: col1 col2 col3

1 hello

Jan 1 1996 12:00AM


As the example shows: * A space in a fixed length field, such as an integer, is considered as a NULL. * A space in a char/varchar field is considered a space. * A space in a datetime field is inserted as the default date, Jan 1 1900. When I copy out text data, only the first 32K are copied. Why? How can I get all of the data copied? By default, the server copies out only 32K of text or image data for bcp..out . To retrieve all of the text or image data using bcp , use the -T option, specifying the size of text data to be retrieved. For example: 1. Find out the maximum size of the data in the table. In isql , enter: select max(datalength(text_col)) from table Use the returned value as the size of the text data to be copied out. Alternatively, you can skip this step and specify the largest text size possible, 2147483647. 2. Specify the size of the text data with the bcp command: bcp dbname.owner.table in datafile -c -Uuser -Ppassword -T100000 In step 2, the size of the text data copied out is 100000 bytes. Why was my bcp job logged when my table has no indexes? Although a table has no indexes, adding data to the table with bcp can cause the log to fill up because bcp logs page allocations. If a table has no index and the select into/bulkcopy/pll option is set for that database, there should be no logging of the new rows added. For details, see the Utility Programs manual. If a table contains triggers but no index, the server uses slow bcp with full logging, but doesn't fire the triggers. You can either: * Drop all triggers on the table before running bcp and then re-create them when bcp is finished. * Use slow bcp to retain the indexes and triggers on the table.

If a table has triggers on it, are the triggers fired in slow bcp ? How about rules on a table? For performance reasons, bcp does not fire triggers on tables or check for rules. For details, see the information on data integrity--defaults, rules and triggers--in the Utility Programs manual. Can I bcp data simultaneously into several table partitions? A single bcp process cannot insert data into several table partitions as of release 11. However, you can use more than one bcp process to insert simultaneously into several partitions. The Utility Programs manual gives guidelines for bulk copying data into partitioned tables. Parallel Bulk Copy As of release 11.0, you can use parallel bcp to copy data into a specific partition. Splitting large bulk copy jobs into multiple client sessions that run concurrently is faster than running one large job. The number of parallel bcp sessions that you can start is limited by the number of partitions in the table. For example, if you start five bcp jobs on a table with four partitions, the first four jobs run in parallel; the fifth job starts when one of the other four jobs finishes. Also see TechNote 1271, FAQs about Table Partitioning , under "How Do I Take Advantage of Table Partitioning with Parallel bcp in?". As of release 11.5, you can balance partitions by specifying partition numbers. This enables you to use bcp to greatly reduce partition skew; that is, to balance the amount of data among partitions. To benefit from release 11.5 parallel bulk copy, do not use tables with clustered indexes. If you do, slow bcp is used; and the clustered index determines data placement, not bcp . Note: Copying in very large tables, especially simultaneous copies into a partitioned table, can require a large number of locks. To avoid running out of locks, consider resetting the number of locks parameter and using the --b batchsize option to copy smaller batches. Can I use two processes at once to bcp data into one table if the table is not partitioned? Yes, two processes can copy data to the same table at the same time, whether or not the table is partitioned. If your table is not partitioned, there is a greater likelihood of contention; for example: * If one bcp process escalates to a table lock, the other will be blocked. * If both bcp processes try to insert into the same place, one will block the other. When adding rows to a table with no clustered index, the new rows are added to the end of the table (last page) and one bcp process

will block the other. It is possible for the two bcp processes to deadlock. How does bcp handle IDENTITY columns? For tables with IDENTITY columns, you can either let the server assign the identity values for the rows being bulk copied in or you can obtain them from the bcp host file. By default, SQL Server or Adaptive Server assigns values for the bcp IDENTITY column. This means that the datafile does not need to contain the values. You can choose which values you want to use: * Use datafile values. If the datafile contains identity values and you want to use them instead of the ones the server assigns, use the -E flag as follows: bcp dbname.owner.table in datafile -c -E -Uuser -Ppassword Note: The server does not guarantee uniqueness with the -E flag. You must guarantee unique values. * Use server defaults. If the datafile contains identity values and you do not want to use them instead of the ones the server assigns, use the -N flag as follows: bcp dbname.owner.table in datafile -c -N -Uuser -Ppassword The server then assigns new identity values. How do I use 11.5 parallel bcp for tables with IDENTITY columns? When using multiple processes to a table with an IDENTITY column, prevent locking contention and ensure sort order by using the --E or --g parameters. Be careful to ensure unique values. For details on how to use the --E and --g parameters, see the Utility Programs manual. Note: If you use native format and explicit identity values, copying out with bcp --n and in with bcp --n --E , you receive an error (Bug #125962). Use character format (-c ) instead. How can I trap errors in bcp ? What kind of errors does bcp trap? The -e option to bcp records errors that occur during a bulk copy operation. bcp traps errors in syntax, conversion and format files. Not all errors are recorded. Only errors that bcp itself recognizes are written to the error file. Errors reported by the server, such as duplicate rows or full log, are not written to the error file but are returned to the screen. How do I record rows that the server rejected during a bcp operation?

To trap rows rejected by the server for reasons such as duplicate rows or character conversion errors, write a bulk copy program using Client-Library to handle the exceptions. In the bulk copy program, do the following: 1. Set the batch size to 1. 2. Bind the data to local variables. 3. Send the row to the server. 4. If the return code is not successful, write the variables containing the data for the current row to an error file.