Creating External Table metadata, the easy way
To further illustrate the tight relationship between SQL*Loader and External Tables,
the SQL*Loader tool may be used to generate a script that creates an External Table
according to a pre-existing control file.
SQL*Loader has a command line option named EXTERNAL_TABLE, this can hold one
of three different parameters {NOT_USED | GENERATE_ONLY | EXECUTE}. If nothing is
set, it defaults to the NOT_USED option.
This keyword is used to generate the script to create an External Table, and the
options mean:
- NOT_USED: This is the default option, and it means that no External Tables are
to be used in this load. - GENERATE_ONLY: If this option is specified, then SQL*Loader will only read
the definitions from the control file and generate the required commands,
so the user can record them for later execution, or tailor them to fit his/her
particular needs. - EXECUTE : This not only generates the External Table scripts, but also executes
them. If the user requires a sequence, then the EXECUTE option will not only
create the table, but it will also create the required sequence, deleting it once
the data load is finished. This option performs the data load process against
the specified target regular by means of an External Table, it creates both
the directory and the External Table, and inserts the data using a SELECT AS
INSERT with the APPEND hint.
Let’s use the GENERATE_ONLY option to generate the External Table creation scripts:
$sqlldr exttabdemo/oracle employees external_table=GENERATE_ONLY
By default the log file is located in a file whose extension is .log and its name
equals that of the control file. By opening it we see, among the whole log processing
information, this set of DDL commands:
CREATE TABLE "SYS_SQLLDR_X_EXT_EMPLOYEES"
(
"EMPLOYEE_ID" NUMBER(6),
"FIRST_NAME" VARCHAR2(20),
"LAST_NAME" VARCHAR2(25),
"EMAIL" VARCHAR2(25),
"PHONE_NUMBER" VARCHAR2(20),
"HIRE_DATE" DATE,
"JOB_ID" VARCHAR2(10),
"SALARY" NUMBER(8,2),
"COMMISSION_PCT" NUMBER(2,2),
"MANAGER_ID" NUMBER(6),
"DEPARTMENT_ID" NUMBER(4)
)
ORGANIZATION external
(
TYPE oracle_loader
DEFAULT DIRECTORY EXTTABDIR
ACCESS PARAMETERS
(
RECORDS DELIMITED BY NEWLINE CHARACTERSET US7ASCII
BADFILE 'EXTTABDIR':'employees.bad'
LOGFILE 'employees.log_xt'
READSIZE 1048576
FIELDS TERMINATED BY "," OPTIONALLY ENCLOSED BY '"' LDRTRIM
REJECT ROWS WITH ALL NULL FIELDS
(
"EMPLOYEE_ID" CHAR(255)
TERMINATED BY "," OPTIONALLY ENCLOSED BY '"',
"FIRST_NAME" CHAR(255)
TERMINATED BY "," OPTIONALLY ENCLOSED BY '"',
"LAST_NAME" CHAR(255)
TERMINATED BY "," OPTIONALLY ENCLOSED BY '"',
"EMAIL" CHAR(255)
TERMINATED BY "," OPTIONALLY ENCLOSED BY '"',
"PHONE_NUMBER" CHAR(255)
TERMINATED BY "," OPTIONALLY ENCLOSED BY '"',
"HIRE_DATE" CHAR(255)
TERMINATED BY "," OPTIONALLY ENCLOSED BY '"',
"JOB_ID" CHAR(255)
TERMINATED BY "," OPTIONALLY ENCLOSED BY '"',
"SALARY" CHAR(255)
TERMINATED BY "," OPTIONALLY ENCLOSED BY '"',
"COMMISSION_PCT" CHAR(255)
TERMINATED BY "," OPTIONALLY ENCLOSED BY '"',
"MANAGER_ID" CHAR(255)
TERMINATED BY "," OPTIONALLY ENCLOSED BY '"',
"DEPARTMENT_ID" CHAR(255)
TERMINATED BY "," OPTIONALLY ENCLOSED BY '"'
)
)
location
(
'employees.txt'
)
)
The more complete version is shown, some differences with the basic script are:
- All the column definitions are set to CHAR(255) with the delimiter character
defined for each column - If the current working directory is already registered as a regular DIRECTORY
at the database level, SQL*Loader utilizes it, otherwise, it creates a new
directory definition - The script specifies where the bad files and log file are located
- It specifies that an all-null column record is rejected
In the case of the EXECUTE keyword, the log file shows that not only are the scripts
used to create the External Table, but also to execute the INSERT statement with the
/*+ append */ hint. The load is performed in direct path mode.
All External Tables, when accessed, generate a log file. In the case of the
ORACLE_LOADER driver, this file is similar to the file generated by SQL*Loader. It has
a different format in the case of ORACLE_DATAPUMP driver. The log file is generated
in the same location where the external file resides, and its format is as follows:
<EXTERNAL_TABLE_NAME>_<OraclePID>.log
When an ORACLE_LOADER managed External Table has errors, it dumps the ‘bad’
rows to the *.bad file, just the same as if this was loaded by SQL*Loader.
The ORACLE_DATAPUMP External Table generates a simpler log file, it only contains the
time stamp when the External Table was accessed, and it creates a log file for each
oracle process accessing the External Table.
Unloading data to External Tables
The driver used to unload data to an External Table is the ORACLE_DATAPUMP access
driver. It dumps the contents of a table in a binary proprietary format file. This way
you can exchange data with other 10g and higher databases in a preformatted way to
meet the other database’s requirements. Unloading data to an External Table doesn’t
make it updateable, the tables are still limited to being read only.
Let’s unload the EMPLOYEES table to an External Table:
create table dp_employees
organization external(
type oracle_datapump
default directory EXTTABDIR
location ('dp_employees.dmp')
)
as
select * from employees;
This creates a table named DP_EMPLOYEES, located at the specified EXTTABDIR
directory and with a defined OS file name.
In the next example, at a different database a new DP_EMPLOYEES table is created,
this table uses the already unloaded data by the first database. This DP_EMPLOYEES
External Table is created on the 11g database side.
create table dp_employees(
EMPLOYEE_ID NUMBER(6),
FIRST_NAME VARCHAR2(20),
LAST_NAME VARCHAR2(25),
EMAIL VARCHAR2(25),
PHONE_NUMBER VARCHAR2(20),
HIRE_DATE DATE,
JOB_ID VARCHAR2(10),
SALARY NUMBER(8,2),
COMMISSION_PCT NUMBER(2,2),
MANAGER_ID NUMBER(6),
DEPARTMENT_ID NUMBER(4)
)
organization external
(
type oracle_datapump
default directory EXTTABDIR
location ('dp_employees.dmp')
);
This table can already read in the unloaded data from the first database.
The second database is a regular 11g database. So this shows the inter-version
upward compatibility between a 10g and an 11g database.
SQL> select count(*) from dp_employees;
COUNT(*)
----------
107
Inter-version compatibility
In, the previous example a 10g data pump generated an External Table that was
transparently read by the 11g release.
Let’s create an 11g data pump External Table named DP_DEPARTMENTS:
create table dp_departments
organization external(
type oracle_datapump
default directory EXTTABDIR
access parameters
(
version '10.2.0'
)
location ('dp_departments.dmp')
)
as
select * from departments
Table created.
SQL> select count(*) from dp_departments;
COUNT(*)
----------
27
In the previous example it is important to point out that the VERSION keyword
defines the compatibility format.
access parameters
(
version '10.2.0'
)
If this clause is not specified then an incompatibility error will be displayed.
SQL> select count(*) from dp_departments;
select count(*) from dp_departments
*
ERROR at line 1:
ORA-29913: error in executing ODCIEXTTABLEOPEN callout
ORA-39142: incompatible version number 2.1 in dump file
"/home/oracle/external_table_dest/dp_departments.dmp"
ORA-06512: at "SYS.ORACLE_DATAPUMP", line 19
Now let’s use the 10g version to read from it.
SQL> select count(*) from dp_departments;
COUNT(*)
----------
27
The VERSION clause is interpreted the same way as the VERSION clause for the data
pump export, it has three different values:
- COMPATIBLE: This states that the version of the metadata corresponds to
the database compatibility level. - LATEST : This corresponds to the database version.
- VERSION NUMBER : This refers to a specific oracle version that the file is
compatible with. This value cannot be lower than 9.2.0.