Oracle DBA Concepts: Delete Data from Large Table

Bulk Delete

The rule to decide when to use a bulk delete: "Use it when you cannot do the delete in a single SQL statement".

Here, you probably had a table with more number of indexes and deleting just does a lot of work. Maintaining the index structures for 300,000 deletes can be "cumbersome". Additionally there are lots of UNDO and REDO generated.

If 300,000 was a large percentage of the table, it is sometimes better to:

create table temp nologging as
select * from t where id not in ( select id from a );

(keep the rows you want)

index temp (unrecoverable, in parrallel );
grant on temp (as you had for t);
drop table t;

rename temp to t;

You quite simply have too much data for it to be deleted quickly. No matter what method you use, it will take a significant amount of time. If you are confident of your code, you can also disable constraints while pruning your data, which will save some time.

It might be easiest to create a new table with the rows you want to keep - then drop the old one but whatever you do will take a large amount of time.

Some helpful hints include:

1. Use partitioning: The fastest way to do a mass delete is to drop an Oracle partition.

2. Tune the delete subquery: Many Oracle deletes use a where clause subquery and optimizing the subquery will improve the SQL delete speed.

3. Use bulk deletes: Oracle PL/SQL has a bulk delete operator that often is faster than a standard SQL delete.

4. Drop indexes & constraints: If you are tuning a delete in a nighttime batch job, consider dropping the indexes and rebuilding them after the delete job is completed.

5. Small pctused: For tuning mass deletes you can reduce freelist overhead by setting Oracle to only re-add a block to the freelists when the block is dead empty by setting a low value for pctused.

6. Parallelize the delete job: You can run massive delete in parallel with the parallel hint. If you have 36 processors, the full-scan can run 35 times faster (cpu_count-1)

7. Consider NOARCHIVELOG: Take a full backup first and bounce the database into NOLOGGING mode for the delete and bounce it again after, into ARCHIVELOG mode.

8. Use CTAS: Another option you can try would be to create a new table using CTAS(Create Table As Select) where the select statement filters out the rows that you want to delete. Then do a rename of the original followed by a rename of the new table and transfer constraints and indexes.

Options for deletion

1. Option
Use a simple delete command with WHERE condition. Make sure where clause brings limited number of rows and commit with each deletion.

ex:
delete from mytab where year = '2008';
commit;

delete from mytab where year = '2009';

commit;

2. Option
CTAS (Create Table As Select)
make sure we have tables constraint and index scripts. We can get through dbms_metadata.get_ddl.

Create the table into another tablespace with required number of rows. Then drop the original table. rename the new table with old one.

create table new_mytab
as
select * from mytab where year = '2012'
tablespace new_tablespace;

rename mytab to old_mytab;

rename new_mytab to mytab

add constraints and indexes

3. Option
Delete and copy back into an existing tablespace.

STEP 1 – Copy the table using a WHERE clause to delete the rows:

create table new_mytab
as
select * from mytab where year = '2012'
tablespace new_tablespace;

STEP 2 – truncate the original table:

truncate mytab;

STEP 3 – Copy-back the rows into the original table definition. Please note that this step may required a dedicated rollback or undo segment:

alter session set rollback_segment = 'HUGE_RBS';

insert into mytab
select * from new_mytab;

STEP 4: drop table old_mytab;

In sum, a create table as select (CTAS) approach can be faster than a vanilla delete when the majority of the table rows are being deleted. CTAS is fast because CTAS can be parallelized, and the required full-scan can be run with parallel read processes such that on a 64 CPU server, the CTAS will scan the table 63 times faster. CTAS can be used with the NOLOGGING option.

Factors affecting massive deletes

1. Is the target table partitioned ?

its easier to delete individual partitions

2. Can you reorganize the table after the delete to remove fragmentation?

3. What percentage of the table will be deleted ?

If the deleted records are more we can create a table using CTAS with required rows and drop the other old table. and rename the new table back to original name.

In cases where you are deleting more than 30-50% of the rows in a very large table it is faster to use CTAS to delete from a table than to do a vanilla delete and a reorganization of the table blocks and a rebuild of the constraints and indexes.

4. Do you want to release the space consumed by the deleted rows?

if you want to released the space back onto the tablespace then you will need to reorganize the table.

dbms_metadata

In Oracle we have the dbms_metadata utility to display DDL directly from the data dictionary. Using this powerful utility, we can punch individual objects or an entire schema.

we need to execute dbms_metadata.get_ddl

To punch off all table and indexes for the EMP table, we execute dbms_metadata.get_ddl

set echo off;
Set pages 999;
set long 90000;

spool ddl_list.sql

select dbms_metadata.get_ddl('TABLE','DEPT','SCOTT') from dual;

select dbms_metadata.get_ddl('INDEX','PK_DEPT','SCOTT') from dual;

spool off;

to get the whole schema we can modify the above script

set pagesize 0
set long 90000
set feedback off
set echo off

spool scott_schema.sql

connect scott/tiger;

SELECT DBMS_METADATA.GET_DDL('TABLE',u.table_name)
FROM USER_TABLES u;

SELECT DBMS_METADATA.GET_DDL('INDEX',u.index_name)
FROM USER_INDEXES u;

spool off;

Oracle DBA Concepts

Search This Blog

Thursday, January 7, 2016

Delete Data from Large Table

No comments:

Post a Comment