Deleting data is always an exciting (or nerve-wracking?) task! In this article, we will explore the TRUNCATE and DELETE commands. Specifically, we will examine how each is used in BigQuery and MSSQL. Additionally, we will discuss which business cases call for using TRUNCATE and in what situations DELETE is more appropriate. If you are not yet familiar with basic SQL commands, first take a look at SQL (1) Basic Commands – Fundamentals of Data Manipulation.
■ What are TRUNCATE and DELETE?
The TRUNCATE and DELETE commands in SQL are used to delete data from a table. While both commands serve the purpose of removing data, there are significant differences in how they operate and their intended uses.
What is TRUNCATE?
The TRUNCATE command is used to quickly delete all rows from a table. It is useful for initializing a table or removing temporary data. TRUNCATE operates on the entire table and does not allow specifying conditions.
What is DELETE?
The DELETE command is used to remove rows from a table that meet specific conditions. You can specify which rows to delete using a WHERE clause, allowing for the selective removal of certain data.
Comparison Table: TRUNCATE vs DELETE
Feature | TRUNCATE | DELETE |
---|---|---|
Scope of Deletion | Deletes all rows in the table | Deletes specific rows using WHERE clause |
Performance | Fast (deallocates data pages) | Relatively slow (deletes each row individually) |
Logging | Minimal (only data page deallocation is logged) | Detailed (each deleted row is logged) |
Auto-Increment Reset | Resets auto-increment counters | Does not reset auto-increment counters |
Triggers | DELETE triggers do not activate | DELETE triggers activate |
Locking Mechanism | Locks the entire table | Locks individual rows based on conditions |
Use Cases | Entire data reset, removing temporary data, etc. | Deleting specific data, cleaning up logs, etc. |
■ Differences between TRUNCATE and DELETE
1. Operating Mechanism
TRUNCATE:
- Deletes all rows: Removes all data from the table at once.
- Fast performance: Instead of deleting each row individually, it deallocates data pages, enabling rapid execution. Data is stored in fixed-size units called pages, where rows are contained within these pages. Multiple pages make up the entire table data. When TRUNCATE is executed, it deallocates the data pages instead of deleting each row.
- Resets auto-increment counters: Initializes the table’s auto-increment counters. Auto-increment generates unique numeric values automatically for a specific column in the table. For example, each time a new row is inserted, the number increases automatically.
Example:
Consider the following table:
ID | Name |
1 | Alice |
2 | Bob |
3 | Charlie |
If you use DELETE to remove ID = 2 (Bob) and then add a user named David, the ID will be 4 because the auto-increment counter does not reset. However, if you TRUNCATE the table and then add David, the table will have a single row with ID = 1 and Name = David.
After DELETE:
ID | Name |
1 | Alice |
3 | Charlie |
4 | David |
DELETE:
- Conditional deletion: Can delete only rows that meet specific conditions using the WHERE clause.
- Individual row deletion: Deletes each row one by one, and each deletion is logged.
- Maintains auto-increment counters: Auto-increment counters are not reset.
2. Logging
TRUNCATE:
- Minimal logging: Only the deallocation of data pages is logged; individual row deletions are not recorded.
- Recoverability: Can be rolled back if used within a transaction, but cannot be recovered outside of a transaction.
DELETE:
- Detailed logging: Each deleted row is recorded in the logs, allowing all deletion operations to be tracked for recovery purposes.
- Recoverability: Can be rolled back anytime if used within a transaction.
3. Performance and Locking
TRUNCATE:
- Fast performance: Can quickly delete large amounts of data.
- Table locking: Locks the entire table during the operation.
DELETE:
- Slow performance: Takes more time when deleting large amounts of data.
- Row-level locking: Locks individual rows based on conditions.
4. Triggers
Triggers are database objects that automatically execute when specific events (e.g., INSERT, UPDATE, DELETE) occur. Triggers are used to maintain data integrity, record change logs, or execute additional business logic. For example, when deleting a customer’s information from the Customers table, the trigger can also delete all orders associated with that customer.
Example: Creating a trigger to delete related orders when a customer is deleted
CREATE TRIGGER trg_AfterCustomerDelete ON Customers AFTER DELETE AS BEGIN DELETE FROM Orders WHERE CustomerID IN (SELECT CustomerID FROM deleted); END;
The trg_AfterCustomerDelete
trigger automatically executes whenever a customer is deleted from the Customers table. It deletes all orders corresponding to the deleted customers (from the deleted
table) in the Orders table, thereby maintaining data integrity.
TRUNCATE:
- Trigger not activated: DELETE triggers do not activate when using TRUNCATE, so no additional logic is executed during deletion (since data pages are deallocated, there is no opportunity for triggers to run).
DELETE:
- Trigger activated: DELETE triggers are activated, allowing additional logic to execute automatically when data is deleted.
5. Permissions
TRUNCATE:
- Requires higher permissions: Typically requires ALTER TABLE permissions.
DELETE:
- Requires lower permissions: Only DELETE permissions are needed.
■ Usage of TRUNCATE and DELETE
BigQuery is a comprehensive data warehouse, and instead of the TRUNCATE command, you can use the DELETE command to remove all data from a table. BigQuery is optimized for handling large-scale data, making it efficient even when deleting entire datasets. In MSSQL, both TRUNCATE and DELETE commands are supported, allowing you to utilize their respective features.
-- Delete all rows in BigQuery DELETE FROM dataset.table_name WHERE TRUE; -- Delete rows with specific conditions in BigQuery DELETE FROM dataset.table_name WHERE condition; -- Example of using TRUNCATE in MSSQL TRUNCATE TABLE dbo.table_name; -- Example of using DELETE in MSSQL DELETE FROM dbo.table_name WHERE condition;
■ Real Business Cases
1. Data Initialization and Removal of Temporary Data (Using TRUNCATE)
In data warehouses, when using tables that periodically store temporary data, there is a need to initialize the data to free up storage space. In such cases, the TRUNCATE command can be used to quickly delete all data.
-- Initialize temporary data table TRUNCATE TABLE dbo.temp_table;
2. Deleting Data Based on Specific Conditions (Using DELETE)
In user management systems, the DELETE command is used to remove inactive user accounts or log data that is older than a certain period. This allows for the selective deletion of only the necessary data.
-- Delete inactive users older than one year DELETE FROM dbo.users WHERE last_active < DATEADD(year, -1, GETDATE());
3. Preparing for Data Migration (Using TRUNCATE)
During the process of migrating data to the cloud, after successfully replicating existing SQL Server data to GCP, TRUNCATE can be used to clean up the data in SQL Server. This helps in quickly deallocating data pages and recovering storage space.
-- Initialize table after data migration TRUNCATE TABLE dbo.prod_tracking_users;
4. Log Data Management (Using DELETE)
When managing log data for web applications, the DELETE command can be used to remove only old logs that meet specific conditions, thereby controlling the size of the database.
-- Delete logs older than 30 days DELETE FROM dbo.event_logs WHERE event_date < DATEADD(day, -30, GETDATE());
■ Conclusion
TRUNCATE and DELETE are important commands used for deleting data in databases. TRUNCATE is useful for quickly deleting all data and resetting the auto-increment counter, while DELETE is suitable for selectively deleting specific data that meets certain conditions. By appropriately utilizing each command in BigQuery and MSSQL, you can enhance the efficiency of data management. It is hoped that this article helps in effectively managing databases according to business requirements.