Schema changes in a high-volume PostgreSQL database can be critical and challenging. One misstep could lead to an outage, making it essential to handle these changes carefully. Traditional methods require precise execution and can cause significant cognitive load for product engineers. This is where pg_osc (PostgreSQL Online Schema Change) comes into play. Inspired by pt-online-schema-change for MySQL and pg_repack, pg_osc offers a streamlined and safer way to perform schema changes without substantial downtime.
Understanding pg_osc
pg_osc facilitates schema changes by using shadow tables. The process involves creating a shadow table that mirrors the primary table, performing the schema change on this shadow table, copying data from the primary table to the shadow table, and finally swapping the table names. This method ensures minimal locking and continuous operation during schema changes.
By utilizing pg_osc, you can ensure that your PostgreSQL schema changes are performed safely, efficiently, and with minimal disruption to your database operations. Whether you’re adding new columns, renaming existing ones, or performing multiple alterations, pg_osc provides a robust solution that enhances the reliability and performance of your PostgreSQL database management tasks.
Key Features
- Minimal Locks: pg_osc acquires minimal locks throughout the process, reducing the risk of database downtime.
- Index and Foreign Key Management: Copies over indexes and foreign keys from the primary table to the shadow table.
- Optional Old Table Management: Optionally drop or retain old tables after the schema change.
- Backend Management: Can terminate competing backends to facilitate quick setup and smooth operations.
- Custom Copy Statement: Allows custom SQL for copying data, enabling additional operations like backfilling during the migration.
- Data Consistency: Maintains data consistency by ensuring that all changes made during the schema modification are accurately reflected in the new structure.
- Safety and Reliability: Ensures the safety and consistency of schema changes, reducing the risk of data corruption or loss during the operation.
Benefits of pg_osc
- Minimize Downtime: Traditional schema changes in PostgreSQL can require exclusive locks on the table being modified, leading to database downtime. pg_osc allows these changes with minimal interruption.
- Avoid Blocking: Concurrent transactions can continue to read and write data to the affected table without being blocked, improving overall database performance.
- Performance Improvements: Designed to minimize the impact on database performance during schema changes, pg_osc optimizes the change process.
- Safety and Reliability: Engineered to ensure the safety and consistency of schema changes, reducing the risk of data corruption or loss.
- Flexibility: Offers the option to drop or retain old tables after schema changes, providing flexibility in managing historical data.
Installation
Installing pg_osc is straightforward, especially with Docker. Use the following command to fetch the latest version:
How pg_osc Works
The process involves several key steps to ensure safe and reliable schema changes:
- Create an Audit Table: Log changes using triggers for inserts, updates, and deletes on the primary table.
- Acquire Lock: Obtain an ACCESS EXCLUSIVE lock on the primary table.
- Develop Shadow Table: Create a new shadow table with the desired schema changes.
- Migrate Data: Copy all rows from the primary table to the shadow table.
- Build Indexes: Enhance query performance by building indexes on the shadow table.
- Replay Changes: Ensure consistency by replaying changes from the audit table to the shadow table.
- Clean Up: Delete rows from the audit table that have been replayed.
- Finalize Transition: Acquire an ACCESS EXCLUSIVE lock again when delta changes are minimal.
- Activate New Structure: Swap table names between the shadow table and the primary table.
- Safeguard Data Integrity: Update references in other tables with NOT VALID constraints.
- Optimize Performance: Analyze the new table.
- Validate Foreign Keys: Ensure data integrity by validating NOT VALID foreign keys.
- Optionally Drop Old Table: Drop the old table if no longer needed.
Examples
Adding a Column
To add a column member
to the team_member
table:
Renaming a Column
To rename a column from birthdate to new_birthdate:
Multiple Alter Statements
To add, rename, and drop columns in a single operation:
Advanced Features
pg_osc also offers advanced features like data consistency during column changes, minimal locks, and custom copy statements for more control during data migration. It ensures that indexes and foreign keys are handled correctly, providing flexibility in managing historical data.
Limitations
While pg_osc is powerful, it does have some limitations:
- Partitioned Tables: Does not currently support partitioned tables.
- Primary Key Requirement: Requires a primary key for schema changes.
- Lock Acquisition: Acquires an ACCESS EXCLUSIVE lock twice during the operation.
- Disk Space: Adequate disk space is needed due to table duplication.
- Index and Constraint Names: Indexes, constraints, and sequence names are altered during the change.
Future Enhancements
- Reversible Changes: Support for reversing changes post table name swap without data loss.
- Refactoring and Testing: Further refactoring and more extensive testing.
- Event Triggers: Exploring actions on specific events by plugging into the trigger system.
- Docker Image: Introducing a Docker image to simplify running on containers.
- Logging Improvements: Cleaning up logging for better readability and debugging.
- Prepared Statements: Migrating to prepared statements for enhanced performance and security.
Conclusion
pg_osc revolutionizes schema changes in PostgreSQL by minimizing downtime, avoiding blocking, and ensuring data consistency. Its seamless process and advanced features make it an indispensable tool for efficient database management. Consider using pg_osc for your PostgreSQL schema changes to ensure a smooth and efficient database evolution.
For more information or assistance, reach out to our Managed PostgreSQL Services and Consulting expertise. Contact us to explore how pg_osc can optimize your PostgreSQL database management.