Fundamentals and Storage I have spent more than ten years working as a backend developer optimizing MySQL databases and I have witnessed numerous applications experience performance problems because developers lacked knowledge about primary keys. In this article, I am going to explain both fundamental and internal aspects of MySQL primary keys which determine application performance outcomes. What Are Primary Keys, Really? The simple definition of a primary key is one or multiple columns that serve to uniquely identify each row in a table structure. But a primary key represents much more than that. A primary key in MySQL is: A unique identifier for each record Automatically indexed Cannot contain NULL values Limited to one per table The foundation of your table's physical storage structure (this is crucial) A unique identifier for each record Automatically indexed Cannot contain NULL values Limited to one per table The foundation of your table's physical storage structure (this is crucial) The creation of a primary key results in more than just data constraints because it determines how MySQL stores and retrieves data from disk. The Critical Role of Primary Keys in MySQL The InnoDB storage engine has become the default in MySQL since version 5.5. It uses primary keys for both logical and physical data storage and access operations. This concept is called "clustered indexing", and it represents a fundamental MySQL concept that you must understand. Your table data within InnoDB follows a physical disk order based on primary key values. The performance implications of this arrangement are significant. Consider this example: CREATE TABLE customers ( customer_id INT NOT NULL AUTO_INCREMENT, name VARCHAR(100) NOT NULL, email VARCHAR(100) NOT NULL, PRIMARY KEY (customer_id) ); CREATE TABLE customers ( customer_id INT NOT NULL AUTO_INCREMENT, name VARCHAR(100) NOT NULL, email VARCHAR(100) NOT NULL, PRIMARY KEY (customer_id) ); The database engine of MySQL arranges newly inserted rows into this table according to the customer_id order when placed on disk storage. This means: customer_id The physical storage of records with adjacent primary key values occurs near each other on the disk. The efficiency of primary key range queries remains at a peak level Non-primary-key indexes store the primary key values instead of row pointers The physical storage of records with adjacent primary key values occurs near each other on the disk. The efficiency of primary key range queries remains at a peak level Non-primary-key indexes store the primary key values instead of row pointers How InnoDB Stores Primary Keys Under the Hood Let's dive deeper into how MySQL actually stores data. B+Tree Storage Structure All indexes including the primary key use a B+Tree data structure within InnoDB. A B+Tree is a balanced tree structure that enables quick searching together with sequential access and efficient insertions and deletions. For a primary key index: The internal nodes within the data structure store only the key values. The actual row data, together with key values, resides in leaf nodes Leaf nodes link together through a doubly-linked list structure, which enables efficient range scanning operations The internal nodes within the data structure store only the key values. The actual row data, together with key values, resides in leaf nodes Leaf nodes link together through a doubly-linked list structure, which enables efficient range scanning operations (Note: This is a conceptual diagram; actual implementation details may vary). A primary key access enables MySQL to locate a row with minimal disk reads, which typically perform as O(log n) operations relative to the number of rows. Page Structure InnoDB stores data through pages, which typically measure 16 KB each. Each page can contain multiple rows, depending on the row size. The definition of a primary key leads MySQL to perform the following actions: The MySQL database engine places rows with similar primary key values within the same page. MySQL stores data pages following the sequence of primary key values. The data organization through clustering results in highly efficient range query execution. The MySQL database engine places rows with similar primary key values within the same page. MySQL stores data pages following the sequence of primary key values. The data organization through clustering results in highly efficient range query execution. To optimize large tables, it is crucial to understand the storage method that uses pages. If your rows are 4 KB each, you'll have about 4 rows per page. But if your rows are only 100 bytes, you can fit around 160 rows per page, dramatically reducing the number of disk I/O operations needed. Hidden Primary Keys: What Happens When You Don't Specify One Many people mistakenly believe that primary keys are optional elements in MySQL. While you can create a table without explicitly defining a primary key, MySQL's InnoDB engine will create a hidden 6-byte primary key called a "Row ID" for tables without one. Consider this table: CREATE TABLE notes ( content TEXT, created_at TIMESTAMP ); CREATE TABLE notes ( content TEXT, created_at TIMESTAMP ); The InnoDB system automatically generates an internal hidden primary key when no primary key definition exists. The approach presents major disadvantages that should be considered. The hidden key cannot be accessed directly by your application. The internal key lacks any meaning that relates to your data. The physical data arrangement becomes outside your control when you do not define a primary key. The hidden key cannot be accessed directly by your application. The internal key lacks any meaning that relates to your data. The physical data arrangement becomes outside your control when you do not define a primary key. The practice of developing tables without primary keys is a common mistake I have observed quite a few times. These tables function properly, but they will eventually cause problems when the table expands in size. All tables need an explicitly defined primary key. All tables need an explicitly defined primary key Auto-Increment Primary Keys: Benefits and Hidden Costs The use of auto-increment primary keys exists in almost every MySQL database. CREATE TABLE orders ( order_id INT NOT NULL AUTO_INCREMENT, customer_id INT NOT NULL, order_date DATETIME NOT NULL, PRIMARY KEY (order_id) ); CREATE TABLE orders ( order_id INT NOT NULL AUTO_INCREMENT, customer_id INT NOT NULL, order_date DATETIME NOT NULL, PRIMARY KEY (order_id) ); The advantages are clear: Automatically generated unique values Sequential values provide superior performance for adding new rows at the end of the table. The small size of these keys enables them to occupy less space within secondary indexes Avoids fragmentation from random inserts Automatically generated unique values Sequential values provide superior performance for adding new rows at the end of the table. The small size of these keys enables them to occupy less space within secondary indexes Avoids fragmentation from random inserts These indices have several hidden implications that need attention. Insertion Bottlenecks in High-Write Systems The use of auto-increment primary keys creates bottlenecks in systems that experience high concurrency. The table's "end" section develops into a competitive area because new rows are consistently inserted there. The previous versions before 8.0 MySQL maintained table-level auto-increment locks throughout the entire statement execution, which could result in lock contention. The auto-increment lock mode in MySQL 8.0 operates more efficiently by releasing the lock instantly after value generation,n yet it does not eliminate all contention risks. The "Hot Spot" Problem The index experiences maximum I/O operations when all new insertions take place at its terminal position. Page lock contention, along with reduced performance, becomes a problem when systems experience numerous concurrent writes. UUID Alternatives and Their Trade-offs Some developers choose UUIDs as a substitute solution: CREATE TABLE sessions ( session_id CHAR(36) NOT NULL, user_id INT NOT NULL, created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP, PRIMARY KEY (session_id) ); CREATE TABLE sessions ( session_id CHAR(36) NOT NULL, user_id INT NOT NULL, created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP, PRIMARY KEY (session_id) ); The distribution of writes across the index by UUIDs reduces contention but leads to random insertion points, which cause index fragmentation and degrade range scan performance. Secondary indexes require more space because UUIDs occupy larger storage. Secondary Indexes and Optimization Strategies We explored how MySQL's InnoDB storage engine physically organizes data based on primary keys. Now, let's dive deeper into how your primary key choice affects secondary indexes, query performance, and overall database efficiency. The Crucial Relationship Between Primary Keys and Secondary Indexes The design of your primary key remains a major performance factor that influences every secondary index in your MySQL table. Database performance relies fundamentally on this relationship. How Secondary Indexes Work in InnoDB InnoDB stores secondary indexes with a different structure from the primary key index. At the leaf level of the primary key index, you will find your actual row data Secondary indexes store both the indexed columns with their associated primary key values. Two separate lookups are necessary to retrieve full row data when accessing it through a secondary index. The first step requires searching for the entry within the secondary index. The complete row lookup occurs by using the primary key value obtained from the secondary index search. At the leaf level of the primary key index, you will find your actual row data At the leaf level of the primary key index, you will find your actual row data Secondary indexes store both the indexed columns with their associated primary key values. Secondary indexes store both the indexed columns with their associated primary key values. Two separate lookups are necessary to retrieve full row data when accessing it through a secondary index. The first step requires searching for the entry within the secondary index. The complete row lookup occurs by using the primary key value obtained from the secondary index search. Two separate lookups are necessary to retrieve full row data when accessing it through a secondary index. The first step requires searching for the entry within the secondary index. The complete row lookup occurs by using the primary key value obtained from the secondary index search. The first step requires searching for the entry within the secondary index. The complete row lookup occurs by using the primary key value obtained from the secondary index search. This two-step process is called a "bookmark lookup", and that forms an essential part of performance optimization. Consider this example: CREATE TABLE products ( product_id INT NOT NULL AUTO_INCREMENT, sku VARCHAR(50) NOT NULL, name VARCHAR(100) NOT NULL, price DECIMAL(10, 2) NOT NULL, PRIMARY KEY (product_id), INDEX idx_sku (sku) ); CREATE TABLE products ( product_id INT NOT NULL AUTO_INCREMENT, sku VARCHAR(50) NOT NULL, name VARCHAR(100) NOT NULL, price DECIMAL(10, 2) NOT NULL, PRIMARY KEY (product_id), INDEX idx_sku (sku) ); When you query by SKU: SELECT * FROM products WHERE sku = 'ABC123'; SELECT * FROM products WHERE sku = 'ABC123'; MySQL: The system performs an index search on idx_sku for 'ABC123' The system retrieves the product_id value stored in the index. The system uses product_id to locate the entire row in the primary key index. The system performs an index search on idx_sku for 'ABC123' idx_sku The system retrieves the product_id value stored in the index. product_id The system uses product_id to locate the entire row in the primary key index. product_id The selection of your primary key affects all secondary indexes for this reason: The size of your primary key directly affects the size of your secondary indexes. The number of secondary indexes you use will increase the total storage space required by your primary key selection. Each secondary index needs to contain the primary key data. The size of your primary key directly affects the size of your secondary indexes. The number of secondary indexes you use will increase the total storage space required by your primary key selection. Each secondary index needs to contain the primary key data. The Hidden Cost of Large Primary Keys Let's quantify this with an example. Imagine these two primary key choices: -- Option 1: INT primary key (4 bytes) CREATE TABLE orders ( order_id INT NOT NULL AUTO_INCREMENT, PRIMARY KEY (order_id), ... ); -- Option 2: UUID primary key (16 bytes) CREATE TABLE orders ( order_id BINARY(16) NOT NULL, PRIMARY KEY (order_id), ... ); -- Option 1: INT primary key (4 bytes) CREATE TABLE orders ( order_id INT NOT NULL AUTO_INCREMENT, PRIMARY KEY (order_id), ... ); -- Option 2: UUID primary key (16 bytes) CREATE TABLE orders ( order_id BINARY(16) NOT NULL, PRIMARY KEY (order_id), ... ); With 5 secondary indexes and 10 million rows: The storage savings from each secondary index amount to approximately 120MB because 4 bytes times 10 million rows times 5 indexes equals 200MB less storage. Secondary index lookups need to process 16-byte primary keys instead of the standard 4 bytes when implementing this option. The storage savings from each secondary index amount to approximately 120MB because 4 bytes times 10 million rows times 5 indexes equals 200MB less storage. Secondary index lookups need to process 16-byte primary keys instead of the standard 4 bytes when implementing this option. The implementation of integer primary keys instead of UUIDs in high-performance systems might result in a significant decrease in database size and a performance boost during read operations. Composite Primary Keys: When and How to Use Them A composite primary key consists of multiple columns which function to make each row unique. These keys deliver strong functionality yet need careful consideration. Anatomy of a Composite Primary Key CREATE TABLE order_items ( order_id INT NOT NULL, line_number INT NOT NULL, product_id INT NOT NULL, quantity INT NOT NULL, price DECIMAL(10, 2) NOT NULL, PRIMARY KEY (order_id, line_number) ); CREATE TABLE order_items ( order_id INT NOT NULL, line_number INT NOT NULL, product_id INT NOT NULL, quantity INT NOT NULL, price DECIMAL(10, 2) NOT NULL, PRIMARY KEY (order_id, line_number) ); The order_id along with the line_number combination serves as the primary key in this example. This has several implications: order_id line_number The physical sorting process starts with order_id, followed by the line_number Physical storage of all items within a single order occurs together. The database will execute order_id range queries with exceptional performance. The primary key becomes inefficient when queries focus exclusively on line_number. The physical sorting process starts with order_id, followed by the line_number order_id, line_number Physical storage of all items within a single order occurs together. The database will execute order_id range queries with exceptional performance. order_id The primary key becomes inefficient when queries focus exclusively on line_number. line_number Natural Ordering Benefits Composite primary keys, designed properly, offer built-in ordering benefits. In time-series data, for example: CREATE TABLE temperature_readings ( sensor_id INT NOT NULL, timestamp DATETIME NOT NULL, temperature DECIMAL(5, 2) NOT NULL, PRIMARY KEY (sensor_id, timestamp) ); CREATE TABLE temperature_readings ( sensor_id INT NOT NULL, timestamp DATETIME NOT NULL, temperature DECIMAL(5, 2) NOT NULL, PRIMARY KEY (sensor_id, timestamp) ); This structure means: The system stores all measurements from a single sensor together. The physical organization of sensor data within each record follows a time-based sorting. The retrieval of time-based data for specific sensors operates at high efficiency. The system stores all measurements from a single sensor together. The physical organization of sensor data within each record follows a time-based sorting. The retrieval of time-based data for specific sensors operates at high efficiency. Column Order Matters The correct order of columns in a composite primary key plays a vital role. InnoDB uses the leftmost prefix during efficient lookups while following the "leftmost prefix rule" principle. The index enables efficient execution of queries that contain sensor_id in their filter conditions The index enables highly efficient operations for queries that contain conditions for sensor_id and timestamp. The primary key becomes inefficient when queries apply filtering only to timestamp values. The index enables efficient execution of queries that contain sensor_id in their filter conditions sensor_id The index enables highly efficient operations for queries that contain conditions for sensor_id and timestamp. sensor_id timestamp The primary key becomes inefficient when queries apply filtering only to timestamp values. timestamp A system optimization through primary key reorganization allowed me to cut query times from seconds down to milliseconds without modifying any other system components. Surrogate vs. Natural Keys: Making the Right Choice The fundamental design choice in database development involves a decision between using surrogate keys and natural keys: Surrogate keys consist of artificial identification values, such as auto-increment ID,s that function exclusively for record identification. Natural keys are inherent attributes of the data that can uniquely identify records Surrogate keys consist of artificial identification values, such as auto-increment ID,s that function exclusively for record identification. Surrogate keys Natural keys are inherent attributes of the data that can uniquely identify records Natural keys The Case for Surrogate Keys CREATE TABLE countries ( country_id INT NOT NULL AUTO_INCREMENT, country_code CHAR(2) NOT NULL, name VARCHAR(100) NOT NULL, PRIMARY KEY (country_id), UNIQUE INDEX (country_code) ); CREATE TABLE countries ( country_id INT NOT NULL AUTO_INCREMENT, country_code CHAR(2) NOT NULL, name VARCHAR(100) NOT NULL, PRIMARY KEY (country_id), UNIQUE INDEX (country_code) ); Advantages: Compact (usually 4-8 bytes) Never need to change No business meaning to protect from change Consistent format across all tables Typically sequential, minimizing fragmentation Compact (usually 4-8 bytes) Never need to change No business meaning to protect from change Consistent format across all tables Typically sequential, minimizing fragmentation The Case for Natural Keys CREATE TABLE countries ( country_code CHAR(2) NOT NULL, name VARCHAR(100) NOT NULL, PRIMARY KEY (country_code) ); CREATE TABLE countries ( country_code CHAR(2) NOT NULL, name VARCHAR(100) NOT NULL, PRIMARY KEY (country_code) ); Advantages: No additional storage overhead The database system uses this approach to maintain data integrity throughout its structure. More meaningful in query results No additional storage overhead The database system uses this approach to maintain data integrity throughout its structure. More meaningful in query results When Each Makes Sense I established these guidelines after optimizing databases for many years. Use surrogate keys when: The natural key candidates might change (email addresses and phone numbers updates). Natural candidates are not appropriate (large text fields for example). You need to maintain relationships even if identifying attributes change. Tables are frequently joined to many others The natural key candidates might change (email addresses and phone numbers updates). Natural candidates are not appropriate (large text fields for example). You need to maintain relationships even if identifying attributes change. Tables are frequently joined to many others Consider natural keys when: Natural identifiers maintain permanent values because they consist of ISO codes or certain scientific IDs. Storage efficiency is paramount The natural key is frequently used in queries. The data contains established industry standards for identification (ISBN for books and other items). Natural identifiers maintain permanent values because they consist of ISO codes or certain scientific IDs. Storage efficiency is paramount The natural key is frequently used in queries. The data contains established industry standards for identification (ISBN for books and other items). Real-World Primary Key Optimization Strategies Through my experience with many MySQL database optimizations, I have established these practical methods for different situations: High-Write Workloads: Reducing Contention Standard auto-increment keys create performance bottlenecks within systems that process many insertions. Options to consider: 1. Sharded Auto-Increment Keys -- Each application server uses a different offset -- Server 1 uses: 1, 101, 201, ... -- Server 2 uses: 2, 102, 202, ... INSERT INTO events (id, event_data) VALUES (LAST_INSERT_ID() + 100, 'event data'); -- Each application server uses a different offset -- Server 1 uses: 1, 101, 201, ... -- Server 2 uses: 2, 102, 202, ... INSERT INTO events (id, event_data) VALUES (LAST_INSERT_ID() + 100, 'event data'); This reduces contention while maintaining most benefits of sequential IDs. 2. Time-Based Sorting with Composite Keys CREATE TABLE events ( day_partition DATE NOT NULL, event_id BIGINT NOT NULL AUTO_INCREMENT, event_data JSON, PRIMARY KEY (day_partition, event_id), UNIQUE KEY (event_id) ); CREATE TABLE events ( day_partition DATE NOT NULL, event_id BIGINT NOT NULL AUTO_INCREMENT, event_data JSON, PRIMARY KEY (day_partition, event_id), UNIQUE KEY (event_id) ); This approach: The Partitions system distributes data storage across multiple physical areas according to date values Maintains chronological ordering The approach decreases conflicts that occur at the index's active end. The Partitions system distributes data storage across multiple physical areas according to date values Maintains chronological ordering The approach decreases conflicts that occur at the index's active end. Read-Heavy Analytics: Optimizing for Query Patterns A well-designed composite key system in data warehousing and analytics leads to significant performance improvements. CREATE TABLE user_actions ( user_id INT NOT NULL, action_time DATETIME NOT NULL, action_type TINYINT NOT NULL, action_data JSON, PRIMARY KEY (user_id, action_time) ); CREATE TABLE user_actions ( user_id INT NOT NULL, action_time DATETIME NOT NULL, action_type TINYINT NOT NULL, action_data JSON, PRIMARY KEY (user_id, action_time) ); All user actions are stored together so this structure makes user history queries extremely efficient and presorted by time. Time-Series Data: Special Considerations For IoT sensors, monitoring systems, and other time-series workloads: CREATE TABLE metrics ( metric_type TINYINT NOT NULL, device_id INT NOT NULL, timestamp DATETIME(3) NOT NULL, value FLOAT NOT NULL, PRIMARY KEY (metric_type, device_id, timestamp) ); CREATE TABLE metrics ( metric_type TINYINT NOT NULL, device_id INT NOT NULL, timestamp DATETIME(3) NOT NULL, value FLOAT NOT NULL, PRIMARY KEY (metric_type, device_id, timestamp) ); This design: Groups related metrics together physically MySQL arranges data chronologically inside each group. The design optimizes performance by providing quick access to device X data based on its recent updates. Groups related metrics together physically MySQL arranges data chronologically inside each group. The design optimizes performance by providing quick access to device X data based on its recent updates. Common Primary Key Anti-Patterns to Avoid During my professional experience, I have witnessed multiple primary key anti-patterns that consistently result in performance issues. 1. Random UUID Primary Keys Without Proper Indexing Random UUIDs cause: Fragmentation because of random write locations Poor range scan performance Fragmentation because of random write locations Poor range scan performance When using UUIDs, you should select ordered versions such as ULID or UUID v6/v7 because they maintain time-based ordering properties. 2. Multi-Column Natural Keys That Change The system I encountered used (client_code, year, document_number) as its primary key structure for invoices. The client code modification triggered cascading updates, which modified millions of rows distributed across multiple tables. client_code year document_number 3. Using VARCHAR Primary Keys When a Numeric Type Would Suffice The process of comparing strings requires more time than comparing numeric values. The conversion of VARCHAR customer codes to INT surrogate keys might result in a huge enhancement of join performance. 4. No Primary Key at All Every table needs an explicit primary key defined. The InnoDB system generates an invisible primary key when left to its own devices. Prevents you from referencing it in foreign keys The physical storage order becomes unpredictable when this method is used Makes secondary indexes less efficient Prevents you from referencing it in foreign keys The physical storage order becomes unpredictable when this method is used Makes secondary indexes less efficient Advanced Strategies and Refactoring Primary Keys and Table Partitioning: Critical Considerations Table partitioning enables the division of big tables into smaller sections that remain logically connected. The primary key strategy needs to match exactly with the partitioning strategy. The Partitioning Key Constraint MySQL requires all unique keys, including primary keys, to contain the partitioning key as one of their components. Unique constraints need to be verifiable within a single partition because of this requirement. all unique keys, including primary keys, to contain the partitioning key as one of their components. -- This works because order_date is part of the primary key CREATE TABLE orders ( order_id INT NOT NULL, order_date DATE NOT NULL, customer_id INT NOT NULL, amount DECIMAL(10, 2) NOT NULL, PRIMARY KEY (order_date, order_id) ) PARTITION BY RANGE (TO_DAYS(order_date)) ( PARTITION p_2022_q1 VALUES LESS THAN (TO_DAYS('2022-04-01')), PARTITION p_2022_q2 VALUES LESS THAN (TO_DAYS('2022-07-01')), PARTITION p_2022_q3 VALUES LESS THAN (TO_DAYS('2022-10-01')), PARTITION p_2022_q4 VALUES LESS THAN (TO_DAYS('2023-01-01')) ); -- This works because order_date is part of the primary key CREATE TABLE orders ( order_id INT NOT NULL, order_date DATE NOT NULL, customer_id INT NOT NULL, amount DECIMAL(10, 2) NOT NULL, PRIMARY KEY (order_date, order_id) ) PARTITION BY RANGE (TO_DAYS(order_date)) ( PARTITION p_2022_q1 VALUES LESS THAN (TO_DAYS('2022-04-01')), PARTITION p_2022_q2 VALUES LESS THAN (TO_DAYS('2022-07-01')), PARTITION p_2022_q3 VALUES LESS THAN (TO_DAYS('2022-10-01')), PARTITION p_2022_q4 VALUES LESS THAN (TO_DAYS('2023-01-01')) ); In this design: order_date serves as the primary key that appears in the partitioning expression. The partitioning expression enables MySQL to direct queries to particular partitions during date-based filtering operations. Each partition maintains its original ordering through the use of the composite key. order_date serves as the primary key that appears in the partitioning expression. order_date The partitioning expression enables MySQL to direct queries to particular partitions during date-based filtering operations. Each partition maintains its original ordering through the use of the composite key. Optimizing Partition Pruning During query execution MySQL uses Partition pruning to bypass partitions that are not needed in the process. The design of a good primary key supports efficient pruning operations. -- This query can use partition pruning SELECT * FROM orders WHERE order_date BETWEEN '2022-04-01' AND '2022-06-30' AND customer_id = 1001; -- This query cannot use partition pruning SELECT * FROM orders WHERE order_id = 5000; -- This query can use partition pruning SELECT * FROM orders WHERE order_date BETWEEN '2022-04-01' AND '2022-06-30' AND customer_id = 1001; -- This query cannot use partition pruning SELECT * FROM orders WHERE order_id = 5000; The initial query removes three partitions from consideration thus it focuses its search on p_2022_q2. The second query needs to scan all partitions because order_id by itself fails to identify the partition location of the data.Throughout my career, I have observed how appropriate partition design improves query speeds from minutes to sub-second for large analytical tables when partitioning schemes match the primary key and query patterns. Refactoring Problematic Primary Key Designs Database management professionals consider changing primary keys in production to be one of their most difficult tasks. Here are battle-tested approaches: The Shadow Table Method The method reduces system downtime during primary key structure modifications. Create a new table with the desired primary key structure: Create a new table with the desired primary key structure: CREATE TABLE products_new ( product_id INT NOT NULL AUTO_INCREMENT, -- New surrogate key sku VARCHAR(50) NOT NULL, -- Old primary key name VARCHAR(255) NOT NULL, price DECIMAL(10, 2) NOT NULL, PRIMARY KEY (product_id), UNIQUE KEY (sku) ); CREATE TABLE products_new ( product_id INT NOT NULL AUTO_INCREMENT, -- New surrogate key sku VARCHAR(50) NOT NULL, -- Old primary key name VARCHAR(255) NOT NULL, price DECIMAL(10, 2) NOT NULL, PRIMARY KEY (product_id), UNIQUE KEY (sku) ); Populate the new table with existing data: Populate the new table with existing data: INSERT INTO products_new (sku, name, price) SELECT sku, name, price FROM products; INSERT INTO products_new (sku, name, price) SELECT sku, name, price FROM products; Create triggers to keep tables synchronized during the transition: Create triggers to keep tables synchronized during the transition: DELIMITER // CREATE TRIGGER products_after_insert AFTER INSERT ON products FOR EACH ROW BEGIN INSERT INTO products_new (sku, name, price) VALUES (NEW.sku, NEW.name, NEW.price); END // CREATE TRIGGER products_after_update AFTER UPDATE ON products FOR EACH ROW BEGIN UPDATE products_new SET name = NEW.name, price = NEW.price WHERE sku = NEW.sku; END // CREATE TRIGGER products_after_delete AFTER DELETE ON products FOR EACH ROW BEGIN DELETE FROM products_new WHERE sku = OLD.sku; END // DELIMITER; DELIMITER // CREATE TRIGGER products_after_insert AFTER INSERT ON products FOR EACH ROW BEGIN INSERT INTO products_new (sku, name, price) VALUES (NEW.sku, NEW.name, NEW.price); END // CREATE TRIGGER products_after_update AFTER UPDATE ON products FOR EACH ROW BEGIN UPDATE products_new SET name = NEW.name, price = NEW.price WHERE sku = NEW.sku; END // CREATE TRIGGER products_after_delete AFTER DELETE ON products FOR EACH ROW BEGIN DELETE FROM products_new WHERE sku = OLD.sku; END // DELIMITER; 4. Update application code to use the new table structure Once validation is complete, rename tables and drop triggers: Once validation is complete, rename tables and drop triggers: RENAME TABLE products TO products_old, products_new TO products; DROP TRIGGER products_after_insert; DROP TRIGGER products_after_update; DROP TRIGGER products_after_delete; RENAME TABLE products TO products_old, products_new TO products; DROP TRIGGER products_after_insert; DROP TRIGGER products_after_update; DROP TRIGGER products_after_delete; 6. Update dependent foreign keys and adjust application code as needed This approach enables primary key modifications on active production systems without causing major disruptions. Conclusion: Principles for Primary Key Design These fundamental design principles will help you make decisions about MySQL primary keys after studying them from basic to complex techniques. Your primary key design should start with the query pattern since it determines the most common data access method. And actually this is true for all indexes. Design your primary key to handle current requirements while preparing for future business expansion. Determine the priority between writing and reading performance in your application to make informed key design choices. The physical organization of InnoDB data depends on your primary key selection because it determines how data will be stored. Realistic data testing reveals that primary key performance changes fundamentally when data reaches large volumes. Your primary key design should start with the query pattern since it determines the most common data access method. And actually this is true for all indexes. Design your primary key to handle current requirements while preparing for future business expansion. Determine the priority between writing and reading performance in your application to make informed key design choices. The physical organization of InnoDB data depends on your primary key selection because it determines how data will be stored. Realistic data testing reveals that primary key performance changes fundamentally when data reaches large volumes. The time spent designing primary keys for MySQL databases during my career has proven to deliver substantial benefits when systems expand in size. The primary key functions as a database constraint while serving as the essential base that determines performance and scalability and maintenance requirements.I hope this article provides principles and techniques that enable you to create primary keys that will function effectively now and accommodate upcoming requirements.