Database Notes

Prepared Statement
- better performance: db parse/compile/optimize and store the optimized query plan
- prevent SQL injection attacks

Normal Forms
1NF - all columns contain atomic values only
2NF - every non key attribute is fully dependent on the primary key
3NF - every non key attribute is non-transitively dependent on the primary key

Unique key field allows one value as NULL value
- Primary Key creates a clustered index
- Unique key creates unclustered index

Right Join - Return all rows from the right table, even if there are no matches in the left table
Left Join  - Return all rows from the left table, even if there are no matches in the right table
Full Join  - Return rows when there is a match in one of the tables
Self-join  - used to join a table to itself. Aliases should be used for the same table comparison.
Cross Join - return all records where each row from the first table is combined with each row from the second table.

Cluster vs Non cluster Index
- There can be only one clustered index per table
- A clustered index defines how records are physically stored
- cluster Index are faster

Creating and Using Indexes
The primary optimization technique is to create good indexes and using it effectively in your SQL statement.
Good indexes are important for all SQL operations, not only query, but also update, delete operations.
Principles to create indexes
1. If columns are often used in comparison, consider to create index.
2. Avoid over-indexing
If you never refer to a column in comparison, don't index it, as index has overhead, table updates need to update table's indexes, unnecessary indexing would slow down table updates.
3. If a column has just few distinct values, don't index it, for example: gender column.
4. Declare an indexed column NOT NULL if possible. An index without
NULL can be processed more simply and thus faster.
Indexing Column Prefixes
In some cases, it's sufficient to index just prefix column values rather than complete values; this can improve performance as short index values can be processed more quickly than long ones.
CREATE TABLE t(name CHAR(255),INDEX (name(15)));
Leftmost Index Prefixes
A leftmost prefix of a composite index consists of one or more of the initial columns of the index. MySQL's capability to use leftmost index prefixes enables you to avoid creating unnecessary indexes.
SQL optimization
1. Optimizing Queries by Limiting Output
A simple but effective technique is to reduce the amount of output a query produces.
Select only columns and rows you need, don’t use SELECT * if possible.
Use TOP/limit keyword or the SET ROWCOUNT statement to select only first n rows.
2. Use views and stored procedures instead of heavy-duty queries.
3. Using Indexes Effectively
If the columns are always used in SQL comparison, consider to index them.
Don't refer to an indexed column within an expression that must be evaluated for every row in the table.
SELECT * FROM t WHERE YEAR(d) >= 1994; // inefficient
SELECT * FROM t WHERE d >= '1994-01-01';
When comparing an indexed column to a value, use a value that has the same data type as the column.
For the string value, MySQL must perform a string-to-number conversion, which might cause an index on the id column not to be used.
WHERE id = '18' // id int type, ineffeicent
Use prefix pattern-matching with LIKE operator if possible.
WHERE name LIKE 'de%'
4. Optimizing Updates
Optimization can be also used for update operations.
Use multiple-row INSERT statements instead of multiple single-row INSERT statements.
INSERT INTO t (id, name) VALUES(1,'Bea'),(2,'Belle'),(3,'Bernice');
Group multiple update commands in within a transaction rather than by executing them with auto-commit mode enabled.
Using a transaction allows to flush the changes at commit time.
Using Scheduling Modifiers - MySQL
For an application that uses MyISAM tables, you can change the priority of statements that retrieve or modify data.
Normally, the server will give updates to the table priority over retrievals.
If the application places high importance on having the summary process execute as quickly as possible, it can use scheduling modifiers to alter the usual query priorities.
Try to avoid using the DISTINCT clause, whenever possible.
Use EXPLAIN to check Query Execution Plan in MySQL and optimize Queries
Post a Comment


Java (159) Lucene-Solr (112) Interview (61) All (58) J2SE (53) Algorithm (45) Soft Skills (38) Eclipse (33) Code Example (31) Linux (25) JavaScript (23) Spring (22) Windows (22) Web Development (20) Tools (19) Nutch2 (18) Bugs (17) Debug (16) Defects (14) Text Mining (14) J2EE (13) Network (13) Troubleshooting (13) PowerShell (11) Chrome (9) Design (9) How to (9) Learning code (9) Performance (9) Problem Solving (9) UIMA (9) html (9) Http Client (8) Maven (8) Security (8) bat (8) blogger (8) Big Data (7) Continuous Integration (7) Google (7) Guava (7) JSON (7) Shell (7) ANT (6) Coding Skills (6) Database (6) Lesson Learned (6) Programmer Skills (6) Scala (6) Tips (6) css (6) Algorithm Series (5) Cache (5) Dynamic Languages (5) IDE (5) System Design (5) adsense (5) xml (5) AIX (4) Code Quality (4) GAE (4) Git (4) Good Programming Practices (4) Jackson (4) Memory Usage (4) Miscs (4) OpenNLP (4) Project Managment (4) Spark (4) Testing (4) ads (4) regular-expression (4) Android (3) Apache Spark (3) Become a Better You (3) Concurrency (3) Eclipse RCP (3) English (3) Happy Hacking (3) IBM (3) J2SE Knowledge Series (3) JAX-RS (3) Jetty (3) Restful Web Service (3) Script (3) regex (3) seo (3) .Net (2) Android Studio (2) Apache (2) Apache Procrun (2) Architecture (2) Batch (2) Bit Operation (2) Build (2) Building Scalable Web Sites (2) C# (2) C/C++ (2) CSV (2) Career (2) Cassandra (2) Distributed (2) Fiddler (2) Firefox (2) Google Drive (2) Gson (2) How to Interview (2) Html Parser (2) Http (2) Image Tools (2) JQuery (2) Jersey (2) LDAP (2) Life (2) Logging (2) Python (2) Software Issues (2) Storage (2) Text Search (2) xml parser (2) AOP (1) Application Design (1) AspectJ (1) Chrome DevTools (1) Cloud (1) Codility (1) Data Mining (1) Data Structure (1) ExceptionUtils (1) Exif (1) Feature Request (1) FindBugs (1) Greasemonkey (1) HTML5 (1) Httpd (1) I18N (1) IBM Java Thread Dump Analyzer (1) JDK Source Code (1) JDK8 (1) JMX (1) Lazy Developer (1) Mac (1) Machine Learning (1) Mobile (1) My Plan for 2010 (1) Netbeans (1) Notes (1) Operating System (1) Perl (1) Problems (1) Product Architecture (1) Programming Life (1) Quality (1) Redhat (1) Redis (1) Review (1) RxJava (1) Solutions logs (1) Team Management (1) Thread Dump Analyzer (1) Visualization (1) boilerpipe (1) htm (1) ongoing (1) procrun (1) rss (1)

Popular Posts