scandalz.net
 
 
 
BETA (Google AJAX Search)

MSSQL

Who needs instructions for this, it's point and click right?.

A Follow Up on the SQL Puzzle

by Jay Pipes at 14:12 PM, 03/11/2010

Or…What the Heck is Wrong with CREATE TABLE IF NOT EXISTS ... SELECT?

So, earlier this week, I blogged about an SQL puzzle that had come up in my current work on Drizzle’s new transaction log.

I posed the question to readers what the “correct” result of the following would be:

CREATE TABLE t1 (a int, b int);
INSERT INTO t1 VALUES (1,1),(1,2);
CREATE TEMPORARY TABLE t2 (a int, b int, PRIMARY KEY (a));
BEGIN;
INSERT INTO t2 VALUES (100,100);
CREATE TEMPORARY TABLE IF NOT EXISTS t2 (PRIMARY KEY (a)) SELECT * FROM t1;
 
# The above statement will correctly produce an ERROR 23000: Duplicate entry '1' for key 'PRIMARY'
# What should the below result be?
 
SELECT * FROM t2;
COMMIT;

A number of readers responded, and, to be fair, most everyone was “correct” in their own way. Why? Well, because the way that MySQL deals with calls to CREATE TABLE ... SELECT, CREATE TABLE IF NOT EXISTS ... SELECT and their temporary-table counterparts is completely stupid, as I learned this week. Rob Wultsch essentially sums up my feelings about the behaviour of DDL statements in regards to transactions in a session:

Implicit commit is evil and stupid. Ideally we the server should error and roll back, imho.

The Officially Correct Answer (at least in MySQL)

OK, so here’s the “official” correct answer:

CREATE TABLE IF NOT EXISTS ... SELECT does not first check for the existence of the table in question. Instead, if the table in question does exist, CREATE TABLE IF NOT EXISTS ... SELECT behaves like an INSERT INTO ... SELECT statement. Yep, you heard right. So, instead of throwing a warning when it notices that the table exists, MySQL instead attempts to insert rows from the SELECT query into the existing table.

Here is the official MySQL explanation:

For CREATE TABLE … SELECT, if IF NOT EXISTS is given and the table already exists, MySQL handles the statement as follows:
* The table definition given in the CREATE TABLE part is ignored. No error occurs, even if the definition does not match that of the existing table.
* If there is a mismatch between the number of columns in the table and the number of columns produced by the SELECT part, the selected values are assigned to the rightmost columns. For example, if the table contains n columns and the SELECT produces m columns, where m < n, the selected values are assigned to the m rightmost columns in the table. Each of the initial n – m columns is assigned its default value, either that specified explicitly in the column definition or the implicit column data type default if the definition contains no default. If the SELECT part produces too many columns (m > n), an error occurs.
* If strict SQL mode is enabled and any of these initial columns do not have an explicit default value, the statement fails with an error.

So, given the above manual explanation, the correct answer to the original blog post is:

  a  |  b
100 | 100

partly because there is an implicit COMMIT directly before the CREATE TABLE is executed (committing the 100,100 record to the table) and the primary key violation kills off the INSERTs of 1,1 in InnoDB. For a MyISAM table, the 1,1 record would be in the table, since MyISAM has no idea what a ROLLBACK is.

I Think Drizzle Should Follow PostgreSQL’s Example Here

On implicit commits before DDL operations, I believe they should all go bye-bye. DDL should be transactional in Drizzle and if a statement cannot be executed in a transaction, it should throw an error if there is an active transaction. Period.

For behaviour of CREATE TABLE ... SELECT acting like an INSERT INTO ... SELECT, that entire code path should be ripped out.

PostgreSQL’s DDL operations, IMHO, are sane. Sane is good. PostgreSQL allows quite a bit of flexibility by implementing the SQL standard’s CREATE TABLE and CREATE TABLE AS statements. I believe Drizzle should scrap all the DDL table-creation code and instead implement PostgreSQL’s much-nicer DDL methods. There, I said it. Slashdot MySQL haters, there ya go.


PlanetMySQL Voting: Vote UP / Vote DOWN

Drizzle’s Data Dictionary and Global Status

by Ronald Bradford at 13:33 PM, 03/11/2010

With the recent news by Brian about the Data Dictionary in Drizzle replacing the INFORMATION_SCHEMA, I was looking into the server status variables (aka INFORMATION_SCHEMA.GLOBAL_STATUS) and I came across an interesting discovery.

select * from data_dictionary.global_status;
...
| Table_locks_immediate      | 0              |
| Table_locks_waited         | 0              |
| Threads_connected          | 8134064        |
| Uptime                     | 332            |
| Uptime_since_flush_status  | 332            |
+----------------------------+----------------+
51 rows in set (0 sec)

This only retrieved 51 rows, which is way less then previous. What I wanted was clearly missing, all the old com_ status variables. Looking at what the data_dictionary actually has available revealed a new table.

drizzle> select * from data_dictionary.global_statements;
+-----------------------+----------------+
| VARIABLE_NAME         | VARIABLE_VALUE |
+-----------------------+----------------+
| admin_commands        | 0              |
| alter_db              | 0              |
| alter_table           | 0              |
| analyze               | 0              |
| begin                 | 0              |
| change_db             | 1              |
| check                 | 0              |
| checksum              | 0              |
| commit                | 0              |
| create_db             | 0              |
| create_index          | 0              |
| create_table          | 0              |
| delete                | 0              |
| drop_db               | 0              |
| drop_index            | 0              |
| drop_table            | 0              |
| empty_query           | 0              |
| flush                 | 0              |
| insert                | 0              |
| insert_select         | 0              |
| kill                  | 0              |
| load                  | 0              |
| release_savepoint     | 0              |
| rename_table          | 0              |
| replace               | 0              |
| replace_select        | 0              |
| rollback              | 0              |
| rollback_to_savepoint | 0              |
| savepoint             | 0              |
| select                | 10             |
| set_option            | 0              |
| show_create_db        | 0              |
| show_create_table     | 0              |
| show_errors           | 0              |
| show_warnings         | 0              |
| truncate              | 0              |
| unlock_tables         | 0              |
| update                | 0              |
+-----------------------+----------------+
38 rows in set (0 sec)

Kudos to this. Looking at list I saw an obvious omission, of “ping”. Something that caught me out some years ago with huge (300-500 per second admin_commands). I’m also a fan of Mark’s recent work An evening hack – Com_ping in MySQL.


PlanetMySQL Voting: Vote UP / Vote DOWN

Liveblogging at Confoo: Blending NoSQL and SQL

by Sheeri K. Cabral at 08:11 AM, 03/11/2010

Persistence Smoothie: Blending NoSQL and SQL – see user feedback and comments at http://joind.in/talk/view/1332.

Michael Bleigh from Intridea, high-end Ruby and Ruby on Rails consultants, build apps from start to finish, making it scalable. He’s written a lot of stuff, available at http://github.com/intridea. @mbleigh on twitter

NoSQL is a new way to think about persistence. Most NoSQL systems are not ACID compliant (Atomicity, Consistency, Isolation, Durability).

Generally, most NoSQL systems have:

  • Denormalization
  • Eventual Consistency
  • Schema-Free
  • Horizontal Scale

NoSQL tries to scale (more) simply, it is starting to go mainstream – NY Times, BBC, SourceForge, Digg, Sony, ShopWiki, Meebo, and more. But it’s not *entirely* mainstream, it’s still hard to sell due to compliance and other reasons.

NoSQL has gotten very popular, lots of blog posts about them, but they reach this hype peak and obviously it can’t do everything.

“NoSQL is a (growing) collection of tools, not a new way of life.”

What is NoSQL? Can be several things:

  • Key-Value Stores
  • Document Databases
  • Column-oriented data stores
  • Graph Databases

Key-Value Stores


memcached is a “big hash in the sky” – it is a key value store. Similarly, NoSQL key-value stores “add to that big hash in the sky” and store to disk.

Speaker’s favorite is Redis because it’s similar to memcached.

  • key-value store + datatypes (list, sets, scored sets, soon hashes will be there)
  • cache-like functions (like expiration)
  • (Mostly) in-memory

Another interesting key-value store is Riak

  • Combination of key-value store and document database
  • heavy into HTTP REST
  • You can create links between documents, and do “link walking” that you don’t normally get out of a key-value store
  • built-in Map Reduce

Map Reduce:


  • Massively parallel way to process large datasets
  • First you scour data and “map” a new set of dataM
  • Then you “reduce” the data down to a salient result — for example, map reduce function to make a tag cloud: map function makes an array with a tag name and a count of 1 for each instance of that tag, and the reduce tag goes through that array and counts them…
  • http://en.wikipedia.org/wiki/MapReduce

Other key-value stores:

Document Databases


Some say that it’s the “closest” thing to real SQL.
  • MongoDB – Document store that speaks BSON (Binary JSON, which is compact). This is the speaker’s favorite because it has a rich query syntax that makes it close to SQL. Can’t do joins, but can embed objects in other objects, so it’s a tradeoff

    • Also has GridFS that can store large files efficiently, can scale to petabytes of data
    • does have MapReduce but it’s deliberate and you run it every so often.

  • CouchDB
    • Pure JSON Document Store – can query directly with nearly pure javascript (there are auth issues) but it’s an interesting paradigm to be able to run your app almost entirely through javascript.
    • HTTP REST interface
    • MapReduce only to see items in CouchDB. Incremental MapReduce, every time you add or modify a document, it dynamically changes the functions you’ve written. You can do really powerful queries as easy as you can do simple queries. However, some things are really complex, ie, pagination is almost impossible to do.
    • Intelligent Replication – CouchDB is designed to work with offline integration. Could be used instead of SQLite as the HTML5 data store, but you need CouchDB running locally to be doing offline stuff w/CouchDB

Column-oriented store


Columns are stored together (ie, names) instead of rows. Lets you be schema-less because you don’t care about a row’s consistency, you can just add a column to a table very easily.

Graph Databases


speaker’s opinion – there aren’t enough of these.
Neo4J – can handle modeling complex relationships – “friends of friends of cousins” but it requires a license.

When should I use this stuff?





If you have:Use
Complex, slow joins for an “activity stream”Denormalize, use a key-value store.
Variable schema, vertical interactionDocument database or column store
Modeling multi-step relationships (linkedin, friends of friends, etc)Graph

Don’t look for a single tool that does every job. Use more than one if it’s appropriate, weigh the tradeoffs (ie, don’t have 7 different data stores either!)

NoSQL solves real scalability and data design issues. But financial transactions HAVE to be atomic, so don’t use NoSQL for those.

A good presentation is http://www.slideshare.net/bscofield/the-state-of-nosql.

Using SQL and NoSQL together


Why? Well, your data is already in an SQL database (most likely).

You can blend by hand, but the easy way is DataMapper:
Generic, relational ORM (adapters for many SQL dbs and many NoSQL stores)
Implements Identity Map
Module-based inclusion (instead of extending from a class, you just include into a class).

You can set up multiple data targets (default is MySQL, example sets up MongoDB too).
DataMapper is:

  • Ultimate Polyglot ORM
  • simple r’ships btween persistence engines are easy
  • jack of all, master none
  • Sometimes perpetuates false assumptions –
  • If you’re in Ruby, your legacy stuff is in ActiveRecord, so you’re going to have to rewrite your code anyway.

Speaker’s idea to be less generic and better use of features of each data store – Gloo – “Gloo glues together different ORMs by providing relationship proxies.” this software is ALPHA ALPHA ALPHA.

The goal is to be able to define relationships on the terms of any ORM from any class, ORM or not
Right now – partially working activeRecord relationships
Is he doing it wrong? Is it a crazy/stupid idea? Maybe.

Example:





NeedUse
Assume you already have an auth systemit’s already in SQL, so leave it there.
Need users to be able to purchase items from the storefront – Can’t lose transactions, need full ACID complianceuse MySQL.
Social Graph – want to have activity streams and 1-way and 2-way relationships. Need speed, but not consistencyuse Redis
Product Listings — selling moves and books, both have different properties, products are pretty much non-relationaluse MongoDB

He wrote the example in about 3 hours, so integration of multiple data stores can be done quickly and work.


PlanetMySQL Voting: Vote UP / Vote DOWN

Liveblogging at Confoo: [not just] PHP Performance by Rasmus Lerdorf

by Sheeri K. Cabral at 06:29 AM, 03/11/2010

Most of this stuff is not PHP specific, and Python or Ruby or Java or .NET developers can use the tools in this talk.

The session on joind.in, with user comments/feedback, is at http://joind.in/talk/view/1320.

Slides are at http://talks.php.net/show/confoo10

“My name is Rasmus, I’ve been around for a long time. I’ve been doing this web stuff since 1992/1993.”

“Generally performance is not a PHP problem.” Webservers not config’d, no expire headers on images, no favicon.

Tools: Firefox/Firebug extension called YSlow (developed by yahoo) gives you a grade on your site.

Google has developed the Firefox/Firebug pagespeed tool.

Today Rasmus will pick on wordpress. He checks out the code, then uses Siege to do a baseline benchmark — see the slide for the results.

Before you do anything else install an opcode cache like APC. Wordpress really likes this type of caching, see this slide for the results. Set the timezone, to make sure conversions aren’t being done all the time.

Make sure you are cpu-bound, NOT I/O bound. Otherwise, speed up the I/O.

Then strace your webserver processs. There are common config issues that you can spot in your strace code. grep for ENOENT which shows you “No such file or directory” errors.

AllowOverride None to turn off .htaccess for every directory, just read settings once from your config file….(unless you’re an ISP).

Make sure DirectoryIndex is set appropriately, watch your include_path. All this low-hanging fruit has examples on the common config issues slide.

Install pecl/inclued and generate a graph – here is the graph image (I have linked it because you really want to zoom in to the graph…)

In strace output check the open() calls. Conditional includes, function calls that include files, etc. need runtime context before knowing what to open. In the example, every request checks to see if we have the config file, once we have config’d we can get rid of that stuff. Get rid of all the conditionals and hard-code “include wp-config.php”. Examples are on the slide.

His tips to change:
Conditional config include in wp-load.php (as just mentioned)
Conditional did-header check in wp-blog-header.php
Don’t call require_wp_db() from wp-settings.php
Remove conditional require logic from wp_start_object_cache

Then check strace again, now all Rasmus sees is theming and translations, which he decided to keep, because that’s the good benefit of Wordpress – Performance is all about costs vs. flexibility. You don’t want to get rid of all of your flexibility, but you want to be fast.

Set error_reporting(-1) in wp-settings.php to catch all warnings — warnings slow you down, so get rid of all errors. PHP error handling is very slow, so getting rid of errors will make you faster.

The slide of warnings that wordpress throws.

Look at all C-level calls made, using callgrind, which sits under valgrind, a CPU emulator used for debugging. See the image of what callgrind shows.

Now dive into the PHP executor, by installing XDebug.

Check xhprofFacebook open sourced this about a year ago, it’s a PECL extension. The output is pretty cool, try it on your own site, Rasmus does show you how to use it. It shows you functions sorted by the most expensive to the least expensive.

For example, use $_SERVER[REQUEST_TIME] instead of time(). Use pconnect() if MySQL can handle the amount of webserver connections that will be persistent, etc.

After you have changed a lot of the stuff above, benchmark again with siege to see how much faster you are. In this case there is not much gained so far.

So keep going….the blogroll is very slow — Rasmus gets rid of it by commenting out in the sidebar.php file. I’d like to see something to make it “semi-dynamic” — that is, make it a static file that can be re-generated, since you might want the blogroll but links are not changed every second…..

At this point we’re out of low-hanging fruit.

HipHop is a PHP to C++ converter & compiler, including a threaded, event-driven server that replaces apache. Rasmus’ slide says “Wordpress is well-suited for HipHop because it doesn’t have a lot of dynamic runtime code. This is using the standard Wordpress-svn checkout with a few tweaks.”

Then, of course, benchmark again.

The first time you compile Wordpress with HipHop, you give it a list of files to add to the binary, it will complain about php code that generate file names, so you do have to fix that kind of stuff. There’s a huge mess of errors the first time you run it (”pages and pages”), and Rasmus had to patch HipHop (and Wordpress) but the changes in HipHop have been put back into HipHop, so you should be good for the most part.

Check out the errors, lots of them show logical errors like $foo.”bar” instead of $foo.=”bar” and $foo=”bar” instead of $foo==”bar” in an if statement. Which of course is nice for your own code, to find those logical errors.

(Wordpress takes in a $user_ID argument and immediately initializes a global $user_ID variable, which overwrites the argument passed in, so you can change the name of the argument passed in….)

You can also get rid of some code, things that check for existence of the same thing more than once. So it will take a bit of tweaking, but it’s worth it.

There are limitations to HipHop, for example:

  • It doesn’t support any of the new PHP 5.3 language features
  • Private properties don’t really exist under HipHop. They are treated as if they are protected instead.
  • You can’t unset variables. unset will clear the variable, but it will still be in the symbol table.
  • eval and create_function are limited
  • Variable variables $$var are not supported
  • Dynamic defines won’t work: define($name,$value)
  • get_loaded_extensions(), get_extension_funcs(), phpinfo(), debug_backtrace() don’t work
  • Conditional and dynamically created include filenames don’t work as you might expect
  • Default unix-domain socket filename isn’t set for MySQL so connecting to localhost doesn’t work

and HipHop does not support all extensions — see the list Rasmus has of extensions HipHop supports.

Then Rasmus showed an example using Twit (which he wrote) including the benchmarks. He shows that you can see what’s going on, like 5 MySQL calls on the home page and what happens when you don’t have a favicon.ico (in yellow).

In summary, “performance is all about architecture”, “know your costs”.

Be careful, because some tools (like valgrind and xdebug) you don’t want to put it on production systems, you could capture production traffic and replay it on a dev/testing box, but “you just have to minimize the differences and do your best”.


PlanetMySQL Voting: Vote UP / Vote DOWN

Writing A Storage Engine for Drizzle, Part 2: CREATE TABLE

by Stewart Smith at 06:27 AM, 03/11/2010

The DDL code paths for Drizzle are increasingly different from MySQL. For example, the embedded_innodb StorageEngine CREATE TABLE code path is completely different than what it would have to be for MySQL. This is because of a number of reasons, the primary one being that Drizzle uses a protobuf message to describe the table format instead of several data structures and a FRM file.

We are pretty close to having the table protobuf message format being final (there’s a few bits left to clean up, but expect them done Real Soon Now (TM)). You can see the definition (which is pretty simple to follow) in drizzled/message/table.proto. Also check out my series of blog posts on the table message (more posts coming, I promise!).

Drizzle allows either your StorageEngine or the Drizzle kernel to take care of storage of table metadata. You tell the Drizzle kernel that your engine will take care of metadata itself by specifying HTON_HAS_DATA_DICTIONARY to the StorageEngine constructor. If you don’t specify HTON_HAS_DATA_DICTIONARY, the Drizzle kernel stores the serialized Table protobuf message in a “table_name.dfe” file in a directory named after the database. If you have specified that you have a data dictionary, you’ll also have to implement some other methods in your StorageEngine. We’ll cover these in a later post.

If you ever dealt with creating a table in MySQL, you may recognize this method:

virtual int create(const char *name, TABLE *form, HA_CREATE_INFO *info)=0;

This is not how we do things in Drizzle. We now have this function in StorageEngine that you have to implement:

int doCreateTable(Session* session, const char *path,
                  Table& table_obj,
                  drizzled::message::Table& table_message)

The existence of the Table parameter is largely historic and at some point will go away. In the Embedded InnoDB engine, we don’t use the Table parameter at all. Shortly we’ll also get rid of the path parameter, instead having the table schema in the Table message and helper functions to construct path names.

Methods name “doFoo” (such as doCreateTable) mean that there is a method named foo() (such as createTable()) in the base class. It does some base work (such as making sure the table_message is filled out and handling any errors) while the “real” work is done by your StorageEngine in the doCreateTable() method.

The Embedded InnoDB engine goes through the table message and constructs a data structure for the Embedded InnoDB library to create a table. The ARCHIVE storage engine is much simpler, and it pretty much just creates the header of the ARZ file, mostly ignoring the format of the table. The best bet is to look at the code from one of these engines, depending on what type of engine you’re working on. This code, along with the table message definition should be more than enough


PlanetMySQL Voting: Vote UP / Vote DOWN

Surveying MySQL’s Popular Storage Engines

by Sean Hull at 05:00 AM, 03/11/2010

In this month’s Database Journal piece we look at the spectrum of MySQL storage engines available, and examine what some of their strengths and weaknesses are.

View the article here: Survey of MySQL Storage Engines


PlanetMySQL Voting: Vote UP / Vote DOWN

Emulating a 'top' CPU summary using /proc/stat and MySQL

by Justin Swanhart at 00:54 AM, 03/11/2010

In my last blog post, I showed how we can get some raw performance information from /proc into the MySQL database using a LOAD DATA INFILE (LDI) command.

I've modified that LDI call slightly to set the `other` column to equal the sum total of the CPU counters for those rows which begin with 'cpu'.

original:
other = IF(@the_key like 'cpu%', NULL , @val1);

new:
other = IF(@the_key like 'cpu%', user + nice + system + idle + iowait + irq + softirq + steal + guest, @val1);


Top provides a useful output that looks something like the following:

top - 04:59:14 up 14 days,  3:34,  1 user,  load average: 0.00, 0.00, 0.00
Tasks: 216 total,   1 running, 215 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0%us,  0.0%sy,  0.0%ni, 99.9%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   8172108k total,  5115388k used,  3056720k free,   315180k buffers
Swap:  2097144k total,        0k used,  2097144k free,  3630748k cached

The information I'm currently concerned with presenting is the CPU summary:
Cpu(s):  0.0%us,  0.0%sy,  0.0%ni, 99.9%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st


In order to emulate this display, we will need to sample two data points from /proc/stat.


  • Load the data from proc_stat
  • Sleep 1 second
  • Load the data again
  • Compare the values


You should end up with something similar to the following:
mysql> select * from test.proc_stat where the_key = 'cpu';
+-----+---------+--------+-------+--------+------------+--------+------+---------+-------+-------+------------+
| seq | the_key | user   | nice  | system | idle       | iowait | irq  | softirq | steal | guest | other      |
+-----+---------+--------+-------+--------+------------+--------+------+---------+-------+-------+------------+
|   1 | cpu     | 440022 | 36207 |  94583 | 1976124562 |  89082 |  858 |   27243 |     0 |     0 | 1976812557 | 
|  24 | cpu     | 440024 | 36207 |  94583 | 1976130493 |  89082 |  858 |   27243 |     0 |     0 | 1976818490 | 
+-----+---------+--------+-------+--------+------------+--------+------+---------+-------+-------+------------+
2 rows in set (0.00 sec)


To display the CPU utilization, run the following query:
select 100 * ( ( new.user - old.user )  / ( new.other - old.other ) ) user,
       100 * ( ( new.nice - old.nice ) / ( new.other - old.other ) ) nice, 
       100 * ( ( new.system - old.system ) / ( new.other - old.other ) ) system, 
       100 * ( ( new.idle - old.idle ) / ( new.other - old.other ) ) idle, 
       100 * ( ( new.iowait - old.iowait ) / ( new.other - old.other ) ) iowait, 
       100 * ( ( new.irq - old.irq ) / ( new.other - old.other ) ) irq, 
       100 * ( ( new.softirq - old.softirq ) / ( new.other - old.other ) ) softer,
       100 * ( ( new.steal - old.steal ) / ( new.other - old.other ) ) steal, 
       100 * ( ( new.guest - old.guest ) / ( new.other - old.other ) ) guest
from test.proc_stat old, 
         test.proc_stat new
where new.seq > old.seq
     and old.the_key = 'cpu'
     and new.the_key = old.the_key;

+--------+--------+--------+---------+--------+--------+--------+--------+--------+
| user   | nice   | system | idle    | iowait | irq    | softer | steal  | guest  |
+--------+--------+--------+---------+--------+--------+--------+--------+--------+
| 0.0337 | 0.0000 | 0.0000 | 99.9663 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 
+--------+--------+--------+---------+--------+--------+--------+--------+--------+
1 row in set (0.01 sec)


edit:

for completeness sake, here is the SQL script I execute to load the data from proc:

CREATE TABLE IF NOT EXISTS test.proc_stat (
  seq tinyint auto_increment primary key, 
  the_key char(25) NOT NULL, 
  user bigint,
  nice bigint, 
  system bigint,
  idle bigint, 
  iowait bigint,
  irq bigint,
  softirq bigint, 
  steal bigint, 
  guest bigint, 
  other bigint
);  

/* MySQL treats consecutive delimiters as separate fields, so some fancy footwork
   is required to load the file successfully.  The file includes a cpu field followed
   by two spaces which is the sum of all the individual CPUs in the system.  

   To account for this each row is read into some MySQL variables. Those variables 
   are examined to determine which field holds the correct value.
*/
LOAD DATA INFILE '/proc/stat' 
   IGNORE INTO TABLE test.proc_stat 
   FIELDS TERMINATED BY ' ' 
   (@the_key, @val1, @val2, @val3, @val4, @val5, @val6, @val7, @val8, @val9, @val10)
SET 
    the_key = @the_key, 
    user = IF(@the_key NOT LIKE 'cpu%', NULL, IF(@the_key != 'cpu', IFNULL(@val1, 0), IFNULL(@val2,0))),
    nice = IF(@the_key NOT LIKE 'cpu%', NULL, IF(@the_key != 'cpu', IFNULL(@val2, 0), IFNULL(@val3,0))),
  system = IF(@the_key NOT LIKE 'cpu%', NULL, IF(@the_key != 'cpu', IFNULL(@val3, 0), IFNULL(@val4,0))),
    idle = IF(@the_key NOT LIKE 'cpu%', NULL, IF(@the_key != 'cpu', IFNULL(@val4, 0), IFNULL(@val5,0))),
  iowait = IF(@the_key NOT LIKE 'cpu%', NULL, IF(@the_key != 'cpu', IFNULL(@val5, 0), IFNULL(@val6,0))),
     irq = IF(@the_key NOT LIKE 'cpu%', NULL, IF(@the_key != 'cpu', IFNULL(@val6, 0), IFNULL(@val7,0))),
 softirq = IF(@the_key NOT LIKE 'cpu%', NULL, IF(@the_key != 'cpu', IFNULL(@val7, 0), IFNULL(@val8,0))),
   steal = IF(@the_key NOT LIKE 'cpu%', NULL, IF(@the_key != 'cpu', IFNULL(@val8, 0), IFNULL(@val9,0))),
   guest = IF(@the_key NOT LIKE 'cpu%', NULL, IF(@the_key != 'cpu', IFNULL(@val9, 0), IFNULL(@val10,0))),
   other = IF(@the_key like 'cpu%', user + nice + system + idle + iowait + irq + softirq + steal + guest, @val1);


PlanetMySQL Voting: Vote UP / Vote DOWN

Continuing the journey

by Stewart Smith at 23:42 PM, 03/10/2010

A couple of months ago (December 1st for those playing along at home) it marked five years to the day that I started at MySQL AB (now Sun, now Oracle). A good part of me is really surprised it was for that long and other parts surprised it wasn’t longer. Through MySQL and Sun, I met some pretty amazing people, worked with some really smart ones and formed really solid and awesome friendships. Of course, not everything was perfect (sometimes not even close), but we did have some fun.

Up until November 2008 (that’s 3 years and 11 months for those playing at home) I worked on MySQL Cluster. Still love the product and love how much better we’re making Drizzle so it’ll be the best SQL interface to NDB :)

The ideas behind Drizzle had been talked about for a while… and with my experience with internals of the MySQL server, I thought that some change and dramatic improvement was sorely needed.

Then, in 2008, Brian created a tree. I was soon sending in patches at nights, we announced to the whole world at OSCON and it captured a lot of attention.

Since November 2008 I’ve been working on Drizzle full time. It was absolutely awesome that I had the opportunity to spend all my days hacking on Drizzle – both directly with fantastic people and for fantastic people.

But… the Sun set… which was exciting and sad at the same time.

Never to fear! There were plenty of places wanting Drizzle hackers (and MySQL hackers). For me, it came down to this: “real artists ship”. While there were other places where I would no doubt be happy and work on something really cool, the only way I could end up working out where I should really be was: what is the best way to have Drizzle make a stable release that we’d see be suitable for deployment? So, Where Am I Now?

Rackspace.

Where I’ll again be spending all my time hacking Drizzle.


PlanetMySQL Voting: Vote UP / Vote DOWN

SQL syntax with /*! c-style comments in MySQLdump

by Kedar at 23:27 PM, 03/10/2010

In mysql we have — , /* and /*! comments.  This post is mainly about very basic c-style comments. /*! : C-Style comments in MySQL We normally see comments in MySQLdump as follows: /*!40000 ALTER TABLE `a` DISABLE KEYS */; Or /*!50013 DEFINER=`root`@`localhost` SQL SECURITY DEFINER */ These are actually C-Style comments which has embeded sql and treated specially [...] Related posts:

  1. Calculate Mysql Memory Usage – Quick Stored Proc
  2. MySQL Memory Usage Limits on 32 bit Linux
  3. Perl Script for Analyze – Optimize – Repair Mysql Databases

PlanetMySQL Voting: Vote UP / Vote DOWN

Proper SQL table alias use conventions

by Shlomi Noach at 23:10 PM, 03/10/2010

After seeing quite some SQL statements over the years, something is bugging me: there is no consistent convention as for how to write an SQL query.

I’m going to leave formatting, upper/lower-case issues aside, and discuss a small part of the SQL syntax: table aliases. Looking at three different queries, I will describe what I find to be problematic table alias use.

Using the sakila database, take a look at the following queries:

Query #1

SELECT
 R.rental_date, C.customer_id, C.first_name, C.last_name
FROM
 rental R
 JOIN customer C USING (customer_id)
WHERE
 R.rental_date >= DATE('2005-10-01')
 AND C.store_id=1;

The above looks for film rentals done in a specific store (store #1), as of Oct. 1st, 2005.

Query #2

SELECT
 F.title, C.name
FROM
 film AS F
 JOIN film_category AS S ON (F.film_id = S.film_id)
 JOIN category AS C ON (S.category_id = C.category_id)
WHERE F.length > 180;

The above lists the title and category for all films longer than three hours.

Query #3

SELECT c.customer_id, c.last_name
FROM
  customer c
  INNER JOIN address a ON (c.address_id = a.address_id)
  INNER JOIN (
    SELECT
      c.city_id
    FROM
      city AS c
      JOIN country s ON (c.country_id = s.country_id)
    WHERE
      s.country LIKE 'F%'
  ) s1 USING (city_id)
WHERE
  create_date >= DATE('2005-10-01');

The above lists customers created as of Oct. 1st, 2005, and who live in countries starting with an ‘F’. The query could be solved without a subquery, but there’s a good reason why I made it so.

The problems

I used very different conventions on any one of the queries, and sometimes within each query. And it’s common that I see the same on a customer’s site, what with having many programmers do the SQL coding. Again, I will only discuss the table aliases conventions. I’ll leaver the rest to the reader.

Here’s where I see problems:

  • Query #1: In itself, it looks fine. Rental turns to R, Customer turns to C. I will comment on this slightly later on when I provide my full opinion.
  • Query #2: So film turns to F, category turns to C. What should film_category turn into? Out of letters? Let’s just go for S, shall we? But S has nothing do with film_category. Yet it’s so commonly seen.
  • Query #2: We’re using the AS keyword now. We didn’t use it before.
  • Queries #1, #2: Hold on. Wasn’t C taken for customer in Query #1? Now, in Query #2 it stands for category? I’m beginning to get confused.
  • Query #3: Now aliases are lower case; I was just getting used to them being upper case.
  • Query #3: But, hey, c is back to customer!
  • Query #3: Or, is it? Take a look at the subquery. Theres another c in there! This time it’s city! And it’s perfectly valid syntax. We actually have two identical aliases in the same query.
  • Query #3: If I could, I would name country with c as well. But I can’t. So why not throw in s again?
  • Query #3: and now I don’t even bother using the alias when accessing the create_date. Well, there’s no such column in any of the other tables!

Proper conventions

What I find so disturbing is that whenever I read a complex query, I need to go back and forth, back and forth between table aliases (found everywhere in the query) and their declaration point. Such irregularities make the queries difficult to read.

Any of the above issues could be justified. But I wish to make some suggestions:

  • Decide whether you’re going for upper or lower case.
  • Do not use the same alias twice in your query, even if it’s valid.
  • Aliases do not have to be single character. film_category may just as well be FC.
  • Do not alias something that is hard to interpret. s does not stand for country.
  • Think ahead: use same aliases throughout all your queries, as far as you can. If uniqueness is a problem, make for longer aliases. Use cust instead of c.

The above should make for more organized and readable SQL code. Remember: what one programmer finds as a very intuitive alias, is unintuitive to another!

My own convention

Simple: I only use aliases when using self joins. I am aware that queries are much longer what with long table names. I go farther than that: I prefer fully qualifying questionable columns throughout the query. Yes, it makes the query even longer.

I know this does not appeal to many. But there’s no confusion. And it’s easily searchable. And it’s consistent. And if properly formatted, as in the above queries, is well readable.

Now please join me in asking Oracle if they can add multi-line Strings for java, as there are for python.


PlanetMySQL Voting: Vote UP / Vote DOWN

[RH]acker

by Monty Taylor at 16:39 PM, 03/10/2010

As I'm sure everyone has figured out by now, I've joined Rackspace where I will continue to work on Drizzle. I'm honestly thrilled with my new home, and there are a myriad of reasons for that. I think the one that I'm most excited about is that they are already the thing that all of the hype was about MySQL and RedHat and IBM wanting to become:
 
A Service Company
 
Rackspace doesn't want you to run Rackspace-Apache or RackspaceDB or EC-Rackspace. They want you to be able to run bog-standard Apache. And Linux. And MySQL. And PHP. And Drizzle. Then, Rackspace wants to be the best at providing you the service you need around those.
 
No ludicrous MySQL Enterprise "we'll sell you a license to a free product, and then we'll include bundled with that a subscription a piece of non-free monitoring software" upselling. Rackspace actually wants to provide you a valuable service, and they want to do such a good job at it that you will happily pay them to do it.
 
For developers, there is a wonderful upside to this: Rackspace doesn't want a special internal Rackspace-only version of anything. It has no value that way. They want the good software to be ubiquitous so that they can compete in the service arena. This means that they don't want assignment of copyright. This means they don't have crazy policies about what Free Software projects you can and cannot contribute to.
 
Rackspace goes one step further than "do no evil" ... they actually want you to try to improve the state of the art - which goes right to the core of why I'm involved with Free Software in the first place.
 
I truly believe that Free will always win over Restricted, that Open beats Closed and that Sharing will always improve the world before Hoarding. I've always contended that a company can be successful and make the world a better place and that the two are not mutually exclusive.
 
I am thrilled to now be a part of a company where I can do my best to prove it. 

PlanetMySQL Voting: Vote UP / Vote DOWN

Giving a talk at the University of Utah on everything from scaling, clustering, mysql, mysql internals, noSQL (Cassandra) to how to manage all this stuff. If you are there at University I'm bringing some Swag!

Also I will upload the slides and put them here.

Peter Gulutzan at the O’Reilly MySQL Conference

by Peter Gulutzan at 13:47 PM, 03/10/2010

I will be doing two talks at the O’Reilly MySQL Conference & Expo in Santa Clara CA.
Performance Schema Tuesday April 13, 11:55am.
Demos Of All The Big New Features Thursday April 15, 11:55am, with Konstantin Osipov.

The other MySQL server engineers giving talks are:
Alexander Barkov (globalization)
Chuck Bell (backup)
Mattias Jonsson (partitions)
Mats Kindahl (replication)
Konstantin Osipov (runtime)
Inaam Rana (InnoDB)
Mikael Ronstrom (partitions)
Calvin Sun (InnoDB)
Lars Thalmann (replication)
Jimmy Yang (InnoDB)

Maybe we’ll have a Birds of a Feather session too.


PlanetMySQL Voting: Vote UP / Vote DOWN

mk-schema-change? Check out ideas from oak-online-alter-table

by Shlomi Noach at 10:28 AM, 03/10/2010

In response to Mark Callaghan’s post mk-schema-change.

I apologize for not commenting on the post itself, I do not hold a Facebook account. Anyway this is a long write, so it may as well deserve a post of its own.

Some of the work Mark is describing already exists under openark kit’s oak-online-alter-table. Allow me to explain what I have gained there, and how the issue can be further pursued. There is relevance to Mark’s suggestion.

oak-online-alter-table uses a combination of locks, chunks and triggers to achieve an almost non-blocking ALTER TABLE effect. I had a very short opportunity to speak with Mark on last year’s conference, in between bites. Mark stated that anything involving triggers was irrelevant in his case.

The triggers are a pain, but I believe a few other insights from oak-online-alter-table can be of interest.

The first attempt

My first attempt with the script assumed:

  • Table has an AUTO_INCREMENT PRIMARY KEY column
  • New rows always gain ascending PRIMARY KEY values
  • PRIMARY KEY never changes for an existing row
  • PRIMARY KEY values are never reused
  • Rows may be deleted at will
  • No triggers exist on the table
  • No FOREIGN KEYs exist on the table.

So the idea was: when one wants to do an ALTER TABLE:

  1. Create a ghost table with the new structure.
  2. Read the minimum and maximum PK values.
  3. Create AFTER INSERT, AFTER UPDATE, AFTER DELETE triggers on the original table. These triggers will propagate the changes onto the ghost table.
  4. Working out slowly, and in small chunks, copy rows within recorded min-max values range into the ghost table. The interesting part is where the script makes sure there’s no contradiction between these actions and those of the triggers, (whichever came first!). This is largely solved using INSERT IGNORE and REPLACE INTO in the proper context.
  5. Working out slowly and in chunks again, we remove rows from the ghost table, which are no longer existent in the original table.
  6. Once all chunking is complete, RENAME original table to *_old, and ghost table in place of the original table.

Steps 4 & 5 are similar in concept to transactional recovery through redo logs and undo logs.

The next attempt

Next phase removed the AUTO_INCREMENT requirement, as well as the “no reuse of PK”. In fact, the only remaining constraints were:

  • There is some UNIQUE KEY on the table which is unaffected by the ALTER operation
  • No triggers exist on the table
  • No FOREIGN KEYs exist on the table.

The steps are in general very similar to those listed previously, only now a more elaborate chunking method is used with possible non-integer, possible multi-column chunking algorithm. Also, the triggers take care of changes in UNIQUE KEY values themselves.

mk-schema-change?

Have a look at the wiki pages for OnlineAlterTable*. There is some discussion on concurrency issues; on transactional behavior, which explains why oak-online-alter-table performs correctly. Some of these are very relvant, I believe, to Mark’s suggestion. In particular, making the chunks copy; retaining transactional integrity, etc.

To remove any doubt, oak-online-alter-table is not production ready or anywhere near. Use at your own risk. I’ve seen it work, and I’ve seen it crash. I got little feedback and thus little chance to fix things. I also didn’t touch the code for quite a few months now, so I’m a little rusty myself.


PlanetMySQL Voting: Vote UP / Vote DOWN

Google Summer of Code projects, Drizzle

by Brian Aker at 10:25 AM, 03/10/2010

I've been doing Google Summer of Code projects with students since its creation. As far as intern programs go, it has been one of the most successful I have ever worked with.

Last year was particularly awesome in that with Drizzle we were able to have students work on projects that made it back into Drizzle. While I have always seen good work created, it has always been hit or miss on whether the student's work has made it back into the project. Last year though we got more code in then ever before and I believe this year will be the same. We have had students go on to jobs thanks to the work they did on Drizzle.

Interning gives you real experience, and it provides resume material which differentiates students who are going on to work in the software engineering field. Working on open source means that you have real experience on your resume, experience that an employer can see. There are many positions open in the Drizzle/MySQL ecosystem and students who have real world experience should have any easy time finding work with the knowledge you will gain from this program.

For Drizzle we have worked out a partial list for this year:
http://drizzle.org/wiki/Soc

Don't see anything you like? I am happy to add new projects or work with students on libmemcached or Gearman.

Are you interested in working on a different project? Apache, Linux, Postgres? Talk to those projects and ask them to either participate or suggest ideas on projects to them.
PlanetMySQL Voting: Vote UP / Vote DOWN

mk-schema-change

at 07:05 AM, 03/10/2010

I want a tool to make some long-running schema changes almost non-blocking. They should block access to a table for no more than a few seconds. I also want to do some of these in place on a master rather than on a slave that has been taken offline. I think this will work for most schema changes. It doesn't have to work for all of them and there are restrictions. This will not work when statements that modify the table for which the schema change is done reference other tables and the other tables are modified during the schema change. If production SQL cannot be changed to meet this restriction, then the schema change can be done on a slave that has been taken offline. Is anyone else interested in such a tool? A hand-waving description of the process is:

  1. Create the new table on the master. The new table might use MyISAM without indexes initially to make the insert as fast as possible and reduce the load on InnoDB.
  2. Run set sql_log_bin=0 as what follows should not be written to the binlog
  3. Run start transaction with consistent innodb snapshot to start an Innodb transaction and get current binlog offset of the master
  4. Run insert into new_table select * from original_table on the master. Alas, this will get a transaction duration read lock on every row in original_table unless you use row based replication or hack InnoDB or set innodb_locks_unsafe_for_binlog.
  5. Convert new_table to InnoDB and create indexes on it
  6. Replay changes from the binlogs after the point in time recorded in step #3. This should extract changes to original_table and replay them against new_table.

PlanetMySQL Voting: Vote UP / Vote DOWN

Together with Berndt I’ll be presenting on the new features in MySQL Cluster 7.1 at this year’s MySQL Cluster User Conference – Santa Clara, on April 12th. If you’re interested in using MySQL Cluster but aren’t sure how to get started (or you’ve used it but would like some tips) then this is a great opportunity. Check out the presentation description.

If you register by 15 March then you get the early-bird price and if you use this ‘friend of a speaker’ code then you get an additional 25% off: mys10fsp

mys10fsp

PlanetMySQL Voting: Vote UP / Vote DOWN

MySQL Cluster on Windows – webinar replay available

by Andrew Morgan at 01:58 AM, 03/10/2010

If you missed the recent webinar on running MySQL Cluster on Windows then you can watch/listen to the replay at http://www.mysql.com/news-and-events/on-demand-webinars/display-od-517.html


PlanetMySQL Voting: Vote UP / Vote DOWN

Things to monitor on MySQL, the user’s perspective

by Shlomi Noach at 01:12 AM, 03/10/2010

Working on mycheckpoint, I have the intention of adding custom monitoring. That is, letting the user define things to monitor. I have my own thoughts, I would be grateful to get more input!

What would the user want to monitor?

Monitoring for the number of SELECT statements per second, InnoDB locks, slave replication lag etc. is very important, and monitoring utilities provide with this information. But what does that tell the end user? Not much.

The experienced DBA may gain a lot. The user would be more interested in completely other kind of information. In between, some information is relevant to both.

Say we were managing an on-line store. We want to monitor the health of the database. But the health of the database is inseparable from the health of the application. I mean, having little to no disk usage is fine, unless… something is wrong with the application, which leads to no new purchases.

And so a user would be interested in monitoring the number of purchases per hour, or the time passed since last successful purchase. This kind of data can only be generated by a user’s specific query. Looking at the charts, the user would then feel safer and confident in the wellness of his store app.

But let’s dig further. We want the store’s website to provide with good response. In particular, the query which returns the items in a customer’s cart must react quickly. Our user would not only want to see that purchases get along, but also that page load times (as in our example) are quick for those critical parts. And so a user should be able to monitor the time it took to execute a given query.

It can be of further interest to know how many times per second a given query is executed. This part is not easily done on the server side, and requires the user’s cooperation (or else we must analyze the general log, sniff, or set up a proxy). If the user is willing, she can log to some table each time she executes a certain query. Then we’re back to monitoring a regular table, as with the first example.

It is also possible to monitor for a query’s execution plan. Is it full scan? How many rows are expected? But given that we can monitor the time it took to execute a query, I’m not sure this is useful. If everything runs fast enough — who cares about how it executes?

Some of the above can be monitored on an altogether higher level: if  we’re talking about some web application, then we can use our Apache logs to determine load time for pages, or number of requests to our “cart items” page. But not always do we work with web servers, and we may be interested in checking the specific queries behind the scenes.

Summary

Custom monitoring can include:

  • User defined queries (number of concurrent visitors; count of successful operations per second; number of rows per given table or condition; …)
  • Execution time for user defined queries (time it takes to return cart items; find rows matching condition; sort a table; …)
  • Number of executions for a given query, per second.

I intend to incorporate the above into mycheckpoint as part of its standard monitoring scheme.

Please share your thought below.


PlanetMySQL Voting: Vote UP / Vote DOWN

System administrators familiar with the Linux operating system use the tools in the 'procps' toolset all the time. Tools which read from /proc include top, iostat, vmstat, sar and others. The files in /proc contain useful information about the performance of the system. Most of the files are documented in the Linux kernel documentation. You can also check man 5 proc.

Most performance monitoring tools invoke other tools like iostat to collect performance information instead of reading from the /proc filesytem itself. This begs the question, what can you do if you don't have access to those tools? Perhaps you are using a hosted Linux database and have no access to the underlying shell to execute tools like iostat or top? How could you gather information about the performance of the actual system without being allowed to run the tools?

MySQL includes a command called LOAD DATA INFILE which can read the contents of a delimited text file and store the contents into a database table. The contents of /proc are world readable, so your MySQL database should have access to this information as long as it is running on a Linux server.

Lets start by collecting and reporting on some CPU performance information.

CREATE TEMPORARY TABLE test.proc_stat (
  seq tinyint auto_increment primary key, 
  the_key char(25) NOT NULL, 
  user bigint,
  nice bigint, 
  system bigint,
  idle bigint, 
  iowait bigint,
  irq bigint,
  softirq bigint, 
  steal bigint, 
  guest bigint, 
  other bigint
);  

/* MySQL treats consecutive delimiters as separate fields, so some fancy footwork
   is required to load the file successfully.  The file includes a cpu field followed
   by two spaces which is the sum of all the individual CPUs in the system.  

   To account for this each row is read into some MySQL variables. Those variables 
   are examined to determine which field holds the correct value.
*/
LOAD DATA INFILE '/proc/stat' 
   IGNORE INTO TABLE test.proc_stat 
   FIELDS TERMINATED BY ' ' 
   (@the_key, @val1, @val2, @val3, @val4, @val5, @val6, @val7, @val8, @val9, @val10)
SET other = IF(@the_key like 'cpu%', NULL, @val1),  
    the_key = @the_key, 
    user = IF(@the_key NOT LIKE 'cpu%', NULL, IF(@the_key != 'cpu', IFNULL(@val1, 0), IFNULL(@val2,0))),
    nice = IF(@the_key NOT LIKE 'cpu%', NULL, IF(@the_key != 'cpu', IFNULL(@val2, 0), IFNULL(@val3,0))),
  system = IF(@the_key NOT LIKE 'cpu%', NULL, IF(@the_key != 'cpu', IFNULL(@val3, 0), IFNULL(@val4,0))),
    idle = IF(@the_key NOT LIKE 'cpu%', NULL, IF(@the_key != 'cpu', IFNULL(@val4, 0), IFNULL(@val5,0))),
  iowait = IF(@the_key NOT LIKE 'cpu%', NULL, IF(@the_key != 'cpu', IFNULL(@val5, 0), IFNULL(@val6,0))),
     irq = IF(@the_key NOT LIKE 'cpu%', NULL, IF(@the_key != 'cpu', IFNULL(@val6, 0), IFNULL(@val7,0))),
 softirq = IF(@the_key NOT LIKE 'cpu%', NULL, IF(@the_key != 'cpu', IFNULL(@val7, 0), IFNULL(@val8,0))),
   steal = IF(@the_key NOT LIKE 'cpu%', NULL, IF(@the_key != 'cpu', IFNULL(@val8, 0), IFNULL(@val9,0))),
   guest = IF(@the_key NOT LIKE 'cpu%', NULL, IF(@the_key != 'cpu', IFNULL(@val9, 0), IFNULL(@val10,0)));

Depending on your kernel version you may get 1 or more warnings about unexpected numbers of columns. You can safely ignore these.

mysql> select * from test.proc_stat;
+-----+---------------+--------+-------+--------+------------+--------+------+---------+-------+-------+------------+
| seq | the_key       | user   | nice  | system | idle       | iowait | irq  | softirq | steal | guest | other      |
+-----+---------------+--------+-------+--------+------------+--------+------+---------+-------+-------+------------+
|   1 | cpu           | 378340 | 33588 |  82489 | 1838257830 |  75444 |  750 |   23065 |     0 |     0 |       NULL |
|   2 | cpu0          |   4152 |   125 |   1613 |  114920899 |    624 |    0 |     869 |     0 |     0 |       NULL | 
|   3 | cpu1          |   2182 |    78 |   1474 |  114924477 |     50 |    2 |       3 |     0 |     0 |       NULL | 
|   4 | cpu2          |   6037 |  5418 |   2289 |  114914024 |     55 |   34 |     401 |     0 |     0 |       NULL | 
|   5 | cpu3          |   3519 |    55 |    842 |  114923794 |     37 |    1 |       1 |     0 |     0 |       NULL | 
|   6 | cpu4          |  71851 |  5443 |   6656 |  114840363 |   3197 |   11 |     720 |     0 |     0 |       NULL | 
|   7 | cpu5          |   2435 |     5 |    801 |  114924963 |     29 |    2 |       0 |     0 |     0 |       NULL | 
|   8 | cpu6          | 136246 |  4711 |  36628 |  114690032 |  46119 |   20 |   14471 |     0 |     0 |       NULL | 
|   9 | cpu7          |   1119 |     2 |    366 |  114926691 |     40 |    1 |       0 |     0 |     0 |       NULL | 
|  10 | cpu8          |   4126 |    34 |   2772 |  114920032 |     92 |    1 |    1153 |     0 |     0 |       NULL | 
|  11 | cpu9          |   1618 |     2 |    694 |  114925811 |     77 |    1 |       0 |     0 |     0 |       NULL | 
|  12 | cpu10         |  18096 |  8735 |   6823 |  114891588 |    396 |  179 |    2379 |     0 |     0 |       NULL | 
|  13 | cpu11         |   7243 |  2583 |   3559 |  114914559 |    241 |    1 |       2 |     0 |     0 |       NULL | 
|  14 | cpu12         |   5215 |  2380 |   2776 |  114915814 |    417 |  342 |    1237 |     0 |     0 |       NULL | 
|  15 | cpu13         |   3224 |    28 |   1507 |  114923336 |     77 |    2 |       0 |     0 |     0 |       NULL | 
|  16 | cpu14         | 109818 |  3979 |  13071 |  114775431 |  23901 |  143 |    1823 |     0 |     0 |       NULL | 
|  17 | cpu15         |   1450 |     1 |    612 |  114926010 |     83 |    1 |       0 |     0 |     0 |       NULL | 
|  18 | intr          |   NULL |  NULL |   NULL |       NULL |   NULL | NULL |    NULL |  NULL |  NULL | 1176485951 | 
|  19 | ctxt          |   NULL |  NULL |   NULL |       NULL |   NULL | NULL |    NULL |  NULL |  NULL |  171220339 | 
|  20 | btime         |   NULL |  NULL |   NULL |       NULL |   NULL | NULL |    NULL |  NULL |  NULL | 1267061074 | 
|  21 | processes     |   NULL |  NULL |   NULL |       NULL |   NULL | NULL |    NULL |  NULL |  NULL |     168510 | 
|  22 | procs_running |   NULL |  NULL |   NULL |       NULL |   NULL | NULL |    NULL |  NULL |  NULL |          1 | 
|  23 | procs_blocked |   NULL |  NULL |   NULL |       NULL |   NULL | NULL |    NULL |  NULL |  NULL |          0 | 
+-----+---------------+--------+-------+--------+------------+--------+------+---------+-------+-------+------------+
23 rows in set (0.00 sec)

Now that you know you can collect that information, then you can emulate top to calculate the current total CPU usage. I'll show you how to do that in my next blog post.
PlanetMySQL Voting: Vote UP / Vote DOWN

Imagine you tried to use the slow query log to debug a performance problem. Does the current format have enough details?

# Time: 100309 18:48:23
# User@Host: root[root] @ localhost []
# Query_time: 0  Lock_time: 0  Rows_sent: 1  Rows_examined: 1
I have added Thread_id, Errno, Start and End. Thread_id can be used to find similar data from SHOW PROCESSLIST and the binlog. Errno is useful in many cases. Start and End are there for convenience. Can you suggest anything else that would be easy to add? Note that Rows_sent and Rows_examined are always zero for insert, update and delete statements. Feature request 49756 is open to change that. Maybe that is easy to fix.
# Query_time: 0  Lock_time: 0  Rows_sent: 1  Rows_examined: 1\
           Thread_id: 3 Errno: 0 Start: 18:48:23 End: 18:48:23
Update, I found more data that is easy to add and the proposed output is:
# Time: 100310  7:51:28
# User@Host: root[root] @ localhost []
# Query_time: 0  Lock_time: 0  Rows_sent: 1  Rows_examined: 0 Thread_id: 1 Errno: 0 \
Killed: 0 Bytes_received: 104 Bytes_sent: 161 Read_first: 0 Read_last: 0 Read_key: 0 \
Read_next: 0 Read_prev: 0 Read_rnd: 0 Read_rnd_next: 0 Sort_merge_passes: 0 \
Sort_range_count: 0 Sort_rows: 0 Sort_scan_count: 0 Created_tmp_disk_tables: 0 \
Created_tmp_tables: 0 Start:  7:51:28 End:  7:51:28

PlanetMySQL Voting: Vote UP / Vote DOWN

MySQL User Conference


I'll be presenting "The Thinking Person's Guide to Data Warehouse Design" at the upcoming MySQL User conference. While a lot of people think that bad SQL code is the #1 wrecking ball of data warehouses and marts, the fact is that poor database design is the first cause of both downtime and bad performance. In my presentation, I'll do my best to show how up-front worRead More...
PlanetMySQL Voting: Vote UP / Vote DOWN

Hotsos Symposium 2010 — Battle Against Any Guess Is Won

by Alex Gorbachev at 12:16 PM, 03/09/2010

Video fragments of my session posted at the end — read on.

I arrived at Omni Mandalay Hotel on Sunday evening with Dan Norris. I was flying through Chicago and it turned out that Dan was on the same flight and only few rows behind me. Small world.

Preparations for the conference were very chaotic on my part and, of course, I didn’t have either of my presentations ready. I was very stressed and getting sick as well — it looked like a complete disaster waiting to happen. I’d like to say that I was feeling like Doug Burns as he often managed to get sick just before a conference. Of course, I worked on my slides for the last few days as well as on the flight and presentation was slowly getting there but boy was I tired!

I quickly said hello to the crowd in the bar on the way to my room and rushed away to do some more damage to my slides. And then I had a brilliant idea — I could still see one of my best mates and do something good about my presentation! I asked Doug if he was interested in the preview (he probably wasn’t interested but he couldn’t say it to me) especially that my session wasn’t on his original agenda. Of course, that would mean that he had to leave a bunch of other good friends and spend some time tete-a-tete. Knowing Doug, this is some of the hardest thing to ask from him but it shows how good of a friend he is! (Plus, everyone thinks that he is anti-social anyway. Shhhh!)

Doug has made my day — while he provided lots of ideas and feedback on few things that I was lucking, he generally approved the idea and confirmed that it wasn’t totally crazy. I guess that was all I needed back then and Doug knew how nervous I was about it. (Thanks mate!)

So I called Sunday a day very early and went to bed before midnight. I really needed some sleep. Woken up by the alarm at 5AM (I woke up few times during the night looking at the clock — making sure I didn’t sleep through) and slides were ready just before lunch. I even managed to do a test run and it took 65 minutes — a wee bit too long for one hour session. But it was good test and I knew I had to be just a bit more concise in few parts.

Mi morning was very productive. Unfortunately, I missed the opening keynote from Tom Kyte. Such a pity! If what Doug wrote is true, Tom was talking about the mistakes we make *because* of our experience and our assumptions. This was exactly one of the points I was making in my Battle Against Any Guess — experience is danger. I wish I could see Tom’s example. Oh well, maybe another time.

I managed to attend half of the Richard Foote’s session on indexes but my mind was far away — with my own slides. Though, I did manage to focus on bitmap indexes part and the myth of bitmap indexes not working well for columns with high cardinality. Very interesting conclusions. I’m still wondering how much overhead updates will do to such bitmap index.

After lunch, it was my turn. I ordered few copies of the latest OakTable book — Expert Oracle Practices: Oracle Database Administration from the Oak Table — that I co-authored with the bunch of other Oakies. I contributed chapter 1 in the book titled just like my presentation — Battle Against Any Guess. The plan was to give a copy away during the presentation and do a draw for another one at the end of the session. I was so nervous that I forgot about it until the end of the session so I just did a draw for two copies. The lucky winners were Lynn-Georgia Tesch and Surendra Anchula. Congratulations! For the rest of you who left the contact details — please stay tuned and we’ll organize few things online.

Now the main topic of this post — my presentation. What’s unusual about this session is that it’s not some technical stuff that I usually do but a more conceptual and motivational talk. Could I pull it off? Well, I think it went fairly well in general even though I did identify few rough places and my lack of English language mastering. Might need to work a little bit more on the flow of the presentation.

We had quite a few good laughs. Later, people in the next hall were asking about it and Dan was making the jokes on the stage so it must have been loud. Anyway, I think nobody fell asleep and I managed to get people thinking about the topic. I received many “thank you” notes yesterday and compliments on a good session so by the end of the day I was more and more pleased. Thanks everyone for attending and especially big thanks to those of you who brought to my attention examples from their own battles. If you have more to discuss — contact me by email (my last name) {at} pythian.com.

Thanks to Marco Gralike for recording some fragments and sharing them. I think he has more to come.

This is the introductory couple minutes. You can definitely notice how nervous I am starting on the stage:

Solving the wrong problem example:

That’s all for now. Stay tuned — more to come.


PlanetMySQL Voting: Vote UP / Vote DOWN

How do I identify the MySQL my.cnf file?

by Ronald Bradford at 11:14 AM, 03/09/2010

As part of my upcoming FREE my.cnf check advice I first need to ask people to provide the current MySQL configuration file commonly found as a file named my.cnf

If only that question was easy to answer!

Use of configuration files

MySQL will by default use at least one configuration file from the following defaults. MySQL also uses a cascade approach for configuration files. When you have multiple files in the appropriate paths you can see unexpected behavior when you override certain values in different files.

You can however for example specify –no-defaults to use no configuration file, or add options to your command line execution, so even looking at all configuration files is no guarantee of your operating configuration.

However for most environments, these complexities do not exist.

Default Location

By default and on single instance MySQL servers you are most likely to find this file called my.cnf and found at:

  • /etc/my.cnf
  • /etc/mysql/my.cnf

These are known as the global options files.

Alternative Locations

MySQL has both instance specific and user specific locations. For the inclusion of an instance specific file, the location is:

  • $MYSQL_HOME/my.cnf

where MYSQL_HOME is a defined environment variable. Historical MySQL versions also looked at [datadir]/my.cnf however I am unaware if this is applicable in 5.x versions.

You can also specific options on a per user basis for default inclusion. These are found at:

  • $HOME/.my.cnf

Distro specific locations

Ubuntu for example also provides an ability to add options via an include directory.

Specifying a configuration at runtime

While you may have these default files, you may elect to start mysql with a specific configuration file as specified by –defaults-file. This option will override all global/instance/user locations and use just this configuration file. You can also specify additional configuration that supplements and not overrides the default with –defaults-extra-file.

What files are on my system?

Again, assuming the default names you can perform a brute force check with:

$ sudo find / -name "*my*cnf"

This is actually worthwhile, especially if you find a /root/.my.cnf file which is default MySQL settings for the Operating System ‘root’ user.

MySQL recommendations

MySQL by default provides a number of recommended files however these are generally outdated especially for newer hardware. These files include my-huge.cnf, my-large.cnf, my-medium.cnf, my-small.cnf and my-innodb-heavy-4G.cnf. Don’t assume replacing your configuration with one of these files will make your system perform better.

MySQL made some attempt to correct these and at least some very poor defaults with MySQL 5.4 however I am unsure what’s in MySQL 5.5

MySQL Configuration at runtime

While several commands can help with identifying your configuration files and print defaults etc, it’s also possible to change your configuration at runtime. It’s possible that these changes are not reflected in your configuration files and pose an additional mismatch.

References


PlanetMySQL Voting: Vote UP / Vote DOWN

Data Comparison Methods Overview

by Julia Samarska at 07:38 AM, 03/09/2010

Data comparison is a difficult and resource-intensive process. For convenience, this process can be divided into several steps.
First, you should compare tables from one database on one server with the database on the other server. You should choose columns for data comparison, and also choose a column that will be a comparison key.
The next step is to choose all data from these tables or some specified part of the data.
The third and the most important step is comparison of the two tables by the selected comparison key itself. During this process the status of each record is set to “only in source”, “only in target”, “different”, or “equal”.
The final steps of the data comparison process are including records to the synchronization and synchronization itself. During these steps records needed for synchronization are chosen, update script is created, and after that the script is executed.
You can read a detailed description of the comparison process here.

Now let’s look at the third step (data comparison) thoroughly.

There are several ways of data comparison that differ only by the side where data comparison is going to be performed – on the server side or on the client PC.

Data comparison on the server side is performed using the resources of the server.
The algorithm of comparison is the following:
1. For each record of each of the two tables its checksum is calculated;
2. Then the checksum of every record from one table is compared to the checksum of the corresponding record from another table and conclusion if the records are equal or different is made;
3. The comparison result is stored in a temporary table on the server.

Performance indicators:
1. The speed of data comparison directly depends on the server capacity and occupancy;
2. The maximal size of database for comparison is limited by the resources of the server itself.

Advantages:
1. There is no need to transfer large amounts of data for comparison to the client PC through network. This way we save network traffic;
2. The speed of comparison does not depend on the client PC resources;
3. Ability to compare blob data of any size.

Disadvantages:
1. Because of the record checksum calculation algorithm in some cases different data can result in equal checksum, and instead of the expected “different” status the “equal” status will be received;
2. There is no flexibility in the synchronization and comparison options usage;
3. There is no possibility to view records differences and exclude a part of the records from the synchronization manually;
4. During the synchronization script creation you should perform data transfer from the server to the client side;
5. The control checksum calculation of a large amount of records consumes all server resources;
6. One should provide extra space on the server for the comparison results storage in the temporary table.

As we can see, this way of comparison has more disadvantages than advantages, that’s why this way is rarely used.

Data comparison on the client PC is performed using the client machine resources, and the server only provides data for comparison. In turn, this way of comparison can be divided into several more ways depending on the way how comparison information will be stored.

Comparing Data on local PC when comparison result is stored in RAM.
The comparison algorithm is the following:
1. Server passes all data from both tables to the local PC;
2. Every record of every table is placed to RAM and is compared without checksum calculation;
3. If a record gets “only in source”, “only in target” or “equal” status, only comparison key is stored in RAM. If records get “different” status, they are placed to RAM for storage completely.

Performance indicators:
1. The speed of data comparison directly depends on the client PC resources and on the speed of data transfer through the network;
2. Maximum size of the database for comparison is limited by the size of RAM on the client PC, and this maximum size also depends on the degree to which the databases that should be compared are different – the smaller is the amount of different records, the larger databases can be compared.

Advantages:
1. Minimal server occupancy – server performs only simple data selection;
2. The simplest algorithm of data comparison because records are sorted on the client side;
3. Flexibility in the comparison options usage;
4. Minimal size of the comparison data store;
5. Status of every record for any data is always correct.

Disadvantages:
1. To view records with “only in source”, “only in target”, or “equal” status an extra data selection is needed;
2. An extra data selection is needed to create a synchronization script;
3. OutOfMemory Exception may be arisen when there are a lot of differences in data in databases;
4. Possibility to compare blob data only of the size that equals to the size of free RAM.

This way of comparison is implemented in dbForge Data Compare for SQL Server v1.10, dbForge Data Compare for MySQL v2.00 and allows to compare databases of any size if data in these databases does not differ a lot.

Comparing Data on local PC when comparison result is stored as a cashed file on the disk.
The algorithm of comparison is the following:
The server passes all data from both tables sorted by comparison key to a local PC. Data is read by bytes, compared without checksum calculation and written to a file on the disk.

Performance indicators:
1. The speed of data comparison directly depends on the client PC resources and on the speed of data transferring through the network;
2. The maximum size of a database to compare is limited by free disk space and does not depend on the degree of data difference in databases.

Advantages:
1. Medium server occupancy – server performs data sorting and selection;
2. To view records and synchronization script creation extra requests to the server are not necessary;
3. The status for every record is always correct for any data;
4. Possibility to compare blob data of the size equal to the size of free space available on the disk.

Disadvantages:
1. Difficult algorithm of data comparison for the records comparison key of which is of the string data type;
2. Difficult algorithm of disk cash for temporary information storage creation.

We can see that in this case the only disadvantage of this way of comparison is the difficulty of implementation. There are more advantages than in the ways of comparison listed above. That’s why this way of comparison will be used in the new version of dbForge Data Compare for SQL Server v2.00 and dbForge Data Compare for MySQL v3.00 for data comparison.


PlanetMySQL Voting: Vote UP / Vote DOWN

MySQL Version Updates

by Michal Hrušecký at 07:19 AM, 03/09/2010

 

Few weeks ago I was at FOSDEM.  It was really amazing experience. I meet many interesting people, learned quite some thing and I returned full of enthusiasm. Open Source events are really great.

But all the fun wasn't over even after the FOSDEM. I spent few more days in Bruxelles attending MySQL packagers meeting organized by SUN/Oracle. We spent quite some time talking to each other. We learned what MySQL people are doing and how. And they learned how do we deal with MySQL and what is troubling us. And many good things will come from this.

First but certainly not last of them is about to appear now. One very interesting thing we learned at meeting was MySQL release policy. What openSUSE and Ubuntu and maybe some others are doing is that after release date there is generaly no version updates allowed. We are only fixing serious bugs and security related issues. It takes quite some work. What we learned is that new releases in stable branch of MySQL are in fact maintanance updates. If you update from 5.1.43 to 5.1.44 you wouldn't get any new features. All you will get are bugfixes. And only bugfixes of serious or security related issues. Does it sound familiar? Yes it is the same thing we are doing! So I discussed it with our maintanance team. And we came to the conclusion that we want to give our users all serious fixes. Not only these few selected. And the best way to do it is to use maintanance updates provided by MySQL people themself. I'm not saying that I don't have enough confidence to play with MySQL sources, but I think that MySQL people can do it better face-wink.png

Yes, you are guessing right. What I'm trying to say is that we are going to update MySQL to the latest available version. This means 5.1.44 for openSUSE 11.2 and 5.0.90 for older openSUSE. We will start with 11.2 as version gap is smaller there and if everything will proceed smoothly, we will continue with 11.1 and 11.0. For 11.2 you can help by testing update. Currently 5.1.44 update is prepared for 11.2 in server: database: STABLE and I'm running some final tests. If you want, you can try it too (not recomended on production servers yet) and if you'll find any problems, please report them before it will hit official updates.

Remember, this is just the beginning. I've got some bigger plans regarding MySQL in 11.3 face-wink.png


PlanetMySQL Voting: Vote UP / Vote DOWN

SQLyog MySQL GUI 8.3 Has Been Released

by Webyog at 06:04 AM, 03/09/2010

Changes (as compared to 8.22) include:

Features:
* Added an option to define a ‘color code’ for a connection. The color will be used as background color in the Object Browser.
* A Query Builder session can now be saved and resumed.
* In Query Builder a table alias can be defined for any table by double-clicking the title bar of the table symbol.
* In RESULT tab results can now be retrieved page-wise. This is ON as default with this build with a defined LIMIT of 1000 rows. For a specific query user can change and for this specific query the setting is persistent across sessions. Also read ‘miscellaneous’ paragraph below.
* Added a context menu to Query Builder canvas.

Bug Fixes:
* Deleting a user would leave non-global privileges orphaned in the ‘mysql’ database. Now we use DELETE USER syntax if server supports.
* Also using EDIT USER dialogue to change host or user specifier for a user would not move non-global privileges. We have split the old ALTER USER dialogue into two: a EDIT USER and RENAME USER dialogue. The latter will use RENAME USER syntax if server supports.
* On Wine Data Sync could generate a malformed XML-string what would case Data Sync to abort.
* Fixed an issue where SSH-tunneling failed with public/private key authentication. Technically the fix is in the PLINK binary shipped with SQLyog.
* SJA failed to send notification mails if Yahoo SMTP servers were used. Note that the fix disables encryption option with Yahoo SMTP servers – but it won’t work anyway due to a non-standard SMTP implementation server-side.
* When importing data from a Universe ODBC-source string data could be truncated.
* The fix in 8.22 for the issue that horizontal scrollbar in GRID would sometime not appear was not complete. It could still happen.
* SQLyog will now trim trailing whitespaces in Connection Manager and Create object dialogs to avoid MySQL Errors..
* Opening a file from ‘recent files’ list could crash SQLyog if a Query Builder or Schema Designer tab was selected and the file specified was not a valid XML file for that tab. This bug was introduced in beta 1.
* When calling a Stored Procedure with more than one SELECT statement from ‘Notifications Services’ only one result set was sent by mail.
* The sja.log file had no line-breaks between what was recorded for two jobs.
* On multi-monitor system resizable dialogues could open on the wrong monitor. New implementation is like this: on multi-monitor systems main program dialogue and ‘first child dialogue’ (example: ALTER TABLE) will open where they were closed (if possible), second and higher child dialogues (example: table advanced properties) will always open on top of its ‘parent’ dialogue. Non-resizable dialogues (such as confirmation boxes) will always open on top of their ‘parent’.
* With multiple SSH-tunnelled connections open stopping and re-executing queries in multiple connections in a fast manner could crash SQLyog.
* If more than one comment occurred before a SELECT statement in the editor, the statement was not identified as a SELECT statement by the Query Profiler and the Query Profiler TAB would not display.
* We did not validate client-side if user checked atoincrement option for a bit column with Create/Alter table dialog.
* If an error occurred while renaming a trigger then trigger was lost as SQLyog was not recreating it back.
* Small GUI fixes.

Miscellaneous:
* The default LIMIT setting for DATA tab has been removed. The setting is not required since we introduced table-level persistence for number of rows displayed. The default for new tables that have not been opened before is 50 – but when user changes the value and next ‘refresh’es SQLyog will save the LIMIT for that particular table persistently across sessions. This in combination with page-wise display in RESULT tab results in a more uniform User Interface for DATA and RESULT tabs.

Downloads: http://webyog.com/en/downloads.php
Purchase: http://webyog.com/en/buy.php


PlanetMySQL Voting: Vote UP / Vote DOWN

Maatkit BoF session at the MySQL conference

by Baron Schwartz (xaprb) at 05:15 AM, 03/09/2010

I’ve submitted a Birds of a Feather session for Maatkit at the upcoming MySQL conference. It’s not on the public schedule yet, but it has been accepted and scheduled for 19:00 on 13 Apr 2010. See you there!

Related posts:

  1. Presentation uploaded for Maatkit talk at MySQL Conference The slides
  2. I’ll be speaking at the O’Reilly MySQL Conference 2010 I’m
  3. Learn about Maatkit at the MySQL Conference I’m

Related posts brought to you by Yet Another Related Posts Plugin.


PlanetMySQL Voting: Vote UP / Vote DOWN

Tip: faster than TRUNCATE

by Shlomi Noach at 03:37 AM, 03/09/2010

TRUNCATE is usually a fast operation (much faster than DELETE FROM). But sometimes it just hangs; I’ve has several such uncheerful events with InnoDB (Plugin) tables which were extensively written to. The TRUNCATE hanged; nothing else would work; minutes pass.

TRUNCATE on tables with no FOREIGN KEYs should act fast: it translate to dropping the table and creating a new one (and it all depends on the MySQL version, see the manual).

What’s faster than TRUNCATE, then? If you don’t have triggers nor FOREIGN KEYs, a RENAME TABLE can come to the rescue. Instead of:

TRUNCATE log_table

Do:

CREATE TABLE log_table_new LIKE log_table;
RENAME TABLE log_table TO log_table_old, log_table_new TO log_table;
DROP TABLE log_table_old;

I found this to work well for me. Do note that AUTO_INCREMENT values can be tricky here: the “new” table is created with an AUTO_INCREMENT value which is immediately taken in the “working” table. If you care about not using same AUTO_INCREMENT values, you can:

ALTER TABLE log_table_new AUTO_INCREMENT=some high enough value;

Just before renaming.

I do not have a good explanation as for why the RENAME TABLE succeeds to respond faster than TRUNCATE.


PlanetMySQL Voting: Vote UP / Vote DOWN


O'Reilly MySQL Conference & Expo 2010
I'm happy to announce that my talk "Making MySQL administration a breeze - a look into a MySQL DBA's toolchest" has been accepted for this year's edition of the MySQL Conference & Expo in Santa Clara, which will take place on April 12-15, 2010. The session is currently scheduled for Wednesday 14th, 10:50 in Ballroom E.

My plan is to provide an overview over the most popular utilities and applications that a MySQL DBA should be aware of to make his life easier. The focus will be on Linux/Unix applications available under opensource licenses that ease tasks related to user administration, setting up and administering replication setups, performing backups and security audits.

Of course I will cover the usual suspects (e.g. Maatkit), some of these are actually collections of different utilities by themselves. As it's impossible to go over each individual component in the given time frame, I will try to pick out the most popular/useful parts related to the scopes mentioned above. But I will also cover some lesser known gems that migh be worth taking a look at. What's your the most valued tool in your toolchest? I am still looking for more inspiration.

I look forward to being at the conference again and meeting with colleagues and friends in the MySQL community. Judging from the current schedule, it will be a very interesting mix of talks.

If you're interested in attending, you should consider registering soon! The early registration ends on March 15th. Until then, I encourage you to make use of this "Friend of Speaker" discount code (25% off): mys10fsp


PlanetMySQL Voting: Vote UP / Vote DOWN

Drizzle BoF at the MySQL Conference and Expo

by Stewart Smith at 20:46 PM, 03/08/2010

At the 2010 O’Reilly MySQL Conference and Expo there will be a Drizzle BoF!

It’s currently scheduled for 7pm on April 13th.

Come along, it will be awesome.


PlanetMySQL Voting: Vote UP / Vote DOWN

Speaking At The MySQL Users Conference

by Gerardo Narvaja at 16:10 PM, 03/08/2010

My proposal has been accepted, yay!

I'll be speaking on a topic that I feel passionate about: MySQL Server Diagnostics Beyond Monitoring. MySQL has limitations when it comes to monitoring and diagnosing as it has been widely documented in several blogs.

My goal is to share my experience from the last few years and, hopefully, learn from what others have done. If you have a pressing issue, feel free to comment on this blog and I'll do my best to include the case in my talk and/or post a reply if the time allows.

I will also be discussing my future plans on sarsql. I've been silent about this utility mostly because I've been implementing it actively at work. I'll post a road map shortly based on my latest experience.

I'm excited about meeting many old friends (and most now fellow MySQL alumni) and making new ones. I hope to see you there!


PlanetMySQL Voting: Vote UP / Vote DOWN

MySQL Drizzle team joins Rackspace

by David Van Couvering at 15:38 PM, 03/08/2010

Well, more defections from Oracle, it's clear where the wind is blowing. It's as if all the cool and interesting stuff is quickly shedding itself from Oracle.


Jay Pipes has a good blog post about the announcement and the history behind them ending up at Rackspace.

Interesting quote:
"Rackspace is also heavily invested in Cassandra, and sees integration of Drizzle and Cassandra as being a key way to add value to its platforms and therefore for its customers".
I look forward to seeing what that's about.

I also liked this from Jay:
"I don't know whether Larry understands that cloud computing and infrastructure-as-a-service, platform-as-a-service, and database-as-a-service will eventually put his beloved Oracle cash cow in its place or not. I don't know whether Oracle is planning on embracing the cloud environments which will continue to eat up the market share of more traditional in-house environments upon which their revenue streams depend. I really don't."

PlanetMySQL Voting: Vote UP / Vote DOWN

Chris is speaking at the MySQL 2010 Conference

by Chris Schneider at 13:21 PM, 03/08/2010

I'll be presenting two talks this year:

Faster Than Alter - Less Downtime

"This will be a informative talk about real world problem solving and the powerful yet sometimes overlooked LOAD DATA INFILE command. This talk is for MySQL DBAs who want to expand their
knowledge, improve performance and decrease customer facing downtime"

Get Your Replication On: Advanced Techniques, Tips and Tricks

Co-speaking with Sarah Sproehnle for this one! We have lots of interesting uses for replication and some best practices up our
sleeves. Warning: we won't be covering how to set up basic replication
- that's a prerequisite for this talk!

Hope to see you there!

PlanetMySQL Voting: Vote UP / Vote DOWN

Qsh.pl: distributed query tool

by Gavin Towey at 12:48 PM, 03/08/2010

I've written quite a few tools over time to connect to many mysql servers and run queries. Most of these have been pretty specific to a small set of tasks such as running an alter across many servers. Any sysadmin that is in charge of many servers is probably familiar with dsh, and as I was using recently I realized how all those specific tools I've written for mysql could be generalized into a dsh like tool. Thus, Qsh.pl was born! (download at launchpad)

Usage should be familiar to anyone who has used dsh before, it even will read group files made for dsh in /etc/dsh/group/or /usr/local/etc/group/.

Here's an example where this tool was quite useful. I was getting a query error for SHOW GLOBAL STATUS. This was a curious result since we're running mysql 5.0 everywhere. So what better way to find out which machines are complaining than just run it everywhere:


# qsh.pl -Mcg all_servers --user root --ask-pass --db=test -e 'SHOW GLOBAL STATUS' 2>error.log
{snip ... lots of output}
Done. Total time 2.919

My group file for all_servers includes 120 mysql servers, executing that query on all of them took a total of 2.9 seconds, not bad. I also redirected stderr to a file, so any query errors are easy to find:
cat error.log
myserver1: Query Error (1064) You have an error in your SQL syntax ...
myserver2: Query Error (1064) You have an error in your SQL syntax ...

Ok, we found all the servers that return an error, why do they complain?
# qsh.pl -Mcm myserver1,myserver2,myserver3 --user root --ask-pass --db=test -e 'SELECT VERSION()'
Password:
myserver1: +-----------------+
myserver1: | VERSION() |
myserver1: +-----------------+
myserver1: | 4.1.21-standard |
myserver1: +-----------------+
myserver3: +----------------------------+
myserver3: | VERSION() |
myserver3: +----------------------------+
myserver3: | 5.0.66a-enterprise-gpl-log |
myserver3: +----------------------------+
myserver2: +-----------------+
myserver2: | VERSION() |
myserver2: +-----------------+
myserver2: | 4.1.21-standard |
myserver2: +-----------------+
Done. Total time 0.063
Ooops! That's right, still a few old versions for legacy reasons.

That's just one example of how I used it. There are probably lots of use cases out there, but since it's new I'm still learning to rely on it. It certainly makes things faster when I can think about querying many servers at once, and is a more efficient way to work when dealing with many machines. It might be useful for:

+ comparing explain plan between many machines
+ altering large tables across many slaves, before promoting one to master.
+ grabbing status output from many machines to feed into awk or sed

PlanetMySQL Voting: Vote UP / Vote DOWN

An SQL Puzzle?

by Jay Pipes at 12:04 PM, 03/08/2010

Dear Lazy Web,

What should the result of the SELECT be below? Assume InnoDB for all storage engines.

CREATE TABLE t1 (a int, b int);
INSERT INTO t1 VALUES (1,1),(1,2);
CREATE TEMPORARY TABLE t2 (a int, b int, PRIMARY KEY (a));
BEGIN;
INSERT INTO t2 VALUES (100,100);
CREATE TEMPORARY TABLE IF NOT EXISTS t2 (PRIMARY KEY (a)) SELECT * FROM t1;
 
# The above statement will correctly produce an ERROR 23000: Duplicate entry '1' for key 'PRIMARY'
# What should the below result be?
 
SELECT * FROM t2;
COMMIT;

PlanetMySQL Voting: Vote UP / Vote DOWN

International Women’s Day

by Sheeri K. Cabral at 11:52 AM, 03/08/2010

If you do not know what International Women’s Day is: http://www.internationalwomensday.com/

Start planning your blog posts for Ada Lovelace day now (March 24th, http://findingada.com/ Ada Lovelace Day is an international day of blogging (videologging, podcasting, comic drawing etc.!) to draw attention to the achievements of women in technology and science.)

To that end, I would like to point out all the women currently in science and tech fields that I admire and think are doing great things. I think it would be great if everyone, male or female, made a list like this:

The women that have taught me science/tech along the way:

High School:
Mary Lou Ciavarra (Physics)
Maria Petretti (Pre-Algebra, and Academic Decathlon)
Reneé Fishman (Biology)
Lisa Acquaire (Economics during Academic Decathlon)

College:
Professor Kalpana White (Biology), and in whose fruit fly lab I worked for 2 semesters.
Professor Eve Marder (Introductory Neuroscience)

Though Brandeis does have female faculty in the Computer Science department, I did not manage to have any classes with female Computer Science faculty members.

My current female DBA co-workers at Pythian: Isabel Pinarci (Oracle), Michelle Gutzait (SQL Server), Catherine Chow (Oracle) and Jasmine Wen (Oracle).

And to folks in the greater MySQL/tech community and tech co-workers past and present, especially those I have been inspired and helped by: Tracy Gangwer, Leslie Hawthorn, Selena Deckelmann (Postgres), Amy Rich, Anne Cross, and more (If I have forgotten you, I apologize!).


PlanetMySQL Voting: Vote UP / Vote DOWN

InnoDB plugin training

by Dave Stokes at 10:27 AM, 03/08/2010

MySQL is a constantly moving target and keeping up on the latest changes can be a difficult chore. A lot of folks seeking certification are often swamped by evolutions in the MySQL product. On Tuesday there is a FREE web seminar where MySQL Professional Services experts will walk you through best practices for achieving performance and scalability improvements using MySQL 5.1 and the new InnoDB Plugin. This is part two of the series and you do not have to have seen part one to join.

List of Web Seminars


PlanetMySQL Voting: Vote UP / Vote DOWN

Don’t Assume – Per Session Buffers

by Ronald Bradford at 10:24 AM, 03/08/2010

MySQL has a number of global buffers, i.e. your SGA. There are also a number of per session/thread buffers that combined with other memory usage constitutes an unbounded PGA. One of the most common errors in mis-configured MySQL environments is the setting of the 4 primary per session buffers thinking they are global buffers.

Global buffers include:

    The four important per session buffers are:

    I have seen people see these values > 5M. The defaults range from 128K to 256K. My advice for any values above 256K is simple. What proof do you have this works better? When nothing is forthcoming, the first move is to revert to defaults or a maximum of 256K for some benchmarkable results. The primary reason for this is MySQL internally as quoted by Monty Taylor – for values > 256K, it uses mmap() instead of malloc() for memory allocation.

    These are not all the per session buffers you need to be aware of. Others include thread_stack, max_allowed_packet,binlog_cache_size and most importantly max_connections.

    MySQL also uses memory in other areas most noticeably in internal temporary tables and MEMORY based tables.

    As I mentioned, there is no bound for the total process memory allocation for MySQL, so some incorrectly configured variables can easily blow your memory usage.

    References

    About “Don’t Assume”

    “Don’t Assume” is a series of posts to help the Oracle DBA understand, use and appreciate the subtle differences and unique characteristics of the MySQL RDBMS in comparison to Oracle. These points as essential to operate MySQL effectively in a production environment and avoid any loss of data or availability.

    For more posts in this series be sure to follow the mysql4oracledba tag and also watch out for MySQL for Oracle DBA presentations.

    The MySQLCamp for the Oracle DBA is a series of educational talks all Oracle DBA resources should attend. Two presentations from this series IGNITION and LIFTOFF will be presented at the MySQL Users Conference 2010 in Santa Clara, April 2010 This series also includes JUMPSTART and VELOCITY. If you would like to here these presentations in your area, please contact me.


    PlanetMySQL Voting: Vote UP / Vote DOWN

[Finally] Slides posted for the DW Breakfast in London

by Ivan Zoratti at 09:49 AM, 03/08/2010

It took a while, but here they are. We have posted only the slides from Sun/MySQL, since the other material is copyright by Infobright and Talend.


PlanetMySQL Voting: Vote UP / Vote DOWN

Free advice on your my.cnf

by Ronald Bradford at 09:36 AM, 03/08/2010

Today, while on IRC in #pentaho I came across a discussion and a published my.cnf. In this configuration I found some grossly incorrect values for per session buffers (see below).

It doesn’t take a MySQL expert to spot the issues, however there is plenty of bad information available on the Internet and developers not knowing MySQL well can easily be mislead. This has spurred me to create a program to rid the world of bad MySQL configuration. While my task is potential infinite, it will enable me to give back and hopefully do a small amount of good. You never know, saving those CPU cycles may save energy and help the planet.

Stay tuned for more details of my program.

[mysqld]
...
sort_buffer_size = 6144K
myisam_sort_buffer_size = 1G
join_buffer_size = 1G
bulk_insert_buffer_size = 1G
read_buffer_size     = 6144K
read_rnd_buffer_size = 6144K
key_buffer_size		= 1024M
max_allowed_packet	= 32M
thread_stack		= 192K
thread_cache_size       = 256

query_cache_limit	= 512M
query_cache_size        = 512M
...

PlanetMySQL Voting: Vote UP / Vote DOWN

Drizzling from the Rackspace Cloud

by Eric Day at 08:57 AM, 03/08/2010

Since I left Sun back in January, folks have been asking what was next. I’m happy to say that I’m going to continue hacking on open source projects like Drizzle and Gearman, but now at the Rackspace Cloud. Not only will I be there, but I get to continue working closely with a few of the amazing Drizzle hackers who have also joined, including Monty Taylor, Jay Pipes, Stewart Smith, and Lee Bieber.

Why Rackspace Cloud? Late last year I was considering what I wanted to do next with the Oracle acquisition looming near, and this was one of the options that presented itself. Rackspace had been a supporter of Drizzle from early on by offering virtual machines to develop and test on, and when talking to some folks more closely, something really hit home. Rackspace provides first-class service and “fanatical” support – they are not a software company. One might ask why an open source software developer would be interested in a company that doesn’t create software or vice-versa, and the answer is that Rackspace wants to find ways to offer the best possible service now and into the future. What better way than to help develop the next generation of service software and get a jump start into integrating this into their architecture? Both the open source community and Rackspace win.

Another thing I learned while talking with Rackspace is that one of their core principles is transparency. This applies to both customer and employees, and anyone within an open source community can appreciate this. The more I learned about the company and the folks within it, the more impressed I was at the lack of internal barriers or “need-to-know” information. One of Drizzle’s core goals is also transparency, from discussing design decisions on public mailing lists and IRC, to having the entire project management infrastructure hosted out in the open at Launchpad.

What does this mean for the Drizzle project? It means continued support for a number of core developers, more infrastructure for development, and most importantly in my eyes, more context. One of the Drizzle tag-lines is “A Lightweight SQL Database for Cloud and Web,” so what better place to develop a database designed for the cloud than on one of the fastest growing cloud platforms. We’ll get a detailed look at the demands, get feedback from cloud customers, and have the perfect test bed for offering new services. We’ll also be able to work closely with a top-notch group of DBAs, developers, and sysadmins in one of the most demanding service architectures out there. This invaluable context will help the Drizzle developers make more informed decisions moving forward, which also means better software for the community.

Personally, this also means getting back to my hosting roots. Before Sun, I worked at Concentric for almost 10 years in a clustered hosting environment. I’m very familiar with many of the multi-tenant scalability concerns Rackspace has, and I’m excited to be working in this type of environment again. We’ve already been working closely with the MySQL DBAs at Rackspace to learn what the biggest pain points are for a multi-tenant architecture, and we’ll be taking steps to address these as it will help anyone wanting to run Drizzle in a cloud-like environment. Drizzle’s modular architecture has already proved useful, as some of these concerns are easily answered with “oh, we have a plugin point for that.”

I’m excited, this is going to be a fun ride.


PlanetMySQL Voting: Vote UP / Vote DOWN

MySQL is crashing, what do I do?

by Ronald Bradford at 08:34 AM, 03/08/2010

Let me start by saying the majority of environments never experience problems of MySQL crashing. I have seen production environments up for years. On my own server I have seen 575 days of MySQL uptime and the problem was hardware, not MySQL.

However it does occur, and the reasons may be obscure.

Confirming mysqld has crashed

To the unsuspecting, MySQL may indeed be crashing and you never know about it. The reason is because most MySQL installations have two running processes, these are mysqld and mysqld_safe.

ps -ef | grep mysqld
root     28822     1  0 Feb22 ?        00:00:00 /bin/sh bin/mysqld_safe
mysql    28910 28822  0 Feb22 ?        00:30:08 /opt/mysql51/bin/mysqld --basedir=/opt/mysql51 --datadir=/opt/mysql51/data --user=mysql --log-error=/opt/mysql51/log/error.log --pid-file=/opt/mysql51/data/dc1.onegreendog.com.pid

One of the functions of mysqld_safe is to restart mysqld if it fails. Unless you review your mysql error log and for low volume systems you will never know. Hint Have you checked your MySQL error log today?

You can determine quickly via SQL your instance uptime.

mysql> SHOW GLOBAL STATUS LIKE '%uptime%';
+---------------+---------+
| Variable_name | Value   |
+---------------+---------+
| Uptime        | 1033722 |
+---------------+---------+
1 row in set (0.00 sec)

This is the number of seconds since start time. While not easily readable for humans, this is more user friendly display. (NOTE: Works for 5.1+ only)

mysql> SELECT FROM_UNIXTIME(UNIX_TIMESTAMP() - variable_value) AS server_start
FROM INFORMATION_SCHEMA.GLOBAL_STATUS
WHERE variable_name='Uptime';
+---------------------+
| server_start        |
+---------------------+
| 2010-02-22 15:22:13 |
+---------------------+
1 row in set (0.07 sec)

Debugging a mysqld core file

When correctly configured, mysqld will generate a core file (See How to crash mysqld intentionally for background information on required settings).

Your first check is to determine if the mysqld binary used has debugging information and symbols stripped. You need this information not stripped for identifying symbol names.

$ file bin/mysqld
bin/mysqld: ELF 64-bit LSB executable, AMD x86-64, version 1 (SYSV), for GNU/Linux 2.4.0,
dynamically linked (uses shared libs), for GNU/Linux 2.4.0, not stripped

You can use gdb and with a backtrace command (bt) you can see a stack trace of calls. This won’t help the average DBA without C or MySQL internal knowledge greatly, however it’s essential information to get to the bottom of the problem.

In the following example I’m going to use Bug #38508 to intentionally crash my test instance.

mysql> drop table if exists t1,t2;
mysql> create table t1(a bigint);
mysql> create table t2(b tinyint);
mysql> insert into t2 values (null);
mysql> prepare stmt from "select 1 from t1 join  t2 on a xor b where b > 1  and a =1";
mysql> execute stmt;
mysql> execute stmt;
ERROR 2013 (HY000): Lost connection to MySQL server during query

Lost connection is the first sign of a problem. We check the error log to confirm.

$ tail data/`hostname`.err
100306 14:51:49 - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=8384512
read_buffer_size=131072
max_used_connections=1
max_threads=151
threads_connected=1
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 338301 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd: 0x521f160
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x401b6100 thread_stack 0x40000
/home/rbradfor/mysql/mysql-5.1.38-linux-x86_64-glibc23/bin/mysqld(my_print_stacktrace+0x2e)[0x8abfbe]
/home/rbradfor/mysql/mysql-5.1.38-linux-x86_64-glibc23/bin/mysqld(handle_segfault+0x322)[0x5df252]
/lib64/libpthread.so.0[0x35fb00de80]
/home/rbradfor/mysql/mysql-5.1.38-linux-x86_64-glibc23/bin/mysqld(_ZN9Item_cond10fix_fieldsEP3THDPP4Item+0x7f)[0x5654ff]
/home/rbradfor/mysql/mysql-5.1.38-linux-x86_64-glibc23/bin/mysqld(_ZN9Item_cond10fix_fieldsEP3THDPP4Item+0xb8)[0x565538]
/home/rbradfor/mysql/mysql-5.1.38-linux-x86_64-glibc23/bin/mysqld(_Z11setup_condsP3THDP10TABLE_LISTS2_PP4Item+0xf6)[0x621f96]
/home/rbradfor/mysql/mysql-5.1.38-linux-x86_64-glibc23/bin/mysqld(_ZN4JOIN7prepareEPPP4ItemP10TABLE_LISTjS1_jP8st_orderS7_S1_S7_P13st_select_lexP18st_select_lex_unit+0x2db)[0x645f3b]
/home/rbradfor/mysql/mysql-5.1.38-linux-x86_64-glibc23/bin/mysqld(_Z12mysql_selectP3THDPPP4ItemP10TABLE_LISTjR4ListIS1_ES2_jP8st_orderSB_S2_SB_yP13select_resultP18st_select_lex_unitP13st_select_lex+0x7a4)[0x654d24]
/home/rbradfor/mysql/mysql-5.1.38-linux-x86_64-glibc23/bin/mysqld(_Z13handle_selectP3THDP6st_lexP13select_resultm+0x16c)[0x659f9c]
/home/rbradfor/mysql/mysql-5.1.38-linux-x86_64-glibc23/bin/mysqld[0x5ec92a]
/home/rbradfor/mysql/mysql-5.1.38-linux-x86_64-glibc23/bin/mysqld(_Z21mysql_execute_commandP3THD+0x602)[0x5efb22]
/home/rbradfor/mysql/mysql-5.1.38-linux-x86_64-glibc23/bin/mysqld(_ZN18Prepared_statement7executeEP6Stringb+0x3bd)[0x66587d]
/home/rbradfor/mysql/mysql-5.1.38-linux-x86_64-glibc23/bin/mysqld(_ZN18Prepared_statement12execute_loopEP6StringbPhS2_+0x7c)[0x66874c]
/home/rbradfor/mysql/mysql-5.1.38-linux-x86_64-glibc23/bin/mysqld(_Z22mysql_sql_stmt_executeP3THD+0xa7)[0x668c27]
/home/rbradfor/mysql/mysql-5.1.38-linux-x86_64-glibc23/bin/mysqld(_Z21mysql_execute_commandP3THD+0x1123)[0x5f0643]
/home/rbradfor/mysql/mysql-5.1.38-linux-x86_64-glibc23/bin/mysqld(_Z11mysql_parseP3THDPKcjPS2_+0x357)[0x5f5047]
/home/rbradfor/mysql/mysql-5.1.38-linux-x86_64-glibc23/bin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0xe93)[0x5f5ee3]
/home/rbradfor/mysql/mysql-5.1.38-linux-x86_64-glibc23/bin/mysqld(_Z10do_commandP3THD+0xe6)[0x5f67a6]
/home/rbradfor/mysql/mysql-5.1.38-linux-x86_64-glibc23/bin/mysqld(handle_one_connection+0x246)[0x5e9146]
/lib64/libpthread.so.0[0x35fb006307]
/lib64/libc.so.6(clone+0x6d)[0x35fa4d1ded]
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0x5249320 = select 1 from t1 join  t2 on a xor b where b > 1  and a =1
thd->thread_id=1
thd->killed=NOT_KILLED
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
Writing a core file
100306 14:51:49 mysqld_safe Number of processes running now: 0
100306 14:51:49 mysqld_safe mysqld restarted
100306 14:51:49 [Note] Plugin 'FEDERATED' is disabled.
100306 14:51:50  InnoDB: Started; log sequence number 0 44233
100306 14:51:50 [Note] Event Scheduler: Loaded 0 events
100306 14:51:50 [Note] /home/rbradfor/mysql/mysql-5.1.38-linux-x86_64-glibc23/bin/mysqld: ready for connections.
Version: '5.1.38'  socket: '/tmp/mysql.sock.3999'  port: 3999  MySQL Community Server (GPL)

Confirming we got the “Writing a core file” line, we can find and use this.

$ find . -name "core*"
./data/core.23290
$ gdb bin/mysqld data/core.23290
GNU gdb Red Hat Linux (6.5-37.el5_2.2rh)
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu"...Using host libthread_db library "/lib64/libthread_db.so.1".

Reading symbols from /lib64/libpthread.so.0...done.
Loaded symbols for /lib64/libpthread.so.0
Reading symbols from /lib64/libdl.so.2...done.
Loaded symbols for /lib64/libdl.so.2
Reading symbols from /lib64/libcrypt.so.1...done.
Loaded symbols for /lib64/libcrypt.so.1
Reading symbols from /lib64/libnsl.so.1...done.
Loaded symbols for /lib64/libnsl.so.1
Reading symbols from /lib64/libm.so.6...done.
Loaded symbols for /lib64/libm.so.6
Reading symbols from /lib64/libc.so.6...done.
Loaded symbols for /lib64/libc.so.6
Reading symbols from /lib64/ld-linux-x86-64.so.2...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Reading symbols from /lib64/libgcc_s.so.1...done.
Loaded symbols for /lib64/libgcc_s.so.1
Core was generated by `/home/rbradfor/mysql/mysql-5.1.38-linux-x86_64-glibc23/bin/mysqld --defaults-fi'.
Program terminated with signal 11, Segmentation fault.
#0  0x00000035fb00b142 in pthread_kill () from /lib64/libpthread.so.0
(gdb) bt
#0  0x00000035fb00b142 in pthread_kill () from /lib64/libpthread.so.0
#1  0x00000000005df285 in handle_segfault (sig=11) at mysqld.cc:2552
#2  
#3  0x00000000005654ff in Item_cond::fix_fields (this=0x5249dd0, thd=0x521f160, ref=) at item_cmpfunc.cc:3900
#4  0x0000000000565538 in Item_cond::fix_fields (this=0x52435b8, thd=0x521f160, ref=) at item_cmpfunc.cc:3912
#5  0x0000000000621f96 in setup_conds (thd=0x521f160, tables=, leaves=0x52494d0, conds=0x5244e38) at sql_base.cc:7988
#6  0x0000000000645f3b in JOIN::prepare (this=0x5243770, rref_pointer_array=0x5248a90, tables_init=, wild_num=, conds_init=,
    og_num=, order_init=0x0, group_init=0x0, having_init=0x0, proc_param_init=0x0, select_lex_arg=0x52488c0, unit_arg=0x5248498) at sql_select.cc:412
#7  0x0000000000654d24 in mysql_select (thd=0x521f160, rref_pointer_array=0x5c3fd0, tables=0x4, wild_num=0, fields=@0x52489c8, conds=0x52435b8, og_num=0, order=0x0, group=0x0, having=0x0,
    proc_param=0x0, select_options=0, result=0x52688a0, unit=0x5248498, select_lex=0x52488c0) at sql_select.cc:2377
#8  0x0000000000659f9c in handle_select (thd=0x521f160, lex=0x52483f8, result=0x52688a0, setup_tables_done_option=0) at sql_select.cc:268
#9  0x00000000005ec92a in execute_sqlcom_select (thd=0x521f160, all_tables=0x52494d0) at sql_parse.cc:5011
#10 0x00000000005efb22 in mysql_execute_command (thd=0x521f160) at sql_parse.cc:2206
#11 0x000000000066587d in Prepared_statement::execute (this=0x5245d60, expanded_query=, open_cursor=false) at sql_prepare.cc:3579
#12 0x000000000066874c in Prepared_statement::execute_loop (this=0x5245d60, expanded_query=0x401b43c0, open_cursor=false, packet=, packet_end=)
    at sql_prepare.cc:3253
#13 0x0000000000668c27 in mysql_sql_stmt_execute (thd=) at sql_prepare.cc:2524
#14 0x00000000005f0643 in mysql_execute_command (thd=0x521f160) at sql_parse.cc:2215
#15 0x00000000005f5047 in mysql_parse (thd=0x521f160, inBuf=0x5243520 "execute stmt", length=12, found_semicolon=0x401b6060) at sql_parse.cc:5931
#16 0x00000000005f5ee3 in dispatch_command (command=COM_QUERY, thd=0x521f160, packet=0x525fde1 "execute stmt", packet_length=) at sql_parse.cc:1213
#17 0x00000000005f67a6 in do_command (thd=0x521f160) at sql_parse.cc:854
#18 0x00000000005e9146 in handle_one_connection (arg=dwarf2_read_address: Corrupted DWARF expression.
) at sql_connect.cc:1127
#19 0x00000035fb006307 in start_thread () from /lib64/libpthread.so.0
#20 0x00000035fa4d1ded in clone () from /lib64/libc.so.6
(gdb) quit

You can use gdb to obtain additional information based on the type of information available.

Now what?

Is the problem a bug? Is it data corruption? Is it hardware related?

Gathering the information is the first step in informing you of more detail that will enable you to search, discuss and seek professional advice to address your problem.

References


PlanetMySQL Voting: Vote UP / Vote DOWN

Actually, the Relational Model doesn't scale

by Robert Treat at 08:20 AM, 03/08/2010

Before all my fellow DBAs' heads explode, let me just say that I am a relational guy. I like the relational model, think it's the best tool for the job, and think every programmer (not just DBA's) should aspire to be as familiar with it as they are with AJAX, MVC, or whatever other technology pattern you think is important. I'll even take that a step further; I think the NoSQL movement is mostly a re-hash of failed technologies from the last century. Object and document databases had their run in the market (some might say "they had their time"), and they were pretty thoroughly beaten by the RDBMS; that some people have reinvented that wheel doesn't change the game.

That said, I find the recent comments from Jeff Davis on the relational model and scalability to be overlooking some things. The state of computing tasks has changed over the past two decades, and what we know about computer engineering has also changed. Working on highly scalable systems like we do at OmniTI, you can't escape some of the inherent problems that you face when working in these types of environments. As much as I'd like the answer to every problem to be "just use an RDBMS", Brewer's CAP theorem just isn't something you can ignore.

When most people think about the relational model, they think of it in terms of parent-child relationships between tables. Without getting too deep in the details of it, I think it's pretty fair to say that Primary Keys and Foreign Keys are very large part of any relational implementation, and that pretty much all RDBMS strive to allow you to add these constraints to your model; it's what helps keep the data consistent. But there's the rub. CAP theorem points out that as we strive for tighter and tighter consistency, we are pulling away from availability, and sacrificing partition tolerance. Two theoretical systems that run smack dab into each other in the real world. This isn't really something new; if you have ever de-normalized, dropped a foreign key, or split data across multiple nodes, you've run into this before.

Now, where CAP theorem falls on it's face (imho) is that it also ignores another holy trinity of software development; Cheap, Fast, and Good. The size of your problem is dictated by the resources you have available; if you can afford decent tools (and let's be clear, decent is not your web dev throwing up MySQL on an EC2 instance) it is quite likely that the stressors of the relational model will never impact you in a way that most CAP folks are worried about. This is also one of the places the NoSQL movement fails; by throwing the baby out with the bath water. Giving up your data integrity before you have scalability issues is a form of premature optimization. The trick, as Theo would say, is having the experience to know when such optimizations are and aren't premature.

So what's the take away? I like to say that you use the relational model because it is best, and you use something else because it is necessary. Most SQL implementations can scale very well, and they should be your first choice when starting a new project. But we also can't pretend that there aren't inherent problems as these systems grow larger; let's understand the trade-offs and engineer appropriately.

PlanetMySQL Voting: Vote UP / Vote DOWN

MySQL Performance Schema is in the trunk

by Marc Alff at 07:41 AM, 03/08/2010

It is in

As of 2010-03-06, the performance schema is merged into MySQL version 5.5.3-m3, in mysql-trunk.

The documentation for 5.5 has also been updated, and contains a new MySQL Performance Schema chapter.

Disclaimer

Please note that MySQL 5.5 is not a GA product. The performance schema feature may still change at any time, for any reason, and without notice. Customers should not make purchasing decisions based on the availability of the performance schema.

Marc Alff, Oracle.

PlanetMySQL Voting: Vote UP / Vote DOWN

Happiness is a Warm Cloud

by Jay Pipes at 06:31 AM, 03/08/2010

Although a few folks knew about where I and many of the Sun Drizzle team had ended up, we’ve waited until today to “officially” tell folks what’s up. We — Monty Taylor, Eric Day, Stewart Smith, Lee Bieber, and myself — are all now “Rackers”, working at Rackspace Cloud. And yep, we’re still workin’ on Drizzle. That’s the short story. Read on for the longer one :)

An Interesting Almost 3 Years at MySQL

I left my previous position of Community Relations Manager at MySQL to begin working on Brian Aker’s newfangled Drizzle project in October 2008.

Many people at MySQL still think that I abandoned MySQL when I did so. I did not. I merely had gotten frustrated with the slow pace of change in the MySQL engineering department and its resistance to transparency. Sure, over the 3 years I was at MySQL, the engineering department opened up a bit, but it was far from the ideal level of transparency I had hoped to inspire when I joined MySQL.

For almost 3 years, I had sent numerous emails to the MySQL internal email discussion lists asking the engineering and marketing departments, both headed by Zack Urlocker, to recognize the importance and necessity of major refactoring of the MySQL kernel, and the need to modularize the kernel or risk having more modular databases overtake MySQL as the key web infrastructure database. The focus was always on the short term; on keeping up with the Jones’ as far as features went, and I railed against this kind of roadmap, instead pushing the idea of breaking up the server into modules that could be blackboxed and developed independently of the kernel. My ideas were met with mostly kind responses, but nothing ever materialized as far as major refactoring efforts were concerned.

I remember Jim Winstead casually responding to one of my emails, “Congratulations, you’ve just reinvented Apache 2.0″. And, yes, Jim, that was kind of the point…

The MySQL source code base had gotten increasingly unmaintainable over the years, and key engineers were extremely resistant to changing the internals of MySQL and modernizing it. There were some good reasons for being resistant, and some poor reasons (such as “this is the way we’ve always done it”). Overall, it’s tough to question the strategy that Zack, Marten Mickos, and others had regarding the short term gains. After all, they managed to maneuver MySQL into a winning position that Sun Microsystems thought was worth one billion dollars. Because of this, it’s tough to argue with them. :|

Working on Drizzle since October 2008 (officially)

I’m not the kind of person which likes to wait for years to see change, and so the Drizzle project interested me because it was not concerned with backwards compatibility with MySQL, it wasn’t concerned with having a roadmap that was dependent on the whims of a few big customers, and it was very much interested in challenging the assumptions built into a 20 year-old code base. This is a project I could sink my teeth into. And I did.

Many folks have said that the only reason Drizzle is still around is because Sun continued to pay for a number of engineers to work on Drizzle as “an experiment of sorts” and that Drizzle has no customers and therefore nothing to lose and everything to gain. This was true, no doubt about it. At Sun CTO Labs, the few of us did have the ability to code on Drizzle without the pressure-cooker of product marketing and sales demands. We were lucky.

4 6 9 10 Months in Purgatory

So, around rolls April 2009. The stock market and worldwide economy had collapsed and recession was in the air. There’s one thing that is absolutely certain in recession economies: companies that have poor leadership and direction and are beholden to the interests of a large stockholder will seek an end to their misery through acquisition by a larger, stronger firm.

And Sun Microsystems was no different. JAVA stock plummeted to two dollars a share, and Jonathan Schwartz and the Sun board began shopping Sun around to the highest bidder. IBM was courted along with other tech giants. So was Oracle.

And it was with a bit of a hangover that I awoke at the MySQL conference in April 2009 to the news that Oracle had purchased Sun Microsystems. Joy. We’d just gone through 14 months of ongoing integration with Sun Microsystems and now it was going to start all over again.

Anyone who follows PlanetMySQL knows about the ensuing battle in the European Commission’s court regarding monopoly of Oracle in the database market with its acquisition of MySQL. Monty Widenius, Eben Moglen, even Richard Stallman, weighed in on the pros and cons of Oracle’s impending control over MySQL.

All the while, us Sun Microsystems employees had to hold our tongues and try to keep our jobs as Sun laid off thousands more workers while the EC battle ensued. Not fun. It was the employment equivalent of purgatory. And the time just dragged on, with many employees, including myself and the Sun Drizzle team, not having a clue as to what would happen to us. Management was completely silent about future plans. Oracle made zero attempts to outline its future strategy regarding software, and thus most software employees simply kept on doing their work not knowing if the pink slip was arriving tomorrow or not. Lots of fun that was.

Oracle Doesn’t Need Our Services — Larry Don’t Need No Stinkin’ Cloud

The acquisition finally closed and very shortly afterwards, I got a call from my boss, Lee Bieber, that Oracle wouldn’t be needing our services. Monty, Eric, and Stewart had already resigned; none of them had any desire to work for Oracle. Lee and I had decided to see what they had in mind for us. Apparently, not much.

Larry Ellison has gone on record that the whole “cloud thing” is faddish. I don’t know whether Larry understands that cloud computing and infrastructure-as-a-service, platform-as-a-service, and database-as-a-service will eventually put his beloved Oracle cash cow in its place or not. I don’t know whether Oracle is planning on embracing the cloud environments which will continue to eat up the market share of more traditional in-house environments upon which their revenue streams depend. I really don’t.

But what I do know is that Rackspace is betting that providing these services is what the future of technology will be about.

Happiness is a Warm Cloud

Our team has landed at Rackspace Cloud. I’ve now been down to San Antonio twice to meet with key individuals with whom we’ll be working closely. Rackspace is not shy about why the wanted to acquire our team. They see Drizzle as a database that will provide them an infrastructure piece that will be modular and scalable enough to meet the needs of their very diverse Cloud customers, of which there are many tens of thousands.

Rackspace recognizes that the pain points they feel with traditional MySQL cannot be solved with simple hacks and workarounds, and that to service the needs of so many customers, they will need a database server that thinks of itself as a friendly piece of their infrastructure and not the driver of its applications. Drizzle’s core principles of flexibility and focus on scalability align with the goals Rackspace Cloud has for its platform’s future.

Rackspace is also heavily invested in Cassandra, and sees integration of Drizzle and Cassandra as being a key way to add value to its platforms and therefore for its customers.

Rackspace is all about the customers, and this is a really cool thing to experience. It’s typical for companies to claim they are all about the customer — in fact, every company I’ve ever worked for has claimed this. Rackspace is the first company I’ve worked for where you actually feel this spirit, though. You can see the fanaticism of Rackers and how they view what they do always in terms of service to the customer. It’s infectious, and I’m pretty psyched to be on their team.

Anyway, that’s my story and I’m stickin’ to it. See y’all on the nets.


PlanetMySQL Voting: Vote UP / Vote DOWN

NoSQL doesn’t mean non-relational

by Baron Schwartz (xaprb) at 04:13 AM, 03/08/2010

It seems that a lot of people equate non-SQL databases with non-relational-ness, or malign the word relational. This is pretty much pure ignorance. If you’ve ever uttered a sentence that includes the phrase “…non-relational database…” then I have two suggestions for you.

  1. Study relational algebra. At a bare minimum, read the Wikipedia article on relational algebra. There is much more you could do — take a class on the topic, or read C.J. Date’s SQL and Relational Theory (my review). Ask yourself how similar SQL is to the relational algebra. How is relational algebra different from SELECT and GROUP BY? Is relational theory about relationships between data? What part do transactions play in relational algebra? Is MySQL a relational database? What about PostgreSQL, Oracle, or DB2?
  2. Now that you understand relational theory more, choose a database that you think is non-relational and write a formal proof that it is not relationally complete. Please do post a link to the proof in the comments.

The truth is, a non-relational database would be of very little use. In layman’s terms, it would mean you have some data that represents true statements, and a piece of software designed to answer questions using those facts, and you can’t answer simple first-order logic questions with the software. How is this an improvement? How is this useful?

Related posts:

  1. A review of SQL and Relational Theory by C. J. Date SQL and Re
  2. InnoDB is a NoSQL database As long as
  3. On the unhelpfulness of NoSQL My favorit

Related posts brought to you by Yet Another Related Posts Plugin.


PlanetMySQL Voting: Vote UP / Vote DOWN

The history of MySQL AB

by Dries Buytaert at 04:00 AM, 03/08/2010

MySQL, the open source database product that puts the "M" in LAMP, was created by MySQL AB, a company founded in 1995 in Sweden. In 2008, MySQL AB announced that it had agreed to be acquired by Sun Microsystems for approximately $1 billion.

The story of MySQL AB is pretty amazing, so I unleashed my "inner academic", did some research and compiled a timeline of MySQL AB's history. This timeline is assembled based on different resources online, such as MySQL press releases (example 1) and interviews with MySQL AB executives (example 2, example 3), etc.

Things to add? Let me know in the comments and I'll update the post.

1995

  • MySQL AB founded by Michael Widenius (Monty), David Axmark and Allan Larsson in Sweden.

2000

  • MySQL goes Open Source and releases software under the terms of the GPL. Revenues dropped 80% as a result, and it took a year to make up for it.

2001

  • Mårten Mickos elected CEO at age 38. Mårten was the CEO of a number of Nordic companies before joining MySQL, and comes with a sales and marketing background.
  • 2 million active installations.
  • Raised series A with undisclosed amount from Scandinavian venture capitalists. Estimated to be around $1 to $2 million.

2002

  • MySQL launched US headquarters in addition to Swedish headquarters.
  • 3 million active users.
  • Ended the year with $6.5 million in revenue with 1,000 paying customers.

2003

  • Raised a $19.5 million series B from Benchmark Capital and Index Ventures.
  • 4 million active installations and over 30,000 downloads per day.
  • Ended the year with $12 million in revenue.

2004

  • With the main revenue coming from the OEM dual-licensing model, MySQL decides to move more into the enterprise market and to focus more on recurring revenue from end users rather than one-time licensing fees from their OEM partners.
  • Ended the year with $20 million in revenue.

2005

  • MySQL launched the MySQL Network modeled after the RedHat Network. The MySQL Network is a subscription service targeted at end users that provides updates, alerts, notifications, and product-level support designed to make it easier for companies to manage hundreds of MySQL servers.
  • MySQL 5 ships and includes many new features to go after enterprise users (e.g. stored procedures, triggers, views, cursors, distributed transactions, federated storage engines, etc.)
  • Oracle buys Innobase, the 4-person Finnish company behind MySQL's InnoDB storage backend.
  • Ended the year with $34 million in revenue based on 3400 customers.

2006

  • Mårten Mickos confirms that Oracle tried to buy MySQL. Oracle' CEO Larry Ellison commented: "We've spoken to them, in fact we've spoken to almost everyone. Are we interested? It's a tiny company. I think the revenues from MySQL are between $30 million and $40 million. Oracle's revenue next year is $15 billion."
  • Oracle buys Sleepycat, the company that provides MySQL with the Berkeley DB transactional storage engine.
  • Mårten Mickos announces that they are making MySQL ready for an IPO in 2008 on an projected $100 million in revenues.
  • 8 million active installations.
  • MySQL has 320 employees in 25 countries, 70 percent of whom work from home.
  • Raised a $18 million Series C based on a rumored valuation north of $300 million.
  • MySQL is estimated to have a 33% market share measured in install base and 0.2% market share measured in revenue (the database market was a $15 billion market in 2006).
  • Ended the year with $50 million in revenue.

2007

  • Ended the year with $75 million in revenue.

2008

  • Sun Microsystems acquired MySQL AB for approximately $1 billion.
  • Michael Widenius (Monty) and David Axmark, two of MySQL AB's co-founders, begin to criticize Sun publicly and leave Sun shortly after.

2009

  • Mårten Mickos leaves Sun and becomes entrepreneur-in-residence at Benchmark Capital. Sun has now lost the business and spiritual leaders that turned MySQL into a success.
  • Sun Microsystems and Oracle announced that they have entered into a definitive agreement under which Oracle will acquire Sun common stock for $9.50 per share in cash. The transaction is valued at approximately $7.4 billion.

PlanetMySQL Voting: Vote UP / Vote DOWN

When indexes are created in internal temporary tables

by Venu Anuganti at 03:40 AM, 03/08/2010

During my previous post on how to improve derived tables performance, I patched the code to add indexes forcefully on internal derived table results, which made a huge difference in the performance. It was just an experiment and a thought to see if it really works without re-writing the queries, so that the logic can be pushed towards the engine rather than query re-write. \

But I got few emails in my inbox today asking whether MySQL really create any keys on internal temporary tables.

The answer is YES; and MySQL does create two keys on internal temporary tables namely ‘group_key‘ and ‘distinct_key‘ on the following conditions:

  • If there is any aggregate function and/or group-by (group_key)
  • Distinct column name(group_key)
  • Distinct in combination with group-by/aggregation functions (distinct_key)

Provided the query results are yielded in temporary table (Using temporary from the explain), else they get optimized away by the existing indexes from the regular table itself. These keys are added to both memory and disk based (MyISAM) internal temporary tables; so it does not matter if the internal temporary table is in memory or disk.

Here is a simple dump of internal temporary table index stats for some of the basic queries related to Information schema [Warning: these queries are really bad, and can't be used for any production use as they are meant for demonstration of different internal keys ]. This is a patch that I might be using for SHOW TEMPORARY TABLES when internal tables are included in the second version. The first version of the patch is already pushed to Maria branch, hoping that it gets pushed to 5.1.

-----------------------------
 TMP TABLE STATS, SESSION: 1
   temp file  : /tmp/#sqlf90_1_1f
   temp type  : MEMORY
   index count: 1
    key 1-1   : distinct_key
    field     : (null)
    key 1-2   : distinct_key
    field     : ENGINE
 query: select count(distinct engine) from information_schema.tables
-----------------------------
 
-----------------------------
 TMP TABLE STATS, SESSION: 1
   temp file  : /tmp/#sqlf90_1_21
   temp type  : MEMORY
   index count: 1
    key 1-1   : group_key
    field     : TABLE_NAME
 query: select table_name, sum(data_length+index_length) from information_schema.tables 
        where table_schema='mysql' group by 1
-----------------------------
 
-----------------------------
 TMP TABLE STATS, SESSION: 1
   temp file  : /tmp/#sqlf90_1_24
   temp type  : MEMORY
   index count: 1
    key 1-1   : group_key
    field     : TABLE_SCHEMA
    key 1-2   : group_key
    field     : TABLE_NAME
    key 1-3   : group_key
    field     : COLUMN_NAME
 query: select tab.table_schema, tab.table_name, column_name, index_name, seq_in_index 
        from Information_schema.tables tab join information_schema.statistics stast 
        using(table_schema,table_name) group by  1,2,3
-----------------------------
 
-----------------------------
 TMP TABLE STATS, SESSION: 1
   temp file  : /tmp/#sqlf90_1_bd
   temp type  : MEMORY
   index count: 1
    key 1-1   : group_key
    field     : TABLE_NAME
    key 1-2   : group_key
    field     : TABLE_TYPE
    key 1-3   : group_key
    field     : ENGINE
    key 1-4   : group_key
    field     : INDEX_SCHEMA
    key 1-5   : group_key
    field     : INDEX_NAME
 query: select tab.table_schema,  tab.table_name, table_type, engine, index_schema, 
        index_name from information_schema.tables tab join information_schema.statistics
        stats using(table_schema, table_name) where table_schema='mysql' group by 
        1,2,3,4,5, 6 order by 4,3,2,1
-----------------------------
 
-----------------------------
 TMP TABLE STATS, SESSION: 1
   temp file  : /tmp/#sqlf90_1_e0
   temp type  : MEMORY
   index count: 1
    key 1-1   : group_key
    field     : TABLE_NAME
 query: select  table_name, sum(data_length+index_length) from information_schema.tables
        where table_schema='mysql' group by 1
-----------------------------

PlanetMySQL Voting: Vote UP / Vote DOWN

Netapp Data ONTAP fail - maxfiles

by Trent Hornibrook at 00:52 AM, 03/08/2010

Netapp, enterprise network attached storage devices with OS, Data ONTAP has a per volume specific variable called 'maxfiles'. Basically the maximum number of inodes the volume can consume independent of disk utilization.

Unfortunately this variable must be set per volume, it cannot be 'unlimited' and it cannot be downsized.

According to the man page:

DESCRIPTION
maxfiles increases the number of files that a volume can
hold, as close as possible to max. File inodes are stored
in blocks, and the filer may round the requested max num-
ber of files to the nearest block.

Once increased, the value of max can never be lowered, so
the new value must be larger than the current value.




Further increasing this value to be an unlimited-like variable consumes filer RAM and will result in less usable filer RAM after a Data ONTAP OS upgrade.

Moral of the story - also monitor inode usage as well as disk usage on Netapp volumes!

PlanetMySQL Voting: Vote UP / Vote DOWN

Sic Transit...

by Anatoly Lubarsky at 21:08 PM, 03/07/2010

friendster, hi5 on decline in 2009


It is very sad to see friendster go :( Their main problems have been always ops and security and not a competition with facebook or others.

How many fsync / sec FusionIO can handle

by Vadim at 08:09 AM, 03/04/2010

I recently was asked how many fsync / sec ( and therefore durable transactions / sec) we can get on FusionIO card.

It should be easy to test, let's take sysbench fileio benchmark and run, the next command should make it:


./sysbench --test=fileio --file-num=1 --file-total-size=50G --file-fsync-all=on --file-test-mode=seqrewr --max-time=100 --file-block-size=4096 --max-requests=0 run

CODE:
  1. Operations performed:  0 Read, 922938 Write, 922938 Other = 1845876 Total
  2. Read 0b  Written 3.5207Gb  Total transferred 3.5207Gb  (36.052Mb/sec)
  3.  9229.35 Requests/sec executed

So that's 9229.35 req/sec, which is pretty impressive.

For comparison the same run on PERC 6i RAID10 with BBU:

CODE:
  1. Operations performed:  0 Read, 4832661 Write, 4832661 Other = 9665322 Total
  2. Read 0b  Written 18.435Gb  Total transferred 18.435Gb  (20.975Mb/sec)
  3.  5369.62 Requests/sec executed

which gives us 5369.62 req/sec.

Note this is for single thread, and in MySQL/InnoDB multi-thread load you may get more transactions per second with group commit ( which is back to live in InnoDB-plugin / XtraDB )


Entry posted by Vadim | No comment

Add to: delicious | digg | reddit | netscape | Google Bookmarks

Maximal write througput in MySQL

by Vadim at 22:53 PM, 02/28/2010

I recently was asked what maximal amount transactions per second we can get using MySQL and XtraDB / InnoDB storage engine if we have high-end server. Good questions, though not easy to answer, as it depends on:

- durability setting ( innodb_flush_log_at_trx_commit = 0 or 1 ) ?
- do we use binary logs ( I used ROW based replication for 5.1)
- do we have sync_binlog options.

So why would not take these as variable parameters and run simple benchmark.
I took sysbench update_key scenario ( update indexed field on simple table)
and used Dell PowerEdge R900 with 16 cores, FusionIO as storage for table and RAID 10 with BBU as storage for innodb log files, innodb system table space and binary logs. And I used Percon-XtraDB-5.1.43-9.1 for benchmarks. All used partitions are formatted in XFS and mounted with nobarrier option.

I run update key for various threads and with next parameters

  • trx_commit=0 : innodb_flush_log_at_trx_commit = 0 and no binary logs
  • trx_commit=1 : innodb_flush_log_at_trx_commit = 1 and no binary logs
  • trx_commit=0 & binlog : innodb_flush_log_at_trx_commit = 0 and binary logs
  • trx_commit=1 & binlog : innodb_flush_log_at_trx_commit = 1 and binary logs
  • trx_commit=1 & binlog & sync_bin : innodb_flush_log_at_trx_commit = 1 and binary logs and sync_binlog=1
  • There are results I get:

    I found results being quite interesting.
    with innodb_flush_log_at_trx_commit = 0 maximal tps is 36332.02 tps, which drops to 23115.04 tps as
    we switch to innodb_flush_log_at_trx_commit = 1. As we use RAID10 with BBU, I did not expect the drops is going to be significant. In second case InnoDB spends

    With enabling binary logs, the results drops to 17451.01 tps with innodb_flush_log_at_trx_commit = 0 and to 12097.39 tps with innodb_flush_log_at_trx_commit = 1. So with binary logs serialization is getting even worse.

    Enabling sync_binlog makes things really bad, and maximal results I have is
    3086.7 tps. So this is good decision if binary log protection is worth such drop.

    UPDATE ( 3/4/2010 )

    Results with innodb_flush_log_at_trx_commit = 2

    Results with innodb_flush_log_at_trx_commit = 2 and binlogs


    Entry posted by Vadim | 6 comments

    Add to: delicious | digg | reddit | netscape | Google Bookmarks

MySQL 5.5-m2 scalability

by Vadim at 09:48 AM, 02/28/2010

Oracle recently announcent MySQL-5.5.2-m2 milestone, and I have a predition that MySQL-5.5 will be announced as GA on MySQL UC 2010 in April.

So let's make quick on scalability characteristics we should expect from new release.

I made sysbench oltp benchmarks on 10 mln rows (worth 2.5GB of data), on our Dell R900 system ( 16 cores, 32GB of RAM, FusionIO + RAID10, but disk in this case does not matter).

There is results for various threads (results are in tps, more is better)

I think it is good results, and there couple conclusions we can come with

  • InnoDB now can scale up to 16 threads pretty well
  • InnoDB can keep load up to 128 threads, and for 256 threads the result is still acceptable, but drops by 25% comparing to peak in 16 threads

If you interested what is issue in scaling over 16 threads, that in current
benchmarks - this is concurrency on rollback segment (which is global shared structure in InnoDB)

sysbench command for reference:

./sysbench --test=oltp --oltp-table-size=10000000 --init-rng=1 --num-threads=$i --max-requests=0 --oltp-dist-type=uniform --max-time=180 run


Entry posted by Vadim | 10 comments

Add to: delicious | digg | reddit | netscape | Google Bookmarks

Why you should ignore MySQL’s key cache hit ratio

by Baron Schwartz at 09:34 AM, 02/28/2010

I have not caused a fist fight in a while, so it's time to take off the gloves. I claim that somewhere around of 99% of advice about tuning MySQL's key cache hit ratio is wrong, even when you hear it from experts. There are two major problems with the key buffer hit ratio, and a host of smaller ones. If you make some assumptions that are very hard to prove, there actually is a very limited use for the statistics from which the ratio is derived (but not the ratio itself, which you should ignore). Read on for the details.

In this article, I will use key buffer, key_buffer, and key cache interchangeably. However, I will be careful about the difference between "rate" and "ratio". In this article, the key cache miss rate is defined as the number of misses per unit of time, with the units of operations per second. The key cache miss ratio is the ratio between reads from the disk and reads from the cache; it is a dimensionless number because the units are the same in the numerator and denominator, and they cancel out. The key cache hit ratio is simply one minus the miss ratio.

The key_buffer hit ratio

The metrics we're interested in are all defined in terms of counters you can get from SHOW GLOBAL STATUS. I'll start out by copying and pasting from the MySQL manual:

  • Key_read_requests

    The number of requests to read a key block from the cache.

  • Key_reads

    The number of physical reads of a key block from disk.

So far, so good. All of the above is mostly factual (more on this later). Here's another fact from the manual:

The cache miss rate can be calculated as Key_reads/Key_read_requests. [Note: their use of 'rate' doesn't match mine; I would call this the 'miss ratio'].

The problem occurs when you start to assign any importance to this ratio. The MySQL manual doesn't fall into this trap, but if you search Google for Key_read_requests, you will find lots of advice on "tuning by ratio," including phrases such as "The ratio of key_reads to key_read_requests should be as low as possible, no more than 1:100" or "your key cache hit ratio should be very high, ideally above 99.9%" or "if your key cache hit ratio is low, your key_buffer_size is too small."

So here's a summary of two pieces of bad advice:

  • Bad advice #1: you should care about the key cache hit ratio.
  • Bad advice #2: you should set your key_buffer_size according to this ratio.

Tuning by ratio is one of those things that is widely accepted because of "proof by repeated assertion," but as you know, that doesn't make it valid. Let's see why the above two things are bad advice.

Problem 1: Ratios don't show magnitude

Ratios of counters are virtually meaningless for two major reasons. The first is that ratios obscure magnitude. Look at the following and tell me what you think:

  • Server #1 has a key cache miss ratio of 23%
  • Server #2 has a key cache miss ratio of 0.001%

Which server is badly tuned? Maybe you're objecting that you don't know enough about the servers. OK, I'll give you more information. Both servers are real production servers, with powerful hardware and heavy query workloads. Both have a key_buffer_size of 4GB. Now can you tell which server is badly tuned? -- No, you can't tell anything meaningful based on a ratio of counters, because the process of dividing one counter by the other to get the ratio has discarded vital information. You don't know how many Key_reads and Key_read_requests those servers have done.

Just for fun, consider these fake but entirely possible scenarios: server #1 has 23 Key_reads and 100 Key_read_requests (23% miss ratio). Server #2 has one trillion Key_reads, and one hundred quatrillion Key_read_requests (1/100th of a percent). Given that information, which server is badly tuned? If you said "I still can't tell," maybe you want more information, so I'll tell you that both servers are identically tuned, and they have identical data, hardware, and workload. Even that doesn't help, though.

Problem 2: Counters don't measure time

The reason you still can't tell which server is badly tuned is because, even if you know the absolute numbers, you are missing the element of time, in two important ways. First, you don't know how long of an interval I used to measure the statistics on those two servers. Maybe I measured the first server immediately after starting it, and that's why its counters are so small. The second server has been online practically forever, and that's why its counters are big. Let's say this is the case. Now, you've got all the information you need to form an opinion, right? Instead of asking the same annoying question, let me ask it a different way: is either of these servers badly tuned?

There's still not enough information -- I hope you're beginning to appreciate that tuning by ratio is a waste of time! The Oracle folks arrived at this conclusion a long time before the MySQL world started to come around. There are even tuning utilities (anti-tuning anti-utilities?) for Oracle, specifically designed to mock and frustrate those who would tune by ratio. They are capable of creating any buffer hit ratio the user desires by running silly queries that do nothing but cause buffer hits, skewing the result towards "this ratio looks great!"

The second kind of time information you're lacking is how much time each buffer hit or miss takes. If you approach application performance optimization from the standpoint of response time measurements, which you should, you will eventually arrive at this question. "I have a query I know is slow and is a problem for my application. I have profiled it with SHOW STATUS and I know it causes a great many Key_reads to occur. How much of this query's execution time is consumed by those operations? Should I try to reduce Key_reads?"

There is no way to know. All you get is counters -- you don't get the time elapsed. In technical terms, counters are surrogate measures. They are not helpful. And as Cary Millsap says, the unfortunate problem is that surrogate measures work sometimes, simply because there is sometimes a correlation (but not a cause) relationship between the counter events and the query's execution time. Alas, that correlation fools us into thinking it's a cause, and we optimize-by-surrogate-measure a time or two and it appears to work -- so we turn into little Pavlovian DBAs and try to do that every time. It would be better if optimizing-by-counter never worked!

A partially valid use of Key_reads

There is a partially valid reason to examine Key_reads, assuming that we care about the number of physical reads that occur, because we know that disks are very slow relative to other parts of the computer. And here's where I return to what I called "mostly factual" above, because Key_reads actually aren't physical disk reads at all. If the requested block of data isn't in the operating system's cache, then a Key_read is a disk read -- but if it is cached, then it's just a system call. However, let's make our first hard-to-prove assumption:

  • Hard-to-prove assumption #1: A Key_read might correspond to a physical disk read, maybe.

If we take that assumption as true, then what other reason might we have for caring about Key_reads? This assumption leads to "a cache miss is significantly slower than a cache hit," which makes sense. If it were just as fast to do a Key_read as a Key_read_request, what use would the key buffer be anyway? Let's trust MyISAM's creators on this one, because they designed a cache hit to be faster than a miss.

  • Hard-to-prove assumption #2: A key cache miss is probably slower than a hit, maybe.

What else? Maybe this physical I/O operation is randomly positioned, which is a worst-case scenario for spinning disks. This is also very hard to prove, but seems reasonable based on the structure of a B-tree index, so let's assume anyway:

  • Hard-to-prove assumption #3: A Key_read might cause a random I/O operation, maybe.

Now, given those assumptions, we can further assume the following:

  • It is good to minimize Key_reads because they are slow and cause random disk I/O.

Notice that we still don't know anything about any relationship between Key_reads and the execution time of our query. All we can do is guess, like good Pavlovian DBAs, that there is a relationship. However, we can again reason that random I/O can cause collateral damage: if the disk head is seeking all over for random I/O, then other I/O (including non-random I/O) is likely to be impacted. If we manage to reduce Key_reads, we might make the database server faster overall, and perhaps the query of interest will accidentally get faster too, and we'll get a treat.

There is one interesting question that we haven't really addressed yet. How bad is bad? This is where we return to the notion of the key cache miss rate in units of operations per second. Given our assumed correlation between a Key_read and a random physical disk I/O, it is partially valid to say that we are going to get in trouble when Key_reads gets close to the number of random I/Os our disk can do. Here's another formula for you:

Key_cache_miss_rate = Key_reads / Uptime

Note the conspicuous absence of Key_read_requests in the formula. The number of requests is absolutely irrelevant -- who cares how often the key is requested? What's relevant is that our assumed connection between Key_reads and random I/Os means that Key_reads/Uptime is assumed to be the same as "random I/Os per second."

And now, I would finally like to show you something partially useful you can do with Key_reads:

CODE:
  1. [baron@localhost ~]$ mysqladmin ext -ri10 | grep Key_reads
  2. | Key_reads                         | 6030962       |
  3. | Key_reads                         | 98            |
  4. | Key_reads                         | 89            |
  5. | Key_reads                         | 104           |

This server is doing approximately 100 Key_reads every ten seconds, so we can assume Key_reads are causing about ten random I/Os per second. Compare that to what your disks are capable of, and draw your own conclusions about whether this is a performance problem. I know what I'd like: I'd like to ask the disk itself how much random I/O it's doing. But alas, that's virtually impossible on most systems I work on. So there you have it -- yet another surrogate measure.

How to choose a key_buffer_size

Let's recap. So far I've shown you the fallacy of tuning by ratio, and told you to ignore the ratio and in fact, ignore Key_read_requests altogether. I've explained that counters are a surrogate measure, but the fact that they're easy to get and sometimes correlated with the true problem causes people to mistake counter analysis for a true performance optimization method. I've shown that if we make some assumptions that are hard to prove, we can compare Key_reads to the disk's physical capacity for random I/O and get an idea of whether index I/O might be causing a performance problem.

But I haven't shown you how to choose an appropriate key_buffer_size. Let's look at that now.

This topic deserves an entire blog post, because there are many subtleties including the possibility of having multiple key caches. But I'll give the simple version here. In my opinion, you should choose a key_buffer_size that is large enough to hold your working set -- the index blocks that are frequently used. How large is that? This is yet another thing that's really hard to measure, alas! So we need to either pick a surrogate, or pull a number out of thin air. Here are some suggestions that are about as good as any:

  1. Just set it really big and forget it. If you have enough memory, who cares. The memory isn't allocated until it's used -- if you set it to 4GB, that doesn't mean that 4GB is actually used. This is not as much of an abdication of responsibility as it might sound like on a machine that's dedicated to MyISAM tables.
  2. Consider your mixture of storage engines (some InnoDB, some MyISAM, which is more important to you, etc) and choose an amount of memory based on how important those tables are to you, how big the indexes are on disk, and so on.
  3. Raise the key_buffer_size until, when the buffer is full, Key_reads/Uptime reduces to a number you're comfortable with.
  4. Set key_buffer_size really big, and then measure its size as it fills up, in something like 1 minute intervals. Pull this into a spreadsheet and graph it as a curve. When the curve's growth tapers off, pick that point on the curve and use it as a heuristic for how big your working set is. Set the key_buffer_size to that.

If the above methods shock you with their unscientific-ness, they shouldn't. The reality is that this server setting is very subjective, and there is no good instrumentation in MySQL to guide your decisions. It is also not the be-all and end-all of MySQL performance, and people frequently obsess over it far out of proportion. But again, 99% of the advice I've seen is based on something much worse: a red herring that only sounds scientific and authoritative -- the "key cache hit ratio." This is a shame. When you are new to MySQL, trying to configure my.cnf, and you have heard guidance that seems so definite, mathematical, and authoritative, but still makes no sense, why wouldn't you obsess over it?

What about InnoDB tuning?

You might be wondering, what about InnoDB tuning? What is the best way to choose an innodb_buffer_pool_size setting? This is a topic that deserves its own article too, but the short version is: ratio-based tuning is just as wrong for InnoDB as it is for MyISAM. Ratio-based tuning is invalid and wrong in general, not just for specific things. All of the above points (loss of magnitude, lack of timing information, etc) apply to all types of ratio-based and counter-based tuning techniques.

Summary

Major points in this article:

  • Counter ratios are meaningless as a performance analysis metric because they are a) ratios of b) counters.
    • You need to look at absolute magnitude, not ratios.
    • For performance analysis, you need to measure elapsed time, not just the number of times something happens.
  • In the absence of timing information, and if you trust the cache's creator and assume that misses are more expensive than hits, then you care about cache misses, not cache hits.
  • A Key_read is not guaranteed to be a random physical I/O, but it might be. Gather Key_reads/Uptime over 10-second or 60-second intervals and compare the results to your IO system's capabilities.
  • MySQL doesn't have good instrumentation for scientifically choosing a key_buffer_size setting, but there are many unscientific approaches that are better than ratio-based tuning.
  • Counter ratios suck for everything, not just for MyISAM tuning.

I've had some heated arguments over these points, so I don't expect the above to pass without controversy. But really, it's time to stop with the bad advice about counter ratios. The sooner we do that, the sooner we can move on to better things.


Entry posted by Baron Schwartz | 18 comments

Add to: delicious | digg | reddit | netscape | Google Bookmarks

Running many benchmarks on fast storage (FusionIO, SSDs) and multi-cores CPUs system I constantly face two contention problems.

So I suspect it's going to be next biggest issues to make InnoDB scaling on high-end system.

This is also reason why in benchmarks I posted previously CPU usage is only about 50%, leaving other 50% in idle state.

First problem is index->lock mutex.
InnoDB uses single mutex per index, so when you run mixed read / write queries, InnoDB locks index for write operation and thus keeps all selects waiting when update/insert is done. This is implemented in this way because write operation may cause B-Tree page split, and InnoDB needs to move records between pages to finish operation. It is getting even worse when for write you need to perform some additional IO to bring page into buffer_pool.

What could be done there internally: there is B-Tree lock free or only page-level lock algorithms, so operation does not need to block whole B-Tree.
From end user point, to fight with this problem, you may need to partition (manually or using 5.1 partitions) table with big index into couple smaller table. It's ugly, but it can help while main problem is not solved.

Second problem is adaptive_search index.
It appears when you have some scanning by secondary key select queries and write queries at the same time.
InnoDB again uses single global mutex for adaptive_search (single mutex for ALL table and ALL indexes), so write query blocks ALL select queries.
Usually first action is to disable adaptive_search (it is possible via global variable), but it rarely helps actually. With disabled adaptive index InnoDB needs to perform much more operations reading secondary keys.

How it can be solved internally: I think some hashing algorithms may be applied to not lock select queries. We may look how to implement it.

Until that InnoDB basically can't utilize powerful hardware.
For example even in IO intensive load I am getting the same result
for single FusionIO card and for two FusionIO cards coupled in RAID0 (which theoretically doubles througput).


Entry posted by Vadim | 9 comments

Add to: delicious | digg | reddit | netscape | Google Bookmarks

Brian Aker, a brilliant helpful duder, who I learn a lot from. Gives a great talk about what is NoSQL explained in a way for database guys. I warn you, there are some points in this video where you can't hear Brian due to the audience "participation" but you should get the content.


I'll be giving a web presentation Jan. 19th 2010. If you would like to check it out please do!


Scaling to 200K Transactions per Second with Open Source - MySQL, Java, curl, PHP


I thought it went well. I will post the link to the slides here :)

Scaling to 200K Transactions per Second with Open Source - MySQL, Java, curl, PHP

Missleading Innodb message on recovery

by peter at 19:03 PM, 02/15/2010

As I wrote about 2 years ago the feature of Innodb to store copy of master's position in Slave's Innodb tablespace got broken. There is a lot of discussions at the corresponding bug report while outcome of the fix remained uncertain for me (the bug is market duplicate while the bugs it seems to be duplicate for describe different issues).
Anyway. The customer came to me today having the following message in the error log after Slave crash while running MySQL 5.1.41 (extra virgin version without XtraDB or Percona Patches)

InnoDB: In a MySQL replication slave the last master binlog file
InnoDB: position 0 10000000, file name mysql-bin.005000
InnoDB: Last MySQL binlog file position 0 20000000, file name ./mysql-bin.003000

He has tried to restart replication from mysql-bin.005000 position 10000000 which failed with "Could not find first log file name in binary log index file" error message

Looking at the Master for this slave I could see its binary log files going only to about 2000, so the binary log file mentioned did not ever exist on this master. What is going on ?
The thing is Innodb does not update this information any more however if it is stored in the tablespace the code is still there to print it. This database was running older MySQL version a while back which was updated to MySQL 5.1 months ago, moved to the new hardware by physical copy and the log file numbers restarted back from 1 but tablespace still contained the ancient data.

I reported the bug on this which should be easy to fix. Otherwise it is easy mistake to make. We also have a patch which restores this information and uses it on slave crash recovery.


Entry posted by peter | 2 comments

Add to: delicious | digg | reddit | netscape | Google Bookmarks

READ-COMMITED vs REPETABLE-READ in tpcc-like load

by Vadim at 13:27 PM, 02/11/2010

Question what is better isolation level is poping up again and again. Recently it was discussed in InnoDB : Any real performance improvement when using READ COMMITED isolation level ? and in Repeatable read versus read committed for InnoDB .
Serge in his post explains why READ COMMITED is better for TPCC load, so
why don't we take tpcc-mysql bencmark and check on results.

I took 3 different datasets 1000w (100GB of data), 300w (30GB) and 10w (1GB) for box with 32GB of RAM and buffer_pool 26GB. Latest case 10w is interesting as I expect a lot of contention on row level having small dataset.
I used as usually tpcc-mysql benchmark with 16 and 32 (for 10w) concurrent users.

Also I had binary log enabled on RBR mode (as READ-COMMITED does not support STATEMENT based replication in 5.1), and everything was run under XtraDB-9 ( based on InnoDB-plugin 1.0.6)

So there are results:

For 1000w:

1000w

There READ-COMMITED seems more preferable, however difference is very small.

For 300w:
300w

I would say both modes are even there, there is no winner for me.

for 10w:
10w

Almost indentical resutls in this case.

However, however.
For 10w run I got 60 Deadlock errors "1213, Deadlock found when trying to get lock; try restarting transaction" in READ-COMMITED mode, and 0 Deadlocks in REPEATABLE-READ mode.

I understand that 60 deadlocks for total 2704687 transactions can be ignored, but it seems you have better chance to get DEADLOCK in READ-COMMITED then in REPEATABLE-READ.

So both modes looks even for me, though some facts to consider:

  • READ-COMMITED is used rare than REPEATABLE-READ (default), that is less-tested
  • READ-COMMITED does not work with statement-based replication in 5.1
  • with READ-COMMITED you may have more DEADLOCKS

Entry posted by Vadim | 5 comments

Add to: delicious | digg | reddit | netscape | Google Bookmarks

Blob Storage in Innodb

by peter at 22:54 PM, 02/09/2010

I'm running in this misconception second time in a week or so, so it is time to blog about it.
How blobs are stored in Innodb ? This depends on 3 factors. Blob size; Full row size and Innodb row format.

But before we look into how BLOBs are really stored lets see what misconception is about. A lot of people seems to think for standard ("Antelope") format first 768 bytes are stored in the row itself while rest is stored in external pages, which would make such blobs really bad. I even seen a solution to store several smaller blobs or varchar fields which are when concatenated to get the real data. This is not exactly what happens :)

With COMPACT and REDUNDANT row formats (used in before Innodb plugin and named "Antelope" in Innodb Plugin and XtraDB) Innodb would try to fit the whole row onto Innodb page. At least 2 rows have to fit to each page plus some page data, which makes the limit about 8000 bytes. If row fits completely Innodb will store it on the page and not use external blob storage pages. For example 7KB blob can be stored on the page. However if row does not fit on the page, for example containing two 7KB blobs Innodb will have to pick some of them and store them in external blob pages. It however will keep at least 768 bytes from each of the BLOBs on the row page itself. With two of 7KB blobs we will have one blob stored on the page completely while another will have 768 bytes stored on the row page and the remainder at external page.

Such decision to store first 768 bytes of the BLOB may look strange, especially as MySQL internally has no optimizations to read portions of the blob - it is either read completely or not at all, so the 768 bytes on the row page is a little use - if BLOB is accessed external page will always have to be read. This decision seems to be rooted in desire to keep code simple while implementing initial BLOB support for Innodb - BLOB can have prefix index and it was easier to implement index BLOBs if their prefix is always stored on the row page.

This decision also causes strange data storage "bugs" - you can store 200K BLOB easily, however you can't store 20 of 10K blobs. Why ? Because each of them will try to store 768 bytes on the row page itself and it will not fit.

Another thing to beware with Innodb BLOB storage is the fact external blob pages are not shared among the blobs. Each blob, even if it has 1 byte which does not fit on the page will have its own 16K allocated. This can be pretty inefficient so I'd recommend avoiding multiple large blobs per row when possible. Much better decision in many cases could be combine data in the single large Blob (and potentially compress it)

If all columns do not fit to the page completely Innodb will automatically chose some of them to be on the page and some stored externally. This is not clearly documented neither can be hinted or seen. Furthermore depending on column sizes it may vary for different rows. I wish Innodb would have some way to tune it allowing me to force actively read columns for inline store while push some others to external storage. May be one day we'll come to implementing this in XtraDB :)

So BLOB storage was not very efficient in REDUNDANT (MySQL 4.1 and below) and COMPACT (MySQL 5.0 and above) format and the fix comes with Innodb Plugin in "Barracuda" format and ROW_FORMAT=DYNAMIC. In this format Innodb stores either whole blob on the row page or only 20 bytes BLOB pointer giving preference to smaller columns to be stored on the page, which is reasonable as you can store more of them. BLOBs can have prefix index but this no more requires column prefix to be stored on the page - you can build prefix indexes on blobs which are often stored outside the page.

COMPRESSED row format is similar to DYNAMIC when it comes to handling blobs and will use the same strategy storing BLOBs completely off page. It however will always compress blobs which do not fit to the row page, even if KEY_BLOCK_SIZE is not specified and compression for normal data and index pages is not enabled.

If you're interested to learn more about Innodb row format check out this page in Innodb docs:

It is worth to note I use BLOB here in a very general term. From storage prospective BLOB, TEXT as well as long VARCHAR are handled same way by Innodb. This is why Innodb manual calls it "long columns" rather than BLOBs.


Entry posted by peter | 11 comments

Add to: delicious | digg | reddit | netscape | Google Bookmarks

Introducing percona-patches for 5.1

by Vadim at 21:03 PM, 02/09/2010

Our patches for 5.0 have attracted significant interest.  You can read about SecondLife's experience here, as well as what Flickr had to say on their blog.  The main improvements come in both performance gains and improvements to diagnostics (such as the improvements to the slow log output, and INDEX_STATISTICS).

Despite having many requests to port these patches to 5.1, we simply haven't had the bandwidth as our main focus has been on developing XtraDB and XtraBackup.  Thankfully a customer (who prefers to stay unnamed) as stood up and sponsored the work to move the patches to 5.1.

To refresh, the most interesting patches are:

Two new features which not available for 5.0:

  • In slow.log for Stored Procedure call you can see profiling for each individial query from this procedure, not just call storproc()
  • With userstat you can get additional THREADS_STATISTICS which show similar information to USER/CLIENT_STATISTICS but per THREAD granularity (it's useful if you have connection pool)

On this stage the patches are available only in source code, you
can get them from Launchpad https://code.launchpad.net/~percona-dev/percona-patches/5.1.43.  Binaries are also on the way, and will be ready soon. We are running intensive stress testing loads on them to provide stable and quality packages.

And to finalize are results for tpce-like benchmark, where I compare MySQL-5.1.43 vs percona-5.1.43.

The results made for TPCE configuration with 2000 customers and 300 tradedays and 16 concurrent users on our R900 server. The dataset is about 25GB, fully fitting into buffer_pool, so disk does not really matter, but data was stored on FusionIO 320GB MLC card.

On chart with results I show amount of TradeResults transactions per 10 sec during 3600 session (more is better)
tpce-like_2000c_300d

As you see with percona patches you can get just about 10x improvement.
Yeah, that sounds too cool, but let me explain where difference comes from.

As I mentioned in tpce workload details the load is very SELECT intensive and these SELECTS are mainly scans by secondary keys ( not Primary Keys), so it hits problems in InnoDB rw-lock implementations and in buffer_pool mutex contention, which alredy fixed in percona-patches ( and in XtraDB and InnoDB-plugin also).

So you are welcome to try it!


Entry posted by Vadim | 18 comments

Add to: delicious | digg | reddit | netscape | Google Bookmarks

Introducing tpce-like workload for MySQL

by Vadim at 09:30 AM, 02/08/2010

We have been using tpcc-mysql benchmark for long time, and there many results published in our blog, but that's just single workload. That's why we are looking into different benchmarks, and one
of them is TPCE. Yasufumi made some efforts to make TPCE working with MySQL, and we are making it available for public consideration.

You can download it from our Lauchpad Percona-tools project, it's
bzr branch lp:~percona-dev/perconatools/tpcemysql

Important DISCLAIMER:
Using this package you should agree with TPC-E License Agreement,
which in human words is:

  • You can't name results as "TPC Benchmark Results"
  • You can't compare results with results published on http://www.tpc.org/ and you can't pretend the results are compatible with published by TPC.

And we are not going to do anything from that, your primary goals is XtraDB/InnoDB performance research and/or compare with available Storage Engines for MySQL.

The workload in tpce is quite different from tpcc. Tpcc is write intensive, while tpce
is read oriented.
To give more details, there is stats for 10 seconds:

CODE:
  1. | Com_select                            | 46272       |
  2. | Com_update                            | 5214        |
  3. | Com_delete                            | 385         |
  4. | Com_insert                            | 3468        |
  5. | Com_commit                            | 5404        |

The result is quite chatty,

CODE:
  1. |    |     [MEE]    | [DM] |                         [CE]                          |
  2. sec. |    TR,    MF |   DM |   BV,    CP,    MW,    SD,    TL,    TO,    TS,    TU | MEEThreads, ReqQueue
  3.       (1st line: count, 2nd line: 90%ile response [msec.])
  4.  260 |   402,    39,     0,   195,   532,   749,   588,   342,   415,   816,    88 | 30, 0
  5.           20,    60,     0,    30,    20,    20,    20,    50,    20,   310,    60
  6.  
  7.  270 |   395,    40,     0,   201,   608,   842,   608,   358,   449,   833,    89 | 30, 0
  8.           30,    40,     0,    30,    20,    20,    20,    50,    20,   300,    50

but it allows you to see count of 11 different transactions per 10 secs and 90% response time.

and final result

CODE:
  1. [TradeResult(TR) transaction]
  2. Succeed: 150243
  3. Lated:   0
  4. Retried: 3
  5. Failed:  0
  6.  
  7. 41.7342 TpsE

where you can see count of successful TR (TradeResult) transactions, and
the summary result in TpsE (transactions per seconds).

Expect our results soon!


Entry posted by Vadim | No comment

Add to: delicious | digg | reddit | netscape | Google Bookmarks

INNODB has some irritating gotchas that makes disk space management hard. In 2002ish INNODB, added innodb_file_per_table to get around allot of these issues, but it does not fix everything.

If you are running innodb_file_per_table, you will notice in your database directory


  • db.opt - database characteristics file.

  • tablename.frm - the table structure.

  • tablename.ibd - the actual innodb table space file



Imagine that you have a table with 10 million rows and you delete say 5 million rows in multiple chunks around 400K chunks, because deletes are slow. Next, you notice that the table space file did not shrink. So what do you do? OPTIMIZE tablename, tada all the wasted space is reclaimed, but here is the PROBLEM the ibdata file grew!

ibdata stores all of the UNDO LOGS thus GROWS due to the deletes and space is never reclaimed.


Ok, lets try coping the tablespace file to another directory and re-import the tablespace file after wiping the data dictionary (ibdata).

For instance
/etc/init.d/mysql stop
cd /var/lib/mysql/DB
cp * /tmp/hold
rm /var/lib/mysql/ib*
cp /tmp/hold/* /var/lib/mysql/DB
/etc/init.d/mysql start // create the ibdata file

ALTER TABLE tablename IMPORT TABLESPACE
ERROR 1146 (42S02): Table 'DB.tablename' does not exist

Really, there is nothing that you can do to force innodb to purge the undo logs in the ibdata file without a full dump of the data to text and then reimport the data.

So here are the steps to shrink all table spaces and the ibdata file

mysqldump --all-databases (or use mk-parallel-dump)
stop mysql
rm -f /var/lib/mysql/ib* /var/lib/mysql/DB/*
start mysql
mysqlimport (or use mk-parallel-restore)

Takes a bunch of time but there is no other recourse. If you know of another way please share :)

Finland-based Aito Technologies, developer of a customer experience analytics product suite, today announced its plans to employ Sun Microsystems’ MySQL Embedded Database Server software to successfully analyze up to billions of phone calls, text messages, and mobile data sessions on a daily basis.

Tips for Windows Server 2008 and IIS7 Tuning

by Anatoly Lubarsky at 08:49 AM, 01/04/2010

Recently I moved from windows servers 2003 32 bit and IIS6 to windows servers 2008 64 bit and IIS7. My experience was rather painful, windows 2008 seems like a wild animal after windows 2003 experience. 2008 has lots networking and scalability issues and it took me quiet a while and many tweaks and hacks to achieve reliability and good performace plus scalability and solve many issues.


If you own a windows 2008 server under some significant load from the web issues described in this post maybe relevant for you. Maybe it may save you some time in desperate searching around the net for solutions...


Before we continue to windows 2008 and IIS7 tips - some relevant info about IIS7.


IIS7 breaking changes

IIS7 introduced new integrated pipeline model where asp.net pipeline is integrated into IIS which has extensibility and performance benefits. This comes together with some breaking changes to configuration and asp.net.


Follow this link to learn more:


asp.net 2 breaking changes on-iis 7


OK, now some tips...


Disable static compression

IIS7 has static compression turned on by default and dynamic compression turned off by default. My first tip is to disable static compression (website - compression in IIS7 manager).


It can be the problem when you have static XML files on you server for example. XML files become unreadable by XML parsers with IIS7 static compression. Browsers and other 3rd parties will not be able to read them anymore because XML is malformed (possible bug?).


Change max concurrent requests per CPU setting

By default IIS7 has a limit of handling 12 concurrent requests per CPU and will queue requests above this limit. If you have some significant web load and many AJAX style requests to your server - this setting maybe very restrictive and it is hard to find out the root of the problem when you server performance is suddenly degraded.


See this post to get the picture how it can cause performance issues on your server.

Some relevant info about asp.net thread usage on IIS7 here:


asp.net thread usage on IIS7 and IIS6


Thomas Marquardt advice is to change this default limit. Recommended settings:


"All of this may be a little confusing, but for nearly everyone, my recommendation is that for ASP.NET 2.0 you should use the same settings as the defaults in ASP.NET v4.0; that is, set maxConcurrentRequestsPerCPU = "5000" and maxConcurrentThreadsPerCPU="0".


This is done by adding DWORD MaxConcurrentRequestsPerCPU to the registry under


HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\ASP.NET\2.0.50727.0

called MaxConcurrentRequestsPerCPU (DWORD). This key doesn't exist by default. Or/and in aspnet.config section which overrides registry setting (also doesn't exist by default) aspnet.config is here on windows 64bit:


%windir%\Microsoft.NET\Framework64\v2.0.50727\aspnet.config

You need to add the following section under "configuration" section (here I used default values) - don't forget change maxConcurrentRequestsPerCPU to 5000.


< system.web>
    < applicationPool 
        maxConcurrentRequestsPerCPU="12" 
        maxConcurrentThreadsPerCPU="0" 
        requestQueueLimit="5000" />
< /system.web>

Disable offload network enhancements if you have network issues

Different unexplained network issue on your windows 2008 server could be related to TCP Chimney and related "networking enhancements" in windows 2008 and certain hardware vendors when hardware doesn't play well with these enhancements.


If you have networking problems similar to these below - try to disable these features.


Information about the TCP Chimney Offload, Receive Side Scaling, and Network Direct Memory Access features in Windows Server 2008


Related threads:


  1. IIS 7 http status 400 errors
  2. impacts to sql server workloads
  3. the effect of tcp chimney offload on viewing network traffic

Set Connection "close" request header explicitly posting to AJAX web services

When moving from IIS6 to IIS7 integrated mode you may experience issues similar to below (POST request time-outs and request aborted errors) and all related to AJAX POST requests to IIS7 while AJAX GET requests don't have any problem.


Related threads (all unresolved)


  1. a few "Request timed out" exceptions every day
  2. "Request timed out" errors moving from IIS6 to IIS7
  3. Request Timed Out on Windows Server 2008/IIS7
  4. IIS7 Integrated Mode Request timed out

I experienced the same issue. Some HTTP POST requests got aborted in IIS7 pipeline and never reach server code. It can happen one per 1000 requests. However it may be still a significant number and if you have many users it can degrade performance for other requests.


In HTTPERR logs you can notice Timer_EntityBody errors related. POST requests usually arrive split in 2 segments: headers and following request body while GET requests arrive in one segment. On network monitor you can notice that server received POST request headers but request body never arrives to the server and after some timeout IIS aborts such request.


The solution which can help minimize the number of these errors is to set Connection "close" request header on the caller explicitly. Somehow IIS7 doesn't treat ajax POST requests very well, while IIS6 is more robust and able to handle missing Connection "close" header. Go figure.


Example for XMLHTTP javascript client:


oXmlHttp.setRequestHeader("Connection", "close");

Enjoy :)

Happy New Year

by Anatoly Lubarsky at 19:51 PM, 12/30/2009

Thanks for all who was around here in 2009 and helped us with great suggestions, feedback and support. We wish you all happy New Year!

Sun Microsystems today announced Cashpoint, the Austrian betting and gaming company, has selected Sun's MySQL™ database to manage its customer, Web, and results data. In order to protect its systems' high-availability, Cashpoint uses both a MySQL Enterprise™ Unlimited subscription and the MySQL Cluster database product. Employing an enterprise-wide MySQL Enterprise subscription along with MySQL Cluster contributes to considerable cost savings for the company.

Sun Microsystems, Inc. today announced the details of the Sun Tech Days worldwide developer conference for 2009-10 at http://developers.sun.com/events/techdays/index.jsp. Sun Tech Days is a multi-city world tour designed to showcase how the developer community can leverage Sun technologies, services and products to drive the next generation of industry innovation. Since 1998, the Sun Tech Days worldwide conference has brought together hundreds of thousands of developers and students to learn about technology from industry experts.

Moving

by Anatoly Lubarsky at 04:03 AM, 12/02/2009

Hello


We are moving to new servers and occasional brief downtimes are expected during December on our sites and apps. Sorry for the inconvenience.


In a procedural language without the use of threads (or Inter Process Communication via forks), to execute I/O requests they are done one after another. Synchronous Queries produce at best a Big-O of N such that N is an element of I/O communication (queries) and N equals the number of queries needed to achieve the requested dataset.
With IPC or threads we can speed up common O(N) problems to reduce the N with parallelism, its still functionally a O(N) yet from a single instance point of view N is much less because threads (IPC) takes that Serial computing component and executes the code in parallel. To better explain what I am talking about lets look at some PHP code:


foreach($friends as $friend){
$data[] = getMySQLData(“SELECT * FROM AccountData WHERE userid = $friend);
}


The Primary key for the AccountData table is userid. Assuming that you have 5000 friends, the query has to be executed 5000 times.
We can reduce the O(N) and change it to a O(nlogn) (Binary Tree - doesn't take into account other factors) by switching the query to


$data = getMYSQLData(“SELECT * FROM AccountData WHERE userid IN (….)”);


We just sped up the retrieval of the data significantly, yet we just introduced a bottle neck on the datalayer. Our architecture requires that the data is located in a single location.

What if AccountData’s data is spread across many servers federated by userid? This means that userid belongs to a server, so the server contains a shard of the AccountData’s Data.

Now we are back to a O(N) where each query needs to be executed on the corresponding shard. The logical next step is to group queries per shard and run across them all. For instance


$multiShardIDs = $genericShard->getMultipleShardIDs($objIds);
foreach ($multiShardIDs as $shardID => $shardUserIDs) {
if (stripos($orgQuery, " WHERE ") !== false){
$query = $orgQuery." AND {$column} IN (".implode(',', $shardUserIDs).") ";
}
else{
$query = $orgQuery." WHERE {$column} IN (".implode(',', $shardUserIDs).") ";
}

$shard_to_sql[$shardID] = $query;

.... more stuff ....



Yet this is still a O(N) its just that N is smaller. Each query is still executed serially.
Let’s look at some stats of synchronous queries of SELECT 1; This query is executed across 35 shards and the timings are from PHP point of view.

FieldEnd ValueStart ValueDelta
ru_oublock00 0
ru_inblock00 0
ru_msgsnd00 0
ru_msgrcv00 0
ru_maxrss00 0
ru_ixrss00 0
ru_idrss00 0
ru_minflt98729865 7
ru_majflt00 0
ru_nsignals00 0
ru_nvcsw1134411114 230
ru_nivcsw977968 9
ru_nswap00 0
ru_utime.tv_usec865054849053 16001
ru_utime.tv_sec1616 0
ru_stime.tv_usec556097552097 4000
ru_stime.tv_sec11 0
Total Execution Time0.18323707580566




As you can see, to execute this from PHP it took 100 ms, 100s pages reclaimed and 200s voluntary context switches to query 35 servers.

Now let’s look at Asynchronous execution of SELECT 1; // the query generation is from PHP yet the execution is performed on a server that executes the query in parallel
FieldEnd ValueStart ValueDelta
ru_oublock00 0
ru_inblock00 0
ru_msgsnd00 0
ru_msgrcv00 0
ru_maxrss00 0
ru_ixrss00 0
ru_idrss00 0
ru_minflt91319121 10
ru_majflt00 0
ru_nsignals00 0
ru_nvcsw38913889 2
ru_nivcsw290290 0
ru_nswap00 0
ru_utime.tv_usec596287596287 0
ru_utime.tv_sec44 0
ru_stime.tv_usec460028460028 0
ru_stime.tv_sec00 0
Total Execution Time0.019363880157471




As you can see from the table above executing the query asynchronously produced results with less context switching, less pages reclaimed and almost 10 times execution improvement over the synchronous query counterpart.
How is the asynchronous query executed? Lets take a look at the figure below.

Async


So A user comes through the firewall / load balancer with a HTTP Request to the www pool that runs PHP. PHP now makes a CURL request to the Async Shard Servers (through a LB same LB different PORT). The HTTP Request to the Async Shard Server contains the SQL we wish to execute. The Async Shard Servers has a thread per shard and executes the request in parallel. The results are merged and sent to the calling CURL process via JSON. The returned JSON is then converted into a PHP object. This is a typical three-tier environment.

When having to query multiple servers using an Asynchronous Tier is dramatically faster; in fact its as fast as the slowest server. This is the main sticking point of why asynchronous queries are faster then synchronous queries (in this context) since the total execution time for serial queries is the SUM of all the query execution.

The current version of the server is used for Friend Query execution across the datalayer. Its been solid for a few months now, and I'm currently getting permission to release it as an Open Source Product. The features this server contains:

  • Lightweight

  • CPU bounded

  • Scales Linearly

  • A Timer Thread to keep the database config up to date in memory and fetching the config from PHP so if PHP changes connections to the shards so does Java

  • Uses Java-6 Executor Service

  • Merges the result set prior to sending it to the calling process

  • Communicates via JSON

  • Uses MySQL Connector/J

  • Supports a high concurrency

  • Optimized thread usage

Sun Microsystems, Inc. today announced that CASPUR, a non-profit, consortium of Italian universities focused on scientific supercomputing and innovative technologies, has subscribed to Sun's MySQL Enterprise™ database service.

E.ON ES, a subsidiary within E.ON Group, today announced that it has purchased a multi-year MySQL Enterprise™ database subscription from Sun Microsystems to help power its advanced work order management system, used for the service and maintenance of Sweden's electricity distribution grid.

I am not a fan of upgrading mySQL unless I need to. I am of the mind if it is not broke don't fix it, but when I do upgrade I follow these general steps.

If I have run into a mySQL bug, I look to see if that bug is fixed by searching the mySQL bug database.

If I've notice a performance bottleneck, I look to see if the performance bottleneck has been fixed by searching the same database.

I will NOT upgrade to the latest and greatest version of mySQL (5.4) I stay within my branch (5.0).

These are my three general motivations that drive my upgrade decisions. Anytime I upgrade I also make a list of things that might affect my environment for the stuff I use.

    Here are my steps:
  1. Check the change log
  2. Ignore all the NDB changes... I don't use it and that's the majority of fixes. This is also, why I do not use it.
  3. List the changes that will affect the production environment
  4. Deploy the version that I picked on a few servers running my original config
  5. Do data corruption tests (make sure my checksum scripts return the same data)
  6. Verify that the problem I'm trying to fix is fixed
  7. Deploy to more boxes
  8. Let the new server bake for a period of no less than a week
  9. Deploy everyplace



So now, I'm upgrading from 5.0.56 to 5.0.86. What I'm trying to fix is mysql memory overhead at high levels of ram.

For instance, I have a slew of 48GB boxes. I set the bufferpool to 40GB; the OS uses 1 GB of memory (roughly) leaving an overhead of 7GB for the system cache and various spikes of sort buffers. Over a period, I see that mySQL will consume and hold onto 47GB of memory for an unknown reason even with some tight my.cnf settings. (I'm certain they are tight since I know what each buffer does). Therefore, testing some later versions of mySQL we found that these later versions do not grow past the settings defined yet performs the same.

Next, since I decided that upgrading is a good solution, now it’s time to list all the changes that fixes things.


  • 5.0.58 - INNODB performance fix
  • 5.0.60 - various problems that I should be affected by but havn't noticed so it’s fair to assume that said problems were introduced after my build.
  • 5.0.62 - nothing major noticed the sp releases that's why I wait.
  • 5.0.64 - nothing major
  • 5.0.66 - security fixes and fixes to fix the bugs introduced from this build.
  • 5.0.67 - two INNODB performance fixes and crash bug fixes.
  • 5.0.68 - changes show status and fixes an innodb crash bug.
  • 5.0.70 - fix another INNODB crash bug and security fixes
  • 5.0.72 - more general bug fixes
  • 5.0.74 - more stuff I don't care about
  • 5.0.75 - stuff given to Enterprise users now in community
  • 5.8.76 - more bug fixes that I do not need
  • 5.0.78 - more bug fixes I do not care about (run MS Access on windows not mySQL)
  • 5.0.80 - problem with error messages for concurrency limits that caused an assert failure
  • 5.0.82 - Fixes to fix fixes for this build.
  • 5.0.83 more minor fixes that I don't seem to have a problem from
  • 5.0.84- more bug fixes for INNODB and latches
  • 5.0.85 - looks like windows fixes
  • 5.0.86 - fixes that I'm not having problems with


Therefore, overall, upgrading should give me a boost in performance. My own internal testing sees some tighter memory usage, even though this is not fixed explicitly, the product has matured overall so I can account for the reduction in memory to that.

Request time is proportional to server load. If the application response time is big so will be the server's load. To reduce server load, reduce the wait time in the application's response time to serve the request. Below are some steps that I took to remove 1.8 ms overhead on every request to my web server farm.


Tools Needed:
vi
strace
top

Use vi to look at your include path in php.init.
Next use top to find which apache process is consuming the most cpu resources.
Use strace -p [TOP HTTPD PROCESS] -T (-T is for deltas).

In my example the include path is
/usr/share/pear:/usr/lib64/pear:.:/var/www/html/httdocs/


lstat("/usr", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 <0.000033>
lstat("/usr/share", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 <0.000034>
lstat("/usr/share/pear", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 <0.000034>
lstat("/usr/share/pear/ams", 0x7fbfff1690) = -1 ENOENT (No such file or directory) <0.000033>
open("/usr/share/pear/ams/include/FreqCapInfo.php", O_RDONLY) = -1 ENOENT (No such file or directory) <0.000036>
lstat("/usr", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 <0.000037>
lstat("/usr/lib64", {st_mode=S_IFDIR|0755, st_size=65536, ...}) = 0 <0.000034>
lstat("/usr/lib64/php", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 <0.000034>
lstat("/usr/lib64/php/pear", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 <0.000038>
lstat("/usr/lib64/php/pear/ams", 0x7fbfff1690) = -1 ENOENT (No such file or directory) <0.000037>
open("/usr/lib64/php/pear/ams/include/FreqCapInfo.php", O_RDONLY) = -1 ENOENT (No such file or directory) <0.000038>
open("/var/www/html/ams/include/FreqCapInfo.php", O_RDONLY) = 24 <0.000043>
fstat(24, {st_mode=S_IFREG|0775, st_size=6707, ...}) = 0 <0.000031>
stat("./ams/include/FreqCapInfo.php", 0x7fbfff4778) = -1 ENOENT (No such file or directory) <0.000037>
stat("/usr/share/pear/ams/include/FreqCapInfo.php", 0x7fbfff4778) = -1 ENOENT (No such file or directory) <0.000038>
stat("/usr/lib64/php/pear/ams/include/FreqCapInfo.php", 0x7fbfff4778) = -1 ENOENT (No such file or directory) <0.000036>
stat("/var/www/html/ams/include/FreqCapInfo.php", {st_mode=S_IFREG|0775, st_size=6707, ...}) = 0 <0.000040>
close(24) = 0 <0.000033>
mlock(0x552b876be0, 24) = 0 <0.000092>
mlock(0x552b8df910, 10624) = 0 <0.000038>
munlock(0x552b876be0, 24) = 0 <0.000036>
munlock(0x552b8df910, 10624) = 0 <0.000032>
mlock(0x552b876be0, 24) = 0 <0.001707>
mlock(0x552b8df910, 10624) = 0 <0.000009>
munlock(0x552b876be0, 24) = 0 <0.000007>
munlock(0x552b8df910, 10624) = 0 <0.000007>



Looking at the strace, 15 unneeded system calls are made on every request, each request roughly takes 30 micro seconds, for a total of a few ms wasted on every request. False positives are adding overhead to ever requests since the include path is not optimized. A Bloom Filter in Shared Memory would be perfect for for this part of PHP-but that's besides the point.


So, change your include path
.:/var/www/html/httdocs/:/usr/share/pear:/usr/lib64/pear



For my example I changed the include path to the above. In my environment we don't do many PEAR loads, so it makes sense to use our directories 1st.

By doing this 15 erroneous system calls have been removed. Note: If you use the php feature __autoload make sure to protect your calls with file exist or you may be doing a require once on a file that is in a different directory which is a PHP fatal Error.


UPDATED: Some more goodies to reduce load on PHP boxes:

For PHP 5.2 there is a nice new feature that you can play with

realpath_cache_size. This is a directive native to PHP by default it's set to 16K, this means that PHP expects there to be very few files, but big files. In most environments I have been in this is not the case. The case is there are a lot of files, a lot of directories thus 16K is not enough.

I've tested a few values but 128K seems to be my sweet spot.

realpath_cache_size = 128K

Fosimo — Backup Facebook Inbox Available

by Anatoly Lubarsky at 15:46 PM, 09/11/2009

As announced previously backup facebook inbox functionality is now available on Fosimo. It took some time until the app was whitelisted on Facebook in terms of reading users inbox permission but now it should be operational. If you already have Fosimo 1.0.8 it should work.


Download here. Hurry up because maybe it is the first app on Facebook which is capable of doing this. For now your inbox is saved in XML format.


Enjoy :)

What is Nagios? Nagios IMHO is the best Open Source monitoring system out there. It supports hosts checks, a level to determine on a host level if a box is considered "up". It supports service check, a level to determine if a particular service such as mySQL is up. It has features to log all events to a flat file or to a DB. It can notify you when a service is in a warning state, error state or unknown state.

For the purpose of this article, I am going to talk about handling events such as a clearing up swap.

First, let us look at some configuration of Nagios. We are going to define a command, then service acting on that command. Let us assume that the nagios install is in /usr/local/nagios.

Therefore, in /usr/local/nagios/ a few configuration files are key:
- /usr/local/nagios/etc/objects/commands.cfg - the command file where the checks are defined
- /usr/local/nagios/etc/hosts/*/hosts.cfg - the services file where the checks are defined for execution based on other directives in this file.


A command:


# 'check_local_swap' command definition
define command{
command_name check_local_swap
command_line $USER1$/check_swap -w $ARG1$ -c $ARG2$
}


This says that check_local_swap executes check_swap with a warning threshold of $ARG1 and a critical threshold or $ARG2


Next when defining a service for a host

define service{
use generic-service; Name of service template to use
host_name dbfacebook34b ; hostname
service_description SYS:Swap ; what shows up in alerts
is_volatile 0
check_period 24x7 ; threshold when to check (all the time)
max_check_attempts 4 ; threshold to check before marking state
event_handler handle-swap ; handle an event (another command)
normal_check_interval 5 ; in seconds
retry_check_interval 1 ; only try once before reporting the state
contact_groups itops ; contact group to send notifications to
notification_options w,u,c,r ; need to look this up for all defs
notification_interval 600 ; retry sending notifs every 8 mins
notification_period 24x7 ; keep sending them
check_command check_nrpe!check_local_swap!80%!55% ; execute the event handler and warn like hell
}



Lots of goodies as you can see. Let us look at the event handler

define command{
command_name handle-swap
command_line /home/scripts/handle_swap.pl
}


This means execute this script whenever any event for swap occurs (I decided to make this simple and not put a threshold on this).


What does handle_swap.pl do - well it’s a perl script that looks at free memory and if only a few 100K of swap is in use, swapoff -a; swapon -a;

In this case, it is a bit safe to do this. Why do this? Why not just turn of swap. I have talked in depth about this subject-but for a minor recap. Linux needs swap else, kswapd will freak out. Swap in DB's is bad so I clean it up automatically since O_DIRECT on my SAN is not an option.

Why not just run a cron job? Nagios keeps a log, I like to review what is happening from a central location, and nagios is freaking COOL.

Sun Microsystems, Inc. today announced an update to its MySQL Enterprise™ subscription, designed to assist corporate users improve the speed and uptime of their MySQL™ database applications. Thirty-day trial subscriptions of the new MySQL Enterprise Fall 2009 Release are now available free-of-charge from http://www.mysql.com/trials.

Fosimo 1.0.8 — Backup Facebook Inbox

by Anatoly Lubarsky at 14:43 PM, 09/01/2009

Fosimo 1.0.8 is available. Fosimo is a desktop app for Facebook which helps you control your Facebook account without browsing through Facebook.


Facebook has rolled out several important features recently including new inbox functionality today and new authentication scheme for desktop applications a couple of weeks ago so there are new features and adjustments.


New in Fosimo 1.0.8

  • Improved permissions configuration and authentication (supports recently introoduced "Publish Stream" and "Read Inbox" permission).
  • Improved photo upload functionality.
  • Improved set status (removed "is" verb). This is an old bug on Fosimo but finally fixed.
  • Many fixes and enhancements.
  • Backup facebook inbox: now you can backup your inbox on facebook to your hard drive with 1 click. Please note: Facebook is rolling out new inbox functionality now so it may take a couple of days until this feature is fully available for everyone.


Enjoy

Running MSI on Windows 7

by Anatoly Lubarsky at 14:17 PM, 09/01/2009

I didn't try Windows 7 yet. But seems that Windows 7 inherited user access control feature from Windows Vista as is. It maybe useful info for people running MSIs on Windows 7.


I learned it because lifehacker linked yesterday to my 2-year-old post about MSI Error 2869 on Windows Vista where I explained the error and how to deal with it (lifehacker had the same MSI problem on Windows 7). In short — the error is caused by user access control feature and you need to run MSI from command line as administrator as a workaround (or patch the MSI). I was a bit surprised yesterday while reading that 2-year-old post since it seems very techy and it took me a while to fully understand it :). Good coverage after all...

For the first time ever I took a two-week vacation. How can a person who has 100s of database servers, 100s of web servers, and a system that supports 100K tps across 20 TB of data is able to take a vacation? Easy, I have a great team that is very competent in managing the platform by following our cookbook routines and guidelines for new application interaction with the databases.

Where did I go? I went to NYC-stayed in Tribeca and only ate Pizza, Hotdogs, White Castle and Hala Food from vendors that are nearly on every street corner. I also lost 6 pounds! After three months of half-ass dieting to prepare for my Pizza diet, who would have known that just going for it was the key to success! How did I do it? Well, first I walked to Little Italy, bought 2-4 slices, ate them, and then walked around the city. That is it. As a Database Dude, I do not do much walking except for when I am too tired to run on the treadmill for 20 mins. I was averaging around 10 miles a day for a 12-hour period walking around NYC taking pictures, enjoying the sites, eating, drinking, and then repeat. I even walked from Canal Street all the way to the Staten Island Ferry just because. Man it was great.
After NYC, I went to Puerto Rico for the second time. My family and I had a freaking awesome time. We went to the only US National Park that is a Rain Forest. Went to Vieques to the Bio Luminous Bay, where the single cell organisms light up when kinetic energy is applied to them. Went to Culebra, to Flamenco Beach – which is off the hook: much better than any other beach I have been to. Drove around the entire Main Puerto Rican Island stopping off in Rincon and eating every 3-4 hours.
I am fully rested and feel like a 20 year old again. So, expect a lot more posts. I think the order will be:

Migrate 20 TB of data into a new Format without downtime.
How to make Friend Queries work with a database.
Throughput increase from Asynchronous Queries.

X2line's Audience is Over 2 Million Users (July 2009)

by Anatoly Lubarsky at 16:54 PM, 08/12/2009

Back in February 2009 we crossed 1 million users mark across all our applications on all platforms combined (link).


In July (or even in June) we reached 2 million. Right now the number is roughly 2243K total users (with more than 1 million users on Friendster alone).


Right now we have 13 applications with more than 50K users each.

Fosimo 1.0.7 — New Facebook Authentication Scheme

by Anatoly Lubarsky at 10:15 AM, 08/10/2009

Fosimo 1.0.7 is available. Fosimo is a desktop tool for Facebook which helps you control your Facebook account without browsing through Facebook.


New in Fosimo 1.0.7

  • Quick launch on double-click and slowing down animated windows just like in a Fosimo.TR for Twitter lates version.
  • New Facebook authentication scheme for desktop apps (which works through Facebook Connect) is implemented starting from Fosimo 1.0.7.
  • Many small fixes and enhancements.

Enjoy

Fosimo.TR 0.5.6 — Quick Tweet and My @responses

by Anatoly Lubarsky at 04:12 AM, 08/10/2009

Fosimo.TR 0.5.6 is available for download. Fosimo.TR is a light and simple Twitter client for windows which helps you keep track of your Twitter account without browsing through Twitter.


New in Fosimo.TR 0.5.6

  • My @responses: now one can view all @responses on twitter in real time (or "mentions" as twitter calls it). It is one of the items of the main menu.
  • Quick tweet: double-click on the tray icon will open a tweet window for a quick tweet.
  • When moving the mouse over animating icon it slows down. Users complained that small boxes move too fast and we finally address this issue slowing down each box after the mouse was over it.

We have more on our TODO list. We will roll up new versions regularly.


Download now!


Your feedback is appreciated. Don't forget to follow @fosimo on twitter. Enjoy.

Sun Microsystems, Inc. today announced that TweetMeme, the innovative online application that tracks the popularity of links sent via Twitter, has subscribed to Sun's MySQL Enterprise database offering in order to keep up with the microblogging service's explosive growth.

Created in 2008 by pioneering Web 2.0 startup Fav.or.it, TweetMeme gives the rapidly expanding Twitter community a means of easily seeing and sorting the most popular links on Twitter. TweetMeme also provides a realtime search facility of fully expanded, qualified and indexed links so that users can get answers to queries about very recent events or news.

mysql UC 2009 Talk

by Dathan Vance Pattishall at 13:30 PM, 06/29/2009

Scribe is a bit buggy with displaying this presentation:

Scaling a Widget Company

So, when you connect on the fly to a database your subject to a variety of issues, like when the db is not available and when the db does not have a route.

One of the main reasons why a dev may want to connect on the fly is because they have too many front ends to hold a persistent connection on the backends. Since mySQL does not use libevent, holding open threads to mySQL is much more costly. Threads == Memory.

But, that's here nor there. The main purpose of this post is to talk about how to recover from failed connections that block apache threads.

Common Failures:
No route to Host
Flapping NIC
Locked Tables
Recovering from a Crash
more of the same.


My Environment:
I have a bunch of webservers (200+) that all have 300 possible threads (60000 possible connections to a single DB) behind a load balancer that uses the LB least connections protocol to distribute load across all 200+ webservers.

Since PHP is used in a stateless mode (no guarentee that the same user will hit the same server), I have to have some way of telling all other apache procs for said box that a server is down. I use stateless on the fly connections, so each apache proc will test the connection.

Here in lies my most common problem. If a box in the backend dies, all apache threads will block for a predefined time-out.

In /etc/php.ini (it could be in /etc/php.d/mysql.ini) I set

mysql.connect_timeout = 5 // the default is 60


In my common_db class when connecting to a database, the connect routine returns a database handle object


if (PEAR::isError($dbh) && $delta_to_connect >= 5){

#
# mark ip as dead for 15 min
#

apc_store($ip, array('DEAD'), 900);
return false;
}

return $dbh;




Now PRIOR to calling the database connect code I check to see if the IP is up.

$status = apc_fetch($ip);
if ($status[0] === 'DEAD'){
return false;
}

... do connect ...


But, there could be a variety of issues that can call false positives, like network flaps, someone blocking the db for some time etc. So, I'll allow one request .1% of the time bypass the status check to try again.


if ($status[0] === 'DEAD' && (mt_rand(0, 1000) != 1)){
return false;
}



But, if your app can't connect to the DB aren't you down anyway?

Yes, although I don't have to restart all the httpds.
Also each DB has a redundant pair to when returning false, the app code will try the other set of servers.

I have a variety of methods to deal with these scenarios but this is the quickest to implement.

Open Source 'Leeches'

by Kristian Köhntopp at 07:24 AM, 06/01/2009

Infoworld has an article "The fight over open source 'leeches', about the failure of enterprises to contribute the changes they make to open source software back upstream to the project.

"When it comes to open source communities, individuals are much better citizens than institutions. The enlightened self-interest that causes individuals to send back bug fixes, contribute ideas for new features, and write documentation is much harder to find in institutions," Dan Woods, CTO of Evolved Media, wrote in Forbes earlier this year.
I have a very old article in my german language blog about Open Source Software and companies, and the infoworld article triggered me now to translate it:

Many companies do use open source, and that is a good thing. Most of that software is well suited to the task, and usually it is much easier to understand, more flexible and also easier to debug than their closed source equivalents. But open source requires that enterprises adjust their process with regard to changes and deployment of software in order to be successful open source users.

The fact that you can change open source software lures many companies into making changes to the software. But making the changes is just a small part of the work, and the most inexpensive one.

What usually happens is often this:



Continue reading "Open Source 'Leeches'"

There are two PRIMARY directions to sort data in SQL: Ascending (ASC) and Descending DESC.
When these two sort definitions are put together in a single statement a filesort is produced.

Why do we want to avoid filesorts?

Filesorts are bad. 1st they tickle a thread based buffer called sort_buffer_size. Additionally filesorts reads the data twice, unless max_length_for_sort_data limit is reached and as a result the Filesort runs slower to reduce disk I/O. If you want filesorts to run faster at the expense of the disk increase the default max_length_for_sort_data. You can read the filesort algorithm here.

So, here is an example


CREATE TABLE `ABCD` (
`A` int(10) unsigned NOT NULL default '0',
`B` int(10) unsigned NOT NULL default '0',
`C` int(10) unsigned NOT NULL default '0',
`D` int(10) unsigned NOT NULL default '0',
PRIMARY KEY (`a`,`b`,`c`,`d`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1



mysql> explain SELECT * FROM ABCD WHERE a=1 AND b=1 ORDER BY c DESC, d ASC\G

*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: ABCD
type: ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 8
ref: const,const
rows: 2
Extra: Using where; Using index; Using filesort
1 row in set (0.00 sec)



Notice the filesort? So how does one get around this filesort?

Well

Let's define some roles for columns C and D. C is the parent while D is the child.

  • We want all the latest parents (C)
  • We want all the oldest children (D)

    We require pagination of all the PARENTS (show 10 parents per page) so Queries like this is PRODUCED

    SELECT * FROM ABCD WHERE A=? AND B=? ORDER BY C DESC
    explain SELECT * FROM ABCD WHERE a=1 AND b=1 ORDER BY c DESC LIMIT 10\G
    *************************** 1. row ***************************
    id: 1
    select_type: SIMPLE
    table: ABCD
    type: ref
    possible_keys: PRIMARY
    key: PRIMARY
    key_len: 8
    ref: const,const
    rows: 2
    Extra: Using where; Using index
    1 row in set (0.00 sec)




    Now

    FOREACH($C_parent as $i => $c_id) {

    $C_parent[$i] = SELECT SQL_CALC_FOUND_ROWS * FROM ABCD WHERE A=? AND B=? AND C=$c_id ORDER BY D ASC LIMIT 1;


    }

    So, we changed 1 query into 11 queries (10 parents per page) to make the page load happen faster, by getting rid of the filesort.

    What 11 queries is faster then 1? Yes, for this case it is. The reason is because filesorts are SLOOOOW, they chew up a lot of limited resources and they should be avoided. I've see filesorts take close to 50-60% of the query time.

    This is the translation of an article from my german language blog. It is not a literal translation, but has been amended and changed a bit to take more recent information into account.

    It started out as a discussion within the german language MySQL group in USENET. There the eternal question came up why phpMyAdmin gets no love at all from the helpers and regulars in that group. My answer was:

    phpMyAdmin (PMA) like many other GUI tools for MySQL has a number of limitations. For a web tool such as PMA these come from its operating principles and can hardly be changed. But let's start at the beginning:

    In MYSQL the connection is a special context or scope for many things. At least the following things are part of the connection scope:

    • Transactions. A disconnect implies a ROLLBACK.
    • Locks. Transactions generate locks with writing statements or SELECT for UPDATE. Table locks are generated by LOCK TABLES. Disconnect releases the locks.
    • The number returned by LAST_INSERT_ID() is cached within the connection context. It is unavailable after a disconnect.
    • Tables created by CREATE TEMPORARY TABLE. They are deleted on disconnect.
    • Prepared Statements and the parsed bytecode of stored routines are kept per connection, even if that is the wrong solution from a performance POV.
    • @-variables. SET @bla = 10 or SELECT @bla := count(*) FROM cookie; are defining variables within the context of a connection. They are lost on disconnect.
    • SESSION-parameters. SET SESSION mysiam_sort_buffer_size = 1024*1024*64 or SET @@session.myisam_sort_buffer_size = 1024*1024*64 are setting configuration parameters with session scope. They are lost on disconnect.
    • Replication specific SET-commands such as SET TIMESTAMP or SET LAST_INSERT_ID can affect the behavior of functions such as SELECT NOW() or SELECT LAST_INSERT_ID().
    I am calling all of that connection scoped state.

    A client that is disconnecting in an uncontrolled way or that does not report a disconnect properly is defective in the sense that all functionality dependent on connection scoped state is unavailable. The opposite case also exists - a client that reuses existing connections may have connections that have an unclean state, which may or may not be a security risk.

    Continue reading "Connection Scoped State in MySQL"

    So say you have two mySQL servers called A and B. A and B are in a circular replication ring meaning

    A replicates from B
    B replicates from A

    In addition, log-slave-updates is on, so if one server of either A or B goes down you can recover all the data from either side.

    Therefore, here are some steps.
    Take a snapshot of a server in the ring, with your favorite backup tool (mine is ibbackup)

    When either server A or B dies, restore from said backup. Here are the steps. Let us assume server A goes down, and backups are done from server B.

    restore the backup of server B to server A

    On server-A turn off log-slave-updates (you will see why on the next line)
    Add replicate-same-server-id on server A to my.cnf

    CHANGE MASTER TO to the position and file of the binary log that is reported from your tool on server-A.

    IF the binary log did not get corrupted on server-B your cool, else flip the binary log to the next binary log from server-A on server-B.

    Let server-A catch up
    Stop server-A
    Add log-slave-updates on server-A
    remove replicate-same-server-id on server A
    Start server-A

    done.

    MySQL, Sun and Oracle

    by Kristian Köhntopp at 01:13 AM, 04/22/2009

    This is the translation of a text I wrote for my german language blog two days ago.

    Now Sun has been bought, and not by IBM or Cisco, but by Oracle. In the aftermath everbody is asking themselves - what happens to MySQL?

    Well, firstly MySQL and InnoDB are now part of the same company. Oracle has been maintaining InnoDB pretty well in the past, and that can only improve.

    Will Oracle let MySQL die and try to push their on products into the market? Hardly so. Why should Oracle do that, and if so, using what products?



    Continue reading "MySQL, Sun and Oracle"

    Sun Microsystems, Inc. today announced MySQL™ 5.4, a new version of the world's most popular open source database, designed to deliver significant performance and scalability improvements to MySQL applications. A preview version of MySQL 5.4 is available now for download at http://www.mysql.com/5.4.

    Today's announcement was made at the seventh annual MySQL Conference & Expo being held this week at the Santa Clara Convention Center. With more than 2,000 attendees, it is the world's largest event for open source database developers, DBAs, vendors and corporate IT managers.

    Oracle to Buy Sun

    at 18:00 PM, 04/19/2009

    Sun Microsystems and Oracle Corporation announced today they have entered into a definitive agreement under which Oracle will acquire Sun common stock for $9.50 per share in cash. The transaction is valued at approximately $7.4 billion, or $5.6 billion net of Sun's cash and debt.

    Want to learn how open source software helped Barack Obama get elected president? Don't miss the closing keynote at the MySQL Conference & Expo 2009, "Database We Can Believe In: Stories from the Front Lines (and Server Rooms) of Barack Obama's Online Presidential Campaign."

    Co-presented by Sun Microsystems and O'Reilly Media, the seventh annual MySQL Conference & Expo explores "Innovation Everywhere" April 20-23, 2009 at the Santa Clara Convention Center in Santa Clara, CA. With more than 2,000 attendees, it is the world's largest community event for open source database developers, DBAs, vendors, and corporate IT managers. For more information, please visit www.mysqlconf.com.

    There are a finite set of cases where mySQL clients will hang on a connection-for a small period of time (seconds) or indefinitely. Most notably no-route to host causes a timeout to occur which in most clients are on the order of seconds to recovery.

    In a web environment if a database is connected to on the fly, each connection made should take milliseconds-when the net is healthy. However, when a database server crashes a timeout for each connection takes seconds and there are cases where connections can just hang (recovery of INNODB as an example). For instance have you ever had a ssh session hang and a kill (SIGHUP..) does not work? This happens for mySQL client connections too.

    I want the front ends to recover gracefully from a no-route to host, or more notably a hanged connection condition. I want to avoid that damn timeout all together on stateless connections. Why? Because if you use up all your worker httpd threads requests Fail-things crash, bosses get mad, everyone wakes up and you have a meeting about it later. So what are ways around this?

    Use a Load balancer

    This is cool and all, but is expensive since two or three are needed for HA purposes. In addition, this increases the complexity of managing servers, and most LBs are optimized for HTTPD traffic, not raw TCP traffic.

    Use a memcache layer to keep DB state

    This is cool as well, but the state server, which collects the state of your servers and updates memcache, is now a single point of failure-and a lot of time is spent to make this resilient. Not to mention all clients across Programming languages (bash scripts, Java, PHP, Python, etc) need to have access to this memcache layer and logically handle the connection the same. This is doable, but there is an easier way.

    Use IRON DNS:
    IRON DNS is a term I use to build an HA Resolver. IF a box fails a health check, a nagios event handler can tell IRON DNS to set the domain names IP address to 127.0.0.1 or a routable interface that produces a Connection Refused (111 instead of 110).


    For an internal network, all database entries should be able to fit in memory. All resolves should take less then 2ms, and updates to DNS entries can happen in just a few seconds across your entire farm. The only drawback is if the DNS server fails to respond your site is boned. Making DNS Failure proof is rather easy-which is another post entirely.

    For my solution, I am using DNS. This allows me to recover servers in a shard on different boxes without having to change code. This also allows my environment to recover from blocked I/O events that can spike load on the front-end that make requests slow.

      Here are some conditions where connections take to long:
    • No Route to Host
    • INNODB recovery
    • Disk Fails (disk fills up)
    • Switch Fails / Flaps
    • Plug-in storage engine crashes, yet mySQL is up
    • DNS Resolve fails


    If you have a better method, or want to advise me on flaws I should watch out for, drop a comment.

    Hadoop Elastic MapReduce by AWS

    by Frank at 07:04 AM, 04/02/2009

    Amazon today launched a beta of it's Elastic MapReduce (hosted hadoop). This is exciting and just in time for my upcoming, Hadoop and MySQL: Friends with benefits, session at the MySQL Conference & Expo.

    I can't wait to try it out!

    Memcache is a fantastic Hash table-very fast and one of the great successes of Brad Fitzpatrick-who in my opinion has done more for the open social movement as an individual then anyone else. I use memcache quite extensively, now I am thinking about adding ZLIB native to compress the value of each key-much like how INNODB does with the Barracuda file format. The theory is with a CPU hit, we can store more data per memcache instance. I've talked to the Northscale guys and they love the idea. What do you think?


    Update: Well what do you know

    http://us3.php.net/manual/en/function.memcache-setcompressthreshold.php

    for PHP for instance compresses the data on the client side.

    There still might be some value compressing the data on the server-side, but now I'm not as motivated.

    What might be a good alternative is to compress keys into 8 byte longs in memcached automatically instead of the actual string that can be huge. To give some more detail,
    8 byte longs is a 64-bit int. A string can easily be converted into a big int by bit manipulation - and the address space is huge so key conflict is effectively removed.

    Cloud computing is the big thing now days, weather you are an app developer using EC2 or the Google APP engine, or a new company trying to build your own cloud product. If you are hosting, or using a platform it costs money. I hate to spend money, especially money that is for my company. If I find an idle server, I use it to 100% utilization (prior to the saturation point).

    I needed to build a new application that periodically crawls a website to update various lists. Building a crawler is expensive, especially from scratch. First, you have to define the amount of lag that is allowed from the crawl copy and the real copy. Of course the Project Manager does not want any lag, all events must be caught and near real-time without overloading the source of the data-but I am not hating, it is a challenge. Next, what technology to use, what language to write the app in-what considerations are left to be defined? How does one crawl Gigs, Tera, and amount of data in a guarantee period? On top of that, how much additional hardware is this going to cost. In addition, to be a cloud it needs to have an API so app developers can set, and get consistent data in an expected period. This is a lot of freaking requirements.

    Therefore, to solve this issue, I know that mySQL will store the data, but getting the data is the hard part. This is what is going to cost money, lots of it. I looked around common architectures and found that nothing would do what I wanted to do in a cost effective manner. So, I designed my own using Seti @ Home as the basis for the design.

    Get to the point already Dathan:

    I have turned ever user who views my applications into a collector, using idle bandwidth without knowing who is collecting the data. My user base on spare cycles will fetch a feed of my choosing, and sends that data to my servers without any personal information. Instead of using an Amazon or Google service, I have turned my user base into a cloud to service their needs.

    It is rather awesome-I must say. I am able to service the needs of more than 60 million users at the cost of development time, and NO NEW HARDWARE. The cloud does not have to be a service provider-it can be the end user as long as the end user is not impacted by the requests. BTW the team that I manage is freaking awesome-they built my vision with trial and error and a hand waved spec.

    Currently the system scales as long as there is enough end users. If I lose all my users then well I am boned, but to support the feeds all I need is 100K nodes at the current rate. With 60 million end nodes, I am cool.

    Imagine if Google with Adsense used this install base to tell Google if the data has changed for an arbitrary web address. All it needs is a few people to hit the same url, inform Google that the web address in question has a different checksum, then at that point Google’s crawlers can go fetch it, instead of constantly crawling data that doesn't change. Google would be able to reduce overall server cost significantly, if it just knew what data has changed instead of guessing what data has changed.

    By next years Velocity conferance I hope to have a full disclosure on what technology my team used, how my team get around cross domain issues, and how to compute checksums to validate the data.

    PS - I designed this, with my team we made it much better and one person implemented it and owns the product from this point on.

    Where I work, Merlin is an important tool for us and provides a lot of insight that other, more generic monitoring tools do not provide. We love it, and in fact love it such much that we have about 140 database agents reporting into Merlin 2.0 from about 120 different machines. That results in a data influx of about 1.2G a day without using QUAN, and in a data influx of about 6G a day using QUAN on a set of selected machines.

    It completely overwhelms the Merlin data purge process, so the merlin database grows out of bounds, which is quite unfortunate because our disk space is in fact very bounded.

    The immediate answer to our purge problem was to disable the merlin internal purge and with the kind help of MySQL support to create a script which generates a list of record ids to delete. These ids end up in a number of delete statements with very large WHERE ... IN (...) clauses that do the actual delete.

    This is a band-aid fix, which does work in a way, but also has unintended consequences, though. Or, as we use to say around here: 'That also breaks, but in a different and interesting way.'
    Continue reading "DELETE, innodb_max_purge_lag and a case for PARTITIONS"

    I will be attending Community One tomorrow and on Thursday at Marriott Marquis Hotel, New York, NY. I am especially looking forward to the announcements tomorrow which sound very interesting :)

    The first day is a free event featuring:

    * Cloud Platforms – Development and deployment in the cloud.
    * Social and Collaborative Platforms – Social networks and Web 2.0 trends.
    * RIAs and Scripting – Rich Internet Applications, scripting and tools.
    * Web Platforms – Dynamic languages, databases, and Web servers.
    * Server-side Platforms – SOA, tools, application servers, and databases.
    * Mobile Development – Mobile platforms, devices, tools and application development.
    * Operating Systems and Infrastructure – Operating systems and virtualization.
    * Free and Open – Open-source projects, business models, and trends.

    The second day of the event is focused on Deep Dives with two half-day sessions on MySQL and two full-day sessions on Java and Web development. I will be attending the session, "Using Java EE and SOA to Architect and Design Robust Enterprise Applications."

    Following the conference, I will be a panelist at a Cloud Computing Seminar at Microsoft office in NY. It's going to be a long but exciting day!

    It will be great to catch up with old and new friends at the event.

    Walking a table means, traversing each row, commonly used in building queues, fixing data, or dumping a table. I've recently ran into a problem-caused by an assumption, where walking a table was taking way to long using the method


    $pos = 0;
    do {

    $result = SELECT col FROM TABLE LIMIT $pos, 1000;
    $pos += 1000;
    } while ($result);

    The assumption was since INNODB uses a cluster index, this would traverse the table using the PRIMARY key. This is not the case, its not a problem in INNODB but a bad assumption, that I fell victim to. A table scan to each $pos occurs producing a Big-O of N^2. So, when the query:

    SELECT col FROM TABLE LIMIT 1000000, 1000 is executed mySQL will scan all the rows up to row position 1001000 and for each subsequent iteration.

    This is SLOOOOW. IMHO since the table is sorted by the primary key, mySQL should optimize this case - but it does not and will not. So, to walk an INNODB table fast, and keep liner time or a Big-O of N an alternative is

    $last_id = 0
    do {

    $result = SELECT col FROM TABLE USE INDEX(PRIMARY) WHERE pkey_part > $last_id LIMIT 1000
    $last_id = $result[count($result) - 1]->pkey_part

    }while($result);


    This dumps a table very fast, almost as fast as doing a count(*) on the PRIMARY KEY.

    Another method is to

    SELECT col INTO OUTFILE "/dir/file.ids" FROM TABLE;

    but the data is local to the database - thus the need for the application to grab data. Another draw back of this method is that the dump produces more disk IO then walking a table off of a key, slowing down access to this table.


    In conclusion, even if the storage engine keeps the table order consistent like INNODB does, do not assume that LIMIT 100000, 1000 is equivalent to a file seek of position 100000, without telling the Optimizer to use an index.

    Cloud Computing - Executive Seminar

    by Frank at 19:42 PM, 03/02/2009

    Tomorrow, I'll be attending the Executive Seminar on Cloud Computing at NASDAQ MarketSite (NY). Speakers include Dr. Werner Vogels and Mårten Mickos (ex-CEO of MySQL). Big thanks to Amazon and RightScale who were able to accommmodate my RSVP even when the registration had formally closed.

    I hope to be able to catch up with Mårten Mickos during the event. In case I do succeed in catching up, is there any question you want me to ask him? You can email me or post a comment.

    It's funny that the event site still shows Mårten's title as "SVP of Sun Microsystems’ Database Group."

    Came across an interesting post by Bret (co-founder of FriendFeed) about how FriendFeed uses MySQL to store "schema-less" data. According to the post, they weren't having issues with scaling existing features but rather they were experiencing pain when trying to add features.

    Now the way they are using MySQL is interesting and bizarre at the same time. At a very high level, it seems their approach is to use a RDBMS as if it is a column-oriented database. Of course, it makes me wonder why not just use a column-oriented database? I need to read the post again in the morning (too tired right now so just gave it a quick glance).

    I am very interested in hearing thoughts from my peers at Planet MySQL regarding this approach. They seem to have gone great lengths to go this route. What issues and benefits you see of this approach and whether you ever see yourself taking this route? I, for one, am not entirely convinced of this approach and whether it can really scale down the road. Also, if it was someone other than Friend feed going down that route, I might have actually lost my tempered and yelled :)

    Side note: Friendfeed is growing fast, and it would have been cool if Bret was speaking at one of the three upcoming MySQL events in April.

    I'll be going into detail what is Sharding, how to Shard, pitfalls of Sharding, performance/throughput gains, shard roles, and performance scaling in general. I hope to make this the most comprehensive talk to date on the subject in 45 min.

    The topic is called Scaling a Widget Company. I'll detail how I setup the data layer for Rockyou. How many transactions per second Rockyou is at, what the infrastructure is comprised of, how 99.999% uptime is achieved and hopefully get into BCP which I probably will not have time to go over.

    If you want me to focus on specific aspects on the subject of shard'ing let me know and I will :).

    Another project that I am doing introduces a new shard type. First, think of a Shard as RAID-10 on your database layer. The data is striped across N servers and mirrored for failure recovery.

    Shard Types are Shards that serve specific purposes. For instance, I have an archive shard. This shard keeps data the is rarely ever read on even cheaper hardware-with slower disks and the data is naively compressed using the innodb_file_format=Barracuda
    option. If and/or when a person requests data that is archived, my software layer detects this case and migrates the archive data from the super crappy RAID-5 servers to the less crappy RAID-10 servers with 32 GB of RAM, all within 5-10 seconds.


    Another shard type is the sub-shard. I use this terminology to define a different way of federating data for a predefined global object_id. A predefined global object_id is what you are federating by, for instance userid.


    userid X maps to Shard 3


    What if you had an application that is expected to take up 8TB of data? Sticking all of userid X's data on a single shard is very costly since current size of the cluster + 8TB means more servers are needed and is costly (more data, larger ranges, slower the avg query speed). So why not put it on another set of servers that have big slow'ish disks, i.e. the Sub-Shard.

    Now for userid X

    userid X maps to Shard 3
    userid X maps to SubShard 1


    Since some shard profiles do not require 100% uptime, I can run builds of mySQL that is cutting edge.

    Using the Percona 5.1 build, I ran across new tweaks for XtraDB. The patch is from the Google Patch made by Mark Callaghan's Google Team. Mark's architecture is purely disk I/O bound, and from his tests, INNODB does not use I/O effectively. He added a bunch of code dealing with the I/O performance bottlenecks that innodb native has. The three new tweaks addressing the I/O bottleneck specifically are

    innodb_io_capacity = 100 // If running without the patch this is the equivalent default behavior
    innodb_write_io_threads = 1
    innodb_read_io_threads = 1


    From what I gather, this means that DEFAULT INNODB uses one write thread 1 read thread and will only work on 100 pending iops at a time-which is probably why INNODB takes sooo long to shutdown since by default 90% of the buffer pool contains dirty pages.

    If your servers have a large innodb_buffer_size, on the order of a few Gigs and the server has many spindles (greater then 1) then increasing these params may help your application purge dirty pages faster. Be warned; do not increase this to high. If your box has a lot of pending I/O and this new code does not have the bandwidth to flush based on your settings, the box will freeze. That means you will have to remove traffic off of the server, let the pending io finish, lower the settings and restart.

    So, what is a good size to set this to?

    Percona says 100*Number of spindles you have on the db server. That seems a bit high for me. I would say 100*Number of spindles / 2 - to start out with, then over time ramp it up.

    Currently I am about to run 10 Shard or 20 servers with the Percona build. This will purely I/O bound load.

    Optimizer tricks

    by Dathan Vance Pattishall at 18:46 PM, 01/07/2009

    When you write good SQL, that use indexes properly there is one more obstacle that can slow down your app. The mySQL optimizer. From versions 3.23 to 5.1 the optimizer has been a problem for me. In mySQL 6.0 SUN/mySQL has resources improving it.

    I wrote a post detailing how to pick indexes to get the most out of mySQL here.

    Here is a post about the mySQL optimizer and what you can do to speed up your SQL SELECT statements.

    What I would like to share with you today, is that UPDATE and DELETE statements can also use optimizer tricks that SELECT uses. Its not documented on the mysql.com but it is possible to do something like


    UPDATE [YOUR TABLE] USE INDEX(`your_index_name`) SET col='val' WHERE [columns that satisfy your index].


    To see if you have problems with your UPDATE statements taking to much time I recommend 1st running INNODB as your storage engine since SHOW INNODB STATUS will indicate what index your long running UPDATE is using. This can be achieved by looking for /RECORD LOCKS space id.* index/ of SHOW INNODB STATUS.

    Another indicator that you might need to force the optimizer to do the right thing is to track "Deadlock detected" from concurrent update statements; look at the where clause of your UPDATE statement.

    Next solve your issues by telling mySQL to use the correct index when setting exclusive locks via the USE INDEX statement above.

    Hope this helps.

    Oh how I love register_shutdown_function of PHP. This bad baby will execute at the end of the scripts in call order. So, how is this useful?

    Say you want to log some action, but that logging DB is on another server. Additionally you do not want to take into account that writing to the other server may break your transaction because of timeouts if logged in the middle of a transaction. Another use case: you do not want to change a bunch of code in different places to batch all logging routines up.


    public function __construct($platform){
    parent::__construct($platform);
    register_shutdown_function(array($this, 'afterProcess'));
    }


    public function afterProcess(){

    foreach($this->getDataThatChange() as $key => $values){
    $insert_data[] = $values;
    }

    $this->getDBClass()->DB_logserver_insert_query("LoggingTable", $insert_data, 'delayed');


    }


    The function afterProcess will batch all the changes and do a bulk insert into a myISAM table so it can used the DELAYED functionality.

    This is done outside of all transactions, and at the end of the script as the data is returned.

    But, why not just use __deconstructor() in PHP? I want it to happen before the deconstructor is called.

    http://us3.php.net/manual/en/language.oop5.decon.php#76710

    I have been using register_shutdown_function for years now, especially to clean up connections at the end of script execution. Since someone asked me if this was possible in PHP, I decided to post this quick usage of a cool php method.

    CREATE TEMPORARY TABLE

    by Kristian Köhntopp at 05:15 AM, 11/26/2008

    If you have a slave, that slave is probably running with the read-only flag set in the mysqld-section of your my.cnf. To be able to write to a read-only slave you need to be the replication SQL_THREAD or have SUPER privilege.

    Since 5.0.16, it is still possible to execute CREATE TEMPORARY TABLE on a read-only slave, so CREATE TEMPORARY TABLE privilege also allows you to write to a read-only slave in a limited and controlled way.

    If you want to process a lot of data in a temporary table, you are probably creating the temporary table without any indices, then INSERT ... SELECT data into it, and then ALTER TABLE ... ADD INDEX afterwards, because that is usually faster than to insert data into a table with indices. Only that you cannot ALTER TABLE a temporary table, even on a server that is not read-only - in order to run ALTER TABLE on any table, even temporary onces, you need ALTER TABLE privilege which you might not want to give out lightly.

    There is no reason at all to check ALTER TABLE privilege for an alter table operation on a temporary table, because that table is visible only to your connection and cannot be shared. It is also deleted when you disconnect. In fact there is no reason at all to check permissions for temporary tables. But it is done.

    Because it is done, you can grant ALTER TABLE to a single table that does exist or even to a table that does not yet exist. If you grant ALTER TABLE privilege to an existing table on a read-only server, you cannot alter that table, because the server is read-only. If you then use CREATE TEMPORARY TABLE to create a table with that name, the temporary table will shadow the existing persistent table and for your connection the persistent table will become inaccessible.

    The semantics of the GRANT will change, though, and will now apply to the temporary table, which is writeable on a read-only server because it is temporary, and is alterable because of the grant which was not meant to apply to it in the first place. Problem solved: I now can ALTER TABLE my temporary table on the read-only server after I have finished my data load.

    All is well? Not!

    There are multiple things at work here which I consider broken:

    1. GRANTs are applied to temporary tables. This is not making any sense at all in my book. Temporary tables are connection-local objects and they cannot have grants applied to them which were always referring to persistent objects when they were made.
    2. Temporary tables can shadow persistent tables in the namespace of a connection. Because GRANTS are tied to objects via the objects name and not an objects UUID or another form of truly unique object-identifier, GRANTS can refer to changing objects even when the grant does not change. This feels somehow broken, as in "not properly normalized". Does RENAME TABLE edit grant tables as well? I have to check!
    3. By granting CREATE TEMPORARY TABLE privilege to a user I am allowing that user to shadow any other object within a schema. The temporary table will then pick up any rights granted to the shadowed object for the duration of its lifetime. This cannot be good.

    OpenSQL Camp Starts Tomorrow!

    by Frank at 20:47 PM, 11/13/2008

    So the good news is that the inaugural OpenSQL Camp is going to be an awesome event with mouth watering sessions by noted experts. The bad news (for me) is that I won't be attending it, which makes me sad. I cannot leave my town because my wife can go into labor anytime now.

    The session list looks great! Congratulations and thanks to Baron, Sheeri, Ronald, all the sponsors and contributors for organizing the first OpenSQL Camp.

    I will be watching PlanetMySQL closely for juicy blog posts. Hopefully, the one and only Sheeri is taking her camcorder!

    Stack Overflow: Q&A Site

    by Frank at 07:37 AM, 11/10/2008

    Today I discovered Stack Overflow, a collaborative site that focuses on technical Questions. You can ask questions related to any language, apparently without having to register. The site is currently in beta. There are also a few MySQL questions that are currently unanswered.

    I am currently tasked with writing Software Requirements Specification (SRS) document for a project. Effective sharding (based on specific criterion) and Scalability are key requirements of the project.

    Scalability is traditionally classified as a non-functional requirement. My question to the community is that if scalability is crucial to a project, would it still be classified as a non-functional requirement? Are their cases when scalability requirements would be best classified as functional requirements?

    Open Source Pony Tail

    by Frank at 10:54 AM, 10/31/2008

    Sorry for not updating this blog regularly. My wife's due date is soon so I've busy.

    Anyway, I wanted to share this very funny interview with Jonathan Schwartz (puppet):

    I wrote an application that is able to send out 3-8 million messages an hour with only 10 CPU's. This application is a part of an Offline Task system that scales linearly.

    How is this done, I'll go into detail hopefully at the mySQL conference if they accept my proposal.

    The scope of this blog post is to go over building the "Task Queues". Currently I have 13 Queues, one queue for each Shard that I run. The data is federated by user or randomly federated with a GUID that lives as long as the job. A request came in to add 20 million jobs to the queue all at once. The problem is with this list, will I cause deadlocks in innodb as I add the jobs to the queue as one transaction? Can live traffic still write to this queue?

    To verify that Deadlocks will not occur - having an understanding about how locks work in INNODB is key. I suggest reading this page.

    To build the queue I dumped the data source by


    SELECT identifier, 16 INTO OUTFILE "/data/mysql/BuildQueue.log" FROM SOURCE_TABLE WHERE CONDITION.


    The isolation level is REPEATABLE-READ; I'm setting a shared lock to get the most current version of the data. Writes are not blocked.

    Next:

    I create a table on each shard where the queue is located.

    CREATE TABLE IF NOT EXISTS OfflineTasksHold (
    object_id bigint(20) NOT NULL DEFAULT 0,
    object_type smallint unsigned NOT NULL DEFAULT 0,
    PRIMARY KEY(object_id,object_type)
    ) ENGINE=INNODB;



    Then I issue a command on each Shard


    LOAD DATA INFILE "/data/mysql/BuildQueue.log" INSERT INTO TABLE OfflineTasksHold;
    START TRANSACTION;
    INSERT INTO OfflineTasks (object_id, object_type) SELECT * FROM OfflineTasksHold;
    COMMIT;


    Each shard is getting around 4-5 million rows, while accepting real-time traffic of 20-60 tasks a second to the OfflineTasks table. The OfflineTasksHold table does not have any real-time requirements and is solely used to keep the queue in Primary Key order, plus there is a CHANCE that LOAD DATA could set an exclusive lock on the OfflineTasks table-shutting down adding data to the table by the live site. The INSERT sets an exclusive lock on the rows that are being added, so the Offline Task Sheppard - the process that pops tasks off the queue is blocked for a small period - which is acceptable. Why are they blocked? Well, the massive insert sets an Exclusive Lock. The Sheppard is trying to grab the rows that are locked waiting up to 50 seconds, until innodb_wait_timeout is reached. This condition is acceptable. All other inserts are able to go into the queue without a slow down.

    In summary, the job queue is built and can be automated with confidence knowing that death to the various app will not occur. Processing slows down for a bit but speed right back up.

    Today's "commercial quality disks" are amazing but they follow the same limitations as yesterday's disks. mySQL scales very well, but disks do not. So if you're IO bound when will your expectation of speed fail?

    Test setup:

    DELL 2950 PERC-6 HWRaid BBC 6 DISK 15K RPM 3.5" RAID-10 256K stripe across two channels-using WRITE THROUGH CACHE on mkfs.ext3 -T largefile4 Linux Filesystem.


    The theory is that the outer part of the spindles is the fastest, and the inner portion is slower - since the outer is where the data starts (thanks for the info Benjamin Schweizer). Thus one can conclude that the more disk space your application(s) use the slower the throughput, since the heads have to move more. Brad F. my co-worker did a benchmark to prove this. Our goal is to find out at what is the saturation point if our expectation is to have 22 MB / sec of random access.

    Why do we want 22MB / sec of random access throughput? We want to guarantee a certain level of performance when adding new apps to a common backend-which is I/O bound: we need to know when things will break.

    Here is what Brad found: Total disk size for our RAID-10 setup =~ 800G. What point does it FAIL to achieve our expectations of sustained 22MB/s?


    rndrw test across 100G test / 750G LV =~ 35 MB/s # outer part of the spindles
    rndrw test across 100G test / 300G LV =~ 32 MB/s # outer part of the spindles
    rndrw test across 250G test / 300G LV =~ 24 MB/s # sweet spot
    rndrw test across 350G test / 384G LV =~ 21 MB/s # saturation point
    rndrw test across 750G test / 800G LV =~ 14 MB/s # waste of space


    In conclusion these test show that even though a RAID-10 setup with 800G of space is available, the expected performance drops when data exceeds the sweet spot of 250G-300G of 800G usable-data array.

    Disclaimers: There are many factors that can raise or lower the bar, like different file systems, different I/O schedulers, flushing. For my setups I like

    Deadline I/O scheduler
    256K Stripe
    few inodes (don't need them)
    ext3 since that’s what stable and available.

    Tomorrow morning I am presenting a session, Startup Scalability Strategies, at Startonomics, a conference being organized by Dave McClure and Deal Maker Media. The sessions will be streamed live using UStream. Check the Startonomics website at http://startonomics.com for more details.

    Also check out my guest blog post titled How Important Is Scalability written for Startonomics blog.

    Let me first start of with the disclaimer, that I do not use memcache to scale, I use it to reduce latency. I'm of firm belief that the database layer should be able to handle the requests, while memcache is used to keep frequent requests returning in a consistent time frame i.e. reduce I/O spikes.

    Capacity planning is key to making sure your site can serve the requests to users. If the site is slow, or down that is loss revenue in any revenue model used to monetize your product.


    How to determine when you need to add more memcache servers.

    The stats I look at are system stats and memcache stats.

    Memcache is Memory / network heavy. CPU spikes are very low, and if the CPU starts maxing out that is probably due to some sort of network driver issue or huge context switching or large values stored in memcache.

    So on the system side I look at vmstat


    [root@memcached1 ~]# vmstat 5
    procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
    r b swpd free buff cache si so bi bo in cs us sy id wa
    0 0 208 207400 41452 50552 0 0 10 198 0 0 2 9 87 2
    0 1 208 207408 41452 50552 0 0 0 81 15617 17671 3 13 84 0
    0 0 208 207352 41452 50552 0 0 0 56 15508 17514 3 13 84 0
    1 0 208 207248 41452 50620 0 0 0 310 15295 16762 3 12 84 0
    0 0 208 207248 41452 50620 0 0 0 31 15512 17167 2 13 84 0
    0 0 208 207256 41452 50620 0 0 0 3 15925 18214 3 14 84 0
    0 0 208 207264 41452 50620 0 0 0 0 15456 16923 3 13 85 0
    0 0 208 207264 41452 50620 0 0 0 213 15782 17604 3 13 84 0
    0 0 208 207264 41452 50620 0 0 0 40 15860 18036 2 13 84 0
    2 0 208 207272 41452 50620 0 0 0 214 15926 18248 3 14 84 0
    0 0 208 207288 41452 50620 0 0 0 77 15781 17617 3 13 84 0


    This server dedicated to memcache. The context switching is huge due to all of the constant requests-but we are talking about modern day CPU's which can context switch like crazy. The thing that is bugging me is that requests are starting to go into the run queue, not at a alarming rate but still this is an indication of some possible issue.

    This is something that is graphed on ganglia. If the run queue on average starts increasing, there is some problem.


    Next stats from memcache.


    /**
    * @desc get extended status from all servers
    */
    public function CacheGetStats(){

    if ($GLOBALS[cfg][disable_feature_memcache]){

    return true;

    }

    return $this->memcache_obj->getExtendedStats();
    }



    I have a class called Cache which is a wrapper around memcache class calls. Cal Henderson would kill me if I was using classes at Flickr. Don't get me wrong I agree with Cal 100% but the environment I am in now requires classes-so I have to use it. The reason why we don't like classes is for another post.

    So the output.

    [pid] => 17696
    [uptime] => 2748911
    [time] => 1221850214
    [version] => 1.2.2
    [pointer_size] => 64
    [rusage_user] => 135944.231335
    [rusage_system] => 420733.419798
    [curr_items] => 6012187
    [total_items] => 2362145406
    [bytes] => 4737438938
    [curr_connections] => 654
    [total_connections] => 4128179078
    [connection_structures] => 7293
    [cmd_get] => 12681552588
    [cmd_set] => 2362145408
    [get_hits] => 9880855733
    [get_misses] => 2800696855
    [evictions] => 0
    [bytes_read] => 2564412782739
    [bytes_written] => 12893067371405
    [limit_maxbytes] => 5242880000
    [threads] => 4



    Notice on this server we have a good hit rate and no evictions. Yet looking at one server is not good enough, look at them all- the reason more memcache servers means more memory to store data for your application. The CRC32 hash that the PHP memcache client uses is not very even and some keys may be requested more.



    [pid] => 13956
    [uptime] => 4228079
    [time] => 1221850213
    [version] => 1.2.2
    [pointer_size] => 64
    [rusage_user] => 268369.193681
    [rusage_system] => 711491.537845
    [curr_items] => 5219411
    [total_items] => 3686853272
    [bytes] => 4751658935
    [curr_connections] => 675
    [total_connections] => 4154000955
    [connection_structures] => 9981
    [cmd_get] => 19489963453
    [cmd_set] => 3686853275
    [get_hits] => 15062084538
    [get_misses] => 4427878915
    [evictions] => 11210410
    [bytes_read] => 3908139025173
    [bytes_written] => 10744393525089
    [limit_maxbytes] => 5242880000
    [threads] => 4



    Take a look at this server. The evictions are high, indicating that memcache needs to make room for new objects. This is not good, its an indication that the LRU is evicting objects out faster then their expire time. Additionally the memcache gets are much greater then the hits. This is an indication that memcache is not really working as good as it can.

    But one server is not an indication that there is a problem. Looking at the system as a whole is to determine if a problem exists. My rule of thumb is if the 30-40% of the servers have a high eviction rate, its time to add 30-40% more servers or memory.

    Now allot of this can be tuned by changing the slab size, but learning from John Allspaw, don't make a plan based on a possible gain, make a plan based on the current usage. Then if the possible gain works your golden.

    How do you base your stats on adding more memcache servers?

    I had a lot of good experience using fabforce's DB designer. Now that I upgraded to mySQL workbench, I have nothing but good things to say.

    In 5 mins I was up and running after install. I took my mysqldump of my schema


    mysqldump -d --no-add-drop-tables


    Clicked File->Import->Reverse Engineer MySQL Script

    Tada a picture of my schema.

    Pictures are always nice to represent what your conveying to individuals who may not be as experienced with the db structure as yourself. Just for this feature alone I recommend using Work Bench.

    I attended several events in June of this year including Graphing Social Patterns East, Velocity and Structure 08. At each of these events, I tried to take some notes and posted them to my personal blog. I received a few pings from readers of this blog to point them to a list of these posts. It took some time but here is the list of my notes. In some cases, I have linked directly to the presentation files.

    High-performance Ajax Applications: Julien Lecomte (Yahoo!) talked about how to effectively use AJAX in your applications without compromising performance.
    Slideshare: High performance Ajax Applications

    Stress, Load and Performance Testing in Quality Assurance: Excellent tips on stress and performance testing by Goranka Bjedov of Google.

    Actionable Logging for Smoother Operation and Faster Recovery: Mandi Walls from AOL talked about logging in general including actionable logging, why it's important, logging goals, log file management, things to avoid in logs and more. (Presentation slides)

    Clouds are No Substitute for Competence: Presented by Javier Soltero of Hyperic

    Energy Efficient Operations: Some Challenges and Opportunities: Luiz Barroso from Google presented this very interesting and informative session about making operations energy efficient.

    Innovation That Drives Opportunity for the Web Infrastructure: John Folwer (Sun Microsystems) was the speaker at this talk about Web 2.0 architectures. (Presentation slides)

    Importance of Operations and Performance: Artur Bergman of Wikia talked about lessons learned while running 7000 wikis.

    Jiffy: Real World Performance Measurement: In this session Scott Ruthfield talks about Jiffy, an open source tool for performance measurement and instrumentation. (Presentation slides)

    KITE: Keynote Internet Testing Environment Launch: KITE was one of the interesting products launched at Velocity. KITE allows you to test from desktop to the Internet cloud. At the time of launch KITE was free. Don't know the current pricing model. (Presentation slides)

    Harnessing Explosive Growth: Infrastructure Strategies and Tactics: Panelists including Sandy Jen, Akash Garg, Jeremiah Robinson, Jonathan Heiliger and James Barrese discussed strategies and tactics for handling explosive growth.

    The Race to the Next Database: Overclocking and Analytics Augment Your Data Layer: At Structure 08, panelists on this session included Mayank Bawa (Aster Data Systems), Doug Judd (Zvents), Luke Lonergan (Greenplum), Damian Black (SQLstream), Dave Schrader (Teradata) and Scott Wiener (Cloud9Analytics). Each panelist provided insight into the ground breaking work their company is doing in solving data processing and handling BI challenges faced by consumers today.

    Working the Clouds: NextGen Infrastructure for New Entrepreneurs: This panel on cloud computing featured panelists including Geva Perry (GigaSpaces), Jason Hoffman (Joyent), Tony Lucas (XCalibre), Lew Moorman (Rackspace), Christophe Bisciglia (Google / AppEngine) and Joe Weinman (AT&T). Christophe got grilled heavily by other panelists but he handled it pretty darn well.

    Werner Vogels: Keynote at Structure 08: Dr. Vogels keynote was one of the highlights of Structure 08. He presented case study of Animoto and talked about the 70/30 switch among other things.

    The Platform Revolution: A Look into disruptive technologies: Jonathan Yarmis AMR Research (VP of Disruptive Technologies) talked about technology trends, social networking, mobility, mobile , cloud computing, stream computing, business models, user 2.0 and the new enterprise reality.

    Green Data Centers: Bill Coleman (Cassatt Corporation) presented this session. Bill is known for being responsible for the B in BEA. (Presentation slides)

    Creating Bebo Applications: Bebo is now part of OpenSocial and this presentation presented at Graphing Social Patterns talks about how to create applications for Bebo.

    Open Social and Google App Engine: Patrick Chanezon (API Evangelist) and Paul McDonald (Product Manager for Google App Engine) presented a technical overview of OpenSocial and Google App Engine at Graphing Social Patterns East.

    OpenSocial: Open for Business: In this session, panelists were Patrick Chanezon (Google), Paul Lindner (hi5), Max Newbould (MySpace) and Sachin Rekhi (imeem). (Also see)

    Viral Marketing and Advertising Strategies for social networks: One of the best sessions at the Graphing Social Patterns conference presented by Kevin Barenblat and Jeff Ragovin. (Also see)

    Mobile Social Networks: A Comparison: Benjamin Joffe's excellent eye opening session for anyone interested in using mobile platform for creating social networking solutions.

    Top 5 Things that fail and win on social networks: Dave McClure, chair of Graphing Social Patterns, presented this concise but every effective presentation on what fails and what wins on social networks.

    Geek Metrics: Using App Analytics to Drive Distribution, Engagement, & Monetization: Dave McClure (500 Hats) moderated this panel which included Hiten Shah (CrazyEgg / KISSmetrics), Ian Swanson (Sometrics, Inc.), Albert Lai (Kontagent) and Roy Pereira (Refresh)

    Social + Mobile = Sociable (Social Networks for SMS, IM & Mobile Devices: Panelists in this session included Benjamin Joffe, Ben Keighran, Gregory Cypes, Craig Dalton and Chris Butler.

    Widget Strategies & Social Platforms: Hooman Radfar, CEO of Clearspring Technologies discussed the new role of widgets and how to go about creating them.

    Facebook Business and Marketing Solutions: Kent Schoen talked about how to use Facebook for business and marketing.

    Developing and Promoting Social Network Applications: Rules of thumb: What does FACEBOOK means when it comes to creating and promoting applications for social networks?

    Social Networks for Business and Marketing Managers: Ro Choy of Rock You! gave an overview of social networks for business managers:

    Scaling MySQL - powered Web Sites by Sharding and Replication: Slides from Peter Zaitsev's session at Velocity. (Presentation slides)

    Capacity Management: John Allspaw's signature presentation on capacity management. John also has a book coming out on this topic. (Presentation slides)

    Structure 08 on demand: Watch the Structure 08 conference on demand at Mogulus.

    LinkedIn Communication Architecture: Slides about LinkedIn's platform built in Java. (Presentation slides)

    SOX Compliance: A presentation by Skye Rogers. I missed this presentation but then caught up with Skye at dinner. I wish Skye would have received more time to discuss SOX Compliance. (Presentation slides)

    There were several sessions I didn't get to go to which is a sad thing. You may want to check the conference websites directly (linked at top of this post) to see if there are presentation slides available. Also, if you took notes at these sessions, please feel free to drop the links as comments to this post.

    S3 suffers major outage

    by Frank at 20:23 PM, 07/20/2008

    Funny how Amazon doesn't use S3 to store any assets for amazon.comtweet by @gruber


    Amazon's S3 suffered a major outage today knocking many websites offline. S3 outage started at approximately 12:00 PM EST and the last time I checked at 11:11PM EST, Smugmug, a popular photo hosting site that extensively uses S3, was still down.

    - S3 down for more than 7 hours
    - S3 outage, 7 hours and counting
    - S3 down again
    - Amazon failure downs Web 2.0 sites
    - Amazon's S3 experiencing outage

    Currently, there are several great opportunities with exciting companies available in the New York area. If you're a rock star Java/PHP/Ruby developer or a pixel-obsessed designer, contact me at your earliest convenience.

    Web Developer:

    Give Real is a well-funded startup in the midst of an exciting period of growth and success. Our technology uses a patent pending platform that combines the ubiquity of credit card transactions and the power of social networks to create a new gifting experience.

    Our primary platform is Rails, but there are programming challenges that range from SOAP APIs to Facebook application development. We are searching for full-time developers with expertise and broad experience in:

    * Ruby on Rails (we also use rSpec, Starling, Memcache)
    * MySQL
    * xHTML & CSS, and comfort with Javascript
    * Team development with tools like Git & Trac

    In addition, we are also interested in candidates who have:

    * Expert Javascript skills
    * Java & SOAP experience
    * Experience scaling with Rails, or any other web platform
    * Comprehensive Linux knowledge
    * UI and graphic design backgrounds

    We are willing to pay top-notch developers very competitively (plus the possibility of options) to join our team and help write code that will be used by hundreds of thousands of users within a few months. We are ideally located in downtown Manhattan less than a minute walk from the BDFV and NRQW lines

    Also, if you know someone who may be a good fit for us (developer or graphic designer), we are offering a $1000 referral reward for anyone we hire.

    Please contact us at jobs@givereal.com

    Graphic Design:

    Give Real is a well-funded startup in the midst of an exciting period of growth and success. Our technology uses a patent pending platform that combines the ubiquity of credit card transactions and the power of social networks to create a new gifting experience.

    We're searching for full-time designers with experience in:

    * Design for advertisements
    * Design for consumer focused websites & applications
    * xHTML & CSS coding
    * HTML & design for emails
    * Working on top of an MVC or template system (we use Rails)

    In addition, we are also interested in candidates who have:

    * Team development with tools like Git & Trac
    * Comfort with Javascript programming
    * Rails programming experience

    We are willing to pay top-notch developers very competitively (plus the possibility of options) to join our team and help design the look and feel of a service that will be used by hundreds of thousands of users with a few months.

    Also, if you know someone who may be a good fit for us (RoR developer or graphic designer), we are offering a $1000 referral reward for anyone we hire.

    Please contact us at jobs@givereal.com

    Please help save Ivan, son of Andrii Nikitin (MySQL Support Engineer), who needs a bone marrow transplant. Andrii's message is below:

    "My family got bad news - doctors said allogenic bone marrow transplantation is the only chance for my son Ivan.

    "8 months of heavy and expensive immune suppression brought some positive results so we hoped that recovering is just question of time.

    "Ivan is very brave boy - not every human meets so much suffering during whole life, like Ivan already met in his 2,5 years. But long road is still in front of us to get full recover - we are ready to come it through.

    "Ukrainian clinics have no technical possibility to do such complex operation, so we need 150-250K EUR for Israel or European or US clinic. The final decision will be made considering amount we able to find. Perhaps my family is able to get ~60% of that by selling the flat where parents leave and some other goods, but we still require external help."

    -- Andrii Nikitin, MySQL Engineer


    Please remember, every little bit will help the family pay for Ivan's operation! Be as generous as you can.

    For donation: Donation can be made through PayPal (via MySQL/Sun website)

    Andrii and Ivan, our prayers are with you.

    Today at 1PM EST I am presenting the second part of memcached for MySQL webinar. I was told that the registration numbers look as good as the previous one. This one will be a bit more technical than the previous webinar. Sorry for the late notice but hope you can join!

    Chad Hurley at Startup2Startup Dinner

    by Frank at 16:03 PM, 06/26/2008

    Tonight, I am attending Startup2Startup Dinner on Dave McClure's invitation (Thanks, Dave!). Chad Hurley, CEO and co-founder of YouTube will be speaking at this invitation only event. I will post more updates on my personal blog or you can follow me on Twitter.

    Graphing Social Patterns - East

    by Frank at 14:47 PM, 06/06/2008

    Graphing Social Patterns - East 2008In just a few minutes, I will be leaving for Graphing Social Patterns East, a conference by Oreilly. Dave McClure of 500 Hats is the conference chair. I plan to meet old friends and make new ones. It should be a lot of fun. More about Graphing Social Patterns.

    Goosh: Google Shell for Geeks

    by Frank at 21:11 PM, 06/02/2008

    Ever wish you could have a browser based shell for Google? One that was clutter and advertising free? Say hello to Goosh, one of the coolest service to hit the web.



    It even recognizes 'clear' :) For now, I am addicted to it.

    Disaster is really inevitable. Even with all the redundant power investments, ThePlanet (formerly EV1 and RackShack), had to shut down their backup generators at their H1 data center on the instructions of the fire crew. This happened after a wire-short in fault transformer led to an explosion that knocked off one of their walls, ultimately bringing 9,000 servers down. Luckily no one was injured.

    This just goes on to show that just because a data center has redundant power and backup generators, it does not mean that a disaster cannot happen. IIRC, ThePlanet's last disaster was blamed on backup generators not kicking off properly.

    While there was no damage to servers, I wonder how many MyISAM repairs need to be triggered once the servers do come back online?

    - The Planet Status Update

    Michael Arrington of TechCrunch asks Twitter a few questions. I have only included a sample list below but you should read his blog post for all the questions:

    • Is it true that you only have a single master MySQL server running replication to two slaves, and the architecture doesn’t auto-switch to a hot backup when the master goes down?
    • Do you really have a grand total of three physical database machines that are POWERING ALL OF TWITTER?
    • Is it true that the only way you can keep Twitter alive is to have somebody sit there and watch it constantly, and then manually switch databases over and re-build when one of the slaves fail?

    A 'yes' answer to any of these questions by Twitter would be disturbing to say the least. However, it won't be surprising as companies expect databases to just somehow magically work without creating and supporting a proper architecture. High availability doesn't comes cheap and reputation for companies is everything.

    I find it amusing that Twitter isn't even looking for a DBA. May be that's considered a job for the SA over there :)

    Memcached Webinar - 560+ registrants

    by Frank at 20:09 PM, 05/29/2008

    A big thank you to all those who attended the memcached webinar today on which I was a panelist. I was told that there were more than 560 registrants.

    The feedback I received directly and indirectly shows that there is a lot of interest about memcached. In the future, I hope to work again with MySQL/Sun on more memcached related webinars.

    If you attended the webinar and have some suggestions, comments or questions, please contact me at fmashraqi at yahoo dot com or post a comment on this blog.

    Special thanks to Jimmy Guerrero, Monty Taylor, Rich Taylor, Edwin DeSouza and Alex Roedling for their hard work in arranging the webinar. Also thanks to Brian Aker, Matt Ingenthron and Trond Norbye for their assistance at various phases.

    In case you missed the webinar:

    Regarding my earlier post on memcached webinar, I was informed today that more than 420 registrants have signed up. Space is limited and filling up fast so if you are interested in memcached and haven't registered yet, click on the following link to register now!

    Designing and Implementing Scalable Applications with Memcached and MySQL (June 29)


    Quick link: register for Designing and Implementing Scalable Applications with Memcached and MySQL webinar (June 29)

    Ever since its introduction, memcached has been changing the way cost-efficient caching is perceived. Some passionately love it, others cynically hate it.

    Today, many large scale web 2.0 properties (including my employer) save millions of dollars by depending on memcached to bring their application response time under control and to offload pressure from databases.

    There are several success stories about using memcached to speed up database driven websites. Facebook, for instance, runs the largest memcached installation and the numbers only keep increasing. In May 2007, Facebook was reportedly running 200 dedicated servers with 3TB of memory in their memcached cluster. At the "Scaling MySQL Up or Out" Keynote, Facebook revealed they are now using 805 dedicated memcached servers. That's more than a 400% increase in less than a year!

    Twitter, digg, Wikipedia, SourceForge, and even Slashdot depend on memcached to keep their users happy.

    For my employer, memcached has been a crucial component of the infrastructure that has been instrumental in handling explosive growth in a cost-efficient manner. In addition, memcached has helped us offload billions of queries from our database.

    To highlight several real-life use cases of memcached (see below), I will be presenting a memcached webinar on Thursday, June 29 at 1 PM EST (10 AM PST). Monty Taylor (Senior Consultant, Sun Microsystems) and Jimmy Guerrero (Sr Product Marketing Manager, Sun Microsystems - Database Group) will also be speaking at the event. Space is limited and filling up fast (200+ registrants already) so I highly recommend registering now.

    In this webinar, I will be covering several use cases for memcached including (but not limited to):
    • deterministic cache
    • non-deterministic cache
    • proactive cache
    • "state" cache
    • filesystem cache replacement
    Hope to "see" you at the webinar.

    Note
    : This memcached webinar is not to be confused with the memcached webinar being presented by Ivan Zoratti on June 28.

    Interview by Sun TV at MySQL Conference

    by Frank at 19:10 PM, 05/21/2008

    At the MySQL Conference and Expo, right after my participation in scaling up or scaling out keynote panel, I talked to Sun's Multimedia team about Sun and MySQL in our environment.

    Recently, I found the interview on Sun's Multimedia page. The video of my discussion is embedded below:

    Over the weekend I took some notes from a presentation and did some research from various sources. The result was a blog post about Internet trends that I posted on my personal blog. There are some very interesting statistics about Internet usage and social networking. Also, Facebook fans will find some interesting facts as well.

    The official MySQL Drink. How to make it:

    Continue reading "Salmiakki - the official MySQL Drink"

    This is the english translation of another article in my german language blog.

    How are transactions organized physically

    When InnoDB creates a new transaction it is not yet committed. The database has not yet made any promises to the application and so we do not really have to make anything persistent so far.

    To be fast InnoDB tries to assemble the transaction in a memory buffer, the innodb_log_buffer. It should be sufficiently large that you actually can assemble such a transaction in memory without needing to write it out in part into the redo log. A size of 1M to 8M is normal.

    Once a transaction is to be committed InnoDB has to read the page from disk which contains the image of the row that is being changed. It then has to actually make that change in memory. The changed page is cached in a memory pool called the innodb_buffer_pool. This pool also caches unchanged pages that have been accessed by read operations. All of these pages on disk and in memory are 16K in size and the innodb_buffer_pool_size determines how much RAM we will use as a cache for such pages - usually as much as we can spare.

    Continue reading "Configuring InnoDB - An InnoDB tutorial"

    This is an english translation of the second part of an article in my german language blog.

    Transactions - An InnoDB tutorial

    InnoDB does transactions. Meaning: It collects statements working on InnoDB tables and applies them on COMMIT to all tables "at once". Either all of these statements inside one transaction succeed ("commit") or all of them fail ("rollback"), changing nothing.

    By default, the database is in AUTOCOMMIT mode. Meaning: The server sees a virtual COMMIT command after each statement. You can disable autocommit completely, or you are starting an explicit transaction inside autocommit using the BEGIN statement.

    Continue reading "Transactions - An InnoDB Tutorial"

    An InnoDB tutorial

    by Kristian Köhntopp at 00:43 AM, 02/07/2008

    MySQL offers a variety of storage engines giving you a lot of flexibility in managing your storage and data access needs. Still I encounter customers who are not using this flexibility when they should, because they lack information about the advantages of Non-MyISAM storage engines or which are using storage engines like InnoDB as if they were using MyISAM.

    This is the introductory article in a series of texts that will hopefully once become an InnoDB tutorial. A german version of this article is available in my german language blog.

    An InnoDB Tutorial

    The InnoDB storage engine is an engine that can be operated ACID compliant, does transactions and foreign key constraints. It is useful for all applications that do online transaction processing or have a high rate of concurrent write accesses for other reasons.


    Continue reading "An InnoDB tutorial"

    PHP PDO V2 CLA

    by Kristian Köhntopp at 02:50 AM, 01/26/2008

    PHP is a nice programming language for web applications with a large number of databases supported.

    PDO is one of many database access abstractions trying to unify the way PHP talks to databases. Unlike most of these, PDO is written in C and not in PHP.

    Wez Furlong wants to improve PDO "Version 2". In particular he and others want to produce an open spec that can be used by anybody including database vendors to code against, add unit testing for the spec and the drivers that claim to support it and improve metadata handling for PDO.

    In order to get database vendors, notably IBM, into the game he proposes a Contributor License Agreement to make it easier for vendors to commit into the PDO part of PHP. There exist corporate and individual versions of the CLA, and the PDO license is yet another license that is different from the PHP license. Wez has prepared a FAQ covering questions regarding all that.
    Continue reading "PHP PDO V2 CLA"

    LDAP is not relational

    by Kristian Köhntopp at 04:12 AM, 12/05/2007

    My thinking was that the topic is already dead, but people have strange ideas off and on again. Have a look at the S9Y boards where you'll find someone who wants to have a storage backend "LDAP" for S9Y. This is sick and wrong! Let me explain why.


    Continue reading "LDAP is not relational"

    According to my findings in Bug #31876, MySQL does not commit data to disk in Windows using the same method MS SQL Server and DB/2 are using. The method MySQL uses appears to be seven times slower in pathological scenarios.

    The bug report contains a patch - thanks to the MySQL WTF (The Windows Task Force) and the lab provided by the customer for helping me to find that.

    Does this work for you? I want to hear about your test results.

    Rubyisms

    by Kristian Köhntopp at 09:25 AM, 07/11/2007

    Lately, I have had opportunity to evaluate a very large Ruby installation that also was growing very quickly. A lot of the work performed on site has been specific to the site, but other observations are true for the platform no matter what is being done on it. This article is about Ruby On Rails and its interaction with MySQL in general.

    Continue reading "Rubyisms"

    Replication - now and then

    by Kristian Köhntopp at 22:42 PM, 07/10/2007

    One of the major contributing factors to the success of MySQL is the ease and simplicity of its replication. Read-slaves for scaleout and backup-slaves for noninterrupting backups are the norm in any MySQL installation I have seen in the last two years.

    So how does replication work? And how shall it be expanded in the future?

    What is available?

    The binlog written by MySQL currently logs all statements changing the tablespace. It is a serialization of all tablespace changes. The binlog position, expressed as (binlog name, offset), is a database global timestamp - a timestamp expressed in seconds.fraction does not work for any precision at all, because on a multi-core machine multiple things can happen concurrently.

    If you want to make a consistent full backup of the database, the database must not change during the backup. That is, it must be possible to associate one and exactly one binlog position with the backup.

    In fact, if you have such a backup - one associated with a binlog position - and you happen to have the binlogs from that time until now, it is possible to do a point-in-time (PIT) recovery. You'd recover from the full backup and you'd then replay the binlog from the backups binlog position until now. That is why it is important to store the binlog in a filesystem that fails independently from the rest of your MySQL. That's also why you must not filter the binlog that is written by MySQL using binlog-do-db and binlog-ignore-db - if you do, you'll get an incomplete binlog that will fail to be useful in a PIT recovery scenario.

    A slave in MySQL is now nothing but a binlog downloader and executor: The slave must be restored from a PIT-capable full backup. It is then being told the current binlog position and where to log in to get the missing binlog. The slaves IO_THREAD will then log into the master server and download the binlog to the local disk as fast as possible, storing it as the relay log. The slaves SQL_THREAD will then start to execute the relay log as fast as possible. Replication can thus be thought of as an ongoing live recovery.


    Continue reading "Replication - now and then"

    In MyISAM, we do have LOAD INDEX INTO CACHE. In InnoDB this does not work. For benchmarking I often require a way to preload the innodb_buffer_pool with the primary key and data after a server restart to shorten warmup phases.

    According to Blackhole Specialist Kai, the following should work:

    CODE:
    mysql> create table t like innodbtable;
    mysql> alter table t engine = blackhole;
    mysql> insert into t select * from innodbtable;
    Another win for the unbreakable BLACKHOLE storage engine.

    Coming back to activity

    by Mike Hillyer at 14:29 PM, 04/10/2007

    Hi All;

    Sorry for the low maintenance level on this site, personal matters have limited my available time. I plan to now come back to more active maintenance of this site, beginning with an update to the forums.

    The forums now allow for registration without admin approval but have a new requirement that those registering must answer a simple question to submit their registration. This should hopefully allow new members to join without waiting for their approval, while also preventing spam registration.

    When I return from the MySQL Conference I’ll be restoring the long-missing samplecode section to the site and restoring the three-parts of the Yatta tutorial.

    One of the recent issues wiped out my copies of Yatta that were sent to me by you the readers. If you previously sent me Yatta, please re-send to mike@thisdomain.

    Do You Have Yatta?

    by Mike Hillyer at 06:58 AM, 02/20/2007

    During the site failure the articles and source for the Yatta series was lost, so I thought I would put a call out: do you have a local copy of the articles and source for the Yatta series? If so, please comment here or send me an email!

    vbmysqldirect online

    by Mike Hillyer at 22:25 PM, 11/18/2006

    I didn’t have a chance to restore the full structure to vbmysqldirect’s project pages yet, but I have posted an archive of the project pages to our new Projects page.

    See http://www.vbmysql.com/projects/vbmysqldirect/

    Mostly Restored

    by Mike Hillyer at 14:18 PM, 11/11/2006

    Good news everyone! At this point most of the articles are restored. I’ll be finishing the restoration of the last two or three articles soon and moving on to a restoration of the sample code and the project pages (vbmysqldirect in particular).

    Thank you for waiting so far, and if you find any errors in the restored articles please let me know.

    Help Wanted

    by Mike Hillyer at 09:29 AM, 10/26/2006

    Hi Folks;

    Those of you who have been around for a while will likely know that part 4 of the tutorial had some code that became outdated when VB2005 came out of Beta. I’ve updated the code samples in the article and its zip file but lack the time right now to do a full validation due to work.

    As such, I need your help. If someone out there could be so kind as to start at Part One and work through the tutorial using the current release of VB2005 to check for issues it will be much appreciated.

    I’ll try to get parts 5-7 up soon and move on to the remaining article content. Then I’ll work on getting samplecode online.

    Also, new content is always appreciated. If you have an article or sample you want to submit feel free to let me know and I’ll create a contributor account for you in the CMS.

    I’m also looking for someone to manage the Spanish language articles and moderate the new Spanish language topic in the forums.

    Thanks!

    New Forums

    by Mike Hillyer at 09:29 AM, 10/26/2006

    I’ve installed forums to http://www.vbmysql.com/forums. Because of the new hosting arrangements and the difficulty in restoring the previous forum software, I’ve changed backends. This means you’ll need to re-register and recreate any posts you were still looking for answers to.

    There’s an RSS feed for the forums at http://www.vbmysql.com/forums/rss.php.

    Please Provide Feedback

    by Mike Hillyer at 09:29 AM, 10/26/2006

    Hi All;

    I’ve restored two articles so far and would appreciate any feedback as to the appearance/style of the articles posted so far. Since I’m rebuilding site design as I restore articles, now is a good time to voice your opinion.

    Site Restore in Progress

    by Mike Hillyer at 09:29 AM, 10/26/2006

    Due to uncontrollable circumstances, this site is down. Restoration is in progress, please be patient.

    Point and click what you want to do.
    2008 scandalz.net
    Bradley's Bromide: If computers get too powerful, we can organize them into a committee -- that will do them in.
    CountryUS
    IP Address38.107.191.99
    User AgentCCBot/1.0 (+http://www.commoncrawl.org/bot.html)