Planet Drizzle

We’ve released an updated version of the MySQL Configuration Wizard we announced at the end of last year. If you don’t remember that announcement, here’s the short version: this is a tool to help you generate my.cnf files based on your server’s hardware and other characteristics.

We’ve gotten really good feedback on this tool, including this nice mention on Stack Exchange:

Percona just built a tool to do just that called the Configuration Wizard. I tested it out once just to see what it would return and the results were pretty darn close to what we were using on our servers, whose cnf’s were put together by highly trained mysql certified dba’s.

So what’s changed in the new version of the Configuration Wizard? Quite a few things. We’ve rolled out the first iteration of the account and profile features. Now you get a homepage with your configuration files, so you can manage them and return to them anytime you like.

From this page (click on the image for a fullsize view) you can do things like sharing configuration files and emailing them to yourself. The new release also adds features like downloading the configuration files so you don’t have to copy-paste them.

If you share a configuration file, then the URL can be loaded by anyone, even if they’re not logged in. It’s kind of like sending someone a link to a pastebin or something like that. Screenshot:

Another new feature is something I’ve wanted for a long time: the ability to generate a more strict, safer configuration file. There’s a new page in the Wizard that lets you set a lot of sanity/safety options to prevent common problems MySQL users run into because of too-permissive MySQL behaviors. These are the kinds of things that Drizzle fixes — and should be fixed by default in MySQL — but never will be because they might break applications that rely on the default behaviors. If you’re building an application from the ground up, now you can prevent bad things from getting a nose under the tent. Here’s a screenshot:

In addition to these things, we have added a number of other features you might not notice, which I won’t spend much time on — they’re things like an integrated feedback form at the left of the page and so on.

What’s next? Well, next I think we’re going to turn our attention to adding new tools, rather than improving this one. I have a list of tools that people have requested or suggested: a SQL formatter, a visual EXPLAIN tool, a configuration advisor, a query analysis tool, a way to register a server’s essential characteristics and then get advice when there’s a new release that might be beneficial for you, and so on. I have selected the next priorities, but I don’t want to spoil the surprise or promise something if it turns out to be harder than I think it will be. What ideas do you have? Let me know by leaving your feedback in the comments.

We hope this suite of free browser-based tools helps you become a more productive MySQL user and administrator!

I am pleased to announce the schedule for Percona Live: MySQL Conference And Expo 2012 is now published. This is truly great selection of talks with something for MySQL Developers, DBAs, Managers, people just starting to use MySQL as well as looking for advanced topics. We have talks about running MySQL on extremely large scale in a Web as well as running MySQL In the Enterprise Environments. Some speakers have spent over decade pushing MySQL to its limits, others have in depth experience working on MySQL Code.

We have many talks which are covering Oracle MySQL, and forks such as MariaDB, Drizzle and Percona Server are well covered too. You will also have a chance to learn about commercial MySQL alternatives such as Clustrix and SchoonerSQL from our sponsors.

At the same time this is the conference for MySQL Community. We’re talking about other database systems only as it comes to migration to MySQL and about NoSQL technologies such as Memcached,Redis,Sphinx which are commonly used to supplement MySQL.

The space was very tight this year and competition was very tough. We had over 300 proposals for approximately 60 slots. As results committee had to make a lot of very tough choices and many great talks could not be accommodated.

We have a great Conference Committee this year who has done a great job getting the schedule together. I can’t thank them enough !

See you in April in Santa Clara for MySQL Conference and lets make this event an amazing success !

I spent last week at linux.conf.au in Ballarat, Victoria (that’s the Victoria in Australia, not wherever else there may be one) which is only a pleasant two hour drive from my home town of Melbourne (Australia, not Florida). I sent an email internally to our experts detailing bits of the conference that may interest them – and I thought that it may also interest our wider readers who are interested in all levels of the software stack.

For those that don’t know: linux.conf.au is one of (if not the) most awesome technical conference in the free software space. It consistently attracts a wide variety of very knowledgable speakers and a large number of attendees.

Every year it is put together by a (different) set of volunteers, and this means it also tours around the country (and sometimes even New Zealand). This year it was in Ballarat – a regional city a couple of hours drive out of Melbourne. One of the great things about LCA is that you are not always at the same hotel, in the same city stuck with the same two restaurants.

This year had a bit of an increased focus on privacy, security and basic freedoms and human rights. This is no doubt a reaction to the increased attacks on freedom of speech and the internet that have been going on in recent months.

That being said, there were a huge number of great talks on a variety of topics – everything from filesystem performance to open hardware, to repurposing existing hardware to upcoming challenges for the kernel to howto be a better sysadmin. In fact… for those who weren’t there and spend any of their life helping people admin machines – go and watch those talks.

linux.conf.au (for me) is one of the cannot-miss events in the year. It’s an opportunity to learn things that directly apply to my work, may apply in the future and most certainly will never apply but are rather cool anyway.

All the video from the conference are already up! This is an amazing effort from the (volunteer) AV team. I’ve included links to a selection of talks below that I especially think are worth watching:

Watch no matter what:

Talks that could be quite interesting for you, depending on your interests:

Talks I shall be watching the videos of as I was in another talk at the time:

My Talk:

All the videos are going up at:

In my previous post I explained why I believe the production of RPM and DEB packages should be more integrated with the rest of your development process. Now it's time to look into how you can put the RPM build scripts inside your main source code repository, and in particular how I did that to produce RPM packages for Drizzle.

read more

Last weekend I released rpm files for the latest Drizzle Fremont beta (announcement). As part of that work I've also integrated the spec file and other files used by the rpmbuild into the main Drizzle bzr repository (but not yet merged into trunk). In this post I want to explain why I think this is a good thing, and in a follow up post I'll go into what I needed to do to make it work.

(And speaking of stuff you can download, phpMyAdmin 3.5.0-alpha1 now supports Drizzle!)

read more

Hot on the heels of this week's Drizzle 2012.01.30 source release, we are now also releasing beta quality binary RPMs for your downloading and testing pleasure! While we are revitalizing the Drizzle yum repository, you can download the rpm-bundle from our Launchpad project page: https://launchpad.net/pkg-drizzle/+download

As the Fremont development cycle is ending it's beta cycle and heading for release candidates, we are starting to publish also binary packages for downloads. Today we are releasing RPM packages for RHEL/CentOS 6.0. Fedora 16 packaging has unfortunately been postponed. We will eventually support also Fedora 16 RPMs, but in the mean time you can possibly spot some Drizzle developers on twitter or irc making snarky comments about systemd ...

In the following week(s) we will also publish Debian and Ubuntu packages.

Note that these packages are a first release of their kind and still a bit experimental. What this means is that the underlying Drizzle release itself is stabilizing quite nicely, but the packaging process is currently not completely integrated nor automated.

Please report errors at the main Drizzle bug tracker, and direct your general feedback to the drizzle-discuss mailing list.

Details:

  • These packages were built from the Drizzle bzr trunk at tag 2012.01.30.
  • In addition a branch that supports "make rpm" was merged from lp:~hingo/drizzle/drizzle-integrate-packaging-rpm. This is based on the same RPM spec as was used for Drizzle 7, but the intent is to integrate it with the main Drizzle source code and then Jenkins builds. The current release was however still a manual process.
  • When building this way, an additional bzr tag --force 2012.01.30 was needed to maintain the same version number.
  • Along the RPMs we have uploaded a source tar file fake-drizzle7-2012.01.30.tar.gz. This corresponds to the real Drizzle 2012.01.30 release, plus the above changes.
  • The RPM files are made available as a tar bundle. The process to publish into the Drizzle yum repository is currently broken but will resume in the near future.
  • A related issue is that the packages are not signed.
  • Known issue: If you install the plugin rpms, they will be automatically loaded at startup. However, some of them (in particular, many auth plugins, the policy plugins and possibly the slave plugin) don't actually work without further configuration and will thus prevent the server from starting. If you encounter this plugin, you should either complete the configuration (in /etc/drizzle/conf.d/[plugin].conf) or, if you don't want to use it, uninstall the plugin.

Drizzle source tarball 2012.01.30 has been released. This is the third beta in our Fremont series.

In this release:
* continuing refactoring, restructuring, and code quality improvements
* many more documentation improvements
* additional bugs fixed
* improvements in test suite

The Drizzle download file can be found here.


I'm starting the first leg of a literal `round-the-world business trip.

I'm flying from my home in Seattle, via LAX, to Melbourne Australia.  There I will meet up with my friend Stewart Smith, the Director of Engineering at Percona. He is fellow survivor of MySQL & Sun, and a fellow contributor to Drizzle.

Other good friends of mine who are converging on Australia for LCA 2012 are Sarah Novotny, Monty Tayler, and Jacob Applebaum.

Why am I going to Australia?  Geeks into open source who are "in the know" know the Linux.conf.au conference to be one of the best open source conferences in the world.  This year it will be held in Ballarat, which is not far from Melbourne.  This will be my 4th LCA, having attended past ones in Brisbane, Wellington, and Tazmania

At that conference, I will be speaking in the SysAdmin Miniconf, to demo OpenShift, Red Hat's cloud PaaS.  (Sign up with the promo code LCA2012.)


This is only the first leg of this trip.  After LCA, I will be heading to Bangalore India to present at JUDCon:India.

And after that, I will keep heading west to Europe, to present at FOSDEM in Brussels.

And then who knows where I will go next?


Wow, what an insane year. Looking back on it, a lot of stuff has happened this year for me. Let's look back:
The new job has kept me unbelievably busy, so I haven't had the time to contribute more to Drizzle, other than a bug fix here or there. That makes me a sad panda, but priorities have changed. Perhaps next year will offer more opportunity.

I don't know what 2012 has in store for me, but hopefully it is just as exciting. I hope everyone has a Happy New Year.




One of the best things that can happen to a piece of software is for people to actually use it.

I’ve been fortunate enough to have received feedback on the tool from several members of both the Percona and Drizzle teams.  The most common and strongly emphasized comments were in regards to what a terrible, terrible name dbqp really is in terms of saying, seeing, and typing it ; )

As that isn’t something that can be disputed (it’s really annoying to use in conversations *and* to type several dozen times a day), the project has been renamed to kewpie.  For those that follow such things, I did present on another tool with that name at the last MySQL Conference, but *that* tool is a nice-to-have, while the test-runner sees daily use.  Better to save the good names for software that actually stands a chance of being used, I say : )

While there are probably 1*10^6 other things I need to do (Stewart is a merciless slave driver as a boss, btw…heheh), the fact that we are merging the tool into the various Percona branches meant it should be done sooner rather than later.  The tool is currently in our 5.1 branch and I have merge requests up for both Drizzle and Xtrabackup (dbqp was living there too).

I have several other interesting things going on with the tests and tool, which I’ll be blogging about over at MySQL Performance Blog.  Later this week, I’ll be talking about what we’ve been doing to work on this bug ; )

 

Also, the Percona Live MySQL Conference in DC is just around the corner.  There are going to be some great speakers and attendees

You could use this in a Vagrant setup if you like (I’ve done so for testing).

Step 1) Set the following in your Vagrantfile:

Vagrant::Config.run do |config|
  config.vm.box = "lucid32"
  config.vm.box_url = "http://files.vagrantup.com/lucid32.box"
  config.vm.provision :puppet
end

Step 2) Get puppet-apt helper.

I used https://github.com/evolvingweb/puppet-apt and put it in a manifests/ directory like so:

$ mkdir manifests
$ cd manifests
$ git clone git://github.com/evolvingweb/puppet-apt.git

Step 3) Write your puppet manifest:

import "puppet-apt/manifests/init.pp"
import "puppet-apt/manifests/ppa.pp"
class drizzlebuild {
        apt::ppa { "ppa:drizzle-developers/ppa": }
        package { "drizzle-dev":
                  ensure => latest,
        }
}
include drizzlebuild

Step 4) “vagrant  up” and you’re done! Feel free to build Drizzle inside this VM.

I’m sure there may be some more proper way to do it all, but that was a pretty neat first intro to me to Puppet and friends :)

Drizzle is known for its plugins, but the DATA_DICTIONARY.PLUGINS table has a column for MODULE_NAME and there’s a MODULES table. What is the relation between Drizzle modules and plugins? A Drizzle module provides one or more plugin of potentially various types. Or, analogously: a Drizzle module is a bookshelf and each plugin is a book. [...]

or… “the case of Stewart recognizing parameters to the read() system call in strace output”.

Last week, a colleague asked a question:

I have an instance of MySQL with 100 tables and the table_definition_cache set to 1000. My understanding of this is that MySQL won’t revert to opening the FRM files to read the table definition, but we can see from strace:

[pid 19876] open("./db/t1.frm", O_RDONLY) = 32 <0.000013>
[pid 19876] read(32, ""..., 10)         = 10 <0.000011>
[pid 19876] close(32)                   = 0 <0.000012>
[pid 19876] open("./db/t2.frm", O_RDONLY) = 32 <0.000014>
[pid 19876] read(32, ""..., 10)         = 10 <0.000012>
[pid 19876] close(32)                   = 0 <0.000012>
[pid 19876] open("./db/t3.frm", O_RDONLY) = 32 <0.000014>
[pid 19876] read(32, ""..., 10)         = 10 <0.000011>
[pid 19876] close(32)                   = 0 <0.000011>
[pid 19876] open("./db/t4.frm", O_RDONLY) = 32 <0.000013>

So, why is this? It turns out that this triggered a memory for me from several years ago. I’ve since discovered the blog post in which I mention it: drop table fail (on the road to removing the FRM). That blog post is from 2008, almost three years ago to the day.

Since we completely reworked how metadata works in Drizzle, it has enabled us to do some truly wonderful things, including more in depth testing of the server. Amazingly enough, spin-offs from this work included being able to find out and then test that the ENUM limit of 65,535 has never been true (but now is in Drizzle), produce a CREATE TABLE statement that took over four minutes to execute and get a more complete view of how the Storage Engine API is called.

But back to what the above strace shows. In MySQL 5.5 you can find in sql/datadict.cc a function named dd_frm_type(). In MySQL 5.1, for some reason yet unknown to humans, it lives in sql/sql_view.cc as mysql_frm_type(). What this code snippet does is:

  • open the FRM
  • read 10 bytes (“header”)
  • check if it’s a view by doing a string compare for “TYPE=VIEW\n” being the first bytes of the FRM file. This is due to VIEWs being stored as the plain text of the SQL query inside the FRM file instead of the normal binary format FRM.
  • some legacy check for a generic table type (I think, I haven’t gone back into the deep history of the FRM file format to confirm)
  • return the fourth byte for the DB_TYPE. i.e. what storage engine it is.

We can ignore the upper limit on number of storage engines for MySQL and understanding the relationship between the range of numbers for dynamic assignment and what this means for on-disk compatibility of data directories is left as an exercise for the reader.

This code is called from several code paths in the server:

  • DROP TABLE
  • RENAME TABLE
  • DROP VIEW
  • open table
  • filling INFORMATION_SCHEMA tables (I think it is actually the TABLES table, but didn’t look closely)
An example of how this is used is that in the DROP TABLE code path, MySQL uses this magic byte to work out which Storage Engine to ask to drop the table. The main consequence of this bit of code is that MySQL may cause unnecessary disk IO for information it already has cached (often at least twice – in InnoDB itself and in the table_definition_cache).

Further reading:

Way back in 2009, Monty Taylor got fed up with maintaining a set of common autotools foo across several projects (one of which was Drizzle) and started the pandora-build project.  Basically, it’s a collection of the foo you need for autotools to do things like: use it properly, detect a bunch of common libraries, enable crap-tons of compiler warnings (and -Werror) and write an application/library with plugins (that are auto-discovered and built).

(and don’t worry, there’s also modes to disable -Werror and different compiler warnings if you’re working on an old code base that really doesn’t build cleanly)

There’s also templates for Quickly to get you up and started really quickly.

Basically, for the past 3 years, whenever I’ve gone to write some small project (or got sufficiently annoyed with the broken build system on an old one), I’ve turned to pandora-build to solve my problems.

Recently, I’ve had the need to use the plugin infrastructure of pandora-build in a new project (I’ve used it extensively in Drizzle of course). The one bit that pandora does not take care of for you is the dlopen() code to load plugins at run time… although I do wonder about turning some of that code into a bit of a library just because a bunch of it is pretty common….

Of course, a task for me is to write up a blog post on how I did it all, but for the moment I thought I’d just share :)

I’ve just arrived at ANU in Canberra for the Open Source Developers Conference 2011 (OSDC). I’ve spoken at several of the past OSDCs that have been held around Australia: 2005, 2007, 2008, 2010 and now 2011. It’s one of those conferences with great energy and great people that’s organised by dedicated members in the community who build the conference they want to go to.

I’ll be giving two talks this year:

I tend to speak highly of the random query generator as a testing tool and thought I would share a story that shows how it can really shine. At our recent dev team meeting, we spent approximately 30 minutes of hack time to produce test cases for 3 rather hard to duplicate bugs. Of course, I would also like to think that the way we have packaged our randgen tests into unittest format for dbqp played some small part, but I might be mildly biased.

The best description of the randgen’s power comes courtesy of Andrew Hutchings – “fishing with dynamite“. This is a very apt metaphor for how the tool works – it can be quite effective for stressing a server and finding bugs, but it can also be quite messy, possibly even fatal if one is careless. ; ) However, I am not writing this to share any horror stories, but glorious tales of bug hunting!

The randgen uses yacc-style grammar files that define a realm of possible queries (provided you did it right…the zen of grammar writing is a topic for another day). Doing this allows us to produce high volumes of queries that are hopefully interesting (see previous comment about grammar-writing-zen).

It takes a certain amount of care to produce a grammar that is useful and interesting, but the gamble is that this effort will produce more interesting effects on the database than the hand-written queries that could be produced in similar time. This is especially useful when you aren’t quite sure where a problem is and are just trying to see what shakes out under a certain type of stress. Another win is that a well-crafted grammar can be used for a variety of scenarios. The transactional grammars that were originally written for testing Drizzle’s replication system have been reused many times (including for two of these bugs!)

This brings us to our first bug:
mysql process crashes after setting innodb_dict_size

The basics of this were that the server was crashing under load when innodb_dict_size_limit was set to a smaller value. In order to simulate the situation, Stewart suggested we use a transactional load against a large number of tables. We were able to make this happen in 4 easy steps:
1) Create a test case module that we can execute. All of the randgen test cases are structured similarly, so all we had to do was copy an existing test case and tweak our server options and randgen command line as needed.

2) Make an altered copy of the general, percona.zz gendata file. This file is used by the randgen to determine the number, composition, and population of any test tables we want to use and generate them for us. As the original reporter indicated they had a fair number of tables:

$tables = {
rows => [1..50],
partitions => [ undef ]
};

The value in the ‘rows’ section tells the data generator to produce 50 tables, with sizes from 1 row to 50 rows.

3) Specify the server options. We wanted the server to hit similar limits as the original bug reporter, but we were working on a smaller scale.
To make this happen, we set the following options in the test case:

server_requirements = [["--innodb-dict-size-limit=200k --table-open-cache=10"]]

Granted, these are insanely small values, but this is a test and we’re trying to do horrible things to the server ; )

4) Set up our test_* method in our testcase class. This is all we need to specify in our test case:

def test_bug758788(self):
test_cmd = ("./gentest.pl "
            "--gendata=conf/percona/innodb_dict_size_limit.zz "
            "--grammar=conf/percona/translog_concurrent1.yy "
            "--queries=1000 "
            "--threads=1")
retcode, output = execute_randgen(test_cmd, test_executor, servers)
self.assertTrue(retcode==0, output)

The test is simply to ensure that the server remains up and running under a basic transactional load

From there, we only need to use the following command to execute the test:
./dbqp.py –default-server-type=mysql –basedir=/path/to/Percona-Server –suite=randgen_basic innodbDictSizeLimit_test
This enabled us to reproduce the crash within 5 seconds.

The reason I think this is interesting is that we were unable to duplicate this bug otherwise. The combination of the randgen’s power and dbqp’s organization helped us knock this out with about 15 minutes of tinkering.

Once we had a bead on this bug, we went on to try a couple of other bugs:

Crash when query_cache_strip_comments enabled

For this one, we only modified the grammar file to include this as a possible WHERE clause for SELECT queries:

WHERE X . char_field_name != 'If you need to translate Views labels into other languages, consider installing the <a href=\" !path\">Internationalization</a> package\'s Views translation module.'

The test value was taken from the original bug report.
Similar creation of a test case file + modifications resulted in another easily reproduced crash.
I will admit that there may be other ways to go about hitting that particular bug, but we *were* practicing with new tools and playing with dynamite can be quite exhilarating ; )
parallel option breaks backups and restores

For this bug, we needed to ensure that the server used –innodb_file_per_table and that we used Xtrabackup‘s –parallel option. I also wanted to create multiple schemas and we did via a little randgen / python magic:

# populate our server with a test bed
test_cmd = "./gentest.pl --gendata=conf/percona/bug826632.zz "
retcode, output = execute_randgen(test_cmd, test_executor, servers)
# create additional schemas for backup
schema_basename='test'
for i in range(6):
    schema = schema_basename+str(i)
    query = "CREATE SCHEMA %s" %(schema)
    retcode, result_set = execute_query(query, master_server)
    self.assertEquals(retcode,0, msg=result_set)
    retcode, output = execute_randgen(test_cmd, test_executor, servers, schema)

This gave us 7 schemas, all with 100 tables per schema (with rows 1-100). From here we take a backup with –parallel=50 and then try to restore it. These are basically the same steps we use in our basic_test from the xtrabackup suite. We just copied and modified the test case to suit our needs for this bug. With this setup, we need a crash / failure during the prepare phase of the backup. Interestingly this only happens with this number of tables, schemas, and –parallel threads.

Not too shabby for about 30 minutes of hacking + explaining things, if I do say so myself. One of the biggest difficulties in fixing bugs comes from being able to recreate them reliably and easily. Between the randgen’s brutal ability to produce test data and queries and dbqp’s efficient test organization, we are now able to quickly produce complicated test scenarios and reproduce more bugs so our amazing dev team can fix them into oblivion : )

So I’m back from the Percona dev team’s recent meeting.  While there, we spent a fair bit of time discussing Xtrabackup development.  One of our challenges is that as we add richer features to the tool, we need equivalent testing capabilities.  However, it seems a constant in the MySQL world that available QA tools often leave something to be desired.  The randgen is a literal wonder-tool for database testing, but it is also occasionally frustrating / doesn’t scratch every testing itch.  It is based on technology SQL Server was using in 1998 (MySQL began using it in ~2007, IIRC).  So this is no knock, it is merely meant to be an example of a poor QA engineer’s frustrations ; )  While the current Xtrabackup test suite is commendable, it also has its limitations. Enter the flexible, adaptable, and expressive answer: dbqp.

One of my demos at the dev meeting was showing how we can set up tests for Xtrabackup using the unittest paradigm.  While this sounds fancy, basically, we take advantage of Python’s unittest and write classes that use their code.  The biggest bit dbqp does is search the specified server code (to make sure we have everything we should), allocate and manage servers as requested by the test cases, and do some reporting and management of the test cases.  As the tool matures, I will be striving to let more of the work be done by unittest code rather than things I have written : )

To return to my main point, we now have two basic tests of xtrabackup:

Basic test of backup + restore:

  1. Populate server
  2. Take a validation snapshot (mysqldump)
  3. Take the backup (via innobackupex)
  4. Clean datadir
  5. Restore from backup
  6. Take restored state snapshot and compare to original state

Slave setup

  1. Similar to our basic test except we create a slave from the backup, replicating from the backed up server.
  2. After the initial setup, we ensure replication is set up ok, then we do additional work on the master and compare master and slave states

One of the great things about this is that we have the magic of assertions.  We can insert them at any point of the test we feel like validating and the test will fail with useful output at that stage.  The backup didn’t take correctly?  No point going through any other steps — FAIL! : )  The assertion methods just make it easy to express what behavior we are looking for.  We want the innobackupex prepare call to run without error?
Boom goes the dynamite!:

# prepare our backup
cmd = ("%s --apply-log --no-timestamp --use-memory=500M "
"--ibbackup=%s %s" %( innobackupex
, xtrabackup
, backup_path))
retcode, output = execute_cmd(cmd, output_path, exec_path, True)
self.assertEqual(retcode, 0, msg = output)

From these basic tests, it will be easy to craft more complex test cases.  Creating the slave test was simply matter of adapting the initial basic test case slightly.  Our plans include: *heavy* crash testing of both xtrabackup and the server, enhancing / expanding replication tests by creating heavy randgen loads against the master during backup and slave setup, and other assorted crimes against database software.  We will also be porting the existing test suite to use dbqp entirely…who knows, we may even start working on Windows one day ; )

These tests are by no means the be-all-end-all, but I think they do represent an interesting step forward.  We can now write actual, honest-to-goodness Python code to test the server.  On top of that, we can make use of the included unittest module to give us all sorts of assertive goodness to express what we are looking for.  We will need to and plan to refine things as time moves forward, but at the moment, we are able to do some cool testing tricks that weren’t easily do-able before.

If you’d like to try these tests out, you will need the following:
* dbqp (bzr branch lp:dbqp)
* DBD:mysql installed (test tests use the randgen and this is required…hey, it is a WONDER-tool!) : )
* Innobackupex, a MySQL / Percona server and the appropriate xtrabackup binary.

The tests live in dbqp/percona_tests/xtrabackup_basic and are named basic_test.py and slave_test.py, respectively.

To run them:
$./dbqp.py –suite=xtrabackup_basic –basedir=/path/to/mysql –xtrabackup-path=/mah/path –innobackupex-path=/mah/other/path –default-server-type=mysql –no-shm

Some next steps for dbqp include:
1)  Improved docs
2)  Merging into the Percona Server trees
3)  Setting up test jobs in Jenkins (crashme / sqlbench / randgen)
4)  Other assorted awesomeness

Naturally, this testing goodness will also find its way into Drizzle (which currently has a 7.1 beta out).  We definitely need to see some Xtrabackup test cases for Drizzle’s version of the tool (mwa ha ha!) >: )

Replication in Drizzle is very simple and multi-source replication is supported. For a walk through of multi-master (multi-source) replication see David Shrewsbury’s excellent post here. Because it was very succinctly written, here I am quoting a lot of his provisioning a new slave post on replication here. But I have added in some detail on the slave.cfg file for clarity for newbies like me, as well as some more detail on the options and their purpose.

A lot of this can also be found in the documentation but here I’m going to walk through the steps. Also see the slave docs here for any questions you may have.

For our purposes we will walk through the features of setting up basic replication between a master and slave server.

You will need to set up your slave.cfg file before you do anything else. It should be located in the “/usr/local” directory but could also be located anywhere you like. Mine is in the /tmp/slave.cfg.

This is a typical setup.

master-host = “your ip address”
master-port = 4427
master-user = kent
master-pass = samplepassword
io-thread-sleep = 10
applier-thread-sleep = 10

Setting up the master is the next step. An important requirement is to start the master Drizzle database server with the –innodb.replication-log option, and a few other options in most circumstances. More options can be found in the options documentation. These are the most common options needed for a replication master. For example:

The InnoDB replication log must be running:

–innodb.replication-log

PID must be set:

–pid-file=/var/run/drizzled/drizzled.pid

the address binding for Drizzle’s default port (4427):

–drizzle-protocol.bind-address=0.0.0.0

The address binding for systems replicating through MySQL’s default port (3306):

–mysql-protocol.bind-address=0.0.0.0

Data Directory can be set other than default:

–datadir=$PWD/var

For more complex setups, the server id option may be appropriate to use:

–server-id=?

To run Drizzle in the background, thereby keeping the database running if the user logs out:

–daemon

So the start command looks like this on my server:

master> usr/local/sbin/drizzled \
–innodb.replication-log \
–pid-file=/var/run/drizzled/drizzled.pid \
–drizzle-protocol.bind-address=0.0.0.0 \
–mysql-protocol.bind-address=0.0.0.0 \
–daemon

Starting the slave is very similar to starting the master but there are a couple of steps before you are ready to start it up. The following is quoted from David’s blog post on simple replication.

1. Make a backup of the master databases.
2. Record the state of the master transaction log at the point the backup was made.
3. Restore the backup on the new slave machine.
4. Start the new slave and tell it to begin reading the transaction log from the point recorded in #2.

Steps #1 and #2 are covered with the drizzledump client program. If you use the –single-transaction option to drizzledump, it will place a comment near the beginning of the dump output with the InnoDB transaction log metadata. For example:
master> drizzledump –all-databases –single-transaction > master.backup
master> head -1 master.backup
– SYS_REPLICATION_LOG: COMMIT_ID = 33426, ID = 35074

The SYS_REPLICATION_LOG tells the slave where to start reading from. It has two pieces of information:

• COMMIT_ID: This value is the commit sequence number recorded for the most recently executed transaction stored in the transaction log. We can use this value to determine proper commit order within the log. The unique transaction ID cannot be used since that value is assigned when the transaction is started, not when it is committed.
• ID: This is the unique transaction identifier associated with the most recently executed transaction stored in the transaction log.

Now you need to start the server without the slave plugin, then import the backup from the master, then shutdown and restart the server with the slave plugin. This is straight out of the docs:

slave> sbin/drizzled –datadir=$PWD/var &
slave> drizzle < master.backup slave> drizzle –shutdown

Now that the backup is imported, restart the slave with the replication slave plugin enabled and use a new option, –slave.max-commit-id, to force the slave to begin reading the master’s transaction log at the proper location:

You need two options for sure, the add slave plugin and defining the slave.cfg file. So the most basic start command is:

slave> /usr/local/sbin/drizzled \
–plugin-add=slave \
–slave.config-file=/usr/local/etc/slave.cfg

A more typical startup will need more options, My startup looks like this:

slave> /usr/local/sbin/drizzled \
–plugin-add=slave \
– datadir=$PWD/var \
–slave.config-file=/usr/local/etc//slave.cfg \
–pid-file=/var/run/drizzled/drizzled.pid \
–drizzle-protocol.bind-address=0.0.0.0 \
–mysql-protocol.bind-address=0.0.0.0 \
–daemon \
– slave.max-commit-id=33426

The slave.max-commit-id is found in the dump file that we made from the master and tells the slave where to start reading from.

If you need more info for your particular setup you can view a lot of detail in the sys replication log and the innodb replication log tables that will help you with clarity.

Two tables in the DATA_DICTIONARY schema provide the different views into the transaction log: the SYS_REPLICATION_LOG table and the INNODB_REPLICATION_LOG table.

drizzle> SHOW CREATE TABLE data_dictionary.sys_replication_log\G
*************************** 1. row ***************************
Table: SYS_REPLICATION_LOG
Create Table: CREATE TABLE `SYS_REPLICATION_LOG` (
`ID` BIGINT,
`SEGID` INT,
`COMMIT_ID` BIGINT,
`END_TIMESTAMP` BIGINT,
`MESSAGE_LEN` INT,
`MESSAGE` BLOB,
PRIMARY KEY (`ID`,`SEGID`) USING BTREE,
KEY `COMMIT_IDX` (`COMMIT_ID`,`ID`) USING BTREE
) ENGINE=InnoDB COLLATE = binary

drizzle> SHOW CREATE TABLE data_dictionary.innodb_replication_log\G
*************************** 1. row ***************************
Table: INNODB_REPLICATION_LOG
Create Table: CREATE TABLE `INNODB_REPLICATION_LOG` (
`TRANSACTION_ID` BIGINT NOT NULL,
`TRANSACTION_SEGMENT_ID` BIGINT NOT NULL,
`COMMIT_ID` BIGINT NOT NULL,
`END_TIMESTAMP` BIGINT NOT NULL,
`TRANSACTION_MESSAGE_STRING` TEXT COLLATE utf8_general_ci NOT NULL,
`TRANSACTION_LENGTH` BIGINT NOT NULL
) ENGINE=FunctionEngine COLLATE = utf8_general_ci REPLICATE = FALSE

There you are, you should be up and running with your replication set up.

For more details you can always check the online documentation. And make sure you check out dshrewsbury.blogspot.com.

The Fremont beta2, version 2011.11.29, is out and ready to be tested.

In this release:
* continuing refactoring, restructuring, and code quality improvements
* many more documentation improvements
* documentation available at http://docs.drizzle.org
* fixes to libdrizzle .pc support
* fixes to build scripts
* additional bugs fixed

The Drizzle download file can be found here

I’m surprised and delighted to see that the Drizzle documentation was updated recently. Last time I looked, it was the original documentation which was missing, among other things, information about its 70+ plugins. So Henrik and I began filling in missing pieces of crucial information like administering Drizzle. I even generated skeleton documentation for every [...]

The entire drizzle.org domain was unavailable for about 10 hours today. This made our website, documentation, jenkins master and mail server inaccessible. On the other hand as we use public services such as Launchpad and Freenode for code repository, bug tracking, mailing list and IRC, this meant that development work continued as active as ever - in fact I think it was the most active day on IRC #drizzle channel in a while!

The DNS outage was related to our transferring of the drizzle.org domain from an individual Drizzle developer to Software in the Public Interest, Inc, our umbrella non-profit corporation. We don't know exactly why, but something went wrong between the registrars, so that the Whois record listed Tucows, the sponsoring registrar used by SPI, as the new registrar, but all other information was still pointing to the old registrar, including some Godaddy nameservers. As Godaddy eventually stopped answering DNS queries for drizzle.org - as they should - the drizzle.org domain became unavailable. 10 hours later the issue was fixed, and the correct SPI nameservers started to propagate through the DNS system. At the time of this writing, everything should have been working normally for some hours already.

And yes, in related news, drizzle.org is now transfered to the ownership of Software in the Public Interest. This is yet another step in our process of becoming a solid non-profit community project, with fiscal services provided by the SPI. So far the experience has been enjoyable and we've really felt a warm welcome into the family of SPI hosted free and open source software projects.

On that note I'd like to thank Ganneff, Solver, Hydroxide and weasel from the #spi  channel for actively helping in troubleshooting and fixing the problem today.

So I’m back from the Percona dev team’s recent meeting.  While there, we spent a fair bit of time discussing Xtrabackup development.  One of our challenges is that as we add richer features to the tool, we need equivalent testing capabilities.  However, it seems a constant in the MySQL world that available QA tools often leave something to be desired.  The randgen is a literal wonder-tool for database testing, but it is also occasionally frustrating / doesn’t scratch every testing itch.  It is based on technology SQL Server was using in 1998 (MySQL began using it in ~2007, IIRC).  So this is no knock, it is merely meant to be an example of a poor QA engineer’s frustrations ; )  While the current Xtrabackup test suite is commendable, it also has its limitations. Enter the flexible, adaptable, and expressive answer: dbqp.

One of my demos at the dev meeting was showing how we can set up tests for Xtrabackup using the unittest paradigm.  While this sounds fancy, basically, we take advantage of Python’s unittest and write classes that use their code.  The biggest bit dbqp does is search the specified server code (to make sure we have everything we should), allocate and manage servers as requested by the test cases, and do some reporting and management of the test cases.  As the tool matures, I will be striving to let more of the work be done by unittest code rather than things I have written : )

To return to my main point, we now have two basic tests of xtrabackup:

Basic test of backup + restore:

  1. Populate server
  2. Take a validation snapshot (mysqldump)
  3. Take the backup (via innobackupex)
  4. Clean datadir
  5. Restore from backup
  6. Take restored state snapshot and compare to original state

Slave setup

  1. Similar to our basic test except we create a slave from the backup, replicating from the backed up server.
  2. After the initial setup, we ensure replication is set up ok, then we do additional work on the master and compare master and slave states

One of the great things about this is that we have the magic of assertions.  We can insert them at any point of the test we feel like validating and the test will fail with useful output at that stage.  The backup didn’t take correctly?  No point going through any other steps — FAIL! : )  The assertion methods just make it easy to express what behavior we are looking for.  We want the innobackupex prepare call to run without error?
Boom goes the dynamite!:

# prepare our backup
cmd = ("%s --apply-log --no-timestamp --use-memory=500M "
"--ibbackup=%s %s" %( innobackupex
, xtrabackup
, backup_path))
retcode, output = execute_cmd(cmd, output_path, exec_path, True)
self.assertEqual(retcode, 0, msg = output)

From these basic tests, it will be easy to craft more complex test cases.  Creating the slave test was simply matter of adapting the initial basic test case slightly.  Our plans include: *heavy* crash testing of both xtrabackup and the server, enhancing / expanding replication tests by creating heavy randgen loads against the master during backup and slave setup, and other assorted crimes against database software.  We will also be porting the existing test suite to use dbqp entirely…who knows, we may even start working on Windows one day ; )

These tests are by no means the be-all-end-all, but I think they do represent an interesting step forward.  We can now write actual, honest-to-goodness Python code to test the server.  On top of that, we can make use of the included unittest module to give us all sorts of assertive goodness to express what we are looking for.  We will need to and plan to refine things as time moves forward, but at the moment, we are able to do some cool testing tricks that weren’t easily do-able before.

If you’d like to try these tests out, you will need the following:
* dbqp (bzr branch lp:dbqp)
* DBD:mysql installed (test tests use the randgen and this is required…hey, it is a WONDER-tool!) : )
* Innobackupex, a MySQL / Percona server and the appropriate xtrabackup binary.

The tests live in dbqp/percona_tests/xtrabackup_basic and are named basic_test.py and slave_test.py, respectively.

To run them:
$./dbqp.py –suite=xtrabackup_basic –basedir=/path/to/mysql –xtrabackup-path=/mah/path –innobackupex-path=/mah/other/path –default-server-type=mysql –no-shm

Some next steps for dbqp include:
1)  Improved docs
2)  Merging into the Percona Server trees
3)  Setting up test jobs in Jenkins (crashme / sqlbench / randgen)
4)  Other assorted awesomeness

Naturally, this testing goodness will also find its way into Drizzle (which currently has a 7.1 beta out).  We definitely need to see some Xtrabackup test cases for Drizzle’s version of the tool (mwa ha ha!) >: )

The Drizzle project regularly gets people asking what they can do to get involved in the project.

One very easy way to brush up on your C++ skills and dip your toe into our open development process is to fix minor warnings.

We are very proud that Drizzle builds with zero warnings for with "gcc -Wall -Wextra".

But we can be even better!  Our JenkinsCI system has a target that is even more picky, and also a target that runs cppcheck.

Go to one of those pages, pick a build log off the build history, find a warning that you think you can fix, and then ask us in the #drizzle channel on Freenode how to send your fix to us.

After you've done that a few times, you'll be ready to fix some low hanging fruit.

We've had people graduate from this process into becoming a Google Summer of Code student, and eventually having a full time paying job hacking on Drizzle and other open source software.

And it all starts with writing a simple warning fix.

Here are the slides to my second talk at last week's Percona Live event in London:

read more

It is finally here!

The Fremont beta is out and ready to be tested.

In this release:

The Drizzle download file can be found here

Just wanted to blog about some of the latest updates to dbqp.  We just merged some interesting changes into Drizzle (just in time for the impending Fremont beta).  In additional to general code cleanup / reorganization, we have the following goodies:

Randgen in the Drizzle tree

One of the biggest things is that the random query generator (aka randgen) is now part of the Drizzle tree.  While I did some of the work here, the major drivers of this happening were Brian and Stewart:

  1. Brian makes a fair argument that the easier / more convenient it is to run a test, the greater the likelihood of it being run.  Additional tools to install, etc = not so much.  Having something right there and ready to go = win!
  2. Stewart is also a fan of convenience, lotsa testing, and working smarter, not harder.  As a result, he did the initial legwork on merging the randgen.  I do suspect there is still much for me to learn about properly bzr joining trees and whatnot, but we’ll get it right soon enough ; )

This doesn’t mean we won’t be contributing any changes we make back to the main randgen project / branch, it is strictly to facilitate more testing for Drizzle.  As we already have our randgen tests packaged into dbqp-runnable suites, running these tests is even easier : )

–libeatmydata

Another request fulfilled in this update is the ability to use Stewart’s libeatmydata to speed up testing.  By default, dbqp uses shared memory as a workdir, similar to mysql-test-run’s –mem option (this can be bypassed in dbqp with –no-shm, fyi).  However, this isn’t always perfect or desirable to do.

An alternative is to use libeatmydata, which disables fsync() calls.  As the name implies, you don’t want to use it if care about your data, but for general testing purposes, it can greatly speed up test execution.

If you have the library installed / on your machine, you can use it like so:  ./dbqp –libeatmydata [--libeatmydata-path ] …

By default, libeatmydata-path is /usr/local/lib/libeatmydata.so (as if you used make install)

Multiple server types

IMHO, this is one of the coolest new tricks.  dbqp can now handle more than just Drizzle servers / source!  The ultimate idea is to allow tests that utilize more than one type / version of a server to have more interesting tests : )  This will be useful for scenarios like testing Drizzledump migration as we can feed in one (or more) MySQL servers and a Drizzle tree and make sure we can migrate data from all of them.

We also intend to utilize dbqp for testing a variety of Percona products, and it is kind of handy to be able to run the code you are testing ; )  I already have the tool running Percona / MySQL servers and have some randgen tests working:


$ ./dbqp.py --default_server_type=mysql --basedir=/percona-server/Percona-Server --mode=randgen
Setting --no-secure-file-priv=True for randgen usage...
20111013-163443 INFO Linking workdir /dbqp/workdir to /dev/shm/dbqp_workdir_pcrews_9dbc7e8a-2872-45a9-8a07-f347f6184246
20111013-163443 INFO Using mysql source tree:
20111013-163443 INFO basedir: /percona-server/Percona-Server
20111013-163443 INFO clientbindir: /percona-server/Percona-Server/client
20111013-163443 INFO testdir: /dbqp
20111013-163443 INFO server_version: 5.5.16-rel21.0
20111013-163443 INFO server_compile_os: Linux
20111013-163443 INFO server_platform: x86_64
20111013-163443 INFO server_comment: (Percona Server with XtraDB (GPL), Release rel21.0, Revision 188)
20111013-163443 INFO Using default-storage-engine: innodb
20111013-163443 INFO Using testing mode: randgen
20111013-163443 INFO Processing test suites...
20111013-163443 INFO Found 5 test(s) for execution
20111013-163443 INFO Creating 1 bot(s)
20111013-163449 INFO Taking clean db snapshot...
20111013-163452 INFO bot0 server:
20111013-163452 INFO NAME: s0
20111013-163452 INFO MASTER_PORT: 9307
20111013-163452 INFO SOCKET_FILE: /dbqp/workdir/bot0/s0/var/s0.sock
20111013-163452 INFO VARDIR: /dbqp/workdir/bot0/s0/var
20111013-163452 INFO STATUS: 1
20111013-163506 ===============================================================
20111013-163506 TEST NAME [ RESULT ] TIME (ms)
20111013-163506 ===============================================================
20111013-163506 main.blob [ pass ] 8624
20111013-163516 main.create_drop [ pass ] 2862
20111013-163524 main.many_indexes [ pass ] 1429
20111013-163547 main.optimizer_subquery [ pass ] 17153
20111013-163558 main.outer_join [ pass ] 4243
20111013-163558 ===============================================================
20111013-163558 INFO Test execution complete in 69 seconds
20111013-163558 INFO Summary report:
20111013-163558 INFO Executed 5/5 test cases, 100.00 percent
20111013-163558 INFO STATUS: PASS, 5/5 test cases, 100.00 percent executed
20111013-163558 INFO Spent 34 / 69 seconds on: TEST(s)
20111013-163558 INFO Test execution complete
20111013-163558 INFO Stopping all running servers...

Expect to see this up and running tests against Percona Server in the next week or so.  I’ll be writing more about this soon.

Native / unittest mode

This hasn’t made it into the Drizzle tree yet.  To ease merging the code with Percona Server / Xtrabackup, I’ve created a separate launchpad project.  One of the things we needed was the ability to write complex tests directly.  It is currently easy to plug new tools into dbqp, but we essentially needed a new tool for certain testing needs.

Our solution for this was to allow dbqp to run python unittest modules.  We still have a bit of work to do before we have some demo tests ready, but we will be creating some expanded Xtrabackup tests using this system very soon.  So far, it is turning out to be pretty neat:


./dbqp.py --default_server_type=mysql --basedir=/percona-server/Percona-Server --mode=native
20111013-190744 INFO Killing pid 1747 from /dbqp/workdir/bot0/s0/var/run/s0.pid
20111013-190744 INFO Linking workdir /dbqp/workdir to /dev/shm/dbqp_workdir_pcrews_9dbc7e8a-2872-45a9-8a07-f347f6184246
20111013-190744 INFO Using mysql source tree:
20111013-190744 INFO basedir: /percona-server/Percona-Server
20111013-190744 INFO clientbindir: /percona-server/Percona-Server/client
20111013-190744 INFO testdir: /dbqp
20111013-190744 INFO server_version: 5.5.16-rel21.0
20111013-190744 INFO server_compile_os: Linux
20111013-190744 INFO server_platform: x86_64
20111013-190744 INFO server_comment: (Percona Server with XtraDB (GPL), Release rel21.0, Revision 188)
20111013-190744 INFO Using default-storage-engine: innodb
20111013-190744 INFO Using testing mode: native
20111013-190744 INFO Processing test suites...
20111013-190744 INFO Found 1 test(s) for execution
20111013-190744 INFO Creating 1 bot(s)
20111013-190749 INFO Taking clean db snapshot...
20111013-190750 INFO bot0 server:
20111013-190750 INFO NAME: s0
20111013-190750 INFO MASTER_PORT: 9306
20111013-190750 INFO SOCKET_FILE: /dbqp/workdir/bot0/s0/var/s0.sock
20111013-190750 INFO VARDIR: /dbqp/workdir/bot0/s0/var
20111013-190750 INFO STATUS: 1
20111013-190756 ===============================================================
20111013-190756 TEST NAME [ RESULT ] TIME (ms)
20111013-190756 ===============================================================
20111013-190756 main.example_test [ pass ] 1
20111013-190756 test_choice (example_test.TestSequenceFunctions) ... ok
20111013-190756 test_sample (example_test.TestSequenceFunctions) ... ok
20111013-190756 test_shuffle (example_test.TestSequenceFunctions) ... ok
20111013-190756
20111013-190756 ----------------------------------------------------------------------
20111013-190756 Ran 3 tests in 0.000s
20111013-190756
20111013-190756 OK
20111013-190756
20111013-190756 ===============================================================
20111013-190756 INFO Test execution complete in 6 seconds
20111013-190756 INFO Summary report:
20111013-190756 INFO Executed 1/1 test cases, 100.00 percent
20111013-190756 INFO STATUS: PASS, 1/1 test cases, 100.00 percent executed
20111013-190756 INFO Spent 0 / 6 seconds on: TEST(s)
20111013-190756 INFO Test execution complete
20111013-190756 INFO Stopping all running servers...

This really only scratches the surface of what can happen, but I’ll be writing more in-depth articles on what kind of tricks we can pull off as the code gets more polished.

Three non-testing bits:

1)  Percona Live London is just around the corner and members of the Drizzle team will be there.

2)  We are *this* close to Fremont beta being ready.  The contributions and feedback have been most welcome.  Any additional testing / etc are most appreciated.

3)  Drizzle is now part of the SPI!

 

Drizzle source tarball, version 2011.10.27 has been released.

In this release:

  • Continued code refactoring
  • Cleanup of test system
  • Document cleanup

 

The Drizzle download file can be found here

Its been a while since I ve blogged but I couldn’t think of a better time to resume blogging than when Drizzle was officially became associated to Software in the Public Interest. I ve been a part of Drizzle for almost a year and a half now and my passion for Drizzle seems to grow every day. It is always good to see changes that happen for good and this is one of them I guess. Now that Drizzle is a part of SPI, it has a legal entity behind it which is always good. How can you benefit from this you may ask. If you are a US tax payer, then any donation that you make will be tax deductible and all your valuable contributions will be used towards the betterment of Drizzle. The easiest way to donate is using a credit card at Click & Pledge. The SPI website lists some alternative methods such as using a cheque.  So, please do make your valuable contributions towards Drizzle. As always I feel proud to be a part of the Drizzle family and will continue to strive for the betterment of Drizzle. :) :) :)


UPDATE:  Our donation link was broken, but has been fixed.  Apologies for any inconvenience this may have caused.

Since it's inception in 2008 Drizzle has taken the approach of an open source community project. It's been run as a meritocracy by its developer community, with developers from various companies and just individuals. Even though at first Sun and then Rackspace did sponsor a core team to work full time on the project, the project is not owned by any particular company, but Drizzle is developed by a vibrant and diverse community. We've been proud to have between 20-40 active contributors each month.

With the first stable release in March 2011, interest in the project grew. At this point we also decided it was time to solidify and clarify the status of the project as a non-profit community project. Out of a couple options available, we chose to become an Associated Project at the Software in the Public Interest. The SPI is a charitable US non-profit corporations that acts as an umbrella organization to many open source projects, including some well known ones like Debian and PostgreSQL. Drizzle was accepted as an SPI Associated Project by the Board on August 10th.

At this point, we would like to extend our gratitude to Josh Berkus (a PostgreSQL lead developer) who guided and sponsored Drizzle through the application process! We appreciate not just your experience and expertise with the SPI, but also the gesture of friendly help between two open source database projects.

Having a legal entity behind Drizzle has a number of useful benefits and any open source project that is bigger than just one or two guys should seriously consider using one of these umbrella organizations.

One benefit that is available to us starting this week is that you can now donate money to Drizzle via the SPI. The easiest way to donate is using a credit card at Click & Pledge. The SPI website lists some alternative methods such as using a cheque. For US tax payers donations are tax-deductible and if you are a business you can of course write the donation off as an expense.

Donations made via the SPI bank account will be properly accounted for by the SPI treasurer. Please read above links on how to "earmark" your donation for Drizzle. Available funds will be used by the Drizzle
project primary for expenses such as legal or IT infrastructure and secondary for arranging developer meetings and if possible sponsoring attendance at conferences. Note that if you want to sponsor development of Drizzle, such as a particular feature, and you have a large enough budget to be usable for that purpose, we recommend you rather contract such development directly via one of the commercial service providers that employ Drizzle developers.

So, it has been a while since I’ve blogged.  As some of you may have read, I have a new job and Stewart and I have been busy planning all kinds of testing goodness for Percona >: ) (I’ve also been recovering from trying to keep up with Stewart!)

Rest assured, gentle readers, that I have not forgotten everyone’s favorite modular, community-driven database ; )  Not by a long-shot.  I have some major improvements to dbqp getting ready for a merge (think randgen in-tree / additional testing modes / multiple basedirs of multiple types).  Additionally, I’ve been cooking up some code to test the mighty Mr. Shrews’ multi-master code (mwa ha ha!)

What I’ve done is allow for a new option to be used with a test’s .cnf file (this is a dbqp thing, won’t work with standard drizzle-test-run).  If the runner sees this request, it will generate a multi-master config file from the specified servers’ individual slave.cnf files. 

Here is a sample config:

[test_servers]
servers = [[--innodb.replication-log],[--innodb.replication-log],[--plugin-add=slave --slave.config-file=$MASTER_SERVER_SLAVE_CONFIG]]

[s2]
# we tell the system that we want
# to generate a multi-master cnf file
# for the 3rd server to use, that
# has the first two servers as masters
# the final file is written to the first
# server's general slave.cnf file
gen_multi_master_cnf= 0,1

A good rundown of the file’s contents can be found on Shrews’ blog here, but the end result looks like this:

ignore-errors

[master1]
master-host=127.0.0.1
master-port=9306
master-user=root
master-pass=''

[master2]
master-host=127.0.0.1
master-port=9312
master-user=root
master-pass=''

I tried cooking up a basic test case where we spin up 3 servers – 2 masters and one slave.  One master 1, we create table t1:


CREATE TABLE t1 (a int not null auto_increment, primary key(a));

On master 2, table t2:


CREATE TABLE t2 (a int not null auto_increment, primary key(a));

We insert some records into both tables, then check that our slave has everything! Sounds simple, right?

Sigh. If only. It seems that we are running into some issues when we try to record the test – you can read the bug here

We see some interesting output in the slave’s logs before it crashes:

$ cat workdir/bot0/s2/var/log/s2.err
InnoDB: Doublewrite buffer not found: creating new
InnoDB: Doublewrite buffer created
InnoDB: 127 rollback segment(s) active.
InnoDB: Creating foreign key constraint system tables
InnoDB: Foreign key constraint system tables created
(SQLSTATE 00000) Duplicate entry '772-1' for key 'PRIMARY'
Failure while executing:
INSERT INTO `sys_replication`.`queue` (`master_id`, `trx_id`, `seg_id`, `commit_order`, `originating_server_uuid`, `originating_commit_id`, `msg`) VALUES (2, 772, 1, 1, 'ac9c8ac0-8f10-474b-9bbd-b61d2cdb2b93' , 1, 'transaction_context {
server_id: 1
transaction_id: 772
start_timestamp: 1317760732106016
end_timestamp: 1317760732106017
}
event {
type: STARTUP
}
segment_id: 1
end_segment: true
')

Replication slave: Unable to insert into queue.
Replication slave: drizzle_state_read:lost connection to server (EOF)
Lost connection to master. Reconnecting.
Replication slave: drizzle_state_connect:could not connect
111004 16:39:05 InnoDB: Starting shutdown...

Additionally, you can just try the setup with –start-and-exit:

$ ./dbqp --suite=slave --start-and-exit multi_master_basic
20111004-170033 INFO Using Drizzle source tree:

20111004-170033 INFO Taking clean db snapshot...
20111004-170033 INFO Taking clean db snapshot...
20111004-170033 INFO Taking clean db snapshot...
20111004-170035 INFO bot0 server:
20111004-170035 INFO NAME: s0
20111004-170035 INFO MASTER_PORT: 9306
20111004-170035 INFO DRIZZLE_TCP_PORT: 9307
20111004-170035 INFO MC_PORT: 9308
20111004-170035 INFO PBMS_PORT: 9309
20111004-170035 INFO RABBITMQ_NODE_PORT: 9310
20111004-170035 INFO VARDIR: /drizzle_mm_test/tests/workdir/bot0/s0/var
20111004-170035 INFO STATUS: 1
20111004-170035 INFO bot0 server:
20111004-170035 INFO NAME: s1
20111004-170035 INFO MASTER_PORT: 9312
20111004-170035 INFO DRIZZLE_TCP_PORT: 9313
20111004-170035 INFO MC_PORT: 9314
20111004-170035 INFO PBMS_PORT: 9315
20111004-170035 INFO RABBITMQ_NODE_PORT: 9316
20111004-170035 INFO VARDIR: /drizzle_mm_test/tests/workdir/bot0/s1/var
20111004-170035 INFO STATUS: 1
20111004-170035 INFO bot0 server:
20111004-170035 INFO NAME: s2
20111004-170035 INFO MASTER_PORT: 9318
20111004-170035 INFO DRIZZLE_TCP_PORT: 9319
20111004-170035 INFO MC_PORT: 9320
20111004-170035 INFO PBMS_PORT: 9321
20111004-170035 INFO RABBITMQ_NODE_PORT: 9322
20111004-170035 INFO VARDIR: /drizzle_mm_test/tests/workdir/bot0/s2/var
20111004-170035 INFO STATUS: 1
20111004-170035 INFO User specified --start-and-exit. dbqp.py exiting and leaving servers running...
pcrews@mister:/drizzle_mm_test/tests$ ps -al
F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD
0 S 1000 18652 1 2 80 0 - 112094 poll_s pts/2 00:00:00 lt-drizzled
0 S 1000 18688 1 3 80 0 - 112096 poll_s pts/2 00:00:00 lt-drizzled
0 S 1000 18721 1 3 80 0 - 156326 poll_s pts/2 00:00:00 lt-drizzled
0 R 1000 18780 15985 0 80 0 - 3375 - pts/2 00:00:00 ps
0 S 1000 32463 30047 0 80 0 - 11272 poll_s pts/1 00:00:01 ssh

From here, we can connect to the slave and check out sys_replication.applier_state:

$ drizzle -uroot -p9318 test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the Drizzle client.. Commands end with ; or \g.
Your Drizzle connection id is 216
Connection protocol: mysql
Server version: 2011.09.26.2427 Source distribution (drizzle_mm_test)

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

drizzle> use sys_replication;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Schema changed
drizzle> show tables;
+---------------------------+
| Tables_in_sys_replication |
+---------------------------+
| applier_state |
| io_state |
| queue |
+---------------------------+
3 rows in set (0.001641 sec)

drizzle> select * from applier_state;
+-----------+------------------------+--------------------------------------+-----------------------+---------+-----------+
| master_id | last_applied_commit_id | originating_server_uuid | originating_commit_id | status | error_msg |
+-----------+------------------------+--------------------------------------+-----------------------+---------+-----------+
| 1 | 0 | f716781f-8c00-4b81-82c6-62039136d616 | 0 | RUNNING | |
| 2 | 3 | df7f2f6e-dba4-43ea-a674-fa4a3709865b | 3 | RUNNING | |
+-----------+------------------------+--------------------------------------+-----------------------+---------+-----------+
2 rows in set (0.000928 sec)

drizzle> select * from io_state;
+-----------+---------+-----------+
| master_id | status | error_msg |
+-----------+---------+-----------+
| 1 | STOPPED | |
| 2 | RUNNING | |
+-----------+---------+-----------+
2 rows in set (0.000839 sec)

drizzle>

So, it looks like the slave knows about both masters, but for some reason, the applier from master 1 is stopped : (
At any rate, there is a bug open on this and it could be something in my config(?) It’s been a while since I’ve played with replication and I know there has been some tinkering under the hood since then : )

The branch with the test code can be found here:
lp:~patrick-crews/drizzle/dbqp_multi_master_test

At the very least, we can now create tests that use this feature, which will help ensure that it stays on the path of solid code in the future! How about anyone out there? Has anyone been using multi-master? If so, can you share any setups / tests? Extra information would be most appreciated : )

I am very happy to welcome Patrick Crews to the Percona development team. Patrick joins Percona at a very exciting time for the development team. We are getting regular releases of Percona Server and Percona Xtrabackup out the door, we have been heavily using the Jenkins continuous integration system to maintain and improve the quality of the products we ship and we just upgraded our documentation publishing platform for both Percona Server (5.1 and 5.5) and Percona Xtrabackup.

We are at the natural point to expand our QA efforts – and that’s where Patrick joins us.

Patrick has been doing QA in the MySQL world for a while now, and has extensive experience with both MySQL and Drizzle. His work has included use of a variety of testing tools such as the randgen (random query generator) project to which he contributes.

As a Drizzle developer, he saw the code get to its first GA release. This included testing a completely rewritten replication system, drizzledump’s evolution to a migration tool, as well as creating a new pluggable testing system for the project (dbqp – expect to hear a lot more on this in the months to come).

Patrick’s role will have him working on both Percona Server, XtraDB, and XtraBackup. This will include creating more advanced tests and test systems for our development needs which will naturally also improve the testing of Drizzle due to sharing of common code.

Patrick has a blog over at www.wc220.com where he writes about Drizzle QA and other topics.

We’ve just gone live with a new way of publishing and maintaining documentation for Percona Server and Percona XtraBackup. We are now using Sphinx to generate the documentation that we publish on the web site. Sphinx was originally created for the new Python documentation, and has won us over due to the simple markup, ease of cross-referencing, ability to keep our documentation source along with our source code and support for multiple output formats.

This moves our documentation workflow to be the same as for source code. A developer (or writer) will create a bzr branch on launchpad and submit a merge request. This means documentation review now goes through the same system as code review.

We have also set up a Jenkins job to automatically update the documentation on the percona.com web site when changes are made to the source repositories. This means that bug fixes and improvements to the documentation can transparently and automatically be published with a minimal amount of turn-around.

An added benefit for XtraBackup is that it it will enable us to easily share documentation for the Drizzle version of XtraBackup as Drizzle uses Sphinx for its documentation too. The new setup for Percona documentation was based on that of the Drizzle project.

The end result? Better and more frequently updated documentation.

Both Henrik and myself will be at Percona Live London 2011 in late October speaking on the wonderful Drizzle database server.

Other speakers at the conference will be talking about a wide range of topics surrounding the MySQL ecosystem including performance monitoring, backup, search, scaling and data recovery.

P.S. I do have a discount code – ask me in the comments for it!

I’m going to be speaking on Highload++ conference October 3,4 in Moscow, Russia. This is a great conference which gathers amazing quality of speakers from Russia and around the world and I usually learn a lot and enjoy talking to a lot of great people on this event.

My talk is going to be about new developments in MySQL Server 5.5, 5.6, Percona Server, MariaDB and Drizzle as it relates to high volume/large scale projects. There is a lot of really cool things happening in MySQL space recently and I would love to share those with you.

I’m also doing a training session 5th of October, which will be in depth training session based on Percona’s Training for MySQL Developers. In fact this is a great learning experience both for MySQL Developers and DBAs and this is unique opportunity to attend this course
given in Russian. Bring your laptop if you have one for more Hands On experience.

If you would like to do business with Percona I will be available for business development and consulting meetings on Thursday and Friday next week.

Finally note We’re Hiring worldwide for most of positions, so if you’re interested joining Percona Team, come and talk to me. We have both technical and non technical positions, in particular we’re looking for Sales and Business Development person in Russia.

Why so late announcement. Ah if you must know I did not know if I will be able to make it until very last minute. My passport was stalled in the British Consulate getting visa to UK for Percona Live London in late October. It is great we could get the documents back in time for my travel.

(Reposted from blog.krow.net)

From the 451 Group:
“MySQL flirted with the open core licensing model in early 2008 with plans to introduce new features into Enterprise Edition that would not be available under an open source license.”

MySQL didn’t flirt with, it was going to do it.

Why? Because we were asking the question, “how do we pull in customers to make more money”.

MySQL was going to put the new backup API, which never materialized, into an Enterprise branch.

It was a lousy idea for the following reasons:

1) There was no internal API in the server for this, so the engineering was going to be messy and expensive.

2) We didn’t own the technology that was needed to even do this (Oracle owned Hot Backup)

3) Percona has an awesome tool for doing this, that is Open Source (http://www.percona.com/software/percona-xtrabackup/)

4) Backup is a core feature everyone needs, and some of those “everyones” are the folks who manufacture tools that you want to have work with your product.

5) When we were going to announce it, we hadn’t even written it/completed it. It was vaporware.

It would have been a horrible move, and would have caused Chaos for no particular reason. It was dead on arrival, and when it was to be announced as a strategy since it didn’t even exist.

Lets look at Oracle’s move. Both the authentication module, and the Thread Pool come into the MySQL server as plugins. If the engineering of the MySQL server continues in the current direction (which is somewhat flattering to Drizzle I might add), then they are on a good path (if I can find my blog entry where I talked about this as a good strategy, I’ll link back to it here).

Much of the hubbub around Open Source, Community, etc, in regards to this are a bit inflated I feel. They haven’t touched the core product, and they are creating API. Are they possibly hurting themselves in regards to ubiquity?

Doubtful.

Would I pick those two pieces? No, but they aren’t the last two I would pick either. If Sun had continued as a company?

Something similar to this would have been done as well.

From an engineering and usage stand point?

The first person who sniffs at the authentication mechanism who knows anything about security is going to freak.

The Thread Pool can only be used by a very limited number of users (and there are some restrictions on what can be done in the server while it is in use). MySQL’s IO was never designed for the Thread Pool, and there is a lot of engineering work that would need to be done to make it work.

Still? People will use both, and I am betting some customers will want them badly enough to pay.

If they are really badly needed? Well then someone will write an open source version of both.

I have no great love of Oracle, but this is really not a big deal at all. The original GPL’ing of the Public Domain/LGPL clients was a much bigger deal.

From the 451 Group:
“MySQL flirted with the open core licensing model in early 2008 with plans to introduce new features into Enterprise Edition that would not be available under an open source license.”



MySQL didn’t flirt with, it was going to do it. 



Why? Because we were asking the question, “how do we pull in customers to make more money”. 



MySQL was going to put the new backup API, which never materialized, into an Enterprise branch. 



It was a lousy idea for the following reasons:



1) There was no internal API in the server for this, so the engineering was going to be messy and expensive. 



2) We didn’t own the technology that was needed to even do this (Oracle owned Hot Backup)



3) Percona has an awesome tool for doing this, that is Open Source (http://www.percona.com/software/percona-xtrabackup/)



4) Backup is a core feature everyone needs, and some of those “everyones” are the folks who manufacture tools that you want to have work with your product.



5) When we were going to announce it, we hadn’t even written it/completed it. It was vaporware. 



It would have been a horrible move, and would have caused Chaos for no particular reason. It was dead on arrival, and when it was to be announced as a strategy since it didn’t even exist. 


Lets look at Oracle’s move. Both the authentication module, and the Thread Pool come into the MySQL server as plugins. If the engineering of the MySQL server continues in the current direction (which is somewhat flattering to Drizzle I might add), then they are on a good path (if I can find my blog entry where I talked about this as a good strategy, I’ll link back to it here). 


Much of the hubbub around Open Source, Community, etc, in regards to this are a bit inflated I feel. They haven’t touched the core product, and they are creating API. Are they possibly hurting themselves in regards to ubiquity?


Doubtful. 


Would I pick those two pieces? No, but they aren’t the last two I would pick either.  If Sun had continued as a company? Something similar to this would have been done as well.


From an engineering and usage stand point?


The first person who sniffs at the authentication mechanism who knows anything about security is going to freak.


The Thread Pool can only be used by a very limited number of users (and there are some restrictions on what can be done in the server while it is in use). MySQL’s IO was never designed for the Thread Pool, and there is a lot of engineering work that would need to be done to make it work. 


Still? People will use both, and I am betting some customers will want them badly enough to pay. 


If they are really badly needed? Well then someone will write an open source version of both.


I have no great love of Oracle, but this is really not a big deal at all. The original GPL’ing of the Public Domain/LGPL clients was a much bigger deal.

Thinking of attending the upcoming Percona Live event in London, but not yet registered?  Use discount code DrizzlePLUK and save 40 pounds off normal registration.

Community members Henrik Ingo and Stewart Smith will be there and both will be presenting on Drizzle.  Hope to see you there!

 

Drizzle source tarball, version 2011.09.26 has been released.  This is one of the final releases before the Fremont beta and consists of mostly polish.  Many thanks to those who have contributed time and effort into testing things.

In this release:

The Drizzle download file can be found here.

Update: I won't be in Moscow after all. I was denied visa on grounds that my passport is beginning to fall apart and there wasn't time to get new passport, invitation and visa. Maybe next year - I was excited to go.

October brings 2 very interesting conferences. I will be speaking first on Oct 3rd at HighLoad++ in Moscow and a few weeks later on Oct Oct 25 at Percona Live in London. I will give a talk called Choosing a MySQL Replication / High Availability Solution which is based on my thinking developed in my recent blog post The ultimate MySQL high availability solution and many benchmarks and functional tests I've done while evaluating these technologies.

At Percona Live I will also give a second talk Fixed in Drizzle: No more GOTCHA's. It looked like none of the Drizzle core team would be able to attend the conference and as I was going to be there I volunteered to cover a Drizzle topic at the same time. This is a talk Stewart Smith has given a few times at earlier conferences which I liked and proposed to Percona. As it turns out, also Stewart will be in London after all, so there will be 2 Drizzle talks, I will still give the one I'm committed to.

read more

Drizzle trunk as of 2011-09-20 (r2422) has a new revision of my query_log plugin which is important because that revision works with the latest trunk revision of mk-query-digest (wget maatkit.org/trunk/mk-query-digest). I made the query log format truly consistent and then wrote DrizzleQueryLogParser for mk-query-digest and added the command line option --type drizzlelog. I don’t intend [...]

Just in case anybody missed it: http://blogs.oracle.com/MySQL/entry/new_commercial_extensions_for_mysql

MySQL has long been an open source product, not an open source project…. and this really is the final nail in that.

To me, this was expected, but it’s still sad to see it.

I am very, very glad we have diverse copyright ownership in Drizzle so that this could not happen easily at all.

For months I’ve been perplexed that Drizzle authentication plugins did not seem to work. I filed bug 823637, asked on the mailing list, and made it need #1. I finally discovered how to authentication with authentication plugins like auth_pam: drizzle -u daniel -P --protocol mysql-plugin-auth. The final command line option is the key: --protocol mysql-plugin-auth. [...]
My recent post on a new 95 percentile algo mentions a Python version of the algo. I’ve chosen Python 3 for development of new Drizzle tools. Problem is: the Python version of that algo is almost 3 times slower than the Perl version; here are the results (where “OLD” is Perl and “NEW” is Python): [...]

Drizzle source tarball, version 2011.08.25 has been released.

In this release:

  • NOTE: Drizzle behavior now allows 0 to mean NULL, but does not allow NULL to mean 0
  • Work on IPV6 data type - thanks to Muhammad Umair for hacking on this
  • Continued code refactoring - thanks to Olaf for his work as always
  • Various bug fixes

The Drizzle download file can be found here

Just wanted to record for the history books that:


drizzle> select js_eval('var d = new Date(); "Drizzle started running JavaScript at: " + d;')\g
+----------------------------------------------------------------------------------+
| js_eval('var d = new Date(); "Drizzle started running JavaScript at: " + d;') |
+----------------------------------------------------------------------------------+
| Drizzle started running JavaScript at: Mon Aug 29 2011 00:23:31 GMT+0300 (EEST) |
+----------------------------------------------------------------------------------+
1 row in set (0.001792 sec)

I will push this onto launchpad tomorrow, after a good nights sleep and final code cleanups.

read more

The Drizzle developers merged my query_log plugin into Drizzle 2011.08.24. This is great news not only because it addresses need #3 but because Drizzle query_log solves every problem that ever plagued the MySQL slow log. Since I had free reign to design whatever I wanted, I designed the log format to be consistent, logical, and [...]

Drizzle source tarball, version 2011.08.24 has been released.

We getting closer to the Drizzle 7.1 / Fremont beta.  Please keep the feedback coming!

In this release:

The Drizzle download file can be found here

Drizzle source tarball, version 2011.08.23 has been released.

We are working hard to get ready for Drizzle 7.1 beta.  Please feel free to help us out by giving things a spin : )

NOTE:  HailDB engine has been removed from the tree.

In this release:

  • Cleanup of build and test system
  • Removed haildb engine from tree
  • Updated and improved replication documentation
  • PBMS fixes to work with gcc 4.6 + code cleanup

The Drizzle download file can be found here

It’s August 1st, 2011, and five years ago on or about this date (who can remember clearly?) Peter Zaitsev and Vadim Tkachenko founded Percona. What’s happened in the last five years?

We’re a privately held, privately funded company of over 50 employees distributed globally, serving 1200 customers worldwide with support, consulting, training, and engineering services for MySQL. Our revenue isn’t public, but we’re proud that we’re able to keep all those people busy and help provide for their families. We’ve contributed significantly to improved performance and advanced functionality in the MySQL server and InnoDB storage engine, and created the first and only opensource hot-backup tool for InnoDB, as well as many other open-source software engineering projects.

We’re happy that we’ve achieved this much, but our future plans are ambitious too. We’d rather first lay eggs and then cackle, than the other way around. So watch this blog for news about what’s next from Percona.

A heartfelt thanks to all of Percona’s customers, without whom there’d be no company. And likewise, to our team members. To the MySQL community, including MariaDB and Drizzle; and the business ecosystem and third-party providers, both closed and open source. Finally, much gratitude is due to Oracle, and also to MySQL’s previous stewards: Sun Microsystems, MySQL AB, and of course Monty Widenius and David Axmark, and Heikki Tuuri, who started it all so many years ago.

Here’s to the next five years.

Sincerely,

Peter Zaitsev, CEO
Vadim Tkachenko, CTO
Baron Schwartz, Chief Performance Architect
Tom Basil, COO
Bill Schuler, VP of Sales
Espen Braekken, VP of Global Services

Drizzle 7 2011.03.13 was a GA release but far from a production-ready database server, in my opinion. I really like Drizzle, and I believe in its future, so I’m going to criticize what I think are its weak points because as Charles Kettering said, “A problem well stated is a problem half solved.” 1. Authentication [...]

Codership team announced availability of MySQL/Galera 0.8.1, which is minor release, but actually it has bunch of improvements that makes Galera replication more user friendly (there are many bugs fixed, reported by me personally, what annoyed me a lot).

As part of my evaluation activity I ported MySQL/Galera 0.8.1 to Percona Server/Galera 0.8.1 and you can get source code on Launchpad.

I appreciate the fact that not everybody has fun from compiling source code (hint, hint for Drizzle developers), that is why I also made binaries for RHEL 6.1 / Oracle Linux 6.1
http://www.percona.com/downloads/TESTING/Galera/percona-5.1.57-galera-0.8.1.tar.gz

This is ABSOLUTELY NO production quality release, but you are welcome to play with it.

It's my turn to apologize. Andrew and I apparently really angered people by being upset about something last week, and for that, as he already has, I apologize. I don't like making people angry or upset.

I believe Henrik made an excellent point, which is that for various different reasons, there are those of us who were upset when Oracle bought MySQL and yet felt complelled to not communicate this publically. To be honest, emotions related to a business transaction ARE a little weird, so I'm not sure it's completely odd that people don't know how to appropriate express them. But as Henrik rightly pointed out, the Oracle takeover has been the elephant in the room (sorry Postgres - it's not you) and we've all been spending a good amount of energy NOT talking about it, because talking about it only leads to people getting upset. As I said before, I don't like making people upset, so I'll try to keep my comments there to myself for the most part.

I'd also like to apoligize for writing a blog post with too many thoughts. I only included the discussion of the naming as what I thought was a humorous take on the backstory of why I was writing in the first place, I see the folly of my ways there. In the future, if what I want to talk about is annoyance at people eye-rolling at my passion for Open Source, I will endeavor to only talk about that. That way, with a single topic post, when it's referenced other places, there will be no confusion.

To sum up, I am sorry for causing any confusion or any anger or for making anyone upset.

I have been running the test and merge process for DrizzleDB using Jenkins.

Jenkins is a pretty standard Java-based web app. The configuration settings are stored in XML files, and that configuration is manipulated using an "easy to use" Web GUI.

The "old skool" UNIX-like way to keep configuration settings is in a text file, which is edited with an ordinary text editor, and is read by the program daemon on start or SIGHUP. This is considered "scary", "hard to learn", and "hard to use" by novices.

There is a big problem with GUI-only managed configuration, an issue where text file configuration has a major advantage.

I did not set up the Jenkins server or nodes. I am not the only person with admin access to it. Several other people have set it up, set up various projects in it, and added new nodes and new types of nodes.

As I work on it and look at the existing configuration, I often find things that are "surprising", things that make me say "Is that right? That can't be right? Can it?". And then I have to spend time digging into it. Sometimes it IS right, for reasons I didn't know at that moment. Sometimes it used to be right, but isn't necessary any more. And sometimes, it just wasn't right.

In a textual configuration file, you can put comments. The purpose of a comment is to communicate into the future, to tell those who came after you (including your future self) what you were intending to do, and why you selected some "surprising" option or way of doing things.

There is no good way to put comments into GUI or WebGUI configuration, even if it has a freeform field labelled "comments".


Update, this post is being discussed at Reddit.

Update, this post is being discussed at Forrst Podcast ep 122.
In the world of software development on Linux, especially in the open source world, there is a set of packages called "autotools", which abstracts and manages portability between compilers and other aspects of the software build environment. Nobody likes it. In fact, most people who use it hate it. There are many attempts to replace it.

And the problems with autotools are getting worse, because it itself was never designed to have cleanly portable control files between versions. Back when most everyone just FTPed down a tarball, and then ran the prebuilt ./configure, it worked pretty well. Now that people pull a projects raw repo over SVN, BZR, or GIT, and then have to run liptoolize, aclocal, automake, autoconf, etc themselves, and who knows what version of autotools is locally installed, all hell breaks loose.

With respect to all the autotools replacements, such as cmake, Ant, etc, and all the other ones mentioned in Eric Raymond's recent blog post: they ALL are some multiple combination of horribly slow, enforce their own special "one true way" of laying out source trees, are specific to what languages they will deign to handle, have abysmally bad error messages when there is a problem (and being worse than autotools in this respect is an amazing achievement), require the installation of a JVM and a huge pile of buggy poorly documented class files, require the installation of a huge pile of buggy poorly documented Python modules, require the installation of a huge pile of buggy poorly documented Perl modules, cannot intelligently detect and handle optional build dependencies, cannot cross compile, cannot build out of a read-only source tree, cannot build out of tree, and/or cannot build shared object files.

I wish to gods and monsters that this was not true, but it is. And until the writers of the competing build chain systems understand why all this stuff is important, and are willing to support it, autotools will stick around, and people will continue to use it.

This is not to say that it cannot be used better. One of Monty Taylor's herculian tasks the Drizzle project has been pandora build, which is a refactoring and rewrite to the years of cargo-cult accumulated cruft that has infested most autotools based open source projects.

Drizzle source tarball, version 2011.07.22 has been released.
In this release:

  • Continued code refactoring (thanks again to Olaf van der Spek)
  • Multi-master replication is back
  • Various bug and documentation fixes

The Drizzle download file can be found here

There's been quite the thread on Google+ (my how technology changes quickly...) over a comment Andrew Hutchings made on an Oracle MySQL Blog Annoucment for their new "Meet The MySQL Experts" Podcast. I should have ignored it - because I honestly could not give two shits one way or the other about Oracle or any podcasts that they may or may not decide to broadcast. But to be straightfoward about it ... the title of the podcast is ludicrous. In case you were wondering, "The" in English is the definite article and implies a singular quality to the thing that it describes... effectively implying that Oracle's MySQL Experts are, in fact, the only MySQL Experts. We all know that's false- Percona and SkySQL are both full of experts as well - likely have more MySQL Experts per-capita than Oracle does, as if a per-capita measure were important. Of course, as Matt Montgomery pointed out, there is absolutely no reason for Oracle to point people toward's someone else's experts ... and that's fine. It's just that there are other ways to phrase the title that still assert Oracle's product and trademark and which are not, from a purely grammatical sense, lies. "Meet Our MySQL Experts" or even "Meet MySQL Experts" or "MySQL Experts Talk to You" or "Hey! Look! MySQL Experts are going to drink Black Vodka!" (ok, probably not the last, since that would point people to MariaDB - but it is at least a true statement... MySQL Experts WILL, inevitably, drink Black Vodka)

As I said earlier though - I don't really care about Oracle... they have no impact or meaning in my life... so if they want to either play silly grammatical games OR be unaware as to the actual meaning of words in English - that's fine. But then Matt Lord said something that really pissed me off:

 Any religion and its dogma can be problematic in the real world, whether or not it involves any kind of deism or not. :)

Too often people confuse FOSS with the cathedral and the bazaar, shared development, shared ownership and other high minded ideals and frameworks. In the end, it's a trademarked and in-house developed product that is released as FOSS. It's not a cross, don't try to impale yourself on it. :D

It's not that big of a deal people! We're surrounded by beauty and tragedy, this is just work.

Now, first of all, I like Matt Lord. And with that in mind, I have the following to say:

I am fully in support of trademarks and trademark protection. I am fully in support of people making a living doing what they do - especially if they are doing it by providing a service. I recognize that Oracle owns the trademark MySQL and can do with it as they see fit.Oracle does, in fact, own the product called MySQL, with all of the rights that go along with that... and honestly I do not think they are being bad shepherds of that product. Whether I like Oracle or not, it is undeniable that they are now a part of the MySQL picture, and I say good for them.

The reason I get pissed off is the attitude that it's not that big of a deal. The MySQL trademark and the business around MySQL is a BIG DEAL to Oracle, and if I were to try to put forward the opinion that they should just, you know, stop caring about it, people would think I was crazy. Why is it so unreasonable then for me to care about the portion of this that I happen care about? Why is it not ok for me to NOT be in this for the money, for me to NOT be in this just as work?

I think it might be worthwhile reading The Cathedral and The Bazaar again - because it describes the two different models you are talking about rather than being a single entity that one might confuse FOSS with. The Cathedral, as described in the book, is the model traditionally taken by the MIT and Gnu-derived projects,  (although emacs has a more open dev model now) and is currently also employed by Oracle on MySQL. In fact, it has been the MySQL model for quite some time - well before Oracle entered the picture. It involves a mostly closed dev process from which code drops are made unannounced and at the whim of the folks in the Cathedral. It's not de-facto a bad thing, it's just a description of a process. With the Cathedral, ironically enough, it is the ideals of Free Software (that the software itself be free) that are more important and that an open development process is less important. The Bazaar, on the other hand, is the process Linux uses - where all of the development is done in a distributed manner and in the open. The assertion in the book, and one of the philosophical differences between Free Software and Open Source (which makes the use of FLOSS or FOSS completely ludicrous) is that having an open development process is more valuable than just the software being free, although the by-product of an open development process is that your software sort of has to be Open Source. The irony here that I mentioned earlier is that, of course, Oracle approaching its Free Software offerings via the Cathedral model gives it none of the benefits you would think a corporation might want from an arrangement such as Eric Raymond's Open Source Bazaar model might afford them, and instead themselves choose to operate under a set of zealous ideals much more akin to Richard Stallman.

I'm sure that analogy is not pleasing to either Stallman or Ellison.

Although I understand that the ideals behind Free Software may not be important to you, I do not think that there is any constructive reason in the context of a discussion about Oracle's business practices asserting trademark ownership to imply me subscribing to those ideals is silly. It would be very difficult to accurately describe the success of any of the currently valuable pieces of Free Software as not due in any large part to those of us who routinely impale ourselves on the cross of Free Software. MySQL AB's business strategy itself, which involved attaching FUD to discussions of the GPL to incite people to buy licenses that they quite simply did not need ... (a perfectly valid if devious business strategy) was predicated on the existence of such an enormous shit-ton of users that they could focus on converting a percent of a percent of those users into customers and still wind up selling the business for a billion dollars. That shit-ton of users grew out of the emergence of LAMP as the dominant pattern for the Web. LAMP arose because it was technically much better than any of the alternatives... and the pieces of LAMP became dominant because of the work of a set of people who do, in fact, care about the ideals of either Free Software or Open Source.

You seem to be quick to put things in business perspectives and to remind people that it's ok for Oracle to do business. I agree. It's ok. But we wouldn't have had MySQL to work in the first place for if it wasn't for a bunch of people for whom it was not just a job, for whom it was not just work and for whom the ideals you are looking down on are not silly things.

So disagree with me all you want to about the effects of Oracle's choices on the health of MySQL. Defend Oracle all you want to on whatever terms you want, in whatever way you want to define a set of values such that they are positive. I'm right there with you on some of it, I might disagree with you on other bits, and that's just life and how we go on being people ... but please do not smirk and snicker and roll your eyes and tell me that the things that I think are important are not. I assure you, I find them to be very important and I do not believe I am the only person who does.

There are a number of different, and very valid patterns for handling objects of different types. This is not about that, this is about how to not mix a pattern.


A very, very common bit of code that is in MySQL (and can therefor be found in Drizzle):


if ((cached_result_type == DECIMAL_RESULT) or (cached_result_type == INT_RESULT))


{


do_something();


}


else


{


do_something_else();


}



DECIMAL_RESULT and INT_RESULT are each possible result types.


Are there more?


Why yes there are. In the above bit of code the original author thought about two cases, and assumed all other cases could just be lumped into the else.


I’ve fixed dozens of bugs over the last few years based on similar assumptions.


What assumptions? 


1) The no one would ever add another result type.


2) That no other bug fix might create a case where the else no longer held true.


3) That the else was ever correct in the first place.


Without changing the entire design, what would be better?


Use a switch and make a case for each enum. That way if a new enum is added anywhere in the code where logic is required based on the enum you will catch it when you compile (assuming you have your warning flags turned up in your compiler). 


Also? Skip “default”. Unless you are taking something off the wire/file/etc you can skip default because you aren’t going to end up with an invalid enum. If you are doing one of these actions?


Sanitize the data first, don’t just cast it.

The SQL standard says that table and row names are case insensitive. Drizzle table names are in Unicode with UTF8 encoding. boost::to_upper et al mangles UTF8 case. Down this path there is going to be a lot of pain. My own inclination is to tell the SQL Standard to realize that its 2011, no longer 1961, and break all the apps that are lazy about identifier case, but other people will probably disagree.
This last March, David Shrewsbury put together a basic implementation of multi-master replication into DrizzleDB link. We were able to actually merge it with the mainline trunk, but then had to remove it until we had refactored and fixed more things. Today, we were able to put it back in.

It's worth reading Shrew's original blog posts, and then trying it out.

Drizzle source tarball, version 2011.07.21 has been released.
In this release:

  • Continued code refactoring (thanks again to Olaf van der Spek)
  • Continued work on the stored procedure interface (yay Vijay!)
  • Improvements to the Storage Engine API tester from Stewart
  • Various bug fixes

The Drizzle download file can be found here