Planet Drizzle

Open Source and the Cloud: Where’s the LAMP?

Posted by: sogrady, July 03, 2009 12:16 AM

town of the clouds

My challenge to everyone competing with Amazon, Google and Microsoft is to remember that you’re competing with Amazon, Google and Microsoft. These are strong technology companies, and if you’re going to compete with them, open source is the only way to do that. Otherwise, you have no leverage.” – Matt Mullenweg

Let’s accept up front that the next Amazon, Google or Microsoft is not going to be able to purchase hardware as cheaply as the last Amazon, Google and Microsoft. That’s strike one. Bandwidth is also going to be a bit more dear. Strike two. Consider the challenges of managing all of the above, and that’s strike three.

But before we call them – and count them – out, let’s consider for a moment the history of the software industry. Before the cloud, before software as a service there was this weird little trend called open source. This bizarre practice involved opening (read: giving away) your source code (read: your software) so that anyone, your competitors included, could use it. For free.

Odd as this might have seemed at the time, of course, open source allowed the small to compete with the big by leveraging rather than submitting to their weaknesses. It’s sometimes difficult to remember in this Google-obsessed age, but during Windows 95’s heyday, it was natural to conclude that Microsoft was the once and future provider of all the technology that one might reasonably require. Of course we’d once thought the same about IBM, but this was different. Microsoft was different.

My how things change. And stay the same, to be fair, as Microsoft hasn’t exactly gone the way of SGI. But anyone who’s watched the Microsoft business over the past decade or so will tell you that open source has been a disruptive influence on the firm, top to bottom. As if it wasn’t enough that monopolies like the browser and operating system markets were threatened by open source alternatives, its biggest and most terrifying competitors were building their own businesses on software they didn’t have to develop. Not that Microsoft’s been alone in feeling the corrosive disruption of free software, of course; it could and has been argued, in fact, that the biggest single reason that Sun is about to be subsumed into Oracle is the LAMP stack.

David versus Goliath, indeed.

To explore the specifics of how open source might impact the cloud, let’s indulge in a bit of Q&A.

Q: Before we begin, do you have anything to disclose?
A: Yes indeed. Folks with relevant technologies like Canonical, Cloudera, Convirture, Dell, IBM, Reductive Labs, Red Hat, Microsoft, Sun and so on are RedMonk customers, while we ourselves are customers of providers of Amazon and Google.

Q: To continue the above: could history repeat itself? Could David beat Goliath – again – in the cloud space, on the backs of free software?
A: Frankly, I doubt it.

Free software is not, by itself, enough to overcome the aforementioned economy of scale advantages enjoyed by the Amazon’s, Google’s, and Microsoft’s of the world, let alone the larger, enterprise focused systems players like HP, IBM, and Oracle (why not you too, Cisco?). But that, to me, is not the interesting question.

Q: What is the interesting question, then?
A: What we should be asking is not whether free software can replace Amazon et al, but whether or not it can power a viable cloud alternative. An alternative sufficiently viable to keep the big guys honest and prevent lockin. On the answer to that question, to me, hinges nothing less than the future of the cloud market.

Q: Why is that question so important?
A: First, there’s the aforementioned question of lockin. Neither customers nor the governments that tax them can be trusted to stave off damaging monopolies, in my opinion. History demonstrates conclusively that IT staffs, necessarily focused on the present, will happily sacrifice the future for the sake of Getting Things Done today. Equally clear is the fact that governments, when finally awakened to anticompetitive threats, generally do too little, too late. Meaning that the best hope for an open and vibrant playing field – i.e. a market of cloud providers not intent on locking you in at the first opportunity – in future is competition for the existing players.

Besides their monopoly-resistant properties, open source cloud software could play an important role in the rise of so-called private clouds – cloud infrastructures that are run on-premise. Whether one agrees or disagrees with the concept of private clouds or not, they’re coming. For compliance, privacy, uptime and a host of other reasons. Given that one can’t replicate the platforms of an Amazon, a Google or a Microsoft internally, it would seem to make public to private or vice versa transitions challenging in the extreme.

(Re)enter open source. Though some might point to interoperability and standards conversations as the most promising candidates for ensuring adequate competition in the cloud space, my experience in other standards arenas leads me to assign greater value to reference implementations of said standards. Open source implementations, more specifically, because at the end of the day the entire interoperability and standards discussion is about ensuring a level playing field. Throw in the fact that open source could potentially allow replication of the public cloud stack privately and you might yet see enterprises and governments pushing for open source.

Q: Are the benefits for open source cloud offerings strongest within the customer, then?
A: Not at all. Lost in discussion of cloud development has been the fact that the development platforms and targets are changing, and quickly. The level of interoperability that even unwieldy standards like J2EE offer is generally absent in the cloud. Platform as a service (PaaS) customers are writing applications, typically, to a completely proprietary abstraction layer, whether it’s offering by Google, Salesforce or someone else. And even Infrastructure as a Service (IaaS) customers deploying to enterprise standard platforms like RHEL will find their deployments regrettably unique, be that in the way that storage is accessed or the instances themselves are managed.

As Matt points out above, then, open source is going to be the primary mechanism with which startups compete, in my view. In the two primary styles of cloud implementations, IaaS and PaaS – what I’ve previously termed instance and fabric – we’ve dramatically different economic opportunities. With IaaS, the opportunities for developers and vendors has typically been to abstract the infrastructure via management, clustering and provisioning type applications. These opportunities are, frankly, likely to dwindle as Amazon increasingly offers these services itself. Within PaaS ecosystems such as Google or Salesforce, there is even less opportunity, in that the fabric is responsible for many of the tasks currently being serviced by vendors operating in the IaaS ecosystem. Most cloud vendors are building on Google or Salesforce rather than around them.

In other words, it’s a competitive market, and it’s only going to get more competitive as the bigger systems players rapidly pivot and reposition their wares for use in the cloud.

So how do you compete? Realistically, unless you’re company letterhead reads Google, IBM, Microsoft, Oracle or Salesforce, you’re probably going to have a hard time convincing even medium size cloud customers to write to something other than Amazon.

Unless, of course, you can develop a credible alternative that is popular enough to assuage concerns about longer term viability. Which pretty much means you’re going the open source route, in my view. Thus it is that the combination of open source and cloud is even more important for developers than it is for customers.

Q: Are any developers seeing things in those terms?
A: Sure. Take Cloudera, who’s offering a suite of commercial services around the open source Hadoop platform. Or the folks from Reasonably Smart – recently acquired by the folks from Joyent – who offer up the code from their Git and Javascript based PaaS layer with the following explanation: “we see [open source] as the only real way to make our platform truly attractive. Other Platform-as-a-Service providers may state a desire to be open, we’ve been that way from day one.”

Elsewhere, Red Hat is throwing a virtual conference strictly on the topic of open source cloud computing.

Q: Is the cloud a natural ally for open source?
A: Not at all. One of the godfathers of the free software movement, Richard Stallman, has called cloud computing “stupidity.” Others have argued that software deployed to the cloud obsoletes open source licenses, undermining the point of the software itself, with some even going so far as to call the loophole that permits this “a cancer.”

Q: Is there evidence to support these concerns?
A: Not much that I can see, candidly. Though the thinking is sound, in practice there are a great many healthy open source projects that are primarily deployed in network settings. From Hadoop to WordPress, well managed open source projects are succeeding without resorting to the more severe restrictions of the AGPL.

Q: Besides customers like enterprises and governments, who might most benefit from an open source cloud stack?
A: In a word: hosts. Given the stark economic reality that the major provider cloud providers’s economic advantages will expand with the growth they’re currently experiencing, what are smaller providers to do? Embracing open source seems to be the clearest response. Much as smaller and medium sized hosts worldwide today run Debian, Fedora, CentOS or Ubuntu as a means of minimizing their expense, so too are tomorrow’s would be cloud providers likely to embrace open cloud stacks in an effort to remain competitive in the burgeoning cloud market.

Besides, it’s not clear how big a cloud market will be left when the big guys are finished carving it up. If you assume (as you probably should) that IBM customers are more than likely to leverage an IBM cloud, HP customers an HP cloud and so on, you’ve already lost an important portion of the Global 100. Then consider the entrenched strength of the category’s market pioneer in Amazon and the relative strengths of communities that the likes of Google and Salesforce.com can sell into, and the addressable market is dwindling rapidly.

LAMP, with its flexibility, simplicity and perhaps most importantly – lack of upfront licensing costs – fueled an explosion in the hosting services market once upon a time. It’s entirely possible that a similarly open source cloud stack could do the same, particularly since far more software is delivered via the network than when the hosting industry first expanded.

Q: What is this cloud LAMP stack going to look like?
A: What we’re going to see, what we’re beginning to see, I think, is a loose coalition or confederation of projects and vendors that will together comprise an increasingly viable top to bottom alternative to some of the cloud providers today. We’re clearly not going to see an Amazon or a Google spring forth, complete, overnight, but the fact is from management to virtualization to operating systems to cloud provisioning the open source alternatives to the current proprietary cloud stacks are more credible by the day.

Q: Which projects and vendors will be part of this “coalition?”
A: Ultimately, there will have to be a variety of participants with varying aims and interests, but they’re probably going to look a lot like the recent Eucalyptus/Ubuntu partnership. Besides Linux (all flavors) and Eucalyptus, examples of projects I would expect to see considered for various roles in an open source cloud stack would be things like ConVirt, Drizzle, Hadoop, Puppet, Reasonably Smart and so on. Which is not to mention critical enabling technologies like KVM or potential API candidates like the one GoGrid made available under a CC license.

As you can tell, it’s far too early to begin casting for the new acronym, but it’s clear to me that there are going to be options for those that wish to pursue open source cloud computing. Which should be obvious, since most of the existing clouds are built on open source.

Q: What about timeframes: what are your expectations in terms of when the open source cloud will arrive?
A: It’s far too early to tell. What I would say instead is that the clock is ticking, and that the network effects favor the incumbents, so if I were an open source provider with cloud ambitions, I’d be ramping up the partnership and alliance conversations as quickly as possible.

If you happen to be one such developer or vendor, drop us a line and we’ll do what we can to help connect you to similarly interested parties.

by-nc-sa
Dogfooding a pastebin

Posted by: MacPlusG3, July 02, 2009 03:52 PM

http://pastebin.flamingspork.com/

A pastebin running Drizzle and  the Drizzle PHP Extension (which is on top of libdrizzle).

libdrizzle and PHP Extension Released

Posted by: Eric Day, July 02, 2009 07:14 AM

Version 0.4 of libdrizzle has been released. This was mostly a maintenance release with build system changes and small bug fixes. This is the client and protocol library for Drizzle and MySQL that provides both client and server interfaces.

Version 0.4.1 of the Drizzle PHP Extension has also been released. James Luedke has moved development and releases of the extension into PECL, and has also fixed a number of bugs, extended the interface, and worked with the PHP/PECL developers to get the extension up to the proper PHP coding standards. Thanks James!

Benchmarking Drizzle with MyBench(DBD::drizzle)

Posted by: Ronald, July 01, 2009 05:14 PM

With thanks to Patrick Galbraith and his DBD::drizzle 0.200 I am now able to test client benchmarks side by side with MySQL and Drizzle.

For simple benchmarking with clients, generally when I have little time, I use a simple Perl framework mybench. I was able to change just the connection string and run tests.

The diff of my two scripts where:

---
> my $user      = $opt{u} || "appuser";
> my $pass      = $opt{p} || "password";
> my $port      = $opt{P} || 3306;
> my $dsn       = "DBI:mysql:$db:$host;port=$port";
---
 my $user      = $opt{u} || "root";
 my $pass      = $opt{p} || "";
 my $port      = $opt{P} || 4427;
 my $dsn       = "DBI:drizzle:$db:$host;port=$port";
---

It’s too early to tell what improvement Drizzle will make. Just running my first test with single and multi thread tests shows an improvement in all figures in Drizzle via MySQL, however I will need to run this on various different versions of MySQL including the latest 5.0 to confirm.

Drizzle, Rethinking MySQL for the Web (Video from OSBridge)

June 30, 2009 04:21 PM

Here is the link to the video of the talk I gave at OSBridge ->

http://blip.tv/file/2296093

It is the standard "this is Drizzle Talk".
Drizzle and Gearman in Boston next week

Posted by: lbieber, June 30, 2009 04:21 PM

Eric Day will be heading back to his home state (Maine) next week for a visit, and while there both Eric and Patrick Galbraith will be talking at the Boston MySQL Meetup Group on Monday night about Drizzle, Gearman, and how to combine the two with projects like Narada. If you are in the Boston area, be sure to check it out!

Drizzle and Gearman in Boston Next Week

Posted by: Eric Day, June 30, 2009 08:46 AM

I’ll be heading back to my home state (Maine) this week for a visit, and while I’m back there Patrick Galbraith and I will be talking at the Boston MySQL Meetup Group on Monday night about Drizzle, Gearman, and how to combine the two with projects like Narada. If you are in the Boston area, be sure to check it out!

Why You Won’t See a Drizzle Proxy

Posted by: Eric Day, June 29, 2009 08:21 PM

I’ve been following the excellent work that Jan, Kay, and others have been doing with MySQL Proxy, it has really matured into a great piece of software. I talked to Jan at the MySQL UC and toyed with the idea of integrating libdrizzle into MySQL Proxy. I’ve also been asked by a number of folks when a Drizzle Proxy project will be started and if it will be as feature rich as MySQL Proxy. For a while I just said “Someday, I just don’t have the time.” Lately though I am hoping we never have a Drizzle Proxy project.

Let me explain.

One of the fundamental ideas in software engineering is code reuse through libraries or modules. Rather than create a Drizzle Proxy project, why not add a proxy module into the Drizzle server? This way, at any point during the query execution path, you could toss the query to the proxy module to deal with, and the main execution engine would be done. You could of course run the Drizzle server in a “proxy only” mode where new queries may only be parsed and then a post-parsing module determines where and how that query is proxied. Post proxy hooks will be needed as well for result processing. Functionally, it’s the same thing as the proxy, but without having to reinvent the components needed in the proxy. (Just as a side note, I understand this may not have been an option for the MySQL proxy folks).

So, to be clear, I still want to have proxy functionality, just not as an independent project.

Even with a proxy module inside of the server, I’d like to address some of the reasons proxies are created and used. These are not necessarily specific to a database proxies, many of these reasons apply to other server types as well. In the case of a database proxy, especially with Drizzle, I would like to address the list of reasons below in a different way. Why? In most architectures, I see a proxy server as a fix for a shortcoming with another component, possibly in the client, server, or maybe even in the application data model. It also introduces latency and another failure point that may not be necessary. The less code and machines your application has to run through, the better. Don’t get me wrong, there are reasons to use proxies, but sometimes they are used as a hack.

  • Query processing and rewriting - In Drizzle we plan to add query rewrite plugin hooks, both pre-parser and post-parser. At some point we want to add pluggable parser support and clean up the abstract syntax tree. These plugins would enable rewriting of queries at a few different levels, both with the raw strings or with rearranging the syntax tree before the optimizer takes over.
  • Query multi-cast, data partitioning, result merging - In my opinion, this may could probably be done at the client library layer or through another system such as Gearman. If pushing that logic into the client is not an option, you could still accomplish this through the proxy module I mentioned above, possibly running the server in a mixed-mode (some queries answered locally, some proxied).
  • Connection Pooling/Concentration - People often confuse these two terms. Pooling is the re-use of connection on a client side. This should be pushed to client APIs whenever possible. When this is not possible, you need to use a generic TCP proxy or database proxy, but these should only be run locally (not on a separate machine). Concentration is a piece of software that acts as a connection multiplexer. It takes multiple client side connections and allows them to map onto a single connection to the server. This is usually because the server does not have an efficient threading or file descriptor handling model to withstand thousands of connections. It’s not always an option to re-architect a server to handle this, but it should be preferred over creating another layer to do the concentration for you. In Drizzle, this is one thing I have a particular interest in. It involves improving or re-writing the pool-of-threads scheduler and making the execution engine more stateful so it can yield a thread when it knows it will block.
  • Sharding, HA/failover - Again, something I think belongs in the client library, and is part of the new Drizzle protocol. I’ll be adding support into libdrizzle to manage sharding and connection failover shortly.
  • Debugging layer - At some point we should be adding probes into the server where output can be piped to a module of your choice. For example, you can register for a set of events and have a module send those out into a Gearman network for processing and debugging. This will give you the flexibility to process probe output however you want and does not introduce another layer just for debugging.

These are things I plan to work on at some point or would like to help someone else work on inside Drizzle. Also, these are my own thoughts and may not be shared by fellow Drizzle developers. Treat this as an invitation for discussion. :)

Drizzle source tarball 1063 has been released

Posted by: lbieber, June 22, 2009 05:17 PM

Drizzle source tarball based on build 1063 has now been released. The change log can be viewed at https://launchpad.net/drizzle/trunk/aloha.

Recap of Southeast Linux Fest 2009

Posted by: Xaprb, June 21, 2009 01:48 PM

Last weekend, my brother and I attended SELF 2009. A few thoughts on it:

The mixture of sessions was interesting. There were some really good ones. I think the best session I attended was an OpenSolaris/NetBeans/Glassfish/Virtualbox/ZFS session, given by a Sun employee. He was an excellent presenter, and really showed off the strengths of the technologies in a nice way. He started up enough VMs to make his OpenSolaris laptop chew into swap, and I thought it was fun to see how it dealt with that. I’ve heard Solaris and OpenSolaris do a lot better at avoiding and managing swapping than GNU/Linux, but I couldn’t make any opinion from watching. I did think it was odd to have this session at a “Linux” (yes, they left off the GNU) conference. But I thought the session was a good addition to the conference. In other sessions, and in the hallways and expo, there was a lot more slant towards open-source software and gadgetry in general than there was towards GNU/Linux. The sessions that were about Linux or GNU/Linux were top-heavy towards topics like educational initiatives.

The Free Software Foundation had a booth in the expo hall. It was funny that they didn’t boycott the event, because I know RMS won’t speak at so-called “Linux User Groups” and insists they be called “GNU/Linux User Groups.” I guess the FSF is not unified behind that banner. Regardless, I used the opportunity to renew my membership perpetually. I’m so lazy that I need something like this to stay involved!

The expo hall was dominated by Red Hat, Fedora, and SUSE; PostgreSQL was there, but not MySQL. There was a good variety and number of vendors. It was great to see the healthy support of the event, which was free, by the way.

Clemson, SC is not easy to get to, and while the Clemson campus was attractive and functioned fine, it’s nothing you can’t find elsewhere. I ended up driving over 9 hours to get to it. I’d have preferred the technology triangle, which if nothing else is close to major airports, bus and train stops, and Red Hat.

Richard Hipp talked about the great fsync() bug, a similar talk to the one he gave at the first OpenSQL Camp. Someone asked about Tokyo Cabinet and he responded that he hasn’t found any fsync() calls in its source code. *cough* Something worth thinking about for on-disk usage (I haven’t looked at its source much myself). TC can also be used in-memory-only, and a while back I suggested that usage of it for Drizzle to replace the Memory engine; I don’t know what became of that.

Related posts:

  1. Sessions of interest at MySQL Conference and Expo 2009 I haven
  2. PostgreSQL Conference East 2009, Day Three As I said
  3. Sessions of interest at MySQL Camp 2009 I wrote pr

Related posts brought to you by Yet Another Related Posts Plugin.

WordPress on Drizzle - Beaten To the Punch

Posted by: mike, June 17, 2009 11:52 PM

Looks like Jeff Waugh actually beat me to it.

Haven't seen the code... but he's done it and sounds like he's done a somewhat thorough job.

Sadly I learned this from Brian's presentation on Drizzle at OS Bridge. Doh.

Article on Drizzle in Linux Magazine

Posted by: lbieber, June 16, 2009 04:37 PM

Great article by Jeremy Zawodny in Linux Magazine. Thanks Jeremy!

Eric Day Speaks About Gearman and Drizzle July 6, 2009 in Boston

Posted by: Sheeri Cabral, June 16, 2009 01:40 PM

The July meeting of the Boston MySQL User Group will feature Eric Day, a prominent Drizzle developer, talking about Drizzle and Gearman:

In this talk we will discuss two growing technologies: Drizzle and Gearman.

We will explain what the Drizzle project is, what we aim to accomplish, and an overview of where we are at. We will also be introducing the fundamentals of how to leverage Gearman, an open-source, distributed job queuing system. Gearman’s generic design allows it to be used as a building block for almost any use - from speeding up your website to building your own Map/Reduce cluster. We will tie Drizzle and Gearman together and demonstrate how they work in a custom Search Engine application.

————————

Here is the URL for MIT’s Map with the location of this building:
http://whereis.mit.edu/map-jpg?selection=E51&Buildings=go

This map shows the MBTA Kendall Stop:
http://whereis.mit.edu/map-jpg?selection=L5&Landmarks=go
(the stop is in red on that map, and you can see E51 in the bottom right)

Here are the URL’s for the parking lots (free and open to the public after 3 pm):
http://whereis.mit.edu/map-jpg?selection=P4&Parking=go
http://whereis.mit.edu/map-jpg?selection=P5&Parking=go

Free pizza and soda will be served, so please RSVP accurately.

To RSVP anonymously, please login to the Meetup site with the e-mail address “admin at sheeri dot com” and the password “guest”.

For more information, see: http://mysql.meetup.com/137/

Storage Engine Dev Journal #3 : Supporting variable width tables

Posted by: Toru Maesaka, June 16, 2009 12:14 PM

Something I’ve added to BlitzDB recently that was pretty high on my todo list is support for variable width tables. So what is a variable width table? it is a table that contains columns that can vary in size, namely BLOB and TEXT types.

Going back to the basics, when a new row is to be written, a storage engine is given a pointer to the row data in MySQL format that it must somehow store for later lookup/retrieval. By meaning “somehow”, the storage engine is given the freedom to do whatever it likes with the row.

Writing a row for a fixed length table (a table with columns that are always the same size) is deadly easy. A storage engine can choose to not temper with the row and simply write or copy the data to it’s storage mechanism. This is because the storage engine is given a row that contains all the data. Rows for variable width tables however, are treated differently since things aren’t as simple (it’s variable!).

The difference is that columns for BLOB and TEXT types are represented by two parts inside a MySQL/Drizzle row:

  • length of the data
  • pointer to the actual data

This is simple to understand since we need to know the size of the data to copy it.

Minor Complication

The minor complication as you would expect here is that you can’t directly write the provided row to your engine like you can with fixed length tables. The data that you want to copy/write exists elsewhere (hence the pointer) so directly writing the row has no meaning (the data would have disappeared by your next access to that row). You need to make sure that the actual data for BLOB/TEXT column(s) are arranged appropriately on your engine’s row buffer and written out to it’s storage mechanism.

This process is commonly referred to as row packing (converting to your engine format) and unpacking (convert back to MySQL format). So how is this done? it’s actually pretty simple!

The solution is actually simple

As much as it sounds like a bother to support variable length rows, it’s actually not that bad. First you need to understand what a MySQL row looks like internally.

A MySQL row begins with a bitset that represents which fields are NULL. The length of this data obviously depends on the number of NULLable columns you have but this is easy to handle with Drizzle since we’re given all the relevant information by the TableShare object (same goes for MySQL from a different object).

After this data comes the actual column data in the order that appears in your CREATE TABLE statement. What you need to do to get packing working with this row is the not-so-obvious part that you really need an example to look at. Fortunately Tweeting about this attracted Brian’s attention which helped me move forward.

Loop the fields!

So, let’s take row insertion to a variable width table as an example. Imagine this table:

CREATE TABLE t1 (
  id int PRIMARY KEY NOT NULL,
  description text,
  arbitrary_data blob
) engine=your_engine;

and let’s imagine that we need to process this query:

INSERT INTO t1 VALUES (1, "hello world", "blobbbbb");

Now, the storage engine needs to “pack” the data for each column into it’s buffer in the write_row() function. Conveniently, Drizzle/MySQL provides a pack() function for it’s column types (fields) that will do the data packing for you. That is, you do not have to inspect the provided row for pointers to the actual data and do the packing/copying yourself.

How? well, the table object (which is visible from your engine) conveniently holds a list of fields in the appropriate order. The actual pack() function is a member of these fields so you just need to call it as you loop over the list:

/* make sure row_buffer has enough memory */
unsigned char *pos = row_buffer;
 
/* copy NULL bits, "table->s" is the TableShare object */
memcpy(pos, row, table->s->null_bytes);
pos += table->s->null_bytes;
 
/* "row" is the MySQL formatted row given by the core */
for (Field **field = table->field; *field; field++) {
  if (!((*field)->is_null()))
    pos = (*field)->pack(pos, row + (*field)->offset(row));
}

The above code snippet will populate “row_buffer” with the actual data that you want to write to your storage mechanism. You do not have to forward the “pos” pointer because pack() returns a pointer at the end of where it had worked in the buffer (think Pascal Strings). This is precisely why we created the pos pointer, to avoid row_buffer from being forwarded.

For the opposite situation (when retrieving a row), an unpack() function is provided for each field so you just need to take advantage of it like we did with the pack() snippet above.

Little bit more on fields

The actual pack() function that gets called depends on the type of column since the Field class is an abstract base class for the sub classes that actually represents column types inside Drizzle/MySQL. If you want to know what a pack() function looks like for a BLOB type, grep for “Field_blob” in the source tree and there will be a pack() member function for it.

The code layout for field subsystem in MySQL is rather difficult to comprehend since everything is crammed in “sql/field.c” and “sql/field.h” files (at least as of 5.4). So, if you want to get a good grasp of how things are architectured, you should take a look at Drizzle. Field subclasses are located individually in the “drizzled/field/” directory and the base class is located in “drizzled/field.h”.

So, that’s about it! Hopefully this information will help other engine developers when they come across a need to support variable width tables :)

Gearman Pluggable Protocol

Posted by: Eric Day, June 12, 2009 10:30 PM

I just finished adding pluggable protocol support to the Gearman job server, this will enable even more methods of submitting jobs into Gearman. If all the various Gearman APIs, MySQL UDFs, and Drizzle UDFs are not enough, it’s now fairly easy to write a module that takes over the socket I/O and parsing hooks to map any protocol into the job server. As an example module, I added basic HTTP protocol support:

> gearmand -r http &
[1] 29911
> ./examples/reverse_worker > /dev/null &
[2] 29928
> nc localhost 8080
POST /reverse HTTP/1.1
Content-Length: 12

Hello World!

HTTP/1.0 200 OK
X-Gearman-Job-Handle: H:lap:1
Content-Length: 12
Server: Gearman/0.8

!dlroW olleH

I’ve added a few headers for setting things like background, priority, and unique key. For example, if you want to run the above job in the background:

POST /reverse HTTP/1.1
Content-Length: 12
X-Gearman-Background: true

Hello World!

HTTP/1.0 200 OK
X-Gearman-Job-Handle: H:lap:2
Content-Length: 0
Server: Gearman/0.8

So what protocols are we looking at? HTTP and memcached were on the top of the list, but I’m guessing other folks may have better ideas or perhaps could use it for custom integration with their existing infrastructure. This is now tested in my development branch and will be pushed to trunk in the next couple days. If anyone is interested in working on the HTTP module, please hack away, patches are welcome! :) It may be interesting to map a worker interface in as well depending on headers, along with better support for client requests and HTTP error codes.

Here is another quick example that shows how this can be useful. With the job server we started above still running, use the gearman command line client/worker to start up a worker that can do the function ‘proto’ and responds with dumping the file PROTOCOL (use any other file you have around):

> gearman -w -f proto cat PROTOCOL

If you’ve note used this command line tool before, -w makes the process act like a worker, -f function specifies which function the worker should register as, and everything after is executed every time a job is run (it fork()s, remaps stdin/out to pass the payload/read result, and then exec()s).

Now point your browser to http://localhost:8080/proto and you will see the contents of the file (assuming you are running all this on your local machine). This may not seem too useful, but now imagine more complex workers running on a distributed cluster. We now have a simple web server with distributed CGI scripts! :)

Log Buffer #150

Posted by: Sheeri Cabral, June 12, 2009 04:11 PM

This is the 150th edition of Log Buffer, the weekly review of database blogs. Someone accidentally left Dave Edwards‘ cage unlocked, and he escaped, thus leaving me with the pleasurable duty of compiling the 150th weekly Log Buffer.

Many people other than Dave are finding release this week. Giuseppe Maxia explains some details of MySQL’s New Release Model. Andrew Morgan announces a New MySQL Cluster Maintenance Release. Aleksandr Kuzminsky of the MySQL Performance Blog releases build16 of the Percona MySQL binaries (versions 5.0.77 and 5.0.82), which adds some 5.4 features and fixes some bugs.

Darran Cassar, the MySQL Preacher, has created a package for Security Roles and Password Expiry on MySQL

. And for a future MySQL release, look for Two New Status Variable Patches, for query invalidation count and a last received datetime for replication heartbeat. These patches were contributed by MySQL Support Team member Andrew Hutchins.

Dave Beulke points out a new feature in DB2 9.7 — DB2 Compatible with Oracle.

To prepare for a future SQL Server release where CREATE DEFAULT, sp_bindefault and sp_unbindefault will be deprecated, Martin Bell advocates Changing Bound Defaults to Default Constraints. If you are going to upgrade SQL Server, definitely look at the notes from SQL Master of SQL Server QA’s presentation on SQL Server Upgrade Issues and How To Evaluate Potential Issues.

Stewart Smith lets is know that Drizzle Tarballs for the Next Milestone - Aloha are being released weekly. Meanwhile, Jay Pipes and the rest of the Drizzle team find, fix and explain the cause of a performance regression in Drizzle Performance Regression Solved - TCMalloc vs. No TCMalloc.

Lenz Grimmer has started organizing OpenSQLCamp 2009, Aug 22-23 in Germany; he posts details and links in Speaking at FrOSCon and Organizing the OpenSQLCamp 2009, European Edition. If you want to speak, Lenz also lets you know that the Call for Papers for the OpenSQLCAmp 2009 is Now Open! In other conference news, the Call for Papers for the OpenSQLCAmp 2009 is Now Open! In other conference news, the Call for Papers for the OpenSQLCAmp 2009 is Now Open! In other conference news, the Call for Papers for the OpenSQLCAmp 2009 is Now Open! In other conference news, Ronald Bradford gives out a discount code and reminds us that we can still attend OSCon 2009 at a Discounted Rate (until June 23rd). OSCon 2009 will be held July 20-24 in San Jose, California. And in Iowa, Michelle Ufford sends out the East Iowa SQL Saturday Call for Speakers to be held on Saturday, October 10, 2009.

Getting back to basics, Richard Foote explains Oracle’s cost-based optimizer in CBO and Indexes, an Introduction for Absolute Beginners. Speaking of optimizations, Valcora has Another Way To Do Performance Tuning — make sure you actually need the queries that are running against your system!

Tanel Poder points to a blog post on Using Perfsheet and TPT Scripts for Solving Real Life Performance Problems in an Oracle RAC environment. And Jonathan Lewis provides a script you can run if you are concerned about the potantial of Oracle PGA leaks. Over at Oraclue, Miladin Modrakovic shows how to discover memory “leaks and other problems with allocations of memory” in Memory Annotations and Oradebug.

If you are migrating a database from Oracle to MySQL, you may be interested in George Trujillo’s process of Converting an Oracle Schema to MySQL.

Kimberly L. Tripp reveals a lot of information about how SQL Server optimizes queries and common myths when she reveals The Tipping Point Query Answers. David Fetter shares Materialized Views Performance Tips in Postgres, and Leo Hsu and Regina Obe talk about Planner Statistics

in the Postgres optimizer.

In the land of DB2, Henrik Loeser shares a PureXML Performance Tip: A Sequence of Good Indexes.

Coskan explains
How to Use Sysman Schema Without Oracle Enterprise Manager. John Hallas notes that using Oracle’s EM to migrate a database to ASM is easy, but seems slow, in ASM Metadata and Migrating a Database to ASM. He then goes on to share a coworker’s Script to Backup ASM Metadata. J. Arneil shows how to go about Fixing up ASM Disk Header Corruption, should you find yourself in a rough spot.

Aaron Alton has a great article telling us that in Defensive Programming, Assumptions Must be Guaranteed or Tested, and another one on handling tags efficiently in Full Text Search vs. Denormalized Tables Remus Rusanu provides a Transact-SQL stored procedure template for Exception Handling and Nested Transactions.

I’ll end with a link to another survey on What’s the hardest part of becoming an involuntary DBA? It’s one simple question, so go fill it out! You have the time, especially since Craig Mullins points out that on average, we got a 4.6% salary increase in 2008 in Salaries for Data Professionals Inching Upward. To learn more and become even better in your field, get a 15-day free trial to Safari Books Online from O’Reilly, with a 15% discount if you continue past the free trial, courtesy of Susan Visser.

Problems compiling MySQL 5.4

Posted by: Ronald, June 11, 2009 07:05 PM

Seem’s the year Sun had for improving MySQL, and with an entire new 5.4 branch the development team could not fix the autoconf and compile dependencies that has been in MySQL for all the years I’ve been compiling MySQL. Drizzle has got it right, thanks to the great work of Monty Taylor.

I’m working on the Wafflegrid AWS EC2 AMI’s for Matt Yonkovit and while compiling 5.1 was straight forward under Ubuntu 8.10 Intrepid, compiling 5.4 was more complicated.

For MySQL 5.1 I needed only to do the following:

apt-get install -y build-essential
apt-get install libncurses5-dev
./configure
make
make install

For MySQL 5.4, I elected to use the BUILD scripts (based on Wafflegrid recommendations). That didn’t go far before I needed.

apt-get install -y automake libtool

You then have to go compiling MySQL 5.4 for 10+ minutes to get an abstract error, then you need to consider what dependencies may be missing.
I don’t like to do a blanket apt-get of a long list of proposed packages unless I know they are actually needed.

The error was:

make[1]: Entering directory `/src/mysql-5.4.0-beta/sql'
make[1]: warning: -jN forced in submake: disabling jobserver mode.
/bin/bash ../ylwrap sql_yacc.yy y.tab.c sql_yacc.cc y.tab.h sql_yacc.h y.output sql_yacc.output -- -d --verbose
make -j 6 gen_lex_hash
make[2]: Entering directory `/src/mysql-5.4.0-beta/sql'
rm -f mini_client_errors.c
/bin/ln -s ../libmysql/errmsg.c mini_client_errors.c
make[2]: warning: -jN forced in submake: disabling jobserver mode.
rm -f pack.c
../ylwrap: line 111: -d: command not found
/bin/ln -s ../sql-common/pack.c pack.c
....
make[1]: Leaving directory `/src/mysql-5.4.0-beta/sql'
make: *** [all-recursive] Error 1

What a lovely error ../ylwrap: line 111: -d: command not found

ylwrap is part of yacc, and by default in this instance it’s not even an installed package. I’ve compiled MySQL long enough that it requires yacc, and actually bison but to you think it would hurt if the configure told the user this.

It’s also been some time since I’ve compiled MySQL source, rather focusing on Drizzle. I had forgotten just how many compile warnings MySQL throws. Granted a warning is not an error, but you should not just ignore them in building a quality product.

Drizzle’s Regression Issue Discovered (a.k.a. Eric [Saves The] Day!)

Posted by: mike, June 11, 2009 05:48 PM

I won’t re-post what has already been said on the other blogs, but this seemed like a great central place to share the news.

After a couple months of trying to hunt down the mysterious little gnome causing headache to the team, it seems Eric stumbled upon the solution, probably in the shower, where all good ideas come to us. Take a look at his blog post for all the sexy statistics (last link) – I assume he did all the work after the shower, of course. Safety first, Eric!

Links:

Jay along with help from many of you has done a great job automating a lot of our tools to provide Drizzle metrics to get a handle on how we are doing. You’ve seen the regular emails that now generate sysbench numbers for each build, we also have regular automated builds to generate data for valgrind, lcov, doxygen and sloc. Oh and lets not forget OpenGrok which was setup by Trond.

If you look at lcov we still have a few rough edges to work out and actually could use some help if any of you are familiar with the inner workings of lcov.

The sloc graph below is showing data going back to March. Nice steady progress downwards which in this case is great! The big drop around build 960 was due to plugin clean up work by Monty and the slight increase around build 984 was from protocol work from Eric (all necessary and important!). Now if we could get one of you to help generate a nice dynamic web page to display this information……any takers?

drizzle-lines-of-code
See here for a pdf of the graph

Compiling Drizzle on OpenSolaris 2009.06

Posted by: trond, June 11, 2009 10:58 AM

I thought it would be appropriate with a new and updated blog post on how to compile Drizzle with the release of OpenSolaris 2009.06. To make the blog more copy'n'paste friendly I have removed the prompt from all of the command's I am displaying :-)

The first thing we need to do is to install a complier, and all of the common tools used to build opensource projects. Drizzle also require libevent and gperf, and there exists precompiled packages for them. So let's go ahead and install the software with the following command:

   pfexec pkg install ss-dev SUNWlibevent SUNWgnu-gperf

I like to put the software I compile in separate ZFS filesystems, so let's go ahead and create:

  • /opt/dscm - To hold the scm systems
  • /opt/drizzle - This is where we want our Drizzle installation
  • /opt/gearman - This is where we want our Gearman installation

"Why not just put everything in /usr/local?" you may ask. Well, I don't like that because then I have a hard time figuring what files to remove when I want to uninstall a package. "This must turn into a long and complex path?" would probably be your next question. The answer is no. Just create the appropriate symbolic links and you are good to go :-)

So let's go ahead and create the ZFS filesystems:

for f in dscm drizzle gearman google
do
   pfexec zfs create -o mountpoint=/opt/$f rpool/$f
   pfexec chown `/usr/bin/id -u`:`/usr/bin/id -g` /opt/$f
done
  

Drizzle, Gearman and libmemcached all use Bazaar for development, and there isn't a package available for OpenSolaris so we need to install this ourself. The Bazaar team is really active and using the "release early, release often" model, and I want a easy way to keep up with the versions. Instead of having zombie files / versions laying around, I ended up with a model where I install each version into its own directory, and I have a symbolic link to the version I want to use. Because we install in a "nonstandard" location, we need to create a startup-script so that Python can find the modules. So let's go ahead and install Bazaar (1.15 is the latest stable version right now) :

wget --no-check-certificate http://launchpad.net/bzr/1.15/1.15final/+download/bzr-1.15.tar.gz
gtar xfz bzr-1.15.tar.gz
cd bzr-1.15
python setup.py install --prefix=/opt/dscm/bazaar-1.15
mkdir /opt/dscm/bin
cat > /opt/dscm/bin/bzr <<EOF
#! /bin/ksh
export PYTHONPATH=/opt/dscm/bazaar/lib/python2.4/site-packages
exec /opt/dscm/bazaar/bin/bzr "\$@"
EOF
chmod a+x /opt/dscm/bin/bzr
ln -s bazaar-1.15 /opt/dscm/bazaar
cd ..
rm -rf bzr-1.15.tar.gz bzr-1.15

The next time you want to upgrade Bazaar, all you need to do is to move the symbolic link /opt/dscm/bazaar to point to the new version. You can now either put /opt/dscm/bin into your path, or you can create something like /opt/local/bin and create a symbolic link to /opt/dscm/bin/bzr from there (and then put /opt/local/bin in your path. To avoid path problems, I'll keep on referring to bzr with absolute path throughout the example.

For some reason OpenSolaris doesn't contain a prebuilt 64-bit version of GNU readline, so that we need to compile that ourself (It is scheduled for an upcoming build AFAIK). To keep the example simple, I'll just install the readline library into /opt/drizzle. So just execute the following commands to download, build and install:

wget http://ftp.gnu.org/gnu/readline/readline-6.0.tar.gz
gtar xfz readline-6.0.tar.gz
cd readline-6.0
./configure --disable-static --prefix=/opt/drizzle 
gmake all install
gmake clean
./configure --disable-static --prefix=/opt/drizzle --libdir=/opt/drizzle/lib/`isainfo -k` CFLAGS="-m64"
gmake all install
ln -s `isainfo -k` /opt/drizzle/lib/64
ln -s . /opt/drizzle/lib/32
cd ..
rm -rf readline-6.0.tar.gz readline-6.0

Stop! why do you build it two times?" If you look at the options there I compile one version with "-m64", and that option will create a 64bit binary. Most people would probably not care for the 32bit binary, but I like to build both versions when I build a library (so that I don't have problems later on if I want to build a 32 (or 64 bit) binary using the library. The reason for the two symbolic links I create at the end is explained in chapter 32-bit and 64-bit Libraries.

Drizzle use Google Protocol buffers in the communication protocol, so let's go ahead and compile them. I don't use the latest version, because there is a compilation error in that version (and I haven't had the time to look at that yet):

wget http://protobuf.googlecode.com/files/protobuf-2.0.3.tar.gz
gtar xfz protobuf-2.0.3.tar.gz
cd protobuf-2.0.3
./configure --disable-static --with-zlib --prefix=/opt/google CPPFLAGS="-fast -m32" LDFLAGS="-fast" \
            --bindir=/opt/google/bin/i86
gmake all install
gmake clean
./configure --disable-static --with-zlib --prefix=/opt/google CPPFLAGS="-fast -m64" LDFLAGS="-fast -m64" \
            --libdir=/opt/google/lib/`isainfo -k` --bindir=/opt/google/bin/`isainfo -k`
gmake all install
cd ..
ln -s `isainfo -k` /opt/google/lib/64
ln -s . /opt/google/lib/32
cp /usr/lib/isaexec /opt/google/bin/protoc
rm -rf protobuf-2.0.3.tar.gz protobuf-2.0.3

With all the dependencies installed, we can go ahead and grab the source for libmemcached, libdrizzle, Gearman and Drizzle:

for f in libdrizzle gearmand libmemcached drizzle 
do
   /opt/dscm/bin/bzr branch lp:$f
done

So let's go ahead and start building them. libdrizzle is first up:

cd libdrizzle
./config/autorun.sh
./configure --disable-static --prefix=/opt/drizzle CFLAGS="-fast -m32" LDFAGS="-fast"
gmake all install
./configure --disable-static --prefix=/opt/drizzle --libdir=/opt/drizzle/lib/`isainfo -k` CFLAGS="-fast -m64" LDFAGS="-fast"
gmake clean
gmake all install
cd ..

The next one on the list is libmemcached:

cd libmemcached
./config/bootstrap
PATH=$PATH:/usr/perl5/bin ./configure --disable-static --prefix=/opt/drizzle CFLAGS="-fast -m32" LDFAGS="-fast" \
    --without-memcached --bindir=/opt/drizzle/bin/i86
gmake all install
PATH=$PATH:/usr/perl5/bin ./configure --enable-64bit --disable-static --prefix=/opt/drizzle \
    --libdir=/opt/drizzle/lib/`isainfo -k` CFLAGS="-fast" LDFAGS="-fast" --without-memcached --bindir=/opt/drizzle/bin/`isainfo -k`
gmake clean
gmake all install
for f in memcat memrm memcp memerror memflush memslap memstat
do
cp /usr/lib/isaexec /opt/drizzle/bin/$f
done
cd ..

There is a problem with the configure script for Gearman, so it is not able to create a 32 bit binary on a machine capable of running in 64 bit mode, so from now on we will only create 64 bit binaries (I will work on a patch for this):

cd gearmand
./config/bootstrap
./configure --prefix=/opt/gearman --disable-static --sbindir=/opt/gearman/sbin/`isainfo -k` --libdir=/opt/gearman/lib/`isainfo -k` \
            --bindir=/opt/gearman/bin/`isainfo -k` CFLAGS="-fast -I/opt/drizzle/include -m64" \
            LDFLAGS="-L/opt/drizzle/lib/64 -R/opt/drizzle/lib/64"
gmake clean
gmake all install
cd ..
cp /usr/lib/isaexec /opt/gearman/sbin/gearmand
cp /usr/lib/isaexec /opt/gearman/bin/gearman

Before we can start compiling Drizzle we need to make sure that Drizzle can detect our PCRE installation. OpenSolaris ships with a version that is too new for the Drizzle configure script, so that we need to create a symbolic link to make sure it detects it properly:

pfexec ln -s pcre/pcre.h /usr/include/pcre.h

Now all is set for compiling Drizzle:

cd drizzle
PATH=$PATH:/opt/dscm/bin ./config/autorun.sh
PATH=$PATH:/opt/google/bin ./configure CPPFLAGS="-I/opt/google/include -I/opt/gearman/include -I/opt/drizzle/include" \
   LDFLAGS="-L/opt/google/lib/64 -L/opt/gearman/lib/64 -L/opt/drizzle/lib/64 -R/opt/drizzle/lib/64:/opt/gearman/lib/64:/opt/google/lib/64" \
   --prefix=/opt/drizzle --libdir=/opt/drizzle/lib/`isainfo -k` 
PATH=$PATH:/opt/google/bin gmake all install

Now you should have Drizzle installed in /opt/drizzle. If you look in some of my previous blog posts you should be able to find out how to install it as an SMF service :-)

Cheers

Regression, The Bummer

June 10, 2009 07:14 PM

Wahoo!

We finally found the regression problem in Drizzle that we have been looking for over the last couple of months.

In the processes of doing this was have walked every line of code. I sat the other night doing a single step through the entire sysbench run looking for anything out of place. Nothing came up at all.

Eric was the person who finally asked the question "could it be tcmalloc"? No one had assumed it because it typically is a good solution (and we will be looking into why it turned out to be at fault, we will probably push now to more aggressively remove the MEMROOT system we inherited since we suspect it/it doesn't play well with C++).

We have not been able to push any patches in the last couple of months that really fixed other performance issues that we know exist.

Why?

Because we feared complicating the problem of finding the original problem. We have all spent time looking through our ancestors to see if there was something we missed.

1) Could it be C++?
2) Could reducing the number of locks, creating a traffic jam around a single lock?
3) Was UTF-8 at fault?

In the end it was none of these :)

So for us?

We have patches coming soon to optimize the UTF-8 system, to minimize LOCK_open, optimize/simplify the THR lock system simpler, and to partition caches internally.
chart.png

Here is a bit of code I worked up for us a recently for Drizzle:

drizzle> DELIMITER |
Note that there is no semicolon after the '|' symbol, which we will use as the delimiter for our purposes. You have to choose a delimiter that does not appear in your procedure, and it can be more than one character.

drizzle> CREATE PROCEDURE perl_hello (param1 string)
-> return "Hello " . $_[0] . "!"
-> |
Query OK, 0 rows affected (0.05 sec)

drizzle> CALL perl_hello('Brian');
-> |
Query OK, 1 row affected (0.00 sec)

drizzle> DELIMITER ;
drizzle> SELECT @perl\G
*************************** 1. row ***************************
@perl: Hello Brian!
1 row in set (0.00 sec)


Stored Procedures!?!

In an actual language!?!

About a week ago I was talking to a CTO for a company who is looking at adoption of Drizzle. One of things he came back with was "I don't need stored procedures, but I do need server side scripting".

Back at the very first MySQL User's Conference we had a debate over the future of stored procedures in MySQL. I and some others really wanted the first stored procedure language to be external, David really wanted it to be PHP. I didn't see the value in implementing a single language. I thought people would be more interested in writing code in whatever language they wanted. Also, I figured that an external system would allow for different groups to develop languages more rapidly.

Fast forward to when we began Drizzle. Parsers are where you spend a lot of your time. The smaller the parser the better off you are. So I went to task removing all of the signs of the SP language from Drizzle. We have been free of them now for over a year now (yes, long before we went public). Things are finally shaping up so that when we begin on Bell, our next milestone, stored procedures, or something like them, are now on our list.

Though are they stored procedures, or is this server-side scripting?

A few premises of the design:

  • Any language should be pluggable. We won't have a native language.
  • We only support re-entrant engines. This solves all of the pre-locking problems that exist currently. We haven't removed 40% of the locks in Drizzle, only to have to come up with a bunch of new ones to support engines which were never built to handle this stuff.
  • While it won't be required, we will focus first on enabling scripting languages that are not in-process. Why is that? We don't want anyone to crash the database. Informix had this problem early on and got a bad rep for it. We want to avoid this.
  • We will enable driver writers to be able to communicate in a native way. AKA if you are writing something in Java, you will be able to use a JDBC interface inside of the database. For Perl DBI, etc. I want to be able to test my SP's in any environment. The difference between running in of the database, and out of should be trivial or non-existent.

    I am a little bit torn about using the SP call/creation SQL commands in Drizzle. You won't be doing the typical SP language (well... unless someone wants to write a plugin for them!). I would also like to encourage people to think differently about what writing server side code should look like. Personally I don't feel that stored procedures are the right solution for a lot of the cases, keep your business logic in your application layer(!), but we also know that users expect to be able to be able to run code locally. Triggering/Callback mechanisms can be very useful though, and enabling them is a part of this. Doing Triggers today in C is simple, but that is not something that everyone should/would/could want to do.

    Putting this in the plugin structure means no overhead to the parser or the rest of the database. Keeping them out of process means no drain or memory expansion of the Database. SMP boxes will benefit because you can confine the language VM to a particular set of processors/amount of memory.

    We don't want the database to ever blow up because of bugs in the execution language!

    And if you never want them? You never load the plugin in the first place.

    Why Perl? I've embedded Perl for years and know how to make it work. I've only done Java once, so I will leave that to other experts.

    I suspect I can find a Java person somewhere inside of Sun :)
  • Drizzle Performance Regression Solved - TCMalloc vs. No TCMalloc

    Posted by: nospam@example.com (Jay Pipes), June 10, 2009 05:03 PM

    As many of you who follow the drizzle-discuss mailing list know, for the last three months, Drizzle developers have been hot (and many times cold) on the tail of a performance problem that we were seeing when comparing Drizzle with MySQL (any version of MySQL, not just 5.4).

    Briefly, on certain machines, we were seeing Drizzle performing at approximately 50% of MySQL, with throughput measured in transactions per second on both a readonly and readwrite workload with Sysbench. The frustrating part of the results was that on other machines, even other machines with virtually identical architectures, compilers, and operating system, we were seeing Drizzle outperforming MySQL by around 20-30%.

    So, the Sun Drizzle team, friends in the Sun PAE (performance applications engineering?) team and various contributors set out to iteratively analyze exactly what was going on during sysbench runs in both MySQL and Drizzle. We analyzed the call stacks of both servers using callgrind, cachegrind, vmstat, and other tools, looking for differences which could explain the differences in performance.

    These efforts ultimately showed either red herrings, or showed results which did not indicate where such a dramatic performance difference was coming from. The efforts did, however, lead us to a much better understanding of the calling patterns, lock contention, and other parts of the servers, and this work will, I am sure, prove very valuable in the coming months as we continue to refine the execution of the server and remove more global contention points.

    So, what was the eventual culprit? Turns out that Drizzle was, by default, linking the TCMalloc library from Google (libgoogle-perftools-dev package on Debian systems) when it was found installed on the machine. This makes sense. We thought that the TCMalloc library would provide a benefit to our Session-mem_root-based allocation strategy adopted from the MySQL core kernel. Unfortunately, it turns out that TCMalloc dramatically degrades the throughput of the server. When we disabled tcmalloc in our build and re-ran the benchmarks, our numbers went through the roof.

    The machine these results are from has the following specifications:

    • 16-core Quad Xeon quad-core Intel processors
    • 4M L2 cache size
    • 32GB RAM - though this is irrelevant as the benchmark was fully in-memory for a 1M row table and 256M innodb buffer pool...
    • Ubuntu 8.10 OS - Linux kernel: 2.6.27-11-server x86_64
    • GCC 4.3.2
    • Disk system software RAID 5 across 3 146GB 15K RPM disks - though this is irrelevant as the benchmark was fully in-memory for a 1M row table and 256M innodb buffer pool...

    The benchmarks were run for a minimum of 10 iterations of 60 seconds each on a readonly and readwrite workload with 1M rows in the table. The configuration was:

    • --innodb_buffer_pool_size=256M
    • --innodb_log_file_size=128M
    • --innodb_log_buffer_size=8M
    • --innodb_additional_mem_pool_size=16M
    • --innodb_thread_concurrency=0

    The Drizzle Automation Suite was used for benchmarking.

    Below, you'll see the before and afters of disabling TCMalloc on both a readonly and readwrite workload. We think you'll agree the results are, well, dramatic.

    READONLY workload

    With TCMalloc linked:

    +------+-------------+
    | c    | tps         |
    +------+-------------+
    |    2 | 1067.444444 | 
    |    4 | 1435.190000 | 
    |    8 | 1937.624444 | 
    |   16 | 2601.817778 | 
    |   32 | 3367.795556 | 
    |   64 | 3930.240000 | 
    |  128 | 3940.764444 | 
    |  256 | 3071.503333 | 
    |  512 | 2003.308889 | 
    | 1024 | 1224.704444 | 
    | 2048 |  530.994444 | 
    +------+-------------+
    

    Without TCMalloc linked:

    +------+-------------+
    | c    | tps         |
    +------+-------------+
    |    2 | 1511.681000 | 
    |    4 | 2714.570000 | 
    |    8 | 4408.986000 | 
    |   16 | 5795.430000 | 
    |   32 | 5619.712000 | 
    |   64 | 4988.760000 | 
    |  128 | 4483.512000 | 
    |  256 | 3914.125000 | 
    |  512 | 2541.946000 | 
    | 1024 | 1325.511000 | 
    | 2048 |  643.446000 | 
    +------+-------------+
    

    And for you pretty graph people:

    Google Chart

    READWRITE workload

    With TCMalloc linked:

    +------+-------------+
    | c    | tps         |
    +------+-------------+
    |    2 |  589.006667 | 
    |    4 |  842.576667 | 
    |    8 | 1142.627778 | 
    |   16 | 1561.532222 | 
    |   32 | 2160.194444 | 
    |   64 | 2169.077778 | 
    |  128 | 1793.243333 | 
    |  256 | 1241.846667 | 
    |  512 |  860.450000 | 
    | 1024 |  491.360000 | 
    +------+-------------+
    

    Without TCMalloc linked:

    +------+-------------+
    | c    | tps         |
    +------+-------------+
    |    2 |  676.238000 | 
    |    4 | 1096.708000 | 
    |    8 | 1661.204000 | 
    |   16 | 2210.335000 | 
    |   32 | 2353.749000 | 
    |   64 | 2202.926000 | 
    |  128 | 2087.273000 | 
    |  256 | 1717.978000 | 
    |  512 | 1361.468000 | 
    | 1024 | 1000.169000 | 
    | 2048 |  288.299000 | 
    +------+-------------+
    

    Again, pretty graph:

    Google Chart

    Note: With TCMalloc linked, the readwrite workload would not complete at 2048 connections. Without TCMalloc linked, it did complete, although with a significant reduction in throughput. This is likely because of the known InnoDB issue regarding 1024 active transactions limit...

    Open Invitation to Benchmark Drizzle, MySQL, XtraDB and PostgreSQL

    The above results compare only Drizzle to itself with and without TCMalloc. We benchmarked also against MySQL 5.4, but as we've previously stated, we don't think comparison numbers should be published unless by a third or objective party. This serves as an open invitation to benchmark Drizzle against MySQL 5.4, XTraDB, PostgreSQL 8.4 and anything else. We'd love to see a validation that our principles of smaller, cleaner code with fewer global contention points, using standard libraries and having features live in a module ecosystem truly does enable a faster, leaner query-running machine. By the same token, if published benchmarks identify cases where Drizzle underperforms compared to another RDBMS, we'd love to tackle the performance problems the benchmarks show. The more data, the better :-)

    Drizzle Regression Hunting

    Posted by: Eric Day, June 09, 2009 10:32 PM

    We’ve been looking for a Drizzle regression for some time now, and today I decided I would take a step back and make another attempt to find it. The first step in doing this was to reproduce this consistently and find a baseline. We’ve noticed it most dramatically with a 16 concurrent connection test from sysbench in read-only mode. I used two 16-core Intel machines running Linux we have for development. We’ve noticed the regression on certain machines but not all, and these two machines provided one of each. I also setup a MySQL 5.1.35 server to use as a baseline to give some comparisons outside of Drizzle. So first, a few more details on the machines:

    Machine 1: 16 core, 16GB RAM, cache sizes from dmesg:
    [    0.010000] CPU: L1 I cache: 32K, L1 D cache: 32K
    [    0.010000] CPU: L2 cache: 4096K
    From /proc/cpuinfo:
    cache_alignment : 64
    
    Machine 2: 16 core, 40GB RAM, cache sizes from dmesg:
    [    0.010000] CPU: Trace cache: 12K uops, L1 D cache: 16K
    [    0.010000] CPU: L2 cache: 1024K
    [    0.010000] CPU: L3 cache: 16384K
    From /proc/cpuinfo
    cache_alignment : 128
    

    For Drizzle I used the latest trunk in Launchpad (r1058), and for MySQL I downloaded mysql-5.1.35-linux-x86_64-glibc23.tar.gz from mysql.com.

    For sysbench, I grabbed the Drizzle branch of it at lp:~drizzle-developers/sysbench/trunk since this has the libdrizzle driver. The libdrizzle driver also supports the MySQL so I use it to test against both. The sysbench commands I used were:

    Drizzle: sysbench –test=oltp –oltp-read-only=on –max-time=15 –max-requests=0 –oltp-table-size=1000000 –num-threads=16 –db-ps-mode=disable –db-driver=drizzle –drizzle-host=127.0.0.1 –drizzle-port=4427 –drizzle-db=test –drizzle-user=root –drizzle-table-engine=innodb run

    MySQL: sysbench –test=oltp –oltp-read-only=on –max-time=15 –max-requests=0 –oltp-table-size=1000000 –num-threads=16 –db-ps-mode=disable –db-driver=drizzle –drizzle-host=127.0.0.1 –drizzle-port=3306 –drizzle-db=test –drizzle-user=root –drizzle-table-engine=innodb –drizzle-mysql=on run

    I started Drizzle and MySQL with the following options. These are not meant to be finely tuned options, but just enough to get the servers running with some sane comparable defaults and able to reproduce the regression.

    bin/mysqld –no-defaults –server-id=1 –port=3306 –socket=/home/eday/other/mysql.data/sock.master –basedir=/home/eday/other/mysql –datadir=/home/eday/other/mysql.data/db.master –log-error=/home/eday/other/mysql.data/db.master/error –innodb-buffer-pool-size=128M –innodb-log-file-size=64M –innodb-log-buffer-size=8M –innodb-thread-concurrency=0 –innodb-additional-mem-pool-size=16M –character-set-server=utf8 –table-open-cache=4096 –open-files-limit=4096 –pid-file=/home/eday/other/mysql.data/db.master/pid

    drizzled –datadir=/home/eday/other/drizzle.data –innodb-buffer-pool-size=128M –innodb-log-file-size=64M –innodb-log-buffer-size=8M –innodb-thread-concurrency=0 –innodb-additional-mem-pool-size=16M –table-open-cache=4096 –table-definition-cache=4096

    Now with everything up and running, I gathered some data. Headings are: -

      1-drizzle 1-mysql 2-drizzle 2-mysql
    TPS 1335 2434 1559 1239
    vmstat
    in 6k 110k 60k 50k
    cs 100k 210k 120k 100k
    us 22 75 72 78
    sy 6 20 25 18
    id 72 5 3 4
    wa 0 0 0 0
    valgrind with cachegrind tool
    TPS 5.21 3.15 3.55 1.96
    iref 858M 1011M 668M 789M

    As you can see, we hit the major regression in column one. Our interrupts and context switches are way out of line, and the CPU is mostly idle. Note though that when run under cachegrind (valgrind –tool=callgrind ), we see the normal pattern and don’t notice the regression. This means to reproduce we can’t have any intrusive debugging tools. I also tried counting system calls as a sanity check and found (using strace -fc ):

    1-drizzle: 402 TPS
    % time     seconds  usecs/call     calls    errors syscall
    ------ ----------- ----------- --------- --------- ----------------
     55.47  106.807940        2867     37252     11127 futex
     13.97   26.907878      220556       122           select
      5.67   10.913999    10913999         1           rt_sigtimedwait
      5.29   10.181982      565666        18           poll
      2.16    3.1154314          11    362463           read
      1.63    2.1146381        1479      2128           pread
      1.55    2.981563          16    181220           write
      0.67    0.1297864         122     10598           sched_yield
      0.67    0.1288412        1394       924           nanosleep
    
    1-mysql: 245 TPS
    % time     seconds  usecs/call     calls    errors syscall
    ------ ----------- ----------- --------- --------- ----------------
     60.03  108.1509367        4594     23836      5835 futex
     15.42   27.1119721      100788       279         1 select
      5.25    8.1579991     1064443         9           rt_sigtimedwait
      1.65    2.1005637         637      4716           pread
      1.30    1.1367183          21    110986           write
      1.07    1.950471           9    221889    221889 sched_setscheduler
      1.06    1.929137           9    223097      1044 read
    
    2-drizzle: 276 TPS
    % time     seconds  usecs/call     calls    errors syscall
    ------ ----------- ----------- --------- --------- ----------------
     65.93  115.556211        3704     31198      8101 futex
     15.30   26.810000      203106       132           select
      5.89   10.330000    10330000         1           rt_sigtimedwait
      5.80   10.170000      565000        18           poll
      2.65    4.647123          37    125060           write
      2.35    4.116470          16    250158           read
      1.72    3.021068        1483      2037           pread
      0.27    0.477662       20768        23           fsync
      0.04    0.072760           4     18218           madvise
      0.02    0.043264          18      2357           sched_yield
    
    2-mysql: 168 TPS
    % time     seconds  usecs/call     calls    errors syscall
    ------ ----------- ----------- --------- --------- ----------------
     68.25  100.851565        6704     15044      3328 futex
     21.22   31.350000      116111       270         1 select
      5.17    7.642591      347391        22           rt_sigtimedwait
      1.93    2.853778         619      4612           pread
      1.15    1.694888          11    153040       346 read
      0.90    1.327743           9    152529    152529 sched_setscheduler
      0.78    1.145964          15     76305           write
      0.32    0.468542          37     12705           madvise
    

    Again, when tracing the process to count system calls, we see the regression disappear, which leaves us with a smaller set of tools to use.

    So what next? One theory we were tossing around is a cache alignment issues. This seems like a pretty dramatic drop in performance to be caused by this, but I ran a test to see what the behavior of a process is when you are being throttled by a shared cache line. The results showed idle CPU, but the interrupts and context switches did not drop off. This does not follow the same pattern we saw in Drizzle (interrupts and context switches did drop off). Our cache line size is also smaller on the machine showing the regression, so that did not help support this theory.

    While stabbing at a few other ideas, I ran ldd to see which libraries were being use in the two drizzled binaries on each machine. Suppressing some common libs:

    1-drizzle: ldd drizzled/drizzled
            ...
            libpcre.so.3 => /lib/libpcre.so.3 (0x00007f9a88532000)
            libtbb.so.2 => /usr/lib/libtbb.so.2 (0x00007f9a88320000)
            libtcmalloc.so.0 => /usr/lib/libtcmalloc.so.0 (0x00007fcced5bc000)
            ...
    
    2-drizzle: ldd drizzled/drizzled
            ...
            libpcre.so.3 => /lib/libpcre.so.3 (0x00007fc0af803000)
            libtbb.so.2 => /usr/lib/libtbb.so.2 (0x00007fc0af5e9000)
            ...
    

    The machine showing the regression is linking with tcmalloc. Looking at the drizzle configure.ac, we use libtcmalloc by default if it is found (machine 2 does not have tcmalloc installed). I relinked drizzled without tcmalloc and received these results:

      1-drizzle 1-mysql 2-drizzle 2-mysql
    TPS 2751 2434 1559 1239

    There it is! For some reason tcmalloc was giving us a 51% performance drop. Perhaps this is due to the tcmalloc version or settings we need to tweak for performance (something to look into later), but for now disabling this by default is the solution. We’re verifying the fix now and should be in the Drizzle trunk shortly.

    Drizzle pluggable MetadataStore (or: no table definition file on disk)

    Posted by: MacPlusG3, June 09, 2009 03:25 AM

    My code is shaping up rather nicely (see https://code.launchpad.net/~stewart-flamingspork/drizzle/discovery) and I’m planning to submit a merge-request for it later today.

    I’m about to commit code that implements a MetadataStore for the ARCHIVE engine. This means that for ARCHIVE tables, you only have the .ARZ file on disk. The table definition protobuf is stored in the ARZ during createTable() and the ARCHIVE MetadataStore can read it.

    The StorageEngine now gets the drizzled::message::Table (i.e. the table definition protobuf) as a parameter. Eventually, we will fully be using this to tell the engine about the table structure (this is a work-in-progress). The advantages of using the proto as the standard way of passing around table definitions are numerous. I see it as almost essential to get this into the replication log for cross-DBMS replication.

    We still have the default way of storing table metadata- in a table definition file (for MySQL it’s the FRM, for Drizzle it’s the table proto serialized into a file ending in ‘.dfe’). However, in my discovery branch if an engine provides its own MetadataStore, then it is the StorageEngine who is responsible for storing the table definition (either in it’s data file or data dictionary). It is also then responsible for making sure rename works and that the definition is cleaned up on drop table.

    The MetadataStore provided by the StorageEngine is also used when searching for metadata such as for SHOW CREATE TABLE, SHOW TABLES, INFORMATION_SCHEMA, CREATE LIKE and when getting the table definition before opening the table.

    The way the ARCHIVE MetadataStore works is that it reads the table proto out of the header of the ARZ file when asked for it. This has the side effect of now being able to copy ARZ files between servers and have it “just work”.

    It will be really nice if we directly interface to the InnoDB Data Dictionary (or even just store the table protos in an InnoDB table manipulated in the same transaction as the DDL) as then we move a lot closer to closing a number of places where we (and MySQL) are not crash-safe.

    Wanting a quick build-and-play way to get Drizzle? We’re dropping weekly-ish tarballs for the Aloha milestone. The latest milestone also has preliminary GCC 4.4 support

    You can see regular announcements on:

    Drizzle source tarball 1055 has been released

    Posted by: lbieber, June 08, 2009 08:49 PM

    Drizzle source tarball based on build 1055 has now been released. The change log can be viewed at https://launchpad.net/drizzle/trunk/aloha.

    For this release we continue to focus on code clean up, build improvements, increased test coverage and performance improvements. We also removed LOCK TABLES, BIT_COUNT and BIT_LENGTH, made several logging improvements, started the first several phases of refactoring JOIN and provided initial support for gcc 4.4 although for now you still need Monty’s patch for protobuf to use gcc 4.4.

    -Lee

    OSCON 2009 at a discounted rate

    Posted by: Ronald, June 05, 2009 09:33 PM

    OSCON moves this year from Portland to San Jose.

    As one the community panel for Drizzle: Status, Principles, and Ecosystem I also have a speaker discount which you can combine with O’Reilly having also extended early bird registration until June 23.

    Be sure to add the os09fos code for an additional 20% off, and be sure to shout me a drink there.

    Pluggable Batch Update Handlers

    Posted by: Marcus Eriksson (noreply@blogger.com), June 04, 2009 08:07 PM

    Reading about the awesome batch insert performance blog post by Mark Matthews last week (http://www.jroller.com/mmatthews/entry/speeding_up_batch_inserts_for) got me thinking, why has this not been done before? Connector/J must be the most deployed JDBC driver in the world and batch inserts are a common use case, why hasn't the community stepped up and implemented the query rewrite feature before? Most likely because it is a complex issue that requires deep knowledge of the rest of the driver. I have been a happy Connector/J user myself for several years and never considered doing something like this.


    To handle this complexity in drizzle-jdbc the batch query functionality is pluggable, i.e. you can implement a small interface and tell the connection to use that implementation. So, if anyone out there has some crazy ideas about how to improve performance of batch inserts/updates, it should be fairly easy.


    First you need to implement the ParameterizedBatchHandler interface, it has two methods:

    void addToBatch(ParameterizedQuery query);
    int [] executeBatch(Protocol protocol) throws QueryException;


    • addToBatch is called when addBatch() is called on the PreparedStatement. I.e. when someone wants to add the current set of parameters in a prepared statement to the current batch - the query parameter contains all the information you need to make something smart.

    • executeBatch is called when executeBatch() is called on the PreparedStatement. The protocol sent to this method should be used to send the query to the server (though, you could make new connections to the server, fork up a few threads and send queries to the server in parallel).

    Then, to make the connection use your handler:

    Connection connection = DriverManager.getConnection("jdbc:drizzle://localhost:4427/test_units_jdbc");
    if(connection.isWrapperFor(DrizzleConnection.class)) {
    DrizzleConnection dc = connection.unwrap(DrizzleConnection.class);
    dc.setBatchQueryHandler(VerrrryFastBatchHandler.class);
    }
    PreparedStatement ps = connection.prepareStatement("insert into asdf (somecol) values (?)");
    ps.setString(1,"aa");
    ps.addBatch();
    ps.executeBatch();



    The current implementation in drizzle-jdbc simply stores all queries in a list and when doing executeBatch, the queries are sent, one-by-one, to the server. I'm planning on doing a rewrite handler in the near future.

    Look at these files for more information:
    Drizzle, State of Testing

    June 04, 2009 05:05 PM

    Testing, Testing, Testing...

    I've gotten a number of questions about how we are doing testing, and even how our methodology for accepting code works :)

    A lot of this comes from running open source projects for almost a couple of decades (hell, if I toss in uploading public domain to software to BBS'es for the Commodore 64 it is a bit longer!).

    One of the most important rules I have learned over the years is that anything that is not automated and not required, will get skipped.

    Today Drizzle runs 213 tests, the entire MySQL test suite minus tests that are for features we don't have. We don't allow for any regression, meaning that no one is allowed to disable a test in order to get their code pushed. Our test suite was also modified so that we can run all of the tests against a particular engine. Today we do this with both Innodb and PBXT. So instead of having "engine specific" tests, we can test everything. Feedback we are getting from storage engine vendors is that this is golden. Even if they never release for Drizzle, they can use it to vastly increase the testing they do today.

    We also do not allow any code to be pushed that causes a compiler to toss a warning. We do this for a wide set of versions of gcc, and also for Sun Studio. We treat warnings as errors :)

    We are also enormously proud of this fact. This took a lot of effort :)

    Our most recent change is that we now include a regression test for performance for sysbench. For each push into our "staging" tree we run a full test at different steps of "connections". We test both read-only and read-write workloads. My only real complaint right now about this system is that we look at absolute numbers, and being a math geek, I would really like some more information on standard deviation :)

    The development process we use maximizes our use of tools. For Drizzle we accept no patches via email. Your patch must come through Launchpad, where you must have an account. We do this so that we can always track down "who did what". This is done almost entirely without exception, and the exception being that I am sure someone at some time as either pointed out a bug in code to me or has made a comment on how something should be written while reading over my shoulder.

    We don't require anyone to sign a "all your bases belong to us" sort of contract. Speaking as myself, I personally find them unpalatable and frequently suggests developers think about them before ever signing. Code flowing is what I find important. With Drizzle we have had a 100+ contributors, and I suspect that number would be much smaller if we had taken a different point of view on this. One of the most interesting conversations I ever had about this topic was with Rasmus, who has continued to refuse to sign the Apache Foundation's agreement. If someone was to ever put code in from their employer, we can strip the code back out in seconds. We can also hand over all of their information to the original copyright holder for them to prosecute. The ones of us who do final review of code are very nitpicky about this. One of the nice things about "source code" is that it has finger prints. Anything that doesn't follow our coding style, which is very specific, stands out.

    Code which was not original tends to always show it roots. I know from talking to Theodore Tso that IBM gives its developers who take in code to its open source projects, a class in how to identify copyright violations. IBM's general handling of open source always tends to impress me.

    Having a modular approach in our design also means that any "large" sort of change can be reduced down to a plugin. Our average patches are bug fixes or refactoring bits. If you find that you enjoy making code readable, fast, and standard looking, you will probably enjoy working on Drizzle.

    If you want to write fun and new code, then you should look at writing plugins!

    On a personal note, if I was to write a database from scratch, I would go after a completely different problem domain. I still happen to need a relational database though, which is why I work on Drizzle.

    That and the people who are working on the project are both awesome and fun :)

    Code flow in Drizzle is pretty simple. You write a patch, you submit a tree to Laundpad, and one of the captains reviews it and puts it into their tree. I pull from one of their trees and merge into a local tree. Before I push the code it has to pass all tests/etc on 64bit Intel Fedora, Open Solaris Sparc, Solaris Sparc, and OSX. When I get around to buying myself a new desktop Mac I am going to spin up a few more platforms in virtual boxes, so that I can test more platforms. There is a simple Gearman system I use to populate and run the code on all of these systems.

    Once the code passes on all of the above it goes to our staging tree on launchpad. From there the automated system gives me feedback on regression. If we pass then the code goes to trunk. Our turn around time for code is frequently about 24 hours. Since all of the above testing is done, we drop tarballs anytime we want too.

    It is pretty much clockwork for us. If a human wasn't involved I suspect we could just set a cronjob to handle it every two weeks (and who knows... if Lee gets bored with this, maybe he will do exactly that!).

    What is the future?

    Today we generate automated reports for cachegrind, callgrind, and valgrind. We run pahole by hand. We are told that there is a new tool for generating random queries for MySQL for crash testing, we need to look into this.

    I've also got a set of tests written around drizzleslap (aka mysqlslap) that we need to toss in soon.

    All of these need to go into the process. We should never regress on L2 misses or branch predictions. We should never see holes in our structures/classes.

    We don't have enough tests for failure cases. Our test coverage is public:
    http://drizzle.org/lcov/index.html

    I want to see that increased. The problem is that there has never been a test system for "this is how to fail at X". A few hacks have been written, but we need to come up with a complete methodology for this. The one open source database that has an awesome test suite for this is SQLite, and we need to spend some time learning from them all of what they do.

    Over dinner recently with Josh Berkus, he mentioned the work they are doing on pgbench. I am hoping that we can get a patch in it so that it will support libdrizzle. That way we could use one of the Postgres tools as well.

    These are our next big steps :)

    Have any suggestions? Want to contribute? All of our tools are open source. We welcome extensions and they are general enough that almost any other database could use them as well!
    Drizzle-JDBC 0.3

    Posted by: Marcus Eriksson (noreply@blogger.com), June 01, 2009 09:54 AM

    I just pushed up 0.3 of drizzle-jdbc to the maven repository, go here to download. It will soon be synced to the official maven repository.


    Changes from 0.2 to 0.3:

    • Add the services file to make the driver autoload.

    • Throw proper JDBC4 SQLExceptions.

    • Make blobs + getObject() work against new versions of drizzled.

    • Fix bug with generated keys and prepared statements.

    • Fix bug with prepared statements and adding parameters several times (fix by Trond Norbye).

    • Name the packet fetcher thread.

    • Fix bug with getSchemas() - returned columns in wrong order.

    • Rework of the packet fetching by Trond.

    • Pluggable batch query handlers (I'll blog about this soon)

    JavaOne/Community One next week

    Posted by: lbieber, May 31, 2009 04:13 AM

    We will be showing a great demo next week at Community One and Java One which will highlight a scalable open source search engine using Drizzle, Memcached, Gearman and Sphinx. See Eric’s write up for more details.

    If you are going, please drop by the Drizzle booth and check out the demo and say hello. Hope to see you there.

    libdrizzle 0.3 Released

    Posted by: Eric Day, May 29, 2009 11:11 PM

    I’m pleased to announce a new version of libdrizzle! This is mostly a bug fixing and maintenance release before I start in on more significant development. One of the new features I added was a hook to be able to use your own I/O event mechanism rather than the default poll(). This will allow you to use libraries like libevent, which can be useful when dealing with a large number of file descriptors, or to mix with other file descriptors in your application (for example, you could listen on other fd’s alongside the non-blocking Drizzle/MySQL socket connections). There is not much for examples or documentation yet with this feature, but for now you can email or find me in #drizzle on irc.freenode.net if you would like to know more.

    One of the next steps with libdrizzle is a better protocol abstraction, since the Drizzle protocol is diverging quite a bit from how the MySQL protocol works. With these abstractions, it will also be possible to easily add other database-like protocols (where column/row/fields make sense). I’m also going to start looking into more memory optimizations and performance tuning.

    JavaOne sessions

    Posted by: Marcus Eriksson (noreply@blogger.com), May 29, 2009 03:15 PM

    I'm attending JavaOne this year, it starts on Tuesday. These are the sessions i look forward the most to (in no particular order):


    Return of the Puzzlers: Schlock and Awe (TS-5186)
    Josh Bloch and Neal Gafter present a couple of interresting programming puzzles, will be fun.

    Effective Java: Still effective after all these years (TS-5217)
    Again, Joshua Bloch. His book "Effective Java" is one of the best programming books i've read.

    Java Platform Concurrency Gotchas (TS-4863)
    Concurrency is a difficult issue, this session presents issues with code examples.

    Ghost in the virtual machine: A reference to references (TS-5245)
    Bob is the author of google guice (see below) and from what I've heard, he is also a great speaker.

    Defective Java Code: Mistakes that matter (TS-5335)
    About bugs in java code, static analysis etc.

    Drizzle: A new database for the cloud (TS-5410)
    Drizzle is a great project, positive, helpful people.

    JDBC? We don't need no stinkin' JDBC: How LinkedIn scaled with memcached, SOA and a bit of SQL. (TS-4696)
    Always interesting to hear how big companies handle scalability issues really job related for me.

    Introduction to Google Guice: The Java Programming language is fun again (TS-5434)
    Google Guice really opened my eyes to what you can do with java, the subject of the talk is so true, it shows what java is capable of and that java can be fun. This is how modern java development should be done.
    Narada - A Scalable Open Source Search Engine

    Posted by: Eric Day, May 28, 2009 12:55 AM

    I’ve been working with Patrick Galbraith for the past couple weeks on a new project that started as an example in his upcoming book. It is a search engine built using Gearman, Sphinx, Drizzle or MySQL, and memcached. Patrick wrote the first implementation in Perl to tie all these pieces together, but there is also a Java version underway bring written by Trond Norbye and Eric Lambert that will be shown at the CommunityOne and JavaOne conferences next week. I’ve been helping get the system setup on a new cluster and with the port to Drizzle.

    Narada provides interfaces that allow you to submit URLs to be indexed and crawled, and then to search those indexes and get a result set back. This allows you to index and search your own set of URLs, possibly for a single website or just for your own personal archive. The crawler in the back-end will be able to stop after some recursion limit from the original URL and also be able to apply URL filters (for example, only index pages under the domain “oddments.org”). Other filters and extensions should be easy to add. Narada is interesting because it is:

    • Open Source - You can modify it to fit your own needs, hopefully in a modular way so that changes can be contributed back to the project.
    • Easy to Scale - The system is built on a number of asynchronous queues, and the processes to perform that work can run on any number of machines. Increasing your capacity is now trivial, simply start up more machines and with new workers.
    • Language Agnostic - While the first versions are in Perl and Java, it is easy to mix in other languages. For example, if a certain component was slow, we could rewrite it in C for better performance. The APIs to index and search can also be wrapped for any language since it will mostly just involved wrapping the Gearman client API. I’m thinking of hacking up a PHP API.

    So, how does Narada work under the hood?


    Click here for the full-size image

    The blue boxes represent your front-end application that use Narada, using the Gearman client API. The yellow boxes represent Gearman workers that perform one of the tasks in the chain. The orange boxes represent the storage mechanisms such as Drizzle, MySQL, Sphinx index, or memcached.

    When a URL is submitted, it will first be queued in a Drizzle table for later processing. A Gearman job is started during the table INSERT to notify a Fetch Worker that a new URL is ready. Once a free Fetch Worker is available, it downloads the page and looks for more URLs to index. This is where recursion limits and filters are implemented. Next, it takes the resulting document and pushes it into memcached and notifies the Document Worker a new document is ready to be stored and indexed. The Document Worker then stores this inside of another Drizzle table and will start the Sphinx indexer if it hasn’t been run in a while. We don’t want to index on every URL since this would be wasteful and expensive. At this point the document is stored, indexed, and memcached is primed with the content.

    When a search request comes in, the client will dispatch a search job to the Search Worker. This worker is responsible for performing the Sphinx search and gathering the necessary information from memcached or Drizzle so the client can return some meaningful results. In the future we will most likely be sharding the data and indexes, so the Search Worker will also be responsible for aggregating multiple shard searches into one set for the caller.

    The code is still rough around the edges, but we’ve set it up on a couple clusters so far and it is working quite well. We’ll be actively working on it and refining the install process so it is easier to get it up and running.