Forum OpenACS Q&A: mysql passes the acid test?

Collapse
Posted by Anthony Barker on
From what I have been reading a new database format InnoDB
for mysql offers many of the key features of a good RDBMS. Do any of
you have experience with it?

Changes in release 3.23.x (Stable)

The 3.23 release has several major features that are not present in
previous versions. We have added three new table types:

MyISAM
A new ISAM library which is tuned for SQL and supports large files.
BerkeleyDB or BDB
Uses the Berkeley DB library from Sleepycat Software to implement
transaction-safe tables.
Innodb
A transaction-safe table handler that supports row level locking, and
many Oracle-like features.

Collapse
Posted by Andrei Popov on
But have they included sub-selects? Unions? Flexible function definition? Views? Foreign keys?

MySQL is nice when something small is needed, something that you could use Access for on Windows, without an overhead of an OS and an application that throuws a CPU into 100% utilization state whether it does anything or just sits idly. Actually, I run it (along with Postgres) on my laptop and do store some data there, but I would not think that MySQL is ready for [Open]ACS-type use...

Collapse
Posted by Brad Ford on
Definitely doesn't support foreign keys and subselects. A client is requesting that I use it for a project and the lack of sub-selects has wasted a lot of my time. From the documentation:
The following will not yet work in MySQL: 

SELECT * FROM table1 WHERE NOT EXISTS (SELECT id FROM table2 where table1.id=table2.id);

However, in many cases you can rewrite the query without a 
sub-select: 

SELECT table1.* FROM table1 LEFT JOIN table2 ON table1.id=table2.id where table2.id IS NULL
A whole new syntax for me. Access is the same but, in its favour, if you directly type in sql using subselects, it will translate it into the 'left join' syntax. I've found this a useful little tool to figure out the other syntax. Just seems to me that sub-selects are so much easier to wrap my head around. As for foreign keys, here's what the docs say:
Note that foreign keys in SQL are not used to join tables, but are
used mostly for checking referential integrity (foreign key 
constraints). If you want to get results from multiple tables from a 
SELECT statement, you do this by joining tables: 

SELECT * from table1,table2 where table1.id = table2.id;

See section 7.20 JOIN Syntax. See section 9.3.6 Using Foreign Keys. 

The FOREIGN KEY syntax in MySQL exists only for compatibility with 
other SQL vendors' CREATE TABLE commands; it doesn't do anything. The 
FOREIGN KEY syntax without ON DELETE ... is mostly used for 
documentation purposes. Some ODBC applications may use this to 
produce automatic WHERE clauses, but this is usually easy to 
override. FOREIGN KEY is sometimes used as a constraint check, but 
this check is unnecessary in practice if rows are inserted into the 
tables in the right order. MySQL only supports these clauses because 
some applications require them to exist (regardless of whether or not 
they work). 
And then they go on to explain why foreign keys shouldn't be used...
There are so many problems with foreign key constraints that we don't 
know where to start: 

Foreign key constraints make life very complicated, because the 
foreign key definitions must be stored in a database and implementing 
them would destroy the whole ``nice approach'' of using files that 
can be moved, copied, and removed. 

The speed impact is terrible for INSERT and UPDATE statements, and in 
this case almost all FOREIGN KEY constraint checks are useless 
because you usually insert records in the right tables in the right 
order, anyway. 

There is also a need to hold locks on many more tables when updating 
one table, because the side effects can cascade through the entire 
database. It's MUCH faster to delete records from one table first and 
subsequently delete them from the other tables. 

You can no longer restore a table by doing a full delete from the 
table and then restoring all records (from a new source or from a 
backup). 

If you use foreign key constraints you can't dump and restore tables 
unless you do so in a very specific order. 

It's very easy to do ``allowed'' circular definitions that make the 
tables impossible to re-create each table with a single create 
statement, even if the definition works and is usable. 

It's very easy to overlook FOREIGN KEY ... ON DELETE rules when one 
codes an application. It's not unusual that one loses a lot of 
important information just because a wrong or misused ON DELETE rule. 

The only nice aspect of FOREIGN KEY is that it gives ODBC and some 
other client programs the ability to see how a table is connected and 
to use this to show connection diagrams and to help in building 
applicatons. 

And I was under the impression that referential integrity was kind of important... Just my two cents worth.
Collapse
Posted by Roberto Mello on
"The only nice aspect of FOREIGN KEY is that it gives ODBC and some other client programs the ability to see how a table is connected and to use this to show connection diagrams and to help in building applicatons "

This paragraph shows some distorted minds working on MySQL. That explains why it is what it is. https://openacs.org/philosophy/why-not-mysql.html, is beatiful reading.

Collapse
Posted by Don Baccus on
Rewriting "exists (subselect)" into joins works, but you lose one of
the benefits of the original construct - the query can stop execution
the moment the "exists" is satisfied (regardless of whether or not
there's a "not" in front).  The savings using "exists (suhselect)" can
be very significant.

Just ignore what they say about foreign keys, they don't know what
they're talking about.  I thought it was some sort of warped April
Fool's joke or a vandal's hack on their website when I first read it.

As far as answering the actual question raised by Anthony, though -
InnoDB looks like it could be a very good back end.  My guess is that
it will require some seasoning in the field before it lives up to its
potential or can really be considered reliable.  This isn't a knock on
Innobase, it's just reality - RDBMS engines are hard to get right and
concurrency problems are subtle and frequently only crop up
occasionally and are very difficult to reproduce.

I've not looked at their website in a couple of months, but in some
regards its still incomplete.  There's an 8KB limit on either column
or row length, for instance (sound familiar, ye olde Postgres
veterans?).  They plan to work their way past various restrictions but
expect it to take some time.

The InnoDB back end is an Oracle-style storage manager, with row-level
locking, following an overwriting paradigm with rollbacks etc done
from redo logs.

One of the reasons I think it might be good is that originally it had no
connection with MySQL.  In fact, I think they intended to write their
own full-blown RDBMS at one point.  Whatever the specifics, it's been
glued to the MySQL front end.  The folks doing InnoDB obviously have a
somewhat more reasoned and mature notion of what an RDBMS should be
than Monty and the rest of the MySQL crowd.
They firmly believe in the importance of ACIDity, for instance, and
probably laugh as heartily as we do at some of the stupid things the
MySQL folk
have said in the past.

There's still the little problem of the weakness of the SQL dialect
supported by MySQL, as others have pointed out.

Collapse
Posted by Todd Gillespie on
Don:
Heikki Tuuri, the InnoDB author, hangs out on comp.databases.  I asked him why he wrote such a nice back-end for MySQL rather than something more mature, eg, Postgres.  His reponse:
(note: before I catch any heat from anyone, I'm just reposting this, not advocating it, and particularly not his questionable benchmarking methods)

your posting from April 19th is not visible at newsone.net yet for some
reason. Therefore I reply to your earlier post.

You asked why I installed InnoDB under MySQL and not PostgreSQL.

The first reason was that in MySQL there is a standardized interface
to lower level storage managers (called table handlers in MySQL) which
MyISAM, BDB, and InnoDB use. I have not studied PostgreSQL
source at that detail, but the object-relational features
there may require something extra from the storage manager.

Another reason is speed: InnoDB was designed to be the fastest
disk-based database engine in the world. Also MySQL SQL interpreter
is fast. The combination MySQL/InnoDB is thus only 30 % slower than
a standalone InnoDB and in many tests faster than the standard table
type of MySQL.

Last summer I measured PostgreSQL 7.0 against a standalone InnoDB.
I got the following results:
                            CPU time in seconds

Test                        InnoDB PostgreSQL 7.0

-----------------------------------------------------

Insert of 100 000 rows,
copied from a table          15          160

Count(*) from a join of
10 000 rows                    0.5          9

-----------------------------------------------------

Test details: tables had two integer columns, and a unique, primary
index on the first integer column. The machine was a 100 MHz Pentium
with 32 MB of RAM, NT 4.0. CPU times are from Task Manager of NT.

The tests above measure the basic internal speed of the database
engine. The time spent to communication with the client is negligible
because the tests above consisted of a single SQL statement. Both
InnoDB and PostgreSQL are transactional databases with row-level
locking and thus comparable.

The third reason was that Monty Widenius, the writer of MySQL
lives in the same town as I :).

Collapse
Posted by Don Baccus on
The benchmarking methodology is rather simplistic, true :)  And I kinda wish  he'd said things like "it's designed to be the most robust, reliable, and scalable disk-based RDBMS system" rather than simply "the fastest".  People don't buy Oracle thinking it's fast.  They buy Oracle thinking it will never bite them in the ass.

Folks who get bad numbers with PG frequently are using default configuration values,  in particular the default value of 64 blocks of shared buffer cache.  Also PG 7.1 gives us about a 10x improvement in raw insertion speed, due to the addition of REDO logging, so it's not clear that InnoBase is actually faster than PG 7.1 from his example.

Monty's published benchmarks were achieved by configuring MySQL in order to improve performance, i.e. not in its "out of the box" configuration.  His PG numbers were run using default values.  That was one of the things that caused me to think of him as being somewhat sleazy - particularly when he defended the practice.

The speed of the MySQL interpreter is partly due to the fact that it does very little optimization.  Simple queries have always been fast in MySQL, but its performance on queries with complex joins on several  tables has always been poor.  Monty's benchmarks have always emphasized simple queries.

PG has a storage manager abstraction, but it's at a lower level than the abstraction in MySQL.  The tuple storage format's pretty much hard-wired, so it wouldn't really be possible to drop the InnoBase backend into PG at this level.

On the other hand the parser and semantic analyzer present another layer one could plug into.  It's not as cleanly defined and frankly the PG parser's a royal kludge and an ugly mess anyway.

Of course, before doing so one would needs to ask whether or not there's actually any thing inherently wrong with the PG backend in the first place.  I don't think there is, myself.

Collapse
Posted by Roberto Mello on
As usual, the MySQL crowd is doing unrealistic benchmarks that will obviously will favour their "speed is all that matters. who cares about everything else a database should do?" approach.

If they'd remember their undergraduate computer architecture class, they'd know that the only benchmarks that truly matter are the one with real applications.

Just think about it: How many times in the lifetime of a project, do you insert 100,000 rows all at once? How many times do you do a count(*) on a table you _know_ is huge? These are not realistic uses of a database, unless you have no clue of what you're doing, or are still learning.

Now, how many times do you have users entering bad data and maybe you forgot to put a check? How many times do you do queries that are not so trivial as a count(*)? How many times do you use a subselect in your queries? How many times do you need to write stored procedures to make your life easier and programming faster?

You want a real benchmark? Go look at the tests Tim Perdue did with MySQL and PostreSQL this past january (beore the improvementh in 7.1 that Don mentions) at http://www.phpbuilder.com/columns/tim20001112.php3. THAT is a real benchmark. Look at the results and decide what's more important: being fast on unrealistic operations or being safe and faster or realist operations.

Collapse
Posted by Todd Gillespie on
Whoa! Slow down, Robert.  They're just silly queries intended to show some full-table perf - which is what you might want if you wrote the backend and don't trust the parser.  All the same, it's a pretty poor comparison.  IIRC, Postgres has historically not been as fast on Windows; we don't know the filesystem; 32MegRAM is pretty poor.  If InnoDB was designed for windows filesystems, it would show a non-trivial advantage there and a loss on Unices.  Also, NT's 'Task Manager' is hardly a good piece of system reporting software -- I place no faith in those numbers.  I know nothing about his tuning parameters.

The numbers mean nothing - I was just posting Tuuri's post to develop some of Don's statements about InnoDB and refuting some others.  (Namely that the InnoDB folks are closer to the MySQL folks than Don thought.)

Funny thing about performance claims - the rule of thumb I've developed is that groups won't change tools until there's a clear 1.7 to 2 orders-of-magnitude improvement.  Anything less is just too much bother for porting.  MySQL's harping is fairly irrelevant b/c of that (even ignoring the fact that it's a kid DB).

Collapse
Posted by Roberto Mello on
I wasn't saying anything against you or trying to. Not at all. I
understood your point that you were just reposting from the newsgroup.

It's just that the MySQL crowd has used the same type of arguments for
as far as I can remember. They mislead and miseducate _a lot_ of
people into thinking that their distorted view of things is the right one.

My peeve with MySQL is not so much that it's a weak product. There are
plenty of applications for MyCRAP^H^H^H^HSQL. But their misleading
docs, statements and "benchmarks" are what really tick me off, simply
because they are a load of lies.

Collapse
Posted by Anthony Barker on
Robert:

I actually wasn't trying to start a flame war at all. I was relating
what I had read in the mysql mailing list regarding Innodb and wanted
a true, unbiased answer. I have found that most commercial software
companies expand the truth as do fledgling open source companies such
as GreatBridge and MySql.

Pointing me to a URL over a year old does not really answer the
question of whether the new database format is any good or not. For
the application that I am working on which is mostly reporting, mysql
is actually ok (execept for adding columns which can take a while).
Multi-version concurrency control, for my app really has no value. The
company I am doing this for has a site license for MS SQL Server, but
I would rather avoid it if possible.

Microsoft research as a book on their website (their monopolistic
profits actually bought some research for once).
http://research.microsoft.com/pubs/ccontrol/

And oh, I was thinking of putting openacs on the same machine...

Collapse
Posted by Roberto Mello on
"I actually wasn't trying to start a flame war at all. I was relating what I had read in the mysql mailing list regarding Innodb and wanted a true, unbiased answer."

That's exactly what you got. The truth hurts sometimes. I've used MySQL before, and can tell that my responses are not biased since I have no involvement with neither companies (MySQL and GreatBridge).

"Pointing me to a URL over a year old does not really answer the question of whether the new database format is any good or not."

The URL was not the only thing in one of my posts (I assume you are referring to the why-not-mysql URL). And almost everything that is in that article applies to MySQL still. Though I'd agree that it doesn't directly answer your InnoDB question. But my other posts do.

"For the application that I am working on which is mostly reporting, mysql is actually ok (execept for adding columns which can take a while)."

Notice that I said that there are plenty of applications where MySQL is adequate.

"Multi-version concurrency control, for my app really has no value."

So why are you asking about InnoDB? If it has no value for your app, then you shouldn't need to worry about it.

"And oh, I was thinking of putting openacs on the same machine..."

Cool.I should come easier on MySQL inquiries. But I've seen them so many times, for the same questions, that it's hard. But I'll try.

Collapse
Posted by Edmund Lian on
<blockquote>>The only nice aspect of FOREIGN KEY is that it gives ODBC and some
</blockquote>
other client programs the ability to see how a table is connected and
to use this to show connection diagrams and to help in building
applicatons<<

I've always wondered whether there was some fundamental stupidity lurking behind the scenes within the MySQL camp that makes them unable to understand ACID issues...

Collapse
Posted by Don Baccus on
Well...my response is considerably different than Roberto's.

Consider that for years the MySQL folks sneered at, belittled, and lied about the value of the ACID test, and transactions (which are part and parcel to it).

Now MySQL finds itself in the interesting position of offering interfaces to two different back ends that claim to pass the ACID test.

If the MySQL statements about ACIDity were true, why are there now two  ACID backends and only one flat-file back end?

Maybe MySQL was wrong.

Armed with this amazing backpedaling (which has been thoroughly forgiven by fans making those in love with Philip look like a bunch of  pikers), certainly it is time to point out that their idiocy in regard to foreign keys and other SQL requirements are equally out of whack, mindless, stupid, and harmful to small children who try to build websites on MySQL?

By allying themselves with two ACID backends, they've essentially stabbed themselves in the back as far as credibility.

I don't care enough to chase after them.  But the PG folks and others who get a clue (to some degree, the PG crew misses amazing clues as well at times) and want to make money supporting their more robust RDBMS engines should fling past statements by Monty et al about how horrible ACID compliancy is right in his face.

Collapse
Posted by Talli Somekh on
oh for crissakesoutload already, can we please stop with the mysql flames? it does nothing for anybody. personally, i can't wait for the day that mysql is adequate for OpenACS because then we can tap into a whole slew of open source developers that would be eager to use a toolkit like this ours.

everybody here knows the community's attitude about mysql. so their developers lie and they don't know what they're talking about. maybe they finally got a clue. i checked and that's not quite a crime. in the grand scheme of things, they have neither soiled our good names
nor insulted our families (and beleive me, i have a very lenient scale for determining whether someone's done such a thing.)

there are probably a lot of people coming to the site with very sincere hopes of using it to build sites. they already know apache, mysql and perl and would like to use those skills. but they get here and they find out they need to learn AOLserver, PG and Tcl. Immediately they're probably intimidated by the necessity of learning three new tools. And they're probably more intimidated by seeing these fanatics crap all over one of their favorite tools, MySQL.

I love seeing Apache made available for these people as well as our intent to support other RDBMS's, which we are doing and will be absolutely great. If MySQL ever gets it's act together I think we would jump at the chance of folding it in.

so why do we keep looking like assholes whenever someone asks whether mysql is finally good enough?

talli

Collapse
Posted by Don Baccus on
everybody here knows the community's attitude about mysql. so their developers lie and they don't know what they're talking about. maybe they finally got a clue. i checked and that's not quite a crime. in the grand scheme of things, they have neither soiled our good names nor insulted our families (and beleive me, i have a very lenient scale for determining whether someone's done such a thing.)
Actually, they have caused a lot of harm. They've misled countless people, people too lazy to research database issues on their own, true enough, but that doesn't exonerate Monte (in particular) in my book.

I'm *glad* they're yielding to pressure and adding two backends that have ACID attributes. That's what they've done, though, they've yielded to pressure. This has been quite clear from some their ramblings.

Without that pressure - and personally I think the "Why Not MySQL?" flamefest here at openacs.org was a measurably important part of the awakening that led to that pressure - MySQL would still be the same old MySQL.

Now that they've yielded on one of their early "principles" (you never need ACIDity) maybe a few more sticks and stones will force them to yield on some of their other misrepresentations of reality.

Lying's not a crime, at least not in all contexts. Lying about a product is actionable in some contexts, come to thing about it.

Crime or not, I was raised to believe that lying is reprehensible...

Collapse
Posted by Janne Blomqvist on
The innoDB site has some benchmarks (essentially the same as
mentioined previously) with pgSQL 7.1.1 at
http://www.innodb.com/bench.html
According to this, InnoDB smoked postgreSQL quite clearly. Of
course, this is a really simple test, and postgreSQL support a lot
more of SQL92. But anyway, InnoDB might finally impove MySQL:s
performance in high concurrency situations (it has essentially the
same locking system as Oracle & PostgreSQL). I'm really glad at how
much the situation has changed regarding open source RDBMS systems
in the last year. Think about it, a year ago pgSQL 7.0.x was the
only open source DB that had anything close to the feature set
offered by commercial databases. Today, we have pgSQL 7.1.x,
Interbase/Firebird, SAPDB and MySQL+InnoDB. We should be happy. :)
Collapse
Posted by Patrick Giagnocavo on
re:  the InnoDB benchmarks.  You're well aware that the "100,000
rows insert" was essentially done with PG 7.1.1 doing a
transaction on each INSERT, while for MySQL/Innodb all 100,000
rows were inserted inside ONE transaction, aren't you?
Collapse
Posted by Mark Armstrong on
As a new reader to these sites, specifically regarding mySQL vs postgreSQL.  I must say that there is a lot of emotion tied into most of the responses I have read.  I understand the desire to defend ones point of view, but as an unbiased reader trying to determine the pro's and con's to both mySQL and PostgreSQL the emotional bloat tends to hamper that process. Please just list the facts, be objective and support ones arguments. Allow the reader to make his/her conclusion them selves.
Collapse
Posted by Ben Adida on
Mark,

The reason you see emotion is that there have been some
egregious claims by some MySQL supporters, claims that
sometimes insulted others, but most importantly claims that
greatly misled people like you who are just trying to find out the
facts.

What is particularly disturbing is how these claims continue to
be made even though many explanations have been published
to prove them wrong. I've gone to big conferences where people
pay serious money to sit through a "MySQL Transactions
Tutorial" and are fed completely incorrect information about what
it means to be ACID-compliant, what it means to do reliable data
storage, etc...

For people who care about the integrity of computer scientists in
industry, it's very disheartening.

That said, we'll continue to try to provide "just the facts" on
OpenACS.org, and hopefully you can forgive the emotional
exuberance. Thanks for reminding us that the facts are more
important, though.

Collapse
Posted by Roberto Mello on
I agree with Ben.

For those looking at InnoDB, you might find the interview with Heikki Tuuri, its implementor, at dbdebunk.com (excellent site with frequent contributions by C.J. Date) very interesting: http://www.dbdebunk.com/innodb1.htm