Looking at MySQL 8 with PostgreSQL goggles on

Big new feature in mys v8 is role system....

Great post! Very informative and balanced, which is a rarity when people compare things close to their heart like databases or text editors 🙂

I've been using MySQL for almost 20 years now, but I really like PostgreSQL too and try to keep up with what's going on with it as much as I can. I actually got started with MySQL because PG was "too correct": There was no way to disable WAL and back in 2000 or 2001 I needed to do bulk imports of data for warehouse-style queries while nothing else was writing to that database, so the WAL overhead was too much when if anything went wrong I could easily wipe the day's data and start the import again. Funny how a small accident in requirements can dictate your career, as had there been a way to disable WAL (or had I been working on OLTP instead of OLAP) I may have ended up spending 20 years on PG instead 🙂

A couple of comments on your MySQL points:
- I think the closest to pgbench in our part of the world is 'mysqlslap', which should come bundled when you install the mysql-client packages. For more advanced uses there's sysbench (which also supports PG), but I think mysqlslap would get you most if not all of what pgbench can give.

- The clustering of MySQL is complex, from a product naming / licensing point of view. This is not made easier by the fact that there's a (n Open Source) product called MySQL Cluster, but which is not the only way to do clustering with MySQL ... But my point is, there are Open Source clustering options, even from Oracle:
- MySQL Cluster, as mentioned, uses a different storage engine (NDB) and while mostly a niche product, it's a fantastic fit for its niche (it does automatic sharding and routing of queries, for example).
- MySQL Group Replication / Innodb Cluster is a way to do HA (but not sharding, at least for now) with Innodb, is official from Oracle, and is Open Source. I have not seen it in the wild much yet but it should be gaining traction. I think some features may require an Enterprise license but you can get a cluster running using only Open Source software.
- There are a few Galera-based clustering solutions that are functionally very similar to Group Replication, though more mature, as that has been around for a few years already. It is not available on the official Oracle Community release (for obvious reasons, I guess), but can be obtained as Open Source Software under a few names (at least, Percona XtraDB Cluster, MariaDB Galera Cluster, Galera Cluster).
- If sharding and scale-out are needed, TiDB is a new database that's MySQL protocol-compatible, uses RocksDB, and does sharding/ha/routing for you. In fact, while there are important architectural differences, as a PostgreSQL user you could think of it of MySQL's CockroachDB and you wouldn't be too far off IMHO 🙂

Sorry for the long rant and cheers!

Reply

Great post! For scaling, though, you can look in our own backyard. You noted, "The remaining 1% percent would be then for cases where some global start-up scaling would be required, due to native multi-master support."

You may want to check out a little company called Pex. Their requirements were so big that they blew past the limits of MongoDB, Hadoop, Cassandra, and HBase. They are using PostgreSQL with the CitusDB extension; they use 20 nodes, 1280 cores, and 2.4TB of RAM to update 80B rows per day and ingest 60k rows per second, all while being responsive to ad-hoc queries. Microsoft also has a similarly-sized Citus cluster that handles all Windows telemetry data.

Reply