Interview with Josh Berkus & Joshua Drake from PostgreSQL

SCALE interviewed two PostgreSQL developers, Joshua D. Drake and Josh Berkus on the eve of PostgreSQL appearing at SCALE. They were kind enough to answer question about their favorite database.

SCALE: Gareth Greenaway, SCALE Community Relations
JD: Joshua D. Drake
JB: Josh Berkus

SCALE: What role do you play in the PostgreSQL community? In the development of PostgreSQL?
JD: I am the PostgreSQL SPI Liaison. The majority of my focus is within fundraising and advocacy. However, I also contribute to docs and the website. Command Prompt, Inc (my employer) is also one of the most pervasive contributors to the PostgreSQL community.
JB: I’m on the PostgreSQL core team, which is a kind of “steering committee” for the project. Mainly we set the dates for releases, and provide leadership on the rare occasions when consensus doesn’t work. I mainly do PR for the project, public speaking, and benchmarks.

SCALE: What do you think is the single most important feature from the 8.3 release? What feature are you most proud of?

JD: The buzzword feature is HOT. I personally favor integrated Tsearch2.
SCALE: Can you tell us a little more about those features? And why you like the integrated Tsearch2 feature.
JD: HOT greatly reduces the VACUUM requirement for relations that qualify. Basically if you are updating columns that are not indexed a dead tuple will not be created. An example of types that will greatly benefit from this technology are session tables which incur a great deal of updates but generally not on an index column.

Tsearch2 is something we have had for a while but always as a module. Tsearch2 provides full text indexing (think search engine) using GIST indexes. It is now integrated into core which allows for easier use for people wanting access to the feature.

JB: It’s actually hard to pick one specific one out; a lot of the features are interrelated. HOT, spread checkpoints, var-varlena, lazy XIDs … there’s a whole host of performance features which push PostgreSQL up to the point of being the single fastest DBMS on a 4-to-8 core commodity system.

SCALE: Now that 8.3 is out the door, what exciting news things are on horizon for 8.4?
JD: There is a lot of talk about formal partitioning and WITH/RECURSIVE support.
SCALE: Would you mind elaborating on that?
JD: Currently PostgreSQL partitioning is a bit fragile. It works very well in very defined ways but is limited in its abilities. Adding formal partitioning will remove a number of limitations that currently exist. WITH/RECURSIVE gives you a great deal more flexibility in your SQL. A classic example is representation of a hierarchical flow such as: x manages y, y manages z, find all the people managed by x or by a child manager of x.
JB: Well, it’s always hard to accurately predict what features will actually be completed, and which ones will show up out of the blue. However, here’s some I know people are working on:

  • PL/PSM, the stored procedure language used by DB2 and MySQL
  • PL/proxy, our distribted table interface
  • More performance improvements, including clustered indexes and the Dead Space Map
  • Recursive queries
  • Improved SMP scaling, to 64 cores or more
  • Hot standby databases

SCALE: What are your thoughts on the recent news of Sun buying MySQL?
JD: I believe that Sun paid entirely too much based on the annual sales of MySQL.
SCALE: What do you believe the reasoning behind the purchase was?
JD: It is difficult for a traditionally closed source company to invest great sums of money into a technology that they can’t control. MySQL also have a great deal of developer mind share and much like Facebook or Youtube aquisitions, being able to point shareholders at a large group of potential customers usually is seen as a good move. Unfortunately for Sun, it means they will have to execute on the acquisition quickly and that is not normally considered a Sun strong point.

SCALE: Will the two teams share code and ideas more freely now?
JD: Doubtful. PostgreSQL puts correctness first.
JB: PostgreSQL has been open source for 12 years and MySQL has been open source for 7. So we’ve always been able to browse each other’s code and borrow good ideas. However, the architecture of the two databases is pretty radically
different and what we can directly borrow is limited. Brian and I have talked about a PostgreSQL tabletype for MySQL, but that would be basically a stunt; it wouldn’t really be useful for anything.

SCALE: What is your opinion of some of the commercial database products?
JD: Which commercial database products? Do you mean Oracle or MSSQL? I find that PostgreSQL can do 95% of any of the enterprise commercial products and the other 5% can be achieved through some creative thinking.
SCALE: Let’s go with Oracle. How does PostgreSQL compare with Oracle on features?
JD: Very well, actually. We lack some key things that Oracle has such as Packages and Materialized views. We make up for the missing features through our extensibility. There is not a single database in the market that has the extensibility that PostgreSQL has.

SCALE: Do you ever use them to compare features and performance against Postgres?
JD: Yes, and depending on workload PostgreSQL will perform better than them. There are of course exceptions and it is very easy to put up bogus benchmark numbers.
JB: All the time. Since Oracle is still considered the industry leader, we compare performance with them all the time; if we can match or beat them on a benchmark workload, we feel pretty good about it. Also, there’s a lot of potential database features which still aren’t defined in the SQL specifications; if we want to implement one of these, often we’ll use the syntax from a proprietary system just to be compatible.

SCALE: Do you believe that MySQL could possibly improve because of the purchase by Sun?
JD: Well there is always the possibility but Sun has to walk a fine line here. The technology that makes MySQL even remotely enterprise class is owned by Oracle. Sun does have engineering prowess however and if they can execute I expect to see good things from the project.

SCALE: What do you like most & like least about open source software?
JD: I believe that Open Source brings a level playing field for anyone wishing to capitalize on a particular technology. However that level playing field comes with a price in terms of usability and some times user visible tools.
JB: What I like most is the teamwork. Everything we do is a group effort, and most other jobs don’t provide you social satisfaction as well as accomplishment at the same time. What I hate most is when the teamwork breaks down and we waste weeks pointlessly arguing.

SCALE: What is the hardest thing about working on an open source project?
JD: Go to the local animal shelter, adopt 200 cats, herd them. Get back to me with the question.
JB: Realizing that everything you do is public. If you come from another environment, it can take some getting used to working in a glass office.

SCALE: What other open source projects are you most excited about, besides Postgres?
JD: Well I am primarily a PostgreSQL person. However LedgerSMB (www.ledgersmb.org), CentOS and Ubuntu also show great promise.
JB: Hmmm. There’s a couple I can’t talk about becuase they’re not open source yet, so I won’t. As one exciting project, we’re “launching” Bytesfree at SCALE. Bytesfree is an attempt to mix open source and politics, and harness some of the passion of geeks to accomplish political goals.

SCALE: Thanks! We’ll see you at SCALE.
JD: You are welcome. See you in a couple of days.

Comments are closed.