As a newly minted Packt author, it makes sense that I might get a request to review one of their books from time to time. On this particular occasion, I have the opportunity to give a look at Instant PostgreSQL Starter by fellow PostgreSQL user and author Daniel K. Lyons.
I’ll be straight-forward with a giant caveat that I’m not the target audience for this booklet. I tried to read with the perspective of a new user since we’ve all been there once, but please bear with me if I break character.
Many new users will find that the examples given using pgAdmin are easy to follow and perform as expected. Users who are new to PostgreSQL likely don’t want to fiddle with the command-line for basic functionality, especially if they are coming from another database such as SQL Server, Oracle, or even MySQL. And for even more complex cases such as hstore, XML manipulation, or full-text searching, we get treated with function and view syntax to help abstract away some of the ugly or annoying syntax.
That said, the number of pgAdmin results were somewhat skimpy, especially when more advanced features are introduced. Seeing result output of some of these would have been nice considering the rather cumbersome and advanced syntax. A new user might have trouble fully understanding these, and if reading without following along, would have no basis for comparison. Additionally, Daniel’s adherence to pgAdmin extends only to using it as a connection method. When creating a user, table, or database, he prefers to use pure SQL instead of pgAdmin’s interface to create or modify these objects. Considering this book is for new users, and we have an entire section on basic SQL syntax for interacting with tables, why omit this?
Speaking of interacting with tables, for the newest of new users, Instant PostgreSQL Starter launches into a quick introduction of SQL syntax. Most readers can generally skip this section, but it’s good to know it’s included. Why? One commonly accepted aspect of marketing is to Get ‘em While They’re Young. If new use
Learning PostgreSQL and SQL in general probably begins with the
concept of TABLES. Then probably VIEWS, INDEXES and maybe
TRIGGERS.
Some users might not ever go any further, which is sad, because
there is so much more to explore!
I thought it would be cool to automatically generate a graph
showing the dependencies between objects and their types.
This shows the order in which the different types of objects can be
created,
perhaps mentally useful to think in terms of “after we have created
a TABLE we can create an INDEX”.
Something like this would be nice to include in the PostgreSQL
documentation online.
I think it would be helpful when learning about PostgreSQL
different object types.
The graph below was produced using GraphViz
dot
command and live-data from pg_catalog.pg_depend:
As we can see, before anything else, we need a SCHEMA, which is
the root node.
Once we have a SCHEMA, we can create TABLES, TYPES, VIEWS,
SEQUENCES and FUNCTIONS.
Some users might not even know about SCHEMAs, as the schema
“public” is pre-installed.
To create an INDEX, we first need a TABLE.
Etc, etc, etc…
You might be surprised FUNCTION and LANGUAGE have arrows
pointing in both directions.
Turns out you need some functions before you can create a language
like plperl
, such as
plperl_call_handler
.
The self-referencing arrow from/to FUNCTION is less surprising as
some functions can of course call other functions.
(Not all object types are included in this graph as I’m not using them all in my system.)
PgCon 2013 was been attended by 256 people across the globe. Attendees had the opportunity to enjoy tutorials, talks and an excellent unconference (this last deserves a special mention).
I lectured a talk related with Full text search using Sphinx and Postgres (you can find the slides at http://t.co/lgFoLq37EC, and all of the talks have been recorded). The quality of the talks in general was quite good, but I don't want to repeat what you will find in other posts.
The unconference was attended quite late into the evening. You can find a schedule of it, as well as the minutes of some of the talks that happened (and others that didn't also) here.
There was an special emphasis on the pluggable storage feature, albeit most agree that it will be a very difficult feature to implement in the near versions. A topic related to this, was the Foreign Data Wrapper enhancements.
Pluggable Storage engine was extended after. The main reason of why everybody agrees with this feature, is because an API for the storage will allow companies to collaborate with code and avoid forks to other projects.
There was a long discussion also about migrations on the hall, using pg_upgrade.
The features about replication were bi-directional and logical replication.
Full text search unconference discussion was pretty interesting. Oleg Bartunov and Alexander showed a really interesting work coming up for optimizing GIN indexes. According to their benchmarks, Postgres could improve the performance significantly.
There were a lot of discussion I missed, due the wide number of
tracks and "hall spots". But th emajority of attendees I heard
agreed that the unconference was quite exciting and granted the possibility to bring many
new ideas.
During the closing session of PGCon this year, the core team announced the addition of four new committers to PostgreSQL:
These have all been involved in both writing new code for PostgreSQL and reviewing other peoples patches during the latest couple of development cycles. With this addition, we will increase the capacity to handle the rising number of contributions we get, and get even more features into the upcoming versions of PostgreSQL.
Welcome to the team!
PGCon certainly had some energizing talks and meetings this week. First, Jonathan Katz gave a tutorial about Postgres data types. Though I missed his talk, I just reviewed his slides and it is certainly a great tour of our powerful type system.
Second, Oleg Bartunov and Teodor Sigaev gave a great presentation about merging the very popular hstore and JSON date types into a new hierarchical hash data type, humorously code-named 'jstore'. (Their work is funded by Engine Yard.) This generated a huge amount of discussion, which has continued to today's unconference.
Third, Alvaro Hernandez Tortosa's talk The Billion Tables Project (slides) really stretched Postgres to a point where its high-table-count resilience and limitations became more visible. A significant amount of research was required to complete this project.
A while back I posted some SQL which helps track of changes to the PostgreSQL settings file. I've found it useful when benchmarking tests with different settings, but unfortunately the pg_settings_log() function needs to be run manually after each setting change. However that sounds like something which a custom background worker (new in 9.3) could handle - basically all the putative background worker would need to do is execute the pg_settings_log() function whenever the server starts (or restarts) or receives SIGHUP .
This turned out to be surprisingly easy to implement. Based off the example contrib module and Michael Paquier's excellent posts , this is the code . Basically all it does is check for the presence of the required database objects (a function and a table) on startup, executes pg_settings_log() on startup, and adds a signal handler for SIGHUP which also calls pg_settings_log() .
This post was originally posted on Medium, a new
blogging platform made up mostly of people who aren’t necessarily
subscribed to Planet. So, please forgive the obvious statements, as
the target audience are people who don’t know very much about
Postgres.
Wednesday May 23, with no fanfare, Tom Lane’s move to Salesforce.com was made public on the Postgres developer wiki.
For 15 years, Tom has contributed code to Postgres, an advanced open source relational database that started development around the same time as MySQL but has lagged behind it in adoption amongst web developers. Tom’s move is part of a significant pattern of investment by large corporations in the future of Postgres.
For the past few years, Postgres development has accelerated. Built with developer addons in mind, things like PLV8 and an extensible replication system have held the interest of companies like NTT and captured the imagination of Heroku.
Tom has acted as a tireless sentry for this community. His role for many years, in addition to hacking on the most important core bits, was to defend quality and a “policy of least surprise” when implementing new features.
Development for this community is done primarily on a mailing list. Tom responds to so many contributor discussions that he’s been the top overall poster on those mailing lists since 2000, with over 85k messages.
Really, he’s a cultural touchstone for a community of developers that loves beautiful, correct code.
Someone asked: “What does [Tom’s move] mean for Postgres?”
You probably don’t remember this:
Salesforce.com bases its entire cloud on Oracle database,” Ellison said, “but its database platform offering is PostgreSQL. I find that interesting.
When I read that last October, I was filled with glee, quickly followed by terror. I love my small database community, my friends and my job. What if Oracle shifted it’s attention to our community and attacked it, directly? So far, that hasn’t happened.
Instead, Salesforce advertised they were hiring “5 new engineers…and 40 to 50 mo
Running accurate database benchmark tests is hard. I’ve managed to publish a good number of them without being embarrassed by errors in procedure or results, but today I have a retraction to make. Last year I did a conference talk called “Seeking PostgreSQL” that focused on worst case situations for storage. And that, it turns out, had a giant error. The results for the Intel 320 Series SSD were much lower in some cases than they should have been, because the drive’s NCQ feature wasn’t working properly. When presenting this talk I had a few people push back that the results looked weird, and I was suspicious too. I have a correction to publish now, and I think the way this slipped by me is itself interesting. The full updated SeekingPostgres talk is also available, with all of the original graphs followed by an “Oops!” section showing the next data.
Native Command Queueing is an important optimization for seek heavy workloads. When trying to optimize work for a mechanical disk drive, it’s very important to know where the drive is currently at when deciding where to go next. If you have a read for that same area of the drive in the queue, you want to read that one now, get the I/O out of the way while you’re nearby, and then move to another physical area of the disk.
However, on a SSD, you might think that re-ordering commands isn’t that important. If reads are always inexpensive, taking a constant and small period of time on a flash device, their order doesn’t matter, right? Well, that’s wrong on a few counts. The idea that reads always take the same amount of time on SSD is a popular misconception. There’s a bit of uncertainty around what else is happening in the drive. Flash cells are made of blocks larger than a single database read. What happens if you are reading 8K of a cell that is being rewritten right now, because someone is updating another 8K section? Coordinating that is likely to pause your read for a moment. It doesn’t take much lag at SSD speeds to result in a noticable slowdown.
Thanks to Shaun M. Thomas, I have been offered a numeric copy of the “Instant PostgreSQL Backup” book from Packt publishing, and was provided with the “Instant PostgreSQL Starter” book to review. Considering my current work-situation, doing a lot of PostgreSQL advertising and basic teaching, I was interested in reviewing this one…
Like the Instant collection ditto says, it’s short and fast. I kind of disagree with the “focused” for this one, but it’s perfectly fine considering the aim of that book.
Years ago, when I was a kid, I discovered databases with a tiny MySQL-oriented book. It teaches you the basis : how to install, basic SQL queries, some rudimentary PHP integration. This book looks a bit like its PostgreSQL-based counterpart. It’s a quick travel through installation, basic manipulation, and the (controversy) “Top 9 features you need to know about”. And that’s exactly the kind of book we need.
So, what’s inside ? I’d say what you need to kick-start with PostgreSQL.
The installation part is straight forward : download, click, done. Now you can launch pgadmin, create an user, a database, and you’re done. Next time someone tells you PostgreSQL ain’t easy to install, show him that book.
The second part is a fast SQL discovery, covering a few PostgreSQL niceties. It’s damn simple : Create, Read, Update, Delete. You won’t learn about indexes, functions, advanced queries here. For someone discovering SQL, it’s what needs to be known to just start…
The last part, “Top 9 features you need to know about”, is a bit more hard to describe. PostgreSQL is a RDBMS with included batteries, choosing 9 features must have been a really hard time for the author, and I think nobody can be blamed for not choosing that or that feature you like : too much choice… The author spends some time on pg_crypto, the RETURNING clause with serial, hstore, XML, even recursive queries… This is, from my point of view, the troublesome part of the book : mentioning all these features means introducing complicated SQL queries. I would never te
I’ve been hacking on a tool to allow resynchronizing an old master server after failover. Please take a look: https://github.com/vmware/pg_rewind.
We just concluded the PgCon Developer Meeting. The two big items from me were that EnterpriseDB has dedicated staff to start work on parallelizing Postgres queries, particularly in-memory sorts. I have previously expressed the importance (and complexity) of parallelism. Second, EnterpriseDB has also dedicated staff to help improve Pgpool-II. Pgpool is the swiss army knife of replication tools, and I am hopeful that additional development work will further increase its popularity.
The Developer Meeting meeting notes (summary) has lots of additional information about the big things coming from everyone next year.
One of the
clients of OmniTI requested
help to provide sample application to insert
JSON data into Postgres using Java JDBC driver . I’m not Java expert
so it took a while for me to write a simple java code to insert
data. TBH, I took help to write test application from one of our
Java engineers at OmniTI. Now, test application is ready and next
step is to make it work with JSON datatype ! After struggling a
little to find out work around for string escaping in JAVA code, I
stumbled upon data type issue!
Here is the test application code to connect to
my local Postgres installation and insert JSON data into sample
table:
postgres=# \d sample
Table "public.sample"
Column | Type | Modifiers
--------+---------+-----------
id | integer |
data | json |
denishs-MacBook-Air-2:java denish$ java -cp $CLASSPATH
PgJSONExample
-------- PostgreSQL JDBC Connection Testing ------------
PostgreSQL JDBC Driver Registered!
You made it, take control your database now!
Something exploded running the insert: ERROR: column "data" is of
type json but expression is of type character varying
Hint: You will need to rewrite or cast the expression.
Position: 42
After some research , I
found out that there is no standard JSON type on java side
so adding support for json to postgres jdbc is not straight forward
! StackOverflow
answer helped me for testing out the JSON datatype
handling at psql level. As Craig mentioned in the answer that
the correct way to solve this problem is to write a custom Java
mapping type that uses the JDBC setObject method. This can be a
tricky though. A simpler workaround is to tell PostgreSQL to
cast implicitly from text to json:
postgres=# create cast (text as json) without function as
implicit;
CREATE CAST
The WITHOUT FUNCTION clause is used because text and json have
the same on-disk and in-memory representation, they’re basically
just aliases for the same data type. AS IMPLICIT tells PostgreSQL
it can convert without being explicitly told to, allowing things
like this to work:
postgres=# prepare test(tex
One of the things that really frustrated me about the KNN GIST distance box box centroid operators that came in PostgreSQL 9.1 and PostGIS 2.0 was the fact that one of the elements needed to be constant to take advantage of the index. In PostGIS speak, this meant you couldn't put it in the FROM clause and could only enjoy it in one of two ways.
Having recently posted some thoughts on Shaun Thomas' " "PostgreSQL Backup and Restore How-to" review ", Packt asked me if I'd like to review the new " Instant PostgreSQL Starter " by Daniel K. Lyons and kindly provided me with access to the ebook version. As I'm happily in a situation where I may need to introduce PostgreSQL to new users, I was interested in taking a look and here's a quick overview.
It follows the same "Instant" format as the backup booklet, which I quite like as it provides a useful way of focussing on particular aspects of PostgreSQL without being bogged down in reams of tl;dr documentation. " Instant Pg Starter " is divided into three sections:
Installation Quick start – creating your first table Top 9 features you need to know aboutIt occurs to me I forgot to congratulate the winners of the free ebooks. So without further adieu:
Congrats to the winners. But more, I call upon them to pay it
forward by contributing to the community, either by corresponding
with the excellent PostgreSQL
mailing lists, or maybe submitting a patch or two to the code.
There’s a lot of ground to cover, and more warm bodies always
helps.
Thanks again, everyone!