Speaking as long-term (>15 years) user of Postgres in science, I am getting worried about the lack of columnar type of storage in Postgresql. As the datasets become bigger and bigger, the limitations of PG's storage are becoming more and more significant. I know there are various extensions (i.e. cetus) that may offer such functionality, but then you depend on that extension being supported in the future, as well additional complexity.
Postgres's whole origin story is basically to think outside the box and don't be constrained by existing thinking. Stonebraker thought existing databases were too limited in terms of their data types and expressiveness. He started Postgres as an evolution of Ingres (Postgres = Post Ingres) that added rich custom data types and a rewriting system based on rules.
Columnar and all the other fun stuff (JSON, GIS, inverted indexes, embedding vectors) is a natural progression of that thinking. With TimescaleDB, Hydra, Citus, pg_mooncake, etc. becoming very popular the last few years, there is a clear demand for an integrated experience.
(Stonebraker also thought one database shouldn't do everything, as described in his early 2000s "One Size Does Not Fit All" paper, and Stonebraker branched out into HStore/Vertica for columnar. In hindsight, I think that was appropriate for the time, but no longer a significant concern.)
PostgreSQL was a right tool for my task for many years. It is a question for PostgreSQL can adapt to a new reality of much bigger datasets or I have to switch to a new tool. And I am not the only user of Postgresql in this context. So it is easy to say in vacuum 'you are using the wrong database', but it's not something that can be easily changed with 100s of Tb of data, existing user workflows etc.
Just because it's easy to say doesn't mean it's wrong.
By the same logic, you could say Microsoft Access should have all the capabilities of Postgres because it's painful for small businesses to move off of it when it's no longer a good fit for their needs.
Age old product question. The Honda Civic is much larger today than the original, because the original targeted young people on a budget. As they aged and had kids, they needed more room in the car.
Most peoples’ insinct is to buy a newer version of the car they love, so the civic grew to accommodate the aging market.
No real point here other than an observation about how the installed base’s needs change, across industries.
I mean I don't disagree with you, but they did just add a graph database feature, which is about as orthogonal to relational database design as you can get.
I've used Postgres, Oracle, MsSql Server, and MySql in serious projects, no extensive experience with Sqlite, which I know is an amazing player.
These days, I do myself a favor and always avoid Oracle and MySql/MariaDB.
Postgres is amazing, and the two big things I wished it had:
1. lightweight connection; connection bouncers improve the situation, but you still have an unreasonably high memory footprint per concurrent connection.
2. Synchronously updated materialized views (Sql Server calls them indexed views). These are incredible tools in complex data situations. I saw a project struggle with complex technical implementations that would be elegant, trivial and always correct with indexed views.
Sql Server can be costly, but in many cases the benefits it provides are totally worth the cost.
Choosing the data store carefully prevents lots of future trouble.
SQLite and MSSQL are my two solutions for relational storage problems.
If I am going to use a "free" provider, SQLite is impossible to beat. They cover a majority of use cases today. SQLite starts to fall apart with backup, replication and tooling. If I am on the hook for things like system availability and disaster recovery, I don't have a problem spending money to cover my ass.
If I am going to pay any amount of money at all, I am going all the way. The developer experience around MSSQL is untouchable. SSMS and VS with sql projects runs circles around contemporary entity framework crap. Sprinkle in 3rd party tools from vendors like RedGate and you can replace multi-million dollar consulting packages.
I wouldn't ever advocate for standing up a new Oracle or DB2 machine, but if one was already in place I'd probably die on the hill of not trying to refactor it away. These databases typically come with multi-volume ghost stories attached. Reinventing all those weird effects on a new engine will typically kill the business if there are no other options available.
1. "materialize" the view as a full table, then index that. Any reasonable pipeline/ETL tool can provide incremental updates between tables. Obviously, anything materialized requires considerations around storage, replication, backup/restore, I/O, etc.
2. use a regular VIEW and index (precisely) the underlying expressions mentioned in the view, i.e. so when the view is used, then the indexes get used.
Both require rewriting SQL, though I've used VIEWs to make the change transparent.
Care to share some examples where SQL Server's indexed views would shine?
In my eyes they're similar to triggers, which incur a high performance overhead in OLTP systems and are shunned by developers. In OLAP systems custom ETL code will likely outperform them.
Indexed views are much faster than trying to achieve the same result with triggers. Triggers have serious concurrency limitations, and you do recalculations even when the fields you depend on are not touched.
Indexed views are not much worse than indexes. Of course, when they refer to other tables there are underlying data lookups, but in our experience when we moved from triggers to indexed views, large scale data ingestion went way faster.
Where we used it: While revamping a large scale sales program, we stored the warehouse in/out in one table, and several things like current stock were calculated using indexed views.
Bonus: Using Snapshot concurrency control, you can do many things concurrently, and only when they both updates to a certain product in the same store you'll get the second transaction failing (which could be retried on the backend).
The fact that they are completely in-sync with your data is amazing.
These things exist to eliminate the risk of ever serving stale information from a materialised view. I.e., their benefit is political/reputational as much as they are technical in the sense that they save you effort like remembering to invalidate a MV after an ingest operation.
Stale MV is a thing you only ever burn your fingers on once. Like how "It's not DNS" is a common meme in networking.
Don't know of anything wrong with MariaDB, but there used to be plenty wrong with MySQL. To give the most egregious example (THANKFULLY fixed in MariaDB, but was present in MySQL for the longest time), inserting the value 128 into a TINYINT column (signed 8-bit int) would clamp the value rather than returning an error. Which might be what you want... except if that was a primary key column. Marvel at the following, which used to be how MySQL behaved:
CREATE DATABASE foo;
USE foo;
CREATE TABLE one ( id TINYINT NOT NULL PRIMARY KEY ) TYPE=InnoDB ;
CREATE TABLE two (
id TINYINT NOT NULL PRIMARY KEY,
INDEX (id),
CONSTRAINT id_fkey FOREIGN KEY (id) REFERENCES one(id)
) TYPE=InnoDB ;
Now that we've created both tables, let's insert a record into table one:
INSERT INTO one VALUES (127);
And now let's insert a record with a different primary key into table two:
INSERT INTO two VALUES (128);
MariaDB will give you an error at this point (ERROR 1264 (22003): Out of range value for column 'id' at row 1), but MySQL (at least back when I tried this about ten years ago, which was the last time I was forced to work with MySQL — and I am so glad I never have to go back!) would return no error message and just say "Query OK, 1 row affected (0.009 sec)".
Now let's select the value we inserted into table "two":
SELECT * FROM two;
And what do we see? The value 127, even though we inserted 128. Which has created a foreign-key relationship to table "one" that we never intended to put in there.
There are other reasons why MySQL was inadequate, but I no longer remember them. Probably MariaDB has fixed them by now. But I no longer have to use MySQL/MariaDB for anything, and I never want to go back. I have a VERY strong averse reaction, caused by past pain, when I think of using MariaDB. (I actually spun up a virtual machine to test what I wrote here, because there's no way I was going to install MariaDB on my primary work machine).
Just a set of things too minor to move off of it but annoying enough to not want to start with it.
My list:
No `explain (analyze,buffers)`. Instant DDL has some warts (e.g. fk, metadata locks). Query planning bugs (actually... query planning in general is disappointing). Exiting the repl doesn't stop queries. Implicit type casting. Replication lag from large DDL (e.g. creating an index). Lack of two phase DDL (creating constraints NOT VALID and then VALIDATE later). Lack of extensions (e.g. pg_vector). No safe access to inspect buffer cache. AWS Aurora seems to only add shiny new things to Postgres. And more.
Again, none of this is quite enough to migrate off of it for an established system, but certainly enough to avoid it on a new project.
I am currently fighting my way off SQL Server towards PostgreSQL.
Windows Server is a real pain to operate and the SQL Server ecosystem expects you to run a lot of add-ons on the server alongside your database. Those don’t translate to managed database services, so you lose a lot of functionality if you jump to RDS or similar.
The first party tools are also aging poorly. SSIS and SSRS are not fun. SSMS is ok for what it is but can’t compete with the ecosystem around PostgreSQL.
Maybe I’m missing something but I can’t wait to ditch it.
Agree about Windows Server. You can run SqlServer on Linux though. I'm not aware of your specific addons, but the Sql Server itself works perfectly well on Linux.
The graph database feature looks interesting, but I wonder...
SELECT customer_name FROM GRAPH_TABLE (myshop MATCH (c IS customers)-[IS customer_orders]->(o IS orders WHERE o.ordered_when = current_date) COLUMNS (c.name AS customer_name));
That is _awful_ syntax; it is reminiscent of neo4j, which is surely not a tool anyone serious should copy from outright in 2026.
And of course the final thing I am left wondering is if it's fast. Row-level security is such a useful feature and yet only a fool would contemplate building anything serious with Postgres', as the planner goes haywire and does per-row-matching, nuking performance.
i like the COPY and logical replication improvements. Currently I back up my PG database with a sidecar Databasus instance that is heavier than my entire backend + DB + Caddy!
(LLM writing rant below)
---
> That alone tells you something: Users had a real need, and the ecosystem filled the gap.
> This sounds straightforward, but it solves a real operational problem.
> None of these change the world. All of them make day-to-day data workflows better.
> The easy thing to do here is list planner changes and call it done. But the more useful takeaway is this: Postgres keeps getting better at recognizing the shape of common queries and doing less unnecessary work.
> [Proceed to list planner changes]
If Orwell were alive today, he might declare himself illiterate in English and learn Klingon just to avoid having to read these.
Wow. Incredible how this was not mentioned in the OP. I had done it with tcn triggers and adding "_archive" shadown tables manually with tcn (https://www.postgresql.org/docs/current/tcn.html), but doing it natively is gonna be, as per most postgresql implementations, wonderful.
I'm wondering. Just wondering? Will they ever support multiple storage engines like MariaDB? Having a storage engine that support OLTP or OLAP or append-only would be cool. I totally understand if they don't want to do that.
I'm dreaming of block compression in Postgresql, instead of only row compression, too limited to be effective.
I know you can store your data on a Zfs pool with block compression, but having it native would remove the burdain of setting this up and maybe better perf.
I can't decide whether this person writes in the type of style that was apparently overrepresented in LLM training, or whether they heavily used AI to spruce up their writing. I'm learning towards the latter.
Spruce up is unreasonably charitable. I'm more irritated that the authorship information is misleading. craig-kerstiens is not available on Huggingface, and yet not a single sentence in this article seems to have been typed on a keyboard.
When Claude writes things like "as someone who has spent a lot of time doing X", I think this is also a kind of failure of alignment. LLMs shouldn't write as if they had personal experience. It's something a person might say in the training data, but I just think LLMs shouldn't claim life experience they don't have, even if that's a statistically likely sequence of tokens.
These low effort constant comments about style or formatting are against Hackernews guidelines for discussions and something needs to be done to clean up the comment section. Getting to a ridiculous point
They aren't actually. They make me feel like I'm not going crazy - that I'm not the only person noticing that the quality of the average article on hacker news has dropped off a cliff in the last 6 months. Links from different people with different cultures, life experiences, and languages have the same tone of voice, the same sentence structure, and the same breathless, boring, staccato yet arrhythmic, emotive yet soulless style.
I hope everyone keeps pointing it out. Even better, change the site guidelines to make AI generated articles a flaggable offense. It's already been done for comments.
It has extremely low false positive rate in various tests (e. g. https://arxiv.org/html/2501.15654v1), I've never seen it mark something I knew was human written as AI.
I get it's in vogue to take stabs at whether or not AI was used. But I think the more useful approach is to instead be critical of the end product, if you have criticisms of it.
This actually gave me an amusing idea: a book review club that strictly reviews the cover art, book binding, hand feel, paper weight, font, etc. of the book.
Interesting, I was wondering why Snowflake was investing in PostgreSQL. Looks like Snowflake bought Crunchy Data and Databricks bought NEON… so the two leading DWaaS companies have managed PostgreSQL offerings now.
It’s like saying that you’re getting worried Apple doesn’t sell washing machines.
Columnar and all the other fun stuff (JSON, GIS, inverted indexes, embedding vectors) is a natural progression of that thinking. With TimescaleDB, Hydra, Citus, pg_mooncake, etc. becoming very popular the last few years, there is a clear demand for an integrated experience.
(Stonebraker also thought one database shouldn't do everything, as described in his early 2000s "One Size Does Not Fit All" paper, and Stonebraker branched out into HStore/Vertica for columnar. In hindsight, I think that was appropriate for the time, but no longer a significant concern.)
By the same logic, you could say Microsoft Access should have all the capabilities of Postgres because it's painful for small businesses to move off of it when it's no longer a good fit for their needs.
No real point here other than an observation about how the installed base’s needs change, across industries.
There was a big fanfare about orioledb a while ago, and i think it got bought by people that wanted to push that into mainstream postgres?
Did it die somewhere along the road?
https://supabase.com/blog/orioledb-launch
And they continue to work on it.
https://supabase.com/blog/orioledb-patent-free
These days, I do myself a favor and always avoid Oracle and MySql/MariaDB.
Postgres is amazing, and the two big things I wished it had:
1. lightweight connection; connection bouncers improve the situation, but you still have an unreasonably high memory footprint per concurrent connection.
2. Synchronously updated materialized views (Sql Server calls them indexed views). These are incredible tools in complex data situations. I saw a project struggle with complex technical implementations that would be elegant, trivial and always correct with indexed views.
Sql Server can be costly, but in many cases the benefits it provides are totally worth the cost.
Choosing the data store carefully prevents lots of future trouble.
If I am going to use a "free" provider, SQLite is impossible to beat. They cover a majority of use cases today. SQLite starts to fall apart with backup, replication and tooling. If I am on the hook for things like system availability and disaster recovery, I don't have a problem spending money to cover my ass.
If I am going to pay any amount of money at all, I am going all the way. The developer experience around MSSQL is untouchable. SSMS and VS with sql projects runs circles around contemporary entity framework crap. Sprinkle in 3rd party tools from vendors like RedGate and you can replace multi-million dollar consulting packages.
I wouldn't ever advocate for standing up a new Oracle or DB2 machine, but if one was already in place I'd probably die on the hill of not trying to refactor it away. These databases typically come with multi-volume ghost stories attached. Reinventing all those weird effects on a new engine will typically kill the business if there are no other options available.
1. "materialize" the view as a full table, then index that. Any reasonable pipeline/ETL tool can provide incremental updates between tables. Obviously, anything materialized requires considerations around storage, replication, backup/restore, I/O, etc.
2. use a regular VIEW and index (precisely) the underlying expressions mentioned in the view, i.e. so when the view is used, then the indexes get used.
Both require rewriting SQL, though I've used VIEWs to make the change transparent.
In my eyes they're similar to triggers, which incur a high performance overhead in OLTP systems and are shunned by developers. In OLAP systems custom ETL code will likely outperform them.
Indexed views are not much worse than indexes. Of course, when they refer to other tables there are underlying data lookups, but in our experience when we moved from triggers to indexed views, large scale data ingestion went way faster.
Where we used it: While revamping a large scale sales program, we stored the warehouse in/out in one table, and several things like current stock were calculated using indexed views.
Bonus: Using Snapshot concurrency control, you can do many things concurrently, and only when they both updates to a certain product in the same store you'll get the second transaction failing (which could be retried on the backend).
The fact that they are completely in-sync with your data is amazing.
Stale MV is a thing you only ever burn your fingers on once. Like how "It's not DNS" is a common meme in networking.
So what's wrong with MySQL or MariaDB?
Note: the below taken nearly verbatim from https://sql-info.de/mysql/referential-integrity.html#3_5
Now that we've created both tables, let's insert a record into table one: And now let's insert a record with a different primary key into table two: MariaDB will give you an error at this point (ERROR 1264 (22003): Out of range value for column 'id' at row 1), but MySQL (at least back when I tried this about ten years ago, which was the last time I was forced to work with MySQL — and I am so glad I never have to go back!) would return no error message and just say "Query OK, 1 row affected (0.009 sec)".Now let's select the value we inserted into table "two":
And what do we see? The value 127, even though we inserted 128. Which has created a foreign-key relationship to table "one" that we never intended to put in there.There are other reasons why MySQL was inadequate, but I no longer remember them. Probably MariaDB has fixed them by now. But I no longer have to use MySQL/MariaDB for anything, and I never want to go back. I have a VERY strong averse reaction, caused by past pain, when I think of using MariaDB. (I actually spun up a virtual machine to test what I wrote here, because there's no way I was going to install MariaDB on my primary work machine).
My list:
No `explain (analyze,buffers)`. Instant DDL has some warts (e.g. fk, metadata locks). Query planning bugs (actually... query planning in general is disappointing). Exiting the repl doesn't stop queries. Implicit type casting. Replication lag from large DDL (e.g. creating an index). Lack of two phase DDL (creating constraints NOT VALID and then VALIDATE later). Lack of extensions (e.g. pg_vector). No safe access to inspect buffer cache. AWS Aurora seems to only add shiny new things to Postgres. And more.
Again, none of this is quite enough to migrate off of it for an established system, but certainly enough to avoid it on a new project.
Windows Server is a real pain to operate and the SQL Server ecosystem expects you to run a lot of add-ons on the server alongside your database. Those don’t translate to managed database services, so you lose a lot of functionality if you jump to RDS or similar.
The first party tools are also aging poorly. SSIS and SSRS are not fun. SSMS is ok for what it is but can’t compete with the ecosystem around PostgreSQL.
Maybe I’m missing something but I can’t wait to ditch it.
SELECT customer_name FROM GRAPH_TABLE (myshop MATCH (c IS customers)-[IS customer_orders]->(o IS orders WHERE o.ordered_when = current_date) COLUMNS (c.name AS customer_name));
That is _awful_ syntax; it is reminiscent of neo4j, which is surely not a tool anyone serious should copy from outright in 2026.
And of course the final thing I am left wondering is if it's fast. Row-level security is such a useful feature and yet only a fool would contemplate building anything serious with Postgres', as the planner goes haywire and does per-row-matching, nuking performance.
(LLM writing rant below)
---
> That alone tells you something: Users had a real need, and the ecosystem filled the gap.
> This sounds straightforward, but it solves a real operational problem.
> None of these change the world. All of them make day-to-day data workflows better.
> The easy thing to do here is list planner changes and call it done. But the more useful takeaway is this: Postgres keeps getting better at recognizing the shape of common queries and doing less unnecessary work.
> [Proceed to list planner changes]
If Orwell were alive today, he might declare himself illiterate in English and learn Klingon just to avoid having to read these.
https://news.ycombinator.com/item?id=48413655
Something like rocksdb as PG backend would be fantastic. Yugabyte does this but it's not PG.
https://vldb.org/cidrdb/2026/a-multi-tenant-relational-oltp-...
Context: https://www.orioledb.com/docs#:~:text=OrioleDB%20currently%2...
When Claude writes things like "as someone who has spent a lot of time doing X", I think this is also a kind of failure of alignment. LLMs shouldn't write as if they had personal experience. It's something a person might say in the training data, but I just think LLMs shouldn't claim life experience they don't have, even if that's a statistically likely sequence of tokens.
I hope everyone keeps pointing it out. Even better, change the site guidelines to make AI generated articles a flaggable offense. It's already been done for comments.