diff --git a/content/blog/2025-09-29-datafusion-50.0.0.md b/content/blog/2025-09-29-datafusion-50.0.0.md
new file mode 100644
index 00000000..e8548b75
--- /dev/null
+++ b/content/blog/2025-09-29-datafusion-50.0.0.md
@@ -0,0 +1,464 @@
+---
+layout: post
+title: Apache DataFusion 50.0.0 Released
+date: 2025-09-29
+author: pmc
+categories: [release]
+---
+
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+[TOC]
+
+<!-- see https://github.com/apache/datafusion/issues/16347 for details -->
+
+## Introduction
+
+We are proud to announce the release of [DataFusion 50.0.0]. This blog post
+highlights some of the major improvements since the release of [DataFusion
+49.0.0]. The complete list of changes is available in the [changelog].
+Thanks to [numerous contributors] for making this release possible!
+
+[DataFusion 50.0.0]: https://crates.io/crates/datafusion/50.0.0
+[DataFusion 49.0.0]: https://datafusion.apache.org/blog/2025/07/28/datafusion-49.0.0/
+[changelog]: https://github.com/apache/datafusion/blob/branch-50/dev/changelog/50.0.0.md
+[numerous contributors]: https://github.com/apache/datafusion/blob/branch-50/dev/changelog/50.0.0.md#credits
+
+
+## Performance Improvements 🚀
+
+DataFusion continues to focus on enhancing performance, as shown in ClickBench
+and other benchmark results.
+
+<img src="/blog/images/datafusion-50.0.0/performance_over_time_clickbench.png"
+  width="100%" class="img-responsive" alt="ClickBench performance results over time for DataFusion" />
+
+**Figure 1**: Average and median normalized query execution times for ClickBench queries for each git revision.
+Query times are normalized using the ClickBench definition. See the 
+[DataFusion Benchmarking Page](https://alamb.github.io/datafusion-benchmarking/) 
+for more details.
+
+Here are some noteworthy optimizations added since DataFusion 49:
+
+**Dynamic Filter Pushdown Improvements**
+
+The dynamic filter pushdown optimization, which allows runtime filters to cut
+down on the amount of data read, has been extended to support **inner hash
+joins**, dramatically improving performance when one relation is relatively
+small or filtered by a highly selective predicate. More details can be found in
+the [Dynamic Filter Pushdown for Hash Joins](#dynamic-filter-pushdown-for-hash-joins) section below.
+The dynamic filters in the TopK operator have also been improved in DataFusion
+50.0.0, further increasing the effectiveness and efficiency of the optimization.
+More details can be found in this
+[ticket](https://github.com/apache/datafusion/pull/16433).
+
+**Nested Loop Join Optimization**
+
+The nested loop join operator has been rewritten to reduce execution time and memory
+usage by adopting a finer-grained approach. Specifically, we now limit the 
+intermediate data size to around a single `RecordBatch` for better memory
+efficiency, and we have eliminated redundant conversions from the old 
+implementation to further improve execution speed.
+When evaluating this new approach in a microbenchmark, we measured up to a 5x
+improvement in execution time and a 99% reduction in memory usage. More details and
+results can be found in this
+[ticket](https://github.com/apache/datafusion/pull/16996).
+
+**Parquet Metadata Caching**
+
+DataFusion now automatically caches the metadata of Parquet files (statistics,
+page indexes, etc.), to avoid unnecessary disk/network round-trips. This is
+especially useful when querying the same table multiple times over relatively
+slow networks, allowing us to achieve an order of magnitude faster execution
+time when running many small reads over large files. More information can be
+found in the [Parquet Metadata Cache](#parquet-metadata-cache) section.
+
+## Community Growth 📈
+
+Between `49.0.0` and `50.0.0`, we continue to see our community grow:
+
+1. Qi Zhu ([zhuqi-lucas](https://github.com/zhuqi-lucas)) and Yoav Cohen
+   ([yoavcloud](https://github.com/yoavcloud)) became committers. See the
+   [mailing list] for more details.
+2. In the [core DataFusion repo] alone, we reviewed and accepted 318 PRs
+   from 79 different committers, created over 235 issues, and closed 197 of them
+   🚀. All changes are listed in the detailed [changelogs].
+3. DataFusion published several blogs, including *[Using External Indexes, Metadata Stores, Catalogs and
+   Caches to Accelerate Queries on Apache Parquet]*, *[Dynamic Filters:
+   Passing Information Between Operators During Execution for 25x Faster
+   Queries]*, and *[Implementing User Defined Types and Custom Metadata 
+   in DataFusion]*.
+
+<!--
+# Unique committers
+$ git shortlog -sn 49.0.0..50.0.0  . | wc -l
+    79
+# commits
+$ git log --pretty=oneline 49.0.0..50.0.0  . | wc -l
+    318
+
+https://crates.io/crates/datafusion/49.0.0
+DataFusion 49 released July 25, 2025
+
+https://crates.io/crates/datafusion/50.0.0
+DataFusion 50 released September 16, 2025
+
+Issues created in this time: 117 open, 118 closed = 235 total
+https://github.com/apache/datafusion/issues?q=is%3Aissue+created%3A2025-07-25..2025-09-16
+
+Issues closed: 197
+https://github.com/apache/arrow-datafusion/issues?q=is%3Aissue+closed%3A2025-07-25..2025-09-16
+
+PRs merged in this time 371
+https://github.com/apache/arrow-datafusion/pulls?q=is%3Apr+merged%3A2025-07-25..2025-09-16
+-->
+
+
+[core DataFusion repo]: https://github.com/apache/arrow-datafusion
+[changelogs]: https://github.com/apache/datafusion/tree/main/dev/changelog
+[mailing list]: https://lists.apache.org/list.html?dev@datafusion.apache.org
+[Using External Indexes, Metadata Stores, Catalogs and Caches to Accelerate Queries on Apache Parquet]: https://datafusion.apache.org/blog/2025/08/15/external-parquet-indexes/
+[Dynamic Filters: Passing Information Between Operators During Execution for 25x Faster Queries]: https://datafusion.apache.org/blog/2025/09/10/dynamic-filters/
+[Implementing User Defined Types and Custom Metadata in DataFusion]: https://datafusion.apache.org/blog/2025/09/21/custom-types-using-metadata/
+
+## New Features ✨
+
+### Improved Spilling Sorts for Larger-than-Memory Datasets
+
+DataFusion has long been able to sort datasets that do not fit entirely in memory,
+but still struggled with particularly large inputs or highly memory-constrained 
+setups. Larger-than-memory sorts in DataFusion 50.0.0 have been improved with the recent introduction
+of multi-level merge sorts (more details in the respective
+[ticket](https://github.com/apache/datafusion/pull/15700)). It is now
+possible to execute almost any sorting query that would have previously triggered *out-of-memory*
+errors, by relying on disk spilling. Thanks to [Raz Luvaton], [Yongting You], and
+[ding-young] for delivering this feature.
+
+[Raz Luvaton]: https://github.com/rluvaton
+[Yongting You]: https://github.com/2010YOUY01
+[ding-young]: https://github.com/ding-young
+
+### Dynamic Filter Pushdown for Hash Joins
+
+The [dynamic filter pushdown
+optimization](https://datafusion.apache.org/blog/2025/09/10/dynamic-filters/)
+has been extended to inner hash joins, dramatically reducing the amount of
+scanned data in some workloads—a technique sometimes referred to as
+[*Sideways Information Passing*](https://www.cs.cmu.edu/~15721-f24/papers/Sideways_Information_Passing.pdf).
+
+These filters are automatically applied to inner hash joins, while future work
+will introduce them to other join types. 
+
+For example, given a query that looks for a specific customer and
+their orders, DataFusion can now filter the `orders` relation based on the
+`c_custkey` of the target customer, reducing the amount of data
+read from disk by orders of magnitude.
+
+```sql
+-- retrieve the orders of the customer with c_phone = '25-989-741-2988'
+SELECT *
+FROM customer
+JOIN orders ON c_custkey = o_custkey
+WHERE c_phone = '25-989-741-2988';
+```
+
+The following shows an execution plan in DataFusion 50.0.0 with this optimization:
+
+```sql
+HashJoinExec
+    DataSourceExec: <-- read customer
+      predicate=c_phone@4 = 25-989-741-2988
+      metrics=[output_rows=1, ...]
+    DataSourceExec: <-- read orders
+      -- dynamic filter is added here, filtering directly at scan time
+      predicate=DynamicFilterPhysicalExpr [ o_custkey@1 >= 1 AND o_custkey@1 <= 1 ]
+      -- the number of output rows is kept to a minimum
+      metrics=[output_rows=11, ...]
+```
+
+Because there is a single customer in this query,
+almost all rows from `orders` are filtered out by the join. 
+In previous versions of DataFusion, the entire `orders` relation would be
+scanned to join with the target customer, but now the dynamic filter pushdown can
+filter it right at the source, minimizing the amount of data decoded.
+
+More information can be found in the respective
+[ticket](https://github.com/apache/datafusion/pull/16445) and the next step will be to
+[extend the dynamic filters to other types of joins](https://github.com/apache/datafusion/issues/16973), such as `LEFT` and
+`RIGHT` outer joins. Thanks to [Adrian Garcia Badaracco], [Qi Zhu], [xudong963], [Daniël Heres], and [Lía Adriana]
+for delivering this feature.
+
+[Adrian Garcia Badaracco]: https://github.com/adriangb
+[Qi Zhu]: https://github.com/zhuqi-lucas
+[xudong963]: https://github.com/xudong963
+[Daniël Heres]: https://github.com/Dandandan
+[Lía Adriana]: https://github.com/LiaCastaneda
+
+### Parquet Metadata Cache
+
+The metadata of Parquet files (statistics, page indexes, etc.) is now
+automatically cached when using the built-in [ListingTable], which reduces disk/network round-trips and repeated decoding
+of the same information. With a simple microbenchmark that executes point reads
+(e.g., `SELECT v FROM t WHERE k = x`) over large files, we measured a 12x
+improvement in execution time (more details can be found in the respective
+[ticket](https://github.com/apache/datafusion/pull/16971)). This optimization
+is production ready and enabled by default (more details in the
+[Epic](https://github.com/apache/datafusion/issues/17000)).
+Thanks to [Nuno Faria], [Jonathan Chen], [Shehab Amin], [Oleks V], [Tim Saucer], and [Blake Orth] for delivering this feature.
+
+[ListingTable]: https://docs.rs/datafusion/latest/datafusion/datasource/listing/struct.ListingTable.html
+[Nuno Faria]: https://github.com/nuno-faria
+[Jonathan Chen]: https://github.com/jonathanc-n
+[Shehab Amin]: https://github.com/shehabgamin
+[Oleks V]: https://github.com/comphead
+[Tim Saucer]: https://github.com/timsaucer
+[Blake Orth]: https://github.com/BlakeOrth
+
+Here is an example of the metadata cache in action:
+
+```sql
+-- disabling the metadata cache
+> SET datafusion.runtime.metadata_cache_limit = '0M';
+
+-- simple query (t.parquet: 100M rows, 3 cols)
+> EXPLAIN ANALYZE SELECT * FROM 't.parquet' LIMIT 1;
+DataSourceExec: ... metrics=[..., metadata_load_time=229.196422ms, ...]
+Elapsed 0.246 seconds.
+
+-- enabling the metadata cache
+> SET datafusion.runtime.metadata_cache_limit = '50M';
+
+> EXPLAIN ANALYZE SELECT * FROM 't.parquet' LIMIT 1;
+DataSourceExec: ... metrics=[..., metadata_load_time=228.612µs, ...]
+Elapsed 0.003 seconds. -- 82x improvement in this specific query
+```
+
+
+The cache can be configured with the following runtime parameter:
+
+```sql
+datafusion.runtime.metadata_cache_limit
+```
+
+The default [`FileMetadataCache`](https://docs.rs/datafusion/latest/datafusion/execution/cache/cache_manager/trait.FileMetadataCache.html) uses a
+least-recently-used eviction algorithm and up to 50MB of memory.
+If the underlying file changes, the cache is automatically invalidated.
+Setting the limit to 0 will disable any metadata caching. As with most APIs in
+DataFusion, users can provide their own behavior using a custom
+[`FileMetadataCache`](https://docs.rs/datafusion/50.0.0/datafusion/execution/cache/cache_manager/trait.FileMetadataCache.html)
+implementation when setting up the [`RuntimeEnv`](https://docs.rs/datafusion/latest/datafusion/execution/runtime_env/struct.RuntimeEnv.html).
+
+
+For users with custom [`TableProvider`](https://docs.rs/datafusion/latest/datafusion/catalog/trait.TableProvider.html):
+
+* If the custom provider uses the
+[`ParquetFormat`](https://docs.rs/datafusion/latest/datafusion/datasource/file_format/parquet/struct.ParquetFormat.html), caching will work
+without any changes.
+
+* Otherwise the
+[`CachedParquetFileReaderFactory`](https://docs.rs/datafusion/latest/datafusion/datasource/physical_plan/parquet/struct.CachedParquetFileReaderFactory.html)
+can be provided when creating a
+[`ParquetSource`](https://docs.rs/datafusion/latest/datafusion/datasource/physical_plan/struct.ParquetSource.html).
+
+Users can inspect the cache contents through the
+[`FileMetadataCache::list_entries`](https://docs.rs/datafusion/latest/datafusion/execution/cache/cache_manager/trait.FileMetadataCache.html#tymethod.list_entries)
+method, or with the
+[`metadata_cache()`](https://datafusion.apache.org/user-guide/cli/functions.html#metadata-cache)
+function in `datafusion-cli`:
+
+
+```sql
+> SELECT * FROM metadata_cache();
++---------------+-------------------------+-----------------+--------------------------+---------+---------------------+------+-----------------+
+| path          | file_modified           | file_size_bytes | e_tag                    | version | metadata_size_bytes | hits | extra           |
++---------------+-------------------------+-----------------+--------------------------+---------+---------------------+------+-----------------+
+| .../t.parquet | 2025-09-21T17:40:13.650 | 420827020       | 0-63f5331fb4458-19154f8c | NULL    | 44480534            | 27   | page_index=true |
++---------------+-------------------------+-----------------+--------------------------+---------+---------------------+------+-----------------+
+1 row(s) fetched.
+Elapsed 0.003 seconds.
+```
+
+
+### `QUALIFY` Clause
+
+DataFusion now supports the `QUALIFY` SQL clause
+([#16933](https://github.com/apache/datafusion/pull/16933)), which simplifies
+filtering window function output (similar to how `HAVING` filters
+aggregation output).
+
+For example, filtering the output of the `rank()` function previously
+required a query like this:
+
+```sql
+SELECT a, b, c
+FROM (
+   SELECT a, b, c, rank() OVER(PARTITION BY a ORDER BY b) as rk
+   FROM t
+)
+WHERE rk = 1
+```
+
+The same query can now be written like this:
+```sql
+SELECT a, b, c, rank() OVER(PARTITION BY a ORDER BY b) as rk
+FROM t
+QUALIFY rk = 1
+```
+
+Although it is not part of the SQL standard (yet), it has been gaining
+adoption in several SQL analytical systems such as DuckDB, Snowflake, and
+BigQuery. Thanks to [Huaijin] and [Jonah Gao] for delivering this feature.
+
+[Huaijin]: https://github.com/haohuaijin
+[Jonah Gao]: https://github.com/jonahgao
+
+### `FILTER` Support for Window Functions
+
+Continuing the theme, the `FILTER` clause has been extended to support
+[aggregate window functions](https://github.com/apache/datafusion/pull/17378).
+It allows these functions to apply to specific rows without having to
+rely on `CASE` expressions, similar to what was already possible with regular
+aggregate functions.
+
+For example, we can gather multiple distinct sets of values matching different
+criteria with a single pass over the input:
+
+```sql
+SELECT 
+  ARRAY_AGG(c2) FILTER (WHERE c2 >= 2) OVER (...)     -- e.g. [2, 3, 4]
+  ARRAY_AGG(CASE WHEN c2 >= 2 THEN c2 END) OVER (...) -- e.g. [NULL, NULL, 2, 3, 4]
+...
+FROM table
+```
+
+Thanks to [Geoffrey Claude] and [Jeffrey Vo] for delivering this feature.
+
+
+[Geoffrey Claude]: https://github.com/geoffreyclaude
+[Jeffrey Vo]: https://github.com/Jefffrey
+
+### `ConfigOptions` Now Available to Functions
+
+DataFusion 50.0.0 now passes session configuration parameters to User-Defined
+Functions (UDFs) via
+[ScalarFunctionArgs](https://docs.rs/datafusion/latest/datafusion/logical_expr/struct.ScalarFunctionArgs.html)
+([#16970](https://github.com/apache/datafusion/pull/16970)). This allows
+behavior that varies based on runtime state; for example, time UDFs can use the
+session-specified time zone instead of just UTC.
+
+Thanks to [Bruce Ritchie], [Piotr Findeisen], [Oleks V], and [Andrew Lamb] for delivering this feature.
+
+[Bruce Ritchie]: https://github.com/Omega359
+[Piotr Findeisen]: https://github.com/findepi
+[Oleks V]: https://github.com/comphead
+[Andrew Lamb]: https://github.com/alamb
+
+### Additional Apache Spark Compatible Functions
+
+Finally, due to Apache Spark's impact on analytical processing, many DataFusion
+users desire Spark compatibility in their workloads, so DataFusion provides a
+set of Spark-compatible functions in the [datafusion-spark](https://crates.io/crates/datafusion-spark) crate.
+You can read more about this project in the [announcement] and [epic].
+DataFusion 50.0.0 adds several new such functions:
+
+- [`array`](https://github.com/apache/datafusion/pull/16936)
+- [`bit_get/bit_count`](https://github.com/apache/datafusion/pull/16942)
+- [`bitmap_count`](https://github.com/apache/datafusion/pull/17179)
+- [`crc32/sha1`](https://github.com/apache/datafusion/pull/17032)
+- [`date_add/date_sub`](https://github.com/apache/datafusion/pull/17024)
+- [`if`](https://github.com/apache/datafusion/pull/16946)
+- [`last_day`](https://github.com/apache/datafusion/pull/16828)
+- [`like/ilike`](https://github.com/apache/datafusion/pull/16962)
+- [`luhn_check`](https://github.com/apache/datafusion/pull/16848)
+- [`mod/pmod`](https://github.com/apache/datafusion/pull/16829)
+- [`next_day`](https://github.com/apache/datafusion/pull/16780)
+- [`parse_url`](https://github.com/apache/datafusion/pull/16937)
+- [`rint`](https://github.com/apache/datafusion/pull/16924)
+- [`width_bucket`](https://github.com/apache/datafusion/pull/17331)
+
+Thanks to [David López], [Chen Chongchen], [Alan Tang], [Peter Nguyen], and [Evgenii Glotov] for delivering these functions. We are looking for additional help
+reviewing and implementing more functions; please reach out on the [epic] if you are interested.
+
+
+[David López]: https://github.com/davidlghellin
+[Chen Chongchen]: https://github.com/chenkovsky
+[Alan Tang]: https://github.com/Standing-Man
+[Peter Nguyen]: https://github.com/petern48
+[Evgenii Glotov]: https://github.com/SparkApplicationMaster
+[announcement]: https://datafusion.apache.org/blog/2025/07/16/datafusion-48.0.0/#new-datafusion-spark-crate
+[epic]: https://github.com/apache/datafusion/issues/15914
+
+
+## Known Issues / Patchset
+
+As DataFusion continues to mature, we regularly release patch versions to fix issues 
+in major releases. Since the release of `50.0.0`, we have identified a few
+issues, and expect to release `50.1.0` to address them. You can track progress
+in this [ticket](https://github.com/apache/datafusion/issues/17594). 
+
+
+## Upgrade Guide and Changelog
+
+Upgrading to 50.0.0 should be straightforward for most users. Please review the
+[Upgrade Guide](https://datafusion.apache.org/library-user-guide/upgrading.html)
+for details on breaking changes and code snippets to help with the transition.
+Recently, some users have reported success automatically upgrading DataFusion by
+pairing AI tools with the upgrade guide. For a comprehensive list of all
+changes, please refer to the [changelog].
+
+## About DataFusion
+
+[Apache DataFusion] is an extensible query engine, written in [Rust], that uses
+[Apache Arrow] as its in-memory format. DataFusion is used by developers to
+create new, fast, data-centric systems such as databases, dataframe libraries,
+and machine learning and streaming applications. While [DataFusion’s primary
+design goal] is to accelerate the creation of other data-centric systems, it
+provides a reasonable experience directly out of the box as a [dataframe
+library], [Python library], and [command-line SQL tool].
+
+[apache datafusion]: https://datafusion.apache.org/
+[rust]: https://www.rust-lang.org/
+[apache arrow]: https://arrow.apache.org
+[DataFusion’s primary design goal]: https://datafusion.apache.org/user-guide/introduction.html#project-goals
+[dataframe library]: https://datafusion.apache.org/user-guide/dataframe.html
+[python library]: https://datafusion.apache.org/python/
+[command-line SQL tool]: https://datafusion.apache.org/user-guide/cli/
+
+DataFusion's core thesis is that, as a community, together we can build much
+more advanced technology than any of us as individuals or companies could build
+alone. Without DataFusion, highly performant vectorized query engines would
+remain the domain of a few large companies and world-class research
+institutions. With DataFusion, we can all build on top of a shared foundation
+and focus on what makes our projects unique.
+
+## How to Get Involved
+
+DataFusion is not a project built or driven by a single person, company, or
+foundation. Rather, our community of users and contributors works together to
+build a shared technology that none of us could have built alone.
+
+If you are interested in joining us, we would love to have you. You can try out
+DataFusion on some of your own data and projects and let us know how it goes,
+contribute suggestions, documentation, bug reports, or a PR with documentation,
+tests, or code. A list of open issues suitable for beginners is [here], and you
+can find out how to reach us on the [communication doc].
+
+[here]: https://github.com/apache/arrow-datafusion/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22
+[communication doc]: https://datafusion.apache.org/contributor-guide/communication.html
diff --git a/content/images/datafusion-50.0.0/performance_over_time_clickbench.png b/content/images/datafusion-50.0.0/performance_over_time_clickbench.png
new file mode 100644
index 00000000..e6a8f14c
Binary files /dev/null and b/content/images/datafusion-50.0.0/performance_over_time_clickbench.png differ