About Narwhals
Narwhals is an open source Python library designed to serve as a lightweight and extensible compatibility layer between different DataFrame libraries. It enables the data ecosystem to become DataFrame-agnostic. By spreading API standards, Narwhals empowers DataFrame authors to innovate on the implementation side. Since its first release in Feb 2024, the project has already become a dependency of major data science tools such as Altair, Marimo, Plotly, Py-Shiny, and Vegafusion.
About the project with POSSEE
The proposed deliverables of the project with POSSEE will propel Narwhals’ mission to strengthen the open source data science ecosystem.
Featuring a wider range of DataFrame libraries besides Polars and pandas in the docstring examples will showcase Narwhals’ flexibility to seamlessly integrate with various data processing ecosystems.
The migration and developer guides will provide clear and detailed documentation on how to upgrade to newer versions of Narwhals and extend its functionality for their specific use cases.
Improving backend support for lazy execution frameworks like DuckDB and Dask will enable more efficient processing of large datasets, which is essential for modern data science workflows that require high performance and scalability.
Allowing output SQL without requiring a backend installation will further streamline data science pipelines, removing dependencies and simplifying integration with various data sources.
Regular performance benchmarking will ensure that Narwhals meets the demands of high-performance computing while minimizing overhead, providing users with reliable insights into Narwhals’ performance under different conditions.
Project scope
Additional information
POSSEE promotes equitable education and open source sustainability, fostering technological and social progress across the globe.