Skip to content

Releases: aws/aws-sdk-pandas

AWS SDK for pandas 2.18.0

02 Dec 16:30
eeba51e
Compare
Choose a tag to compare

Noteworthy

Features & enhancements

Bug fixes

Documentation

Tests

New Contributors

Thanks

We thank the following contributors/users for their work on this release:
@lucasasmith, @vikramsg, @mycaule, @pal0064, @LeonLuttenberger, @cnfait, @malachi-constant, @kukushking, @jaidisido

Full Changelog: 2.17.0...2.18.0

3.0.0rc2

23 Nov 12:54
Compare
Choose a tag to compare
3.0.0rc2 Pre-release
Pre-release

What's Changed

  • (enhancement): Enable missing unit tests and Redshift, Athena, LF load tests by @jaidisido in #1736
  • (enhancement): configure scheduling options, remove dependencies on internal ray impl by @kukushking in #1734
  • (testing): Enable Athena and Redshift tests, and address errors by @LeonLuttenberger in #1721
  • (feat): Make tqdm progress reporting opt-in by @kukushking in #1741

Full Changelog: 3.0.0rc1...3.0.0rc2

3.0.0rc1

27 Oct 18:25
3bd0670
Compare
Choose a tag to compare
3.0.0rc1 Pre-release
Pre-release

What's Changed

Full Changelog: 3.0.0b3...3.0.0rc1

3.0.0b3

12 Oct 15:28
ad40148
Compare
Choose a tag to compare
3.0.0b3 Pre-release
Pre-release

What's Changed

Full Changelog: 3.0.0b2...3.0.0b3

3.0.0b2

30 Sep 09:42
715f163
Compare
Choose a tag to compare
3.0.0b2 Pre-release
Pre-release

What's Changed

Full Changelog: 3.0.0b1...3.0.0b2

3.0.0b1

22 Sep 16:38
Compare
Choose a tag to compare
3.0.0b1 Pre-release
Pre-release

What's Changed

Full Changelog: 3.0.0a2...3.0.0b1

AWS SDK for pandas 2.17.0

20 Sep 23:11
3bcd8d3
Compare
Choose a tag to compare

New Functionalities

Enhancements

  • Returning empty DataFrame for empty TimeStream query #1430
  • Added support for INSERT IGNORE for mysql.to_sql #1429
  • Added use_column_names to redshift.copy akin to redshift.to_sql #1437
  • Enable passing kwargs to redshift.connect #1467
  • Add timestream_endpoint_url property to the config #1483
  • Add support for upserting to an empty Glue table #1579

Documentation

  • Fix typos in documentation #1434

Bug Fix

  • validate_schema=True for wr.s3.read_parquet breaks with partition columns and dataset=True #1426
  • wr.neptune.to_property_graph failing for Neptune version 1.1.1.0 #1407
  • ValueError when using opensearch.index_df with documents with an array field #1444
  • Missing catalog_id in wr.catalog.create_database #1480
  • Check for pair of brackets in query preparation for Athena cache #1529
  • Fix wrong type hint for TagColumnOperation in quicksight.create_athena_dataset #1570
  • s3.to_json compression parameters is passed twice when dataset=True #1585
  • Cast Athena array, map & struct types to pandas object #1581
  • In the OpenSearch module, use SSL only for HTTPS (port 443) #1603

Noteworthy

AWS Lambda Managed Layers

Since the last release, the library has been accepted as an official SDK for AWS, and rebranded as AWS SDK for pandas 🚀. The module names in Python will remain the same. One noteworthy change, however, is that the AWS Lambda Manager layer name has been renamed from AWSDataWrangler to AWSSDKPandas.

You can view the ARN value for the layers here.

PyArrow 7 Support

⚠️ For platforms without PyArrow 7 support (e.g. MWAA, EMR, Glue PySpark Job):

pip install pyarrow==2 awswrangler

Thanks

We thank the following contributors/users for their work on this release:

@bechbd, @maxispeicher, @timgates42, @aeeladawy, @KhueNgocDang, @szemek, @malachi-constant, @cnfait, @jaidisido, @LeonLuttenberger, @kukushking

3.0.0a2

17 Aug 10:35
b471c5c
Compare
Choose a tag to compare
3.0.0a2 Pre-release
Pre-release

This is a pre-release for the Wrangler@Scale project

What's Changed

Full Changelog: 3.0.0a1...3.0.0a2

3.0.0a1

17 Aug 10:06
b4d13bf
Compare
Choose a tag to compare
3.0.0a1 Pre-release
Pre-release

This is a pre-release for the Wrangler@Scale project

What's Changed

  • (feat): Add distributed config flag and initialise method by @jaidisido in #1389
  • (feat): Add distributed Lake Formation read by @jaidisido in #1397
  • (feat): Distribute S3 select over multiple paths and scan ranges by @jaidisido in #1445
  • (refactor): Refactor threading/ray; add single-path distributed s3 select impl by @kukushking in #1446

Full Changelog: 2.16.1...3.0.0a1

2.16.1

28 Jun 16:39
Compare
Choose a tag to compare

Noteworthy

🐛 Fixed issue introduced by 2.16.0 to method s3.read_parquet()

Patch

  • Fix bug: pq_file.schema.names(): TypeError: 'list' object is not callable s3.read_parquet() #1412

P.S. The AWS Lambda Layer file (.zip) and the AWS Glue file (.whl) are available below. Just upload it and run or use them from our S3 public bucket!

Full Changelog: 2.16.0...2.16.1