Python Monthly News

Python News: What's New From February 2023

by Kate Finegan community

February is the shortest month, but it brought no shortage of activity in the Python world! Exciting developments include a new company aiming to improve cloud services for developers, publication of the PyCon US 2023 schedule, and the first release candidate for pandas 2.0.0.

In the world of artifical intelligence, OpenAI has continued to make strides. But while the Big Fix has worked to reduce vulnerabily for programmers, more malicious programs showed up on PyPI.

Read on to dive into the biggest Python news from the last month.

Pydantic Launches Commercial Venture

With over 40 million downloads per month, pydantic is the most-used data validation library in Python. So when its founder, Samuel Colvin, announces successful seed funding, there’s plenty of reason to believe there’s a game changer in the works.

In his announcement, Colvin compares the current state of cloud services to that of a tractor fifteen years after its invention. In both cases, the technology gets the job done, but without much consideration for the person in the driver’s seat.

Colvin’s new company builds on what pydantic has learned about putting developer experience and expertise first. The exact details of this new venture are still under wraps, but it sets out to answer these questions:

What if we could build a platform with the best of all worlds? Taking the final step in reducing the boilerplate and busy-work of building web applications — allowing developers to write nothing more than the core logic which makes their application unique and valuable? (Source)

This doesn’t mean that the open-source project is going anywhere. In fact, pydantic V2 is on its way and will be around seventeen times faster than V1, thanks to being rewritten in Rust.

To stay on up to date on what’s happening with pydantic’s new venture, subscribe to the GitHub issue.

OpenAI Continues to Develop Its Technologies

OpenAI, the company behind DALL·E 2 and ChatGPT, has continued to bring the power of artificial intelligence to programmers.

In February, OpenAI published its tutorials page. Currently, there’s only one tutorial, on using embeddings to answer questions, but more are on the way. As you wait for new tutorials, you can check out the examples gallery and the OpenAI Cookbook on GitHub to get ideas.

Also in February, OpenAI launched ChatGPT Plus for twenty US dollars a month. While free access is still available, the paid version offers use during peak hours, as well as faster response times and priority access to new features and improvements.

If you’re interested in learning more about AI, then check out Real Python’s tutorials on machine learning. You might also enjoy building a chatbot with Python or creating an unbeatable tic-tac-toe player with AI.

PyCon US 2023 Schedule Announced

PyCon US is coming up in April, but you can already register for the in-person or online conference and start planning your learning journey.

The full schedule was released in February, and you may notice some familiar names. In fact, Real Python team member Geir Arne Hjelle is giving a tutorial on Python decorators that you won’t want to miss.

This year marks the twentieth anniversary of PyCon US. To celebrate, the conference team is putting together a slideshow, and you’re invited to be part of it. This invitation is open to all attendees, including first-timers, so be sure to contribute!

If you’re planning to attend the conference and want to make sure you have a good experience, then check out How to Get the Most Out of PyCon US.

Malicious PyPI Packages Continue to Appear

The team over at Phylum is staying busy as wrongdoers continue to target programmers, largely through typosquatting and impersonating legitimate packages. Back in August 2022, we reported malware attacks on PyPI, and in November, Phylum reported a separate incident targeting cryptocurrency programs:

After installation, a malicious Javascript file is dropped to the system and executed in the background of any web browsing session. When a developer copies a cryptocurrency address, the address is replaced in the clipboard with the attacker’s address. (Source)

At that time, there were just over two dozen malicious packages. But in early February, Phylum reported a new attack involving over 451 unique packages, largely in cryptocurrency, finance, and web development.

These attacks work similarly to those in November, but automation allowed malicious PyPI users to register several packages almost simultaneously. This is how they were able to target so many packages. For a full list of affected packages, see the Malicious Package List on Phylum’s blog. And to learn more about safety when using PyPI, check out How to Evaluate the Quality of Python Packages.

The Big Fix Boosts Software Security

From February 14 to March 14, the Big Fix was on. Its mission was to fix vulnerabilities in open- and closed-source software. The event aimed to fix over 200,000 vulnerabilities, and it did. At the time of writing, the event boasted 275,924 fixes!

Participants could participate in a Discord group, watch a fix-a-thon live stream, and win prizes. Anyone who fixed at least one security vulnerability was awarded a limited edition Big Fix t-shirt.

While the event is just about over for 2023, be sure to watch for it in 2024. In the meantime, you can learn about creating more secure applications by checking out the learning paths that event sponsor Snyk offers. You can also explore the Open Web Application Security Project (OWASP) playlist that Snyk created for the Big Fix.

Release Candidate for pandas 2.0.0 Announced

If you work with data, you’ve probably pip installed pandas into countless virtual environments. So a new version of pandas is cause for celebration! In late February, pandas maintainers announced a version 2.0.0 release candidate and strongly encouraged developers who rely on pandas to run their test suites with the release candidate and report any breaking changes before the official release.

This new version brings a handful of exciting developments: interchangable backends, nullable datatypes, and copy-on-write improvements.

Traditionally, pandas has stored data in NumPy arrays. Over time, pandas has decoupled more and more from NumPy and can now use Apache Arrow for in-memory data storage instead of NumPy. Some advantages of using Arrow are richer datatypes, better interoperability with other DataFrame libraries, and faster operations. For now, you need to opt in to use Arrow datatypes. For example, by setting a global mode:

Python
>>> import pandas as pd
>>> pd.options.mode.dtype_backend = "pyarrow"

Learn more about the Arrow backend in pandas 2.0 and the Arrow revolution.

When you’re working with real-world datasets, you’ll often run into missing data. Previously, missing data posed a challenge if it represented, say, a Boolean or integer value. That’s because only floating-point data had a null value (NaN). Now, you can set nullable_dtypes to True to automatically convert values to nullable dtypes:

Python
>>> import pandas as pd
>>> pd.read_csv("numbers.csv")
        name      value
0   thousand     1000.0
1    million  1000000.0
2  bajillion        NaN

>>> pd.read_csv("numbers.csv", use_nullable_dtypes=True)
        name    value
0   thousand     1000
1    million  1000000
2  bajillion     <NA>

The first example shows that the integer column value gets converted to floating-point numbers because of the missing value in the last row. The second example shows how this is handled better with the new data types.

The final big change is increased support for copy-on-write, or lazy copying, meaning that pandas will only copy an object when it’s modified. Copying an object is memory intensive, yet previous implementations of pandas were inconsistent about when an operation would return a view vs a copy. Lazy copying brings a couple of enhancements:

1) a clear and consistent user API (a clear rule: any subset or returned series/dataframe always behaves as a copy of the original, and thus never modifies the original) and 2) improving performance by avoiding excessive copies (eg a chained method workflow would no longer return an actual data copy at each step). (Source)

For a full list of updates, check out What’s new in 2.0.0. Note that some of these changes are also partially available in version 1.5 of pandas.

Conclusion

The Python news desk is always brimming with updates! What’s on your radar from this past month? Are you excited to try the new version of pandas or keep up with what’s happening over at pydantic? Are you building something exciting with OpenAI or working to improve your app’s security? Will we see you at PyCon US 2023? Let us know in the comments!

🐍 Python Tricks 💌

Get a short & sweet Python Trick delivered to your inbox every couple of days. No spam ever. Unsubscribe any time. Curated by the Real Python team.

Python Tricks Dictionary Merge

About Kate Finegan

Kate Finegan Kate Finegan

Kate relishes a well-placed comma as Tutorial Editor for Real Python.

» More about Kate

Each tutorial at Real Python is created by a team of developers so that it meets our high quality standards. The team members who worked on this tutorial are:

Master Real-World Python Skills With Unlimited Access to Real Python

Locked learning resources

Join us and get access to thousands of tutorials, hands-on video courses, and a community of expert Pythonistas:

Level Up Your Python Skills »

Master Real-World Python Skills
With Unlimited Access to Real Python

Locked learning resources

Join us and get access to thousands of tutorials, hands-on video courses, and a community of expert Pythonistas:

Level Up Your Python Skills »

What Do You Think?

Rate this article:

What’s your #1 takeaway or favorite thing you learned? How are you going to put your newfound skills to use? Leave a comment below and let us know.

Commenting Tips: The most useful comments are those written with the goal of learning from or helping out other students. Get tips for asking good questions and get answers to common questions in our support portal.


Looking for a real-time conversation? Visit the Real Python Community Chat or join the next “Office Hours” Live Q&A Session. Happy Pythoning!

Keep Learning

Related Tutorial Categories: community