.py in the sky

Musings on Python, Astronomy, and Open Science

Stop writing code that will break on Python 4!

With the end of support for Python 2 on the horizon (in 2020), many package developers have made their packages compatible with both Python 2 and Python 3 by using constructs such as:

if sys.version_info[0] == 2:
    # Python 2 code
else:
    # Python 3 code

in places where things have changed between Python 2 and 3.

Python 3 in Science: the great migration has begun!

Back in 2012, I carried out a survey to find out which Python, NumPy, and SciPy versions scientists are currently using for their daily work, in order to better understand which versions should be supported. The main finding was that a large fraction of people have reasonably up-to-date Python installations, although virtually no-one was using Python 3 for daily work.

This year, I decided to repeat the experiment: last January I advertised a survey which asked users to provide information about their Python installation(s) for research/production work, as well as more general information about their Python experience, which packages they used regularly, why they are not using Python 3 if they were still using Python 2, and so on.

There is a lot to be learned from this data, and there is no way that I can cover all results in a single blog post, so instead I will focus only on a few points in this post, and will write several more posts over the next couple of weeks to highlight various other results.

For this post, I thought it would be fun to take a look specifically at what Python versions users in the scientific Python community are using, and in particular, the state of Python 3 adoption. I am making an anonymized and cleaned-up version of the subset of the data used in this post in this GitHub repository, and will add to the data over time with future blog posts.

Are we acknowledging tools and services enough in Astronomy papers?

A couple of weeks ago, I attended the 5th .Astronomy meeting, which took place in Boston. For anyone not familiar with this series of conferences, the aim is to bring together researchers, developers, and educators/outreach specialists who use or are interested in using the web as a tool for their work (I like to think of it as an astro-hipster conference!).

One of the topics that comes up regularly at .Astronomy meetings is the question of credit: how do we, as scientists, get credit for work that is not considered 'traditional', such as (but not limited to) creating or contributing to open source software, outreach activities, or refereeing? Sarah Kendrew already summarized the discussions on this topic in her blog, so I won't repeat them here. However, given that I contribute to a number of open source projects (such as Astropy, APLpy, and many others) this got me wondering how often authors actually acknowledge the tools that they use in papers?

I previously played around with the NASA/ADS full-text search, but what I wanted was a way to be able to do this automatically for any keyword/phrase, and be able to see the evolution of acknowledgments over time. With the release of the ADS developer API (which Alberto Accomazzi presented on the Monday at .Astronomy), I finally had the tool I needed to do this! This was a fun post-dotastro hack, for which I now present the results below.

Astropy: Google Summer of Code!

astropy

As one of the co-ordinators of the Astropy project, I am very happy to announce that two talented students will be joining the Astropy project as part of this year's Google Summer of Code (GSoC)!

For anyone not familiar with GSoC, it is a great program that allows students around the world to spend the summer contributing to an open source project (the students receive a stipend from Google for their work). Astropy is participating in GSoC as a sub-organization in the Python Software Foundation organization.

What Python installations are scientists using?

Back in November 2012, I asked Python users in Science to fill out a survey to find out what Python, Numpy, and Scipy versions they were using, and how they maintain their installation. My motivation for this was to collect quantitative information to inform discussions amongst developers regarding which versions to support, because those discussions are usually based only on guessing and personal experience. In particular, there has been some discussion in the Astropy project regarding whether we should drop support for Numpy 1.4, but we had no quantitative information about whether this would affect many users (which motivated this study).

In this post, I'll give an overview of the results, as well as access to the (anonymized) raw data. First, I should mention that given my area of research and networks, the only community I obtained significant data are Astronomers, so the results I present here only include these (though I also provide the raw data for the remaining users for anyone interested).