Friday, April 21, 2017

Community Service Award Recipient Ian Cordasco

The Python Software Foundation depends on its board of directors in order to function. Board members are elected every year by PSF voting members in a process run internally by non-board members. Ian Cordasco has been the PSF’s Election Administrator since 2015, volunteering his efforts for this important role. Cordasco is also a valuable member of the Python community, frequently mentoring newer coders and supporting their Python endeavors. For these reasons, the PSF is delighted to award the 2017 QA Community Service award to Ian Cordasco:

RESOLVED that the Python Software Foundation award the 2017 Q1 Community Service Award to Ian Cordasco for his contributions to PSF elections and active mentoring of women in Python community.

PSF Elections

Cordasco began as the PSF’s Election Administrator during a time of turmoil. “The first year I ran the election was something of a nightmare,” he recalls. Due to unforeseen circumstances, the previous Election Administrator stepped down on short notice and was unavailable to relaunch the election efforts. “Many people did not get ballots via email as they should. Some people were accidentally excluded from the voting rolls. Further, there was a lot of confusion because I stepped in at the last minute.” Without the aid of documentation and prior experience, Ian threw himself into the cause. The PSF has since reviewed, solidified, and documented the election procedures.

Since his dramatic start as Elections Administrator, Cordasco’s work with PSF elections has been much smoother. Mark Mangoba, PSF’s IT Manager, works closely with Cordasco during the election process. Mangoba notes, “Ian is a great volunteer. He does an excellent job with the elections, assuring that all votes are accounted for and that there is no fraud or issues of any kind.” Cordasco has also gotten creative with how he manages elections. For example, to reduce bias, he uses Python code to break ties and to randomize the order in which candidates appear to voters. Additionally, those that work with Cordasco describe him as an enjoyable collaborator. Mangoba explains, “Ian is energetic and thoughtful. His passion and enthusiasm for the PSF shows through, he’s always available to help and answer questions.”


Cordasco has a history of going out of his way to support and encourage female developers. When Carol Willing, a developer for the Jupyter project, wanted to work on the Requests library, she got in touch with Cordasco. “We worked together on the project and my first commit to the Requests library got accepted!” Cordasco later wrote a fantastic post about it on his blog.

Cordasco has also found newer coders to mentor at Python events, such as Anna Ossowski. “I met Ian at PyTennessee 2015, a day before I was scheduled to give my very first ever conference talk. Ian’s encouragement and support helped me a lot and it’s thanks to him and Carol [Willing] that I had the confidence to go up on stage and deliver my talk.” But his support didn’t stop there, Ossowski goes on to say, “every week he would reserve an hour for me where we would program together, he would answer questions, and just generally help me with any programming issues I experienced. Ian helped me get the PyLadies Remote website up and running, something I would have never managed without his help.”

Adrienne Lowe, a developer at Emma, has also enjoyed Cordasco’s support and encouragement. She recalls, “He models the kind of developer that we all want to be in terms of being encouraging and open.” She continues, “he sets himself apart by being genuine, welcoming, and happy to explain anything from simple things to more complex concepts, all in an ego-less way.”

The Python community as a whole is very lucky to count Cordasco as its member, and we hope he continues to help others contribute and achieve their goals.

CSA 2017 Q1 Winner Ian Cordasco

In his free time, you can find Cordasco blogging on his website, riding his bike, or reading books.

Wednesday, April 05, 2017

Pay What You Want for "The Humble Book Bundle: Python" and Benefit the PSF

Pay what you want for a stack of Python ebooks from No Starch Press, and decide what portion goes to the PSF. This deal is presented by Humble Bundle, which sells ebooks and games to raise money for nonprofits. When you buy a bundle you choose how much to pay, and how the money is divided among the creators, Humble Bundle, Inc., and the nonprofit organization.

The Humble Book Bundle: Python is available now through April 19th. Pay a dollar or more for these three books:

If you choose to pay $8 or more, you also receive:

If you pay more than $15 you get all of the above, plus:

To help the PSF and get a stack of fun and useful Python books at a price you decide, buy the bundle before April 19th!

Tuesday, March 28, 2017

Python at Google Summer of Code: Apply by April 3

Google Summer of Code (GSoC) is a global program that offers post-secondary students an opportunity to be paid for contributing to an open source project over a three-month period. Since 2005, the Python Software Foundation (PSF) has served as an "umbrella organization" to a variety of Python-related projects, as well as sponsoring projects related to the development of the Python language.

April 3rd is the last date for student applications for GSoC 2017. You can view all the sub-orgs under PSF and see what projects are seeking applications, then go to the Google Summer of Code site to submit your application.


To ask questions about specific projects, go to the sub-orgs page and click "Contact" under the project you want to ask about.

The student application deadline is April 3, and decisions will be announced on May 4.

Timeline for Google Summer of Code 2017

Friday, March 10, 2017

Ernest takes the call - Community Service Award Recipient Ernest Durbin III

The day was October 23, 2015, Friday afternoon. The PSF and PyCon organizers were busy pulling together sponsors for the upcoming PyCon conference when suddenly, the ancient mail server 'albatross' suffered a hard disk crash. Email was down, grant requests would not go through, and PyCon planning was at a stand still. To make matters worse, most of the volunteers who had helped set up the initial mail server were away. Something had to be done, and fast. Ernest Durbin, a volunteer systems admin, took the call. With no documentation on how to fix the existing mail server, he worked diligently through the weekend to rebuild it. Thanks to Ernest’s hard work and dedication, the PSF and PyCon US were able to resume operations before the following Monday.

For his enthusiasm and years of volunteering, the Python Software Foundation awards the 4th Quarter 2016 Community Service Award to Ernest Durbin III:

RESOLVED, that the Python Software Foundation award the 4th Quarter 2016 Community Service Award to Ernest W. Durbin III. Ernest has been a dedicated volunteer of the PSF for several years. Countless times he has triaged PSF infrastructure. Beyond that, Ernest has been a key person in creating structure for our infrastructure. Not only does that include internal infrastructure such as, that also includes external infrastructure such as PyPI. Recently, Ernest has also accepted the position of PyCon 2017 US co-chair and PyCon 2018/19 conference chair.

Durbin’s involvement in PSF began in 2012. He recalls, “A friend of mine submitted a proposal and we were selected for the task.” After a few months of doing paid work for the PSF, he realized that he would be more comfortable volunteering his time, and has been doing so ever since. “Ernest has been a huge help with the growth of PSF's infrastructure,” says Ewa Jodlowska, Director of Operations at PSF. “[He] ensures that we are keeping best practices and that the knowledge of proper processes is passed on. I am grateful that he has been able to lend his expertise in such a way.”

In addition to providing volunteer technical support for the PSF, Durbin has also become a Python community organizer. He will serve as co-chair of PyCon 2017 and has taken on the responsibilities of full conference chair for PyCon 2018 and 2019. He is also an organizer for his local Python meetup group.

When asked about why Durbin chooses to promote Python and the PSF Durbin responded, “Python has been my language of choice for most of my career,” adding that he “has always appreciated the great breadth and depth of experience in the Python community as represented by the available packages on PyPI. It is such a testament to the community's collective knowledge and generosity when nine times out of ten you can find something that fulfills your need.”

As for the email server incident, Durbin simply brushes off the stress explaining “it was a great way to meet new folks in the Python community and work with them towards a common goal.”
Ernest Durbin, CSA Winner 2016 Q4
When not programming, you can find Durbin out in his garage working on his 1960’s era SAABs or hosting Taco Tuesdays for large groups of friends.

Friday, February 10, 2017

Discovering the Python Community in Zimbabwe at their first PyCon

On the heels of attending a successful PyCon in Namibia in 2015, a small group of Python enthusiasts in Harare, Zimbabwe vowed to organize the first-ever PyCon held in Zimbabwe.

After months of planning on November 25th, 2016 they achieved their goal in dramatic fashion with an enormously successful sold-out conference at the ZESA National Training Center in Harare. I was privileged to give the keynote to an extremely attentive audience. For an hour, we had a tremendous time in discussing how to contribute to open-source successfully and how to grow ideas into successful open-source projects.

In all my years of speaking, I've never had such an incredible audience. Often at technical conferences audience members are more engaged with their smartphones than the speaker. Not so at PyConZim! Questions were thoughtful and engaging. Truly a pleasure.

Throughout the conference many enjoyable talks were given. I enjoyed Dennis Murekachiro's inspiring talk on how to be a game-changer as he encouraged Zimbabwe technologists not to settle for "good enough" but to work hard to use technology to better themselves and the communities they live in. Tendai Marengerke's talk on how to create reproducible research in Python was absolutely fascinating; it's a must-watch for anybody using Python in an academic setting.

Petrus Janse van Rensburg from South Africa gave an outstanding overview of challenges that low-bandwidth connections create in Africa and how he is working to solve them by re-designing the way e-commerce platforms operate. I can virtually guarantee we'll be hearing a more about him and his work in the coming months and years.

One of the most astonishing things about PyConZim is the way in which every single attendee is brilliant and, without fail, engaged with pragmatic ideas about how to use Python to make a better life for their communities. One could go to every PyCon on Earth and never find one as inspiring as PyCon Zimbabwe.

The highlight for me, though, was having the chance to meet Marlene Hangami and Ronald Maravanyika.
Marlene and Ronald have single-handedly started an organization to teach Python to young girls across Zimbabwe.

Fueled by a desire to simply improve the lives of girls in their country, they've started free workshops in community centers and now operate in over forty community centers across the country.
They've had to battle a number of difficult obstacles that would discourage most people but they're continuing on.
As a direct result of my trip to PyConZim, I've started working with Ronald and Marlene to start a program to bring female software developers to Zimbabwe to work with selected girls on Python-based projects to help out in their communities.

Mentors participate in projects that girls work on by volunteering as little as four hours of their time and conduct their mentorship via video-conference and email. It's a very simple way to advance the case for women in technology in Africa and beyond. More information on mentorship programs and application information is available here.

My humble thanks to everybody at the Python Software Foundation for sponsoring my trip to Zimbabwe and for sponsoring the conference itself.

Friday, February 03, 2017

Pythonistas (and a Python!) at PyCon Jamaica

This past November marked the first PyCon Jamaica. Held in the capital, Kingston, the conference began on November 17th with a day of tutorials followed by a single track of talks on November 18th. I attended both as a representative of the Python Software Foundation, which sponsored the conference, and as a speaker.

Python in Kingston’s Higher Education 

Kingston, home to approximately 33% of Jamaicans, boasts several institutions of higher learning including the Caribbean Maritime Institute and the Mona campus of the University of the West Indies. PyCon Jamaica kicked off with tutorials at the University of the West Indies. Most of the tutorials focused on introductory topics (e.g. Introduction to Plone). Participants came from a wide range of backgrounds including mechanical engineers or undergraduates with a marketing concentration. Interestingly I was informed Python isn’t a part of the standard computer science offering at the university yet it has become a language of considerable interest in many of Kingston’s professional sectors.

David Bain, organizer of PyCon Jamaica and the local Python Jamaica user group, explained that he thinks the interest in Python has risen as students have become increasingly exposed to web technologies. Bain added that PyCon Jamaica is a way to help demonstrate to students and professionals the various applications Python has. "Jamaica wants to be seen as a viable source for local and North American nearshore developer talent, our event signals that software development talent is here," Bain explained.

Modernizing the Public Sector with Python

Conference talks were held at the Hope Zoo, a facility housing vast botanical gardens, a zoo, and a community center. There were three international speakers, Joir-dan Gumbs of IBM, Star Ying of the US Dept of Commerce, and myself, alongside several local speakers. The tutorials had been more student-centric, but the conference catered to those using Python in the Jamaican public sector.

A common theme from local speakers highlighted how Python has helped local professionals modernize outdated practices. Marc Murray of the Jamaican Ministry of Health described how he has used Python throughout his career of fifteen-plus years to automate processes and enable better data collection and data sharing. More than one speaker acknowledged that struggle of institutional knowledge silos in the local government. With Python, though, these knowledge silos have started to be disrupted. Agencies are able to share the same data sets with greater ease and promote transparency.

Python's data-processing power was the star in a talk by student Dominic Mills. Mills recently completed an internship at CERN, where he built a Django prototype for debugging hardware in future experiments. Crucial to this project was not only the collection of data via Celery but the capacity to analyze it. Mills used bokeh for real time analysis of the sensor data, permitting monitoring and alarms to be raised if unfavorable conditions were found.

Collectively the speakers at PyCon Jamaica reflect how Jamaican programmers are embracing Python for data collection and analysis in a variety of specialties. Python’s open source packages and rich community support seemed to be its biggest selling points. Speaker Joir-dan Gumbs commented that, “the best part for me was the presentations of how Python is enhancing the lives of Jamaicans, as well as the networking.”

I’m excited to see what PyCon Jamaica 2017 will hold. Already the conference is rich in data science and data visualization content. After all, if PyCon Jamaica 2016 included an appearance from the Hope Zoo’s own python what will we see next? Perhaps two pythons, and of course many more Jamaican Pythonistas.

Tuesday, January 31, 2017

Time To Upgrade Your Python: TLS v1.2 Will Soon Be Mandatory

If you're using an older Python without the most secure TLS implementation, this is the year to get serious about upgrading. Otherwise next June you may not be able to "pip install" packages from PyPI.

PyPI's maintainer Donald Stufft recently announced that and related sites will begin disabling the old TLS versions 1.0 and 1.1. This change was imposed on us by our content delivery network, Fastly, in response to a change imposed on them by the Payment Card Industry Security Standards Council. In order to continue serving websites that take credit card payments, Fastly is required to disable the old, insecure versions of TLS. Since the PSF's servers, including PyPI, use Fastly, the old versions of TLS will be disabled as well.

Fastly wrote in October 2015,
There have been serious and systemic security issues with earlier versions of TLS and its predecessor, SSL, including POODLE, Heartbleed, and LOGJAM. These threatened to break trust in fundamental methods of secure communication, exposing both you and your customers to breaches in security. The actions of the PCI DSS Council to maintain a high minimum bar are a step towards ensuring the security of all online business transactions.
There are two deadlines to upgrade your Python to a version with the latest TLS. The first comes soon, on April 30, 2017, when sites without Extended Validation Certificates will stop supporting TLS 1.0 and 1.1. These sites include:


Warehouse, the future successor to PyPI, will also be affected by April's deadline, since Warehouse serves files from

The more crucial deadline comes June 30, 2018. On that date all remaining sites, including PyPI, will no longer support TSL 1.0 and 1.1. Older Python versions that do not implement TLSv1.2 will be prohibited from accessing PyPI.

See below for instructions to check your interpreter's TLS version. 1

Stufft writes, "I am going to see about possibly organizing some scheduled 'brown outs' of TLSv1.0 and TLSv1.1 prior to the cut off dates to try and help folks find places that will need updates. Any scheduled brownouts will be posted to prior to happening."

Mac users should pay special attention. So far, the system Python shipped with MacOS does not yet support TLSv1.2 in any MacOS version; beginning next June these system Pythons will no longer be able to "pip install" packages. 2 Fortunately, it's easy to install a modern Python alongside the MacOS system Python. Either download Python 3.6 from, or for Python 2.7 with the latest TLS, use Homebrew. Both methods of installing Python will continue working after June 2018.

1. To check your Python interpreter's TLS version, install the "requests" package and run a command. For example, for Python 2:

python2 -m pip install --upgrade requests
python2 -c "import requests; print(requests.get('', verify=False).json()['tls_version'])"

Or Python 3:

python3 -m pip install --upgrade requests
python3 -c "import requests; print(requests.get('', verify=False).json()['tls_version'])"

If you see "TLS 1.2", your interpreter's TLS is up to date. If you see "TLS 1.0" or an error like "tlsv1 alert protocol version", then you must upgrade.

2. The reason Python's TLS implementation is falling behind on macOS is that Python continues to use OpenSSL, which Apple has stopped updating on macOS. In the coming year, the Python Packaging Authority team will investigate porting pip to Apple's own "SecureTransport" library as an alternative to OpenSSL, which would allow old Python interpreters to use modern TLS with pip only. "This is a non-trivial amount of effort," writes Stufft, "I'm not sure it's going to get done."

In the long run, the Python interpreter itself would easily keep up with TLS versions, if it didn't use OpenSSL on platforms like macOS and Windows where OpenSSL is not shipped with the OS. Cory Benfield and Christian Heimes propose to redesign the standard library's TLS interfaces to make it easier to swap OpenSSL with platform-native TLS implementations.

Thursday, January 26, 2017

“I use Python to help build the kind of world I want to live in” - Shannon Turner, Community Service Award Winner Q4

Shannon Turner has been fascinated with programming since she was a child, thanks in part to her grandmother, who loved video games. Watching her grandmother play, Turner would draw pictures on paper and ask, 'Wouldn't this be cool if this were part of the game?". “Yes,” her grandmother would agree, “so you’ll need to get very good with computers if you want to make games of your own someday.” 

As an adult, Turner’s interest in programming grew. She taught herself to program and attended tech events but it didn't feel right. She grew frustrated at being one of the only women in the room, being talked down to, and not taken seriously. Then, after speaking with other women at the events, she would realize that it wasn’t just her, “...[that] we all had this shared experience of being talked down to and not taken seriously. That's when I decided to start teaching other women what I'd taught myself.” This is what motivated Turner to start Hear Me Code (HMC), a group that offers free, beginner-friendly coding classes for women in the Washington DC area.

The Python Software Foundation awards the 4th Quarter 2016 Community Service Award to Shannon Turner for her work on Hear Me Code:
RESOLVED, that the Python Software Foundation award the 4th Quarter 2016 Community Service Award to Shannon Turner. Shannon is the founder of Hear Me Code, an organization offering free, beginner-friendly Python coding classes for over 2000 women in Washington, DC. She teaches all the classes with the help of women who have previously taken the classes. She empowers hundreds of women to code with Python by lowering barriers to entry. More than just a class where women learn to build websites, Hear Me Code focuses on leadership development, peer mentoring, and turning students into teachers.

Hear Me Code

What started in 2013 as an informal class with a few friends seated at the kitchen table has grown to a group of over 2000 in the Washington DC area. Turner developed the curriculum, slides, and exercises for five lessons, making incremental changes and improvements each time she taught. In the beginning she taught all the classes herself, but quickly realized she would do even more good by helping her students become instructors themselves. To date, over 100 women who started as students have moved on to be teaching assistants and teachers in the group. “In our first two years,” says Turner, “over two dozen women credited Hear Me Code with providing them the skills and experience they needed to land a job in tech.”

At HMC, programming courses are taught with Python. Why Python? As Turner was teaching herself to code with a variety of languages, Python felt different. “I still struggled to learn it,” Turner recalls, “but it was much more intuitive than other languages I'd used.”

Helping Female Developers

HMC student Sonia Hinson started taking classes in January 2014. Since then she has completed most of the courses and moved on to being a teaching assistant and teacher. She says Turner encourages her students to become teachers by “promoting the idea that you learn best from teaching someone else programming and working with your neighbors to solve bugs and problems in code.”

Student Haynes Bunn would agree. She values Turner’s ability to identify people’s strengths and encourage students to get involved in teaching positions. By doing this, says Haynes, Turner is not just teaching women to code, “she’s also helping them to teach, to help others, and to be leaders.”

Turner would rather spend her time helping women through her networks than seek praise for all of her work. That’s not what motivates her, says Stephanie Nguyen, “her impact in the Python community and the women who she has empowered to code are all examples that speak loudly for her.”

Other Projects

“Now, in addition to running Hear Me Code,” says Turner, “I use Python to help build the kind of world I want to live in.” Some of her other projects include a visualization of 500 schools that aren't taking campus sexual assault seriously and a searchable database of 6000 museums across the US.

Shannon Turner, CSA Winner Q4
Turner lives in Washington DC with her pet bird, who she keeps tabs on with her Raspberry Pi.

Thursday, January 19, 2017

Sheila Miguez and Will Kahn-Greene and their love for the Python Community: Community Service Award Quarter 3 2016 Winners

There are two elements which make Open Source function:
  1. Technology
  2. An active community.
The primary need for a successful community is a good contributor base. The contributors are our real heroes, who work persistently, on many (if not most) occasions without any financial benefits, just for the love of the community. The Python Community is blessed with many such heroes. The PSF's quarterly Community Service Award honors these heroes for their notable contributions and dedication to the Python ecosystem.

The PSF is delighted to give the 2016 Third Quarter Community Service Award to Sheila Miguez and Will Kahn-Greene:
Sheila Miguez and William Kahn-Greene for their monumental work in creating and supporting PyVideo over the years.

Community Service Award for 3rd Quarter

Will Kahn-Greene
Taken by Erik Rose, June 2016
The PSF funds a variety of conferences and workshops throughout the year worldwide to educate people about Python. But, not everyone can attend all of these events. Two people, Sheila Miguez and Will Kahn-Greene wanted to resolve this problem for the Pythonistas. Will came up with a brilliant idea of PyVideo and Sheila later joined the mission. PyVideo works as the warehouse of videos from Python conferences, local user groups, screencasts, and tutorials.

The Dawn of PyVideo

Back in 2010, Will started a Python video site using the Miro Community video-sharing platform. PSF encouraged his work with an $1800 grant the following year. As Will recalls, "I was thinking there were a bunch of Python conferences putting out video, but they were hosting the videos in different places. Search engines weren't really finding it. It was hard to find things even if you knew where to look." He started with Miro Community, and later wrote a whole new codebase for generating the data and another codebase for the front end of the website.
With these tools he started "This new infrastructure let me build a site closer to what I was envisioning."

When Sheila joined the project she contributed both to its technology and by helping the community find Python videos easier. Originally, she intended to only work on the codebase, but found herself dedicating a lot of time to adding content to the site.

What is PyVideo?
PyVideo is a repository that indexes and links to thousands of Python videos. It also provides a website where people can browse the collection, which is more than 5000 Python videos and growing. The goals for PyVideo are:

  1.  Help people get to Python presentations easier and faster
  2.  Focus on education
  3.  Data collection and categorization.
  4.  Aim to give people an easy, enjoyable experience contributing to open source on PyVideo's GitHub repo

The Community Response

The Python community has welcomed Will and Sheila's noble endeavor enthusiastically. Pythonistas around the world never have to miss another recorded talk or tutorial. Sheila and Will worked relentlessly to give shape to their mammoth task. When I asked Will about the community’s response, he said, "Many learned Python by watching videos they found on Many had ideas for different things we could do with the site and other related projects. I talked with some folks who later contributed fixes and corrections to the data."

Will and Sheila worked on only in their spare time, but it has became a major catalyst in the growth of the Python community worldwide. According to Will, has additional, under publicized benefits:

  • PyVideo is a primary source to survey diversity trends among Python conference speakers around the globe.
  • Since its videos are solely Python, it is easily searchable and provides more helpful results than other search engines.
  • It offers a preview of conferences: By watching past talks people can choose if they want to go.

PyVideo : The End?

With a blog post Will and Sheila announced the end of "I'm pretty tired of working on pyvideo and I haven't had the time or energy to do much on it in a while," Will wrote.

Though they were shutting down the site, they never wanted to lose or waste the valuable data. Will says, "In February 2016 or so, Sheila and I talked about the state of things and I just felt bad about everything. So we decided to focus on extracting the data from PyVideo and make sure that even if the site didn't live on, the data did. We wrote a bunch of tools and
infrastructure for a community of people to add to, improve and otherwise work on the data. We figured someone could take the data and build a static site around it." Will did a blog post about the status of the data of, and invited new maintainers to replace the site.

The end of broke the hearts of many Pythonistas, including Paul Logston. Paul’s mornings used to begin by watching a talk on the site, and he couldn't renounce his morning entertainment.  He resolved to replace To begin, he wrote his project called "PyTube" for storing videos. Though initially his interest was personal, its educational outreach aspect drove him to finish and publicize the project. Sheila remembers noticing Paul for the first time when she noticed his fork of the pyvideo data repository. She was excited to see that he'd already built a static site generator based on PyVideo data. She read Paul’s development philosophy and felt he was the right person to carry on the mission.

In May 2016, at PyCon US,  there was a lightning talk on PyVideo and its situation. Paul met some fellow PyVideo followers who, just like him, did not want to lose the site. They decided to work on it during the Sprints. Though the structure of the website was ready, there were a lot of things that needed to be done like data gathering, curating data, and the design of the website. So, the contributors divided the works between them.

Both Sheila and Will were committed to PyVideo's continued benefit for the community, while passing PyVideo to new hands. They were satisfied by Paul's work and transferred the domain to his control. Paul's PyTube code became the replacement of on August 13, 2016.

Emergence of the Successor : The Present Status of PyVideo

Now the project has 30 contributors, with Paul serving as project lead. These contributors have kept the mission alive. Though PyVideo's aim is still the same, there is a difference in its technology. The old Django app is replaced with a static site generated with Pelican, and it now has a separate repository for data in JSON files. The team's current work emphasizes making the project hassle-free to maintain.

Listen to Paul talking about PyVideo and its future on Talk Python to Me.

The Wings to Fly

Every community needs someone with a vision for its future. Will and Sheila had showed us a path to grow and help the community. It is now our responsibility to take the new PyVideo further. Paul describes its purpose beautifully: "PyVideo's deeper 'why' is the desire to make educating oneself as easy, affordable, and available as possible." Contributors: please come and join the project, give a hand to Paul and the team to help move this great endeavor forward.

Wednesday, January 04, 2017

"Weapons of Math Destruction" by Cathy O'Neil

In a 1947 lecture on computing machinery, Alan Turing made a prediction: "The new machines will in no way replace thought, but rather they will increase the need for it."

Someday, he said, machines would think for themselves, but the computers of the near future would require human supervision to prevent malfunctions:
"The intention in constructing these machines in the first instance is to treat them as slaves, giving them only jobs which have been thought out in detail, jobs such that the user of the machine fully understands in principle what is going on all the time." 1
It is unclear now whether machines remain slaves, or if they are beginning to be masters. Machine-learning algorithms pervasively control the lives of Americans. We do not fully understand what they do, and when they malfunction they harm us, by reinforcing the unjust systems we already have. Usually unintentionally, they can make the lives of poor people and people of color worse.

In "Weapons of Math Destruction", Cathy O'Neil identifies such an algorithm as a "WMD" if it satisfies three criteria: it makes decisions of consequence for a large number of people, it is opaque and unaccountable, and it is destructive. I interviewed O'Neil to learn what data scientists should do to disarm these weapons.

Automated Injustice

Recidivism risk models are a striking example of algorithms that reinforce injustice. These algorithms purport to predict how likely a convict is to commit another crime in the next few years. The model described in O'Neil's book, called LSI-R, assesses offenders with 54 questions, then produces a risk score based on correlations between each offender's characteristics and the characteristics of recidivists and non-recidivists in a sample population of offenders.

Some of LSI-R's factors measure the offender's past behavior: Has she ever been expelled from school, or violated parole? But most factors probably aren't under the individual's control: Does she live in a high-crime neighborhood? Is she poor? And many factors are not under her control at all: Has a family member been convicted of any crimes? Did her parents raise her with a "rewarding" parenting style?

Studies of LSI-R show it gives worse scores to poor black people. Some of its questions directly measure poverty, and others (such as frequently changing residence) are proxies for poverty. LSI-R does not know the offender's race. It would be illegal to ask, but, O'Neil writes, "with the wealth of detail each prisoner provides, that single illegal question is almost superfluous." For example, it asks the offender's age when he was first involved with the police. O'Neil cites a 2013 New York Civil Liberties Union study that young black and Hispanic men were ten times as likely to be stopped by the New York City police, even though only a tiny fraction were doing anything criminal.

So far, the LSI-R does not automatically become destructive. If it is accurate, and used for benign choices like spending more time treating and counselling offenders with high risk scores, it could do some good. But in many states, judges use the LSI-R and models like it to decide how long the offender's sentence should be. This is not LSI-R's intended use, and it is certainly not accurate enough for it: a study this year found that LSI-R misclassified 41% of offenders. 2

Success, According to Whom?

O'Neil told me that whether an algorithm becomes a WMD depends on who defines success, and according to whom. "Over and over again, people act as if there's only one set of stakeholders."

When a recidivism risk model is used to sentence someone to a longer prison term, the sole stakeholder respected is law enforcement. "Law enforcement cares more about true positives, correctly identifying someone who will reoffend and putting them in jail for longer to keep them from committing another crime." But our society has a powerful interest in preventing false positives. Indeed, we were founded on a constitution that considered a false positive—that is, being punished for a crime you did not commit—to be extremely costly. Principles including the presumption of innocence, the requirement that guilt is proven beyond reasonable doubt, and so on, express our desire to avoid unjust punishment, even at the cost of some criminals being punished too little or going free.

However, this interest is ignored when an offender is punished for a bad LSI-R score. His total sentence accounts not only for the crime he committed, but also for future crimes he is thought likely to commit. Furthermore, he is punished for who he is: Being related to a criminal or being raised badly are circumstances of birth, but for many people facing sentencing, such circumstances are used to add years to their time behind bars.

Statistically Unsound

Cathy O'Neil says weapons of math destruction are usually caused by two failures. The first is when only one stakeholder's interests define success. LSI-R is an example of this. The other is a lack of actual science in data science. For these algorithms, she told me, "We actually don't have reasonable ways of checking to see whether something is working or not."

A New York City public school program begun in 2007 assessed teachers with a "value added model", which estimated how much a teacher affected each student's progress on standardized tests. To begin, the model forecast students' progress, given their neighborhood, family income, previous achievement, and so on. At the end of the year their actual progress was compared to the forecast, and the difference was attributed to the teacher's effectiveness. O'Neil tells the story of Tim Clifford, a public school teacher who scored only 6 out of 100 the first year he was assessed, then 96 out of 100 the next year. O'Neil writes, "Attempting to score a teacher's effectiveness by analyzing the test results of only twenty-five or thirty students is statistically unsound, even laughable." One analysis of the assessment showed that a quarter of teachers' scores swung by 40 points in a year. Another showed that, with such small samples, the margin of error made half of all teachers statistically indistinguishable.

Nevertheless, the score might determine if the teacher was given a bonus, or fired. Although its decision was probabilistic, appealing it required conclusive evidence. O'Neil points out that time and again, "the human victims of WMDs are held to a higher standard of evidence than the algorithms themselves." The model is math so it is presumed correct, and anyone who objects to its scores is suspect.

New York Governor Andrew Cuomo put a moratorium on these teacher evaluations in 2015. We are starting to see that some questions require too subtle an intelligence for our current algorithms to answer accurately. As Alan Turing said, "If a machine is expected to be infallible, it cannot also be intelligent."

Responsible Data Science

I asked Cathy O'Neil about the responsibilities of data scientists, both in their daily work and as reformers of their profession. Regarding daily work, O'Neil drew a sharp line: "I don't want data scientists to be de facto policy makers." Rather, their job is to explain to policy makers the moral tradeoffs of their choices. The same as any programmer gathers requirements before coding a solution, data scientists should gather requirements regarding the relative cost of different kinds of errors. Machine learning algorithms are always imperfect, but they can be tweaked for either more false positives or more false negatives. When the stakes are high, the choice between the two is a moral one. Data scientists must pose these questions frankly to policy makers, says O'Neil, and "translate moral decisions into code."

Tradeoffs in the private sector often pit corporate interests against human ones. This is especially dangerous to the poor because, as O'Neil writes, "The privileged are processed more by people, the masses by machines." She told me that when the boss asks for an algorithm that optimizes for profit, it is the data scientist's duty to mention that the algorithm should also consider fairness.

"Weapons of Math Destruction" tells us how to recognize a WMD once it is built. But how can we predict whether an algorithm will become a WMD? O'Neil told me, "The biggest warning sign is if you're choosing winners and losers, and if it's a big deal for losers to lose. If it's an important decision and it's a secret formula, then that's a set-up for a weapon of math destruction. The only other ingredient you need in that setup is actually making it destructive."


Cathy O'Neil says the top priority, for data scientists who want to disarm WMDs, is to develop tools for analyzing them. For example, any EU citizen harmed by an algorithmic decision may soon have the legal right to an explanation, but so far we lack the tools to provide one. We also need tools to measure disparate impact and unfairness. O'Neil says, "We need tools to decide whether an algorithm is being racist."

New data scientists should enter the field with better training in ethics. Curricula usually ignore questions of justice, as if the job of the data scientist were purely technical. Data-science contests like Kaggle also encourage this view, says O'Neil. "Kaggle has defined the success and the penalty function. The hard part of data science is everything that happens before Kaggle." O'Neil wants more case studies from the field, anonymized so students can learn from them how data science is really practiced. It would be an opportunity to ask: When an algorithm makes a mistake, who gets hurt?

If data scientists take responsibility for the effects of their work, says O'Neil, they will become activists. "I'm hoping the book, at the very least, gets people to acknowledge the power that they're wielding," she says, "and how it could be used for good or bad. The very first thing we have to realize is that well-intentioned people can make horrible mistakes."

1. Quoted in "Alan Turing: The Enigma", by Andrew Hodges. Princeton University Press.

2. See also ProPublica's analysis of bias in a similar recidivism model, COMPAS.