The addition — which follows GitHub’s introduction last month of Large File Storage — could make GitHub into a better place for data scientists to share and collaborate on their statistical work, and not solely a virtual destination for developers working together on projects.
Sharing an actual notebook on a GitHub page is certainly much better than sharing a screenshot of a notebook with colleagues, Arfon Smith, GitHub’s head of science, told VentureBeat in an interview.
The idea to support Jupyter came about a year or so ago, right around the time that startups like Domino Data Labs and Sense were emerging on the scene to focus on the data science collaboration market.
But GitHub’s support for .ipynb files won’t instantly make the company a competitor to those startups, and others, like Plotly or Yhat, Smith said. GitHub isn’t performing the computations as a cloud service, but only providing a place to share results — that’s the major difference.
The move follows Google Research’s release last summer of a Chrome browser app that installs IPython, a component of Jupyter, to let people work with data in a way that’s integrated with Google Drive.
Support for Jupyter notebooks will ship in version 2.3 of the GitHub Enterprise software, which can run in companies’ on-premises data centers and on clouds like Amazon Web Services and Microsoft Azure. Version 2.3 should arrive in July, Smith said.
Data scientists at Stitch Fix, a startup that relies on stylists to deliver clothes and accessories to women, are already using Jupyter Notebooks.
And in academia, where Jupyter Notebooks are often used, GitHub could also become a better place for sharing information.
“It’s a really compelling option for people who want to combine analysis and visualization into an easily sharable format,” Smith said.
See Smith’s blog post for more detail.