Data Lifecycles at Massive Scale Using Python

Audience:

Topic:

Flipboard is a social-network aggregation mobile app that aims to transform how people discover, view and share content by combining the beauty and ease of print with the power of social media. The company has hundreds of instances running in the cloud and that number is continually growing with their increasing user base and feature sets.

Qubole and Flipboard will provide a look inside Flipboard’s data lifecycle and how it uses Python, Apache Hive, cloud technologies such as AWS and Qubole, and its own algorithms to ultimately provide content recommendations to its audience. They will delve into the innovative work it is doing with in-memory processing, and how the latest cloud technologies enable Flipboard to manage and analyze large data sets continuously, cost-effectively and at scale.

The session will give an overview of the data lifecycle. Ashish and Rob will discuss the data strategy and architecture the power of Python and in-memory processing and the role of Qubole in Flipboard’s Python SDK and other cloud technologies to quickly and easily access data, analyze it and feed it into other models.

Presentation:

SoCal_Linux_Expo_v1.pdf

Room:

Room 211

Time:

Sunday, January 24, 2016 - 13:30 to 14:30