DevOps for Data as a Service
Modern data engineering needs to be agile and able to quickly respond to a changing business landscape without sacrificing necessary data quality.
DevOps revolutionized Software engineering with its adoption of agile, lean practices, and fostering collaboration. We can see the same need to happen for Data Engineering as well.
In this talk, we will go over how we can adopt best DevOps practices in the space of data engineering. And what are the challenges in adopting them considering the different skill sets of the data engineers and the different needs?
- What is the API for Data?
- What are the types of SLO and SLAs that data engineers need to track.
- How do we adapt and automate the DevOps cycle - plan,code,build,test,release,deploy,operate,monitor for data.
Those are challenging questions, and the data engineering space does not have a good answer, yet.
We will show and demonstrate how a new open source project Versatile Data Kit (https://github.com/vmware/versatile-data-kit) answers those questions and helps introduce DevOps practices in data engineering.
In VMware, we have had to tackle such problems and have come up with a solution that we think would be beneficial to the whole community.