Session: 2 for 1: Open Source Data Annotation Platform for NLP, CV, Tabular, and Log Data / Deploying Anything as a Service (XaaS) Using Operators on Kubernetes

Open Source Data Annotation Platform for NLP, CV, Tabular, and Log Data – Julia Li & Steve Liang

We introduce Data Annotator for Machine Learning (DAML), a data annotation platform to create high-quality labelled datasets for supervised machine learning. This all-in-one platform helps machine learning teams collaborate on the creation and management of data annotation projects to quickly build custom training datasets while maintaining data quality. The tool works for all major file types, Text, Tabular, Image, Named Entity Recognition & Log data, and uses Active Learning to intelligently querying annotators to label the data that matter the most, therefore reducing the amount of labeled data to achieve similar accuracies.

Deploying Anything as a Service (XaaS) Using Operators on Kubernetes – Jeff Spahr

Kubernetes has long since solved compute as a service, but what if you want to deploy higher level services without reimplementing the finer details of how to scale, cluster, and upgrade those services? Custom Resource Definitions (CRDs) allow users to expand the Kubernetes API to create resources like ‘kind: elasticsearch’ or ‘kind: mariadb’. Operators manage those CRDs and take on orchestration and lifecycle management of those services.

In this talk I’ll cover the what and why of Operators on Kubernetes with a focus on what real world problems this solves for Kubernetes end users. I’ll walk through deploying operators for common high level services that make up a production application.

The XaaS walkthrough and demo will include some of the following technologies:

  • Cloud Services (EC2, S3)
  • Databases (MariaDB, Vitess, Elasticsearch)
  • Load balancers (F5, NGINX)
  • Streaming (Kafka, RabbitMQ)

You’ll leave this session with a foundation to start offering XaaS to your end users.

Presenters: