Vertex AI — Google's ML platform

A unified platform to train, tune, deploy, and call ML models — including Google's Gemini family via the Gemini API. Covers the full ML lifecycle.

Easy Technical

1 min read

Vertex AI is a managed end-to-end MLOps platform: Workbench (managed Jupyter), Data Labeling, Feature Store, Training (custom / AutoML / hyperparameter tuning with Vizier), Model Registry (versioned models with metadata), Endpoints (online + batch prediction), Pipelines (Kubeflow-compatible DAGs), Model Monitoring (drift + skew detection), Explainable AI (feature attributions).

Gemini API / Vertex AI Generative AI Studio: REST/SDK access to Google's foundation models — Gemini (multi-modal LLM: text + image + video + audio), Imagen (text-to-image), Veo (text-to-video), embedding models (text-embedding-005, multimodalembedding). Supports function calling, grounding (Google Search, custom data stores), streaming, tool use — same feature surface as Anthropic/OpenAI APIs.

Training options: (a) AutoML — upload data, select task type, Google searches architectures and hyperparameters; (b) Custom training — submit a container or Python package, pick machine type (N1/A2/A3 with A100/H100 GPUs, or TPU v4/v5), Vertex runs the job and saves artifacts; (c) Pre-built containers — TensorFlow, PyTorch, scikit-learn, XGBoost versions maintained by Google.

Model deployment: a model artifact is registered (Model Registry), then deployed to an Endpoint which provisions replicas on chosen machine types. Autoscaling uses CPU / accelerator utilization targets. Traffic splitting (90/10) supports canary releases. Private Endpoints avoid public exposure.

Feature Store: centralized low-latency storage for ML features (online serving in ms) + offline store (BigQuery) for training consistency. Solves training/serving skew — you compute features once, use them in both places.

Pricing considerations: per-node-hour for endpoints (charged while deployed, not per request), per-vCPU/GPU/TPU-hour for training, per-1k-character for Gemini API calls, separate tier for Vertex AI Workbench notebook VMs. Undeploy endpoints aggressively when not used — they bill 24/7.

Grounded on https://cloud.google.com/vertex-ai/docs/start/introduction-unified-platform

Next up

Firestore — Serverless NoSQL document database

A real-time NoSQL database with offline sync, live listeners, and granular security rules. Ideal for mobile/web apps that want live updates without building a backend.