Google Cloud Certified - Professional Cloud Developer

Cloud computing has five fundamentals attributes

Customers get resources on demand and self-service
Customers can access the resources over the network
The provider has a pool of resources so the customer doesn’t know the exact location of the resource.
The customer can rapidly scale them
Customers pay only for what they use

Waves of the trend towards cloud computing

Colocation: You host the servers
Virtualized data centers: You still own the infrastructure, but it is virtualized
Container based architecture: More versatile than virtualization

Kind of services

Traditional On-Premises Legacy

You manage everything

Infrastructure as a service (IaaS)

You hire the hardware but you still are in charge of everything else. You deploy your app in a virtual machine.
You pay for what you allocate
(amazon EC2, digitalOcean)

Platform as a service (PaaS)

You hire the hardware and the software, but you still have to manage it. You build your software on top of the platform
You pay for what you use
(heroku, salesforce)

Software as a Service (SaaS)

You hire everything, you just use the software (paypal, facebook)

Regions and Zones

Regions are independent areas that contain zones. The round-trip latency between two points in the same region is under 5 milliseconds. A zone is the minimum area for a failure, so to get a resilient application, it must be deployed across multiple zones.

In 2019 there are 20 regions.

Google Cloud Compute Engine VM instance resides in a specific zone

Resources

There are regional and multiregional resources. Those have a higher latency but are more fault-tolerant. Those services have regional and multi-regional deployments:

Google App Engine
Google Cloud Datastore
Google Cloud Storage
Google Big Query

Resource hierarchy

Org node

A policy can be defined
When created, all users can create projects and billing accounts
Folder (optional)

A policy can be defined
Can contain more Folders
If one exists, a org node must exist as well
Project

A policy can be defined
It is the basis for enabling and using services
It can have different owners and users
They are billed and managed separately.
It has to have:

Project ID (custom-unique-immutable)
Project Name
Project number (assigned-unique-immutable)

Resource

Some resources allow to define a policy
A resource belongs to only one project

Policies are inherited across the tree

Policies cannot remove access of something granted in a more specific rule. If there are several rules at the same level, the less restrictive has prevalence.

Google Cloud billing

Billed in seconds (compute and data processing):

Compute Engine
Kubernetes Engine
Cloud Dataproc (open source Big Data system Hadoop as a service)
App Engine flexible environment VMs

Discount for every machine used more than 25% in a month
Discount for long-term workloads
Discount for preemptive use
Pay only for the resources you need

Open APIs

BigTable uses Apache HBase interface
Cloud Dataproc uses Hadoop as managed Service
TensorFlow: software library for machine learning
Kubernetes and Google Kubernetes Engine can mix microservices across different clouds.
Google Stackdriver lets customers monitor workloads across multiple cloud providers

Security

Secure boot stack

cryptographic signatures over the BIOS
bootloader
base operating system image

Premises security

Physical security in data centers
Host some servers in third party data centers

Encryption of inter-service-communication

cryptographic privacy for RPC
Services communicate with RPC calls
User of hardware cryptographic accelerators

User identity:

login and password
logging device
location of login
Second factor authentication

Encryption of the storage media
Google Front End (GFE)

TLS encryption
Protection against Denial of Service

absorb many DoS attacks by nature of the scale
multi-tier and multi-layer protection

Intrusion detection
Software development practices:

central review control
two-party review of new code

Service categorization

Compute

Compute Engine
Kubernetes Engine
App Engine
Cloud Functions

Storage

Cloud Bigtable
Cloud Storage
Cloud SQL
Cloud Spanner
Cloud Datastore

Big Data

BigQuery
Cloud Pub/Sub
Cloud Dataflow
Cloud DataProc
Cloud DataLab

Machine Learning

Cloud Natural Language API
Cloud Vision API
Machine Learning
Cloud Speech API
Template API

Networking
Operations/Tools

Budgets and alerts

Budgets are defined at the billing account
Alerts at a % of the budget
Billing export for example Big Query and Cloud Storage Budget
Reports defined for a project or service
Quotas applied at the project level

Rate quota resets after a specific time
Allocation quotas is applied for a number of resources
Both of them are requested to the Google Cloud Support

Google Cloud Identity and Access Management

Like an ACL allows somebody do an action on a resource.

Who:

Google account
Google group
Service account

They authenticate using keys

Cloud Identity or G Suite Domain

Can do what:

Managed by roles (service, resource, verb)

Primitive

Owner
Editor
Viewer
Billing Administrator

Predefined

particular permissions on particular services

Custom

They can only be used at the project level

On which resource

Permissions can be exported from LDAP (one way)

Interaction with Google Cloud Platform

Cloud Platform Console
Cloud Shell/Cloud Sdk

gcloud tool
gsutil tool
bq tool
It gives a temporary compute engine virtual machine instance running Debian
5GB of persistent disk storage mounted in $HOME
Built in authorization for access to projects and resources

Cloud Console Mobile App
REST-based API

use JSON as interchange format
use OAuth 2.0 for authentication and authorization

Cloud Client Libraries / Google API Client Libraries (Latest)
Cloud marketplace

offered by Google and third parties

Virtual Private Cloud Networking

They are global. Subnets are regionals.

A VPC belongs to a Google Cloud Platform Project.

Virtual Private Cloud Networks have routing tables to forward traffic between instances. There is a global firewall and it can be configured by compute engine instance, defined by metadata tags (for example: all instances with “WEB” tag are allowed as incoming traffic on ports 80 and 443)

With VPC peering you can add visibility across different Google Cloud Platform projects

Direct Peering

Putting a router in the same google network and route traffic to a on-premise system
It is not covered by the Google SLA

Carrier Peering
Dedicated Interconnect

You get a dedicated connection to Google. It is covered up by up to 99.99% SLA

Partner interconnect

connection through a supported service provider
Useful if your physical connection cannot reach a dedicated interconnect
Downtime is tolerated

Compute Engine

CPU, Memory, amount of storage and OS selection and can be changed
Persistent storage and it can be rescaled with no downtime
A Windows or Linux premade image as well as a custom image can be run
Billed per second
Discount per incremental minute if you run 25% of the month (30% discount if you run the entire month)
57% discount if you use continuously a 1 or 3 years of cpu usage
Discount if you use a preemptible machine (is stoppable if needed elsewhere)

In 2019 the maximum number of virtual CPUs was 96 (zone dependent), the maximum memory size was 624 GB. A mega memory machine can handle 1.4 TB

Auto Scaling allows to add and remove Virtual Machines for your applications based on load metrics.

Cloud Load Balancing

Provides a cross-region load balancing, including automatic multi region failover

Global HTTP(S)

Layer 7.
Across urls

Global SSL Proxy

Layer 4. SSL not HTTP
Specific port numbers

Global TCP Proxy

Layer 4. No HTTP, no SSL
Specific port numbers

Regional

UDP Traffic
Any port number

Regional internal

Internal use

Cloud DNS

8.8.8.8 DNS for de www free

Cloud DNS is a managed DNS Service, It is programmable using de GCP Console, the command line interface or de API

Cloud CDN (Content delivery Network)

Enabled by a single checkbox in the Load Balancer

Cloud Storage

binary large-object storage addressed by unique keys
Information saved encrypted, from server side
The objects are immutable
It is useful when large-object storage is needed
There is available a service to send large amounts of offline data, hdd or usb flash drives
The files are organized into buckets (location and name are picked by the user)
There is availability (turned off by default) of versioning
Offers lifecycle management policy

Cloud Storage Classes

Multi-Regional

Store your data in at least two geographical locations separated by at least 160km
Used for frequently accessed storing data

Regional

Lets you store your data in a Region
Used with Compute Engine and Kubernetes Engine

Nearline

Ideal when you modify your data once a month

Coldline

Ideal to access your data once a year

Ways to bring data to Cloud Storage

gsutil
drag&drop
Online Storage Transfer Service

Schedule batch transfer for another endpoint (G Cloud or another provider)

Offline Transfer Appliance

Cloud Storage integration

Import and export tables to/from Big Query
Startup scripts, images and objects from Computer Engine
Logs and images storage from App Engine
Datastore Backups
Import and export tables from Cloud SQL

Cloud Bigtable

It is a full managed NoSQL database for Terabytes Applications

It uses HBase API
Compatible with Hadoop ecosystems
Streams to Cloud Dataflow streaming, Spark Streaming and Storm or batch processes
It is the same database that use Search, Analytics, Maps and Gmail

When to use Cloud Bigtable

There is large amount of data ( Petabytes)
Data is changing fast
strong ralational relationships are not required
Data is natural ordered by time
You run asynchronous batch on real time
You run machine learning algorithms.
You don’t need multi-row transactions

To sum up, it handles massive workloads, has low latency and high throughput. It is apropriate for operacional and analytical application and IoT

Cloud SQL

Managed RDBMS (Relational Data Base Management System).

Offers MySQL and PostgreSQL database as a service:

Automated replication

From Google instances to Google instances
From non Google instances to Google Instances
From Google Instances to non Google Instances

Automated Backups

Up to 7 backups for instance
encrypted data

Vertical Scalling (Read and Write)
Horizontal Scalling (read)
Google security
network firewall
Up to 10 TB of storage

Cloud SQL integration

With App Engine standard drives
Compute Engine using external IP Addresses
With external applications and clients

Cloud Spanner

Horizontal and scalable RDBMS
Automatic replication
Strong and global consistency
Managed instances with high availability
use SQL

It is appropriate if you need:

A RDBMS with joins and secondary indexes
High availability
Strong global consistency
Database size up to Petabytes
Many IOPS

Cloud Datastore

Fully managed NoSql Database designed for application bakends

High-scalable

Automatic scaling

support for Databases with Terabytes

support multirow transactions

Benefits of Cloud Datastore

Local development tools

Includes a free daily quota

Restful interface

Atomic transactions (ACID)

High availability of reads and writes

Massive scalling with high performance

Flexible storage and querying of data (SQL - like language)

Encryption at rest

Fully managed with no downtime

Google Kubernetes Engine

managed, production ready environment for developing containerized applications
Grants high availability
Runs Kubernetes, thus enusres portability across clouds and on-premises
Includes auto node-repair, auto upgrade, auto scalling
Regional clusters with multiple masters and node storage replication across multiple zones

Google Kuberenetes Engine GKE On-Prem

It is a GKE to run On Premise

kubernetes best practices pre-loaded
easy update to latest Kubernetes Engine

Stackdriver

Built-in logging and monitoring solution for Google Cloud Platform

Fully managed logging

view, filter, search logs
Define metrics
Incorporate in alerts
Export logs

Cloud BigQuery
Cloud Storage
Cloud Pub/Sub

Metrics collection
Monitoring
Dashboarding
Alerting solutions
Debugging

Connects the production code application whit the source code and takes snapshots of the values

Error reporting

Tracks and group the errors, and notify when new ones are detected

Trace
Profiles

Observe the call parameters between functions, cpu, memory

App Engine

It is a Platform as a service for scalable Applications

Designed for Backend applications and mobile backends

There is a free daily use quota

Provides:

NoSQL Datastore
memcache
load balancing
health checks
application logging
User autentication API

Scales automatically depending on the amount of traffic

Preconfigured with:

Java 7
Python 2.7
Go
PHP
(Specific versions are supported)

Persistent storage with queries, sorting and transactions

Restrictions:

No writing to local file system
all request time out at 60 seconds
Third party software is limited

There is a simulated sandbox to emulate app engine in your local computer. From there you can launch a deploy in App Engine, in production

Security scanner

Automatically scans and detects common vulnerabilities

App Engine Flexible

Runs in a container instead of a sandbox (Docker inside Compute Engine)

Customizable container

Instances are auto health-checked

Critical backward compatibility operating system updates are automatially aplied

instances are restarted every week

App Engine Flexible can access App Engine services

Support for:

Java 8
Servlet 3.1
Jetty 9
Python 2.7
Node.js
Go

Cloud Endpoints

Distributed API management system. It works with those APIs that implements Open API specification (Former swagger)

Uses autentication
Automated deployment
Logging and monitoring
API Keys
Easy integration

Supported platforms for Cloud Endpoints

App Engine Flexible environment
Kubernetes Engine
Compute Engine
Android
iOS
Javascript

Apigee

Platform for developing and managing API proxies

Helps you to secure and monetize APIs

Cloud Source Repository

It is a Git Repository hosted on Google Cloud Platform

Includes integration with Stackdriver Debugger without slowing down the users

Allows any number of Git repositories

Integration with Github and Bitbucket repositories

Cloud Functions

Single purpose functions that respond to events without a server or runtime:

from Cloud Storage
from Pub/Sub
HTTP invocations for synchronous exceution

Created in Javascript, Phyton or Go and executed in a Node.js environment

You ar billed to the nearest 100 milliseconds, only when the code is running.

Deployment Manager

Infrastructure management service that automates the creation and management of resources.

You create a .yaml file or python and the deployment manager do the actions needed to deploy the environment your template describes

Cloud Dataproc

A managed way to run Hadoop, Spark, Hive and Pig on Google Cloud Platform. A Hadoop cluster will be built in 90 seconds or less.

It can be monitored with Stackdriver

Peemtible instances can be used to make them cheaper.

When the data is in your cluster, you can use Spark to mine it. It discover patterns through machine learning.

Cloud Dataflow

When the data shows up in real time or has unpredictible size Dataflow is a good choice. It is used to build data pipelines in batch and in streaming models:

Resource Management
On demand (autoscale)
Intelligent work-scheduling
Autoscaling (horizontal)
Unified programming Model
Open Source
Monitoring
Integrated

Cloud Storage
Cloud Pub/Sub
Cloud Datastore
Cloud Bigtable
BigQuery
extensions to Kafka and HDFS
Reliable & Consistent Processing

BigQuery

Is a fully managed Data warehouse. Provides nearly rea-time analysis of hundreds of TB

Use of SQL.

Features:

Flexible data load:

Cloud Datastore
Cloud Storage
Streaming

Globl availability
Security and permissions
Cost controls
Hihgly available
Super fast performance
Integrations:

Cloud dataflow
Spark
Hadoop

Export to google products
There are some limitations in the databases
Discount for continous usage
Petabytes of database size

Cloud Pub/Sub

Many to many asynchronous messages

Applications subscribe to topics

Integration with Cloud Dataflow

Grants at least one time delivery at low latency

Highly scalable
Encryption
Replicated storage (replicated in multiple servers and in multiple zones)
Message queue by topic
end to end acknowledgement
Fan out

one to many
many to many

Rest api

Suitable for:

building blocks in Dataflow, IoT or Marketing analytics
Push notification for cloud-based applications
Connect applications (Compute Engine and App Engine)

Cloud Datalab

Lets you useu Jupyter notebooks to explore, analyze and visualize data on Google Cloud Platform

Shows an interactive Pyton interface ready to use for data exploration

Integrations:

BigQuery
Compute Engine
Cloud Storage

Multilanguage Support:

Python
SQL
Javascript

Pay per use pricing

Interactive data visualization

Git-based control version, linkable with GitHub and Bitbucket

Open Source

IPhyton support

When to use:

Documentation
visualization
Analize BigQuery, Compute Engine and Cloud Storage using Pyton, SQL and Javascrip

TensorFlow

TensorFlow is an open source software library for machine learning.

Cloud Vision API

Analyzes images with a REST API

Detect inapropriate content
Analize Sentiment
Extract text
Get keyworks

Cloud Speech API

Recognizes over 80 languages
Can return text in real time
Highly accurate
Access from any device

Cloud Natural languge API

Reveal structure and meaning of the text

syntax analysis
identify nouns, verbs, adjetives
recognize people, places

Extract information about items showed in texts
Integrate with Cloud Storage
Available in English, Spanish and Japanese
Integrated in REST API
Sentiment analysis

Cloud Translation API

Translate arbitrary strings between thousands of language pairs
Language detection

Cloud Video Intelligence API

Annotate the contents of video
Detects scene changes
Flag inappropriate content

Cloud CDN (Content Delivery Network)

Cache load-balanced frontend content that comes from Compute Engine

Cache static content that is served from Cloud Storage