The sheer number of Google Cloud Platform services can be overwhelming.  I found that breaking them down into their component parts helped me to get a clearer understanding.

With that in mind, today I decided to write about one of the fundamental GCP data storage option: Cloud Storage

Google Cloud Storage

When deciding what Google Cloud service to use to store data you have to think about how you are going to use that data.  Google offers different services depending on how your data is structured, how you will be accessing that data, as well as how often you are accessing your data.

In my notes I defined Cloud Storage thusly:

Binary large-object storage accessible by URL. High durability, high availability.  It’s not a file system – it’s “buckets”.  Buckets contain immutable objects.  Appropriate for web-content, downloads, etc.

If you are anything like me the part of this definition that catches your attention is: “buckets”.  WTF does buckets mean?   I found that it’s best to not over-think it 🙂

It really just means that in Google Cloud Storage you are not dealing with a file system as you are on your Linux or Windows computer.  A comparable paradigm in a  file system might be a folder or directory.  In the GCP Cloud Storage system buckets are the most fundamental  containers that hold data and everything in Cloud Storage must be stored in a bucket.

One difference between directories and buckets is that while you can use them both to organize your content you cannot “nest” buckets inside one another.

The Binary large-object part of the definition from my notes means that each item in a bucket is an object – just a chunk of data to GCP.  Each object has associated meta data (name-value pairs) that describe that object qualities.

It’s also important to understand that objects are immutable; meaning that you cannot change an object.  You can upload a new version to overwrite an object – and you can even use versioning to keep a series of the same object.  What you can’t do it go into GCP and edit an object itself.

You can control who has access objects using GCP’s IAM policy or with Access Control Lists (ACLs).

 

Cloud Storage Classes

Finally you’ll need to decide what Cloud Storage class is the best fit for your data. There are four classes in total, but within those 4 I think there are really 2 classes each with a sub class.  Again, from my notes:
         

          Multi-regional

              High performance

              For most frequently accessed data

              Appropriate for content storage and delivery

              Highest storage price – low transfer price

 

          Regional

              High performance

              Data  accessed frequently within a region

              Appropriate for transcoding / regional analytics

 

          Nearline

              For data accessed less than once a month

              Appropriate for backup / longtail content

 

          Coldline

              Accessed less than once a year

              Archive / disaster recovery

              Lowest storage price – highest transfer price

 

The idea is that if you or your users are going to access your data on the reg then you choose Multi-Regional or Regional.  If you are storing your data or doing some type of scheduled batch processing you choose Nearline or Coldline.

The first two have a higher storage price and a lower transfer price and the later two the opposite.

Hopefully this gives you at least a basic understanding of Google Cloud Platforms Cloud Storage.  Cloud Storage is only one of the myriad of data storage options GCP provides.

Next week I’ll write a post outlining the GCP Database offerings.  Until then here is a great page to continue reading about Google Cloud Storage.


Leave a Reply

Your email address will not be published. Required fields are marked *