Chapter 8

The Google Cloud

Subscribe and stay connected to the cloud

I first discovered the Google Cloud as a product called App Engine back in 2007, around the time it was first launched. I was working on my startup (Socialwok) and was struggling with keeping the servers running. Being a founder and engineer, I had enough to do and certainly did not want to add the task of maintaining servers to my already pretty busy schedule.

“If I have seen further it is by standing on the shoulders of giants.”

I would often lie awake at night worrying about our customers and their expectations of us. I would worry that the people who paid us money to use our software would personally hate me if we failed them in any one of the many ways in which we could. I worried that our server would overload when they made more use our product and that we would have to drop everything and try to figure out a way to scale it. I worried that making major changes would require us to shut down for a while. I worried that a database crash or a hacker attack would cause us to lose all the data our customers trusted us with. I worried that keeping our customers happy would cause our expenses to balloon. I worried that sooner or later our small team would have to stop work on the core product because our time would eventually be taken up by monitoring, managing, and securing our infrastructure. And finally, I worried about getting popular and having many customers show up at our front door and that we wouldn’t be able scale up to accommodate them.

At the time, there were a few basic cloud providers around, but all of them required us to still worry about much of the above, except for maybe that we could easily add more servers to handle higher usage. We would still have to deal with all the plumbing that goes into maintaining each running server, so my worries would not be addressed. As I was doing my research on which virtual server provider to pick, I stumbled onto Google App Engine (GAE).

I was intrigued. The promise of (GAE) was that it was a fully managed platform for web applications. This is what separated it from everything else. And when they said “fully managed,” they really meant it. There were no servers or databases to manage. There were no backups to make. There was nothing but the promise that once I uploaded my code into GAE using their easy user interface and if my code was built with certain defined rules in mind, it would just work. If a million users showed up the next minute, it would handle them. If hackers came calling, they would be dealing with the might of Google. Our customer data would be safe and secure using the same technology Google uses for their own data. And best of all (especially for a poor startup), we paid only for usage, so we would pay nothing when no one was using our software.

What started as one product, GAE, is now a planetary-scale cloud platform that addresses your every need from running your applications to managing your data. What hasn’t changed is the fundamental promise on which Google built their cloud, the promise of a managed solution designed to be useful to a wide audience. This is a really hard promise to live up to. As Steve Jobs put it, “simplicity is the ultimate sophistication.” It’s really hard to hide complexity from users and still give them unlimited access to all the powerful capabilities of the platform.

What’s “Managed” Services?

Up to this point, I have made numerous references to the word managed. But what does it mean and why is it important? All the complex software that your business depends on (for example, a database or a web server) needs two sets of people to deal with it: 1) those who use it, and 2) those who keep it running.

The people who use that software could be anyone, really. It could be you, me, that guy from marketing, or it could even be a software developer building an application that depends on a working database like I was with my startup. You can think of this group as consumers. They depend on a piece of software to achieve their goals. The typical reader of this book possibly falls into this category. Their needs are usually simple. They want this piece of software to work well, be bug-free, and keep their data safe. And although these needs are simple enough, fulfilling them is a massive task that requires the second group of people to act.

The second group is made up of people who keep the software running. In so doing, they keep the first group happy. This group is usually made up of software engineers who add new functionality and fix bugs as well as system administrators (aka developer operations) who monitor the health of the software, monitor its data, and often get out of bed at night when the software fails at 2 a.m.

This is all pretty familiar to you if you’re in the software business, but wait—this is where the word managed comes in. Imagine not needing this second group. Imagine everything being taken care of for you and completely invisible to you. For example, say you have a terabyte of data (1 terabyte = 2 million photos) that your business needs to mine for insights and reports. The obvious thing to do would be to get your technology team to set up a database and upload the data into it. Now say your business has multiple offices. This means the database will need to be accessible over the Internet, so security is important. Soon the database gets popular within your organization, and with more people using it and more data flowing into it, it naturally slows down to a crawl. Your technology team now needs to add more disks and somehow figure out how to speed up the database. This requires shutting down the system for maintenance, meaning your business will lose access to the database for a few days. And after things are back to normal, the system administrator rushes into your office and informs you that they discovered a major security hole in the database software and that they will need to take the database down again to upgrade to a new version. Another week passes, and even with all the effort, the database is still not quite as fast as you’d like it to be. This, along with all the resources needed for regular maintenance, is turning into a real drag on performance, and the costs are adding up.

What if, at this stressful point in your businesses life, you decided to adopt Google BigQuery (GBQ), a big data analytics tool and a part of the Google Cloud? Your technology team would have to upload the data into it, and that’s pretty much it. It doesn’t matter if you have several terabytes or even petabytes (1 petabyte = 2 billion photos) of data. It will handle any amount you throw at it. Once you upload or stream (in real time) the data into GBQ, it’s available to you to access and mine for insights using just your web browser.

This is the first time in human history that technology as powerful as Google BigQuery has been available to anyone and everyone for as little as the cost of a cup of coffee. The entire software infrastructure behind Big Query is actively managed, upgraded, and kept safe by armies of Google engineers. Google is also constantly adding new features and improving the speed and scalability of this technology. By leveraging this fully managed cloud infrastructure, your business will not only save itself a lot of resources and headaches; it will also have access to much better technology than it currently has. And BigQuery is just one example of all the managed cloud services that I will cover in this book.

Imagine another scenario, where you are the founder of a new software as a service (SaaS) startup. You have big plans to build a CRM solution for the investment banking industry. You know that reputation is everything and that you cannot afford to have any slip-ups or it could cost you your business. Your customers are quite sophisticated and want to understand what security standards you adhere to. Also, each deal will bring in thousands of new accounts that will actively use your software, so it will need to scale. You have a small team and big ideas, so you certainly don’t want to waste months rebuilding stuff just to help you scale. The resources you might spend on rebuilding would be put to better use creating new functionality and moving your product ahead.

Since you are a software company, the status quo would be for you to build your core software product and also have the team deal with the entire supporting infrastructure such as databases, multiple servers, and all kinds of other stuff to handle the complexity involved in scaling a software product to thousands of users. However, going with this outdated mindset would cost you and your team money, peace of mind, and, most important, valuable time. While dealing with all of that, your competitors would likely bypass you, and your customers and investors would likely fume waiting for you to deliver.

You, however, are a smart founder. You get your team together and recommend they give the Google Cloud a try. The team, having built the initial prototype using the popular Ruby on Rails technology, will need to figure out how compatible it is with the Google Cloud. They spend a few hours reading up on the Google Cloud before deploying the main CRM product into Google App Engine (the flexible environment option) and adding a Google Cloud SQL database for storing the data. A few hours of effort have now made your application faster, more scalable, highly secure, and extremely easy to manage. Your data is automatically backed up and replicated. You can now boast about your entire system’s compliance with several major security certifications. In addition to all of these product advantages, you’ve also just gained a lot of other wins: Your team is happier, leaner, and more productive, and all of your resources are now focused on making your core product better.

To put it all together, a managed cloud infrastructure is a strong foundation that you can build on. Traditionally, the old cloud was simply about using virtual servers so you didn’t have to pay for hardware, but things have come a long way. With the new managed cloud, also known as infrastructure as a service (IaaS), you don’t have to think in such low-level terms. Instead, you can focus on the bigger picture while all the major and minor details are taken care of for you.

The services provided by the Google Cloud fall into a few categories: computing, data, security, machine learning, and management. I will cover some of these services and show how other startups and companies have very successfully leveraged them to build amazing things.

To Good to Be True?

At this point, some of you are probably thinking this fanboy has told us a lot about how great the cloud is, but what about the bad stuff? There has to be a downside, right? Maybe some of this sounds too good to be true. You’re absolutely right there are valid concerns that those need to be addressed.

People like to have control. With the managed cloud, you are handing over a lot of that control over to the cloud provider. Software is a critical component if not the product itself for many businesses, and it can make you uncomfortable to have someone else manage and run it for you. In some cases—as with Netflix and Amazon Prime Video, and Spotify and Google Play Music, for example—your cloud provider is also your major competitor, which can make management teams squirm in their seats.

But let’s be honest. Running production software infrastructure is not easy, and it’s clearly not something you simply want to hand over to your developers to do on the side. It’s a full-time job best left up to the experts. As a cloud provider, Google has invested over a decade and billions of dollars to build their solution and has armies of highly specialized talent improving, fixing, and managing it 24 hours a day, 7 days a week, 365 days a year. In the world of cloud providers, your reputation is everything. Although Google is free to build a product to compete with you, it would not be very smart of them to risk their massive investments in their cloud just to cause you a little grief. The fact is that running your business on the Google Cloud is probably your best bet at competing successfully with larger companies like Google.

When building my startup on the Google Cloud, I experienced a lot of these concerns, especially since I had to sell the idea to the rest of the team. The issue of handing control of our infrastructure over to Google was not a big one for me since we had little to begin with, but this next issue bothered me a lot more. Vendor lock-in is a term referring to a situation where the cost of switching vendors is very high. For example, it’s easy to change your coffee shop, but it’s much harder to change your camera since you’ve already invested in expensive lenses that only work with certain models of a specific camera brand. You can find vendor lock-ins everywhere. Some are expected, such as when they are the result of technical innovations, but others are the result of greedy business practices. I’ve often been asked if building software using managed services such as Google’s database or the Google App Engine constitutes a lock-in. I’m of the opinion that it does not.

Google has done a lot to make their software open source and standards compliant. The idea behind it is this: They made it so you are free to run your software anywhere on any cloud or even your own laptop with the hope that you’ll see the value in running in on their cloud. A good example of this is the Google Container Engine (GKE), the software that manages and orchestrates your entire application in the cloud. The GKE is built using two open-source technologies that you are free to run anywhere: Kubernetes and Docker. I have to warn you, however, that this is not entirely true. Some services, such as Google App Engine and Google Cloud Machine Learning, are highly specialized and require you to build some Google-specific integrations to have them work for you. The good news is that a lot of work is being done under the umbrella of Google Infrastructure for Everyone Else (GIFEE) to adopt open APIs and standards throughout the cloud. My opinion is that software is very malleable, and integrations can be abstracted away, so in the end, this is not a concern. Most software today is riddled with vendor-specific integrations, so this is not any different. Lock-ins can also be contractual, where your cloud vendor might make a case for the need to sign a long-term contract. I would be very wary of these, as they are contrary to the very idea of cloud computing.

Running Your Code

Google Compute Engine (GCE)

The Google Compute Engine (GCE) is a basic building block of the Google Cloud. This is the basic computing unit available on the platform. Traditionally, this is what a server was. On the cloud, it is called a computing unit. It includes the processor, memory, and disk, all of which are very flexible and can be resized to your needs. In the past you, would order server hardware from one of the many companies that made them. You would indicate the RAM, disk, and microprocessor specifications you wanted, and they would ship it to you or your data center in a box. You then had to unpack it and set it up. Compare this manual process to the cloud, where you can instantly conjure up one computing unit (or tens of thousands) with a couple of clicks of the mouse. The RAM, the disk, and the processing power are flexible on the Google Cloud and can be changed at a moment’s notice. For example, if you need more disk space, you can add terabytes of it without ever needing to shut down your application.

The GCE is a managed service that provides you with as much computing power as you need. If your company has an existing application that you need to move to the cloud quickly, the GCE would be the way to go. You would still have to set up and manage your application that runs on it, and if your application needs more than one computing unit to work, then it means more work on your end. Alternatively, you could just use one of the other services I discuss below that are built specifically to make things even easier for your applications.

The GCE will give you a lot of flexibility. It’s a great option when initially moving an existing application to the Google Cloud. Other use cases include highly specialized and heavy computing tasks that won’t fit into the limitations enforced by other fully managed services such as the Google App Engine. Encoding a video file between formats is a good example of such a task. Think of the GCE as your server in the cloud. You are free to do or run whatever you want while the server itself is managed for you.

Google Container Engine (GKE)

The Google Container Engine (GKE) is a fully managed solution to set up large and complex applications in the Google Cloud. Think of it as the next level up from the Compute Engine. If your business is a software application for a social network, you would probably have various services that make up your entire application. You might have one set of services for managing friends and connections, one for handling photo and video uploads, one for handling user registration, etc. This kind of complex system would need several groups of servers to handle each of these functions. Also, you’d need these groups to be able to scale and talk amongst themselves. The GKE can take care of setting up and maintaining such a complex system. All you have to do is define how you want it set up, and the software does the rest. If your needs change (for example, if you see a spiked increase in account registrations), then the GKE will scale automatically to fulfill them.

In my experience dealing with large software projects, I’ve often seen that a lot of complexity lies with getting the various components to work well together. When working with cloud-based software, these various components often reside on separate servers, so getting them to all work with each other can be a major resource sink. This complexity can go up several notches when your application is globally distributed or is expecting a lot of users.

On the Google Cloud, each server is essentially a Google Compute Engine unit, and the GKE is the best way to manage groups of several thousands of these units where each group performs a separate task. Your developers have to define setup options including what groups they expect to have, where in the world the groups should be located, how many copies of the groups should exist, and how many Compute Engine units make up each group. And once these are defined, the GKE works like a watchdog to ensure these parameters are always adhered to. If anything deviates from these parameters, the GKE will take care of it. For example, if a group has fewer servers than it should, the GKE will launch more servers to fill the gap. Also, if you need new code rolled out across your servers without affecting the active application, the GKE will take care it without affecting your users. I can vouch for the fact that the GKE will save you a lot of man-hours and do wonders for the happiness of your developers and the quality of your product.

Google App Engine (GAE)

The Google App Engine (GAE) is the pinnacle of what I consider a fully managed service for your application. There is literally nothing you have to do other than push your code into the GAE. I made extensive use of this for my startup. When dealing with the GAE, you don’t have to worry about servers or any of that. It’s all abstracted away. Once you upload your code, you could get a million users or just one user; it’s all the same to you since it’ll be handled for you. You will simply be charged for your usage.

Let’s dig deeper into the GAE, the service that powered Snapchat’s success. It’s my favorite of all the managed services on the Google Cloud. It’s a truly fire-and-forget platform. Once you deploy your application into it, you can just sit back and watch it fly. Even if a million users suddenly show up within the first few seconds, your application will scale up within milliseconds to handle them. If you get no users, your application will scale down to zero, and you’ll pay nothing.

Sure, GAE is often considered to be a little opinionated, and you have to do some things a certain way, but the rewards are so much greater. For example, if your application has bugs like all applications do, the platform will pipe all the error messages into a beautiful web UI for your engineering team to look at and fix. They can do this by changing the application’s code right in the browser. One click will save the changes and run the new code for everyone to use. In all of my years of software development, there is nothing that comes even close in terms of ease of use. The GAE has support for some of the most popular software development technologies, including Python, Java, Go, and the hugely popular Ruby on Rails framework.

Every application has data that it has to store and look up. On the Google Cloud Platform, the two most popular options are Datastore and Cloud SQL. Datastore is the best option unless you need your application to talk to more traditional relational databases, in which case Cloud SQL would be better. Datastore is tightly coupled with the GAE, and it scales almost infinitely to accommodate any amount of load. You can push any amount of data into it (there are really no set limits) and send it any amount of traffic. It will be consistently fast and will not require any babysitting. Datastore is the ultimate managed database to go along with your application on the GAE. Furthermore, Google Cloud Console is a web and mobile app that gives you full control over your Google Cloud account. A few clicks is all it takes for your developers to access all the logs produced by your application, look at performance data to help fine tune things, and see your application’s usage stats.

One thing that people using the GAE often take for granted is the technology updates that you get for free. With the wide adoption of browsers such as Chrome and Firefox, the web has come a long way from even a few years ago. New technologies include HTTP/2 and QUIC, which promise to speed up web browsing, and TLS, a new security standard that provides airtight security and privacy. A lot of these new developments are the result of research done at Google, and they are being incorporated into popular browsers. However, they’re instantly available to applications running on the GAE. So without you even lifting a finger, your users will have a better experience dealing with your site. When your server and their browsers talk, the latest protocols and technologies are available to everyone.

To help you comprehend what a leap all of this really is, let me take you through what life is like for those who don’t leverage the GAE. I commonly come across websites and even mobile apps with a software architecture that, while built on virtual servers, have many manually setup and hardwired components. It could be a project management app or custom-built CRM software, the kind you’re probably familiar with at work. From experience, I can say that the team building it probably started off by running it on one server, and they maybe used another server to run the database software. This is a fairly common setup. Virtual servers by themselves come with nothing, so your team has to install the operating system, add every software library your app needs, set up the environment correctly, set up firewalls and other security measures (scary), and finally install and run your application. They then have to repeat this whole process once more for the database server. Each of these steps takes me just a few words to describe, but it can take hours or a couple of days for engineers to get it done right. Also, there is a lot more work still to do to set up the networking stuff, such as IP addresses and a load balancer. Now your team has to build a lot of tools to make replicating this process easier every time a new machine is set up, and most times, all of this needs to be duplicated for setting up testing environments as well.

If right about now you’re thinking, “He’s just making mountains out of molehills,” or, “What’s a little more work? Our team is pretty solid,” hold up—I’m not done yet. First, you’re assuming that the guys you hired to develop your software products are also experts at security, networking, and managing servers. Second, we’ve only just covered the initial setup. There is still that other pesky little thing called “maintenance,” which includes keeping all of your servers updated with the latest security patches, being vigilant so you won’t get hacked, dealing with random failures due to issues such as the disk filling up, figuring out how to analyze all of your log data, tracking application performance, ensuring the app is fast enough…. I could go on for at least another page, but I think I’ve made my point.

Actually, this is fun, so I think I’ll go on a little longer, especially since that big customer of yours is now complaining that your application is too slow. Everyone is now scrambling to either scale up the database or co-locate your application in a data center closer to this customer. This co-location stuff is really hard to get right, and so is trying to scale up your database. And while you’re the thick of things, another four customers start pressing for those new features you promised and are threatening to leave if they suffer anymore downtime due to your growing pains. You’re really stressed at this point and are having a hard time prioritizing between building new features to improve your product and putting out all the fires just to keep things running. And now, in the middle of this terrible week, your top sales guy rushes in with good news: He’s got a huge customer on the line with the potential to add a thousand new users. You should be happy, right? Finally, some good news. But is it? If you sign up this large account, you’re probably going to have to hire more people to help scale the infrastructure even further to accommodate this new client and keep them happy. If only you could somehow deal with this stuff so you can focus on the fun things in life, like new features.

Google Firebase

Google Firebase was initially built by a startup that Google acquired in 2014 and integrated into their cloud offerings. It has come far in the few years that it’s been a part of the Google family. This is a managed service designed to serve as a backend for mobile apps, but it can just as well be used for desktop and web apps.

Firebase is the first “no coding needed” (almost) service on the Google Cloud. You can just use the web user interface to define what kind of data you will be storing, and it takes care of everything else. It will sync your data between all your users’ devices, handle your application’s sign-ups and logins, and take care of notifications when data is changed. This is just a short list of the things it can do as a backend for your application.

As for your app itself, Firebase will automatically test it for bugs, report failures, and even allow you to make money by connecting it to Google’s ad network.

I have great hopes for this service’s ability to make app development more accessible to people without coding skills who have a problem to solve.

Managing Your Data

Google Cloud SQL

Everyone needs one of these: a good old relational database. Google Cloud SQL is a fully managed relational database. This means there’s no more worrying about backups, security, etc. It’s easy to set up though a webpage, and it’s fast and secure. There’s also no need to change your code. Cloud SQL is compatible with the popular MySQL database, so most software will not need any changes. If you need to add more space or power to the database, it’s easy to do though the same web user interface.

These traditional databases (relational databases) work great for most applications, but keep in mind that you need to decide how your data should be structured and what part of the data has what kind of relationship with other parts, etc. And you often need to change these structures as applications mature, and that takes work. When choosing this kind of database, you make a trade off between speed and flexibility. The ability to create complex queries to find just the right data provides flexibility at the cost of some speed and slightly higher data management requirements.

**]

Google Cloud Datastore** {#google-cloud-datastore style=“margin: 0.0px 0.0px 12.0px 0.0px; text-indent: 0.0px; font-size: 150%“}

The limitations (perceived or otherwise) of the above-mentioned relational database have led to a whole new type of database called NoSQL, or document databases. With this type of database, everything is stored as a key to value mapping. For example, key 12345 could map to your user profile. This simplifies things and makes room for much higher read/write performance.

The Google Cloud Datastore is often thought of as the go-to datastore for web scale performance without any of the management headaches of the other relational database model. If you’re building a game that will potentially get a lot of usage or if you expect a major marketing event to bring lots of global visitors, then this is probably the best choice for your application.

The Google Cloud Datastore does well at upholding Google’s reputation for having a massively scalable, high-performing database while still providing a lot of the flexibility of the Cloud SQL relational database. This is a truly fully managed database solution on the Google Cloud, and it nicely complements your application running on the Google App Engine.

Google Cloud Bigtable

Sometimes the performance needs of your application are so high that you cannot achieve them without considerable investment. This is often the case with financial applications that deal with market data or stock transactions. In a recently published case study, a company that was tasked with building an audit trail by the US Securities and Exchange Commission (SEC) needed to record every order and change conducted in the US equities and options markets. They expected to have to deal with over 6 billion messages per hour, with peaks of up to 10 billion.

This is truly a staggering amount of data, but this was what they would have to deal with if they wanted to innovate in the realm of financial technology. For a solution, they turned to a new data service called Google Cloud Bigtable, which was designed to be scaled up to match your needs. It can easily handle over ten thousand read/write operations per second.

Bigtable, which is relatively new to the Google Cloud Platform, has been used internally at Google for a decade now. It is the underlying technology for Google Search, Gmail, and Google Maps, to name a few of the very popular applications that use it.

Google Big Query

The notion that all data is valuable is becoming increasingly popular; however, from experience, I’ve seen that most companies do a terrible job at identifying and storing data that could be of immense value to them. And the companies that do capture data usually do a terrible job at making use of it. A lot of this is due to the inherent complexity and costs involved with identifying, storing, and analyzing data. It’s usually not obvious what data is valuable, and so it’s advisable to store it all, but traditionally, the investments required to do that would scare away most organizations.

Data is usually scattered across log files, stored in various databases, collected by web services such as Google Analytics, generated by users using your product, scraped from the web, and the list goes on. The first thing you need to do is consolidate all of it. How can you fix this? The one obvious solution that comes to mind is to keep it simple and cheap, and to store all the data and mine it as you please. This is easy enough to talk about but very hard to implement—at least it was until Google BigQuery came around. And when talking to startups about the value of their data, I cannot help but sing the praises of this one-stop-shop, fully managed, data-munching beast of a service.

Google BigQuery is a fully managed data solution (data warehouse) that can store any amount of data, and you can derive insights from it in seconds. When using BigQuery, there is nothing to manage. It can ingest your data from various sources, and this includes the ability to stream it in real time. When the time comes to generate reports or mine the data for insights, you can generate tables and graphs of your results through a simple web interface.

The part I find most interesting and powerful is that BigQuery can scan massive amounts of data in a few seconds, which makes working with it a breeze. And the best part is that anyone in your organization can use it as long as they know some simple SQL. A simple query like “SELECT user, SUM(clicks) FROM table GROUP BY user” can tell you in seconds which of your users is the most active by going over massive amounts of click data.

Another feature of BigQuery that I love is the public datasets. Google has made a lot of valuable public datasets available, including all the taxi rides in New York City since 2009 and all the open-source code available on GitHub. This makes very interesting data-driven innovation possible. For example, people use BigQuery to search the GitHub source code for popular security holes so they can fix them, therefore making open-source projects more secure. And BigQuery makes it possible to use the NYC taxi ride data to create an app that predicts the best place to stand to get a taxi in downtown NYC during rush hour.

Google Cloud Dataflow

Whether you’re dealing with the Internet of Things (IoT) or just your popular game app, you could find yourself in a position where you’re dealing with millions or billions of messages or events. Or maybe you’re trying to use external data from Twitter, such as the Twitter Firehose API, which provides you with millions of live tweets in real time. Maybe your marketing company needs to mine this Twitter data around a sporting event to see what’s trending so you can better target your ads. In all of these examples, large amounts of data are flowing in live, and you need to process it.

Unlike the queries you run on BigQuery, here you are using the data to do things that are much more complex. Let’s take the example of Peter and his drone video startup from the “Drone Dreams in Palo Alto” story a couple of chapters ago. In that story, lots of drone video was streaming in from people all over the world, and Peter’s team needed to process the video as it came in to make it available on their app. Processing video can be a heavy task, and this is where Google Cloud Dataflow shines. It parallelizes the work and has many more computers tackle the problem simultaneously. It handles all the complexity of keeping things efficient and running smooth if any of the tasks need to be rerun or subdivided further, ensuring that everything in the work pipeline flows for as long as needed.

All the examples previous to these can be considered very complex problems that require some serious brainpower and server resources just to get started on. Problems of this nature, however, are often categorized under “data processing,” where you have to take one type of data and transform it into another type. This is also not the kind of data that would typically fit on a laptop. We’re talking “big data” here. I know you’re thinking, “Couldn’t I just use BigQuery?” Well, yes, you could, but it’s not exactly the right tool for transforming data. It’s really best at deriving answers from your data. In this case, Dataflow would be the best choice. And like everything else I’ve covered, it’s fully managed. You don’t have to deal with things such as setting it up or trying to guess how many servers you would need.

Security and Oversight

Google Stackdriver

The one thing all of this managed magic cannot do is fix bugs in your software for you. All software, including the one by that new shiny startup, has bugs and issues in its code. This could be anything. It could be as simple as a crash caused by a user typing a phone number into an email field or something more complex such as slow connections to a database slowing down your entire application. The one thing that all software needs is monitoring and the ability to alert the right people when things go wrong.

Google Stackdriver is a service that monitors the health of your entire application and all of its constituent parts. Every log message, user request, disk, service, and other resources are all monitored. You can set up alerts to suit your needs or just use the web interface to look at live graphs and reports of how well things are going. If your application is built on the Google App Engine, then you do almost nothing to get this high level of comprehensive monitoring across your entire application. It just works. I’ve seen months and months of valuable man-hours dedicated to manually setting up monitoring and reporting, and even that was not as comprehensive. It was also often quite fragile.

Stackdriver is clearly a tool that will have a positive impact on your bottom line, and it will keep your developers happy. As an engineer, I can attest to the value of having the ability to quickly look up anything I need to when solving issues that affect the user experience.

Compliance and Certifications

If you’re dealing with sectors such as banking, health, and government, then the words compliance, audit, and certification are a part of your daily life. To be honest, when building my startup on the Google Cloud Platform, I didn’t rush to look into these issues. We were pretty comfortable with the level of security of other Google products such as Gmail, so we went ahead and trusted in the Google platform. This line of thinking probably only works for some small startups. The rest need to deal with regulations.

Regulators require software platforms to be complaint and require audits and certifications as proof. Some certifications are local to the US, but there are others around the world, including those from the European Union (EU). Google is a global company that takes regulation and compliance very seriously. They run independent annual audits of their data centers, infrastructure, and operations. These cover a number of ISO security certifications for everything from cloud security to operations, people, processes, and their data centers. It also includes PCI standards for dealing with credit card, banking, and financial information and HIPAA compliance for dealing with private medical and health data. And for customers who have to deal with the data protection directive that governs the transmission of personal data within the EU, Google will ensure they are complaint and ensure their data is contained within the required geographic regions.

Google Cloud Machine Learning

Machine learning and artificial intelligence (AI) are very exciting and fast-moving technologies. They’re also incredibly hard for most companies to adopt and find the right talent for. Almost any task can be sufficiently automated by AI, saving you time and money.

Let’s say your business wants to analyze calls made by your sales team to ensure they’re using the optimal approach and the right keywords. This is not a difficult problem. Traditional computer programs can handle this, and you would probably end up hiring people to manually listen to and grade all the audio files. But what if you have a multilingual sales team selling to Europe and Asia? Google Cloud Machine Learning is a collection of managed services that can deal with such a complex problem. For example, you could use the Google Cloud Speech API to convert the audio recordings to text. It supports more than eighty languages, and the number is growing. Once you have text transcripts of all the calls, you can easily search for words and phrases and build a solution around the results.

Let’s say you have a website where users upload holiday photos, and you want to sort and store them based on where the photo was taken. Let’s also say that most of the photos are from traditional handheld cameras, so they don’t have any location information. This is another very complex problem bordering on impossible, unless, of course, you have access to the Google Cloud Vision API, which can tell you where each photo was taken based on what’s in it. For example, pictures of people smiling in front of the Taj Mahal would be tagged correctly in the bucket indicated for the user’s trip to India.

The complexity that Google hides from you so you can focus on the problem at hand is mind-boggling. If you’re more technically capable, a recently introduced feature allows you to build your own machine learning models with the Google Cloud doing all the heavy lifting for you. Google Cloud Machine Learning is backed up by specialized hardware to make deep learning (AI) training run a lot faster and cheaper than you could with your own resources.

Subscribe and stay connected to the cloud

All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement. The TC50 photograph above (center) is by Jen Consalvo.

© 2016 Culture Capital Corp.