tutorials / google-cloud / vision

Vision

tutorial javascript google cloud vision

Cloud Vision is a set of tools designed for image analysis.

The Cloud Vision APIs use machine learning models that have already been trained, so you can skip straight to the fun stuff. It’s also possible to train your own models (you can learn more about that here), but this guide is going to stick with the pre-trained models.

Enable Cloud Vision API

Before we can use the Cloud Vision API, we have to enable it. Go here:

https://console.cloud.google.com/flows/enableapi?apiid=vision.googleapis.com

Make sure your project is selected, and click the Enable button.

Credentials

The Cloud Vision API requires your Cloud project’s credentials to work. When you deploy to App Engine this will work ~magically~ automatically, but when running or deploying locally you have to set your credentials manually. Follow the steps here to set up your local credentials.

Important: Before proceeding, make sure you have your GOOGLE_APPLICATION_CREDENTIALS environment variable set. Nothing will work without this.

Maven Dependency

As mentioned above, the Cloud Vision API allows us to write code that translates text. It’s available as a web service, or as a library that can be called from many languages. We’re going to use it as a Java library.

To add the library to our classpath, we can use this maven dependency:

<dependency>
  <groupId>com.google.cloud</groupId>
  <artifactId>google-cloud-vision</artifactId>
  <version>1.70.0</version>
</dependency>

Hello World

(You can view the full source of this example here.)

import com.google.cloud.vision.v1.AnnotateImageRequest;
import com.google.cloud.vision.v1.AnnotateImageResponse;
import com.google.cloud.vision.v1.BatchAnnotateImagesResponse;
import com.google.cloud.vision.v1.EntityAnnotation;
import com.google.cloud.vision.v1.Feature;
import com.google.cloud.vision.v1.Feature.Type;
import com.google.cloud.vision.v1.Image;
import com.google.cloud.vision.v1.ImageAnnotatorClient;
import com.google.protobuf.ByteString;
import java.io.FileInputStream;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

public class VisionHelloWorld {

  public static void main(String[] args) throws IOException {

    String filePath = "C:\\Users\\kevin\\Desktop\\Stanley.jpg";

    ByteString imageBytes = ByteString.readFrom(new FileInputStream(filePath));
    Image image = Image.newBuilder().setContent(imageBytes).build();

    Feature feature = Feature.newBuilder().setType(Type.LABEL_DETECTION).build();
    AnnotateImageRequest request =
        AnnotateImageRequest.newBuilder().addFeatures(feature).setImage(image).build();
    List<AnnotateImageRequest> requests = new ArrayList<>();
    requests.add(request);

    ImageAnnotatorClient client = ImageAnnotatorClient.create();
    BatchAnnotateImagesResponse batchResponse = client.batchAnnotateImages(requests);
    List<AnnotateImageResponse> imageResponses = batchResponse.getResponsesList();
    AnnotateImageResponse imageResponse = imageResponses.get(0);

    if (imageResponse.hasError()) {
      System.out.println("Error: " + imageResponse.getError().getMessage());
    }

    for (EntityAnnotation annotation : imageResponse.getLabelAnnotationsList()) {
      System.out.println(annotation.getDescription() + ": " + annotation.getScore());
    }

    client.close();
  }
}

To run this example, first make sure your GOOGLE_APPLICATION_CREDENTIALS environment variable is set and that you’ve enabled the Vision API. Then change the filePath variable to point to an image on your computer and execute this command:

mvn clean compile exec:java

You should see something like this printed to the console:

Cat: 0.99598557
Mammal: 0.9890478
Vertebrate: 0.9851104
Whiskers: 0.9777251
Small to medium-sized cats: 0.97744334
Felidae: 0.96784574
Carnivore: 0.9342105

This is the result of requesting labels for this image:

Web App

The above example performs image analysis in a standard Java application. This is useful if you want to build a desktop application or analyze images on your computer. But you can also use the Vision API in server code, which comes in handy if you want to build a web app.

(You can download the full code for this example here.)

Let’s start with the HTML:

index.jsp

<%@ page import="com.google.appengine.api.blobstore.BlobstoreService" %>
<%@ page import="com.google.appengine.api.blobstore.BlobstoreServiceFactory" %>
<% BlobstoreService blobstoreService = BlobstoreServiceFactory.getBlobstoreService();
   String uploadUrl = blobstoreService.createUploadUrl("/image-analysis"); %>

<!DOCTYPE html>
<html>
  <head>
    <meta charset="UTF-8">
    <title>Image Upload Analysis</title>
  </head>
  <body>
    <h1>Image Upload Analysis</h1>

    <form method="POST" enctype="multipart/form-data" action="<%= uploadUrl %>">
      <p>Upload an image:</p>
      <input type="file" name="image">
      <br/><br/>
      <button>Submit</button>
    </form>
  </body>
</html>

This JSP file gets the Blobstore upload URL and uses it as the form’s action attribute. This allows a user to upload an image, which gets stored in Blobstore, and then the request is forwarded to our /image-analysis URL. This URL maps to a servlet:

ImageAnalysisServlet.java

package io.happycoding.servlets;

import com.google.appengine.api.blobstore.BlobInfo;
import com.google.appengine.api.blobstore.BlobInfoFactory;
import com.google.appengine.api.blobstore.BlobKey;
import com.google.appengine.api.blobstore.BlobstoreService;
import com.google.appengine.api.blobstore.BlobstoreServiceFactory;
import com.google.appengine.api.images.ImagesService;
import com.google.appengine.api.images.ImagesServiceFactory;
import com.google.appengine.api.images.ServingUrlOptions;
import com.google.cloud.vision.v1.AnnotateImageRequest;
import com.google.cloud.vision.v1.AnnotateImageResponse;
import com.google.cloud.vision.v1.BatchAnnotateImagesResponse;
import com.google.cloud.vision.v1.EntityAnnotation;
import com.google.cloud.vision.v1.Feature;
import com.google.cloud.vision.v1.Image;
import com.google.cloud.vision.v1.ImageAnnotatorClient;
import com.google.protobuf.ByteString;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.io.PrintWriter;
import java.util.ArrayList;
import java.util.List;
import java.util.Map;
import javax.servlet.annotation.WebServlet;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

/**
 * When the user submits the form, Blobstore processes the file upload
 * and then forwards the request to this servlet. This servlet can then
 * analyze the image using the Vision API.
 */
@WebServlet("/image-analysis")
public class ImageAnalysisServlet extends HttpServlet {

  @Override
  public void doPost(HttpServletRequest request, HttpServletResponse response) throws IOException {

    PrintWriter out = response.getWriter();

    // Get the message entered by the user.
    String message = request.getParameter("message");

    // Get the BlobKey that points to the image uploaded by the user.
    BlobKey blobKey = getBlobKey(request, "image");

    // User didn't upload a file, so render an error message.
    if(blobKey == null) {
      out.println("Please upload an image file.");
      return;
    }

    // Get the URL of the image that the user uploaded.
    String imageUrl = getUploadedFileUrl(blobKey);

    // Get the labels of the image that the user uploaded.
    byte[] blobBytes = getBlobBytes(blobKey);
    List<EntityAnnotation> imageLabels = getImageLabels(blobBytes);

    // Output some HTML that shows the data the user entered.
    // A real codebase would probably store these in Datastore.
    response.setContentType("text/html");
    out.println("<p>Here's the image you uploaded:</p>");
    out.println("<a href=\"" + imageUrl + "\">");
    out.println("<img src=\"" + imageUrl + "\" />");
    out.println("</a>");
    out.println("<p>Here are the labels we extracted:</p>");
    out.println("<ul>");
    for(EntityAnnotation label : imageLabels){
      out.println("<li>" + label.getDescription() + " " + label.getScore());
    }
    out.println("</ul>");
  }

  /**
   * Returns the BlobKey that points to the file uploaded by the user, or null if the user didn't upload a file.
   */
  private BlobKey getBlobKey(HttpServletRequest request, String formInputElementName){
    BlobstoreService blobstoreService = BlobstoreServiceFactory.getBlobstoreService();
    Map<String, List<BlobKey>> blobs = blobstoreService.getUploads(request);
    List<BlobKey> blobKeys = blobs.get("image");

    // User submitted form without selecting a file, so we can't get a BlobKey. (devserver)
    if(blobKeys == null || blobKeys.isEmpty()) {
      return null;
    }

    // Our form only contains a single file input, so get the first index.
    BlobKey blobKey = blobKeys.get(0);

    // User submitted form without selecting a file, so the BlobKey is empty. (live server)
    BlobInfo blobInfo = new BlobInfoFactory().loadBlobInfo(blobKey);
    if (blobInfo.getSize() == 0) {
      blobstoreService.delete(blobKey);
      return null;
    }

    return blobKey;
  }

  /**
   * Blobstore stores files as binary data. This function retrieves the
   * binary data stored at the BlobKey parameter.
   */
  private byte[] getBlobBytes(BlobKey blobKey) throws IOException {
    BlobstoreService blobstoreService = BlobstoreServiceFactory.getBlobstoreService();
    ByteArrayOutputStream outputBytes = new ByteArrayOutputStream();

    int fetchSize = BlobstoreService.MAX_BLOB_FETCH_SIZE;
    long currentByteIndex = 0;
    boolean continueReading = true;
    while (continueReading) {
      // end index is inclusive, so we have to subtract 1 to get fetchSize bytes
      byte[] b = blobstoreService.fetchData(blobKey, currentByteIndex, currentByteIndex + fetchSize - 1);
      outputBytes.write(b);

      // if we read fewer bytes than we requested, then we reached the end
      if (b.length < fetchSize) {
        continueReading = false;
      }

      currentByteIndex += fetchSize;
    }

    return outputBytes.toByteArray();
  }

  /**
   * Uses the Google Cloud Vision API to generate a list of labels that apply to the image
   * represented by the binary data stored in imgBytes.
   */
  private List<EntityAnnotation> getImageLabels(byte[] imgBytes) throws IOException {
    ByteString byteString = ByteString.copyFrom(imgBytes);
    Image image = Image.newBuilder().setContent(byteString).build();

    Feature feature = Feature.newBuilder().setType(Feature.Type.LABEL_DETECTION).build();
    AnnotateImageRequest request =
        AnnotateImageRequest.newBuilder().addFeatures(feature).setImage(image).build();
    List<AnnotateImageRequest> requests = new ArrayList<>();
    requests.add(request);

    ImageAnnotatorClient client = ImageAnnotatorClient.create();
    BatchAnnotateImagesResponse batchResponse = client.batchAnnotateImages(requests);
    client.close();
    List<AnnotateImageResponse> imageResponses = batchResponse.getResponsesList();
    AnnotateImageResponse imageResponse = imageResponses.get(0);

    if (imageResponse.hasError()) {
      System.err.println("Error getting image labels: " + imageResponse.getError().getMessage());
      return null;
    }

    return imageResponse.getLabelAnnotationsList();
  }

  /**
   * Returns a URL that points to the uploaded file.
   */
  private String getUploadedFileUrl(BlobKey blobKey){
    ImagesService imagesService = ImagesServiceFactory.getImagesService();
    ServingUrlOptions options = ServingUrlOptions.Builder.withBlobKey(blobKey);
    return imagesService.getServingUrl(options);
  }
}

This servlet contains a doPost() function along with some helper functions that use the Cloud Vision API to generate a list of labels for the uploaded image.

To run this example, first make sure your GOOGLE_APPLICATION_CREDENTIALS environment variable is set and that you’ve enabled the Vision API, and then execute this command:

mvn appengine:devserver

Then navigate to localhost:8080. You should see something like this:

image upload form

image labels webpage

Learn More