<Microsoft Cognitive Services> developer portal

Computer Vision API (v3.2)

The Computer Vision API provides state-of-the-art algorithms to process images and return information. For example, it can be used to determine if an image contains mature content, or it can be used to find all the faces in an image. It also has other features like estimating dominant and accent colors, categorizing the content of images, and describing an image with complete English sentences. Additionally, it can also intelligently generate images thumbnails for displaying large images effectively.

This API is currently available in:

Australia East - australiaeast.api.cognitive.microsoft.com
Brazil South - brazilsouth.api.cognitive.microsoft.com
Canada Central - canadacentral.api.cognitive.microsoft.com
Central India - centralindia.api.cognitive.microsoft.com
Central US - centralus.api.cognitive.microsoft.com
East Asia - eastasia.api.cognitive.microsoft.com
East US - eastus.api.cognitive.microsoft.com
East US 2 - eastus2.api.cognitive.microsoft.com
France Central - francecentral.api.cognitive.microsoft.com
Japan East - japaneast.api.cognitive.microsoft.com
Japan West - japanwest.api.cognitive.microsoft.com
Korea Central - koreacentral.api.cognitive.microsoft.com
North Central US - northcentralus.api.cognitive.microsoft.com
North Europe - northeurope.api.cognitive.microsoft.com
South Africa North - southafricanorth.api.cognitive.microsoft.com
South Central US - southcentralus.api.cognitive.microsoft.com
Southeast Asia - southeastasia.api.cognitive.microsoft.com
UK South - uksouth.api.cognitive.microsoft.com
West Central US - westcentralus.api.cognitive.microsoft.com
West Europe - westeurope.api.cognitive.microsoft.com
West US - westus.api.cognitive.microsoft.com
West US 2 - westus2.api.cognitive.microsoft.com

OCR

Optical Character Recognition (OCR) detects text in an image and extracts the recognized characters into a machine-usable character stream.

Upon success, the OCR results will be returned.

Upon failure, the error code together with an error message will be returned. The error code can be one of InvalidImageUrl, InvalidImageFormat, InvalidImageSize, NotSupportedImage, NotSupportedLanguage, or InternalServerError.

Http Method

POST

Select the testing console in the region where you created your resource:

West US West US 2 East US East US 2 West Central US South Central US West Europe North Europe Southeast Asia East Asia Australia East Brazil South Canada Central Central India UK South Japan East Central US France Central Korea Central Japan West North Central US South Africa North UAE North Norway East West US 3 Jio India West

Request URL

https://{endpoint}/vision/v3.2/ocr[?language][&detectOrientation][&model-version]

Request parameters

language (optional)

string

The BCP-47 language code of the text to be detected in the image.The default value is "unk", then the service will auto detect the language of the text in the image.

Supported languages:

unk (AutoDetect)
zh-Hans (ChineseSimplified)
zh-Hant (ChineseTraditional)
cs (Czech)
da (Danish)
nl (Dutch)
en (English)
fi (Finnish)
fr (French)
de (German)
el (Greek)
hu (Hungarian)
it (Italian)
ja (Japanese)
ko (Korean)
nb (Norwegian)
pl (Polish)
pt (Portuguese,
ru (Russian)
es (Spanish)
sv (Swedish)
tr (Turkish)
ar (Arabic)
ro (Romanian)
sr-Cyrl (SerbianCyrillic)
sr-Latn (SerbianLatin)
sk (Slovak)

detectOrientation (optional)

boolean

Whether detect the text orientation in the image. With detectOrientation=true the OCR service tries to detect the image orientation and correct it before further processing (e.g. if it's upside-down).

model-version (optional)

string

Optional parameter to specify the version of the AI model. The default value is "latest".

Request headers

Content-Type

string

Media type of the body sent to the API.

Ocp-Apim-Subscription-Key

string

Subscription key which provides access to this API. Found in your Cognitive Services accounts.

Request body

Input passed within the POST body. Supported input methods: raw image binary or image URL.

Input requirements:

Supported image formats: JPEG, PNG, GIF, BMP.
Image file size must be less than 4MB.
Image dimensions must be between 50 x 50 and 4200 x 4200 pixels, and the image cannot be larger than 10 megapixels.

{"url":"http://example.com/images/test.jpg"}

[Binary image data]

[Binary image data]

Response 200

The OCR results in the hierarchy of region/line/word. The results include text, bounding box for regions, lines and words.

textAngle
The angle, in radians, of the detected text with respect to the closest horizontal or vertical direction. After rotating the input image clockwise by this angle, the recognized text lines become horizontal or vertical. In combination with the orientation property it can be used to overlay recognition results correctly on the original image, by rotating either the original image or recognition results by a suitable angle around the center of the original image. If the angle cannot be confidently detected, this property is not present. If the image contains text at different angles, only part of the text will be recognized correctly.

orientation
Orientation of the text recognized in the image, if requested. The value (up, down, left, or right) refers to the direction that the top of the recognized text is facing, after the image has been rotated around its center according to the detected text angle (see textAngle property).
If detection of the orientation was not requested, or no text is detected, the value is 'NotDetected'.

language
The BCP-47 language code (user-provided or auto-detected) of the text detected in the image.

regions
An array of objects, where each object represents a region of recognized text. A region consists of multiple lines (e.g. a column of text in a multi-column document).

lines
An array of objects, where each object represents a line of recognized text.

words
An array of objects, where each object represents a recognized word.

boundingBox
Bounding box of a recognized region, line, or word, depending on the parent object. The four integers represent the x-coordinate of the left edge, the y-coordinate of the top edge, width, and height of the bounding box, in the coordinate system of the input image, after it has been rotated around its center according to the detected text angle (see textAngle property), with the origin at the top-left corner, and the y-axis pointing down.

text
String value of a recognized word.

application/json

{
  "language": "en",
  "textAngle": -2.0000000000000338,
  "orientation": "Up",
  "regions": [
    {
      "boundingBox": "462,379,497,258",
      "lines": [
        {
          "boundingBox": "462,379,497,74",
          "words": [
            {
              "boundingBox": "462,379,41,73",
              "text": "A"
            },
            {
              "boundingBox": "523,379,153,73",
              "text": "GOAL"
            },
            {
              "boundingBox": "694,379,265,74",
              "text": "WITHOUT"
            }
          ]
        },
        {
          "boundingBox": "565,471,289,74",
          "words": [
            {
              "boundingBox": "565,471,41,73",
              "text": "A"
            },
            {
              "boundingBox": "626,471,150,73",
              "text": "PLAN"
            },
            {
              "boundingBox": "801,472,53,73",
              "text": "IS"
            }
          ]
        },
        {
          "boundingBox": "519,563,375,74",
          "words": [
            {
              "boundingBox": "519,563,149,74",
              "text": "JUST"
            },
            {
              "boundingBox": "683,564,41,72",
              "text": "A"
            },
            {
              "boundingBox": "741,564,153,73",
              "text": "WISH"
            }
          ]
        }
      ]
    }
  ],
  "modelVersion": "2021-04-01"
}

Response 400

Possible Errors:

InvalidRequest
InvalidArgument

Additional details are provided in the inner error.

application/json

{
    "error": {
        "code": "InvalidRequest",
        "message": "Input data is not a valid image.",
        "innererror": {
            "code": "InvalidImageFormat",
            "message": "Input data is not a valid image."
        }
    }
}

Response 415

Unsupported media type error. Content-Type is not in the allowed types:

For an image URL: Content-Type should be application/json
For a binary image data: Content-Type should be application/octet-stream or multipart/form-data

application/json

{
    "code": "BadArgument",
    "message": "Invalid Media Type"
}

Response 500

Internal service error.

application/json

{
    "error": {
        "code": "InternalServerError",
        "message": "Internal server error"
    }
}

Response 503

Service unavailable. A transient fault has occurred. Please try again later.

application/json

{
    "error": {
        "code": "ServiceUnavailable",
        "message": "The service is temporarily unavailable.",
        "innererror": {
            "code": "ServiceUnavailable",
            "message": "The service is temporarily unavailable."
        }
    }
}

Code samples

@ECHO OFF

curl -v -X POST "https://westus3.api.cognitive.microsoft.com/vision/v3.2/ocr?language=unk&detectOrientation=true&model-version=latest"
-H "Content-Type: application/json"
-H "Ocp-Apim-Subscription-Key: {subscription key}"

--data-ascii "{body}"

using System;
using System.Net.Http.Headers;
using System.Text;
using System.Net.Http;
using System.Web;

namespace CSHttpClientSample
{
    static class Program
    {
        static void Main()
        {
            MakeRequest();
            Console.WriteLine("Hit ENTER to exit...");
            Console.ReadLine();
        }
        
        static async void MakeRequest()
        {
            var client = new HttpClient();
            var queryString = HttpUtility.ParseQueryString(string.Empty);

            // Request headers
            client.DefaultRequestHeaders.Add("Ocp-Apim-Subscription-Key", "{subscription key}");

            // Request parameters
            queryString["language"] = "unk";
            queryString["detectOrientation"] = "true";
            queryString["model-version"] = "latest";
            var uri = "https://westus3.api.cognitive.microsoft.com/vision/v3.2/ocr?" + queryString;

            HttpResponseMessage response;

            // Request body
            byte[] byteData = Encoding.UTF8.GetBytes("{body}");

            using (var content = new ByteArrayContent(byteData))
            {
               content.Headers.ContentType = new MediaTypeHeaderValue("< your content type, i.e. application/json >");
               response = await client.PostAsync(uri, content);
            }

        }
    }
}

// // This sample uses the Apache HTTP client from HTTP Components (http://hc.apache.org/httpcomponents-client-ga/)
import java.net.URI;
import org.apache.http.HttpEntity;
import org.apache.http.HttpResponse;
import org.apache.http.client.HttpClient;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.client.utils.URIBuilder;
import org.apache.http.impl.client.HttpClients;
import org.apache.http.util.EntityUtils;

public class JavaSample 
{
    public static void main(String[] args) 
    {
        HttpClient httpclient = HttpClients.createDefault();

        try
        {
            URIBuilder builder = new URIBuilder("https://westus3.api.cognitive.microsoft.com/vision/v3.2/ocr");

            builder.setParameter("language", "unk");
            builder.setParameter("detectOrientation", "true");
            builder.setParameter("model-version", "latest");

            URI uri = builder.build();
            HttpPost request = new HttpPost(uri);
            request.setHeader("Content-Type", "application/json");
            request.setHeader("Ocp-Apim-Subscription-Key", "{subscription key}");


            // Request body
            StringEntity reqEntity = new StringEntity("{body}");
            request.setEntity(reqEntity);

            HttpResponse response = httpclient.execute(request);
            HttpEntity entity = response.getEntity();

            if (entity != null) 
            {
                System.out.println(EntityUtils.toString(entity));
            }
        }
        catch (Exception e)
        {
            System.out.println(e.getMessage());
        }
    }
}

<!DOCTYPE html>
<html>
<head>
    <title>JSSample</title>
    <script src="http://ajax.googleapis.com/ajax/libs/jquery/1.9.0/jquery.min.js"></script>
</head>
<body>

<script type="text/javascript">
    $(function() {
        var params = {
            // Request parameters
            "language": "unk",
            "detectOrientation": "true",
            "model-version": "latest",
        };
      
        $.ajax({
            url: "https://westus3.api.cognitive.microsoft.com/vision/v3.2/ocr?" + $.param(params),
            beforeSend: function(xhrObj){
                // Request headers
                xhrObj.setRequestHeader("Content-Type","application/json");
                xhrObj.setRequestHeader("Ocp-Apim-Subscription-Key","{subscription key}");
            },
            type: "POST",
            // Request body
            data: "{body}",
        })
        .done(function(data) {
            alert("success");
        })
        .fail(function() {
            alert("error");
        });
    });
</script>
</body>
</html>

#import <Foundation/Foundation.h>

int main(int argc, const char * argv[])
{
    NSAutoreleasePool * pool = [[NSAutoreleasePool alloc] init];
    
    NSString* path = @"https://westus3.api.cognitive.microsoft.com/vision/v3.2/ocr";
    NSArray* array = @[
                         // Request parameters
                         @"entities=true",
                         @"language=unk",
                         @"detectOrientation=true",
                         @"model-version=latest",
                      ];
    
    NSString* string = [array componentsJoinedByString:@"&"];
    path = [path stringByAppendingFormat:@"?%@", string];

    NSLog(@"%@", path);

    NSMutableURLRequest* _request = [NSMutableURLRequest requestWithURL:[NSURL URLWithString:path]];
    [_request setHTTPMethod:@"POST"];
    // Request headers
    [_request setValue:@"application/json" forHTTPHeaderField:@"Content-Type"];
    [_request setValue:@"{subscription key}" forHTTPHeaderField:@"Ocp-Apim-Subscription-Key"];
    // Request body
    [_request setHTTPBody:[@"{body}" dataUsingEncoding:NSUTF8StringEncoding]];
    
    NSURLResponse *response = nil;
    NSError *error = nil;
    NSData* _connectionData = [NSURLConnection sendSynchronousRequest:_request returningResponse:&response error:&error];

    if (nil != error)
    {
        NSLog(@"Error: %@", error);
    }
    else
    {
        NSError* error = nil;
        NSMutableDictionary* json = nil;
        NSString* dataString = [[NSString alloc] initWithData:_connectionData encoding:NSUTF8StringEncoding];
        NSLog(@"%@", dataString);
        
        if (nil != _connectionData)
        {
            json = [NSJSONSerialization JSONObjectWithData:_connectionData options:NSJSONReadingMutableContainers error:&error];
        }
        
        if (error || !json)
        {
            NSLog(@"Could not parse loaded json with error:%@", error);
        }
        
        NSLog(@"%@", json);
        _connectionData = nil;
    }
    
    [pool drain];

    return 0;
}

<?php
// This sample uses the Apache HTTP client from HTTP Components (http://hc.apache.org/httpcomponents-client-ga/)
require_once 'HTTP/Request2.php';

$request = new Http_Request2('https://westus3.api.cognitive.microsoft.com/vision/v3.2/ocr');
$url = $request->getUrl();

$headers = array(
    // Request headers
    'Content-Type' => 'application/json',
    'Ocp-Apim-Subscription-Key' => '{subscription key}',
);

$request->setHeader($headers);

$parameters = array(
    // Request parameters
    'language' => 'unk',
    'detectOrientation' => 'true',
    'model-version' => 'latest',
);

$url->setQueryVariables($parameters);

$request->setMethod(HTTP_Request2::METHOD_POST);

// Request body
$request->setBody("{body}");

try
{
    $response = $request->send();
    echo $response->getBody();
}
catch (HttpException $ex)
{
    echo $ex;
}

?>

########### Python 2.7 #############
import httplib, urllib, base64

headers = {
    # Request headers
    'Content-Type': 'application/json',
    'Ocp-Apim-Subscription-Key': '{subscription key}',
}

params = urllib.urlencode({
    # Request parameters
    'language': 'unk',
    'detectOrientation': 'true',
    'model-version': 'latest',
})

try:
    conn = httplib.HTTPSConnection('westus3.api.cognitive.microsoft.com')
    conn.request("POST", "/vision/v3.2/ocr?%s" % params, "{body}", headers)
    response = conn.getresponse()
    data = response.read()
    print(data)
    conn.close()
except Exception as e:
    print("[Errno {0}] {1}".format(e.errno, e.strerror))

####################################

########### Python 3.2 #############
import http.client, urllib.request, urllib.parse, urllib.error, base64

headers = {
    # Request headers
    'Content-Type': 'application/json',
    'Ocp-Apim-Subscription-Key': '{subscription key}',
}

params = urllib.parse.urlencode({
    # Request parameters
    'language': 'unk',
    'detectOrientation': 'true',
    'model-version': 'latest',
})

try:
    conn = http.client.HTTPSConnection('westus3.api.cognitive.microsoft.com')
    conn.request("POST", "/vision/v3.2/ocr?%s" % params, "{body}", headers)
    response = conn.getresponse()
    data = response.read()
    print(data)
    conn.close()
except Exception as e:
    print("[Errno {0}] {1}".format(e.errno, e.strerror))

####################################

require 'net/http'

uri = URI('https://westus3.api.cognitive.microsoft.com/vision/v3.2/ocr')

query = URI.encode_www_form({
    # Request parameters
    'language' => 'unk',
    'detectOrientation' => 'true',
    'model-version' => 'latest'
})
if query.length > 0
  if uri.query && uri.query.length > 0
    uri.query += '&' + query
  else
    uri.query = query
  end
end

request = Net::HTTP::Post.new(uri.request_uri)
# Request headers
request['Content-Type'] = 'application/json'
# Request headers
request['Ocp-Apim-Subscription-Key'] = '{subscription key}'
# Request body
request.body = "{body}"

response = Net::HTTP.start(uri.host, uri.port, :use_ssl => uri.scheme == 'https') do |http|
    http.request(request)
end

puts response.body