|

|  How to Use Microsoft Azure Form Recognizer API in .NET

How to Use Microsoft Azure Form Recognizer API in .NET

October 31, 2024

Unlock insights with Microsoft Azure Form Recognizer API in .NET. Learn step-by-step integration for accurate, efficient document data extraction.

How to Use Microsoft Azure Form Recognizer API in .NET

 

Getting Started with Azure Form Recognizer API

 

  • Before diving into the code, ensure that you have set up your Azure Form Recognizer resources, and obtained the endpoint and API key from the Azure portal.
  •  

  • Make sure you have installed the Azure.AI.FormRecognizer nuget package in your .NET project to help interact with the API.
  •  

  • You'll typically work with the FormRecognizerClient class, which provides methods to analyze documents.

 

dotnet add package Azure.AI.FormRecognizer

 

Sample Configuration and Initialization

 

  • Initialize the FormRecognizerClient using the endpoint and API key obtained from your Azure account.
  •  

  • You'll usually initialize this client in a part of your application responsible for handling dependency injection or service initialization.

 

using Azure;
using Azure.AI.FormRecognizer;
using Azure.AI.FormRecognizer.Models;
using Azure.AI.FormRecognizer.DocumentAnalysis;
using System;
using System.IO;
using System.Threading.Tasks;

namespace FormRecognizerExample
{
    public class FormRecognizerService
    {
        private readonly FormRecognizerClient _formRecognizerClient;

        public FormRecognizerService(string endpoint, string apiKey)
        {
            var credential = new AzureKeyCredential(apiKey);
            _formRecognizerClient = new FormRecognizerClient(new Uri(endpoint), credential);
        }
    }
}

 

Analyzing Forms with Form Recognizer

 

  • To analyze forms, you'll often use the StartRecognizeContentAsync method for basic text extraction or the StartRecognizeCustomFormsAsync for more complex form processing.
  •  

  • For custom form models, make sure you have a form model ID from the forms you have trained previously in Azure.

 

public async Task AnalyzeFormAsync(string filePath)
{
    using var stream = new FileStream(filePath, FileMode.Open);

    var operation = await _formRecognizerClient.StartRecognizeContentAsync(stream);
    var result = await operation.WaitForCompletionAsync();

    foreach (var page in result.Value)
    {
        Console.WriteLine($"Page number: {page.PageNumber}");

        foreach (var table in page.Tables)
        {
            Console.WriteLine("Table data:");

            foreach (var cell in table.Cells)
            {
                Console.WriteLine($"Cell text: '{cell.Content}'");
            }
        }
    }
}

 

Handling Models and Training Data

 

  • Form Recognizer allows training models on your custom forms. This involves uploading training data and labeling attributes that the models will learn to recognize.
  •  

  • Make sure your training data is clean and well-organized to improve the accuracy of the form recognizer model.

 

public async Task TrainModelAsync(string trainingDataUrl)
{
    var options = new BuildModelOptions { };
    var operation = await _formRecognizerClient.StartBuildModelAsync(trainingDataUrl, options);
    var customFormModel = await operation.WaitForCompletionAsync();

    Console.WriteLine($"Model ID: {customFormModel.Value.ModelId}");
}

 

Handling Errors and Debugging

 

  • When working with live APIs, error handling is crucial. Use try-catch blocks and inspect exceptions for better reliability and debugging.
  •  

  • Azure's SDK provides extensive logging capabilities which can help debug and identify issues in the interaction with the API.

 

try
{
    await AnalyzeFormAsync("path/to/your/document.pdf");
}
catch (Exception ex)
{
    Console.WriteLine($"An error occurred: {ex.Message}");
}

 

Optimizing Performance

 

  • Consider optimizing network calls by batching requests or using asynchronous operations, as shown in the examples, to maximize application responsiveness.
  •  

  • Keep models updated with new data samples to improve accuracy and account for data drift over time.