FAQ Series_101 — Form Recognizer

Amit Damle
5 min readFeb 9, 2022

Hi Everyone, I am sure you remember the struggle to delivery a task a specific analysis or implementation with limited knowledge on the subject. I always felt that if I had the answers to most common questions I can concentrate on the complex scenarios. But once task is over I forget to document the learning and when same tasks need to be repeated, I need to start all over again. I have decided to start a 101 FAQ series for the Azure Components to help get the information as quickly as possible. By no means this is complete list but I thought of listing down questions which I have to answer most of the time

Azure Form Recognizer FAQ —

  1. What is Form Recognizer?

Azure Form Recognizer is a cloud-based managed AI service that uses machine-learning / Deep learning models to extract and analyze form fields, text, and tables from your documents. Form Recognizer analyzes your forms and documents, extracts text and data, maps field relationships as key-value pairs, and returns a structured JSON output. Here is more info on Form Recognizer

2. How to use Form Recognizer?

Extracting data from Form Recognizer is simple 3 step process -

➊ Create Form Recognizer Resource on Azure Portal

➋ Choose Model of your choice from predefined models

➌ Use SDK or REST API to call your preferred model end point.

Here is the detailed info

3. What development languages are supported by Form Recognizer SDK

At present it supports 4 development languages Viz. C#, Java, JavaScript, Python. But if you want to develop using any other language then it can be done using REST APIs

4. What Languages are supported by Form Recognizer?

Form Recognizer supports 7 languages for hand writing and 122 Printed Languages. please refer this for exhaustive list of supported languages. Custom Neural Model only supports English at the time of writing this blog

5. What all Formats are supported by Form Recognizer out of the box?

Form Recognizer supports 6 out of the box Models + Custom Model which you can train using your data
Supported formats are General Document, Layout, Prebuilt — Invoices, Receipts, ID Documents, Business Cards. more info

6. What is Form Recognizer Studio?

Form Recognizer Studio preview is an online tool for visually exploring, understanding, and integrating features from the Form Recognizer service into your applications. Refer this for more details

7. What if the Prebuilt models does not help me extract the required fields?

You need to use the custom model(s) which allows you to train Prebuilt Form Recognizer Model using your data and labels. More info here

8. How many samples I need to use for training my Custom Model?

You can train your model with min 5 documents. If the document quality is low then additional samples will be required.

9. Which Custom Model format should I use?

Form Recognizer supports 2 types of custom models viz. Template based and Neural models.

Template based models depends on templates to extract data whereas Custom neural models can work with structured, semi-structured or unstructured data. At the time of writing this blog Custom neural models only extract form fields (key-value pair) and selection marks, tables, signature and region is not supported.

10. What is the difference between Custom Template and Custom Neural model?

from FR official docs

11. Where can I use Custom Neural Models?

Custom Neural Models can be used for extracting data from surveys, questionnaires, application forms, invoices, contracts, letters and other documents where tables, signature and region fields are not present

12. What are supported formats for samples?

Form Recognizer supports JPEG, PNG, BMP, TIFF, and PDF — Text embedded or Scanned documents

13. How much time it takes to train the custom model?

It depends on size and complexity of the document on an average training can be completed in few mins.

14. How will I deploy my Custom Model?

Using containers if on-premise deployment is required (v 2.1). For version 3.x you can use Form Recognizer Studio or REST API to create and use the model

15. Can I deploy multiple versions of the model?

Yes, it is possible to create multiple versions and use specific version while inferencing

16. How much will it cost me?

Pls refer pricing page

17. Can I run Form Recognizer Model on my on-premise Hardware?

Yes, at the time of writing this blog Form Recognizer v2.1 container is available in preview. Check this document

18. What all Security Controls provided by Form Recognizer to Secure my Data Extraction?

Azure Form Recognizer provides various Security controls e.g. Authentication using API Key, Data Encryption during transit using TLS 1.2, pls refer this document

19. How do I automate the data extraction process using Form Recognizer?

Use Functions or Logic Apps

20. Is it possible to extract data from table spanning multiple pages

Need custom code to corelate please refer this link for sample

21. I have multiple types of forms do I need to specify type while performing extraction with Form Recognizer?

No. One composite model can be created from multiple individual custom models that cater to all required document types. While extracting data users can submit documents without mentioning type. Composed model will run classification to identify specific custom model to use for data extraction. Please refer this

22. How many models are supported for Composite model?

5 models for Free Tier and 100 for Standard Tier

23. I am using old version of Form Recognizer i.e. 2.1 how can I migrate to new Model i.e. 3.1?

Please refer to migration guide

24. How to Backup and recover my custom Form Recognizer models?

When HA and DR scenario is required then 2 Form Recognizer resources needs to be present in 2 separate region. Custom models can be synced across two regions by periodically executing copy REST APIs

Please refer backup and recovery docs

--

--