Microsoft OCR Library for Windows Runtime (opens in new tab)

(blogs.windows.com)

132 pointsmaouida11y ago45 comments

45 comments

36 comments · 9 top-level

steeve11y ago· 7 in thread

We had great results using tesseract-ocr[1] with SWT (state of the art text detection algorithm, via libccv[2]) on Linux.

You can use our python bindings for both[3,4], although they might be slightly outdated:

[1] https://code.google.com/p/tesseract-ocr/

[2] http://libccv.org/doc/doc-swt/

[3] https://github.com/veezio/pytesseract

[4] https://github.com/veezio/pyccv

danbruc11y ago

Be aware that SWT is patented [1] if you want to use it commercially.

[1] http://www.google.com/patents/US20090285482

discjockeydom11y ago

This link shows the claims of the published application. The recently allowed claims are a lot more narrow and less problematic. Still worth reviewing though in case you are worried you infringe:

http://www.scribd.com/doc/240266916/12122729

ap2221311y ago

It looks like Microsoft is the assignee? If so, is this included in the Microsoft OCR library?

MrBuddyCasino11y ago

Is it still possible to generate pixel correct hOCR when using SWT? Also, what is the main advantage of SWT - improving speed or accuracy?

steeve11y ago

I'm not sure about speed, but for accuracy, it's great. We've had terrible results with tesseract when giving it text that wasn't properly cropped with SWT.

1 more reply

beagle311y ago

What did you use to generate pyccv? (It looks automatically generated)

Does it still work with an up-to-date ccv?

steeve11y ago

This method: http://www.kaij.org/blog/?p=98

Although SWIG might work better now.

rikkus11y ago· 7 in thread

It doesn't appear that you can use this in a 'normal' .NET app. Any ideas why?

NetMonkey11y ago

This is really one of my big frustrations with Microsoft.

On one hand, they really try to push everybody to upgrade to their newest and shiniest, by making a lot of stuff (like this) only available on Windows 8+.

On the other hand, they don't even bother to put in a box with "What operating systems will this work on", so you don't have to do trial/error, research WinRT, and then be disappointed when you realize this will apparently never work on Windows 7. And maybe only in Metro apps? What is Windows Runtime and am I just supposed to know this?

I really enjoy coding C# and working in .NET. Microsoft has some really great stable techs which work well for years and years - but increasingly if you want anything new and shiny from them, you have to run the newest OS. Which if you work with anything related to enterprise, good luck only targetting Windows 8.

And honestly, despite working almost exclusively with MS tech, I just don't really trust any platform from them that doesn't have significant traction and track record as they all too often just give up and try something new - and sometimes without real replacements available.

danbruc11y ago

The MSDN documentation for the classes [1] clearly states the supported platforms. Admittedly the restriction to store apps is missing on the page for the namespace [2].

  Minimum supported client  Windows 8.1 [Windows Store apps only]
  Minimum supported server  Windows Server 2012 R2 [Windows Store apps only]
  Minimum supported phone   Windows Phone 8

[1] http://msdn.microsoft.com/en-us/library/windows/apps/xaml/wi...

[2] http://msdn.microsoft.com/en-us/library/windows/apps/xaml/wi...

1 more reply

pjmlp11y ago

> What is Windows Runtime and am I just supposed to know this?

If you subscribe to MSDN like any Windows developer, this has been explained multiple times in the last two years.

Just for the clueless ones.

Windows Runtime is an evolution of COM, based on the ideas that were on the genesis of .NET. Namely Ext-VOS.

http://blogs.msdn.com/b/dsyme/archive/2012/07/05/more-c-net-...

So a native version of .NET, so to speak. And unless Windows 9 changes it, the future of Windows APIs.

The .NET runtime starting with Windows 8 acquired additional capabilities:

- Ahead of time compilation to native code for Windows Phone apps, with the MDIL binary format

- Consumption and creation of Windows Runtime components

> but increasingly if you want anything new and shiny from them, you have to run the newest OS.

No different from other commercial vendors.

1 more reply

ghuntley11y ago

Can confirm that the actual package successfully installs into a Profile78 Portable Class Library. So whilst the marketing heavily mentions Windows Phone, in theory this library will also work on Xamarin (iOS/Android, etc) and also within standard .NET applications (ASP.NET/Console/etc).

nb: haven't actually tested past installation at this stage.

edit: nope :(

rikkus11y ago

Ah excellent, will try that, thanks!

1 more reply

rikkus11y ago

I made a C# Console app and added the nuget package. It adds, but there aren't any references. Within the nuget package, though, there is are three subdirectories within 'lib', one being 'win81'.

  packages\Microsoft.Windows.Ocr.1.0.0\lib\win81

Within this, there are 'ARM' 'x86' and 'x64' directories and dlls within them. VS refuses to add them to my project, so I'm guessing they're native and not COM libraries.

Why would I think they might be .NET libraries if they have 'x86' and 'x64' labels? Because C++/CLI has to be compiled to separate dlls, I believe.

pjmlp11y ago

It has nothing to do with C++/CLI, but with marshaling, JIT and NGEN.

When calling native code outside the CLR the runtime needs to know which type of marshaling code to generate.

It also plays a role when using unsafe code blocks in .NET.

reallycurious11y ago· 6 in thread

is this better than the terrassect OCR?

jeroen11y ago

Tesseract: https://code.google.com/p/tesseract-ocr/

josteink11y ago

I think that's a sort of apples and pears type of comparison.

Tessarect can be used everywhere, and is used dominantly on open platforms. This is a offering from Microsoft to be used on their platform only.

They may both be good, but they have widely different platform targets.

RobAley11y ago

My guess is he meant better at actually OCR'ing text, not better for implementation.

gondo11y ago

what are you talking about? it is always about the results. OCR is a tool and it doesn't matter if runs on windows, linux, osx, phone, tablet, watch. if this microsoft OCR produce better results than terrassect, than people will simply create service running on windows (yes even on windows phone) and some kind of API to talk to it. the questions remains the same: does it produce better results than terrassect?

so far, this microsoft OCR is just bunch of words without any prove that it actually works, what so ever. show me some pictures or videos of results.

MrBuddyCasino11y ago

Thats the big question. Tesseract is pretty good, though quite slow I must say.

Norm--11y ago

It depends on what is being scanned. Say you have a perfectly formatted image, directly taken from a scanner, it's a pretty darn quick process.

But from my experience, what adds to the slowness is pre-processing the image to make it suitable for OCR, especially tesseract. I still haven't found the magic combination of filters because every image is different, especially if your source them from users camera phones.

cipher011y ago· 5 in thread

"demonstrated in code snippets below". The code snippets are actually images and even worse, they're JPEGs which is the reason why the text looks horrible.

drblast11y ago

If only there were some automated way to convert those images to text.

allegory11y ago

Now that is possibly the cruellest irony I've seen for a while. Well spotted :)

kyberias11y ago

Yeah, stop being silly and just click the link on the page to the actual documentation with examples:

http://msdn.microsoft.com/en-us/library/windows/apps/xaml/wi...

jwr11y ago

You are expected to OCR them using the library.

tiedemann11y ago

"This blog was written by Jelena Mojasevic, Program Manager at Microsoft" - I seems no one told her how to embed code snippets.

jamessantiago11y ago· 2 in thread

Off topic, but this made me think that it would be neat if libraries on places like github and nuget could someout include "cited by" data. Something that referenced open source (maybe closed source too) projects that had a dependency to the library similar to google scholar or CiteSeerX.

asuidyasiud11y ago

You can get a DOI for github.

afandian11y ago

Then what? There's nothing magical about DOIs. You need someone to store the citation metadata. And generate / deposit citation metadata. And maintain the persistence of the DOI. What precisely does the DOI represent? A codebase? A fork of it? A file? A file at a particular revision? A changeset?

swalsh11y ago

This is very cool! I've been working on a receipt scanning tool in C# for keeping track of kitchen inventory (tired of calling my wife asking if we have sesame oil or some odd ball thing)

I found a few libraries, but they only worked with relatively perfect scans (my goal is to be able to just use a phone). When I get home definitely going to give this a go.

mdaniel11y ago

On http://msdn.microsoft.com/en-us/library/windows/apps/windows... they mention the supported languages and their statuses, but Korean is only "Good".

I freely admit that I do not speak Korean, but if one compares "Chinese Simplified" characters (listed as "Very good") with those in the Korean alphabet, I am surprised those two entries aren't transposed.

Is there something that makes recognizing Korean harder than Chinese Simplified, or was that just a product management decision?

jccodez11y ago

tesseract is really looking great with google adding searchable pdf as output in the latest release candidate.

Norm--11y ago

So from reading the list of reasons for inaccurate results, it sounds like this library is totally useless for images taken with mobile phones, yet it is only allowed to run on mobile ;)

Now I would be more interested in an image correction library

".... Blurry images Handwritten or cursive text Artistic font styles Small text size (less than 15 pixels for Western languages, or less than 20 pixels for East Asian languages) Complex backgrounds Shadows or glare over text Perspective distortion Oversized or dropped capital letters at the beginnings of words Subscript, superscript, or strikethrough text"

j / k navigate · click thread line to collapse

45 comments

36 comments · 9 top-level

steeve11y ago· 7 in thread

We had great results using tesseract-ocr[1] with SWT (state of the art text detection algorithm, via libccv[2]) on Linux.

You can use our python bindings for both[3,4], although they might be slightly outdated:

[1] https://code.google.com/p/tesseract-ocr/

[2] http://libccv.org/doc/doc-swt/

[3] https://github.com/veezio/pytesseract

[4] https://github.com/veezio/pyccv

danbruc11y ago

Be aware that SWT is patented [1] if you want to use it commercially.

[1] http://www.google.com/patents/US20090285482

discjockeydom11y ago

This link shows the claims of the published application. The recently allowed claims are a lot more narrow and less problematic. Still worth reviewing though in case you are worried you infringe:

http://www.scribd.com/doc/240266916/12122729

ap2221311y ago

It looks like Microsoft is the assignee? If so, is this included in the Microsoft OCR library?

MrBuddyCasino11y ago

Is it still possible to generate pixel correct hOCR when using SWT? Also, what is the main advantage of SWT - improving speed or accuracy?

steeve11y ago

I'm not sure about speed, but for accuracy, it's great. We've had terrible results with tesseract when giving it text that wasn't properly cropped with SWT.

1 more reply

beagle311y ago

What did you use to generate pyccv? (It looks automatically generated)

Does it still work with an up-to-date ccv?

steeve11y ago

This method: http://www.kaij.org/blog/?p=98

Although SWIG might work better now.

rikkus11y ago· 7 in thread

It doesn't appear that you can use this in a 'normal' .NET app. Any ideas why?

NetMonkey11y ago

This is really one of my big frustrations with Microsoft.

On one hand, they really try to push everybody to upgrade to their newest and shiniest, by making a lot of stuff (like this) only available on Windows 8+.

danbruc11y ago

The MSDN documentation for the classes [1] clearly states the supported platforms. Admittedly the restriction to store apps is missing on the page for the namespace [2].

  Minimum supported client  Windows 8.1 [Windows Store apps only]
  Minimum supported server  Windows Server 2012 R2 [Windows Store apps only]
  Minimum supported phone   Windows Phone 8

[1] http://msdn.microsoft.com/en-us/library/windows/apps/xaml/wi...

[2] http://msdn.microsoft.com/en-us/library/windows/apps/xaml/wi...

1 more reply

pjmlp11y ago

> What is Windows Runtime and am I just supposed to know this?

If you subscribe to MSDN like any Windows developer, this has been explained multiple times in the last two years.

Just for the clueless ones.

Windows Runtime is an evolution of COM, based on the ideas that were on the genesis of .NET. Namely Ext-VOS.

http://blogs.msdn.com/b/dsyme/archive/2012/07/05/more-c-net-...

So a native version of .NET, so to speak. And unless Windows 9 changes it, the future of Windows APIs.

The .NET runtime starting with Windows 8 acquired additional capabilities:

- Ahead of time compilation to native code for Windows Phone apps, with the MDIL binary format

- Consumption and creation of Windows Runtime components

> but increasingly if you want anything new and shiny from them, you have to run the newest OS.

No different from other commercial vendors.

1 more reply

ghuntley11y ago

nb: haven't actually tested past installation at this stage.

edit: nope :(

rikkus11y ago

Ah excellent, will try that, thanks!

1 more reply

rikkus11y ago

I made a C# Console app and added the nuget package. It adds, but there aren't any references. Within the nuget package, though, there is are three subdirectories within 'lib', one being 'win81'.

  packages\Microsoft.Windows.Ocr.1.0.0\lib\win81

Within this, there are 'ARM' 'x86' and 'x64' directories and dlls within them. VS refuses to add them to my project, so I'm guessing they're native and not COM libraries.

Why would I think they might be .NET libraries if they have 'x86' and 'x64' labels? Because C++/CLI has to be compiled to separate dlls, I believe.

pjmlp11y ago

It has nothing to do with C++/CLI, but with marshaling, JIT and NGEN.

When calling native code outside the CLR the runtime needs to know which type of marshaling code to generate.

It also plays a role when using unsafe code blocks in .NET.

reallycurious11y ago· 6 in thread

is this better than the terrassect OCR?

jeroen11y ago

Tesseract: https://code.google.com/p/tesseract-ocr/

josteink11y ago

I think that's a sort of apples and pears type of comparison.

Tessarect can be used everywhere, and is used dominantly on open platforms. This is a offering from Microsoft to be used on their platform only.

They may both be good, but they have widely different platform targets.

RobAley11y ago

My guess is he meant better at actually OCR'ing text, not better for implementation.

gondo11y ago

so far, this microsoft OCR is just bunch of words without any prove that it actually works, what so ever. show me some pictures or videos of results.

MrBuddyCasino11y ago

Thats the big question. Tesseract is pretty good, though quite slow I must say.

Norm--11y ago

It depends on what is being scanned. Say you have a perfectly formatted image, directly taken from a scanner, it's a pretty darn quick process.

cipher011y ago· 5 in thread

"demonstrated in code snippets below". The code snippets are actually images and even worse, they're JPEGs which is the reason why the text looks horrible.

drblast11y ago

If only there were some automated way to convert those images to text.

allegory11y ago

Now that is possibly the cruellest irony I've seen for a while. Well spotted :)

kyberias11y ago

Yeah, stop being silly and just click the link on the page to the actual documentation with examples:

http://msdn.microsoft.com/en-us/library/windows/apps/xaml/wi...

jwr11y ago

You are expected to OCR them using the library.

tiedemann11y ago

"This blog was written by Jelena Mojasevic, Program Manager at Microsoft" - I seems no one told her how to embed code snippets.

jamessantiago11y ago· 2 in thread

asuidyasiud11y ago

You can get a DOI for github.

afandian11y ago

swalsh11y ago

This is very cool! I've been working on a receipt scanning tool in C# for keeping track of kitchen inventory (tired of calling my wife asking if we have sesame oil or some odd ball thing)

I found a few libraries, but they only worked with relatively perfect scans (my goal is to be able to just use a phone). When I get home definitely going to give this a go.

mdaniel11y ago

On http://msdn.microsoft.com/en-us/library/windows/apps/windows... they mention the supported languages and their statuses, but Korean is only "Good".

Is there something that makes recognizing Korean harder than Chinese Simplified, or was that just a product management decision?

jccodez11y ago

tesseract is really looking great with google adding searchable pdf as output in the latest release candidate.

Norm--11y ago

So from reading the list of reasons for inaccurate results, it sounds like this library is totally useless for images taken with mobile phones, yet it is only allowed to run on mobile ;)

Now I would be more interested in an image correction library

j / k navigate · click thread line to collapse