Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks
- intro: Google. Ian J. Goodfellow
- arxiv: https://arxiv.org/abs/1312.6082
End-to-End Text Recognition with Convolutional Neural Networks
- paper: http://www.cs.stanford.edu/~acoates/papers/wangwucoatesng_icpr2012.pdf
- PhD thesis: http://cs.stanford.edu/people/dwu4/HonorThesis.pdf
Word Spotting and Recognition with Embedded Attributes
Reading Text in the Wild with Convolutional Neural Networks
- arxiv: http://arxiv.org/abs/1412.1842
- homepage: http://www.robots.ox.ac.uk/~vgg/publications/2016/Jaderberg16/
- demo: http://zeus.robots.ox.ac.uk/textsearch/#/search/
- code: http://www.robots.ox.ac.uk/~vgg/research/text/
Deep structured output learning for unconstrained text recognition
- intro: “propose an architecture consisting of a character sequence CNN and an N-gram encoding CNN which act on an input image in parallel and whose outputs are utilized along with a CRF model to recognize the text content present within the image.”
- arxiv: http://arxiv.org/abs/1412.5903
Deep Features for Text Spotting
- paper: http://www.robots.ox.ac.uk/~vgg/publications/2014/Jaderberg14/jaderberg14.pdf
- bitbucket: https://bitbucket.org/jaderberg/eccv2014_textspotting
- gitxiv: http://gitxiv.com/posts/uB4y7QdD5XquEJ69c/deep-features-for-text-spotting
Reading Scene Text in Deep Convolutional Sequences
DeepFont: Identify Your Font from An Image
An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition
- intro: Convolutional Recurrent Neural Network (CRNN)
- arxiv: http://arxiv.org/abs/1507.05717
- github: https://github.com/bgshih/crnn
- github: https://github.com/meijieru/crnn.pytorch
Recursive Recurrent Nets with Attention Modeling for OCR in the Wild
Writer-independent Feature Learning for Offline Signature Verification using Deep Convolutional Neural Networks
DeepText: A Unified Framework for Text Proposal Generation and Text Detection in Natural Images
End-to-End Interpretation of the French Street Name Signs Dataset
- paper: http://link.springer.com/chapter/10.1007%2F978-3-319-46604-0_30
- github: https://github.com/tensorflow/models/tree/master/street
End-to-End Subtitle Detection and Recognition for Videos in East Asian Languages via CNN Ensemble with Near-Human-Level Performance
Smart Library: Identifying Books in a Library using Richly Supervised Deep Scene Text Reading
Improving Text Proposals for Scene Images with Fully Convolutional Networks
- intro: Universitat Autonoma de Barcelona (UAB) & University of Florence
- intro: International Conference on Pattern Recognition (ICPR) - DLPR (Deep Learning for Pattern Recognition) workshop
- arxiv: https://arxiv.org/abs/1702.05089
Scene Text Eraser
Attention-based Extraction of Structured Information from Street View Imagery
- intro: University College London & Google Inc
- arxiv: https://arxiv.org/abs/1704.03549
- github: https://github.com/tensorflow/models/tree/master/attention_ocr
STN-OCR: A single Neural Network for Text Detection and Text Recognition
- arxiv: https://arxiv.org/abs/1707.08831
- github(MXNet): https://github.com/Bartzi/stn-ocr
Text Detection
Object Proposals for Text Extraction in the Wild
- intro: ICDAR 2015
- arxiv: http://arxiv.org/abs/1509.02317
- github: https://github.com/lluisgomez/TextProposals
Text-Attentional Convolutional Neural Networks for Scene Text Detection
Accurate Text Localization in Natural Image with Cascaded Convolutional Text Network
Synthetic Data for Text Localisation in Natural Images
- intro: CVPR 2016
- project page: http://www.robots.ox.ac.uk/~vgg/data/scenetext/
- arxiv: http://arxiv.org/abs/1604.06646
- paper: http://www.robots.ox.ac.uk/~vgg/data/scenetext/gupta16.pdf
- github: https://github.com/ankush-me/SynthText
Scene Text Detection via Holistic, Multi-Channel Prediction
Detecting Text in Natural Image with Connectionist Text Proposal Network
- intro: ECCV 2016
- arxiv: http://arxiv.org/abs/1609.03605
- github(Caffe): https://github.com/tianzhi0549/CTPN
- github(CUDA8.0 support): https://github.com/qingswu/CTPN
- demo: http://textdet.com/
- github(Tensorflow): https://github.com/eragonruan/text-detection-ctpn
TextBoxes: A Fast Text Detector with a Single Deep Neural Network
- intro: AAAI 2017
- arxiv: https://arxiv.org/abs/1611.06779
- github(Caffe): https://github.com/MhLiao/TextBoxes
- github: https://github.com/xiaodiu2010/TextBoxes-TensorFlow
TextBoxes++: A Single-Shot Oriented Scene Text Detector
- intro: University of Science and Technology(HUST)
- arxiv: https://arxiv.org/abs/1801.02765
- github: https://github.com/MhLiao/TextBoxes_plusplus
Deep Matching Prior Network: Toward Tighter Multi-oriented Text Detection
- intro: CVPR 2017
- intro: F-measure 70.64%, outperforming the existing state-of-the-art method with F-measure 63.76%
- arxiv: https://arxiv.org/abs/1703.01425
Detecting Oriented Text in Natural Images by Linking Segments
- intro: CVPR 2017
- arxiv: https://arxiv.org/abs/1703.06520
- github(Tensorflow): https://github.com/dengdan/seglink
Deep Direct Regression for Multi-Oriented Scene Text Detection
Cascaded Segmentation-Detection Networks for Word-Level Text Spotting
WordFence: Text Detection in Natural Images with Border Awareness
- intro: ICIP 2017
- arcxiv: https://arxiv.org/abs/1705.05483
SSD-text detection: Text Detector
- intro: A modified SSD model for text detection
- github: https://github.com/oyxhust/ssd-text_detection
R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection
- intro: Samsung R&D Institute China
- arxiv: https://arxiv.org/abs/1706.09579
R-PHOC: Segmentation-Free Word Spotting using CNN
- intro: ICDAR 2017
- arxiv: https://arxiv.org/abs/1707.01294
Towards End-to-end Text Spotting with Convolutional Recurrent Neural Networks
EAST: An Efficient and Accurate Scene Text Detector
- intro: CVPR 2017. Megvii
- arxiv: https://arxiv.org/abs/1704.03155
- paper: http://openaccess.thecvf.com/content_cvpr_2017/papers/Zhou_EAST_An_Efficient_CVPR_2017_paper.pdf
- github(Tensorflow): https://github.com/argman/EAST
Deep Scene Text Detection with Connected Component Proposals
- intro: Amap Vision Lab, Alibaba Group
- arxiv: https://arxiv.org/abs/1708.05133
Single Shot Text Detector with Regional Attention
- intro: ICCV 2017
- arxiv: https://arxiv.org/abs/1709.00138
- github: https://github.com/BestSonny/SSTD
- code: http://sstd.whuang.org
Fused Text Segmentation Networks for Multi-oriented Scene Text Detection
Deep Residual Text Detection Network for Scene Text
- intro: IAPR International Conference on Document Analysis and Recognition (ICDAR) 2017. Samsung R&D Institute of China, Beijing
- arxiv: https://arxiv.org/abs/1711.04147
Feature Enhancement Network: A Refined Scene Text Detector
- intro: AAAI 2018
- arxiv: https://arxiv.org/abs/1711.04249
ArbiText: Arbitrary-Oriented Text Detection in Unconstrained Scene
Detecting Curve Text in the Wild: New Dataset and New Solution
FOTS: Fast Oriented Text Spotting with a Unified Network
PixelLink: Detecting Scene Text via Instance Segmentation
- intro: AAAI 2018
- arxiv: https://arxiv.org/abs/1801.01315
PixelLink: Detecting Scene Text via Instance Segmentation
- intro: AAAI 2018. Zhejiang University & Chinese Academy of Sciences
- arxiv: https://arxiv.org/abs/1801.01315
Sliding Line Point Regression for Shape Robust Scene Text Detection
Single Shot TextSpotter with Explicit Alignment and Attention
- intro: CVPR 2018
- arxiv: https://arxiv.org/abs/1803.03474
Rotation-Sensitive Regression for Oriented Scene Text Detection
- intro: CVPR 2018
- arxiv: https://arxiv.org/abs/1803.05265
Detecting Multi-Oriented Text with Corner-based Region Proposals
- arxiv: https://arxiv.org/abs/1804.02690
- github: https://github.com/xhzdeng/crpn
An Anchor-Free Region Proposal Network for Faster R-CNN based Text Detection Approaches
IncepText: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Oriented Scene Text Detection
- intro: IJCAI 2018
- arxiv: https://arxiv.org/abs/1805.01167
Text Recognition
Sequence to sequence learning for unconstrained scene text recognition
- intro: master thesis
- arxiv: http://arxiv.org/abs/1607.06125
Drawing and Recognizing Chinese Characters with Recurrent Neural Network
Learning Spatial-Semantic Context with Fully Convolutional Recurrent Network for Online Handwritten Chinese Text Recognition
- intro: correct rates: Dataset-CASIA 97.10% and Dataset-ICDAR 97.15%
- arxiv: https://arxiv.org/abs/1610.02616
Stroke Sequence-Dependent Deep Convolutional Neural Network for Online Handwritten Chinese Character Recognition
Visual attention models for scene text recognition
Focusing Attention: Towards Accurate Text Recognition in Natural Images
- intro: ICCV 2017
- arxiv: https://arxiv.org/abs/1709.02054
Scene Text Recognition with Sliding Convolutional Character Models
AdaDNNs: Adaptive Ensemble of Deep Neural Networks for Scene Text Recognition
A New Hybrid-parameter Recurrent Neural Networks for Online Handwritten Chinese Character Recognition
Arbitrarily-Oriented Text Recognition
- intro: A method used in ICDAR 2017 word recognition competitions
- arxiv: https://arxiv.org/abs/1711.04226
SEE: Towards Semi-Supervised End-to-End Scene Text Recognition
Breaking Captcha
Using deep learning to break a Captcha system
- intro: “Using Torch code to break simplecaptcha with 92% accuracy”
- blog: https://deepmlblog.wordpress.com/2016/01/03/how-to-break-a-captcha-system/
- github: https://github.com/arunpatala/captcha
Breaking reddit captcha with 96% accuracy
- blog: https://deepmlblog.wordpress.com/2016/01/05/breaking-reddit-captcha-with-96-accuracy/
- github: https://github.com/arunpatala/reddit.captcha
I’m not a human: Breaking the Google reCAPTCHA
Neural Net CAPTCHA Cracker
- slides: http://www.cs.sjsu.edu/faculty/pollett/masters/Semesters/Spring15/geetika/CS298%20Slides%20-%20PDF
- github: https://github.com/bgeetika/Captcha-Decoder
- demo: http://cp-training.appspot.com/
Recurrent neural networks for decoding CAPTCHAS
- blog: https://deepmlblog.wordpress.com/2016/01/12/recurrent-neural-networks-for-decoding-captchas/
- demo: http://simplecaptcha.sourceforge.net/
- code: http://sourceforge.net/projects/simplecaptcha/
Reading irctc captchas with 95% accuracy using deep learning
I Am Robot: (Deep) Learning to Break Semantic Image CAPTCHAs
- intro: automatically solving 70.78% of the image reCaptchachallenges, while requiring only 19 seconds per challenge. apply to the Facebook image captcha and achieve an accuracy of 83.5%
- paper: http://www.cs.columbia.edu/~polakis/papers/sivakorn_eurosp16.pdf
- intro: Solve captcha without manually labeling a training set
- github: https://github.com/rickyhan/SimGAN-Captcha
Handwritten Recognition
High Performance Offline Handwritten Chinese Character Recognition Using GoogLeNet and Directional Feature Maps
Recognize your handwritten numbers
Handwritten Digit Recognition using Convolutional Neural Networks in Python with Keras
MNIST Handwritten Digit Classifier
LeNet – Convolutional Neural Network in Python
Scan, Attend and Read: End-to-End Handwritten Paragraph Recognition with MDLSTM Attention
MLPaint: the Real-Time Handwritten Digit Recognizer
- blog: http://blog.mldb.ai/blog/posts/2016/09/mlpaint/
- github: https://github.com/mldbai/mlpaint
- demo: https://docs.mldb.ai/ipy/notebooks/_demos/_latest/Image%20Processing%20with%20Convolutions.html
Training a Computer to Recognize Your Handwriting
Using TensorFlow to create your own handwriting recognition engine
- blog: https://niektemme.com/2016/02/21/tensorflow-handwriting/
- github: https://github.com/niektemme/tensorflow-mnist-predict/
Building a Deep Handwritten Digits Classifier using Microsoft Cognitive Toolkit
- blog: https://medium.com/@tuzzer/building-a-deep-handwritten-digits-classifier-using-microsoft-cognitive-toolkit-6ae966caec69#.c3h6o7oxf
- github: https://github.com/tuzzer/ai-gym/blob/a97936619cf56b5ed43329c6fa13f7e26b1d46b8/MNIST/minist_softmax_cntk.py
Hand Writing Recognition Using Convolutional Neural Networks
- intro: This CNN-based model for recognition of hand written digits attains a validation accuracy of 99.2% after training for 12 epochs. Its trained on the MNIST dataset on Kaggle.
- github: https://github.com/ayushoriginal/HandWritingRecognition-CNN
Design of a Very Compact CNN Classifier for Online Handwritten Chinese Character Recognition Using DropWeight and Global Pooling
- intro: 0.57 MB, performance is decreased only by 0.91%.
- arxiv: https://arxiv.org/abs/1705.05207
Handwritten digit string recognition by combination of residual network and RNN-CTC
Plate Recognition
Reading Car License Plates Using Deep Convolutional Neural Networks and LSTMs
Number plate recognition with Tensorflow
- blog: http://matthewearl.github.io/2016/05/06/cnn-anpr/
- github(Deep ANPR): https://github.com/matthewearl/deep-anpr
Segmentation-free Vehicle License Plate Recognition using ConvNet-RNN
- intro: International Workshop on Advanced Image Technology, January, 8-10, 2017. Penang, Malaysia. Proceeding IWAIT2017
- arxiv: https://arxiv.org/abs/1701.06439
License Plate Detection and Recognition Using Deeply Learned Convolutional Neural Networks
Adversarial Generation of Training Examples for Vehicle License Plate Recognition
Towards End-to-End Car License Plates Detection and Recognition with Deep Neural Networks
High Accuracy Chinese Plate Recognition Framework
- intro: 基于深度学习高性能中文车牌识别 High Performance Chinese License Plate Recognition Framework.
- gihtub: https://github.com/zeusees/HyperLPR
Applying OCR Technology for Receipt Recognition
- blog: http://rnd.azoft.com/applying-ocr-technology-receipt-recognition/
- mirror: http://pan.baidu.com/s/1qXQBQiC
Hacking MNIST in 30 lines of Python
- blog: http://jrusev.github.io/post/hacking-mnist/
- github: https://github.com/jrusev/simple-neural-networks
Optical Character Recognition Using One-Shot Learning, RNN, and TensorFlow
Creating a Modern OCR Pipeline Using Computer Vision and Deep Learning
ocropy: Python-based tools for document analysis and OCR
- github: https://github.com/tmbdev/ocropy
Extracting text from an image using Ocropus
CLSTM : A small C++ implementation of LSTM networks, focused on OCR
- github: https://github.com/tmbdev/clstm
OCR text recognition using tensorflow with attention
Digit Recognition via CNN: digital meter numbers detection
- github(caffe): https://github.com/SHUCV/digit
Attention-OCR: Visual Attention based OCR
umaru: An OCR-system based on torch using the technique of LSTM/GRU-RNN, CTC and referred to the works of rnnlib and clstm
Tesseract.js: Pure Javascript OCR for 62 Languages
- homepage: http://tesseract.projectnaptha.com/
- github: https://github.com/naptha/tesseract.js
DeepHCCR: Offline Handwritten Chinese Character Recognition based on GoogLeNet and AlexNet (With CaffeModel)
deep ocr: make a better chinese character recognition OCR than tesseract
Practical Deep OCR for scene text using CTPN + CRNN
Tensorflow-based CNN+LSTM trained with CTC-loss for OCR
- github: https://github.com//chenxinpeng/SSD_scene_text_detection
- blog: http://blog.csdn.net/u010167269/article/details/52563573
Deep Learning for OCR
Scene Text Localization & Recognition Resources
- intro: A curated list of resources dedicated to scene text localization and recognition
- github: https://github.com/chongyangtao/Awesome-Scene-Text-Recognition
Scene Text Localization & Recognition Resources
- intro: 图像文本位置感知与识别的论文资源汇总
- github: https://github.com/whitelok/image-text-localization-recognition/blob/master/README.zh-cn.md
awesome-ocr: A curated list of promising OCR resources