-
Notifications
You must be signed in to change notification settings - Fork 1.3k
some suggestions #112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
some suggestions #112
Conversation
1. normalization influenced by alpha.
```python
grayscale = (data[..., 0]-data[..., 0].min()) / (data[..., 0].max()-data[..., 0].min())*255
```
2. paste size does not match
```
padded.paste(im, (0, 0, im.size[0], im.size[1]))
```
3. pad wrong pixel when text is inverted
cause the text has inverted some times, but the padded pixel is hard code to 255
```python
padded = Image.new('L', dims, 255)
```
I notice this will cause error recognition when the text's pixel is 255 and the pad pixel is also 255, then that pad part will be recognized as text.
|
Thanks for the contribution.
I have implemented these changes and pushed them to your branch (49480af). Feel free to comment. |
|
Thanks for your attention. @lukas-blecher
and here is the padded image, the line at the bottom doesn't exist in the original picture but shows after padded. As I think, this line is padded in the middle process. if the above information is not enough, I will add more after WSL is repaired.😁 I encountered these problems at those images if you do not mind, you can test them before I repaired my WSL. |
|
I'm happy to give any insight if you have an specific question. |
|
Glad to hear your reply. Your code is much better than my coworkers. My point is that my confusion comes from the knowledge hamper. here are what I have
here are what I want to know
|
|
In fact, I want to make a handwriting formula recognition project, and I notice your todo list contains this part. Could there any possibilities that permit me to be a collaborator on this project and learn from you.😁 |
|
I sadly don't have any more papers for you. When I started this project the ViT was freshly proposed and I wanted to make a formula recognition model. I can tell you though it is helpful to have a CNN backbone in the encoder. Regarding the handwriting project: I see you already noticed the colab notebook I linked in the README: https://colab.research.google.com/drive/1ba_qCGJl29dFQqfBjdqMik3o_EqPE4fr |
Thanks for your advice. Look forward to your better results. |









cause the text has inverted some times, but the padded pixel is hard code to 255
I notice this will cause error recognition when the text's pixel is 255 and the pad pixel is also 255, then that pad part will be recognized as text.