I'm trying to understand the basics of machine learning, and I have this theoretical question:
I have a deep linear network (no activation) and k classes to learn. Assume that, during training, I just show my network points sampled from a simple distribution D which lies on a circle, and assume that these points all have the same label, let's say they are all (x,0). I want my trained network to perform well on this circle.
How fast will my network be able to achieve zero error when shown points with this label from the same distribution?
How can I prove it?
Thank you very much!
Comments
Post a Comment