There is no hard limit. Instead, it depends on your patience and the amount of traffic. You need to take into account the more variations you test, the longer it will take the experiment for a winner to be declared. An important factor to consider is the difference between variations. The more substantial the difference in appearance, the faster you can detect a statistically significant difference in performance. Additionally, a control variation is not a must. You can compare variations to one another using a Bayesian stats calculator
which determines a 'probability to be best' for each variation, allowing you to declare a winner faster than ever before.
It comes down to how bold you are and how quickly you want results. Your resources traffic and creativity and you need to use them wisely. The explore-exploit tradeoff means you need to balance your desire to exploit the knowledge you have, giving users the best experience you know of so far, while also risking serving sub-optimal experience as you try to discover an even better experience.
There is no silver bullet. Do your best to come up with a diverse set of variations and risk exploring them in the short term to improve performance in the long run. Don't create a new variation just for the sake of testing more ― do it only if you have a reasonable belief that it will be better than anything you have tried so far.
Working with small changes takes a lot of time and only allows you to "fine tune" the performance of your page. On the other hand, if you test major differences, you can make leaps in performance. Having said that, major differences are risky during an experiment and may also cause a drop in performance for the duration of the experiment. An optional solution is to make major changes and run the experiment on a small portion of the population.